-
42951.基于OBDDs的Submatch快速获取
[信息传输、软件和信息技术服务业] [2013-12-18]
Network-based intrusion detection systems (NIDS) commonly use pattern languages to identify packets of interest. Similarly, security information and event management (SIEM) systems rely on pattern languages for real-time analysis of security alerts and event logs. Both NIDS and SIEM systems use pattern languages extended from regular expressions. One such extension, the submatch construct, allows the extraction of substrings from a string matching a pattern. Existing solutions for submatch extraction are based on non-deterministic finite automata (NFAs) or recursive backtracking. NFA-based algorithms are time-inefficient. Recursive backtracking algorithms perform poorly on pathological inputs generated by algorithmic complexity attacks. We propose a new approach for submatch extraction that uses ordered binary decision diagrams (OBDDs) to represent and operate pattern matching. Our evaluation using patterns from the Snort HTTP rule set and a commercial SIEM system shows that our approach achieves its ideal performance when patterns are combined. In the best case, our approach is faster than RE2 and PCRE by one to two orders of magnitude.
关键词:正则表达式;模式匹配;submatch;标记的NFA;有序二元决策图(OBDD)
-
42952.在产能多项logit选择模型与产品差异化价格敏感度下的产品选择和定价联合优化
[信息传输、软件和信息技术服务业] [2013-12-18]
Many firms face a problem to select an assortment of products and determine their prices to maximize the total profit subject to a capacity constraint. Customers' purchase behavior follows the Multinomial Logit choice model. Our analysis shows the capacity is always fully used and the nested structure is lost when price sensitivities are product-differentiated. We propose a nonrecursive polynomial-time algorithm for the joint assortment and multi-product price optimization.
关键词:多项式Logit模型;组合优化;多产品价格优化
-
42953.数字出版艺术:支撑出版业未来的一种组合标准的基础
[信息传输、软件和信息技术服务业,印刷和记录媒介复制业] [2013-12-18]
Scienti c content increasingly relies on the presentation and authoring of complex multimedia diagrams and gures, sometimes interactive, to convey information in a non-textual way. Wikis and user-generated hyper-linked content have both been very successful in the case for text|this is what we aim to do for mathematical diagrams. Many professors in higher education who write textbooks know TeX, however, they don't often know how to program the Web. The future of building interactive user interfaces should lie not in the hands of programmers, but in the hands of the expert of a given eld|the goal of this project is to supply math, physics, and engineering professors with a platform to express mathematical concepts to students to provide immersive learning environments.
关键词:数字出版;编程;交互式用户界面
-
42954.通过局部优化冷却资源提高数据中心的效率
[信息传输、软件和信息技术服务业] [2013-12-18]
Data centers are large computing facilities that can house tens of thousands of computer servers, storage and networking devices. They can consume megawatts of power and, as a result, reject megawatts of heat. For more than a decade, researchers have been investigating methods to improve the efficiency by which these facilities are cooled. One of the key challenges to maintain highly efficient cooling is to provide on demand cooling resources to each server rack, which may vary with time and rack location within the larger data center. In common practice today, chilled water or refrigerant cooled computer room air conditioning (CRAC) units are used to reject the waste heatoutside the data center, and they also work together with the fans in the IT equipment to circulate air within the data center for heat transport. In a raised floor data center, the cool air exiting the multiple CRAC units enters the underfloor plenum before it is distributed through the vent tiles in the cold aisles to the IT equipment. The vent tiles usually have fixed openings and are not adapted to accommodate the flow demand that can vary from cold aisle to cold aisle or rack to rack. In this configuration, CRAC units have the extra responsibilities of cooling resources distribution as well as provisioning.
关键词:数据中心;CRAC;冷却效率
-
42955.使用InfoSphereBigInsights查询社交媒体和结构化数据——Jaql介绍
[信息传输、软件和信息技术服务业] [2013-12-18]
If you're looking to get off to a quick start with big data projects involving IBM? InfoSphere? BigInsights?, learning the basics of how to query, manipulate, and analyze your data is important. This article takes you through simple query examples that show how you can read, write, filter, and refine social media and structured data. You'll even see how business analysts can visualize query results using a
spreadsheet-style tool.
关键词:InfoSphere BigInsights;社交媒体;结构化数据;Jaql
-
42956.具有凸的不确定性的MDPPCTL多项式时间的验证
[信息传输、软件和信息技术服务业] [2013-12-18]
We address the problem of verifying Probabilistic Computation Tree Logic (PCTL) properties of Markov Decision Processes (MDPs) whose state transition probabilities are only known to lie within uncertainty sets. We rst introduce the model of Convex-MDPs (CMDPs), i.e., MDPs with convex uncertainty sets. CMDPs generalize Interval-MDPs (IMDPs) by allowing also more expressive (convex) descriptions of uncertainty. Using results on strong duality for convex programs, we then present a PCTL veri cation algorithm for CMDPs, and prove that it runs in time polynomial in the size of a CMDP for a rich subclass of convex uncertainty models. This result allows us to lower the previously known algorithmic complexity upper bound for IMDPs from co-NP to PTIME. Using the proposed approach, we verify a consensus protocol and a dynamic con guration protocol for IPv4 addresses.
关键词:多项式时间;MDP;PCTL;共织协议
-
42957.Shark:使用粗粒度的分布式内存进行数据的快速分析
[信息传输、软件和信息技术服务业] [2013-12-18]
Shark is a research data analysis system built on a novel coarse-grained distributed shared-memory abstraction. Shark marries query processing with deep data analysis, providing a uni ed system for easy data manipulation using SQL and pushing sophisticated analysis closer to data. It scales to thousands of nodes in a fault-tolerant manner. Shark can answer queries 40X faster than Apache Hive and run machine learning programs 25X faster than MapReduce programs in Apache Hadoop on large datasets. This is a complete overview of the development of Shark, including design decisions, performance details, and comparison with existing data warehousing solutions. It demonstrates some of Shark's distinguishing features including its in-memory columnar caching and its uni ed machine learning interface.
关键词:分布式内存;快速分析;数据分析
-
42958.在IBMPureFlex系统创建KVM备份
[信息传输、软件和信息技术服务业] [2013-12-18]
IBM PureFlex System with integrated network and storage virtualization and an openhypervisor provide an open and cost-effective solution to customers. This article shows how tocreate a backup of KVM (Kernel-based Virtual Machine) virtual machines in a PureFlex Systemenvironment.
关键词:KVM虚拟机;备份;系统集成网络;存储虚拟化
-
42959.在开放堆栈启用梯形的LDAP后端
[信息传输、软件和信息技术服务业] [2013-12-18]
OpenStack is open source software for building public and private clouds that provide anInfrastructure as a Service (IaaS) platform. Keystone is an OpenStack subproject that providesidentity services, including user authentication and authorization, for the OpenStack family ofprojects. This article shows how to configure Keystone to use a Lightweight Directory AccessProtocol (LDAP) server as its back end for identity services, instead of the default SQL backend.
关键词:LDAP树;配置;单元测试
-
42960.在混合计量方案上进行周期性的虚拟计量校准的结果
[信息传输、软件和信息技术服务业] [2013-12-18]
Risk and cost must be balanced in the design of semiconductor processing metrology. More speci cally, one needs to balance the cost of operating the metrology tool, and the loss in terms of processing cost and yield due to the limited sampling and the time lapse between the occurrence and the correction of a process fault. In virtual metrology (VM), the real-time data produced by the processing tool (e.g. plasma etching data during isolation trench formation) is used to predict an outcome of the wafer (e.g. critical dimension of the trench) utilizing an empirical model.
关键词:混合计量方案;虚拟计量校准;半导体