-
43091.为实现可扩展性和灵活性的多代理集群调度
[信息传输、软件和信息技术服务业] [2013-12-18]
This dissertation presents a taxonomy and evaluation of three cluster scheduling architectures for scalability and exibility using a common high level taxonomy of cluster scheduling, a Monte Carlo simulator, and a real system implementation. We begin with the popular Monolithic State Scheduling (MSS), then consider two new architectures: Dynamically Partitioned State Scheduling (DPS) and Replicated State Scheduling (RSS). We describe and evaluate DPS, which uses pessimistic concurrency control for cluster resource sharing. We then present the design, implementation, and evaluation of Mesos, a real-world DPS cluster scheduler that allows diverse cluster computing frameworks to eciently share resources.
关键词:可扩展性;灵活性;多代理集群调度
-
43092.建立基于知识IE系统的俄语提取模板的方法学
[信息传输、软件和信息技术服务业] [2013-12-18]
In this technical report we describe methodology for building information extraction (IE) rules. Rules are usually developed by experts and are widely used in knowledge-based IE systems. They consist of two parts: the left-hand side (LHS) of a rule is a template that matches a certain syntactico-semantic structure (SSS) and the right-hand side is an action that is executed when LHS template is matched against a particular text fragment. In the report we describe the process of building a more complex LHS part (template). This methodology was used for developing the information extraction system that extracts business events from news articles written in Russian language.
关键词:信息提取规则;事件抽取;词典;规则;模式;意义文本模式;乔姆斯基语法
-
43093.基于OBDDs的Submatch快速获取
[信息传输、软件和信息技术服务业] [2013-12-18]
Network-based intrusion detection systems (NIDS) commonly use pattern languages to identify packets of interest. Similarly, security information and event management (SIEM) systems rely on pattern languages for real-time analysis of security alerts and event logs. Both NIDS and SIEM systems use pattern languages extended from regular expressions. One such extension, the submatch construct, allows the extraction of substrings from a string matching a pattern. Existing solutions for submatch extraction are based on non-deterministic finite automata (NFAs) or recursive backtracking. NFA-based algorithms are time-inefficient. Recursive backtracking algorithms perform poorly on pathological inputs generated by algorithmic complexity attacks. We propose a new approach for submatch extraction that uses ordered binary decision diagrams (OBDDs) to represent and operate pattern matching. Our evaluation using patterns from the Snort HTTP rule set and a commercial SIEM system shows that our approach achieves its ideal performance when patterns are combined. In the best case, our approach is faster than RE2 and PCRE by one to two orders of magnitude.
关键词:正则表达式;模式匹配;submatch;标记的NFA;有序二元决策图(OBDD)
-
43094.在产能多项logit选择模型与产品差异化价格敏感度下的产品选择和定价联合优化
[信息传输、软件和信息技术服务业] [2013-12-18]
Many firms face a problem to select an assortment of products and determine their prices to maximize the total profit subject to a capacity constraint. Customers' purchase behavior follows the Multinomial Logit choice model. Our analysis shows the capacity is always fully used and the nested structure is lost when price sensitivities are product-differentiated. We propose a nonrecursive polynomial-time algorithm for the joint assortment and multi-product price optimization.
关键词:多项式Logit模型;组合优化;多产品价格优化
-
43095.数字出版艺术:支撑出版业未来的一种组合标准的基础
[信息传输、软件和信息技术服务业,印刷和记录媒介复制业] [2013-12-18]
Scienti c content increasingly relies on the presentation and authoring of complex multimedia diagrams and gures, sometimes interactive, to convey information in a non-textual way. Wikis and user-generated hyper-linked content have both been very successful in the case for text|this is what we aim to do for mathematical diagrams. Many professors in higher education who write textbooks know TeX, however, they don't often know how to program the Web. The future of building interactive user interfaces should lie not in the hands of programmers, but in the hands of the expert of a given eld|the goal of this project is to supply math, physics, and engineering professors with a platform to express mathematical concepts to students to provide immersive learning environments.
关键词:数字出版;编程;交互式用户界面
-
43096.通过局部优化冷却资源提高数据中心的效率
[信息传输、软件和信息技术服务业] [2013-12-18]
Data centers are large computing facilities that can house tens of thousands of computer servers, storage and networking devices. They can consume megawatts of power and, as a result, reject megawatts of heat. For more than a decade, researchers have been investigating methods to improve the efficiency by which these facilities are cooled. One of the key challenges to maintain highly efficient cooling is to provide on demand cooling resources to each server rack, which may vary with time and rack location within the larger data center. In common practice today, chilled water or refrigerant cooled computer room air conditioning (CRAC) units are used to reject the waste heatoutside the data center, and they also work together with the fans in the IT equipment to circulate air within the data center for heat transport. In a raised floor data center, the cool air exiting the multiple CRAC units enters the underfloor plenum before it is distributed through the vent tiles in the cold aisles to the IT equipment. The vent tiles usually have fixed openings and are not adapted to accommodate the flow demand that can vary from cold aisle to cold aisle or rack to rack. In this configuration, CRAC units have the extra responsibilities of cooling resources distribution as well as provisioning.
关键词:数据中心;CRAC;冷却效率
-
43097.使用InfoSphereBigInsights查询社交媒体和结构化数据——Jaql介绍
[信息传输、软件和信息技术服务业] [2013-12-18]
If you're looking to get off to a quick start with big data projects involving IBM? InfoSphere? BigInsights?, learning the basics of how to query, manipulate, and analyze your data is important. This article takes you through simple query examples that show how you can read, write, filter, and refine social media and structured data. You'll even see how business analysts can visualize query results using a
spreadsheet-style tool.
关键词:InfoSphere BigInsights;社交媒体;结构化数据;Jaql
-
43098.具有凸的不确定性的MDPPCTL多项式时间的验证
[信息传输、软件和信息技术服务业] [2013-12-18]
We address the problem of verifying Probabilistic Computation Tree Logic (PCTL) properties of Markov Decision Processes (MDPs) whose state transition probabilities are only known to lie within uncertainty sets. We rst introduce the model of Convex-MDPs (CMDPs), i.e., MDPs with convex uncertainty sets. CMDPs generalize Interval-MDPs (IMDPs) by allowing also more expressive (convex) descriptions of uncertainty. Using results on strong duality for convex programs, we then present a PCTL veri cation algorithm for CMDPs, and prove that it runs in time polynomial in the size of a CMDP for a rich subclass of convex uncertainty models. This result allows us to lower the previously known algorithmic complexity upper bound for IMDPs from co-NP to PTIME. Using the proposed approach, we verify a consensus protocol and a dynamic con guration protocol for IPv4 addresses.
关键词:多项式时间;MDP;PCTL;共织协议
-
43099.Shark:使用粗粒度的分布式内存进行数据的快速分析
[信息传输、软件和信息技术服务业] [2013-12-18]
Shark is a research data analysis system built on a novel coarse-grained distributed shared-memory abstraction. Shark marries query processing with deep data analysis, providing a uni ed system for easy data manipulation using SQL and pushing sophisticated analysis closer to data. It scales to thousands of nodes in a fault-tolerant manner. Shark can answer queries 40X faster than Apache Hive and run machine learning programs 25X faster than MapReduce programs in Apache Hadoop on large datasets. This is a complete overview of the development of Shark, including design decisions, performance details, and comparison with existing data warehousing solutions. It demonstrates some of Shark's distinguishing features including its in-memory columnar caching and its uni ed machine learning interface.
关键词:分布式内存;快速分析;数据分析
-
43100.在IBMPureFlex系统创建KVM备份
[信息传输、软件和信息技术服务业] [2013-12-18]
IBM PureFlex System with integrated network and storage virtualization and an openhypervisor provide an open and cost-effective solution to customers. This article shows how tocreate a backup of KVM (Kernel-based Virtual Machine) virtual machines in a PureFlex Systemenvironment.
关键词:KVM虚拟机;备份;系统集成网络;存储虚拟化