-
43251.NetApp公司的自动支持分析
[信息传输、软件和信息技术服务业] [2013-11-20]
This project leverages the Autosupport data to gain insights into the production environment as well as the QA environment in terms of their relationships to each other. Using the K-Means algorithm and direct matching method, we have identified eight common customer configuration groups, top customer configurations not tested by any QA machines, and top QA machines not testing any customer configurations.
关键词:NetApp公司;自动支持分析;QA环境;大数据
-
43252.针对自然语言处理中的结构性问题的最佳检索算法
[信息传输、软件和信息技术服务业] [2013-11-20]
We will discuss both known and novel algorithms that can find the best path without considering all hyperedges in the hypergraph, and hence can speed up search without sacrificing search quality. We will provide simplified proofs of correctness for these algorithms. We also propose two novel algorithms that permit extraction of the k-best paths instead of the single best. We compare these approaches both against exhaustive search, and against approximate search techniques which speed up search by sacrificing optimality guarantees.
关键词:检索算法;自然语言处理;结构;检索速度
-
43253.识别使用地区
[信息传输、软件和信息技术服务业] [2013-11-20]
The success of modeling object viewpoints motivates us to tackle the generic variation problem through component models, where each component characterizes not only a particular viewpoint of objects, but also a particular subcategory or pose.Furthermore, our approach allows the transfer of inner-grained semantic information from the components, such as keypoint locations and segmentation masks.
关键词:识别地区;建模对象;语义信息
-
43254.Reviewably-Secure软件系统的语言和框架支持
[信息传输、软件和信息技术服务业] [2013-11-20]
My thesis is that languages and frameworks can and should be designed to make it easier for programmers to write reviewably secure systems. A system is reviewably secure if its security is easy for an experienced programmer to verify, given access to the source code. A security reviewer should be able, with a reasonable amount of effort, to gain confidence that such a system meets its stated security goals. This dissertation includes work on on language subsetting and web application framework design. It presents Joe-E, a subset of the Java programming language designed to enforce object-capability security, simplifying the task of verifying a variety of security properties by enabling sound, local reasoning. Joe-E also enforces determinism-by-default, which permits functionally-pure methods to be identified by their signature. Functional purity is a useful property that can greatly simplify the task of correctly implementing and reasoning about application code.
关键词:软件系统;信息安全;框架支持;源代码;语言子集
-
43255.细晶粒流量分析的HTTPS漏洞
[信息传输、软件和信息技术服务业] [2013-11-20]
In this thesis, we apply the pattern recognition and data processing strengths of machine learning to accomplish traffic analysis objectives. Traffic analysis relies on the use of observable features of encrypted traffic in order to infer plaintext contents. We apply a clustering technique to HTTPS encrypted traffic on websites covering medical, legal and financial topics and achieve accuracy rates ranging from 64% - 99% when identifying traffic within each website. The total number of URLs considered on each page ranged from 176 to 366. We present our results along with a justification of the machine learning techniques employed and an evaluation which explores the impact on accuracy of variations in amount of training data, number of clustering algorithm invocations, and convergence threshold. Our technique represents a significant improvement over previous techniques which have achieved similar accuracy, albeit with the aid of supporting assumptions simplifying traffic analysis. We examine these assumptions more closely and present results suggesting that two assumptions, browser cache configuration and selection of webpages for evaluation, can have considerable impact on analysis. Additionally, we propose a set of minimum evaluation standards for improved quality in traffic analysis evaluations.
关键词:流量分析;机器学习;HTTPS;漏洞;模式识别;数据处理
-
43256.有效性测试:预测和确认并发和分散的内存并行系统的并发漏洞
[信息传输、软件和信息技术服务业] [2013-11-20]
We explain in detail the design decisions and optimizations that were necessary to scale Active Testing to thousands of cores. We present extensions to UPC-Thrille that support hybrid memory models as well. We evaluate the effectiveness of Active Testing by running our tools on several Java and UPC benchmarks, showing that it can predict and confirm real concurrency bugs with low overhead. We demonstrate the scalability of Active Testing by running benchmarks with UPC-Thrille on large clusters with thousands of cores.
关键词:大型集群;主动测试;并行系统;内存;并发;分散
-
43257.众包公民科学中使用移动技术和社交网络
[信息传输、软件和信息技术服务业] [2013-11-20]
This dissertation explores the application of computer science methodologies, techniques, and technologies to citizen science. Citizen science can be broadly de ned as scienti c research performed in part or in whole by volunteers who are not professional scientists. Such projects are increasingly making use of mobile and Internet technologies and social networking systems to collect or categorize data, and to coordinate efforts with other participants. The dissertation focuses on observations and experiences from the design, deployment, and testing of a citizen science project, CreekWatch. CreekWatch is a collaboration between an HCI research group and a government agency.
关键词:众包;互联网技术;社交网络
-
43258.面向多核的自动调谐稀疏矩阵-向量乘法
[信息传输、软件和信息技术服务业] [2013-11-20]
Sparse matrix-vector multiplication (SpMV) is an important kernel in scientific and engineering computing. Straightforward parallel implementations of SpMV often perform poorly, and with the increasing variety of architectural features in multicore processors, it is getting more difficult to determine the sparse matrix data structure and corresponding SpMV implementation that optimize performance. In this paper we present pOSKI, an autotuning system for SpMV that automatically searches over a large set of possible data structures and implementations to optimize SpMV performance on multicore platforms. pOSKI explores a design space that depends on both the nonzero pattern of the sparse matrix, typically not known until run-time, and the architecture, which is explored off-line as much as possible, in order to reduce tuning time. We demonstrate significant performance improvements compared to previous serial and parallel implementations, and compare performance to upper bounds based on architectural models.
关键词:工程计算;稀疏矩阵;向量乘法;多核处理器;数据结构
-
43259.统计问题的算法
[信息传输、软件和信息技术服务业] [2013-11-20]
The increasing size of our datasets|and perhaps more importantly, the increasing complexity of the underlying distributions that we hope to understand|are exposing issues that seem to demand computational consideration. In this dissertation, we apply the computational perspective to three basic statistical questions which underlie and abstract several of the challenges encountered in the analysis of today's large datasets.
关键词:概率;统计数据;算法;大型数据集
-
43260.一个优化的视频点播系统:理论,设计和实施
[信息传输、软件和信息技术服务业] [2013-11-20]
We show that by storing only a fractional of the entire catalog everywhere, the system is able to fully support user demand at large scale. Second, we develop a Markov approximation technique to solve the problem of topology selection under node degree bound using a simple distributed algorithm. We prove that our algorithm achieves close-to-optimal solution, which we verify using extensive realworld trace simulations. On the system side, we show extensive results to test the algorithm's scalability and robustness to changes in user dynamics and demand patterns. We show that our solution achieves high utilization of cache nodes storage and bandwidth resources, and automatically learns and caches the video according to the demand patterns. We observe that there exists a complex interplay between disk space, network bandwidth and node degree bound. We also present guidelines to important practical design choices including caching update intervals, demand prediction and provisioning. We also demonstrate the feasibility and efficiency of our design choice by building and experimenting a prototype system at Berkeley.
关键词:拓扑选择;算法;用户动态需求;视频点播系统