-
4591.连续的通信范围的快速线性代数
[信息传输、软件和信息技术服务业] [2013-12-30]
In this note we obtain communication cost lower and upper bounds on the algo-rithms for LU and QR given in (Demmel, Dumitriu, and Holtz 2007). The algorithmsthere use fast, stable matrix multiplication as a subroutine and are shown to be asstable and as computationally effcient as the matrix multiplication subroutine. Weshow here that they are also as communication-effcient (in the sequential, two-level memory model) as the matrix multiplication algorithm. The analysis for LU and QRextends to all the algorithms in (Demmel, Dumitriu, and Holtz 2007). Further, weprove that in the case of using Strassen-like matrix multiplication, these algorithmsare communication optimal.
关键词:沟通成本;矩阵乘法;内存模型
-
4592.在非随机的顺序决策环境
[信息传输、软件和信息技术服务业] [2013-12-30]
关键词:决策;参数分布;概率分布;不确定性
-
4593.交互式查询处理大数据系统:一个十字架MapReduce工作负载的行业研究
[信息传输、软件和信息技术服务业] [2013-12-30]
Within the past few years, organizations in diverse indus- tries have adopted MapReduce-based systems for large-scale data processing. Along with these new users, important new workloads have emerged which feature many small, short,and increasingly interactive jobs in addition to the large, long-running batch jobs for which MapReduce was originally designed. As interactive, large-scale query processing (e.g.,OLAP) is a strength of the RDBMS community, it is impor- tant that lessons from that eld be carried over and applied where possible in this new domain. However, these new workloads have not yet been described in the literature. We ll this gap with an empirical analysis of MapReduce traces from six separate business-critical deployments inside Face-book and at Cloudera customers in e-commerce, telecommu-nications, media, and retail. Our key contribution is a char- acterization of new MapReduce workloads which are driven in part by interactive analysis, and which make heavy use of SQL-like programming frameworks on top of MapReduce.These workloads display diverse behaviors which invalidate prior assumptions about MapReduce such as uniform data access, regular diurnal patterns, and prevalence of large jobs.A secondary contribution is a rst step towards creating a TPC-like data processing benchmark for MapReduce.
关键词:数据处理;工作负载;查询处理;昼夜模式
-
4594.存储的高级服务水平目标
[信息传输、软件和信息技术服务业] [2013-12-30]
Modern datacenters support a large number of applications with diverse performance requirements. These performance requirements are expressed at the application layer as high-level service-level objectives (SLOs). However, large-scale distributed storage systems are unaware of these high-level SLOs. This lack of awareness results in poor performance when workloads from multiple applications are consolidated onto the same storage cluster to increase utilization. In this paper, we argue that because SLOs are expressed at a high level, a high-level control mechanism is required. This is in contrast to existing approaches, which use block- or disk-level mechanisms.These require manual translation of high-level requirements into low-level parameters. We present Frosting,a request scheduling layer on top of a distributed storage system that allows application programmers to specify their high-level SLOs directly. Frosting improves over the state-of-the-art by automatically translating high-level SLOs into internal scheduling parameters and uses feedback control to adapt these parameters to changes in the workload. Our preliminary results demonstrate that our overlay approach can multiplex both latency-sensitive and batch applications to increase utilization, while still maintaining a 100ms 99th percentile latency SLO for latencysensitive clients.
关键词:高级服务水平目标;高级控制机制;低级参数;工作负载
-
4595.一个简单、准确的解析All-Fragments语法
[信息传输、软件和信息技术服务业] [2013-12-30]
We present a simple but accurate parser which exploits both large tree fragments and symbol refinement. We parse with all fragments of the training set, in contrast to much recent work on tree selection in data-oriented parsing and tree-substitution grammar learning. We require only simple, deterministic grammar symbol refinement, in contrast to recent work on latent symbol refinement. Moreover, our parser requires no explicit lexicon machinery, instead parsing input sentences as character streams. Despite its simplicity, our parser achieves accuracies of over 88% F1 on the standard English WSJ task, which is competitive with substantially more complicated state-of-the-art lexicalized and latent-variable parsers. Additional specific contributions center on making implicit all-fragments parsing efficient, including a coarse-to-fine inference scheme and a new graph encoding.
关键词:解析器;语法学习;隐式中心
-
4596.StrassenCommunication-Optimal并行算法的矩阵乘法
[信息传输、软件和信息技术服务业] [2013-12-30]
Parallel matrix multiplication is one of the most studied fundamental problems in distributed and high performance computing.We obtain a new parallel algorithm that is based on Strassen's fast matrix multiplication and minimizes communication.The algorithm outperforms all known parallel matrix multiplication algorithms, classical and Strassen-based, both asymptotically and in practice.
关键词:并行矩阵乘法;Strassen;分布式和高性能计算问题
-
4597.CyclopsTensor的初步分析框架
[信息传输、软件和信息技术服务业] [2013-12-30]
Cyclops (cyclic-operations) Tensor Framework (CTF) 1 isa distributed library for tensor contractions. CTF aims toscale high-dimensional tensor contractions done in CoupledCluster calculations on massively-parallel supercomputers.The framework preserves tensor symmetry by subdividingtensors cyclically, producing a highly regular parallel decomposition.The parallel decomposition e ectively hidesany high dimensional structure of tensors reducing the complexityof the distributed contraction algorithm to knownlinear algebra methods for matrix multiplication. We alsodetail the automatic topology-aware mapping framework deployedby CTF, which maps tensors of any dimension andstructure onto torus networks of any dimension. We employ;virtualization to provide completely general mapping support while maintaining perfect load balance. Performance of a preliminary version of CTF on the IBM Blue Gene/P and Cray XE6 supercomputers shows highly ecient weakscaling, demonstrating the viability of our approach.
关键词:分布式库张量;线性代数矩阵乘法;载平衡;细分框架
-
4598.Android权限:用户的关注、理解和行为
[信息传输、软件和信息技术服务业] [2013-12-30]
Android’s permission system is intended to inform users about the risks of installing applications. When a user installs an application,he or she has the opportunity to review the application’s permission requests and cancel the installation if the permissions are excessive or objectionable. We examine whether the Android permission system is effective at warning users. In particular, we evaluate whether Android users pay attention to, understand, and act on permission information during installation. We performed two usability studies:an Internet survey of 308 Android users, and a laboratory study where we interviewed and observed 25 Android users. Study participants displayed low attention and comprehension rates: both the Internet survey and laboratory study found that 17% of people paid attention to permissions during installation, and only 3% of Internet survey respondents could correctly answer all three permission comprehension questions. This indicates that current Android permission warnings do not help most users make correct security decisions.However, a notable minority of users demonstrated both awareness of permission warnings and reasonable rates of comprehension.We present recommendations for improving user attention and comprehension, as well as identify open challenges.
关键词:安卓;网络调查;用户关注;权限;Android
-
4599.一个可靠的构建系统的模型和框架
[信息传输、软件和信息技术服务业] [2013-12-30]
Reliable and fast builds are essential for rapid turnaroundduring development and testing. Popular existing buildsystems rely on correct manual specification of builddependencies, which can lead to invalid build outputsand nondeterminism. We outline the challenges of developingreliable build systems and explore the designspace for their implementation, with a focus on nondistributed,incremental, parallel build systems. We definea general model for resources accessed by build tasksand show its correspondence to the implementation techniqueof minimum information libraries, APIs that returnno information that the application doesn’t plan touse. We also summarize preliminary experimental results from several prototype build managers.
关键词:构建系统;重点标识;探索设计
-
4600.细节:减少在数据中心网络流量完成的时间
[信息传输、软件和信息技术服务业] [2013-12-30]
Web applications have now become so sophisticated that renderinga typical page may require hundreds of intra-datacenter flows. At the same time, web sites must meet strict page creation deadlines of 200-300ms to satisfy user demands for interactivity.Long-tailed flow completion times make it challenging for web sites to meet these constraints. They are forced to choose between rendering a subset of the complex page, or delay its rendering, thus missing deadlines and sacrificing either quality or responsiveness. Either option leads to potential financial loss.
关键词:Web应用程序;用户交互性;长尾流量完成时间