欢迎访问行业研究报告数据库

行业分类

当前位置:首页 > 报告详细信息

找到报告 1 篇 当前为第 1 页 共 1

数据密集型集群中并行工作性能的优化技术

Optimizing Parallel Job Performance in Data-Intensive Clusters

作者:Ganesh Ananthanarayanan 作者单位:EECS Department, University of California, Berkeley 加工时间:2015-06-07 信息来源:EECS 索取原文[125 页]
关键词:集群技术;并行工作;数据密集型
摘 要:A simple but key aspect of parallel jobs is the all-or-nothing property: unless all tasks of a job are provided equal improvement, there is no speedup in the completion of the job. The all-or-nothing property is critical for the promise of efficient and fault-tolerant parallel computations on large clusters. Meeting this promise in clusters of these scales is challenging and a key departure from prior work on distributed systems. This talk will look at the execution of a job from first principles and propose techniques spanning the software stack of data analytics systems such that its tasks achieve homogeneous performance while overcoming the various heterogeneities. To that end, we will propose techniques for (i) caching and cache replacement for parallel jobs, which outperforms even Belady's MIN (that uses an oracle), (ii) data locality, and (iii) straggler mitigation. Our analyses and evaluation are performed using workloads from Facebook and Bing production datacenters Along the way, we will also describe how we broke the myth of disk-locality's importance in datacenter computing.
© 2016 武汉世讯达文化传播有限责任公司 版权所有 技术支持:武汉中网维优
客服中心

QQ咨询


点击这里给我发消息 客服员


电话咨询


027-87841330


微信公众号




展开客服