欢迎访问行业研究报告数据库

行业分类

当前位置:首页 > 报告详细信息

找到报告 1 篇 当前为第 1 页 共 1

重新优化数据并行计算

Re-optimizing Data Parallel Computing

加工时间:2013-11-18 信息来源:EECS 索取原文[39 页]
关键词:优化数据并行计算;查询优化器;高效执行
摘 要:Performant execution of data-parallel jobs needs good executionplans. Certain properties of the code, the data, and the interaction between them are crucial to generate these plans. Yet, these propertiesare dicult to estimate due to the highly distributed nature of theseframeworks, the freedom that allows users to specify arbitrary code,as operations on the data, and since jobs in modern clusters haveevolved beyond single map and reduce phases to logical graphs ofoperations. Using xed apriori estimates of these properties to chooseexecution plans, as modern systems do, leads to poor performance inseveral instances. We present RoPE, a rst step towards re-optimizingdata-parallel jobs. RoPE collects certain code and data properties bypiggybacking on job execution. It adapts execution plans by feedingthese properties to a query optimizer. We show how this improves thefuture invocations of the same (and similar) jobs and characterize thescenarios of benet. Experiments on Bing's production clusters showup to 2 improvement across response time for production jobs at the75th percentile while using 1:5 fewer resources.
© 2016 武汉世讯达文化传播有限责任公司 版权所有 技术支持:武汉中网维优
客服中心

QQ咨询


点击这里给我发消息 客服员


电话咨询


027-87841330


微信公众号




展开客服