关键词:并行工作;数据密集型集群;优化
摘 要:To cope with the deluge in data that is growing faster than Moores law, computation frameworks have resorted to massive parallelization of analytics jobs into many ne-grained tasks. These frameworks promised to provide efficient and fault-tolerant execution of these tasks. However, meeting this promise in clusters spanning hundreds of thousands of machines is challenging and a key departure from earlier work on parallel computing.