以数据为中心的系统规模Workload-Driven的大型设计和评价
Workload-Driven Design and Evaluation of Large-Scale Data-Centric Systems
关键词:以数据化为中心;大规模系统
摘 要:Large-scale data-centric systems help organizations store, manipulate, and derivevalue from large volumes of data. They consist of distributed components spread acrossa scalable number of connected machines and involve complex software/hardware stackswith multiple semantic layers. These systems help organizations solve established prob-lems involving large amounts of data, while catalyzing new, data-driven businesses such assearch engines, social networks, and cloud computing and data storage service providers.The complexity, diversity, scale, and rapid evolution of large-scale data-centric systemsmake it challenging to develop intuition about these systems, gain operational expe-rience, and improve performance. It is an important research problem to develop a method to design and evaluate such systems based on the empirical behavior of the tar-geted workloads. Using an unprecedented collection of nine industrial workload tracesof business-critical large-scale data-centric systems, we develop a workload-driven designand evaluation method for these systems and apply the method to address previouslyunsolved design problems.