行业报告详情 - 行业报告数据库

行业分类

找到报告 1 篇当前为第 1 页共 1 页

Shark：使用粗粒度的分布式内存进行数据的快速分析

Shark: Fast Data Analysis Using Coarse-grained Distributed Memory

作者：Clifford Engle 加工时间：2013-12-18 信息来源：EECS

关键词：分布式内存；快速分析；数据分析
摘要：Shark is a research data analysis system built on a novel coarse-grained distributed shared-memory abstraction. Shark marries query processing with deep data analysis, providing a uni ed system for easy data manipulation using SQL and pushing sophisticated analysis closer to data. It scales to thousands of nodes in a fault-tolerant manner. Shark can answer queries 40X faster than Apache Hive and run machine learning programs 25X faster than MapReduce programs in Apache Hadoop on large datasets. This is a complete overview of the development of Shark, including design decisions, performance details, and comparison with existing data warehousing solutions. It demonstrates some of Shark's distinguishing features including its in-memory columnar caching and its uni ed machine learning interface.

行业分类

友情链接

联系我们

QQ咨询

电话咨询

微信公众号

感谢访问