关键词:分布式实时处理系统;数据处理;可用性
摘 要:Our Capstone project involves working with an open source distributed real time processing system called Apache Storm, in collaboration with Cisco Systems, Inc. The term “real time processing” in this context means that the system is able to respond within seconds or subsecond to requests, while “distributed” means that it is running on multiple computers. The goal of the project is to add a feature called “ksafety” to Storm. With ksafety, Storm will be able to tolerate up to k machine failures without losing data or reducing its response time, making the system highly available. Cisco plans to integrate our modified version of Storm into their data processing pipeline and use it to support internal and customerfacing products.