欢迎访问行业研究报告数据库

行业分类

当前位置:首页 > 报告详细信息

找到报告 1 篇 当前为第 1 页 共 1

云计算框架中的堵文本处理和挖掘

Plugging Text Processing and Mining in a Cloud Computing Framework
作者:Akil RajdhoMarenglen Biba 作者单位:Department of Computer Science, University of New York in Tirana, Tirana, Albania,School of Computing and Mathematical Sciences, University of Greenwich, London, UK 加工时间:2013-10-10 信息来源:科技报告(other) 索取原文[22 页]
关键词:电子信息;云计算;堵文本;文本挖掘;数据收集;数据存储
摘 要:Computational methods have evolved over the years giving developers and researchers more sophisticated and faster ways to solve hard data processing tasks. However, with new data collecting and storage technologies, the amount of gathered data increases everyday making the analysis of it a more and more complex task. One of the main forms of storing data is plain unstructured text and one of the most common ways of analyzing this kind of data is through Text Mining. Text Mining is similar to other types of data mining but the problem is that differently from other forms of data that are properly structured (such as XML) in text mining data in the best case scenario is semi-structured. In order for them to derive valuable information, text mining systems have to execute a lot of complex natural language processing algorithms. In this chapter we focus on text processing tools dealing with stemming algorithms. Stemming is the step that deals with finding the stem (or root) of the word which is essential in every text processing procedure. Stemming algorithms are complex and require high computational effort. In this chapter we present an Apache Mahout plugin for a stemming algorithm making possible to execute the algorithm in a cloud computing environment. We investigate the performance of the algorithm in the cloud and show that the new approach significantly reduces the execution time of the original algorithm over a large dataset of text documents.
© 2016 武汉世讯达文化传播有限责任公司 版权所有
客服中心

QQ咨询


点击这里给我发消息 客服员


电话咨询


027-87841330


微信公众号




展开客服