关键词:文本摘要;文本文档;机器学习
摘 要:This dissertation presents techniques for the summarization and exploration of text documents. Many approaches taken towards analysis of news media can be analogized to well-de ned, well-studied problems from statistical machine learning. The problem of feature selection, for classi cation and dimensionality reduction tasks, is formulated to help assist with these media analysis tasks. Taking advantage of `1 regularization, convex programs can be used to eciently solve these feature selection problems eciently. There is a demon-strated potential to conduct media analysis at a scale commensurate with the growing volume of data available to news consumers.