欢迎访问行业研究报告数据库

行业分类

当前位置:首页 > 报告详细信息

找到报告 1 篇 当前为第 1 页 共 1

词典和基于模式的识别在俄罗斯新闻文本的组织名称

Dictionary and pattern-based recognition of organization names in Russian news texts
作者:Valery Solovyev, Rinat Gareev, Vladimir Ivanov, Sergey Serebryakov, Natalia Vassilieva 作者单位:Kazan Federal University,Institute of Informatics AS RT,National University of Science and Technology 加工时间:2013-12-21 信息来源:HP 索取原文[7 页]
关键词:命名实体识别;知识为基础;事件抽取
摘 要:This paper describes a part of the event extraction system which has been developed in collaboration with HP Labs Russia. The domain of input texts is business news feeds. One of the most important event participant types is 'Organization'. This paper is focused on the problem of organization names recognition in Russian news texts. Two approaches have been implemented. The first is dictionary-based. We propose an algorithm to make a dictionary from a set of legal body full names gathered from a government registry. The main problems with the dictionary matching are incorrect stemming and significant fraction of ambiguous names among dictionary entries. The second recognition approach is based on usage of local context clues and internal name words. These words constitute patterns which are intrinsic to organization names. These patterns enable recognition of non-dictionary names. We propose an algorithm to derive such patterns from the original dictionary.
© 2016 武汉世讯达文化传播有限责任公司 版权所有 技术支持:武汉中网维优
客服中心

QQ咨询


点击这里给我发消息 客服员


电话咨询


027-87841330


微信公众号




展开客服