行业报告详情 - 行业报告数据库

行业分类

找到报告 1 篇当前为第 1 页共 1 页

扬声器的记录：当前的限制和新方向

Speaker Diarization: Current Limitations and New Directions

作者：Mary Tai Knox 作者单位：Electrical Engineering and Computer Sciences 加工时间：2013-11-07 信息来源：EECS

关键词：语音识别和音频索引；扬声器系统；改善
摘要：Speaker diarization is the problem of determining “who spoke when” in an audio recording when the number and identities of the speakers are unknown. Motivated by applications in automatic speech recognition and audio indexing, speaker diarization has been studied extensively over the past decade, and there are currently a wide variety of approaches – including both top-down and bottom-up unsupervised clustering methods. The contributions of this thesis are to provide a unified analysis of the current state-of-the-art, to understand where and why mistakes occur, and to identify directions for improvements.In the first part of the thesis, we analyze the behavior of six state-of-the-art diarization systems, all evaluated on the National Institute of Standards and Technology (NIST) Rich Transcription 2009 evaluation dataset. While performance is typically assessed in terms of a single number – the diarization error rate (DER) – we further characterize the errors based on speech segment durations and their proximity to speaker change points.

行业分类

友情链接

联系我们

QQ咨询

电话咨询

微信公众号

感谢访问