关键词:媒体分类技术;语音处理;人脸识别;自动化;搜索;敏感信息;隐私
摘 要:Automated media classification techniques like speech processing and face recognition are becoming increasingly commonplace and sophisticated. While such tools can add great value to the public sphere, media searches often process sensitive information, leading to a potential breach of client privacy. Thus, there is great potential for applications involving privacy-preserving searches on public databases like Google Images, Flickr, or ``Wanted Persons" directories put forth by various police agencies. The objective of this thesis is to argue that private media searches masking the client's query from the server are both important and practically feasible. The main contributions include an audio search tool that uses private queries to identify a noisy sound clip from a database without giving the database information about the query. The proposed scheme is shown to have computation and communication costs that are sublinear in database size. An important message of this work is that good private search schemes will typically require special algorithms that are designed for the private domain. To that end, some techniques used in the private audio search tool are generalized to adapt nearest-neighbor searches to the private domain. The resulting private nearest-neighbor algorithm is demonstrated in the context of a privacy-preserving face recognition tool.