An Indoor Distant Speaker Identification System

Author:
Chen, Zeya, Computer Science - School of Engineering and Applied Science, University of Virginia
Advisor:
Stankovic, John, Department of Computer Science, University of Virginia
Abstract:

Indoor speaker identification systems have been developed for a long time and are widely used in many acoustic monitoring systems. Many works have focused on improving accuracy in dealing with different realisms including noise and distances. However, these works either require extra efforts such as measuring room types, obtaining lots of speaker's samples, or require hardware such as microphone arrays and complex deployment settings. In this project, we introduce a complete speaker identification solution. It uses an artificial reverberation generator with different parameters to adjust the original close-distance speech samples so that each speaker has different artificial voice samples. Samples in different environments are not required because these artificial samples are close approximations to different environments. Two kinds of models, GMM-UBM and i-vector, are evaluated. The models are trained on all samples separately, and testing is done against all in parallel. A score fusing approach with two thresholds, minimum value and minimum difference, is applied to the scores and outputs the final result. Also, several standard acoustic pre-processing routines, including voice activity detection algorithms and an overlapped speech remover, are included to make the system deployable. Finally, in order to see the improvement when applying reverberation adjustment, we evaluate our system with two literature speech databases, one has 251 people and the other one has four kinds of emotions. Also, we perform an in-lab speaking experiment. The evaluation results show our system has more than 90% accuracy to identify distant speakers within 6 meters if the mood is neutral and 10% improvement in distant speaker identification when speakers have other emotions.

Degree:
MS (Master of Science)
Keywords:
speaker identification
Language:
English
Rights:
All rights reserved (no additional license for public reuse)
Issued Date:
2018/07/24