eduzhai > Applied Sciences > Engineering >

Data augmentation versus noise compensation for x- vector speaker recognition systems in noisy environments

  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: The explosion of available speech data and new speaker modeling methods basedon deep neural networks (DNN) have given the ability to develop more robustspeaker recognition systems. Among DNN speaker modelling techniques, x-vectorsystem has shown a degree of robustness in noisy environments. Previous studiessuggest that by increasing the number of speakers in the training data andusing data augmentation more robust speaker recognition systems are achievablein noisy environments. In this work, we want to know if explicit noisecompensation techniques continue to be effective despite the general noiserobustness of these systems. For this study, we will use two different x-vectornetworks: the first one is trained on Voxceleb1 (Protocol1), and the second oneis trained on Voxceleb1+Voxveleb2 (Protocol2). We propose to add a denoisingx-vector subsystem before scoring. Experimental results show that, the x-vectorsystem used in Protocol2 is more robust than the other one used Protocol1.Despite this observation we will show that explicit noise compensation givesalmost the same EER relative gain in both protocols. For example, in theProtocol2 we have 21 to 66 improvement of EER with denoising techniques.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...