eduzhai > Applied Sciences > Engineering >

Neural PLDA Modeling for End-to-End Speaker Verification

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: While deep learning models have made significant advances in supervisedclassification problems, the application of these models for out-of-setverification tasks like speaker recognition has been limited to derivingfeature embeddings. The state-of-the-art x-vector PLDA based speakerverification systems use a generative model based on probabilistic lineardiscriminant analysis (PLDA) for computing the verification score. Recently, wehad proposed a neural network approach for backend modeling in speakerverification called the neural PLDA (NPLDA) where the likelihood ratio score ofthe generative PLDA model is posed as a discriminative similarity function andthe learnable parameters of the score function are optimized using averification cost. In this paper, we extend this work to achieve jointoptimization of the embedding neural network (x-vector network) with the NPLDAnetwork in an end-to-end (E2E) fashion. This proposed end-to-end model isoptimized directly from the acoustic features with a verification cost functionand during testing, the model directly outputs the likelihood ratio score. Withvarious experiments using the NIST speaker recognition evaluation (SRE) 2018and 2019 datasets, we show that the proposed E2E model improves significantlyover the x-vector PLDA baseline speaker verification system.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...