eduzhai > Applied Sciences > Engineering >

Intra-class variation reduction of speaker representation in disentanglement framework

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: In this paper, we propose an effective training strategy to ex-tract robustspeaker representations from a speech signal. Oneof the key challenges inspeaker recognition tasks is to learnlatent representations or embeddingscontaining solely speakercharacteristic information in order to be robust interms of intra-speaker variations. By modifying the network architecturetogenerate both speaker-related and speaker-unrelated representa-tions, weexploit a learning criterion which minimizes the mu-tual information betweenthese disentangled embeddings. Wealso introduce an identity change losscriterion which utilizes areconstruction error to different utterances spokenby the samespeaker. Since the proposed criteria reduce the variation ofspeakercharacteristics caused by changes in background envi-ronment or spoken content,the resulting embeddings of eachspeaker become more consistent. Theeffectiveness of the pro-posed method is demonstrated through two tasks;disentangle-ment performance, and improvement of speaker recognition ac-curacycompared to the baseline model on a benchmark dataset,VoxCeleb1. Ablationstudies also show the impact of each cri-terion on overall performance.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...