eduzhai > Applied Sciences > Engineering >

One Model Many Languages Meta-learning for Multilingual Text-to-Speech

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: We introduce an approach to multilingual speech synthesis which uses themeta-learning concept of contextual parameter generation and producesnatural-sounding multilingual speech using more languages and less trainingdata than previous approaches. Our model is based on Tacotron 2 with a fullyconvolutional input text encoder whose weights are predicted by a separateparameter generator network. To boost voice cloning, the model uses anadversarial speaker classifier with a gradient reversal layer that removesspeaker-specific information from the encoder.We arranged two experiments to compare our model with baselines using variouslevels of cross-lingual parameter sharing, in order to evaluate: (1) stabilityand performance when training on low amounts of data, (2) pronunciationaccuracy and voice quality of code-switching synthesis. For training, we usedthe CSS10 dataset and our new small dataset based on Common Voice recordings infive languages. Our model is shown to effectively share information acrosslanguages and according to a subjective evaluation test, it produces morenatural and accurate code-switching speech than the baselines.

Please select stars to rate!

         

0 comments Sign in to leave a comment.

    Data loading, please wait...
×