eduzhai > Applied Sciences > Engineering >

Transformer based unsupervised pre-training for acoustic representation learning

  • king
  • (0) Download
  • 20210505
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: Recently, a variety of acoustic tasks and related applications arised. Formany acoustic tasks, the labeled data size may be limited. To handle thisproblem, we propose an unsupervised pre-training method using Transformer basedencoder to learn a general and robust high-level representation for allacoustic tasks. Experiments have been conducted on three kinds of acoustictasks: speech emotion recognition, sound event detection and speechtranslation. All the experiments have shown that pre-training using its owntraining data can significantly improve the performance. With a largerpre-training data combining MuST-C, Librispeech and ESC-US datasets, for speechemotion recognition, the UAR can further improve absolutely 4.3 on IEMOCAPdataset. For sound event detection, the F1 score can further improve absolutely1.5 on DCASE2018 task5 development set and 2.1 on evaluation set. For speechtranslation, the BLEU score can further improve relatively 12.2 on En-Dedataset and 8.4 on En-Fr dataset.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...