eduzhai > Applied Sciences > Engineering >

End-to-End Speech-Translation with Knowledge Distillation FBK@IWSLT2020

  • Save

... pages left unread,continue reading

Document pages: 9 pages

Abstract: This paper describes FBK s participation in the IWSLT 2020 offline speechtranslation (ST) task. The task evaluates systems ability to translate EnglishTED talks audio into German texts. The test talks are provided in two versions:one contains the data already segmented with automatic tools and the other isthe raw data without any segmentation. Participants can decide whether to workon custom segmentation or not. We used the provided segmentation. Our system isan end-to-end model based on an adaptation of the Transformer for speech data.Its training process is the main focus of this paper and it is based on: i)transfer learning (ASR pretraining and knowledge distillation), ii) dataaugmentation (SpecAugment, time stretch and synthetic data), iii) combiningsynthetic and real data marked as different domains, and iv) multi-tasklearning using the CTC loss. Finally, after the training with word-levelknowledge distillation is complete, our ST models are fine-tuned using labelsmoothed cross entropy. Our best model scored 29 BLEU on the MuST-C En-De testset, which is an excellent result compared to recent papers, and 23.7 BLEU onthe same data segmented with VAD, showing the need for researching solutionsaddressing this specific data condition.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...