eduzhai > Applied Sciences > Engineering >

Multi-Encoder-Decoder Transformer for Code-Switching Speech Recognition

  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: Code-switching (CS) occurs when a speaker alternates words of two or morelanguages within a single sentence or across sentences. Automatic speechrecognition (ASR) of CS speech has to deal with two or more languages at thesame time. In this study, we propose a Transformer-based architecture with twosymmetric language-specific encoders to capture the individual languageattributes, that improve the acoustic representation of each language. Theserepresentations are combined using a language-specific multi-head attentionmechanism in the decoder module. Each encoder and its corresponding attentionmodule in the decoder are pre-trained using a large monolingual corpus aimingto alleviate the impact of limited CS training data. We call such a network amulti-encoder-decoder (MED) architecture. Experiments on the SEAME corpus showthat the proposed MED architecture achieves 10.2 and 10.8 relative error ratereduction on the CS evaluation sets with Mandarin and English as the matrixlanguage respectively.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...