eduzhai > Applied Sciences > Engineering >

Class LM and word mapping for contextual biasing in End-to-End ASR

  • king
  • (0) Download
  • 20210505
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: In recent years, all-neural, end-to-end (E2E) ASR systems gained rapidinterest in the speech recognition community. They convert speech input to textunits in a single trainable Neural Network model. In ASR, many utterancescontain rich named entities. Such named entities may be user or locationspecific and they are not seen during training. A single model makes itinflexible to utilize dynamic contextual information during inference. In thispaper, we propose to train a context aware E2E model and allow the beam searchto traverse into the context FST during inference. We also propose a simplemethod to adjust the cost discrepancy between the context FST and the basemodel. This algorithm is able to reduce the named entity utterance WER by 57 with little accuracy degradation on regular utterances. Although an E2E modeldoes not need pronunciation dictionary, it s interesting to make use ofexisting pronunciation knowledge to improve accuracy. In this paper, we proposean algorithm to map the rare entity words to common words via pronunciation andtreat the mapped words as an alternative form to the original word duringrecognition. This algorithm further reduces the WER on the named entityutterances by another 31 .

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...