eduzhai > Applied Sciences > Engineering >

Analyzing the Quality and Stability of a Streaming End-to-End On-Device Speech Recognizer

  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: The demand for fast and accurate incremental speech recognition increases asthe applications of automatic speech recognition (ASR) proliferate. Incrementalspeech recognizers output chunks of partially recognized words while the useris still talking. Partial results can be revised before the ASR finalizes itshypothesis, causing instability issues. We analyze the quality and stability ofon-device streaming end-to-end (E2E) ASR models. We first introduce a novel setof metrics that quantify the instability at word and segment levels. We studythe impact of several model training techniques that improve E2E modelqualities but degrade model stability. We categorize the causes of instabilityand explore various solutions to mitigate them in a streaming E2E ASR system.Index Terms: ASR, stability, end-to-end, text normalization,on-device, RNN-T

Please select stars to rate!

         

0 comments Sign in to leave a comment.

    Data loading, please wait...
×