eduzhai > Applied Sciences > Engineering >

Joint Speaker Counting Speech Recognition and Speaker Identification for Overlapped Speech of Any Number of Speakers

  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: We propose an end-to-end speaker-attributed automatic speech recognitionmodel that unifies speaker counting, speech recognition, and speakeridentification on monaural overlapped speech. Our model is built on serializedoutput training (SOT) with attention-based encoder-decoder, a recently proposedmethod for recognizing overlapped speech comprising an arbitrary number ofspeakers. We extend SOT by introducing a speaker inventory as an auxiliaryinput to produce speaker labels as well as multi-speaker transcriptions. Allmodel parameters are optimized by speaker-attributed maximum mutual informationcriterion, which represents a joint probability for overlapped speechrecognition and speaker identification. Experiments on LibriSpeech corpus showthat our proposed method achieves significantly better speaker-attributed worderror rate than the baseline that separately performs overlapped speechrecognition and speaker identification.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...