eduzhai > Applied Sciences > Engineering >

Advancing Multiple Instance Learning with Attention Modeling for Categorical Speech Emotion Recognition

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: Categorical speech emotion recognition is typically performed as asequence-to-label problem, i.e., to determine the discrete emotion label of theinput utterance as a whole. One of the main challenges in practice is that mostof the existing emotion corpora do not give ground truth labels for eachsegment; instead, we only have labels for whole utterances. To extractsegment-level emotional information from such weakly labeled emotion corpora,we propose using multiple instance learning (MIL) to learn segment embeddingsin a weakly supervised manner. Also, for a sufficiently long utterance, not allof the segments contain relevant emotional information. In this regard, threeattention-based neural network models are then applied to the learned segmentembeddings to attend the most salient part of a speech utterance. Experimentson the CASIA corpus and the IEMOCAP database show better or highly competitiveresults than other state-of-the-art approaches.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...