eduzhai > Applied Sciences > Engineering >

Dyadic Speech-based Affect Recognition using DAMI-P2C Parent-child Multimodal Interaction Dataset

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 10 pages

Abstract: Automatic speech-based affect recognition of individuals in dyadicconversation is a challenging task, in part because of its heavy reliance onmanual pre-processing. Traditional approaches frequently require hand-craftedspeech features and segmentation of speaker turns. In this work, we designend-to-end deep learning methods to recognize each person s affectiveexpression in an audio stream with two speakers, automatically discoveringfeatures and time regions relevant to the target speaker s affect. We integratea local attention mechanism into the end-to-end architecture and compare theperformance of three attention implementations -- one mean pooling and twoweighted pooling methods. Our results show that the proposed weighted-poolingattention solutions are able to learn to focus on the regions containing targetspeaker s affective information and successfully extract the individual svalence and arousal intensity. Here we introduce and use a "dyadic affect inmultimodal interaction - parent to child " (DAMI-P2C) dataset collected in astudy of 34 families, where a parent and a child (3-7 years old) engage inreading storybooks together. In contrast to existing public datasets for affectrecognition, each instance for both speakers in the DAMI-P2C dataset isannotated for the perceived affect by three labelers. To encourage moreresearch on the challenging task of multi-speaker affect sensing, we make theannotated DAMI-P2C dataset publicly available, including acoustic features ofthe dyads raw audios, affect annotations, and a diverse set of developmental,social, and demographic profiles of each dyad.

Please select stars to rate!

         

0 comments Sign in to leave a comment.

    Data loading, please wait...
×