eduzhai > Applied Sciences > Engineering >

Emotion Recognition in Audio and Video Using Deep Neural Networks

  • Save

... pages left unread,continue reading

Document pages: 9 pages

Abstract: Humans are able to comprehend information from multiple domains for e.g.speech, text and visual. With advancement of deep learning technology there hasbeen significant improvement of speech recognition. Recognizing emotion fromspeech is important aspect and with deep learning technology emotionrecognition has improved in accuracy and latency. There are still manychallenges to improve accuracy. In this work, we attempt to explore differentneural networks to improve accuracy of emotion recognition. With differentarchitectures explored, we find (CNN+RNN) + 3DCNN multi-model architecturewhich processes audio spectrograms and corresponding video frames givingemotion prediction accuracy of 54.0 among 4 emotions and 71.75 among 3emotions using IEMOCAP[2] dataset.

Please select stars to rate!

         

0 comments Sign in to leave a comment.

    Data loading, please wait...
×