eduzhai > Applied Sciences > Engineering >

Speech Driven Talking Face Generation from a Single Image and an Emotion Condition

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 10 pages

Abstract: Visual emotion expression plays an important role in audiovisual speechcommunication. In this work, we propose a novel approach to rendering visualemotion expression in speech-driven talking face generation. Specifically, wedesign an end-to-end talking face generation system that takes a speechutterance, a single face image, and a categorical emotion label as input torender a talking face video in sync with the speech and expressing thecondition emotion. Objective evaluation on image quality, audiovisualsynchronization, and visual emotion expression shows that the proposed systemoutperforms a state-of-the-art baseline system. Subjective evaluation of visualemotion expression and video realness also demonstrates the superiority of theproposed system. Furthermore, we conduct a pilot study on human emotionrecognition of generated videos with mismatched emotions between the audio andvisual modalities, and results show that humans reply on the visual modalitymore significantly than the audio modality on this task.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...