eduzhai > Applied Sciences > Engineering >

Audeo Audio Generation for a Silent Performance Video

  • Save

... pages left unread,continue reading

Document pages: 12 pages

Abstract: We present a novel system that gets as an input video frames of a musicianplaying the piano and generates the music for that video. Generation of musicfrom visual cues is a challenging problem and it is not clear whether it is anattainable goal at all. Our main aim in this work is to explore theplausibility of such a transformation and to identify cues and components ableto carry the association of sounds with visual events. To achieve thetransformation we built a full pipeline named ` textit{Audeo} containing threecomponents. We first translate the video frames of the keyboard and themusician hand movements into raw mechanical musical symbolic representationPiano-Roll (Roll) for each video frame which represents the keys pressed ateach time step. We then adapt the Roll to be amenable for audio synthesis byincluding temporal correlations. This step turns out to be critical formeaningful audio generation. As a last step, we implement Midi synthesizers togenerate realistic music. textit{Audeo} converts video to audio smoothly andclearly with only a few setup constraints. We evaluate textit{Audeo} on `inthe wild piano performance videos and obtain that their generated music is ofreasonable audio quality and can be successfully recognized with high precisionby popular music identification software.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...