eduzhai > Applied Sciences > Engineering >

Sep-Stereo Visually Guided Stereophonic Audio Generation by Associating Source Separation

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 20 pages

Abstract: Stereophonic audio is an indispensable ingredient to enhance human auditoryexperience. Recent research has explored the usage of visual information asguidance to generate binaural or ambisonic audio from mono ones with stereosupervision. However, this fully supervised paradigm suffers from an inherentdrawback: the recording of stereophonic audio usually requires delicate devicesthat are expensive for wide accessibility. To overcome this challenge, wepropose to leverage the vastly available mono data to facilitate the generationof stereophonic audio. Our key observation is that the task of visuallyindicated audio separation also maps independent audios to their correspondingvisual positions, which shares a similar objective with stereophonic audiogeneration. We integrate both stereo generation and source separation into aunified framework, Sep-Stereo, by considering source separation as a particulartype of audio spatialization. Specifically, a novel associative pyramid networkarchitecture is carefully designed for audio-visual feature fusion. Extensiveexperiments demonstrate that our framework can improve the stereophonic audiogeneration results while performing accurate sound separation with a sharedbackbone.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...