eduzhai > Applied Sciences > Engineering >

Revisiting Representation Learning for Singing Voice Separation with Sinkhorn Distances

  • king
  • (0) Download
  • 20210505
  • Save

... pages left unread,continue reading

Document pages: 19 pages

Abstract: In this work we present a method for unsupervised learning of audiorepresentations, focused on the task of singing voice separation. We build upona previously proposed method for learning representations of time-domain musicsignals with a re-parameterized denoising autoencoder, extending it by usingthe family of Sinkhorn distances with entropic regularization. We evaluate ourmethod on the freely available MUSDB18 dataset of professionally produced musicrecordings, and our results show that Sinkhorn distances with small strength ofentropic regularization are marginally improving the performance of informedsinging voice separation. By increasing the strength of the entropicregularization, the learned representations of the mixture signal consists ofalmost perfectly additive and distinctly structured sources.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...