eduzhai > Applied Sciences > Engineering >

Comparing Representations for Audio Synthesis Using Generative Adversarial Networks

  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: In this paper, we compare different audio signal representations, includingthe raw audio waveform and a variety of time-frequency representations, for thetask of audio synthesis with Generative Adversarial Networks (GANs). We conductthe experiments on a subset of the NSynth dataset. The architecture follows thebenchmark Progressive Growing Wasserstein GAN. We perform experiments both in afully non-conditional manner as well as conditioning the network on the pitchinformation. We quantitatively evaluate the generated material utilizingstandard metrics for assessing generative models, and compare training andsampling times. We show that complex-valued as well as the magnitude andInstantaneous Frequency of the Short-Time Fourier Transform achieve the bestresults, and yield fast generation and inversion times. The code for featureextraction, training and evaluating the model is available online.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...