eduzhai > Applied Sciences > Engineering >

Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: In recent works, a flow-based neural vocoder has shown significantimprovement in real-time speech generation task. The sequence of invertibleflow operations allows the model to convert samples from simple distribution toaudio samples. However, training a continuous density model on discrete audiodata can degrade model performance due to the topological difference betweenlatent and actual distribution. To resolve this problem, we propose audiodequantization methods in flow-based neural vocoder for high fidelity audiogeneration. Data dequantization is a well-known method in image generation buthas not yet been studied in the audio domain. For this reason, we implementvarious audio dequantization methods in flow-based neural vocoder andinvestigate the effect on the generated audio. We conduct various objectiveperformance assessments and subjective evaluation to show that audiodequantization can improve audio generation quality. From our experiments,using audio dequantization produces waveform audio with better harmonicstructure and fewer digital artifacts.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...