eduzhai > Applied Sciences > Engineering >

Quasi-Periodic Parallel WaveGAN A Non-autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network

  • king
  • (0) Download
  • 20210505
  • Save

... pages left unread,continue reading

Document pages: 15 pages

Abstract: In this paper, we propose a quasi-periodic parallel WaveGAN (QPPWG) waveformgenerative model, which applies a quasi-periodic (QP) structure to a parallelWaveGAN (PWG) model using pitch-dependent dilated convolution networks(PDCNNs). PWG is a small-footprint GAN-based raw waveform generative model,whose generation time is much faster than real time because of its compactmodel and non-autoregressive (non-AR) and non-causal mechanisms. Although PWGachieves high-fidelity speech generation, the generic and simple networkarchitecture lacks pitch controllability for an unseen auxiliary fundamentalfrequency ($F {0}$) feature such as a scaled $F {0}$. To improve the pitchcontrollability and speech modeling capability, we apply a QP structure withPDCNNs to PWG, which introduces pitch information to the network by dynamicallychanging the network architecture corresponding to the auxiliary $F {0}$feature. Both objective and subjective experimental results show that QPPWGoutperforms PWG when the auxiliary $F {0}$ feature is scaled. Moreover,analyses of the intermediate outputs of QPPWG also show better tractability andinterpretability of QPPWG, which respectively models spectral andexcitation-like signals using the cascaded fixed and adaptive blocks of the QPstructure.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...