eduzhai > Applied Sciences > Engineering >

Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement

  • king
  • (0) Download
  • 20210505
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: This paper investigates different trade-offs between the number of modelparameters and enhanced speech qualities by employing several deeptensor-to-vector regression models for speech enhancement. We find that ahybrid architecture, namely CNN-TT, is capable of maintaining a good qualityperformance with a reduced model parameter size. CNN-TT is composed of severalconvolutional layers at the bottom for feature extraction to improve speechquality and a tensor-train (TT) output layer on the top to reduce modelparameters. We first derive a new upper bound on the generalization power ofthe convolutional neural network (CNN) based vector-to-vector regressionmodels. Then, we provide experimental evidence on the Edinburgh noisy speechcorpus to demonstrate that, in single-channel speech enhancement, CNNoutperforms DNN at the expense of a small increment of model sizes. Besides,CNN-TT slightly outperforms the CNN counterpart by utilizing only 32 of theCNN model parameters. Besides, further performance improvement can be attainedif the number of CNN-TT parameters is increased to 44 of the CNN model size.Finally, our experiments of multi-channel speech enhancement on a simulatednoisy WSJ0 corpus demonstrate that our proposed hybrid CNN-TT architectureachieves better results than both DNN and CNN models in terms ofbetter-enhanced speech qualities and smaller parameter sizes.

Please select stars to rate!

         

0 comments Sign in to leave a comment.

    Data loading, please wait...
×