eduzhai > Applied Sciences > Engineering >

Boosting Objective Scores of a Speech Enhancement Model by MetricGAN Post-processing

  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: The Transformer architecture has demonstrated a superior ability compared torecurrent neural networks in many different natural language processingapplications. Therefore, our study applies a modified Transformer in a speechenhancement task. Specifically, positional encoding in the Transformer may notbe necessary for speech enhancement, and hence, it is replaced by convolutionallayers. To further improve the perceptual evaluation of the speech quality(PESQ) scores of enhanced speech, the L 1 pre-trained Transformer is fine-tunedusing a MetricGAN framework. The proposed MetricGAN can be treated as a generalpost-processing module to further boost the objective scores of interest. Theexperiments were conducted using the data sets provided by the organizer of theDeep Noise Suppression (DNS) challenge. Experimental results demonstrated thatthe proposed system outperformed the challenge baseline, in both subjective andobjective evaluations, with a large margin.

Please select stars to rate!

         

0 comments Sign in to leave a comment.

    Data loading, please wait...
×