eduzhai > Applied Sciences > Engineering >

Dual-Path Transformer Network Direct Context-Aware Modeling for End-to-End Monaural Speech Separation

  • king
  • (0) Download
  • 20210505
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: The dominant speech separation models are based on complex recurrent orconvolution neural network that model speech sequences indirectly conditioningon context, such as passing information through many intermediate states inrecurrent neural network, leading to suboptimal separation performance. In thispaper, we propose a dual-path transformer network (DPTNet) for end-to-endspeech separation, which introduces direct context-awareness in the modelingfor speech sequences. By introduces a improved transformer, elements in speechsequences can interact directly, which enables DPTNet can model for the speechsequences with direct context-awareness. The improved transformer in ourapproach learns the order information of the speech sequences withoutpositional encodings by incorporating a recurrent neural network into theoriginal transformer. In addition, the structure of dual paths makes our modelefficient for extremely long speech sequence modeling. Extensive experiments onbenchmark datasets show that our approach outperforms the currentstate-of-the-arts (20.6 dB SDR on the public WSj0-2mix data corpus).

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...