eduzhai > Applied Sciences > Engineering >

MP3 Compression To Diminish Adversarial Noise in End-to-End Speech Recognition

  • king
  • (0) Download
  • 20210505
  • Save

... pages left unread,continue reading

Document pages: 10 pages

Abstract: Audio Adversarial Examples (AAE) represent specially created inputs meant totrick Automatic Speech Recognition (ASR) systems into misclassification. Thepresent work proposes MP3 compression as a means to decrease the impact ofAdversarial Noise (AN) in audio samples transcribed by ASR systems. To thisend, we generated AAEs with the Fast Gradient Sign Method for an end-to-end,hybrid CTC-attention ASR system. Our method is then validated by two objectiveindicators: (1) Character Error Rates (CER) that measure the speech decodingperformance of four ASR models trained on uncompressed, as well asMP3-compressed data sets and (2) Signal-to-Noise Ratio (SNR) estimated for bothuncompressed and MP3-compressed AAEs that are reconstructed in the time domainby feature inversion. We found that MP3 compression applied to AAEs indeedreduces the CER when compared to uncompressed AAEs. Moreover, feature-inverted(reconstructed) AAEs had significantly higher SNRs after MP3 compression,indicating that AN was reduced. In contrast to AN, MP3 compression applied toutterances augmented with regular noise resulted in more transcription errors,giving further evidence that MP3 encoding is effective in diminishing only AN.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...