eduzhai > Applied Sciences > Engineering >

Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 11 pages

Abstract: Weakly Labelled learning has garnered lot of attention in recent years due toits potential to scale Sound Event Detection (SED) and is formulated asMultiple Instance Learning (MIL) problem. This paper proposes a Multi-TaskLearning (MTL) framework for learning from Weakly Labelled Audio data whichencompasses the traditional MIL setup. To show the utility of proposedframework, we use the input TimeFrequency representation (T-F) reconstructionas the auxiliary task. We show that the chosen auxiliary task de-noisesinternal T-F representation and improves SED performance under noisyrecordings. Our second contribution is introducing two step Attention Poolingmechanism. By having 2-steps in attention mechanism, the network retains betterT-F level information without compromising SED performance. The visualisationof first step and second step attention weights helps in localising theaudio-event in T-F domain. For evaluating the proposed framework, we remix theDCASE 2019 task 1 acoustic scene data with DCASE 2018 Task 2 sounds event dataunder 0, 10 and 20 db SNR resulting in a multi-class Weakly labelled SEDproblem. The proposed total framework outperforms existing benchmark modelsover all SNRs, specifically 22.3 , 12.8 , 5.9 improvement over benchmarkmodel on 0, 10 and 20 dB SNR respectively. We carry out ablation study todetermine the contribution of each auxiliary task and 2-step Attention Poolingto the SED performance improvement. The code is publicly released

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...