eduzhai > Applied Sciences > Engineering >

Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net

  • Save

... pages left unread,continue reading

Document pages: 4 pages

Abstract: Our systems submitted to the DCASE2020 task~3: Sound Event Localization andDetection (SELD) are described in this report. We consider two systems: asingle-stage system that solve sound event localization~(SEL) and sound eventdetection~(SED) simultaneously, and a two-stage system that first handles theSED and SEL tasks individually and later combines those results. As thesingle-stage system, we propose a unified training framework that uses anactivity-coupled Cartesian DOA vector~(ACCDOA) representation as a singletarget for both the SED and SEL tasks. To efficiently estimate sound eventlocations and activities, we further propose RD3Net, which incorporatesrecurrent and convolution layers with dense skip connections and dilation. Togeneralize the models, we apply three data augmentation techniques: equalizedmixture data augmentation~(EMDA), rotation of first-order Ambisonic~(FOA)singals, and multichannel extension of SpecAugment. Our systems demonstrate asignificant improvement over the baseline system.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...