eduzhai > Applied Sciences > Engineering >

Sound Event Localization and Detection using Squeeze-Excitation Residual CNNs

  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: Sound Event Localization and Detection (SELD) is a problem related to thefield of machine listening whose objective is to recognize individual soundevents, detect their temporal activity, and estimate their spatial location.Thanks to the emergence of more hard-labeled audio datasets, Deep Learningtechniques have become state-of-the-art solutions. The most common ones arethose that implement a convolutional recurrent network (CRNN) having previouslytransformed the audio signal into multichannel 2D representation. Thesqueeze-excitation technique can be considered as a convolution enhancementthat aims to learn spatial and channel feature maps independently rather thantogether as standard convolutions do. This is usually achieved by combiningsome global clustering operators, linear operators and a final calibrationbetween the block input and its learned relationships. This work aims toimprove the accuracy results of the baseline CRNN presented in DCASE 2020 Task3 by adding residual squeeze-excitation (SE) blocks in the convolutional partof the CRNN. The followed procedure involves a grid search of the parameterratio (used in the linear relationships) of the residual SE block, whereas thehyperparameters of the network remain the same as in the baseline. Experimentsshow that by simply introducing the residual SE blocks, the results obtainedclearly exceed the baseline.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...