eduzhai > Applied Sciences > Engineering >

Are you wearing a mask? Improving mask detection from speech using augmentation by cycle-consistent GANs

  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: The task of detecting whether a person wears a face mask from speech isuseful in modelling speech in forensic investigations, communication betweensurgeons or people protecting themselves against infectious diseases such asCOVID-19. In this paper, we propose a novel data augmentation approach for maskdetection from speech. Our approach is based on (i) training GenerativeAdversarial Networks (GANs) with cycle-consistency loss to translate unpairedutterances between two classes (with mask and without mask), and on (ii)generating new training utterances using the cycle-consistent GANs, assigningopposite labels to each translated utterance. Original and translatedutterances are converted into spectrograms which are provided as input to a setof ResNet neural networks with various depths. The networks are combined intoan ensemble through a Support Vector Machines (SVM) classifier. With thissystem, we participated in the Mask Sub-Challenge (MSC) of the INTERSPEECH 2020Computational Paralinguistics Challenge, surpassing the baseline proposed bythe organizers by 2.8 . Our data augmentation technique provided a performanceboost of 0.9 on the private test set. Furthermore, we show that our dataaugmentation approach yields better results than other baseline andstate-of-the-art augmentation methods.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...