eduzhai > Applied Sciences > Engineering >

Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation

  • king
  • (0) Download
  • 20210505
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: We propose a self-supervised representation learning model for the task ofunsupervised phoneme boundary detection. The model is a convolutional neuralnetwork that operates directly on the raw waveform. It is optimized to identifyspectral changes in the signal using the Noise-Contrastive Estimationprinciple. At test time, a peak detection algorithm is applied over the modeloutputs to produce the final boundaries. As such, the proposed model is trainedin a fully unsupervised manner with no manual annotations in the form of targetboundaries nor phonetic transcriptions. We compare the proposed approach toseveral unsupervised baselines using both TIMIT and Buckeye corpora. Resultssuggest that our approach surpasses the baseline models and reachesstate-of-the-art performance on both data sets. Furthermore, we experimentedwith expanding the training set with additional examples from the Librispeechcorpus. We evaluated the resulting model on distributions and languages thatwere not seen during the training phase (English, Hebrew and German) and showedthat utilizing additional untranscribed data is beneficial for modelperformance.

Please select stars to rate!

         

0 comments Sign in to leave a comment.

    Data loading, please wait...
×