eduzhai > Applied Sciences > Engineering >

Speaker dependent acoustic-to-articulatory inversion using real-time MRI of the vocal tract

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: Acoustic-to-articulatory inversion (AAI) methods estimate articulatorymovements from the acoustic speech signal, which can be useful in several taskssuch as speech recognition, synthesis, talking heads and language tutoring.Most earlier inversion studies are based on point-tracking articulatorytechniques (e.g. EMA or XRMB). The advantage of rtMRI is that it providesdynamic information about the full midsagittal plane of the upper airway, witha high relative spatial resolution. In this work, we estimated midsagittalrtMRI images of the vocal tract for speaker dependent AAI, using MGC-LSPspectral features as input. We applied FC-DNNs, CNNs and recurrent neuralnetworks, and have shown that LSTMs are the most suitable for this task. Asobjective evaluation we measured normalized MSE, Structural Similarity Index(SSIM) and its complex wavelet version (CW-SSIM). The results indicate that thecombination of FC-DNNs and LSTMs can achieve smooth generated MR images of thevocal tract, which are similar to the original MRI recordings (average CW-SSIM:0.94).

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...