eduzhai > Applied Sciences > Engineering >

Listen carefully and tell an audio captioning system based on residual learning and gammatone audio representation

  • Save

... pages left unread,continue reading

Document pages: 5 pages

Abstract: Automated audio captioning is machine listening task whose goal is todescribe an audio using free text. An automated audio captioning system has tobe implemented as it accepts an audio as input and outputs as textualdescription, that is, the caption of the signal. This task can be useful inmany applications such as automatic content description or machine-to-machineinteraction. In this work, an automatic audio captioning based on residuallearning on the encoder phase is proposed. The encoder phase is implemented viadifferent Residual Networks configurations. The decoder phase (create thecaption) is run using recurrent layers plus attention mechanism. The audiorepresentation chosen has been Gammatone. Results show that the frameworkproposed in this work surpass the baseline system in challenge results.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...