eduzhai > Applied Sciences > Engineering >

Diet deep generative audio models with structured lottery

  • king
  • (0) Download
  • 20210506
  • Save

... pages left unread,continue reading

Document pages: 8 pages

Abstract: Deep learning models have provided extremely successful solutions in mostaudio application fields. However, the high accuracy of these models comes atthe expense of a tremendous computation cost. This aspect is almost alwaysoverlooked in evaluating the quality of proposed models. However, models shouldnot be evaluated without taking into account their complexity. This aspect isespecially critical in audio applications, which heavily relies on specializedembedded hardware with real-time constraints. In this paper, we build on recentobservations that deep models are highly overparameterized, by studying thelottery ticket hypothesis on deep generative audio models. This hypothesisstates that extremely efficient small sub-networks exist in deep models andwould provide higher accuracy than larger models if trained in isolation.However, lottery tickets are found by relying on unstructured masking, whichmeans that resulting models do not provide any gain in either disk size orinference time. Instead, we develop here a method aimed at performingstructured trimming. We show that this requires to rely on global selection andintroduce a specific criterion based on mutual information. First, we confirmthe surprising result that smaller models provide higher accuracy than theirlarge counterparts. We further show that we can remove up to 95 of the modelweights without significant degradation in accuracy. Hence, we can obtain verylight models for generative audio across popular methods such as Wavenet, SINGor DDSP, that are up to 100 times smaller with commensurate accuracy. We studythe theoretical bounds for embedding these models on Raspberry Pi and Arduino,and show that we can obtain generative models on CPU with equivalent quality aslarge GPU models. Finally, we discuss the possibility of implementing deepgenerative audio models on embedded platforms.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...