eduzhai > Applied Sciences > Engineering >

AntiDote Attention-based Dynamic Optimization for Neural Network Runtime Efficiency

  • king
  • (0) Download
  • 20210507
  • Save

... pages left unread,continue reading

Document pages: 6 pages

Abstract: Convolutional Neural Networks (CNNs) achieved great cognitive performance atthe expense of considerable computation load. To relieve the computation load,many optimization works are developed to reduce the model redundancy byidentifying and removing insignificant model components, such as weightsparsity and filter pruning. However, these works only evaluate modelcomponents static significance with internal parameter information, ignoringtheir dynamic interaction with external inputs. With per-input featureactivation, the model component significance can dynamically change, and thusthe static methods can only achieve sub-optimal results. Therefore, we proposea dynamic CNN optimization framework in this work. Based on the neural networkattention mechanism, we propose a comprehensive dynamic optimization frameworkincluding (1) testing-phase channel and column feature map pruning, as well as(2) training-phase optimization by targeted dropout. Such a dynamicoptimization framework has several benefits: (1) First, it can accuratelyidentify and aggressively remove per-input feature redundancy with consideringthe model-input interaction; (2) Meanwhile, it can maximally remove the featuremap redundancy in various dimensions thanks to the multi-dimension flexibility;(3) The training-testing co-optimization favors the dynamic pruning and helpsmaintain the model accuracy even with very high feature pruning ratio.Extensive experiments show that our method could bring 37.4 to 54.5 FLOPsreduction with negligible accuracy drop on various of test networks.

Please select stars to rate!

         

0 comments Sign in to leave a comment.

    Data loading, please wait...
×