eduzhai > Applied Sciences > Engineering >

FracBits Mixed Precision Quantization via Fractional Bit-Widths

  • king
  • (0) Download
  • 20210505
  • Save

... pages left unread,continue reading

Document pages: 9 pages

Abstract: Model quantization helps to reduce model size and latency of deep neuralnetworks. Mixed precision quantization is favorable with customized hardwaressupporting arithmetic operations at multiple bit-widths to achieve maximumefficiency. We propose a novel learning-based algorithm to derive mixedprecision models end-to-end under target computation constraints and modelsizes. During the optimization, the bit-width of each layer kernel in themodel is at a fractional status of two consecutive bit-widths which can beadjusted gradually. With a differentiable regularization term, the resourceconstraints can be met during the quantization-aware training which results inan optimized mixed precision model. Further, our method can be naturallycombined with channel pruning for better computation cost allocation. Our finalmodels achieve comparable or better performance than previous quantizationmethods with mixed precision on MobilenetV1 V2, ResNet18 under differentresource constraints on ImageNet dataset.

Please select stars to rate!

         

0 comments Sign in to leave a comment.

    Data loading, please wait...
×