eduzhai > Applied Sciences > Engineering >

Communication-Computation Trade-Off in Resource-Constrained Edge Inference

  • Save

... pages left unread,continue reading

Document pages: 7 pages

Abstract: The recent breakthrough in artificial intelligence (AI), especially deepneural networks (DNNs), has affected every branch of science and technology.Particularly, edge AI has been envisioned as a major application scenario toprovide DNN-based services at edge devices. This article presents effectivemethods for edge inference at resource-constrained devices. It focuses ondevice-edge co-inference, assisted by an edge computing server, andinvestigates a critical trade-off among the computation cost of the on-devicemodel and the communication cost of forwarding the intermediate feature to theedge server. A three-step framework is proposed for the effective inference:(1) model split point selection to determine the on-device model, (2)communication-aware model compression to reduce the on-device computation andthe resulting communication overhead simultaneously, and (3) task-orientedencoding of the intermediate feature to further reduce the communicationoverhead. Experiments demonstrate that our proposed framework achieves a bettertrade-off and significantly reduces the inference latency than baselinemethods.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...