Fused DSConv: Optimizing sparse CNN inference for execution on edge devices

Jia Guo, Radu Teodorescu, Gagan Agrawal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Accelerating CNN on resource-constrained edge devices is becoming an increasingly important problem with the emergence of IoT and edge computing. This paper proposes an execution strategy and an implementation for efficient execution of CNNs. Our execution strategy combines two previously published, but not widely used, ideas - direct sparse convolution and fusion of two convolution layers. Together with a scheme for caching intermediate results, this results in a very efficient mechanism for speeding up inference after the model has been sparsified. We also demonstrate an efficient implementation that uses both multi-core and SIMD parallelism. Our experimental results demonstrate that our scheme significantly outperforms existing implementations on an edge device, while also scaling better in a server environment.

Original languageEnglish (US)
Title of host publicationProceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021
EditorsLaurent Lefevre, Stacy Patterson, Young Choon Lee, Haiying Shen, Shashikant Ilager, Mohammad Goudarzi, Adel N. Toosi, Rajkumar Buyya
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages545-554
Number of pages10
ISBN (Electronic)9781728195865
DOIs
StatePublished - May 2021
Event21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021 - Virtual, Melbourne, Australia
Duration: May 10 2021May 13 2021

Publication series

NameProceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021

Conference

Conference21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021
Country/TerritoryAustralia
CityVirtual, Melbourne
Period5/10/215/13/21

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Fused DSConv: Optimizing sparse CNN inference for execution on edge devices'. Together they form a unique fingerprint.

Cite this