TY - GEN
T1 - A Fused Inference Design for Pattern-Based Sparse CNN on Edge Devices
AU - Guo, Jia
AU - Teodorescu, Radu
AU - Agrawal, Gagan
N1 - Funding Information:
Acknowledgements: This work was partially supported by the following NSF grants: 1629392, 2007793, 2034850, 2131509, and 2018627.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Weight pruning approaches for Convolution Neural Networks (CNN) has been well developed in the past years. Compared with traditional unstructured and structured pruning, the new state-of-the-art sparse convolution pattern (SCP) based pruning uses certain patterns that lead to both high pruning rate and low accuracy loss. This paper introduce a novel inference scheme to accelerate the execution of SCP-pruned models on IoT devices with limited resources. This inference scheme applies and combines ideas from direct sparse convolution and layer fusion. To fully utilize the power of modern IoT processors, the inference is also mapped to all available cores and optimized with SIMD instructions. The experimental results show good performance improvement as well as scalability of our scheme on an edge device.
AB - Weight pruning approaches for Convolution Neural Networks (CNN) has been well developed in the past years. Compared with traditional unstructured and structured pruning, the new state-of-the-art sparse convolution pattern (SCP) based pruning uses certain patterns that lead to both high pruning rate and low accuracy loss. This paper introduce a novel inference scheme to accelerate the execution of SCP-pruned models on IoT devices with limited resources. This inference scheme applies and combines ideas from direct sparse convolution and layer fusion. To fully utilize the power of modern IoT processors, the inference is also mapped to all available cores and optimized with SIMD instructions. The experimental results show good performance improvement as well as scalability of our scheme on an edge device.
KW - Deep Neural Networks
KW - Edge Computing
UR - http://www.scopus.com/inward/record.url?scp=85125657518&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125657518&partnerID=8YFLogxK
U2 - 10.1109/HiPC53243.2021.00060
DO - 10.1109/HiPC53243.2021.00060
M3 - Conference contribution
AN - SCOPUS:85125657518
T3 - Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021
SP - 424
EP - 429
BT - Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021
Y2 - 17 December 2021 through 18 December 2021
ER -