An Efficient Partitioning Scheme of DNNs for IoT

An Efficient Partitioning Scheme of DNNs for IoT

The emergence of Internet of Things (IoT) paradigmhas led to a remarkable increase on the volume of data generatedat the network edge. Due to limited network bandwidth and data privacy concerns, it is often impractical forthese applications to transmit input data from edge devices to acentralized location for DNN processing. However, due to limited bandwidth and computational resources on edge devices, it is acritical challenge to partition large DNNs and assign workload to individual devices to achieve low inference latency, while maintaining low communication overhead. In this work, we study the DNN partitioning problem for CNNs, an efficient partitioning scheme of the large-scale CNN over the edge devices with limited computing power. Evaluation over numerous CNN models anddatasets demonstrates CININ can greatly reduce the inferencelatency while achieving almost no loss on the performance.

Avatar
Xin Dong
Ph.D. student at Harvard on Machine Learning

Xin Dong is a Ph.D. student at Harvard University. He’s research focuses efficient deep learning, at the intersection between machine learning and computer architecture. He completed his undergraduate study of Yingcai Honors College at University of Electronic Science and Technology of China (UESTC). He was a Research Assistant in Nanyang Technological University (NTU), Singapore and UC San Diego (UCSD) working on technique related to machine learning.