The emergence of Internet of Things (IoT) paradigmhas led to a remarkable increase on the volume of data generatedat the network edge. Due to limited network bandwidth and data privacy concerns, it is often impractical forthese applications to transmit input data from edge devices to acentralized location for DNN processing. However, due to limited bandwidth and computational resources on edge devices, it is acritical challenge to partition large DNNs and assign workload to individual devices to achieve low inference latency, while maintaining low communication overhead. In this work, we study the DNN partitioning problem for CNNs, an efficient partitioning scheme of the large-scale CNN over the edge devices with limited computing power. Evaluation over numerous CNN models anddatasets demonstrates CININ can greatly reduce the inferencelatency while achieving almost no loss on the performance.