High-throughput CNN inference on embedded ARM big.little multi-core processors

Zeng, Y.; Pathania, A.; Mitra, T.; Goel, N.; Wang, S.; Ananthanarayanan, G.

Please use this identifier to cite or link to this item: http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/1873

Full metadata record

DC Field	Value	Language
dc.contributor.author	Wang, S.
dc.contributor.author	Ananthanarayanan, G.
dc.contributor.author	Zeng, Y.
dc.contributor.author	Goel, N.
dc.contributor.author	Pathania, A.
dc.contributor.author	Mitra, T.
dc.date.accessioned	2021-06-20T09:43:17Z
dc.date.available	2021-06-20T09:43:17Z
dc.date.issued	2021-06-20
dc.identifier.uri	http://localhost:8080/xmlui/handle/123456789/1873
dc.description.abstract	IoT Edge intelligence requires Convolutional Neural Network (CNN) inference to take place in the edge devices itself. ARM big.LITTLE architecture is at the heart of prevalent commercial edge devices. It comprises of single-ISA heterogeneous cores grouped into multiple homogeneous clusters that enable power and performance trade-offs. All cores are expected to be simultaneously employed in inference to attain maximal throughput. However, high communication overhead involved in parallelization of computations from convolution kernels across clusters is detrimental to throughput. We present an alternative framework called Pipe-it that employs pipelined design to split convolutional layers across clusters while limiting parallelization of their respective kernels to the assigned cluster. We develop a performance-prediction model that utilizes only the convolutional layer descriptors to predict the execution time of each layer individually on all permitted core configurations (type and count). Pipe-it then exploits the predictions to create a balanced pipeline using an efficient design space exploration algorithm. Pipe-it on average results in a 39% higher throughput than the highest antecedent throughput.	en_US
dc.language.iso	en_US	en_US
dc.subject	Heterogeneous Multi-Core	en_US
dc.subject	Asymmetric MultiCore	en_US
dc.subject	Edge Inference	en_US
dc.subject	CNN Performance-Prediction	en_US
dc.title	High-throughput CNN inference on embedded ARM big.little multi-core processors	en_US
dc.type	Article	en_US
Appears in Collections:	Year-2020

Files in This Item:

File	Description	Size	Format
Fulltext.pdf		964.42 kB	Adobe PDF	View/Open Request a copy

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets