“…Recent developments in computing hardware (e.g., graphics processing units (GPUs) and tensor processing units (TPUs) [ 1 ]) have enabled large scale parallel processing, resulting in a substantial reduction in the inference/training time for deep learning on PC/server platforms. As hardware performance improvements have made neural network models deeper and wider, the deep learning model has outperformed humans in various fields such as computer vision, natural language processing, and audio classification [ 2 , 3 , 4 , 5 , 6 ]. Many recent studies have used the superior performance of deep learning algorithms, which normally run on PC/server platforms, for their deployment in mobile devices [ 7 , 8 , 9 , 10 , 11 ].…”