Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Menghani, Gaurav

doi:10.48550/arxiv.2106.08962

Cited by 20 publications

(21 citation statements)

References 90 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Training and deploying large deep learning models is costly. For example, the cost of trying combinations of different hyper-parameters for a large model is computationally expensive and it highly relies on training resources [3]. The results of the average training time per epoch and the total training time are presented in Table 2.…”

Section: Training Efficiencymentioning

confidence: 99%

“…Although there has been much effort on deep learning to improve the prediction accuracy of the state-of-the-art forecasting models, progressive improvements on benchmarks have been correlated with an increase in the number of parameters and the amount of training resources required to train the model, making it costly to train and deploy large deep learning models [3]. Therefore, a lightweight and training-efficient model is essential for fast delivery and deployment.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A Lightweight and Accurate Spatial-Temporal Transformer for Traffic Forecasting

Li¹,

Zhong²,

Xiang³

et al. 2022

Preprint

View full text Add to dashboard Cite

We study the forecasting problem for traffic with dynamic, possibly periodical, and joint spatial-temporal dependency between regions. Given the aggregated inflow and outflow traffic of regions in a city from time slots 0 to t − 1, we predict the traffic at time t at any region. Prior arts in the area often consider the spatial and temporal dependencies in a decoupled manner, or are rather computationally intensive in training with a large number of hyper-parameters to tune. We propose ST-TIS, a novel, lightweight and accurate Spatial-Temporal Transformer with information fusion and region sampling for traffic forecasting. ST-TIS extends the canonical Transformer with information fusion and region sampling. The information fusion module captures the complex spatial-temporal dependency between regions. The region sampling module is to improve the efficiency and prediction accuracy, cutting the computation complexity for dependency learning fromwhere n is the number of regions. With far fewer parameters than state-of-the-art models, ST-TIS's offline training is significantly faster in terms of tuning and computation (with a reduction of up to 90% on training time and network parameters). Notwithstanding such training efficiency, extensive experiments show that ST-TIS is substantially more accurate in online prediction than state-of-the-art approaches (with an average improvement of 9.5% on RMSE, and 12.4% on MAPE compared to STDN and DSAN).

show abstract

Section: Training Efficiencymentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

A Lightweight and Accurate Spatial-Temporal Transformer for Traffic Forecasting

Li¹,

Zhong²,

Xiang³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Pruning [19,71,78,79,132,134,140,171,200,265,288] Quantization [19,68,90,134,166,179,291,307,311,314] Knowledge Distillation [29,41,42,80,83,88,95,170,186,195,220,228,231,239,257,266,267,274,295,296,300,312] Low rank factorization [76,98,119,168,190,196,210,292] Conditional Computation…”

Section: Model Compressionmentioning

confidence: 99%

Enabling Deep Learning for All-in EDGE paradigm

Joshi¹,

Hasanuzzaman²,

Thapa³

et al. 2022

Preprint

View full text Add to dashboard Cite

Deep Learning-based models have been widely investigated, and they have demonstrated significant performance on non-trivial tasks such as speech recognition, image processing, and natural language understanding. However, this is at the cost of substantial data requirements. Considering the widespread proliferation of edge devices (e.g., Internet of Things devices) over the last decade, Deep Learning in the edge paradigm, such as device-cloud integrated platforms, is required to leverage its superior performance. Moreover, it is suitable from the data requirements perspective in the edge paradigm because the proliferation of edge devices has resulted in an explosion in the volume of generated and collected data. However, there are difficulties due to other requirements such as high computation, high latency, and high bandwidth caused by Deep Learning applications in real-world scenarios. In this regard, this survey paper investigates Deep Learning at the edge, its architecture, enabling technologies, and model adaption techniques, where edge servers and edge devices participate in deep learning training and inference. For simplicity, we call this paradigm the All-in EDGE paradigm. Besides, this paper presents the key performance metrics for Deep Learning at the All-in EDGE paradigm to evaluate various deep learning techniques and choose a suitable design. Moreover, various open challenges arising from the deployment of Deep Learning at the All-in EDGE paradigm are identified and discussed.

show abstract

“…Deep learning (DL) technologies have significantly advanced many fields critical to mobile applications, such as image understanding, speech recognition, and text translation [29,43,49]. Besides, a lot of research efforts have been put into optimizations of DL latency and efficiency [11,22,34,40], paving the path towards the local intelligent inference on mobile devices like smartphones. Recent study [8,45,51] indicates the intelligent Apps (iApps), smartphone Apps using in-App DL models, will be increasingly popular, which is also verified by our own study shown in Section 6.1.…”

Section: Introductionmentioning

confidence: 99%

Automation Slicing and Testing for in-App Deep Learning Models

Wu¹,

Gong²,

Ke³

et al. 2022

Preprint

View full text Add to dashboard Cite

Intelligent Apps (iApps), equipped with in-App deep learning (DL) models, are emerging to offer stable DL inference services. However, App marketplaces have trouble auto testing iApps because the in-App model is black-box and couples with ordinary codes. In this work, we propose an automated tool, ASTM, which can enable largescale testing of in-App models. ASTM takes as input an iApps, and the outputs can replace the in-App model as the test object. ASTM proposes two reconstruction techniques to translate the in-App model to a backpropagation-enabled version and reconstruct the IO processing code for DL inference. With the ASTM's help, we perform a large-scale study on the robustness of 100 unique commercial in-App models and find that 56% of in-App models are vulnerable to robustness issues in our context. ASTM also detects physical attacks against three representative iApps that may cause economic losses and security issues.

show abstract

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Cited by 20 publications

References 90 publications

A Lightweight and Accurate Spatial-Temporal Transformer for Traffic Forecasting

A Lightweight and Accurate Spatial-Temporal Transformer for Traffic Forecasting

Enabling Deep Learning for All-in EDGE paradigm

Automation Slicing and Testing for in-App Deep Learning Models

Contact Info

Product

Resources

About