Deep Learning and Its Parallelization

Li, X.; Zhang, G.; Li, K.; Zheng, Wenbin

doi:10.1016/b978-0-12-805394-2.00004-0

Cited by 23 publications

(16 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A deep CNN consists of an input layer that contains image data of m training examples, multiple hidden layers that compute features from input images and an output layer, which classifies the learned images. Deep learning models employ non-linear transformation functions to solve complex large-scale problems (Reyes et al, 2015;Li et al, 2016;Badejo et al, 2018). As shown in Figure 1, the hidden layers consist of stacked convolution layers that convolve using a Rectified Linear Unit (ReLU) activation (or transfer) function, as well as a pooling layer, which reduces the dimension of the convoluted image.…”

Section: Convolutional Neural Networkmentioning

confidence: 99%

“…The Fully Connected (FC) layer connects each input to the next layer (classification layer) from the previous layer. Significant achievements have been made through the application of deep learning models in image processing, natural language processing and speech recognition tasks, thereby paving the way for more predictive analysis of big data (Li et al, 2016). The description of the aforementioned layers can be mathematically represented as shown in Equation ( 1)-( 7):…”

Section: Convolutional Neural Networkmentioning

confidence: 99%

See 1 more Smart Citation

LeafsnapNet: An Experimentally Evolved Deep Learning Model for Recognition of Plant Species based on Leafsnap Image Dataset

Ajayi¹,

Kala²,

Ajala³

et al. 2021

Journal of Computer Science

View full text Add to dashboard Cite

Section: Convolutional Neural Networkmentioning

confidence: 99%

Section: Convolutional Neural Networkmentioning

confidence: 99%

LeafsnapNet: An Experimentally Evolved Deep Learning Model for Recognition of Plant Species based on Leafsnap Image Dataset

Ajayi¹,

Kala²,

Ajala³

et al. 2021

Journal of Computer Science

View full text Add to dashboard Cite

“…There are three prominent strategies to partition the learning phase of a model: partitioning by input samples (data parallelism), by network structure (model parallelism), and by layer (pipelining). Data parallelism can be easily implemented, and it is, therefore, the most widely used implementation strategy on multi-GPUs (Li et al (2016)). We have explored this option focusing on CPUs only, where each core utilises the same sparse model to train on di erent data subsets.…”

Section: Parallel Training Of Deep Neural Networkmentioning

confidence: 99%

Truly Sparse Neural Networks at Scale

Curci

Mocanu

Pechenizkiy

2021

Preprint

View full text Add to dashboard Cite

Recently, sparse training methods have started to be established as a de facto approach for training and inference efficiency in artificial neural networks. Yet, this efficiency is just in theory. In practice, everyone uses a binary mask to simulate sparsity since the typical deep learning software and hardware are optimized for dense matrix operations. In this paper, we take an orthogonal approach, and we show that we can train truly sparse neural networks to harvest their full potential. To achieve this goal, we introduce three novel contributions, specially designed for sparse neural networks: (1) a parallel training algorithm and its corresponding sparse implementation from scratch, (2) an activation function with non-trainable parameters to favour the gradient flow, and (3) a hidden neurons importance metric to eliminate redundancies. All in one, we are able to break the record and to train the largest neural network ever trained in terms of representational power -- reaching the bat brain size. The results show that our approach has state-of-the-art performance while opening the path for an environmentally friendly artificial intelligence era.

show abstract

“…The key consideration to take advantage of their parallelization is to know how to divide the GPUs' tasks. Hence, three approaches should be considered if we want to train parallelized models: data parallelism, model parallelism, and data-model parallelism [79].…”

Section: Parallelization Of Neural Networkmentioning

confidence: 99%

Object Detection, Distributed Cloud Computing and Parallelization Techniques for Autonomous Driving Systems

Medina¹,

Espitia²,

Silva³

et al. 2021

Preprint

View full text Add to dashboard Cite

Autonomous driving systems are increasingly becoming a necessary trend towards building smart cities of the future. Numerous proposals have been presented in recent years to tackle particular aspects of the working pipeline towards creating a functional end-to-end system, such as object detection, tracking, path planning, sentiment or intent detection. Nevertheless, few efforts have been made to systematically compile all of these systems into a single proposal that effectively considers the real challenges these systems will have on the road, such as real-time computation, hardware capabilities, etc. This paper has reviewed various techniques towards proposing our own end-to-end autonomous vehicle system, considering the latest state on the art on computer vision, DSs, path planning, and parallelization.

show abstract

Deep Learning and Its Parallelization

Cited by 23 publications

References 5 publications

LeafsnapNet: An Experimentally Evolved Deep Learning Model for Recognition of Plant Species based on Leafsnap Image Dataset

LeafsnapNet: An Experimentally Evolved Deep Learning Model for Recognition of Plant Species based on Leafsnap Image Dataset

Truly Sparse Neural Networks at Scale

Object Detection, Distributed Cloud Computing and Parallelization Techniques for Autonomous Driving Systems

Contact Info

Product

Resources

About