Sigmoid-weighted linear units for neural network function approximation in reinforcement learning

Elfwing, Stefan; Uchibe, Eiji; Doya, Kenji

doi:10.1016/j.neunet.2017.12.012

Cited by 1,084 publications

(552 citation statements)

References 15 publications

Supporting

Mentioning

550

Contrasting

Order By: Relevance

“…For MobileNetV3, we use a combination of these layers as building blocks in order to build the most effective models. Layers are also upgraded with modified swish nonlinearities [36,13,16]. Both squeeze and excitation as well as the swish nonlinearity use the sigmoid which can be inefficient to compute as well challenging to maintain accuracy in fixed point arithmetic so we replace this with the hard sigmoid [2,11] as discussed in section 5.2. .…”

Section: Efficient Mobile Building Blocksmentioning

confidence: 99%

See 1 more Smart Citation

Searching for MobileNetV3

Howard

Pang

Adam

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

6,249

2,545

View full text Add to dashboard Cite

We present the next generation of MobileNets based on a combination of complementary search techniques as well as a novel architecture design. MobileNetV3 is tuned to mobile phone CPUs through a combination of hardwareaware network architecture search (NAS) complemented by the NetAdapt algorithm and then subsequently improved through novel architecture advances. This paper starts the exploration of how automated search algorithms and network design can work together to harness complementary approaches improving the overall state of the art. Through this process we create two new MobileNet models for release: MobileNetV3-Large and MobileNetV3-Small which are targeted for high and low resource use cases. These models are then adapted and applied to the tasks of object detection and semantic segmentation. For the task of semantic segmentation (or any dense pixel prediction), we propose a new efficient segmentation decoder Lite Reduced Atrous Spatial Pyramid Pooling (LR-ASPP). We achieve new state of the art results for mobile classification, detection and segmentation. MobileNetV3-Large is 3.2% more accurate on ImageNet classification while reducing latency by 15% compared to MobileNetV2. MobileNetV3-Small is 4.6% more accurate while reducing latency by 5% compared to MobileNetV2. MobileNetV3-Large detection is 25% faster at roughly the same accuracy as MobileNetV2 on COCO detection. MobileNetV3-Large LR-ASPP is 30% faster than MobileNetV2 R-ASPP at similar accuracy for Cityscapes segmentation.

show abstract

Section: Efficient Mobile Building Blocksmentioning

confidence: 99%

“…In [36,13,16] a nonlinearity called swish was introduced that when used as a drop-in replacement for ReLU, that significantly improves the accuracy of neural networks. The nonlinearity is defined as…”

Section: Nonlinearitiesmentioning

confidence: 99%

Searching for MobileNetV3

Howard

Pang

Adam

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

6,249

2,545

View full text Add to dashboard Cite

show abstract

“…Regarding activation, four algorithms were tested: ReLU [40], LeakyReLU [40], PReLU [41], and trainable Swish [42]. Swish is a recent algorithm similar to the sigmoid-weighted linear unit proposed in [43], but with a trainable parameter. Regarding convolutional layer initialization, two algorithms were tested, the so-called Glorot-normal (a.k.a., Xavier-normal, [44]) and the He-normal [41].…”

Section: Hyperparameter Tuning and Model Testingmentioning

confidence: 99%

Gait Recognition via Deep Learning of the Center-of-Pressure Trajectory

Terrier

2020

Applied Sciences

View full text Add to dashboard Cite

Fig. 6. CAM analysis. Twenty-four segments were randomly selected from the training set containing the gait data of six participants (left columns). These segments were used to fine-tune the last layers of the convolutional neural network (CNN) that was pre-trained on the training set of the 30 other participants. This CNN classified 30 segments drawn randomly from the test set with 100% accuracy (right columns). Class activation mapping (CAM) was performed on each sample of the test set. Color coding shows which parts of the signals are prioritized to be used by the CNN to perform classification. Warm color (red, orange): high focus; cold colors (green, blue): low focus. AbstractThe fact that every human has a distinctive walking style has prompted a proposal to use gait recognition as an identification criterion. Using end-to-end learning, I investigated whether the center-of-pressure trajectory is sufficiently unique to identify a person with a high certainty. Thirty-six adults walked on a treadmill equipped with a force platform that recorded the positions of the center of pressure. The raw two-dimensional signals were sliced into segments of two gait cycles . A set of 20,250 segments from 30 subjects was used to configure and train convolutional neural networks (CNNs). The best CNN classified a separate set containing 2,250 segments with 99.9% overall accuracy. A second set of 4,500 segments from the six remaining subjects was then used for transfer learning. Several small subsamples of this set were selected randomly and used for fine tuning. Training with two segments per subject was sufficient to achieve 100% accuracy. The results suggest that every person produces a unique trajectory of underfoot pressures and that CNNs can learn the distinctive features of these trajectories. Using transfer learning, a few strides could be sufficient to learn and identify new gaits.

show abstract

“…E. Swish [19]: Swish was introduced to deep learning particularly in image classification and machine translation tasks by Google Brain team in 2017. In fact, it was similar to Sigmoid-weighted Linear Unit (SiL) [35] function which was used in reinforcement learning. It has the smooth property similar to Softplus.…”

Section: Existing Activation Functions For Comparisonmentioning

confidence: 99%