All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation

Xie, Dong; Xiong, Jiang; Pu, Shiliang

doi:10.1109/cvpr.2017.539

Cited by 139 publications

(125 citation statements)

References 34 publications

Supporting

Mentioning

124

Contrasting

Order By: Relevance

“…Recent studies also investigated "softer" orthogonality regularizations by enforcing the Gram matrix of each weight matrix to be close to an identity matrix, under Frobenius norm [38] or spectral norm [39]. We propose a novel spectral value difference orthogonality (SVDO) regularization that directly constrains the conditional number of the Gram matrix.…”

Section: Diversity Via Orthogonalitymentioning

confidence: 99%

“…We propose a novel spectral value difference orthogonality (SVDO) regularization that directly constrains the conditional number of the Gram matrix. Also contrasting from [13,38] that apply orthogonality only to CNN weights, we enforce the new regularization on both hidden activations and weights.…”

Section: Diversity Via Orthogonalitymentioning

confidence: 99%

“…However, computing SVD on highdimensional matrices is expensive, urging for the development of soft orthogonality regularizers. Many existing soft regularizers [38,41] restrict the Gram matrix of F to be close to an identity matrix under Frobenius norm that can avoid the SVD step while being differentiable. However, * For the weight tensor Wc ∈ R S×H×C×M in a convolutional layer, where S, H, C, M are filter's width and height, the number of input and output channels, we follow the convention of [38,39] to reshape Wc into a matrix form…”

Section: Diversity: Orthogonality Regularizationmentioning

confidence: 99%

See 2 more Smart Citations

ABD-Net: Attentive but Diverse Person Re-Identification

Chen

Ding

Xie

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

463

239

View full text Add to dashboard Cite

Attention mechanisms have been found effective for person re-identification (Re-ID). However, the learned "attentive" features are often not naturally uncorrelated or "diverse", which compromises the retrieval performance based on the Euclidean distance. We advocate the complementary powers of attention and diversity for Re-ID, by proposing an Attentive but Diverse Network (ABD-Net). ABD-Net seamlessly integrates attention modules and diversity regularizations throughout the entire network to learn features that are representative, robust, and more discriminative. Specifically, we introduce a pair of complementary attention modules, focusing on channel aggregation and position awareness, respectively. Then, we plug in a novel orthogonality constraint that efficiently enforces diversity on both hidden activations and weights. Through an extensive set of ablation study, we verify that the attentive and diverse terms each contributes to the performance boosts of ABD-Net. It consistently outperforms existing state-of-the-art methods on there popular person Re-ID benchmarks. * This also exploits attention mechanisms.• This is with a ResNet-152 backbone. This is with a DenseNet-121 backbone. ‡ Official codes are not released. We report the numbers in the original paper, which are better than our re-implementation.comes 3.40% for top-1 and 6.40% for mAP. We also considered SVDNet [13] and HA- CNN [50] which also proposed to generate diverse and uncorrelated feature embeddings. ABD-Net surpasses both with significant top-1 and mAP improvement. Overall, our observations endorse the superiority of ABD-Net by combing "attentive" and "diverse". VisualizationsAttention Pattern Visualization: We conduct a set of attention visualizations * * on the final output feature maps of the baseline (XE), baseline (XE) + PAM + CAM, and ABD-Net (XE), as shown in Fig.5. We notice that the feature maps from the baseline show little attentiveness. PAM + * * Grad-CAM visualization method [73]: https://github.com/ utkuozbulak/pytorch-cnn-visualizations; RAM visualization method [74] for testing images. More results can be found in the supplementary.

show abstract

Section: Diversity Via Orthogonalitymentioning

confidence: 99%

Section: Diversity Via Orthogonalitymentioning

confidence: 99%

Section: Diversity: Orthogonality Regularizationmentioning

confidence: 99%

See 1 more Smart Citation

ABD-Net: Attentive but Diverse Person Re-Identification

Chen

Ding

Xie

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

463

239

View full text Add to dashboard Cite

show abstract

“…Lots of research works have been focusing on learning compact and non-redundant feature descriptors which have many good merits [39], [40], [41], [20], [42], [21]. Chan et al [42] proposed a very simple deep learning framework called PCANet for image classification, which learned orthogonal projection to produce the filters.…”

Section: Compact Feature Learningmentioning

confidence: 99%

“…Chan et al [42] proposed a very simple deep learning framework called PCANet for image classification, which learned orthogonal projection to produce the filters. Xie et al [41] utilized the regularization effect of orthogonalization to improve the classification accuracy. Sun et al [39] quantitatively analyzed the influence of features on person ReID accuracy and found that the associations between different features also impacted the results.…”

Section: Compact Feature Learningmentioning

confidence: 99%

Person Re-Identification in Aerial Imagery

Zhang

Yang

Wei

et al. 2021

IEEE Trans. Multimedia

View full text Add to dashboard Cite

Nowadays, with the rapid development of consumer Unmanned Aerial Vehicles (UAVs), visual surveillance by utilizing the UAV platform has been very attractive. Most of the research works for UAV captured visual data are mainly focused on the tasks of object detection and tracking. However, limited attention has been paid to the task of person Re-identification (ReID) which has been widely studied in ordinary surveillance cameras with fixed emplacements.In this paper, to facilitate the research of person ReID in aerial imagery, we collect a large scale airborne person ReID dataset named as Person ReID for Aerial Imagery (PRAI-1581), which consists of 39,461 images of 1581 person identities. The images of the dataset are captured by two DJI consumer UAVs flying at an altitude ranging from 20 to 60 meters above the ground, which covers most of the real UAV surveillance scenarios. In addition, we propose to utilize subspace pooling with SVD of convolution feature maps to represent the input person images. The proposed method can learn a discriminative and compact feature descriptor for ReID in aerial imagery and can be trained via an end-to-end fashion efficiently. We conduct extensive experiments on our dataset and the experimental results demonstrate that the proposed method achieves state-of-the-art performance.

show abstract

Weight initialization based‐rectified linear unit activation function to improve the performance of a convolutional neural network model

Olimov

Sanjar

Jang

et al. 2020

Concurrency and Computation

View full text Add to dashboard Cite

Convolutional Neural Networks (CNNs) have made a great impact on attaining state‐of‐the‐art results in image task classification. Weight initialization is one of the fundamental steps in formulating a CNN model. It determines the failure or success of the CNN model. In this paper, we conduct a research based on the mathematical background of different weight initialization strategies to determine the one with better performance. To have smooth training, we expect the activation of each layer of the CNN model follow the standard normal distribution with mean 0 and SD 1. It prevents gradients from vanishing and leads to more smooth training. However, it was obtained that even with the appropriate weight initialization technique, a regular Rectified Linear Unit (ReLU) activation function increases the activation mean value. In this paper, we address this issue by proposing weight initialization based (WIB)‐ReLU activation function. The proposed method resulted in more smooth training. Moreover, the experiments showed that WIB‐ReLU outperforms ReLU, Leaky ReLU, parametric ReLU, and exponential linear unit activation functions and results in up to 20% decrease in loss value and 5% increase in accuracy score on both Fashion‐MNIST and CIFAR‐10 databases.

show abstract

All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation

Cited by 139 publications

References 34 publications

ABD-Net: Attentive but Diverse Person Re-Identification

ABD-Net: Attentive but Diverse Person Re-Identification

Person Re-Identification in Aerial Imagery

Weight initialization based‐rectified linear unit activation function to improve the performance of a convolutional neural network model

Contact Info

Product

Resources

About