Deformable Convolutional Networks

Dai, Jifeng; Qi, Haozhi; Xiong, Yan; Yi, Li; Zhang, Guodong; Hu, Han; Wei, Yichen

doi:10.1109/iccv.2017.89

Cited by 5,401 publications

(3,440 citation statements)

References 50 publications

Supporting

Mentioning

3,415

Contrasting

Unclassified

Order By: Relevance

“…Existing approaches basically follow two directions: deformable filter and rotating filer. In [3], a deformable convolution filter was introduced to enhance DCNNs' capacity of modeling geometric transformations by allowing free form deformation of the sampling grid with offsets learned from the preceding feature maps. However, the deformable filtering is complicated, because it is always associated with the Region of Interest (RoI) pooling technique originally designed for object detection [4].…”

mentioning

confidence: 99%

Gabor Convolutional Networks

Luan

Chen

Zhang

et al. 2018

IEEE Trans. on Image Process.

278

103

View full text Add to dashboard Cite

Abstract-In steerable filters, a filter of arbitrary orientation can be generated by a linear combination of a set of "basis filters". Steerable properties dominate the design of the traditional filters e.g., Gabor filters and endow features the capability of handling spatial transformations. However, such properties have not yet been well explored in the deep convolutional neural networks (DCNNs). In this paper, we develop a new deep model, namely Gabor Convolutional Networks (GCNs or Gabor CNNs), with Gabor filters incorporated into DCNNs such that the robustness of learned features against the orientation and scale changes can be reinforced. By manipulating the basic element of DCNNs, i.e., the convolution operator, based on Gabor filters, GCNs can be easily implemented and are readily compatible with any popular deep learning architecture. We carry out extensive experiments to demonstrate the promising performance of our GCNs framework and the results show its superiority in recognizing objects, especially when the scale and rotation changes take place frequently. Moreover, the proposed GCNs have much fewer network parameters to be learned and can effectively reduce the training complexity of the network, leading to a more compact deep learning model while still maintaining a high feature representation capacity. The source code can be found at https://github.com/bczhangbczhang .

show abstract

mentioning

confidence: 99%

Gabor Convolutional Networks

Luan

Chen

Zhang

et al. 2018

IEEE Trans. on Image Process.

278

103

View full text Add to dashboard Cite

show abstract

“…In Dai et al [19], deformable convolution was achieved by augmenting the input feature map with 2D offsets during convolution. For better understanding from the image processing perspective, we formulized the deformable convolution as follows:…”

Section: Deformable Convolutionmentioning

confidence: 99%

“…In our work, the CNN architecture was developed according to Dai et al [19], but a different training strategy is used. As shown in Figure 2, based on R-FCN that contains fully convolutional feature maps, RoI pooling and RPN, we used ResNet101 ImageNet pre-trained parameters as the initial values and substitute res5, res4b22, res4b21 and res4b20 layers by deformable convolution layers.…”

Section: Deformable R-fcnmentioning

confidence: 99%

“…In the training of deformable ConvNet, we set the learning rate to 0.0005 and perform fine-tuning based on ResNet-101 pre-trained models. The RPN parameters in deformable ConvNet were same as in Dai et al [19]. To compare deformable ConvNet's fine-tuned efficiency, we used transferred AlexNet, newly trained AlexNet, RICNN with and without fine-tuning on ImageNet, and R-P-Faster R-CNN with Zeiler and Fergus (ZF) model or the visual geometry group (VGG) model fine-tuned on ImageNet.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Deformable ConvNet with Aspect Ratio Constrained NMS for Object Detection in Remote Sensing Imagery

Wang

et al. 2017

Remote Sensing

108

View full text Add to dashboard Cite

Convolutional neural networks (CNNs) have demonstrated their ability object detection of very high resolution remote sensing images. However, CNNs have obvious limitations for modeling geometric variations in remote sensing targets. In this paper, we introduced a CNN structure, namely deformable ConvNet, to address geometric modeling in object recognition. By adding offsets to the convolution layers, feature mapping of CNN can be applied to unfixed locations, enhancing CNNs' visual appearance understanding. In our work, a deformable region-based fully convolutional networks (R-FCN) was constructed by substituting the regular convolution layer with a deformable convolution layer. To efficiently use this deformable convolutional neural network (ConvNet), a training mechanism is developed in our work. We first set the pre-trained R-FCN natural image model as the default network parameters in deformable R-FCN. Then, this deformable ConvNet was fine-tuned on very high resolution (VHR) remote sensing images. To remedy the increase in lines like false region proposals, we developed aspect ratio constrained non maximum suppression (arcNMS). The precision of deformable ConvNet for detecting objects was then improved. An end-to-end approach was then developed by combining deformable R-FCN, a smart fine-tuning strategy and aspect ratio constrained NMS. The developed method was better than a state-of-the-art benchmark in object detection without data augmentation.

show abstract

“…In Ref. [14], deformable convolutions are used to reformulate the sampling process in convolutions in a learning-based approach. Deformable convolutions can also be regarded as a way of reallocating convolutional weights.…”

Section: Affine Transformation In Deep Networkmentioning

confidence: 99%

Learning adaptive receptive fields for deep image parsing networks

et al. 2018

View full text Add to dashboard Cite

In this paper, we introduce a novel approach to automatically regulate receptive fields in deep image parsing networks. Unlike previous work which placed much importance on obtaining better receptive fields using manually selected dilated convolutional kernels, our approach uses two affine transformation layers in the network's backbone and operates on feature maps. Feature maps are inflated or shrunk by the new layer, thereby changing the receptive fields in the following layers. By use of end-to-end training, the whole framework is data-driven, without laborious manual intervention. The proposed method is generic across datasets and different tasks. We have conducted extensive experiments on both general image parsing tasks, and face parsing tasks as concrete examples, to demonstrate the method's superior ability to regulate over manual designs.

show abstract

Deformable Convolutional Networks

Cited by 5,401 publications

References 50 publications

Gabor Convolutional Networks

Gabor Convolutional Networks

Deformable ConvNet with Aspect Ratio Constrained NMS for Object Detection in Remote Sensing Imagery

Learning adaptive receptive fields for deep image parsing networks

Contact Info

Product

Resources

About