The advancement of deep convolutional neural networks (DCNNs) has driven significant improvement in the accuracy of recognition systems for many computer vision tasks. However, their practical applications are often restricted in resource-constrained environments. In this paper, we introduce projection convolutional neural networks (PCNNs) with a discrete back propagation via projection (DBPP) to improve the performance of binarized neural networks (BNNs). The contributions of our paper include: 1) for the first time, the projection function is exploited to efficiently solve the discrete back propagation problem, which leads to a new highly compressed CNNs (termed PCNNs); 2) by exploiting multiple projections, we learn a set of diverse quantized kernels that compress the full-precision kernels in a more efficient way than those proposed previously; 3) PCNNs achieve the best classification performance compared to other state-ofthe-art BNNs on the ImageNet and CIFAR datasets.
The rapidly decreasing computation and memory cost has recently driven the success of many applications in the field of deep learning. Practical applications of deep learning in resource-limited hardware, such as embedded devices and smart phones, however, remain challenging. For binary convolutional networks, the reason lies in the degraded representation caused by binarizing full-precision filters. To address this problem, we propose new circulant filters (CiFs) and a circulant binary convolution (CBConv) to enhance the capacity of binarized convolutional features via our circulant back propagation (CBP). The CiFs can be easily incorporated into existing deep convolutional neural networks (DCNNs), which leads to new Circulant Binary Convolutional Networks (CBCNs). Extensive experiments confirm that the performance gap between the 1-bit and full-precision DCNNs is minimized by increasing the filter diversity, which further increases the representational ability in our networks. Our experiments on ImageNet show that CBCNs achieve 61.4% top-1 accuracy with ResNet18. Compared to the state-of-the-art such as XNOR, CBCNs can achieve up to 10% higher top-1 accuracy with more powerful representational ability.
Deep convolutional neural networks (DCNNs) have dominated the recent developments in computer vision through making various record-breaking models. However, it is still a great challenge to achieve powerful DCNNs in resource-limited environments, such as on embedded devices and smart phones. Researchers have realized that 1-bit CNNs can be one feasible solution to resolve the issue; however, they are baffled by the inferior performance compared to the full-precision DCNNs. In this paper, we propose a novel approach, called Bayesian optimized 1-bit CNNs (denoted as BONNs), taking the advantage of Bayesian learning, a well-established strategy for hard problems, to significantly improve the performance of extreme 1-bit CNNs. We incorporate the prior distributions of full-precision kernels and features into the Bayesian framework to construct 1-bit CNNs in an end-to-end manner, which have not been considered in any previous related methods. The Bayesian losses are achieved with a theoretical support to optimize the network simultaneously in both continuous and discrete spaces, aggregating different losses jointly to improve the model capacity. Extensive experiments on the ImageNet and CIFAR datasets show that BONNs achieve the best classification performance compared to state-of-the-art 1-bit CNNs.
Compression artifacts reduction (CAR) is a challenging problem in the field of remote sensing. Most recent deep learning based methods have demonstrated superior performance over the previous hand-crafted methods. In this paper, we propose an end-to-end one-two-one (OTO) network, to combine different deep models, i.e., summation and difference models, to solve the CAR problem.Particularly, the difference model motivated by the Laplacian pyramid is designed to obtain the high frequency information, while the summation model aggregates the low frequency information.We provide an in-depth investigation into our OTO architecture based on the Taylor expansion, which shows that these two kinds of information can be fused in a nonlinear scheme to gain more capacity of handling complicated image compression artifacts, especially the blocking effect in compression. Extensive experiments are conducted to demonstrate the superior performance of the OTO networks, as compared to the state-of-the-arts on remote sensing datasets and other benchmark datasets. The source code will be available here 1 . Sub-network:ResNet (R) ResUnit ResUnit ResUnit Sub-network: DenseNet (D) DenseUnit DenseUnit DenseUnit DenseUnit DenseUnit BN ReLU Conv Conv Sub-network: Classic CNN (C) Conv ReLU CnnUnit CnnUnit CnnUnit CnnUnit CnnUnit CnnUnit Difference Model Summation Model about the OTO network are shown in Fig. 2. ResNet(R): For each ResUnit, we follow the latest variant proposed in [51], which is more powerful than its predecessors. More specifically, in each ResUnit, batch normalization layer [52], ReLU layer [53] and convolution layer are stacked twice in sequence. 11 Conv Down Subnetwork Subnetwork Up Scale + ResUnit × 5 Conv + Normal-scale Network Small-scale Network Fusion NetworkFigure 3: The architecture of OTO(Linear) with a learned α to balance the two branch networks. Conv Down Subnetwork Subnetwork Up + ResUnit × 5 Normal-scale Network Small-scale Network Figure 4: The architecture of OTO(Sum) without the difference model. DenseNet(D): Inspired by Densely Connected Convolutional Networks [54], to further improve the information flow between layers we propose a different connectivity pattern: we introduce direct connections from any layer to all subsequent layers. In DenseNet, the feature fusion method is converted from addition to concatenation compared with ResNet, resulting in wider feature maps.The growth rate k is an important concept in DenseNet which means how fast the width of feature maps grows and in our implementation, we set k to 8. For each DenseUnit, we also follow the pre-activation style unit as ResUnit except the number of convolutional layers is reduced to 1. As can be seen in Fig. 2, five DenseUnits are stacked sequentially followed by a convolutional layer to reduce the width of feature map so that it can be fused with the other sub-network.Classic CNNs(C): The classic CNN models only take advantages of convolutional layers and activation layers. The CnnUnit consists of one convolutional layer and one ReLU layer,...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.