Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene classification.Recently, Convolutional Neural Networks (CNNs) have revolutionised computer vision, being the catalyst to significant performance gains in many vision applications, including texture recognition [26] and remote sensing scene classification [27,28]. CNNs and other "deep networks" are generally trained on large amounts of labeled training data (e.g. ImageNet [29]) with raw image pixels with a fixed size as input. Deep networks consists of several convolution and pooling operations followed by one or more fully connected (FC) layers. Several works [30,31] have shown that intermediate activations of the FC layers in a deep network, pre-trained on the ImageNet dataset, are general-purpose features applicable to visual recognition tasks. Deep features based approaches have shown to provide the best results in recent evaluations for texture recognition [4] and remote sensing scene classification [32].As mentioned above, the de facto practice is to train deep models on the ImageNet dataset using RGB values of the image patch as an input to the network. These pre-trained RGB deep networks are typically employed...
Due to the high cost of traditional forest plot measurements, the availability of up-to-date in situ forest inventory data has been a bottleneck for remote sensing image analysis in support of the important global forest biomass mapping. Capitalizing on the proliferation of smartphones, citizen science is a promising approach to increase spatial and temporal coverages of in situ forest observations in a cost-effective way. Digital cameras can be used as a relascope device to measure basal area, a forest density variable that is closely related to biomass. In this paper, we present the Relasphone mobile application with extensive accuracy assessment in two mixed forest sites from different biomes. Basal area measurements in Finland (boreal zone) were in good agreement with reference forest inventory plot data on pine (R 2 = 0.75, RMSE = 5.33 m 2 /ha), spruce (R 2 = 0.75, RMSE = 6.73 m 2 /ha) and birch (R 2 = 0.71, RMSE = 4.98 m 2 /ha), with total relative RMSE(%) = 29.66%. In Durango, Mexico (temperate zone), Relasphone stem volume measurements were best for pine (R 2 = 0.88, RMSE = 32.46 m 3 /ha) and total stem volume (R 2 = 0.87, RMSE = 35.21 m 3 /ha). Relasphone data were then successfully utilized as the only reference data in combination with optical satellite images to produce biomass maps. The Relasphone concept has been validated for future use by citizens in other locations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.