This paper proposes a Big Data approach to automatically identify and extract buildings from a digital surface model created from aerial laser scanning data. The approach consists of two steps. The first step is a MapReduce process where neighboring points in a digital surface model are mapped into cubes. The second step uses a non-MapReduce algorithm first to remove trees and other obstructions and then to extract adjacent cubes.According to this approach, all adjacent cubes belong to the same object and an object is a set of adjacent cubes that belong to one or more adjacent buildings. Finally, an evaluation study is presented for a section of Dublin, Ireland to demonstrate the applicability of the approach resulting in a 92% quality level for the extraction of 106 buildings over 1 km 2 including buildings that had more than 10 adjacent components of different heights and complicated roof geometries. The proposed approach is notable not only for its Big Data context but its usage of vector data.
: This paper proposes an approach to classify, localize, and extract automatically urban objects such as buildings and the ground surface from a digital surface model created from aerial laser scanning data. To achieve that, the approach involves three steps: 1) dividing the original data into smaller, more manageable pieces using a method based on MapReduce gridding for subspace partitioning; 2) applying the DBSCAN algorithm to identify interesting subspaces depending on point density; and 3) grouping of identified subspace to form potential objects.Validation of the method was achieved using an architecturally dense and complex portion of Dublin, Ireland. The best results were achieved with a 1 m 3 sized clustering cube, for which the number of classified clusters equaled that which was derived manually and that amongst those there the following scores: correctness = 84.91%, completeness = 84.39%, and quality = 84.65%.
Arabic is the most widely spoken language in the Arab World. Most people of the Islamic World understand the Classic Arabic language because it is the language of the Qur'an. Despite the fact that in the last decade the number of Arabic Internet users (Middle East and North and East of Africa) has increased considerably, systems to analyze Arabic digital resources automatically are not as easily available as they are for English. Therefore, in this work, an attempt is made to build a real time Named Entity Recognition system that can be used in web applications to detect the appearance of specific named entities and events in news written in Arabic. Arabic is a highly inflectional language, thus we will try to minimize the impact of Arabic affixes on the quality of the pattern recognition model applied to identify named entities. These patterns are built up by processing and integrating different gazetteers, from DBPedia (http://dbpedia.org/About, 2009) to GATE (A general architecture for text engineering, 2009) and ANERGazet
While big data technologies are growing rapidly and benefit a wide range of science and engineering domains, many barriers remain for the remote sensing community to fully exploit the benefits provided by these emerging powerful technologies. To overcome these barriers, this paper presents the in-depth experience gained when adopting a distributed computing framework -Hadoop HBase -for storage, indexing, and integration of large scale, high resolution laser scanning point cloud data. Four data models were conceptualized, implemented, and rigorously investigated to explore the advantageous features of distributed, key-value database systems. In addition, the comparison of the four models facilitated the reassessment of several well-known point cloud management techniques founded in traditional computing environments in the new context of the distributed, key-value database. The four models were derived from two row-key designs and two columns structures, thereby demonstrating various considerations during the development of a data solution for high-resolution, city-scale aerial laser scan for a portion of Dublin, Ireland. This paper presents lessons learned from the data model design and its implementation for spatial data management in a distributed computing framework. The study is a step towards full exploitation of powerful emerging computing assets for dense spatio-temporal data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.