This paper presents a study of Geography Markup Language (GML), the issues that arise from using GML for spatial applications, including storage, parsing, querying and visualization, as well as the use of GML for mobile devices and web services. GML is a modeling language developed by the Open Geospatial Consortium (OGC) as a medium of uniform geographic data storage and exchange among diverse applications. Many new XML-based languages are being developed as open standards in various areas of application. It would be beneficial to integrate such languages with GML during the developmental stages, taking full advantage of a non-proprietary universal standard. As GML is a relatively new language still in development, data processing techniques need to be refined further in order for GML to become a more efficient medium for geospatial applications.
Spatial outliers are the spatial objects whose nonspatial attribute values are quite different from those of their spatial neighbors. Identification of spatial outliers is an important task for data mining researchers and geographers. A number of algorithms have been developed to detect spatial anomalies in meteorological images, transportation systems, and contagious disease data. In this paper, we propose a set of graph-based algorithms to identify spatial outliers. Our method first constructs a graph based on k-nearest neighbor relationship in spatial domain, assigns the differences of nonspatial attribute as edge weights, and continuously cuts high-weight edges to identify isolated points or regions that are much dissimilar to their neighboring objects. The proposed algorithms have three major advantages compared with other existing spatial outlier detection methods: accurate in detecting both point and region outliers, capable of avoiding false outliers, and capable of computing the local outlierness of an object within subgraphs. We present time complexity of the algorithms, and show experiments conducted on US housing and Census data to demonstrate the effectiveness of the proposed approaches.
Storytelling connects entities (people, organizations) using their observed relationships to establish meaningful storylines. This can be extended to spatiotemporal storytelling that incorporates locations, time, and graph computations to enhance coherence and meaning. But when performed sequentially these computations become a bottleneck because the massive number of entities make space and time complexity untenable. This article presents DISCRN, or distributed spatiotemporal ConceptSearch-based storytelling, a distributed framework for performing spatiotemporal storytelling. The framework extracts entities from microblogs and event data, and links these entities using a novel ConceptSearch to derive storylines in a distributed fashion utilizing key-value pair paradigm. Performing these operations at scale allows deeper and broader analysis of storylines. The novel parallelization techniques speed up the generation and filtering of storylines on massive datasets. Experiments with microblog posts such as Twitter data and Global Database of Events, Language, and Tone events show the efficiency of the techniques in DISCRN.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.