Optimization approaches to mpi and area merging-based parallel buffer algorithm

Fan, Jiangwen; Ji, Min; Gu, Guomin; Sun, Yihan

doi:10.1590/s1982-21702014000200015

Cited by 14 publications

(16 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The first strategy is based on the MPI (message passing interface) protocol, which supports the parallel execution of buffer analysis algorithms on HPC (high-performance computing) cluster computers [14][15][16][17] but suffers from weak extensibility when faced with the big spatial data scenarios. The second strategy comes from the distributed high-performance computing framework for big data, MapReduce and Spark, which have recently gained much attention and popularity in the field of geoanalysis.…”

Section: Input Buffers With Intact Boundariesmentioning

confidence: 99%

“…The solution was applied to a two-node self-build cluster with a dataset containing 200,000 features. Fan [17] proposed a parallel buffer algorithm based on area merging and vertex partitioning to improve the performances of buffer analyses when processing large datasets. Fan implemented the algorithm in the MPI programming model, which is the mainstream architecture in high-performance computing.…”

Section: Related Workmentioning

confidence: 99%

“…We found that space-filling curves have good advantages in terms of the aggregation of neighboring spatial objects when compared with an ordinary grid [14]. Space-filling curves in parallel buffer analysis has not been addressed in recent research [16,17,32], though in his literature [17], Fan presented this topic as a future research direction. Therefore, we present a more efficient partitioning method based on space-filling curves, which essentially contains two steps.…”

Section: Hilbert-curve-based Data Partitionmentioning

confidence: 99%

“…This strategy is practically supported in two ways. Primarily, the splitting-and-conquer method is one of the ways to create a buffer zone [17]. Moreover, the topology relationship between split lines is insignificant for the dissolve-type buffer result.…”

Section: Boundary Object Processingmentioning

confidence: 99%

“…Second, a single high-density partitioned series D may require a lengthy calculation time, thereby causing processing congestion. To solve this problem, an even decomposition strategy, which decomposes the data according to the number of vertices, is utilized [16,17,32]. Our goal is to minimize the chances of producing a single large data block as well as control the total number of data blocks.…”

Section: Grid-based Accumulative Data Decompositionmentioning

confidence: 99%

See 4 more Smart Citations

Approach to Accelerating Dissolved Vector Buffer Generation in Distributed In-Memory Cluster Architecture

Shen

Chen

et al. 2018

IJGI

View full text Add to dashboard Cite

Abstract:The buffer generation algorithm is a fundamental function in GIS, identifying areas of a given distance surrounding geographic features. Past research largely focused on buffer generation algorithms generated in a stand-alone environment. Moreover, dissolved buffer generation is dataand computing-intensive. In this scenario, the improvement in the stand-alone environment is limited when considering large-scale mass vector data. Nevertheless, recent parallel dissolved vector buffer algorithms suffer from scalability problems, leaving room for further optimization. At present, the prevailing in-memory cluster-computing framework-Spark-provides promising efficiency for computing-intensive analysis; however, it has seldom been researched for buffer analysis. On this basis, we propose a cluster-computing-oriented parallel dissolved vector buffer generating algorithm, called the HPBM, that contains a Hilbert-space-filling-curve-based data partition method, a data skew and cross-boundary objects processing strategy, and a depth-given tree-like merging method. Experiments are conducted in both stand-alone and cluster environments using real-world vector data that include points and roads. Compared with some existing parallel buffer algorithms, as well as various popular GIS software, the HPBM achieves a performance gain of more than 50%.

show abstract

Section: Input Buffers With Intact Boundariesmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Hilbert-curve-based Data Partitionmentioning

confidence: 99%

Section: Boundary Object Processingmentioning

confidence: 99%

Section: Grid-based Accumulative Data Decompositionmentioning

confidence: 99%

See 3 more Smart Citations

Approach to Accelerating Dissolved Vector Buffer Generation in Distributed In-Memory Cluster Architecture

Shen

Chen

et al. 2018

IJGI

View full text Add to dashboard Cite

show abstract

Geographical information system parallelization for spatial big data processing: a review

et al. 2015

View full text Add to dashboard Cite

An enhanced active caching strategy for data-intensive computations in distributed GIS

et al. 2017

View full text Add to dashboard Cite

Caching can prepare data for computational tasks in advance by tracking the requirements and behaviors of distributed geographical information systems to reduce network latency and improve computational performance. This paper presents an enhanced method to actively cache data for data-intensive computations that considers both data relationships and the timeliness of those relationships. First, the access correlations, the correlation steps and the times of the correlations are computed based on the behaviors of the computational tasks. Because the influence of historically accessed records will decrease gradually over time, only recently accessed records are used. To track changes in the relationships and prevent cache waste problems, each record is given a different age-based weight. A conditional caching probability can then be computed based on the timeliness relationships, which can be used to find the appropriate data to compute simultaneously. Finally, we present several experiments that compare the proposed method with techniques that use other data placement strategies, active caching strategies and passive caching algorithms. The results show that the proposed model has better performance than other algorithms in all respects. In addition, the proposed model results in a lower cache replacement ratio. The experiments with different data sets on different data scales indicate that the proposed algorithm can also be used in large-scale distributed environments. B Shaoming Pan

show abstract

Optimization approaches to mpi and area merging-based parallel buffer algorithm

Cited by 14 publications

References 18 publications

Approach to Accelerating Dissolved Vector Buffer Generation in Distributed In-Memory Cluster Architecture

Approach to Accelerating Dissolved Vector Buffer Generation in Distributed In-Memory Cluster Architecture

Geographical information system parallelization for spatial big data processing: a review

An enhanced active caching strategy for data-intensive computations in distributed GIS

Contact Info

Product

Resources

About