Francesco Lettich scite author profile

The ability to timely process significant amounts of continuously updated spatial data is mandatory for an increasing number of applications. In this paper we focus on a specific data-intensive problem concerning the repeated processing of huge amounts of k nearest neighbours (k-NN) queries over massive sets of moving objects, where the spatial extents of queries and the position of objects are continuously modified over time. In particular, we propose a novel hybrid CPU/GPU pipeline that significantly accelerate query processing thanks to a combination of ad-hoc data structures and non-trivial memory access patterns. To the best of our knowledge this is the first work that exploits GPUs to efficiently solve repeated k-NN queries over massive sets of continuously moving objects, even characterized by highly skewed spatial distributions. In comparison with state-of-the-art sequential CPU-based implementations, our method highlights significant speedups in the order of 10x-20x, depending on the datasets, even when considering cheap GPUs

show abstract

Leveraging feature selection to detect potential tax fraudsters

Matos¹,

Macêdo²,

Lettich³

et al. 2020

Expert Systems with Applications

View full text Add to dashboard Cite

Detecting avoidance behaviors between moving object trajectories

Lettich

Älvares

Bogorny

et al. 2016

Data & Knowledge Engineering

View full text Add to dashboard Cite

Parallel Traversal of Large Ensembles of Decision Trees

Lettich

Lucchese

Nardini

et al. 2019

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Machine-learnt models based on additive ensembles of regression trees are currently deemed the best solution to address complex classification, regression, and ranking tasks. The deployment of such models is computationally demanding: to compute the final prediction, the whole ensemble must be traversed by accumulating the contributions of all its trees. In particular, traversal cost impacts applications where the number of candidate items is large, the time budget available to apply the learnt model to them is limited, and the users' expectations in terms of quality-of-service is high. Document ranking in web search, where sub-optimal ranking models are deployed to find a proper trade-off between efficiency and effectiveness of query answering, is probably the most typical example of this challenging issue. This paper investigates multi/many-core parallelization strategies for speeding up the traversal of large ensembles of regression trees thus obtaining machine-learnt models that are, at the same time, effective, fast, and scalable. Our best results are obtained by the GPU-based parallelization of the state-of-the-art algorithm, with speedups of up to 102.6x.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.