Andrew Sugaya scite author profile

Andrew Sugaya

4Publications

30Citation Statements Received

73Citation Statements Given

How they've been cited

How they cite others

104

Affiliations

Massachusetts Institute of Technology, Vassar College

Publications

Order By: Most citations

An effective coreset compression algorithm for large scale sensor networks

Feldman¹,

Sugaya²,

Rus³

2012

View full text Add to dashboard Cite

The wide availability of networked sensors such as GPS and cameras is enabling the creation of sensor networks that generate huge amounts of data. For example, vehicular sensor networks where in-car GPS sensor probes are used to model and monitor traffic can generate on the order of gigabytes of data in real time. How can we compress streaming highfrequency data from distributed sensors? In this paper we construct coresets for streaming motion. The coreset of a data set is a small set which approximately represents the original data. Running queries or fitting models on the coreset will yield similar results when applied to the original data set.We present an algorithm for computing a small coreset of a large sensor data set. Surprisingly, the size of the coreset is independent of the size of the original data set. Combining map-and-reduce techniques with our coreset yields a system capable of compressing in parallel a stream of O(n) points using space and update time that is only O(log n). We provide experimental results and compare the algorithm to the popular Douglas-Peucker heuristic for compressing GPS data.

show abstract

An effective coreset compression algorithm for large scale sensor networks

Feldman

Sugaya

Rus

2012

View full text Add to dashboard Cite

show abstract

iDiary

Feldman

Sung

Sugaya

et al. 2015

ACM Trans. Sen. Netw.

View full text Add to dashboard Cite

This article describes iDiary, a system that takes as input GPS data streams generated by users’ phones and turns them into textual descriptions of the trajectories. The system features a user interface similar to Google Search that allows users to type text queries on their activities (e.g., “Where did I buy books?”) and receive textual answers based on their GPS signals. iDiary uses novel algorithms for semantic compression and trajectory clustering of massive GPS signals in parallel to compute the critical locations of a user. We encode these problems as follows. The k-segment mean is a k -piecewise linear function that minimizes the regression distance to the signal. The ( k,m )- segment mean has an additional constraint that the projection of the k segments on R d consists of only m ≤ k segments. A coreset for this problem is a smart compression of the input signal that allows computation of a (1+ε)-approximation to its k -segment or ( k,m )-segment mean in O ( n log n ) time for arbitrary constants ε, k , and m . We use coresets to obtain a parallel algorithm that scans the signal in one pass, using space and update time per point that is polynomial in log n . Using an external database, we then map these locations to textual descriptions and activities so that we can apply text mining techniques on the resulting data (e.g., LSA or transportation mode recognition). We provide experimental results for both the system and algorithms and compare them to existing commercial and academic state of the art. This is the first GPS system that enables text-searchable activities from GPS data.

show abstract

iDiary

Feldman

Sugaya

Sung

et al. 2013

View full text Add to dashboard Cite

This paper describes a system that takes as input GPS data streams generated by users' phones and creates a searchable database of locations and activities. The system is called iDiary and turns large GPS signals collected from smartphones into textual descriptions of the trajectories. The system features a user interface similar to Google Search that allows users to type text queries on their activities (e.g., "Where did I buy books?") and receive textual answers based on their GPS signals.iDiary uses novel algorithms for semantic compression (known as coresets) and trajectory clustering of massive GPS signals in parallel to compute the critical locations of a user. Using an external database, we then map these locations to textual descriptions and activities so that we can apply text mining techniques on the resulting data (e.g. LSA or transportation mode recognition).We provide experimental results for both the system and algorithms and compare them to existing commercial and academic state-of-the-art. This is the first GPS system that enables text-searchable activities from GPS data.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Andrew Sugaya

An effective coreset compression algorithm for large scale sensor networks

An effective coreset compression algorithm for large scale sensor networks

iDiary

iDiary

Contact Info

Product

Resources

About