Fuheng Zhao scite author profile

Fuheng Zhao

3Publications

6Citation Statements Received

43Citation Statements Given

How they've been cited

How they cite others

107

Affiliations

University of California, Santa Barbara

Publications

Order By: Most citations

SpaceSaving ^±

et al. 2022

View full text Add to dashboard Cite

In this paper, we propose the first deterministic algorithms to solve the frequency estimation and frequent item problems in the bounded-deletion model. We establish the space lower bound for solving the deterministic frequent items problem in the bounded-deletion model, and propose Lazy SpaceSaving ± and SpaceSaving ± algorithms with optimal space bound. We develop an efficient implementation of the SpaceSaving ± algorithm that minimizes the latency of update operations using novel data structures. The experimental evaluations testify that SpaceSaving ± has accurate frequency estimations and achieves very high recall and precision across different data distributions while using minimal space. Our experiments clearly demonstrate that, if allowed the same space, SpaceSaving± is more accurate than the state-of-the-art protocols with up to logU - 1/ logU of the items deleted, where U is the size of the input universe. Moreover, motivated by prior work, we propose Dyadic SpaceSaving ± , the first deterministic quantile approximation sketch in the bounded-deletion model.

show abstract

Differentially Private Linear Sketches: Efficient Implementations and Applications

Zhao¹,

Qiao²,

Redberg³

et al. 2022

Preprint

View full text Add to dashboard Cite

Linear sketches have been widely adopted to process fast data streams, and they can be used to accurately answer frequency estimation, approximate top K items, and summarize data distributions. When data are sensitive, it is desirable to provide privacy guarantees for linear sketches to preserve private information while delivering useful results with theoretical bounds. We show that linear sketches can ensure privacy and maintain their unique properties with a small amount of noise added at initialization. From the differentially private linear sketches, we showcase that the state-of-the-art quantile sketch in the turnstile model can also be private and maintain high performance. Experiments further demonstrate that our proposed differentially private sketches are quantitatively and qualitatively similar to noise-free sketches with high utilization on synthetic and real datasets.

show abstract

Panakos: Chasing the Tails for Multidimensional Data Streams

et al. 2023

View full text Add to dashboard Cite

System operators are often interested in extracting different feature streams from multi-dimensional data streams; and reporting their distributions at regular intervals, including the heavy hitters that contribute to the tail portion of the feature distribution. Satisfying these requirements to increase data rates with limited resources is challenging. This paper presents the design and implementation of Panakos that makes the best use of available resources to report a given feature's distribution accurately, its tail contributors, and other stream statistics (e.g., cardinality, entropy, etc.). Our key idea is to leverage the skewness inherent to most feature streams in the real world. We leverage this skewness by disentangling the feature stream into hot, warm, and cold items based on their feature values. We then use different data structures for tracking objects in each category. Panakos provides solid theoretical guarantees and achieves high performance for various tasks. We have implemented Panakos on both software and hardware and compared Panakos to other state-of-the-art sketches using synthetic and real-world datasets. The experimental results demonstrate that Panakos often achieves one order of magnitude better accuracy than the state-of-the-art solutions for a given memory budget.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Fuheng Zhao

SpaceSaving ^±

Differentially Private Linear Sketches: Efficient Implementations and Applications

Panakos: Chasing the Tails for Multidimensional Data Streams

Contact Info

Product

Resources

About