Sacha Servan-Schreiber scite author profile

Sacha Servan-Schreiber

5Publications

49Citation Statements Received

46Citation Statements Given

How they've been cited

How they cite others

117

Affiliations

Massachusetts Institute of Technology

Publications

Order By: Most citations

ProSecCo: progressive sequence mining with convergence guarantees

Servan-Schreiber¹,

Riondato

Zgraggen³

2019

Knowl Inf Syst

View full text Add to dashboard Cite

We present ProSecCo, an algorithm for the progressive mining of frequent sequences from large transactional datasets: it processes the dataset in blocks and it outputs, after having analyzed each block, a high-quality approximation of the collection of frequent sequences. ProSecCo can be used for interactive data exploration, as the intermediate results enable the user to make informed decisions as the computation proceeds. These intermediate results have strong probabilistic approximation guarantees and the final output is the exact collection of frequent sequences. Our correctness analysis uses the Vapnik-Chervonenkis (VC) dimension, a key concept from statistical learning theory. The results of our experimental evaluation of ProSecCo on real and artificial datasets show that it produces fast-converging high-quality results almost immediately. Its practical performance is even better than what is guaranteed by the theoretical analysis, and ProSecCo can even be faster than existing state-of-the-art non-progressive algorithms. Additionally, our experimental results show that ProSecCo uses a constant amount of memory, and orders of magnitude less than other standard, non-progressive, sequential pattern mining algorithms.

show abstract

SchengenDB: A Data Protection Database Proposal

Kraska¹,

Brodie²,

Servan-Schreiber³

et al. 2019

View full text Add to dashboard Cite

GDPR in Europe and similar regulations, such as the California CCPA, require new levels of privacy support for consumers. Most challenging to IT departments is the "right to be forgotten". Hence, an enterprise must ensure that ALL information about a specific consumer be deleted from enterprise storage, when requested. Since enterprises are internally heavily "siloed", sharing of information is usually accomplished by copying data between systems. This makes finding and deleting all copies of data on a particular consumer difficult. GDPR also requires the notion of purposes, which is an access control model orthogonal to the one customarily in SQL. Herein, we sketch an implementation of purposes and show how it fits within a conventional access control framework. We then propose two solutions to supporting GDPR in a DBMS. When a "green field" environment is present, we propose a solution which directly supports the process of ensuring GDPR compliance at enterprise-scale. Specifically, it is designed to store every fact about a consumer exactly once. Therefore, the right to be forgotten is readily supported by deleting that fact. On the other hand, when dealing with legacy systems in the enterprise, we propose a second solution which tracks all copies of personal information, so they can be deleted on request. Of course, this solution entails additional overhead in the DBMS. Once data leaves the DBMS, it is in some application. We propose "sandboxing" applications in a novel way that will prevent them from leaking data to the outside world when inappropriate. Lastly, we discuss the challenges associated with auditing and logging of data. This paper sketches the design of the above GDPR compliant facilities, which we collectively term SchengenDB.

show abstract

ShorTor: Improving Tor Network Latency via Multi-hop Overlay Routing

Hogan¹,

Servan-Schreiber²,

Newman³

et al. 2022

View full text Add to dashboard Cite

Trellis: Robust and Scalable Metadata-private Anonymous Broadcast

Langowski¹,

Servan-Schreiber²,

Devadas³

2023

View full text Add to dashboard Cite

ProSecCo: Progressive Sequence Mining with Convergence Guarantees

Servan-Schreiber¹,

Riondato²,

Zgraggen³

2018

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.