Aristide C. Y. Tossou scite author profile

Aristide C. Y. Tossou

5Publications

83Citation Statements Received

105Citation Statements Given

How they've been cited

How they cite others

105

Affiliations

Chalmers University of Technology

Publications

Order By: Most citations

Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?

Basu¹,

Dimitrakakis²,

Tossou³

2019

Preprint

View full text Add to dashboard Cite

We introduce a number of privacy definitions for the multi-armed bandit problem, based on differential privacy. We relate them through a unifying graphical model representation and connect them to existing definitions. We then derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions. We show that for all of them, the learner's regret is increased by a multiplicative factor dependent on the privacy level ǫ, but that the dependency is weaker when we do not require local differential privacy for the rewards.

show abstract

Near-optimal Optimistic Reinforcement Learning using Empirical Bernstein Inequalities

Tossou¹,

Basu²,

Dimitrakakis³

2019

Preprint

View full text Add to dashboard Cite

We study model-based reinforcement learning in an unknown finite communicating Markov decision process. We propose a simple algorithm that leverages a variance based confidence interval. We show that the proposed algorithm, UCRL-V, achieves the optimal regret Õ( √ DSAT ) up to logarithmic factors, and so our work closes a gap with the lower bound without additional assumptions on the MDP. We perform experiments in a variety of environments that validates the theoretical bounds as well as prove UCRL-V to be better than the state-of-the-art algorithms.

show abstract

Algorithms for Differentially Private Multi-Armed Bandits

Tossou¹,

Dimitrakakis²

2015

Preprint

View full text Add to dashboard Cite

We present differentially private algorithms for the stochastic Multi-Armed Bandit (MAB) problem. This is a problem for applications such as adaptive clinical trials, experiment design, and user-targeted advertising where private information is connected to individual rewards. Our major contribution is to show that there exist (ǫ, δ) differentially private variants of Upper Confidence Bound algorithms which have optimal regret, O(ǫ −1 + log T ). This is a significant improvement over previous results, which only achieve poly-log regret O(ǫ −2 log 2 T ), because of our use of a novel intervalbased mechanism. We also substantially improve the bounds of previous family of algorithms which use a continual release mechanism. Experiments clearly validate our theoretical bounds.

show abstract

VIVO: A secure, privacy-preserving, and real-time crowd-sensing framework for the Internet of Things

Luceri

Cardoso

Papandrea

et al. 2018

Pervasive and Mobile Computing

View full text Add to dashboard Cite

Smartphones are a key enabling technology in the Internet of Things (IoT) for gathering crowd-sensed data. However, collecting crowd-sensed data for research is not simple. Issues related to device heterogeneity, security, and privacy have prevented the rise of crowd-sensing platforms for scientific data collection. For this reason, we implemented VIVO, an open framework for gathering crowdsensed Big Data for IoT services, where security and privacy are managed within the framework. VIVO introduces the enrolled crowd-sensing model, which allows the deployment of multiple simultaneous experiments on the mobile phones of volunteers. The collected data can be accessed both at the end of the experiment, as in traditional testbeds, as well as in real-time, as required by many Big Data applications. We present here the VIVO architecture, highlighting its advantages over existing solutions, and four relevant real-world applications running on top of VIVO.

show abstract

Thompson Sampling for Stochastic Bandits with Graph Feedback

Tossou¹,

Dimitrakakis²,

Dubhashi³

2017

AAAI

View full text Add to dashboard Cite

We present a simple set of algorithms based on Thompson Sampling for stochastic bandit problems with graph feedback. Thompson Sampling is generally applicable, without the need to construct complicated upper confidence bounds. As we show in this paper, it has excellent performance in problems with graph feedback, even when the graph structure itself is unknown and/or changing. We provide theoretical guarantees on the Bayesian regret of the algorithm, as well as extensive experi- mental results on real and simulated networks. More specifically, we tested our algorithms on power law, planted partitions and Erdo's–Rényi graphs, as well as on graphs derived from Facebook and Flixster data and show that they clearly outperform related methods that employ upper confidence bounds.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.