Benjamin Freed scite author profile

This work focuses on multi-agent reinforcement learning (RL) with inter-agent communication, in which communication is differentiable and optimized through backpropagation. Such differentiable approaches tend to converge more quickly to higher-quality policies compared to techniques that treat communication as actions in a traditional RL framework. However, modern communication networks (e.g., Wi-Fi or Bluetooth) rely on discrete communication channels, for which existing differentiable approaches that consider real-valued messages cannot be directly applied, or require biased gradient estimators. Some works have overcome this problem by treating the message space as an extension of the action space, and use standard RL to optimize message selection, but these methods tend to converge slower and to inferior policies. In this paper, we propose a stochastic message encoding/decoding procedure that makes a discrete communication channel mathematically equivalent to an analog channel with additive noise, through which gradients can be backpropagated. Additionally, we introduce an encryption step for use in noisy channels that forces channel noise to be message-independent, allowing us to compute unbiased derivative estimates even in the presence of unknown channel noise. To the best of our knowledge, this work presents the first differentiable communication learning approach that can compute unbiased derivatives through channels with unknown noise. We demonstrate the effectiveness of our approach in two example multi-robot tasks: a path finding and a collaborative search problem. There, we show that our approach achieves learning speed and performance similar to differentiable communication learning with real-valued messages (i.e., unlimited communication bandwidth), while naturally handling more realistic real-world communication constraints. Content Areas: Multi-Agent Communication, Reinforcement Learning.

show abstract

Convolution Neural Network Algorithm for Shockable Arrhythmia Classification Within a Digitally Connected Automated External Defibrillator

Shen

Freed

Walter

et al. 2023

JAHA

View full text Add to dashboard Cite

Background Diagnosis of shockable rhythms leading to defibrillation remains integral to improving out‐of‐hospital cardiac arrest outcomes. New machine learning techniques have emerged to diagnose arrhythmias on ECGs. In out‐of‐hospital cardiac arrest, an algorithm within an automated external defibrillator is the major determinant to deliver defibrillation. This study developed and validated the performance of a convolution neural network (CNN) to diagnose shockable arrhythmias within a novel, miniaturized automated external defibrillator. Methods and Results There were 26 464 single‐lead ECGs that comprised the study data set. ECGs of 7‐s duration were retrospectively adjudicated by 3 physician readers (N=18 total readers). After exclusions (N=1582), ECGs were divided into training (N=23 156), validation (N=721), and test data sets (N=1005). CNN performance to diagnose shockable and nonshockable rhythms was reported with area under the receiver operating characteristic curve analysis, F1, and sensitivity and specificity calculations. The duration for the CNN to output was reported with the algorithm running within the automated external defibrillator. Internal and external validation analyses included CNN performance among arrhythmias, often mistaken for shockable rhythms, and performance among ECGs modified with noise to mimic artifacts. The CNN algorithm achieved an area under the receiver operating characteristic curve of 0.995 (95% CI, 0.990–1.0), sensitivity of 98%, and specificity of 100% to diagnose shockable rhythms. The F1 scores were 0.990 and 0.995 for shockable and nonshockable rhythms, respectively. After input of a 7‐s ECG, the CNN generated an output in 383±29 ms (total time of 7.383 s). The CNN outperformed adjudicators in classifying atrial arrhythmias as nonshockable (specificity of 99.3%–98.1%) and was robust against noise artifacts (area under the receiver operating characteristic curve range, 0.871–0.999). Conclusions We demonstrate high diagnostic performance of a CNN algorithm for shockable and nonshockable rhythm arrhythmia classifications within a digitally connected automated external defibrillator. Registration URL: https://clinicaltrials.gov/ct2/show/NCT03662802 ; Unique identifier: NCT03662802

show abstract

Simultaneous Policy and Discrete Communication Learning for Multi-Agent Cooperation

Freed

Sartoretti

Choset

2020

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

Learning Cooperative Multi-Agent Policies With Partial Reward Decoupling

Freed

Kapoor

Abraham

et al. 2022

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Benjamin Freed

Sparse Discrete Communication Learning for Multi-Agent Cooperation Through Backpropagation

Communication Learning via Backpropagation in Discrete Channels with Unknown Noise

Convolution Neural Network Algorithm for Shockable Arrhythmia Classification Within a Digitally Connected Automated External Defibrillator

Simultaneous Policy and Discrete Communication Learning for Multi-Agent Cooperation

Learning Cooperative Multi-Agent Policies With Partial Reward Decoupling

Contact Info

Product

Resources

About