Access to medical data is highly restricted due to its sensitive nature, preventing communities from using this data for research or clinical training. Common methods of de-identification implemented to enable the sharing of data are sometimes inadequate to protect the individuals contained in the data. For our research, we investigate the ability of generative adversarial networks (GANs) to produce realistic medical time series data which can be used without concerns over privacy. The aim is to generate synthetic ECG signals representative of normal ECG waveforms. GANs have been used successfully to generate good quality synthetic time series and have been shown to prevent re-identification of individual records. In this work, a range of GAN architectures are developed to generate synthetic sine waves and synthetic ECG. Two evaluation metrics are then used to quantitatively assess how suitable the synthetic data is for real world applications such as clinical training and data analysis. Finally, we discuss the privacy concerns associated with sharing synthetic data produced by GANs and test their ability to withstand a simple membership inference attack. For the first time we both quantitatively and qualitatively demonstrate that GAN architecture can successfully generate time series signals that are not only structurally similar to the training sets but also diverse in nature across generated samples. We also report on their ability to withstand a simple membership inference attack, protecting the privacy of the training set. Keywords generative adversarial networks • time series synthesis • health data
Generative adversarial networks (GANs) studies have grown exponentially in the past few years. Their impact has been seen mainly in the computer vision field with realistic image and video manipulation, especially generation, making significant advancements. While these computer vision advances have garnered much attention, GAN applications have diversified across disciplines such as time series and sequence generation. As a relatively new niche for GANs, fieldwork is ongoing to develop high quality, diverse and private time series data. In this paper, we review GAN variants designed for time series related applications. We propose a classification of discrete-variant GANs and continuous-variant GANs, in which GANs deal with discrete time series and continuous time series data. Here we showcase the latest and most popular literature in this field; their architectures, results, and applications. We also provide a list of the most popular evaluation metrics and their suitability across applications. Also presented is a discussion of privacy measures for these GANs and further protections and directions for dealing with sensitive data. We aim to frame clearly and concisely the latest and state-of-the-art research in this area and their applications to real-world technologies.
Access to medical data is highly regulated due to its sensitive nature, which can constrain communities' ability to utilise these data for research or clinical purposes. Common de-identification techniques to enable the sharing of data may not provide adequate protection for an individual's personal data in every circumstance. We investigate the ability of Generative Adversarial Networks (GANs) to generate realistic medical time series data to address these privacy and identification concerns. We generate synthetic, and more significantly, multichannel electrocardiogram (ECG) signals that are representative of waveforms observed in patients. Successful generation of high-quality synthetic time series data has the potential to act as an effective substitute for actual patient data. For the first time, we demonstrate a multivariate GAN architecture that can successfully generate dependent multichannel time series signals. We present the first application of multivariate dynamic time warping as a means of evaluating generated GAN samples. Quantitative evidence demonstrates our GAN can generate data that is structurally similar to the training set and diverse across generated samples, all whilst ensuring sufficient privacy guarantees for the underlying training data. CCS CONCEPTS • Computing methodologies → Machine learning.
Ischemic heart disease is the highest cause of mortality globally each year. This puts a massive strain not only on the lives of those affected, but also on the public healthcare systems. To understand the dynamics of the healthy and unhealthy heart, doctors commonly use an electrocardiogram (ECG) and blood pressure (BP) readings. These methods are often quite invasive, particularly when continuous arterial blood pressure (ABP) readings are taken, and not to mention very costly. Using machine learning methods, we develop a framework capable of inferring ABP from a single optical photoplethysmogram (PPG) sensor alone. We train our framework across distributed models and data sources to mimic a large-scale distributed collaborative learning experiment that could be implemented across low-cost wearables. Our time-series-to-time-series generative adversarial network (T2TGAN) is capable of high-quality continuous ABP generation from a PPG signal with a mean error of 2.95 mmHg and a standard deviation of 19.33 mmHg when estimating mean arterial pressure on a previously unseen, noisy, independent dataset. To our knowledge, this framework is the first example of a GAN capable of continuous ABP generation from an input PPG signal that also uses a federated learning methodology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.