2021
DOI: 10.48550/arxiv.2112.03837
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Augment & Valuate : A Data Enhancement Pipeline for Data-Centric AI

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…Alternatively, data-centric methods-which leave the system's architecture unchanged-endeavor to improve system performance by using enhanced data [27][28][29] obtained through, e.g. data set augmentation [30][31][32], data set distillation [33,34], label analysis and accuracy improvements [35][36][37][38], data set validation [39,40], domain randomization [41], and combinations of these [42][43][44].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Alternatively, data-centric methods-which leave the system's architecture unchanged-endeavor to improve system performance by using enhanced data [27][28][29] obtained through, e.g. data set augmentation [30][31][32], data set distillation [33,34], label analysis and accuracy improvements [35][36][37][38], data set validation [39,40], domain randomization [41], and combinations of these [42][43][44].…”
Section: Introductionmentioning
confidence: 99%
“…Data-centric ML techniques have found wide applicability in a variety of domains including legal [43], natural language processing [37], image classification [44], and medical prognosis [39]. In the context of QIS, data-centric approaches, namely engineered data sets, have been used to demonstrate prediction advantage in quantum machine learning [47] and improve the accuracy of state reconstruction systems [48][49][50].…”
Section: Introductionmentioning
confidence: 99%
“…Examples include increasing the number of hidden layers in a deep neural network, tailoring the structure of a model, modifying the loss function [26] or tweaking the reward function in reinforcement learning. Alternatively, data-centric [27][28][29][30][31][32][33][34][35][36] methods-leaving the system's architecture unchanged-endeavor to improve system performance by using enhanced data sets, e.g., removing spurious correlations, increasing the accuracy of labels, increasing the variety of sampled situations covered by the data, or distilling the data sets to improve efficiency [37].…”
Section: Introductionmentioning
confidence: 99%
“…Data-centric ML techniques have found wide applicability in a variety of domains including legal [32], natural language processing [40], image classification [35], and medical prognosis [36]. In the context of QIS, a datacentric approach to improving state reconstruction accuracy was implemented where expected statistical and experimental noise were included into training sets of a convolutional neural network (CNN), resulting in overall performance improvements [41][42][43].…”
Section: Introductionmentioning
confidence: 99%
“…In DC-AI, more emphasis is on data; the model can simply be used/imported, and therefore, transfer learning can play a vital role in this new paradigm. • Data-enhanced pipelines (or instruments): To solve noisy data and scarcity issues, many sophisticated pipelines have been developed recently to enhance the performance of machine learning techniques [136]. These pipelines can significantly enhance the accuracy of ML methods by utilizing a small portion of data.…”
mentioning
confidence: 99%