2022 56th Asilomar Conference on Signals, Systems, and Computers 2022
DOI: 10.1109/ieeeconf56349.2022.10064696
|View full text |Cite
|
Sign up to set email alerts
|

Data Shapley Valuation for Efficient Batch Active Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
81
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 50 publications
(81 citation statements)
references
References 16 publications
0
81
0
Order By: Relevance
“…This data value can be of social-ecological, economical [25,26], functional and/or symbolic nature [10,26] with the purpose of adding a measurable business value [13]. Thereby, data value is determined by a multitude of value drivers [27,28], underlying theories [29][30][31][32][33], as well as frameworks [34,35].…”
Section: Data Valuation Business Capabilitymentioning
confidence: 99%
See 1 more Smart Citation
“…This data value can be of social-ecological, economical [25,26], functional and/or symbolic nature [10,26] with the purpose of adding a measurable business value [13]. Thereby, data value is determined by a multitude of value drivers [27,28], underlying theories [29][30][31][32][33], as well as frameworks [34,35].…”
Section: Data Valuation Business Capabilitymentioning
confidence: 99%
“…The rst characteristic economic comprises all theories that determine the data value based on price-quantity diagrams, cost curves and conventional hardware-oriented pricing (cost, competition, customer) [29,32,57]. In addition, game theory can be used as a data valuation theory, which can be divided into two characteristics, cooperative [30,31] and non-cooperative game theory [63,75]. The fourth characteristic decision theory summarizes approaches, e.g., analytic hierarchy process [51,67] or fair knapsack [77], that assess the data value while taking uncertainty and vagueness into account.…”
Section: Data Valuation Theorymentioning
confidence: 99%
“…Multi-source AL for NLP While AL has been studied for a variety of tasks in NLP (Siddhant and Lipton, 2018;Lowell et al, 2019;Ein-Dor et al, 2020;Shelmanov et al, 2021;Margatina et al, 2021;Yuan et al, 2022;Schröder et al, 2022;Margatina et al, 2022;Kirk et al, 2022;Zhang et al, 2022), the majority of work remains limited to settings where training data is assumed to stem from a single source. Some recent works have sought to address the issues that arise when relaxing the single-source assumption (Ghorbani et al, 2021;, though results remain primarily limited to image classification. Moreover, these works study how AL fares under the presence of corrupted training data, such as duplicating images or adding Gaussian noise, and they do not consider settings where sampling from multiple sources may be beneficial due to complementary source attributes.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, training dynamics, i.e. traces of SGD or logits during training, is being used in analyzing catastrophic forgetting [27], measuring the importance of data samples on a learning task [28], large dataset analysis [29], and identifying noisy labels [30]. Toneva et al [27] use training dynamics to identify forgetting events over the course of training.…”
Section: Related Workmentioning
confidence: 99%
“…Toneva et al [27] use training dynamics to identify forgetting events over the course of training. Ghorbani et al [28] propose a method called Data Shapley to quantify the predictor performance of each training sample to identify potentially corrupted data samples. Dataset Cartography [29] analyzes a large-scale dataset map with training dynamics to identify the presence of ambiguous-, easy-, and hard-to-learn data regions in a feature space.…”
Section: Related Workmentioning
confidence: 99%