2020
DOI: 10.1609/aaai.v34i04.5742
|View full text |Cite
|
Sign up to set email alerts
|

Robust Data Programming with Precision-guided Labeling Functions

Abstract: Scarcity of labeled data is a bottleneck for supervised learning models. A paradigm that has evolved for dealing with this problem is data programming. An existing data programming paradigm allows human supervision to be provided as a set of discrete labeling functions (LF) that output possibly noisy labels to input instances and a generative model for consolidating the weak labels. We enhance and generalize this paradigm by supporting functions that output a continuous score (instead of a hard label) that noi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5

Relationship

2
3

Authors

Journals

citations
Cited by 12 publications
(19 citation statements)
references
References 9 publications
0
19
0
Order By: Relevance
“…Data Programming and Unsupervised Learning: Snorkel (Ratner et al, 2016) has been proposed as a generative model to determine correct label probability using consensus on the noisy and conflicting labels assigned by the discrete LFs. Chatterjee et al (2020) proposed a graphical model, CAGE, that uses continuous-valued LFs with scores obtained using soft match techniques such as cosine similarity of word vectors, TF-IDF score, distance among entity pairs, etc. Owing to its generative model, Snorkel is highly sensitive to initialisation and hyper-parameters.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Data Programming and Unsupervised Learning: Snorkel (Ratner et al, 2016) has been proposed as a generative model to determine correct label probability using consensus on the noisy and conflicting labels assigned by the discrete LFs. Chatterjee et al (2020) proposed a graphical model, CAGE, that uses continuous-valued LFs with scores obtained using soft match techniques such as cosine similarity of word vectors, TF-IDF score, distance among entity pairs, etc. Owing to its generative model, Snorkel is highly sensitive to initialisation and hyper-parameters.…”
Section: Related Workmentioning
confidence: 99%
“…may also produce conflicting labels. In the past, generative models such as Snorkel (Ratner et al, 2016) and CAGE (Chatterjee et al, 2020) have been proposed for consensus on the noisy and conflicting labels assigned by the discrete LFs to determine the probability of the correct labels. Labels thus obtained could be used for training any supervised model/classifier and evaluated on a test set.…”
Section: Motivating Examplementioning
confidence: 99%
See 2 more Smart Citations
“…Firstly, avoiding over-dependence on such hand-crafted features, since the above approaches limit the scope for in-the-wild HOI detections. Such over-dependence has been averted in both textual [2] and image [18] domains and we take inspiration from such works. More often than not, the 3D poses or 3D centroids of objects (used as features) are either not available or are too erroneously estimated to be simply plugged into a model trained on hand-crafted features.…”
Section: Related Workmentioning
confidence: 99%