2017
DOI: 10.14778/3157794.3157797
|View full text |Cite
|
Sign up to set email alerts
|

Snorkel

Abstract: Labeling training data is increasingly the largest bottleneck in deploying machine learning systems. We present Snorkel, a first-of-its-kind system that enables users to train state-of- the-art models without hand labeling any training data. Instead, users write labeling functions that express arbitrary heuristics, which can have unknown accuracies and correlations. Snorkel denoises their outputs without access to ground truth by incorporating the first end-to-end implementation of our recently proposed machin… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
68
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
3
3

Relationship

1
9

Authors

Journals

citations
Cited by 468 publications
(68 citation statements)
references
References 42 publications
0
68
0
Order By: Relevance
“…However, our results hint that periphery nodes could also be noisy sources of information and possibly warrant omission in standard link prediction. Our fringe measurements can also be viewed as adding noisy training data, which is related to training data augmentation methods [29,30].…”
Section: Discussionmentioning
confidence: 99%
“…However, our results hint that periphery nodes could also be noisy sources of information and possibly warrant omission in standard link prediction. Our fringe measurements can also be viewed as adding noisy training data, which is related to training data augmentation methods [29,30].…”
Section: Discussionmentioning
confidence: 99%
“…Some of them use filtering to identify potentially mislabeled examples in the training dataset. This kind of filter is usually based on the labels of close neighbours (similar instances) [16] or exploit the disagreements in the prediction of classifiers trained on different portions of the dataset [15,17].…”
Section: Classification Under Weak Supervisionmentioning
confidence: 99%
“…In [19], Ratner et al transform a set of weak supervision sources, that may disagree with each other, into soft labels used to train a discriminative model. They show experimentally that this approach outperforms the naïve majority voting strategy for generating the target labels.…”
Section: Related Workmentioning
confidence: 99%