Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-demo.40
|View full text |Cite
|
Sign up to set email alerts
|

skweak: Weak Supervision Made Easy for NLP

Abstract: We present skweak, a versatile, Python-based software toolkit enabling NLP developers to apply weak supervision to a wide range of NLP tasks. Weak supervision is an emerging machine learning paradigm based on a simple idea: instead of labelling data points by hand, we use labelling functions derived from domain knowledge to automatically obtain annotations for a given dataset. The resulting labels are then aggregated with a generative model that estimates the accuracy (and possible confusions) of each labellin… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(4 citation statements)
references
References 27 publications
0
4
0
Order By: Relevance
“…Recently, more packages and environments (e.g. Snorkel [43], skweak [50]) have been created to apply weak supervision in general domain NLP practice. Thus, a promising future study is to adapt the current weak supervision infrastructures or the ideas behind them to the clinical NLP domain and establish best practice in the field, and a recent work is Trove [42].…”
Section: Conclusion Discussion and Future Studiesmentioning
confidence: 99%
“…Recently, more packages and environments (e.g. Snorkel [43], skweak [50]) have been created to apply weak supervision in general domain NLP practice. Thus, a promising future study is to adapt the current weak supervision infrastructures or the ideas behind them to the clinical NLP domain and establish best practice in the field, and a recent work is Trove [42].…”
Section: Conclusion Discussion and Future Studiesmentioning
confidence: 99%
“…Work in progress aims to extend this to other types of entities (locations and social media links) and approach the problem in a multitask framework to leverage the connections in the output data (e.g., person names might be part of social media links too, but are unlikely to be part of a location). We also aim to improve the connections between the different components using a weak supervision framework, particularly Skweak [12].…”
Section: Phase 1: Specific Applications and Domainsmentioning
confidence: 99%
“…When creating a dataset to train our model, we need to define the assignment of labels by existing linguistic properties of each class. To assign speech act annotations to linguistic expressions containing the linguistic properties just mentioned, we use the weak supervision library skweak [22]. This library enables the generation of so-called weak labels through annotator functions.…”
Section: B Heuristic Labelling Using Weak Supervisionmentioning
confidence: 99%