2020
DOI: 10.14778/3415478.3415559
|View full text |Cite
|
Sign up to set email alerts
|

Leveraging organizational resources to adapt models to new data modalities

Abstract: As applications in large organizations evolve, the machine learning (ML) models that power them must adapt the same predictive tasks to newly arising data modalities (e.g., a new video content launch in a social media application requires existing text or image models to extend to video). To solve this problem, organizations typically create ML pipelines from scratch. However, this fails to utilize the domain expertise and data they have cultivated from developing tasks for existing modalities. We demonstrate … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 71 publications
0
2
0
Order By: Relevance
“…We implement semantic rules via similarity based on contextual token embeddings (BERT [5] and ELMo [16]). In addition to the active learning approach previously implemented in Ruler, TagRuler proposes a novel active learning component focusing on false positives, thus contributing to one of the main challenges in data programming: surfacing difficult (or borderline) labeled examples [20]. In this active learning approach, unlabeled examples that have higher potential to identify false positives will have higher probability to be sampled as next instance to be labeled.…”
Section: Data Programmming By Demonstration (Dpbd)mentioning
confidence: 99%
See 1 more Smart Citation
“…We implement semantic rules via similarity based on contextual token embeddings (BERT [5] and ELMo [16]). In addition to the active learning approach previously implemented in Ruler, TagRuler proposes a novel active learning component focusing on false positives, thus contributing to one of the main challenges in data programming: surfacing difficult (or borderline) labeled examples [20]. In this active learning approach, unlabeled examples that have higher potential to identify false positives will have higher probability to be sampled as next instance to be labeled.…”
Section: Data Programmming By Demonstration (Dpbd)mentioning
confidence: 99%
“…TagRuler is a system designed to learn from expert knowledge, and expert manual annotation is expensive, so it is important to obtain informative data with as few annotations as possible. The active learning approach focuses on contributing to this main challenge in data programming: generating difficult (or borderline) examples [20]. TagRuler samples the text to be displayed in Figure 2 (A) after each annotation using an active learning technique that leverages the trained label model and a small labeled development set.…”
Section: Active Samplermentioning
confidence: 99%