2022
DOI: 10.48550/arxiv.2205.05435
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Building for Tomorrow: Assessing the Temporal Persistence of Text Classifiers

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…(2) Supervised machine learning methods use labeled data to classify and predict. However, these methods are sensitive to changes in the data distribution, such as shifts in the domain or temporal changes, as shown in recent research (Alkhalifa et al, 2022;AL-Sharuee et al, 2021;Bjerva et al, 2019, inter alia). This sensitivity may potentially affect the accuracy of our analysis, particularly since our questions focus on changes manifested in the data over time.…”
Section: Lexicons As Sentiment Classifiersmentioning
confidence: 99%
“…(2) Supervised machine learning methods use labeled data to classify and predict. However, these methods are sensitive to changes in the data distribution, such as shifts in the domain or temporal changes, as shown in recent research (Alkhalifa et al, 2022;AL-Sharuee et al, 2021;Bjerva et al, 2019, inter alia). This sensitivity may potentially affect the accuracy of our analysis, particularly since our questions focus on changes manifested in the data over time.…”
Section: Lexicons As Sentiment Classifiersmentioning
confidence: 99%
“…It is motivated by recent research showing that the performance of the models drops as the test data becomes more distant, with respect to time, from the training data. This is true for classification [1,11,16], but also the research in information retrieval shows that deep neural network-based IR systems are dependent on the consistency between the train and test data [20]. To be able to study this, one needs several test collections created over sequential time periods, which would allow doing observations at different time stamps 𝑡, and most importantly, comparing the performance across different time stamps 𝑡 and 𝑡 ′ .…”
Section: Longeval Collectionsmentioning
confidence: 99%