Proceedings of NLP Power! The First Workshop on Efficient Benchmarking in NLP 2022
DOI: 10.18653/v1/2022.nlppower-1.1
|View full text |Cite
|
Sign up to set email alerts
|

Raison d’être of the benchmark dataset: A Survey of Current Practices of Benchmark Dataset Sharing Platforms

Abstract: This paper critically examines the current practices of benchmark dataset sharing in NLP and suggests a better way to inform reusers of the benchmark dataset. As the dataset sharing platform plays a key role not only in distributing the dataset but also in informing the potential reusers about the dataset, we believe data sharing platforms should provide a comprehensive context of the datasets. We survey four benchmark dataset sharing platforms: HuggingFace, PaperswithCode, Tensorflow, and Pytorch to diagnose … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 18 publications
0
1
0
Order By: Relevance
“…We estimated the knowledge by topics of the thread that askers and answerers collaborated on to address RQ 1. We used the topic modeling algorithm which takes word embeddings and estimates distinctive topics in the vector space (Angelov, 2020), which was enabled through several deep learning natural language processing (NLP) frameworks (Park & Jeoung, 2022). As we used a thread as the unit of analysis and the topic vector will be given to threads, we can estimate which topic the users asked and answered.…”
Section: Methodsmentioning
confidence: 99%
“…We estimated the knowledge by topics of the thread that askers and answerers collaborated on to address RQ 1. We used the topic modeling algorithm which takes word embeddings and estimates distinctive topics in the vector space (Angelov, 2020), which was enabled through several deep learning natural language processing (NLP) frameworks (Park & Jeoung, 2022). As we used a thread as the unit of analysis and the topic vector will be given to threads, we can estimate which topic the users asked and answered.…”
Section: Methodsmentioning
confidence: 99%