2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) 2019
DOI: 10.1109/msr.2019.00021
|View full text |Cite
|
Sign up to set email alerts
|

Cleaning StackOverflow for Machine Translation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 31 publications
0
3
0
Order By: Relevance
“…In addition to the raw text data associated with messages, we expect to process the mailing lists to automatically identify text structures such as email signatures, code and quotes in replies [23]. Quotes, in particular, are potentially powerful as they provide a window into moderation norms and practices.…”
Section: Challenges Improvements and Future Workmentioning
confidence: 99%
“…In addition to the raw text data associated with messages, we expect to process the mailing lists to automatically identify text structures such as email signatures, code and quotes in replies [23]. Quotes, in particular, are potentially powerful as they provide a window into moderation norms and practices.…”
Section: Challenges Improvements and Future Workmentioning
confidence: 99%
“…In [ 25 ], the author utilizes a machine learning approach to detect stack size, which is best for beam threshold runtime values for machine translation. In [ 26 ], the authors propose a sentiment analysis approach for MT.…”
Section: Background and Literature Surveymentioning
confidence: 99%
“…While we focus on bug fix identification as a single, running example, this problem is more general, spanning any classification of interest (e.g., tangled commits [3]), on any artifact of interest (e.g., bug reports [11], Stack Overflow [12]), for which simplistic heuristics are used to categorize the artifacts. We argue that using heuristics is not the problem, rather it is using not enough of them, for lack of a way to effectively gather, integrate, and combine them.…”
Section: Introductionmentioning
confidence: 99%