Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Softw 2021
DOI: 10.1145/3468264.3468569
|View full text |Cite
|
Sign up to set email alerts
|

Validation on machine reading comprehension software without annotated labels: a property-based method

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 21 publications
(4 citation statements)
references
References 41 publications
0
4
0
Order By: Relevance
“…As introduced above, QA software has been widely used in daily human life, thus there is an urgent demand to assure the quality of its returned answers and reveal its undisclosed defects. But currently, almost all the NLP models, including the core models in QA software, are mainly tested in the referencebased paradigm Ribeiro et al (2020); Chen et al (2021b). As explained in Section 1, using this test paradigm, the testers must obtain a well-annotated benchmark dataset at first, which means that the manually annotated reference answers are mandatory during testing QA software.…”
Section: Motivationmentioning
confidence: 99%
See 1 more Smart Citation
“…As introduced above, QA software has been widely used in daily human life, thus there is an urgent demand to assure the quality of its returned answers and reveal its undisclosed defects. But currently, almost all the NLP models, including the core models in QA software, are mainly tested in the referencebased paradigm Ribeiro et al (2020); Chen et al (2021b). As explained in Section 1, using this test paradigm, the testers must obtain a well-annotated benchmark dataset at first, which means that the manually annotated reference answers are mandatory during testing QA software.…”
Section: Motivationmentioning
confidence: 99%
“…Besides, we propose to validate the machine reading comprehension (MRC) DL models with MT in our previous work Chen et al (2021b). It aims to provide the MRC models with one systematic and extensible assessment of language understanding capabilities against required linguistic properties.…”
Section: Metamorphic Testing For Deep Learning Softwarementioning
confidence: 99%
“…Constructing proper oracles has long been difficult for testing DNNs [93]. Metamorphic or differential testing has been used extensively to overcome the difficulty of explicitly establishing testing oracles [24,68,73,84]. Consequently, DNNs are considered incorrect if they produce inconsistent results.…”
Section: Related Workmentioning
confidence: 99%
“…To discover erroneous behaviors in NLP software, researchers have designed various software testing techniques [10,25,40,63,81]. A test case for NLP software is in the form of a text (e.g., a sentence) and its label, where the label is the expected correct output of the NLP software.…”
Section: Introductionmentioning
confidence: 99%