Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis 2022
DOI: 10.1145/3533767.3534394
|View full text |Cite
|
Sign up to set email alerts
|

AEON: a method for automatic evaluation of NLP test cases

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 50 publications
0
5
0
Order By: Relevance
“…By leveraging web-based platforms, educators can create interactive and engaging learning experiences for students, facilitating language acquisition and proficiency. Huang et al (2022) proposed AEON method for automatic evaluation of NLP test cases contributes to the advancement of automated testing techniques in natural language processing. Automated evaluation methods like AEON are essential for ensuring the accuracy and reliability of NLP systems in various applications, including chatbots, virtual assistants, and machine translation.…”
Section: Literature Reviewmentioning
confidence: 99%
See 1 more Smart Citation
“…By leveraging web-based platforms, educators can create interactive and engaging learning experiences for students, facilitating language acquisition and proficiency. Huang et al (2022) proposed AEON method for automatic evaluation of NLP test cases contributes to the advancement of automated testing techniques in natural language processing. Automated evaluation methods like AEON are essential for ensuring the accuracy and reliability of NLP systems in various applications, including chatbots, virtual assistants, and machine translation.…”
Section: Literature Reviewmentioning
confidence: 99%
“…By considering factors such as word frequency, concreteness, and connotation, these models can provide a more nuanced evaluation of vocabulary usage. The development of large-scale annotated datasets for vocabulary assessment has facilitated the training and evaluation of machine learning models [6]. These datasets contain diverse texts with annotated vocabulary levels, allowing researchers to benchmark the performance of different algorithms and improve their accuracy over time.…”
Section: Introductionmentioning
confidence: 99%
“…NLP software, such as machine translation software and chatbot, has also been widely used in human life. Similar to AI software, researchers have proposed variance methods to validate the reliability of NLP software on the correctness [18,19,40,44], toxicity [53,54], fairness [47,52,57].…”
Section: Related Work 61 Testing Of Ai Softwarementioning
confidence: 99%
“…The existence of adversarial samples causes negative effects on security-sensitive DNNbased applications, such as self-driving [27] and medical diagnosis [6,7]. Therefore, it is necessary to understand the DNNs [15,32,33,40] and enhance attack algorithms to better identify the DNN model's vulnerability, which is the first step to improve their robustness against adversarial samples [23,24,42].…”
Section: Introductionmentioning
confidence: 99%