Syntactically Robust Training on Partially-Observed Data for Open Information Extraction

Qi, Ji; Chen, Yuxiang; Hou, Lei; Li, Juanzi; Xu, Bin

doi:10.48550/arxiv.2301.06841

Cited by 1 publication

(4 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Despite the widespread interest in these benchmarks and the related OpenIE approaches provides promising results. However, the traditional peer-topeer matching-based evaluation can not measure the robustness of those approaches, where the syntax and expression may be various with underlying meaning (Qi et al, 2023). This work significantly fills the gap between traditional metrics and missed robustness evaluation for OpenIE and calls for more efforts in this research area.…”

Section: Related Workmentioning

confidence: 96%

“…In order to analyze the syntactic divergence in the cliques, we need a metric to measure the syntactic correlation between two sentences. A fast and effective algorithm is the HWS distance proposed in (Qi et al, 2023), which calculates the syntactic tree distance between two sentences based on a hierarchically weighted matching strategy, where smaller weights imply a greater focus on the comparison of skeletons. The value domain of this is [0, 1], where 1 indicates the farthest distance.…”

Section: Syntactic Analysismentioning

confidence: 99%

“…The revised Hierarchically Weighted Syntactic Distance Algorithm (HWS distance) is shown in algorithm 1. We fix the over-counting problem for repeated consecutive spans while preserving the efficiency with the same time complexity in the original work (Qi et al, 2023).…”

Section: A21 Hierarchically Weighted Syntacticmentioning

confidence: 99%

“…Research including these efforts has been devoted to evaluating the pairwise matching correctness between model extractions and golden facts on a sentence. However, the conventional evaluation benchmarks do not measure the robustness of models in the realistic open-world scenario, where the syntactic and expressive forms may vary under the same knowledge meaning (Qi et al, 2023). As shown in Figure 1, while the three sentences s 1 , s 2 , s 3 contain the same structured knowledge (a 1 , p, a 2 , a 3 ), the state-of-the-art model OpenIE6 successfully extracts facts (in green color) on sentence s 1 , but fails to predict arguments (in red color) on the other sentences due to the syntactic and expressive drifts.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Preserving Knowledge Invariance: Rethinking Robustness Evaluation of Open Information Extraction

Qi,

Zhang,

Wang

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

The robustness to distribution changes ensures that NLP models can be successfully applied in the realistic world, especially for information extraction tasks. However, most prior evaluation benchmarks have been devoted to validating pairwise matching correctness, ignoring the crucial validation of robustness. In this paper, we present the first benchmark that simulates the evaluation of open information extraction models in the real world, where the syntactic and expressive distributions under the same knowledge meaning may drift variously. We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique that consists of sentences with structured knowledge of the same meaning but with different syntactic and expressive forms. By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques. We perform experiments on typical models published in the last decade as well as a representative large language model, and the results show that the existing successful models exhibit a frustrating degradation, with a maximum drop of 23.43 F 1 score. Our resources and code are available at https://github.com/qijimrc/ROBUST.

show abstract

Section: Related Workmentioning

confidence: 96%

Section: Syntactic Analysismentioning

confidence: 99%

Section: A21 Hierarchically Weighted Syntacticmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations