DOI: 10.29007/398z
|View full text |Cite
|
Sign up to set email alerts
|

A Data-Driven Metric of Hardness for WSC Sentences

Abstract: The Winograd Schema Challenge (WSC) — the task of resolving pronouns in certain sentences where shallow parsing techniques seem not to be directly applicable — has been proposed as an alternative to the Turing Test. According to Levesque, having access to a large corpus of text would likely not help much in the WSC. Among a number of attempts to tackle this challenge, one particular approach has demonstrated the plausibility of using commonsense knowledge automatically acquired from raw text in English Wikiped… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
17
0

Publication Types

Select...
2
1
1

Relationship

4
0

Authors

Journals

citations
Cited by 4 publications
(18 citation statements)
references
References 5 publications
(7 reference statements)
1
17
0
Order By: Relevance
“…1). Contributors are allowed to take on this second role if they meet two requirements: first, the percentage of their valid and approved (by other Evaluators) schemas among those that they have contributed that far exceeds a certain threshold (which we have set to be 90%, corresponding to the bar for near adult human abilities on the WSC [3]); second, their score (which we discuss later) is above a certain other threshold. Contributors who are also Evaluators choose the role in which they interact with WinoFlexi at login time.…”
Section: Contributing and Evaluatingmentioning
confidence: 99%
See 3 more Smart Citations
“…1). Contributors are allowed to take on this second role if they meet two requirements: first, the percentage of their valid and approved (by other Evaluators) schemas among those that they have contributed that far exceeds a certain threshold (which we have set to be 90%, corresponding to the bar for near adult human abilities on the WSC [3]); second, their score (which we discuss later) is above a certain other threshold. Contributors who are also Evaluators choose the role in which they interact with WinoFlexi at login time.…”
Section: Contributing and Evaluatingmentioning
confidence: 99%
“…Towards this goal, we follow a single-step approach for labeling schemas with a hardness score which indirectly shows if a schema is considered hard to answer by a machine; Winograd schemas are accordingly labeled as such by the computed hardness index. For this purpose we use a recent tool [3] that can take any Winograd schema and output a score that shows its hardness index. The hardness index is presented to the Contributors and the Evaluators.…”
Section: Un-validated Schemasmentioning
confidence: 99%
See 2 more Smart Citations
“…On a second front, we expect that the adoption and use of WSC-based CAPTCHAs will encourage more AI researchers to work on the problem of actually trying to solve the WSC, and perhaps, in the process, help towards the building of machines able to reason with commonsense knowledge. At the same time, it will also present AI researchers with the novel challenge of automating the construction of new WSC instances, or evaluating how hard they might be to humans (as pursued, for example, in [13]).…”
Section: Discussionmentioning
confidence: 99%