Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing - SemiSupLearn '09 2009
DOI: 10.3115/1621829.1621830
|View full text |Cite
|
Sign up to set email alerts
|

Coupling semi-supervised learning of categories and relations

Abstract: We consider semi-supervised learning of information extraction methods, especially for extracting instances of noun categories (e.g., 'athlete,' 'team') and relations (e.g., 'playsForTeam(athlete,team)').Semisupervised approaches using a small number of labeled examples together with many unlabeled examples are often unreliable as they frequently produce an internally consistent, but nevertheless incorrect set of extractions. We propose that this problem can be overcome by simultaneously learning classifiers f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
35
0
2

Year Published

2009
2009
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 45 publications
(37 citation statements)
references
References 16 publications
0
35
0
2
Order By: Relevance
“…Information about relation similarity is used in training and evaluation, as it roughly indicates how confusable the linguistic expression of two relations are. This would indicate, for example, that relation colearning (Carlson et al 2009) would not work for similar relations. Ambiguity is defined for each relation as the max relation similarity for the relation.…”
Section: Crowd Truthmentioning
confidence: 99%
“…Information about relation similarity is used in training and evaluation, as it roughly indicates how confusable the linguistic expression of two relations are. This would indicate, for example, that relation colearning (Carlson et al 2009) would not work for similar relations. Ambiguity is defined for each relation as the max relation similarity for the relation.…”
Section: Crowd Truthmentioning
confidence: 99%
“…While these earlier methods showed the feasibility of semi-supervised learning of extraction patterns, they were limited because accurate learning requires more constraints than are provided by a few dozen labeled training examples. Our algorithm achieves significantly higher accuracy by using the input ontology itself to provide additional constraints that guide the learner [9]. For example, when our algorithm learns extraction patterns for the predicates 'person', 'team' and 'plays-on-team', prior knowledge from the ontology requires that for any unlabeled sentence containing noun phrases A and B, the extractor for 'plays-on-team' can label <A, B > a positive example of the relation only if the 'person' classifier labels A positive, and the 'team' classifier labels B positive.…”
Section: The Problemmentioning
confidence: 99%
“…The textual pattern learner, CBL [9], iteratively grows a set of extraction patterns while obeying mutual exclusion, subset, and type checking constraints given by the ontology. The HTML pattern learner, SEAL [10], learns patterns of HTML and text tokens that capture regularities such as HTML lists of predicate instances.…”
Section: The Readtheweb Systemmentioning
confidence: 99%
“…9 The pseudo-code of the proposed algorithm to wrapper induction It can be noticed that the first concept k 1 aggregates in itself the information about the 6 , o 7 surrounded from the left by such prefixes as <li class = "film title"><br/> and <ul><li class = "film title"><br/> etc. The objects o i ∈ k 1 are surrounded by HTML tokens expansions of lengths conceptLength(k 1 ) = {2, 3}.…”
Section: Figmentioning
confidence: 99%
“…[21]. IESs, such as Never-Ending Language Learner (NELL), Know It All, TextRunner, or Snowball represent this approach [1,3,6,9,10,22,23,56,59,68,78]. The systems mentioned above represent the trend called open IE.…”
Section: State Of the Art And Related Workmentioning
confidence: 99%