Automatic pattern acquisition for Japanese information extraction

Sudo, Kiyoshi; Sekine, Satoshi; Grishman, Ralph

doi:10.3115/1072133.1072142

Cited by 25 publications

(35 citation statements)

References 1 publication

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…To automatically decide whether a sentence is definitional, we could use a simple cutoff in which sentences that are ranked more highly are considered definitional. This is similar to the work by Sudo et al [19] who proposed unsupervised learning method for pattern discovery by utilizing TF-IDF weight to select a set of relevant documents and sentences, and then built patterns from them.…”

Section: Step 2: Unsupervised Labeling Using Prfmentioning

confidence: 64%

“…Similarly, Yangarber et al [21] used a set of basic patterns as "seeds" and learn more scenario oriented extraction patterns automatically. Most relevant to our application of PRF, Sudo et al [19] put forward an unsupervised learning for pattern discovery. They utilized TF×IDF to get a set of relevant documents and sentences and built patterns from them.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Unsupervised learning of soft patterns for generating definitions from online news

Hang

Kan

Chua

2004

Proceedings of the 13th International Conference on World Wide Web

View full text Add to dashboard Cite

Breaking news often contains timely definitions and descriptions of current terms, organizations and personalities. We utilize such web sources to construct definitions for such terms. Previous work has identified definitions using hand-crafted rules or supervised learning that constructs rigid, hard text patterns. In contrast, we demonstrate a new approach that uses flexible, soft matching patterns to characterize definition sentences. Our soft patterns are able to effectively accommodate the diversity of definition sentence structure exhibited in news. We use pseudorelevance feedback to automatically label sentences for use in soft pattern generation. The application of our unsupervised method significantly improves baseline systems on both the standardized TREC corpus as well as crawled online news articles by 27% and 30%, respectively, in terms of F measure. When applied to a state-of-art definition generation system recently fielded in the TREC 2003 definitional question answering task, it improves the performance by 14%.

show abstract

Section: Step 2: Unsupervised Labeling Using Prfmentioning

confidence: 64%

Section: Related Workmentioning

confidence: 99%

Unsupervised learning of soft patterns for generating definitions from online news

Hang

Kan

Chua

2004

Proceedings of the 13th International Conference on World Wide Web

View full text Add to dashboard Cite

show abstract

“…Several recent approaches to IE have used patterns based on a dependency analysis of the input text (Yangarber, 2003;Sudo et al, 2001;Sudo et al, 2003;Bunescu and Mooney, 2005;. These approaches have used a variety of pattern models (schemes for representing IE patterns based on particular parts of the dependency tree).…”

Section: Introductionmentioning

confidence: 99%

Comparing information extraction pattern models

Stevenson

Greenwood

2006

Proceedings of the Workshop on Information Extraction Beyond the Document - IEBeyondDoc '06

View full text Add to dashboard Cite

Several recently reported techniques for the automatic acquisition of Information Extraction (IE) systems have used dependency trees as the basis of their extraction pattern representation. These approaches have used a variety of pattern models (schemes for representing IE patterns based on particular parts of the dependency analysis). An appropriate model should be expressive enough to represent the information which is to be extracted from text without being overly complicated. Four previously reported pattern models are evaluated using existing IE evaluation corpora and three dependency parsers. It was found that one model, linked chains, could represent around 95% of the information of interest without generating an unwieldy number of possible patterns.

show abstract

“…The predicate-argument (SVO) model allows subtrees containing only a verb and its direct subject and object as extraction pattern candidates (Yangarber, 2003). The chain model represents extraction patterns as a chain-shaped path from each target slot value to the root node of the dependency tree (Sudo et al, 2001). A couple of chain model patterns sharing the same verb are linked to each other and construct a linked-chain model pattern (Greenwood and Stevenson, 2006).…”

Section: Introductionmentioning

confidence: 99%

“…A number of pattern induction approaches have recently been researched based on the dependency analysis (Yangarber, 2003) (Sudo et al, 2001) (Greenwood and Stevenson, 2006) (Sudo et al, 2003). The natural language texts in training instances are parsed by dependency analyzer and converted into dependency trees.…”

Section: Introductionmentioning

confidence: 99%

A local tree alignment-based soft pattern matching approach for information extraction

Kim

Jeong

Lee

2009

Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Com

View full text Add to dashboard Cite

This paper presents a new soft pattern matching method which aims to improve the recall with minimized precision loss in information extraction tasks. Our approach is based on a local tree alignment algorithm, and an effective strategy for controlling flexibility of the pattern matching will be presented. The experimental results show that the method can significantly improve the information extraction performance.

show abstract

Automatic pattern acquisition for Japanese information extraction

Cited by 25 publications

References 1 publication

Unsupervised learning of soft patterns for generating definitions from online news

Unsupervised learning of soft patterns for generating definitions from online news

Comparing information extraction pattern models

A local tree alignment-based soft pattern matching approach for information extraction

Contact Info

Product

Resources

About