2006
DOI: 10.1017/s1351324906004116
|View full text |Cite
|
Sign up to set email alerts
|

InfoXtract: A customizable intermediate level information extraction engine

Abstract: Information Extraction (IE) systems assist analysts to assimilate information from electronic documents. This paper focuses on IE tasks designed to support information discovery applications. Since information discovery implies examining large volumes of heterogeneous documents for situations that cannot be anticipated a priori, they require IE systems to have breadth as well as depth. This implies the need for a domain-independent IE system that can easily be customized for specific domains: end users must be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

2007
2007
2012
2012

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 33 publications
(19 citation statements)
references
References 22 publications
0
19
0
Order By: Relevance
“…These are considered Location Entities). Srihari et al [2008] discuss the architecture and the grammar framework of Semantex TM in detail. We have used the grammar toolkit of Semantex TM to develop the morphological analyzer for Urdu.…”
Section: Example 2 [:: [\P{alphabetic}]+‫\پر‬b::/t (Nnp | Nnpc) ∼Neomentioning
confidence: 99%
“…These are considered Location Entities). Srihari et al [2008] discuss the architecture and the grammar framework of Semantex TM in detail. We have used the grammar toolkit of Semantex TM to develop the morphological analyzer for Urdu.…”
Section: Example 2 [:: [\P{alphabetic}]+‫\پر‬b::/t (Nnp | Nnpc) ∼Neomentioning
confidence: 99%
“…Further, text decomposition in accordance with semantics-based text themes (Salton, Singhal, Buckley, & Mitra, 1996) is an important future task. In particular, information extraction (Srihari, Li, Cornel, & Niu, 2006), feature identification (Popescu & Etzioni, 2005), aspect identification (Kobayashi, Iida, Inui, & Matsumoto, 2006), or discourse analysis (Teufel & Moens, 2002) facilitate the identification of text segments related to specific properties of an ontology concept. For example, 13 sentential features were extracted to classify scientific articles into a fixed set of rhetorical categories (Teufel & Moens, 2002).…”
Section: Limitations and Future Workmentioning
confidence: 99%
“…An Information Extraction (IE) engine, Semantex [2] was used to extract the features from document collections. Semantex tags named entities, common relationships associated with person and organization, as well as providing subject-verb-object (SVO) relationships.…”
Section: Data Preparationmentioning
confidence: 99%
“…Formally, given the source topic c 1 and destination topic c n , a concept chain CC 1->n is basically a sequence of concepts c 1 -> c 2 -> c 3 , …., c n-1 ->c n , the transitive strength of a path from c 1 to c n made up of the links { (c 1 , c 2 ),…, (c n-1 , c n )}, denoted by g(c 1 , c n ), is given by the following equation: (2) where ω(c i , c i+1 ) is the weight of the edge between c i and c i+1 in the CAG. We call a path an optimal path from c 1 to c n with path length l if its transitive strength is maximal among all the possible paths with length l.…”
Section: Concept Chain Queriesmentioning
confidence: 99%
See 1 more Smart Citation