2005
DOI: 10.1016/j.infsof.2004.08.003
|View full text |Cite
|
Sign up to set email alerts
|

An information extraction approach to reorganizing and summarizing specifications

Abstract: Materials and Process Specifications are complex semi-structured documents containing numeric data, text, and images. This article describes a coarse-grain extraction technique to automatically reorganize and summarize spec content. Specifically, a strategy for semantic-markup, to capture content within a semantic ontology, relevant to semi-automatic extraction, has been developed and experimented with. The working prototypes were built in the context of Cohesia's existing software infrastructure, and use tech… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
8
0

Year Published

2006
2006
2015
2015

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 26 publications
0
8
0
Order By: Relevance
“…So, the representation language should have the provision to more or less preserve the grid layout of a table to promote readability and enable changes to the original table to be easily incorporated in text, while describing the interpretation of each row/column in a way that is flexible and applicable to all rows/columns for further machine manipulation. We have looked into two different avenues, each with its own pros and cons (Thirunarayan, 2005). In Water (Plusch, 2003), annotation definition can encapsulate interprettation and be treated as a method, while the annotated data can be viewed as a method call.…”
Section: • Representation Of Tabular Data For Semi-automatic Translatmentioning
confidence: 99%
See 2 more Smart Citations
“…So, the representation language should have the provision to more or less preserve the grid layout of a table to promote readability and enable changes to the original table to be easily incorporated in text, while describing the interpretation of each row/column in a way that is flexible and applicable to all rows/columns for further machine manipulation. We have looked into two different avenues, each with its own pros and cons (Thirunarayan, 2005). In Water (Plusch, 2003), annotation definition can encapsulate interprettation and be treated as a method, while the annotated data can be viewed as a method call.…”
Section: • Representation Of Tabular Data For Semi-automatic Translatmentioning
confidence: 99%
“…Brought to you by | provisional account Authenticated Download Date | 6/26/15 8:04 AM Water, an XML-inspired programming language, provides a rich substrate for formalizing and querying heterogeneous documents (Thirunarayan, 2005). The annotated data can be interpreted as a method call, and the XMLelement as a method, as illustrated below in the context of the example in We will now attempt to annotate a document containing the table text, to capture its semantics via suitably chosen XML tags and XSLT stylesheets that manipulate the table according to its semantics.…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…The output of which is a generic document class with the ability to apply specific document classes where necessary. Related work manifests itself in the approach of Thirunarayan, Berkovich, and Sokol (2005), who have developed and applied an ontology to the structure of a document, which aids the definition of a document's layout and logical composition both for the extraction of text but also for the composition of new documents too. Somewhat differently, Wu (2009) sets out a document based approach to the management of information and knowledge using factoring and synthesising to structure and present information within a structured document format.…”
Section: Introductionmentioning
confidence: 99%
“…For most companies and individuals, collecting product specifications for comparison and evaluation is a necessary task before purchasing something. Specifications can be viewed as semi-structured documents with standard terminologies (Thirunarayan et al, 2005), or more specifically, information technology (IT) products. Despite product specifications having a common format, a little difference still exists in various Web sites.…”
Section: Introductionmentioning
confidence: 99%