2011 IEEE 12th International Conference on Mobile Data Management 2011
DOI: 10.1109/mdm.2011.23
|View full text |Cite
|
Sign up to set email alerts
|

Enabling Structured Queries over Unstructured Documents

Abstract: With the information explosion on the internet, finding precise answers efficiently is a prevalent requirement by many users. Today, search engines answer keyword queries with a ranked list of documents. Users might not be always willing to read the top ranked documents in order to satisfy their information need. It would save lots of time and efforts if the the answer to a query can be provided directly, instead of a link to a document which might contain the answer. To realize this functionality, users must … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
3
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“….,(p ij , (l ijr ; d ijr ), v ijr )} The proposed TRSs for unstructured data are simple and straight forward. Unlike Information Extraction (IE) tool, we translate the unstructured data into the collection of triples without extracting the structure from the data (Grishman, 1997;Doan et al, 2009a;Doan et al, 2009b;Al-Mathami, 1998), because the existing IE tools have the following disadvantages (Kastrati et al, 2011): first, such approaches are costly due to a very large collection of data have high preprocessing cost, second, automatic extraction of structure is a source of uncertainty (Sarma et al, 2009), and third, they consist of out-of-dated version of extracted data already stored in somewhere. Therefore, we have adopted an approach proposed by F. Kastrati et.…”
Section: Unstructured Data Modelmentioning
confidence: 99%
See 1 more Smart Citation
“….,(p ij , (l ijr ; d ijr ), v ijr )} The proposed TRSs for unstructured data are simple and straight forward. Unlike Information Extraction (IE) tool, we translate the unstructured data into the collection of triples without extracting the structure from the data (Grishman, 1997;Doan et al, 2009a;Doan et al, 2009b;Al-Mathami, 1998), because the existing IE tools have the following disadvantages (Kastrati et al, 2011): first, such approaches are costly due to a very large collection of data have high preprocessing cost, second, automatic extraction of structure is a source of uncertainty (Sarma et al, 2009), and third, they consist of out-of-dated version of extracted data already stored in somewhere. Therefore, we have adopted an approach proposed by F. Kastrati et.…”
Section: Unstructured Data Modelmentioning
confidence: 99%
“…Therefore, an IE based approach is not suitable for data extraction. In this work, we have adopted a just-in-time query processing over a large collection of documents, which are result of a corpus selection procedure (Kastrati et al, 2011). This approach utilizes the functionality of search engine for selecting a relevant document based on the input keywords, and locating the appropriate data segments from a selected document.…”
Section: Unstructured Data Modelmentioning
confidence: 99%
“…A dataspace system copes with the problem of integrating a variety of data based on their structures and semantics such as structured, semi-structured, and unstructured data, and returns the best-effort or approximate answer to its users [16,17,30]. The existing works on query processing and query answering paid attention to return top-k answers to the users in the area of dataspace system [6,10,15,18,29,40,42,4,26,22,38,19,34,39,37,33,41]. The motivation behind this work is to focus on "how is information important?"…”
Section: Introductionmentioning
confidence: 99%
“…A dataspace system copes with the problem of integrating a variety of data based on their structures and semantics such as structured, semi-structured, and unstructured data, and returns the best-effort or approximate answer to its users [16,17,30]. The existing works on query processing and query answering paid attention to return top-k answers to the users in the area of dataspace system [6,10,15,18,29,40,42,4,26,22,38,19,34,39,37,33,41]. The motivation behind this work is to focus on "how is information important?"…”
Section: Introductionmentioning
confidence: 99%