2010
DOI: 10.1016/j.datak.2009.10.002
|View full text |Cite
|
Sign up to set email alerts
|

Information extraction for search engines using fast heuristic techniques

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
29
0
1

Year Published

2011
2011
2019
2019

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 47 publications
(30 citation statements)
references
References 29 publications
0
29
0
1
Order By: Relevance
“…Therefore, according to Fig. 1(b), the set of leaf nodes of T 1 (T 2 ) is {#text 6 , #text 25 , #text 21 + similarity(#text 6 , span 9 ) + similarity(#text 6 , font 11 ) + similarity(a 7 , span 9 ) + similarity(a 7 ,font 11 ) + similarity(span 9 , font 11 ))/6 = (0 + 0 + 0 + 0 + 0 + 0)/6 = 0. Similarly, avgSimðr 2 Þ ¼ 0.…”
Section: Cohesion Calculation For C-record Setmentioning
confidence: 99%
“…Therefore, according to Fig. 1(b), the set of leaf nodes of T 1 (T 2 ) is {#text 6 , #text 25 , #text 21 + similarity(#text 6 , span 9 ) + similarity(#text 6 , font 11 ) + similarity(a 7 , span 9 ) + similarity(a 7 ,font 11 ) + similarity(span 9 , font 11 ))/6 = (0 + 0 + 0 + 0 + 0 + 0)/6 = 0. Similarly, avgSimðr 2 Þ ¼ 0.…”
Section: Cohesion Calculation For C-record Setmentioning
confidence: 99%
“…Although they can be handcrafted [15,24,4,42,51,50,20], the costs involved motivated many researchers to work on proposals to learn them automatically. These proposals are either supervised, i.e., they require the user to provide a number of information samples to be extracted [11,44,58,26,32,8,22,9,14,18,30,5,40,21,59], or unsupervised, i.e., they extract as much prospective information as they can and the user then gathers the relevant information from the results [62,12,16,2,28,25,60,39,46,64,67,38,59,57]. Since typical web documents are growing in complexity, a number of authors are also working on techniques whose goal is to identify the region within a web document where relevant information is most likely to be contained [37,7,61,63,27,34,65,66,52,…”
Section: Introductionmentioning
confidence: 99%
“…In the case of prediction, the interpretability of the knowledge extracted and used by the predictive models may be of secondary importance, in which case the models are commonly known as ''black box'' models. Rule extraction technique is one of techniques applied for knowledge extraction, and it is an important task in knowledge discovery from imperfect training dataset in uncertain environments such as medical diagnosis, mechanical faults and electric fields [1][2][3][4][5][6]. Rule extraction methods have been categorized into de-compositional, pedagogical, and eclectic techniques.…”
Section: Introductionmentioning
confidence: 99%