Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2000
DOI: 10.1145/347090.347164
|View full text |Cite
|
Sign up to set email alerts
|

A classifier for semi-structured documents

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
27
0
1

Year Published

2003
2003
2019
2019

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 76 publications
(28 citation statements)
references
References 4 publications
0
27
0
1
Order By: Relevance
“…This allows taking into account either the structure itself or the structure and content of these documents. (Yi & Sundaresan, 2000) have also used a vector model containing terms or XML tree paths as the vector elements. Ghosh (Ghosh & Mitra, 2008) has proposed a composite kernel for fusion of content and structure information.…”
Section: Related Workmentioning
confidence: 99%
“…This allows taking into account either the structure itself or the structure and content of these documents. (Yi & Sundaresan, 2000) have also used a vector model containing terms or XML tree paths as the vector elements. Ghosh (Ghosh & Mitra, 2008) has proposed a composite kernel for fusion of content and structure information.…”
Section: Related Workmentioning
confidence: 99%
“…This is a discriminative model which directly computes the posterior probability corresponding to the document relevance for each class. [22] present an extension of the Naive Bayes model to semi-structured documents where essentially global word frequencies estimators are replaced with local estimators computed for each path element. [18] propose to use Probabilistic Relationnal Models to classify structured document and more precisely Web pages.…”
Section: Previous Workmentioning
confidence: 99%
“…The method based on the structure and content take into account not only the structure but also contents of the document. The classification methods are used commonly which are K-nearest neighbor [5], Bayesian classifier [6], SVM [7] and the bottom-up classification [8] and so on. Regardless of which type of document classification method is used, whose accuracy depends on document similarity measure, so the problem to compute similarity of XML document is studied.…”
Section: Introductionmentioning
confidence: 99%