Classifying Wikipedia entities into fine-grained classes

Tkatchenko, Maksim; Ulanov, Alexander; Simanovsky, Andrey

doi:10.1109/icdew.2011.5767662

Cited by 5 publications

(7 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…For different languages, it is difficult to perfectly implement the previous works because language-dependent features and heuristic rules are usually adopted to achieve better classification performance, such as in Japanese [19] and Arabic [20]. In order to evaluate the resulting training set in classification, we re-implemented the state-of-the-art method presented by Tkatchenko et al [17], and their baseline method that is similar to Tardif's classifier [18]. Their baseline method used the text of the first paragraph as a basic feature space, and a range of additional ones, namely, Title, Infobox, Sidebar, and Taxobox tokens, stemmed and tokenized categories, and template names.…”

Section: Resultsmentioning

confidence: 99%

“…Saleh et al [16] extracted features from abstracts, infobox, category, and persondata structure, and improved the recall of different NE types, by using beta-gamma threshold adjustment. Tkatchenko et al [17] adopted similar features to Tardif et al [18]. They added a 'List of' feature to the bag-of-words (BOW) representation, and added a boolean feature, which is the result of a binary rule, to increase separability between the articles of NEs and non-entities.…”

Section: Related Workmentioning

confidence: 99%

“…These hierarchical layouts, especially list items, are usually used to extract hyponymy-relation candidates and to then realize large-scale hyponymy-relation acquisition [22,23]. Tkatchenko et al [17] added the 'List of' feature, which is constructed by tokenizing and stemming the titles of all 'List of …' articles containing the current article, to the BOW representation. In this paper, the co-occurrence articles are utilized to extract representative words and increase their weights because only a small number of articles tagged with co-occurrence relation can be found in Chinese Wikipedia.…”

Section: B Structured Featurementioning

confidence: 99%

See 2 more Smart Citations

Classifying Articles in Chinese Wikipedia with Fine-Grained Named Entity Types

Zhou¹,

Li²,

Tang³

2014

Journal of Computing Science and Engineering

View full text Add to dashboard Cite

Named entity classification of Wikipedia articles is a fundamental research area that can be used to automatically build large-scale corpora of named entity recognition or to support other entity processing, such as entity linking, as auxiliary tasks. This paper describes a method of classifying named entities in Chinese Wikipedia with fine-grained types. We considered multi-faceted information in Chinese Wikipedia to construct four feature sets, designed different feature selection methods for each feature, and fused different features with a vector space using different strategies. Experimental results show that the explored feature sets and their combination can effectively improve the performance of named entity classification.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: B Structured Featurementioning

confidence: 99%

See 1 more Smart Citation

Classifying Articles in Chinese Wikipedia with Fine-Grained Named Entity Types

Zhou¹,

Li²,

Tang³

2014

Journal of Computing Science and Engineering

View full text Add to dashboard Cite

show abstract

“…También Bollegala [38] realiza estudios sobre la identifi cación de entidades nombradas en microtextos o textos breves que se publican en redes sociales como Twitter y Facebook. Otros estudios como el de Tkatchenko [39] propone un enfoque semi-supervisado para la construcción de conjuntos de entrenamiento para la clasifi cación de entidades nombradas. Para su desarrollo se usó una taxonomía de entidades nombradas llamada BBN [40] , un umbral de al menos 40 artículos de Wikipedia, y un subconjunto de las 400 palabras en minúscula más frecuentes, del corpus Reuters.…”

Section: Entidades Nombradas (Ne)unclassified

“…En 2007, Hirano [51] propone adicionar un mecanismo de aprendizaje supervisado al proceso, el cual mejora en un 4.4% la precisión. En 2011, Tkatchenko [39] propone utilizar un clasifi cador con aprendizaje semi-supervisado basado en SVM para establecer relaciones entre entidades nombradas dentro de Wikipedia. El trabajo ofrece niveles de precisión cercanos a 1 (100%) al aplicar el clasifi cador sobre 18 clases, lo cual es un resultado destacable.…”

Section: Relaciones Entre Entidades Nombradasunclassified

Identificación de relaciones entre los nodos de una red social

Barón

Salinas

2013

Ing.

View full text Add to dashboard Cite

<p>El presente artículo realiza una revisión del tema, representación y clasificación de de relaciones de pertenencia entre los nodos de una red social. Para ello, se abordan aspectos sobre Procesamiento de Lenguaje Natural, Minería de Texto, Recuperación de Información<br />y Entidades Nombradas. Se hace una descripción de cada una de ellas y se referencian y discuten trabajos académicos destacados que se han desarrollado en dicho tema.</p>

show abstract

Learning multilingual named entity recognition from Wikipedia

Nothman

Ringland

Radford

et al. 2013

Artificial Intelligence

277

187

View full text Add to dashboard Cite

We present a corpus of sentence-aligned triples of German audio, German text, and English translation, based on German audio books. The corpus consists of over 100 hours of audio material and over 50k parallel sentences. The audio data is read speech and thus low in disfluencies. The quality of audio and sentence alignments has been checked by a manual evaluation, showing that speech alignment quality is in general very high. The sentence alignment quality is comparable to well-used parallel translation data and can be adjusted by cutoffs on the automatic alignment score. To our knowledge, this corpus is to date the largest resource for end-to-end speech translation for German.

show abstract

Classifying Wikipedia entities into fine-grained classes

Cited by 5 publications

References 7 publications

Classifying Articles in Chinese Wikipedia with Fine-Grained Named Entity Types

Classifying Articles in Chinese Wikipedia with Fine-Grained Named Entity Types

Identificación de relaciones entre los nodos de una red social

Learning multilingual named entity recognition from Wikipedia

Contact Info

Product

Resources

About