2014
DOI: 10.15514/ispras-2014-26(1)-18
|View full text |Cite
|
Sign up to set email alerts
|

Texterra: A Framework for Text Analysis

Abstract: Аннотация. В статье описан проект Texterra, в рамках которого была создана инфраструктура для анализа текстов. Texterra предоставляет масштабируемое решение для быстрой обработки текстовых документов, основанное на использовании знаний, извлекаемых из Веб-ресурсов и текстовых документов. В данной статье раскрываются детали реализации проекта, варианты использования и результаты экспериментальных исследований разработанных инструментов.Ключевые слова: анализ текстов, обработка естественного языка, Википедия, ко… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(1 citation statement)
references
References 9 publications
0
1
0
Order By: Relevance
“…For example, TerMine 2 is based on CValue/NC-Value methods (and academic usage only); FlexiTerm contains C-Value and "a simple term variant normalisation method" [41]; TOPIA 3 lists only one method without algorithm description and it is not updated since 2009; TermRider 4 utilizes TF-IDF only; TermSuite [11] ranks candidates by Weirdness method, but focuses on recognizing term variants based on syntactic and morphological patterns. Some tools are limited by searching for mentions of (named) entities (for example, OpenCalais 5 ) or named entites and Wikipedia concepts (Texterra [42]). Another tool 6 supports only supervised recognition of 1-word and 2-words terms.…”
Section: Atr Software Toolsmentioning
confidence: 99%
“…For example, TerMine 2 is based on CValue/NC-Value methods (and academic usage only); FlexiTerm contains C-Value and "a simple term variant normalisation method" [41]; TOPIA 3 lists only one method without algorithm description and it is not updated since 2009; TermRider 4 utilizes TF-IDF only; TermSuite [11] ranks candidates by Weirdness method, but focuses on recognizing term variants based on syntactic and morphological patterns. Some tools are limited by searching for mentions of (named) entities (for example, OpenCalais 5 ) or named entites and Wikipedia concepts (Texterra [42]). Another tool 6 supports only supervised recognition of 1-word and 2-words terms.…”
Section: Atr Software Toolsmentioning
confidence: 99%