2016
DOI: 10.18287/2412-6179-2016-40-4-572-582
|View full text |Cite
|
Sign up to set email alerts
|

Extraction the knowledge and relevant linguistic means with efficiency estimation for formation of subject-oriented text sets

Abstract: АннотацияСтатья посвящена взаимосвязанным проблемам выделения единиц знаний из множества (корпуса) тематических текстов и отбора текстов в корпус анализом релевантности исход-ной фразе. Данные проблемы актуальны для построения систем обработки, анализа, оцени-вания и понимания информации. Конечной практической целью является поиск наиболее рационального варианта передачи смысла средствами заданного естественного языка для последующей фиксации фрагментов знаний в тезаурусе и онтологии предметной области. При эт… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 16 publications
(10 citation statements)
references
References 0 publications
0
10
0
Order By: Relevance
“…A model of interaction of subjects (subjects) in a single information space, which can be implemented using multi-agent software, can be proposed. The ideas of indirect and conditional project management, creating a soft influence on highly motivated autonomous actors, are successfully implemented in online communities and social networks [10][11][12][13][14].…”
Section: Technology Reviewmentioning
confidence: 99%
“…A model of interaction of subjects (subjects) in a single information space, which can be implemented using multi-agent software, can be proposed. The ideas of indirect and conditional project management, creating a soft influence on highly motivated autonomous actors, are successfully implemented in online communities and social networks [10][11][12][13][14].…”
Section: Technology Reviewmentioning
confidence: 99%
“…• Twitter is not a specialized network, which means it reflects the public opinion of a wider range of users [9]. The data collection from the Twitter social network can be carried out using the software products Apache Ambari and Flume, this method is described in more detail in [10]. However, it is often more convenient to develop a dedicated software product using standard libraries (twitter4j, tweepy, etc.)…”
Section: Social Network Data Collectionmentioning
confidence: 99%
“…Weight was assigned to each word in the word index, thus each group took the form of a vector of attributes (words) with own weights. In this paper, it was decided to use the word frequency count as the weight [10,11]. Such an approach for calculating the weights of words in word indexes using traditional methods and technologies requires huge computational resources and takes a long time when the volume and the number of analyzed word indexes increases, so it was decided to use BigData technology and computational clusters for this purpose [12].…”
Section: Determination Of the Proximity Of Groups Using Bigdata Technmentioning
confidence: 99%