This paper discusses linguistic and psychological aspects of the problem of automatic modusdictum analysis of texts published in social networks and other electronic media. Thereupon, theoretical questions are raised anew on the linguistic nature of modus, on the means to express “ego-meanings” in speech, on the differentiation of proper modus (autoreferential signs) and modal-evaluating predicates in dictum position, on the implicit methods of communicating modus information, and the resources to read this information based on discursive speech practices (conventional meanings). The applied goal of the paper is to provide “humanitarian” (psychological and linguistic) support for development of machine “mining” programs, i. e. automatic monitoring of network content and text identification with a certain subjective modality. To achieve this goal, we describe, in particular, such lexical-grammatical features of the texts that can be significant for determining psychological state of an individual or a professional group to identify certain public opinions. Conceptually, this research is connected with the idea of speech system which is manifested both at the level of styles and genres and within independent communicative units, as well as with one of the most important trends in the field of artificial intelligence — the method of relational-situational analysis of texts in natural language. Thematic groups of words (TGW) were compiled including “evaluation collocations” typical of those texts. The templates created on the basis of psychological and linguistic description model suggested in this paper can be used hereafter to develop algorithms for automatic monitoring of the network texts of a given theme (professional stability or mobility, professional crisis, etc.) and evaluation.
Жанры речи как объект компьютерного анализа (на материале научных те кстов) Speech Genres as an Object of Computer Analysis (Based on Academic Texts) Ключевые слова: жанр речи, дискурс, функциональная стилистика, компьютерный анализ текста, научный текст, речевая системность, реляционно-ситуационный метод, ментально-речевая операция.
The article examines the thematic organization of instructional texts in the aspect of problems relevant to work on the creation of a cognitive assistant. The purpose of the assistant is to provide a user with the necessary information to follow the rules of a particular scenario to successfully achieve a goal according to the search query. The query containing certain keywords, further specified as the task being solved, is focused on a detailed set of topics which mark the subject areas reflected in the scenario. The authors of the article provide a review of some linguistic works devoted to the issues of theme-rhematic structuring of a produced text and its compression within the limits of keywords. The importance of the description of the text’s thematic chains, to obtain the detailed objective information on its thematic structure, is emphasized. When comparing the list of keywords identified by the automatic system TextApplianсe in a collection of Internet-extracted instructional texts retrieved from the Internet with the results of hand-held analysis of these texts, to determine the place of various nominative units in the text’s thematic organization, the authors consider the most significant characteristics of a keyword shown in different nominative units to varying degrees. This is a high indicator of a text identifier, content capacity, and communicative significance of a word or a substantive phrase as a marker of important information for a recipient. Defining keywords in whole instructional texts and in relatively independent text fragments (subtexts) that describe individual stages of the user’s goal achievement (for example, the stages of selecting a car, its inspection, making a transaction, car registration) makes it possible to improve the quality of scenario identification in the Network. Extracting keywords along with their context allows for the creation of a recommendations’ database for users automatically. The significance of the theme-rhematic text structure analysis, as a sign for its modeling in the sign picture of the world, is revealed.
The paper proves that speech genres as forms of text production and interpretation claim to be one of the main objects of formal linguistic analysis in comprehensive papers on cognitive modeling – an intensively developing trend of artificial intelligence. Understanding a speech genre as a form of spiritual socio-cultural activity (artistic, scientific, political, ideological, etc.) at the level of its objectification through a system of speech actions in the text as a communication unit allows to describe the systems of speech genres in various spheres of communication. The authors analyze speech genres which objectify the main stages of an academic theoretical research. By means of artificial intelligence, the research solves the problem of recognition of the speaker’s intentions while performing cognitive-speech actions forming the genre form of a text. It contributes to the development of the fundamental problem of “understanding” the meaning of an utterance by a machine. The research is based on the interdisciplinary complex method of text analysis. In terms of software implementation, the offered approach obtains a small set of high-level linguistic features of the clauses with the templates and then trains classifiers on these features. In order to create templates, the authors carry out linguistic and psychological analysis that deals with identifying markers of cognitive and speech actions as accurately as possible in accordance with the standards of perception. In the course of our study, the authors have obtained high indexes of cognitive and speech action identification, ranging from 0.78 to 0.99.
This paper discusses the issues of improving a predicate word dictionary structure that is used in solving problems of knowledge acquisition and text analysis. The principle of open dictionary architecture is shown. It takes into account the stylistic differentiation of speech and involves the description of predicate word subsystems functioning in separate speech varieties.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.