2003
DOI: 10.1142/s0129065703001352
|View full text |Cite
|
Sign up to set email alerts
|

Applying the Som Model to Text Classification According to Register and Stylistic Content

Abstract: We report on the application of the Self-Organizing Map (SOM) classification method to the task of categorizing texts according to their register and the style of their author. The SOM has been selected as its performance in various data-mining applications has been found to be highly successful. Here, the method is evaluated against the task of clustering textual data which are corpora of texts written in the Greek language; the parameters used depict linguistically important structural properties of the text… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
8
0

Year Published

2006
2006
2018
2018

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 15 publications
(9 citation statements)
references
References 9 publications
1
8
0
Order By: Relevance
“…Moreover, LSI consumes ample time in calculating similarities of new queries against all documents, but a SOM only needs to calculate similarities versus some representative subset of old input data and can then map new input straight onto the most similar models without having to recompute the whole mapping. Tambouratzis et al (2003) use SOMs for categorizing texts according to register and author style, and show that the results are compatible to those generated by statistical methods.…”
Section: Self-organizing Mapssupporting
confidence: 61%
“…Moreover, LSI consumes ample time in calculating similarities of new queries against all documents, but a SOM only needs to calculate similarities versus some representative subset of old input data and can then map new input straight onto the most similar models without having to recompute the whole mapping. Tambouratzis et al (2003) use SOMs for categorizing texts according to register and author style, and show that the results are compatible to those generated by statistical methods.…”
Section: Self-organizing Mapssupporting
confidence: 61%
“…It is claimed that if sufficient differences exist between the manner in which different authors express themselves, the SOM should retrieve such differences even though it is not provided with information about the actual authorship of the documents. Indeed, preliminary results reviewed in the next two sections have indicated that author characteristics can be detected by the SOM in data-mining experiments [11]. Here, the emphasis is placed on confirming these results using more extensive datasets and determining the extent to which each of the feature categories contributes to the discrimination between the authors via the SOM.…”
Section: Som Neural Networkmentioning
confidence: 89%
“…In these studies, originally statistical methods were used [9], [10], since they possessed solid theoretical foundations for the classification tasks at hand and, being supervised, were expected to provide the highest possible classification accuracy. Following that, SOM-based techniques were evaluated in the same tasks, in order to indicate the effectiveness of biologically inspired unsupervised methods in this task [11]. It has been found that using both the SOM and classical statistical techniques, the separation of registers is easier than the separation of documents belonging to the same register according to their author's style.…”
Section: Introductionmentioning
confidence: 99%
“…Details on the semi-automatic featureselection process are provided in [1] [2]. The features were counted in a completely automated manner and can be distinguished into the following main groups:…”
Section: B Features Countedmentioning
confidence: 99%