2011 14th International Conference on Network-Based Information Systems 2011
DOI: 10.1109/nbis.2011.71
|View full text |Cite
|
Sign up to set email alerts
|

Author Identification in Albanian Language

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 6 publications
0
6
0
Order By: Relevance
“…While many studies rely on established benchmark datasets like Enron [20], C50 [7], PAN [22], IMDb62 [6,21] and others [9], the scarcity of standard datasets, particularly for low-resource languages, presents a unique challenge. Creating specialized corpora has paved the way for promising advancements in the field, demonstrated by projects like UNAAC [5], BAAD [2], UrduCorpus [5], A3C Corpus [8,25], and more [4]. These corpora tailored for AA contribute significantly to the field, expanding its resources.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…While many studies rely on established benchmark datasets like Enron [20], C50 [7], PAN [22], IMDb62 [6,21] and others [9], the scarcity of standard datasets, particularly for low-resource languages, presents a unique challenge. Creating specialized corpora has paved the way for promising advancements in the field, demonstrated by projects like UNAAC [5], BAAD [2], UrduCorpus [5], A3C Corpus [8,25], and more [4]. These corpora tailored for AA contribute significantly to the field, expanding its resources.…”
Section: Related Workmentioning
confidence: 99%
“…In today's digital age [1], the Internet has expanded anonymous content, making AA an increasingly concern. This issue carries substantial implications across various domains, including literature [2][3][4], journalism [5][6][7][8], and forensics [9].…”
Section: Introductionmentioning
confidence: 99%
“…An important part of the literature consists of studies on English language [4,5,6,7,8]. There are also many studies done in many different languages including Japanese [9], Mongolian [10], Persian [11], Albanian [12], Indian [13,14], Brazilian [15], Russian [16,17], German [18], and Arabic [19]. When the existing studies were examined, it was seen that different types of data sets were used for author identification tasks.…”
Section: Literature Reviewmentioning
confidence: 99%
“…When the existing studies were examined, it was seen that different types of data sets were used for author identification tasks. Some studies have been carried out on newspaper articles [4,15,18,19], while others were carried out on poems [13], novels [11,12,16], email content [20], song lyrics [21], source codes [22], or tweets, blog posts, and forums [8,9,23]. In some cases, different types of data sources were combined or compared [17,25] Early studies in author identification focused on different stylometric techniques.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The initial study aimed to identify authors of literary texts using stylometric techniques (Varela et al, 2016). Author attribution isn't just a literary problem (Phani et al, 2017), (Zhou et al, 2022), (Paci et al, 2011) The remaining sections of the paper are structured as follows. In Section 2, we take a look at the author-related tasks.…”
Section: Introductionmentioning
confidence: 99%