2020
DOI: 10.37624/ijert/13.9.2020.2380-2384
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Automatic Text Summarization of Konkani Texts using K-means with Elbow Method

Abstract: Text Summarization is an emerging field of research in Natural Language Processing (NLP). A bulk of the work is related to texts in English and other popular languages. This paper presents some of the early works attempted at performing single document extractive Automatic Text Summarization on Konkani language documents, which is an under-research language in the domain of Automatic Text Summarization (ATS). The input documents need to be cleaned of punctuation and then sentence scores are calculated for each… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 18 publications
(10 citation statements)
references
References 14 publications
0
9
0
1
Order By: Relevance
“…Tables 1, 2 and 3, outline the metrics of the assessment of the corresponding human summaries with the system-generated summaries using deep learning. These tables also provide the scores of automatic text summarization system built with k-means clustering with 3 clusters with the output summaries generated with the same Konkani folk tales dataset provided as input to the system [30]. The performance of ATS systems can be compared against baseline systems, like using leading sentences from the input text document [2].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Tables 1, 2 and 3, outline the metrics of the assessment of the corresponding human summaries with the system-generated summaries using deep learning. These tables also provide the scores of automatic text summarization system built with k-means clustering with 3 clusters with the output summaries generated with the same Konkani folk tales dataset provided as input to the system [30]. The performance of ATS systems can be compared against baseline systems, like using leading sentences from the input text document [2].…”
Section: Resultsmentioning
confidence: 99%
“…In our experiment, we compute sentence embeddings, using fastText word embeddings, which act as feature vectors ideal to be used with MLPs [29]. Previous work using machine learning for text summarization in Konkani used k-means clustering on the same Konkani dataset [30]. We compare this system with the system presented in this paper.…”
Section: Related Workmentioning
confidence: 99%
“…In [12], we discussed the content recommendation system approaches based on grouping for similar articles that used TF-IDF to perform vector transformation of the document contents and, through cosine similarity, applied k-means [13] for clustering them. In [14], the authors automatically summarized texts using TF-IDF and k-means to determine the document's textual groups used to create the abstract. Then, TF-IDF is considered the primary technique for vectorizing textual content and k-means the most used algorithm for unsupervised machine learning.…”
Section: State-of-the-art Reviewmentioning
confidence: 99%
“…This method can be illustrated through a line plot between SSE (Sum of Squared error) compared to the total cluster and finding a point that represents "an elbow point" (the point after SSE or inertia starts decreasing in a linear fashion). Elbow method is often used in previous studies for determining the optimal number of clusters [14], [15] , in addition to the silhouette coefficient method [16].…”
Section: )mentioning
confidence: 99%