2023
DOI: 10.3390/app13127228
|View full text |Cite
|
Sign up to set email alerts
|

Unlocking the Potential of Keyword Extraction: The Need for Access to High-Quality Datasets

Zaira Hassan Amur,
Yew Kwang Hooi,
Gul Muhammad Soomro
et al.

Abstract: Keyword extraction is a critical task that enables various applications, including text classification, sentiment analysis, and information retrieval. However, the lack of a suitable dataset for semantic analysis of keyword extraction remains a serious problem that hinders progress in this field. Although some datasets exist for this task, they may not be representative, diverse, or of high quality, leading to suboptimal performance, inaccurate results, and reduced efficiency. To address this issue, we conduct… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 38 publications
0
4
0
Order By: Relevance
“…After all data points are assigned to clusters, the centroids are updated by computing the mean of all data points assigned to each cluster. The assignment and update steps are repeated iteratively until convergence, where the centroids no longer change significantly or a maximum number of iterations is reached [28], [29], [30], [31], [32].…”
Section: K-means Clusteringmentioning
confidence: 99%
See 1 more Smart Citation
“…After all data points are assigned to clusters, the centroids are updated by computing the mean of all data points assigned to each cluster. The assignment and update steps are repeated iteratively until convergence, where the centroids no longer change significantly or a maximum number of iterations is reached [28], [29], [30], [31], [32].…”
Section: K-means Clusteringmentioning
confidence: 99%
“…K-means clustering is a versatile and widely used algorithm for partitioning data into clusters, with applications across different domains and industries. It provides an efficient and interpretable way to explore data structure and discover underlying patterns, such as customer segmentation, image compression, document clustering, and anomaly detection [28], [29], [30], [31], [32].…”
Section: K-means Clusteringmentioning
confidence: 99%
“…Furthermore, there is a need for developing robust methods to handle outliers, exceptions, and edge cases that are often encountered in short answer assessment. This requires careful consideration of the characteristics of short answers and the design of models and algorithms that can handle such situations effectively [36][37][38][39][40]. To overcome these challenges, potential solutions include utilizing data augmentation techniques to generate more diverse training data, developing novel algorithms and models specifically tailored for short answer assessment, and leveraging domain-specific knowledge and expertise to enhance the performance of machine learning models.…”
Section: Model Implicationsmentioning
confidence: 99%
“…These industry researchers have been found to trust a variety of ML techniques, including the most widely used recurrent neural networks (RNNs), artificial neural networks (ANNs), and support vector machine (SVM), with the infrequent AI method involving fuzzy inference systems (FISs). Many researchers have shown success using ML, AI, and smart technology in water-based applications to optimize modeling techniques [24]. While these achievements have been acknowledged, ML and AI applications do not come without constraints that need to be addressed before general execution.…”
mentioning
confidence: 99%