Automated Clustering of High-dimensional Data with a Feature Weighted Mean Shift Algorithm

Chakraborty, Saptarshi; Paul, Debolina; Das, Swagatam

doi:10.1609/aaai.v35i8.16854

Cited by 12 publications

(6 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Based on the K-Means machine learning algorithm to complete the modeling process of the fractal model, the clustering idea is used to mine the potential correlation of the genotype data, and the data are grouped categorized, and visualized. The K-Means algorithm was proposed by the Lloyd scholars in 1982, and the algorithm is one of the most classical and commonly used unsupervised learning algorithms to solve the clustering problem ( Chakraborty et al., 2020 ; Sinaga and Yang, 2020 ). It divides the set of samples into K-class clusters and uses Euclidean distance to measure the similarity between the samples, which results in high similarity within clusters of the same class and low similarity between clusters of different classes ( Mirzal, 2020 ).…”

Section: Methodsmentioning

confidence: 99%

KASP-IEva: an intelligent typing evaluation model for KASP primers

Chen,

Huang,

Fan

et al. 2024

Front. Plant Sci.

View full text Add to dashboard Cite

KASP marker technology has been used in molecular marker-assisted breeding because of its high efficiency and flexibility, and an intelligent evaluation model of KASP marker primer typing results is essential to improve the efficiency of marker development on a large scale. To this end, this paper proposes a gene population delineation method based on NTC identification module and data distribution judgment module to improve the accuracy of K-Means clustering, and introduces a decision tree to construct the KASP-IEva primer typing evaluation model. The model firstly designs the NTC identification module and data distribution judgment module to extract four types of data, grouping and categorizing to achieve the improvement of the distinguishability of amplification product signals; secondly, the K-Means algorithm is used to aggregate and classify the data, to visualize the five aggregated clusters and to obtain the morphology location eigenvalues; lastly, the evaluation criteria for the typing effect level are constructed, and the logical decision tree is used to make conditional discrimination on the eigenvalues in order to realize the score prediction. The performance of the model was tested by the KASP marker typing test results of 2519 groups of cotton varieties, and the following conclusions were obtained: the model is able to visualize the aggregation and classification effects of the amplification products of NTC, pure genotypes, heterozygous genotypes, and untyped genotypes, enabling rapid and accurate KASP marker typing evaluation. Comparing and analyzing the model evaluation results with the expert evaluation results, the average accuracy rate of the four grades evaluated by the model was 87%, and the overall evaluation results showed an uneven distribution of the grades with significant differential characteristics. When evaluating 2519 KASP fractal maps, the expert evaluation consumes 15 hours, and the model evaluation only uses 8min27.45s, which makes the model intelligent evaluation significantly better than the expert evaluation from the perspective of time. The establishment of the model will further enhance the application of KASP markers in molecular marker-assisted breeding and provide technical support for the large-scale screening and identification of excellent genotypes.

show abstract

Section: Methodsmentioning

confidence: 99%

KASP-IEva: an intelligent typing evaluation model for KASP primers

Chen,

Huang,

Fan

et al. 2024

Front. Plant Sci.

View full text Add to dashboard Cite

show abstract

“…Then, this detected speech is sent for Speaker segmentation, where the speech is segmented based on Speaker Change Detection [24] and the constant thresholds are estimated using Proposed FEOSA. Next to speaker segmentation, the clustering or Speaker diarization process is conducted using entropy weighting power k means algorithm [25], where the weight update is accomplished through same proposed FEOSA. Figure 1 portrays the schematic illustration of proposed FEOSA.…”

Section: Methodsmentioning

confidence: 99%

A Fractional Ebola Optimization Search Algorithm Approach for Enhanced Speaker Diarization

Kangala,

Ramisetty

2023

ISI

View full text Add to dashboard Cite

Speaker diarization, the task of ascertaining speaker homogeneity within a collection of audio recordings featuring multiple speakers, is crucial for answering queries such as "who spoke when". Diverse speaker recordings, encompassing meetings, reality shows, and news broadcasts, typically populate the speaker diarization database. Traditional methods primarily rely on clustering speaker embeddings, yet these approaches often fail to minimize diarization errors effectively and struggle to accurately account for speaker overlaps. Addressing these limitations, we propose a robust model leveraging the Fractional Ebola Optimization Search Algorithm (FEOSA) for speaker segmentation and diarization. This model represents an amalgamation of the Fractional Calculus (FC) concept and the Ebola Optimization Search Algorithm (EOSA), thereby enhancing the efficacy of the diarization process. The diarization task is executed employing an entropy weighted power k-means algorithm, with weights updated via the proposed FEOSA. The proposed FEOSA demonstrated superior testing accuracy, reaching a maximum of 0.913, and significantly reduced diarization errors to a minimum of 0.566. Further, False Discovery Rate (FDR), False Negative Rate (FNR) and False Positive Rate (FPR) were recorded at 0.257, 0.128, and 0.104 respectively, underscoring the effectiveness of the proposed model in enhancing speaker diarization.

show abstract

“…Where M c represents the largest number of clusters As one of the widely used clustering algorithms, K-means [13] algorithm can group data vectors into several clusters. When K-means algorithm is initialized, it is necessary to determine the number of clusters, and this parameter has a great impact on the performance of the algorithm.…”

Section: Identify Electricity Theft In Output Layermentioning

confidence: 99%

Electricity theft detection algorithm based on contrast learning and cluster combination discrimination

Wang

2023

Third International Conference on Mechanical, Electronics, and Electrical and Automation Control (METMS 2023)

View full text Add to dashboard Cite

Power system is susceptible to kinds of abnormal behavior of electricity consumption. Detection of electricity theft can improve the stability of power system. Aiming at the problems of dependence on label data and low accuracy of existing electricity theft detection methods, this paper proposes a theft detection algorithm based on contrastive learning and clustering combination discrimination. The algorithm is composed of representation extraction module and clustering combination discrimination module. The representation extraction module firstly uses BiGRU to extract the context information of energy consumption data, and then obtains the representation of power data based on dilated convolutional network, and uses hierarchical contrastive learning to improve the learning effect of the network. The cluster combination discrimination module is based on K-means and adaptive DBSCAN algorithm, and uses a combination discrimination mechanism to classify the representations and determine theft. In this paper, the effectiveness of the proposed method is verified by the public dataset SGCC, and the experimental results show that the proposed method is superior to other unsupervised anomaly detection methods.

show abstract

Automated Clustering of High-dimensional Data with a Feature Weighted Mean Shift Algorithm

Cited by 12 publications

References 36 publications

KASP-IEva: an intelligent typing evaluation model for KASP primers

KASP-IEva: an intelligent typing evaluation model for KASP primers

A Fractional Ebola Optimization Search Algorithm Approach for Enhanced Speaker Diarization

Electricity theft detection algorithm based on contrast learning and cluster combination discrimination

Contact Info

Product

Resources

About