K and starting means for k-means algorithm

Fahim, Ahmed

doi:10.1016/j.jocs.2021.101445

Cited by 41 publications

(15 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Based on these components, hierarchical clustering was placed. Ward's method was applied to clarify the ideal number of clusters; however, the classification was done by the K-means method since the hierarchical clustering instead used "complementary" rather than non-hierarchical methods [38]. The statistical test identified one outlier, but the falsely highlighted shop was just an extreme point since nearly all the examined attributes were available in its case.…”

Section: Resultsmentioning

confidence: 99%

The Potential of Digital Marketing Tools to Develop the Innovative SFSC Players’ Business Models

Csordás

Pancsira

Lengyel

et al. 2022

Journal of Open Innovation: Technology, Market, and Complexity

View full text Add to dashboard Cite

The traditional global food supply chains are not just complex, but they do not support the sustainability of agriculture. The business models with the greatest growth potential are those that would allow consumers to buy more directly from producers. Before COVID-19, these alternatives were not just popular but had a relatively high market share compared to post COVID-19 era. However, due to the pandemic and the changes in consumers’ purchasing behavior, the players of short food supply chains had to adapt to the new circumstances. This is why business model innovation is nowadays a priority, which means a substantial renewal of the value delivered to customers, and a significant transformation of the processes and activities of the company/network. The study aimed to examine the dynamic innovation through applied digital marketing solutions that could open up new sales channels and increase the competitiveness of the companies involved. Since Austria is one of the “greenest” countries, its consumers are interested in purchasing sustainably, so they strive to buy directly from the producers. This motivates the authors to use its representative sample to investigate how well adapted the Austrian direct sellers are to the unsteady circumstances. Based on a previous research framework, a principal component analysis was applied and the elements defined therein were the variables of non-hierarchical clustering. The used methods highlight the lack of online distribution and marketing of the farmer shops, which could multiply the harmful effects of the pandemic. The classification of the shops demonstrated the generally low share of innovative direct sellers. We conclude that in the new era, businesses that effectively apply open business innovation models will be able to compete in the market.

show abstract

Section: Resultsmentioning

confidence: 99%

The Potential of Digital Marketing Tools to Develop the Innovative SFSC Players’ Business Models

Csordás

Pancsira

Lengyel

et al. 2022

Journal of Open Innovation: Technology, Market, and Complexity

View full text Add to dashboard Cite

show abstract

“…Notice that, even though the hostility measure uses k-means as the base of the method, it is not affected by its main drawbacks. The k-means method depends on the initialization and cannot form non-convex shapes [9]. Nevertheless, the hostility measure overcomes these problems thanks to its initialization with a high value for k to maximize the number of layers and, consequently, the resulting information to combine.…”

Section: Proposed Methodsmentioning

confidence: 99%

Hostility measure for multi-level study of data complexity

et al. 2022

View full text Add to dashboard Cite

Complexity measures aim to characterize the underlying complexity of supervised data. These measures tackle factors hindering the performance of Machine Learning (ML) classifiers like overlap, density, linearity, etc. The state-of-the-art has mainly focused on the dataset perspective of complexity, i.e., offering an estimation of the complexity of the whole dataset. Recently, the instance perspective has also been addressed. In this paper, the hostility measure, a complexity measure offering a multi-level (instance, class, and dataset) perspective of data complexity is proposed. The proposal is built by estimating the novel notion of hostility: the difficulty of correctly classifying a point, a class, or a whole dataset given their corresponding neighborhoods. The proposed measure is estimated at the instance level by applying the k-means algorithm in a recursive and hierarchical way, which allows to analyze how points from different classes are naturally grouped together across partitions. The instance information is aggregated to provide complexity knowledge at the class and the dataset levels. The validity of the proposal is evaluated through a variety of experiments dealing with the three perspectives and the corresponding comparative with the state-of-the-art measures. Throughout the experiments, the hostility measure has shown promising results and to be competitive, stable, and robust.

show abstract

“…While among the partitional clustering algorithms, the KMA is currently the most popular one, and its implementation consists of five main steps: (1) specify the NC m; (2) select m samples randomly as the ICCs; (3) classify each sample into the cluster where its nearest center is located; (4) update the clustering centers by treating the mean of each cluster as a new center; and (5) repeat steps (3) and (4) until the clustering centers no longer change [26]. In a subsequent study, Sculley [27] improved the KMA and proposed the Mini-batch K-means (MBK) to accelerate the clustering speed by randomly selecting a subset instead of the whole dataset to train the ICCs, but like KMA, its clustering performance is unstable and very sensitive to the ICCs [28]. A representative partition-based clustering algorithm K-means++ proposed by Arthur and Vassilvitskii provided such an effective solution to determine the ICCs, as shown in Algorithm 1, in which the ICCs are determined based on a D 2 weighting method following the principle that the larger the distances among the ICCs, the more reasonable the selection of the ICCs, and it can effectively reduce the possibility of multiple ICCs appearing in the same cluster [29].…”

Section: Clustering Algorithmsmentioning

confidence: 99%

An Adaptive Parameter-Free Optimal Number of Market Segments Estimation Algorithm Based on a New Internal Validity Index

Qi¹,

Li²,

Haibin³

et al. 2023

Computer Modeling in Engineering &Amp; Sciences

View full text Add to dashboard Cite

An appropriate optimal number of market segments (ONS) estimation is essential for an enterprise to achieve successful market segmentation, but at present, there is a serious lack of attention to this issue in market segmentation. In our study, an independent adaptive ONS estimation method BWCON-NSDK-means++ is proposed by integrating a new internal validity index (IVI) Between-Within-Connectivity (BWCON) and a new stable clustering algorithm Natural-SDK-means++ (NSDK-means++) in a novel way. First, to complete the evaluation dimensions of the existing IVIs, we designed a connectivity formula based on the neighbor relationship and proposed the BWCON by integrating the connectivity with other two commonly considered measures of compactness and separation. Then, considering the stability, number of parameters and clustering performance, we proposed the NSDK-means++ to participate in the integration where the natural neighbor was used to optimize the initial cluster centers (ICCs) determination strategy in the SDK-means++. At last, to ensure the objectivity of the estimated ONS, we designed a BWCON-based ONS estimation framework that does not require the user to set any parameters in advance and integrated the NSDK-means++ into this framework forming a practical ONS estimation tool BWCON-NSDK-means++. The final experimental results show that the proposed BWCON and NSDK-means++ are significantly more suitable than their respective existing models to participate in the integration for determining the ONS, and the proposed BWCON-NSDK-means++ is demonstrably superior to the BWCON-KMA, BWCON-MBK, BWCON-KM++, BWCON-RKM++, BWCON-SDKM++, BWCON-Single linkage, BWCON-Complete linkage, BWCON-Average linkage and BWCON-Ward linkage in terms of the ONS estimation. Moreover, as an independent market segmentation tool, the BWCON-NSDK-means++ also outperforms the existing models with respect to the inter-market differentiation and sub-market size.

show abstract

K and starting means for k-means algorithm

Cited by 41 publications

References 28 publications

The Potential of Digital Marketing Tools to Develop the Innovative SFSC Players’ Business Models

The Potential of Digital Marketing Tools to Develop the Innovative SFSC Players’ Business Models

Hostility measure for multi-level study of data complexity

An Adaptive Parameter-Free Optimal Number of Market Segments Estimation Algorithm Based on a New Internal Validity Index

Contact Info

Product

Resources

About