2017
DOI: 10.1007/s00357-017-9240-z
|View full text |Cite
|
Sign up to set email alerts
|

rCOSA: A Software Package for Clustering Objects on Subsets of Attributes

Abstract: rCOSA is a software package interfaced to the R language. It implements statistical techniques for clustering objects on subsets of attributes in multivariate data. The main output of COSA is a dissimilarity matrix that one can subsequently analyze with a variety of proximity analysis methods. Our package extends the original COSA software (Friedman and Meulman, 2004) by adding functions for hierarchical clustering methods, least squares multidimensional scaling, partitional clustering, and data visualization.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
11
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(12 citation statements)
references
References 24 publications
1
11
0
Order By: Relevance
“…1 , region ML and FL), COSA (clustering objects on subsets of attributes) analysis was used. This analysis is appropriate when the differentiation among groups of objects is unclear and when there are objects that do not clearly belong to any of the groups ( Kampert, Meulman & Friedman, 2016 ). It is a method involving iterations that minimize the distance among individuals.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…1 , region ML and FL), COSA (clustering objects on subsets of attributes) analysis was used. This analysis is appropriate when the differentiation among groups of objects is unclear and when there are objects that do not clearly belong to any of the groups ( Kampert, Meulman & Friedman, 2016 ). It is a method involving iterations that minimize the distance among individuals.…”
Section: Methodsmentioning
confidence: 99%
“…(1) : where d ijk is the dissimilarity between objects i and j as evaluated for attribute k , and N is the number of objects. This method was implemented using the software rCOSA ( Kampert, Meulman & Friedman, 2016 ). The NNS were grouped a priori based on two criteria: occurrence in more/fewer localities and the presence/absence of NCR.…”
Section: Methodsmentioning
confidence: 99%
“…The testing procedure proposed by Janitza et al is particularly appealing here, but it is expected to work only if the holdout permutation importance is centred around 0 and symmetric for noisy variables. Following Kampert et al, 102 samples were generated having 500 normally distributed attributes. The first 50 attributes are generated from normal distributions with different means so that three groups containing 34 samples each are generated ( μ 1 = −1.2, μ 2 = 0, μ 3 = 1.2, and σ = 0.2) with a total of 500 attributes; 450 of which are normally distributed noise that do not contribute to the clusters.…”
Section: Applicationsmentioning
confidence: 99%
“…For categorical data, the dissimilarity function called simple matching (KAUFMAN; ROUSSEEUW, 1990) is commonly used to measure the difference between qualitative attributes. Such a function is expressed by:…”
Section: Dissimilarity Functionsmentioning
confidence: 99%
“…It is very common using such strategy to define the number of the clusters a dataset might has, although it can be also done using the third strategy, the relative performance measures. Some examples of internal performance measures are silhouette width (ROUSSEEUW, 1987;KAUFMAN;ROUSSEEUW, 1990), Dunn index (DUNN, 1974, Davies-Bouldin index (DAVIES;BOULDIN, 1979), also known as DB index, PBM index (PAKHIRA; BANDYOPADHYAY; MAULIK, 2004), and c-index (HUBERT; LEVIN, 1975). To obtain more information about internal performance measures, see in Vendramin, Campello and Hruschka (2010) and Xiong and Li (2014).…”
Section: Performance Measuresmentioning
confidence: 99%