Augmented versus artificial intelligence for stratification of patients with myositis With interest we read the recent article by Pinal-Fernandez and Mammen, 1 which comments on the paper by Spielmann et al 2 and to a lesser extent on the contribution by Mariampillai et al 3 4 and raises concerns about the artificial intelligence (AI)-driven approach used to define subgroups of patients with idiopathic inflammatory myopathy (IIM). To illustrate this, Pinal-Fernandez and Mammen constructed a library of 1000 observations and selected the four variables using a multivariate normal distribution, thus finding a similar clustering as in the original paper by Spielmann et al. 2 We share some of the concerns about unsupervised learning techniques raised by Pinal-Fernandez and Mammen. 1 In this letter, we would like to highlight several aspects related to AI-driven methodologies. Machine learning (ML) is a subset of AI that enables a computer to make decisions based on the large dataset. When applied to clustering, it will always give an 'optimal' solution for the number of clusters 'present' in a dataset. However, it is up to the human user's discretion to determine whether those clusters exist. An ML algorithm determines a number of clusters by separating the datasets into the subgroups through a process of optimising (1) separation between each cluster to its greatest and (2) ensuring that within a cluster, the distance to the cluster centre for each point is the smallest. Such an algorithm is essentially trying to identify a number of optimal clusters that allow each cluster to be distinct from the others. The goal is to have tight individual clusters that are very distinguishable from the others. In any dataset, the algorithms will present an optimal solution to those or similar criteria, but it does not always mean those clusters are truly significant or meaningful. Visualising the clusters using dimensionality reduction techniques such as principal component analysis or t-distributed stochastic neighbour embedding is vital for this process, in addition to more quantitative methods such as comparing intracluster variation, intercluster variation and silhouette scoring. That is why researchers using ML should ideally be 'bilingual' and understand both the mathematics and algorithms, as well the science and clinical meaning behind the results. To conclude, we emphasise that, no doubt, ML has the potential to improve the stratification of patients with IIM if certain concepts of data science are followed as also pointed out by a task force of the European League Against Rheumatism for big data and AI. 5 ML relies on large, standardised and curated datasets that require large patient cohorts. Due to the rarity of IIM, larger patient cohorts (such as the MyoNet/EuroMyositis) 6 are required to generate quality data. Once larger and curated datasets are available, the ML approach is a powerful alternative to human judgement and can improve future classification criteria for IIM. 4 7 8 Today, we argue for the use of ML alon...
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.