2022
DOI: 10.1371/journal.pcbi.1010301
|View full text |Cite
|
Sign up to set email alerts
|

Archetypal Analysis for population genetics

Abstract: The estimation of genetic clusters using genomic data has application from genome-wide association studies (GWAS) to demographic history to polygenic risk scores (PRS) and is expected to play an important role in the analyses of increasingly diverse, large-scale cohorts. However, existing methods are computationally-intensive, prohibitively so in the case of nationwide biobanks. Here we explore Archetypal Analysis as an efficient, unsupervised approach for identifying genetic clusters and for associating indiv… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 14 publications
(6 citation statements)
references
References 17 publications
0
6
0
Order By: Relevance
“…Principal Component Analysis (PCA) clustering was carried out using PLINK (Purcell et al, 2007). We complemented our PCA analysis with archetypal analysis following Gimbernat-Mayol et al (2022), to test biases in the PCA analysis due to irregular sample sizes and to identify latent factors. We removed multiallelic SNPs prior to running archetypal analysis and performed the analysis with k ranging from 2 to 4.…”
Section: Population Structurementioning
confidence: 99%
“…Principal Component Analysis (PCA) clustering was carried out using PLINK (Purcell et al, 2007). We complemented our PCA analysis with archetypal analysis following Gimbernat-Mayol et al (2022), to test biases in the PCA analysis due to irregular sample sizes and to identify latent factors. We removed multiallelic SNPs prior to running archetypal analysis and performed the analysis with k ranging from 2 to 4.…”
Section: Population Structurementioning
confidence: 99%
“…For example, grouping all racialized groups together often creates challenges with translating study findings into the real-world setting where racialized groups are far from a homogenous population. In recent years, ethnicity determined through genomic analysis has been proposed as a more precise approach to contextualize disparities rather than the social construct of race ( 14 ). For example, the use of genetic patterns, including variations of drug metabolism and drug targets, indicates that there are issues in representing human population genetic structures in evaluating drug safety and efficiency and relating this structure to drug response.…”
Section: The Promises and Challenges Of Precision Medicinementioning
confidence: 99%
“…Two widely used parametric tools are STRUCTURE [ 11 ] and ADMIXTURE [ 12 ], which estimate the proportions of different ancestries (or ancestral populations) for each individual, known as admixture. Recently, Archetypal analysis was shown to be more computationally efficient and provide more interpretable results than ADMIXTURE [ 13 ]. In contrast, non-parametric methods do not have a finite set of parameters and instead rely on the intrinsic structure of the data to determine which data points best resemble each other.…”
Section: Introductionmentioning
confidence: 99%