Exome-wide evidence of compound heterozygous effects across common phenotypes in the UK Biobank

Lassen, Frederik H.; Venkatesh, Samvida S.; Baya, Nikolas; Hill, Barney; Zhou, Wei; Bloemendal, Alex; Neale, Benjamin M.; Kessler, Benedikt M.; Whiffin, Nicola; Lindgren, Cecilia M.; Palmer, Duncan S.

doi:10.1016/j.xgen.2024.100602

Cell Genomics

2024

DOI: 10.1016/j.xgen.2024.100602

|View full text |Cite

Exome-wide evidence of compound heterozygous effects across common phenotypes in the UK Biobank

Frederik H. Lassen,

Samvida S. Venkatesh,

Nikolas Baya

et al.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Preprint2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 67 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Genetic Transformer: An Innovative Large Language Model Driven Approach for Rapid and Accurate Identification of Causative Variants in Rare Genetic Diseases

Liang,

Chen,

Wang

et al. 2024

Preprint

View full text Add to dashboard Cite

BackgroundIdentifying causative variants is crucial for the diagnosis of rare genetic diseases. Over the past two decades, the application of genome sequencing technologies in the field has significantly improved diagnostic outcomes. However, the complexity of data analysis and interpretation continues to limit the efficiency and accuracy of these applications. Various genotype and phenotype-driven filtering and prioritization strategies are used to generate a candidate list of variants for expert curation, with the final report variants determined through knowledge-intensive and labor-intensive expert review. Despite these efforts, the current methods fall short of meeting the growing demand for accurate and efficient diagnosis of rare disease. Recent developments in large language models (LLMs) suggest that LLMs possess the potential to augment or even supplant human labor in this context.MethodsIn this study, we have developed Genetic Transformer (GeneT), an innovative large language model (LLM) driven approach to accelerate identification of candidate causative variants for rare genetic disease. A comprehensive evaluation was conducted between the fine-tuned large language models and four phenotype-driven methods, including Xrare, Exomiser, PhenIX and PHIVE, alongside six pre-trained LLMs (Qwen1.5-0.5B, Qwen1.5-1.8B, Qwen1.5-4B, Mistral-7B, Meta-Llama-3-8B, Meta-Llama-3-70B). This evaluation focused on performance and hallucinations.ResultsGenetic Transformer (GeneT) as an innovative LLM-driven approach demonstrated outstanding performance on identification of candidate causative variants, identified the average number of candidate causative variants reduced from an average of 418 to 8, achieving recall rate of 99% in synthetic datasets. Application in real-world clinical setting demonstrated the potential for a 20-fold increase in processing speed, reducing the time required to analyze each sample from approximately 60 minutes to around 3 minutes. Concurrently, the recall rate has improved from 94.36% to 97.85%. An online analysis platform iGeneT was developed to integrate GeneT into the workflow of rare genetic disease analysis.ConclusionOur study represents the inaugural application of fine-tuned LLMs for identifying candidate causative variants, introducing GeneT as an innovative LLM-driven approach, demonstrating its superiority in both simulated data and real-world clinical setting. The study is unique in that it represents a paradigm shift in addressing the complexity of variant filtering and prioritization of whole exome or genome sequencing data, effectively resolving the challenge akin to finding a needle in a haystack.

show abstract

Genetic Transformer: An Innovative Large Language Model Driven Approach for Rapid and Accurate Identification of Causative Variants in Rare Genetic Diseases

Liang,

Chen,

Wang

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

RGnet: Recessive Genotype Network in a Large Mendelian Disease Cohort

Ai,

Kang,

Zeng

et al. 2024

Preprint

View full text Add to dashboard Cite

Recessive genotypes, including compound heterozygotes and homozygotes formed by rare variants that impact gene function, affect both alleles and were linked to numerous diseases and traits. However, the underlying patterns and interconnections of these recessive genotypes in large cohorts have rarely been studied. To address this gap, the Recessive Genotype Network (RGnet) was developed. This network model maps variant and genotype features to visualize and analyze recessive genotype patterns within large cohorts. Additionally, it uses permutation-based analyses to assess the enrichment of these genotypes in relation to specific phenotypes. Demonstrated through its application to the genetic deafness gene SLC26A4 in 22,125 cases affected by hearing loss, RGnet successfully identified pathogenic variants with high connectivity, providing a reliable method for exploring the pathogenic mechanisms underlying recessive disorders or traits.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Exome-wide evidence of compound heterozygous effects across common phenotypes in the UK Biobank

Cited by 2 publications

References 67 publications

Genetic Transformer: An Innovative Large Language Model Driven Approach for Rapid and Accurate Identification of Causative Variants in Rare Genetic Diseases

Genetic Transformer: An Innovative Large Language Model Driven Approach for Rapid and Accurate Identification of Causative Variants in Rare Genetic Diseases

RGnet: Recessive Genotype Network in a Large Mendelian Disease Cohort

Contact Info

Product

Resources

About