2021
DOI: 10.1002/tpg2.20135
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning approaches to identify core and dispensable genes in pangenomes

Abstract: A gene in a given taxonomic group is either present in every individual (core) or absent in at least a single individual (dispensable). Previous pangenomic studies have identified certain functional differences between core and dispensable genes. However, identifying if a gene belongs to the core or dispensable portion of the genome requires the construction of a pangenome, which involves sequencing the genomes of many individuals. Here we aim to leverage the previously characterized core and dispensable gene … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 45 publications
0
8
0
Order By: Relevance
“…Pan-genome enabled the characterization of the genetic diversity present in a species. In plants, core genes are often associated with essential metabolic processes, while dispensable genes are related to adaptive functions such as disease resistance and stress responses (Danilevicz et al, 2020;Yocca and Edger, 2022). HvCaM/CML was a large gene family, and according to the released firstgeneration barley pan-genome data (Jayakodi et al, 2020), 81 (95.3%) HvCaMs/CMLs were core genes (Figure 6).…”
Section: Hvcams/cmls In Different Barley Genotypesmentioning
confidence: 99%
“…Pan-genome enabled the characterization of the genetic diversity present in a species. In plants, core genes are often associated with essential metabolic processes, while dispensable genes are related to adaptive functions such as disease resistance and stress responses (Danilevicz et al, 2020;Yocca and Edger, 2022). HvCaM/CML was a large gene family, and according to the released firstgeneration barley pan-genome data (Jayakodi et al, 2020), 81 (95.3%) HvCaMs/CMLs were core genes (Figure 6).…”
Section: Hvcams/cmls In Different Barley Genotypesmentioning
confidence: 99%
“…The pangenome of the three strains (Figures 3 and S3) consists of 6259 genes, with 4089 classified as core genes. Dispensable genes, defined as those absent in at least one of the strains [83], comprise 674 genes (10.8%). The G. polyisoprenivorans pangenome was analyzed using a dataset comprising three strains: G. polyisoprenivorans 135 and G. polyisoprenivorans C (CP073075.1) which are closely related, and G. polyisoprenivorans VH2 (CP003119.1), which is phylogenetically distinct from this pair.…”
Section: The Pangenome Of Coding Regionsmentioning
confidence: 99%
“…The pangenome of the three strains (Figures 3 and S3) consists of 6259 genes, with 4089 classified as core genes. Dispensable genes, defined as those absent in at least one of the strains [83], comprise 674 genes (10.8%).…”
Section: The Pangenome Of Coding Regionsmentioning
confidence: 99%
“…ML is widely used for crop improvement; some of the case studies include plant–pathogen interactions [ 210 ], traits, and phenotyping [ 211 ], and applications include at the molecular level in plants [ 212 ]. The use of ML in plant genomics has increased in the last decade [ 213 ]; applications include the classification of genes into active and inactive genes in maize [ 214 ], identifying genome crossovers [ 215 ], identification of near-complete genetically fixed genomic regions [ 216 ], gene regulatory networks in maize [ 217 ], gene prediction with deep learning with a variety of architectures [ 218 ], diagnosis of pests and disease [ 219 ], gene prediction concerning climatic conditions [ 220 ], predicted gene expression levels from genomic sequence data [ 221 ], identifying variants based on short-read sequence alignments [ 222 ], and classifying genes as core and dispensable genes [ 223 ].…”
Section: Data Science and Artificial Intelligencementioning
confidence: 99%