2021
DOI: 10.1073/pnas.2101056118
|View full text |Cite
|
Sign up to set email alerts
|

Novel functional sequences uncovered through a bovine multiassembly graph

Abstract: Many genomic analyses start by aligning sequencing reads to a linear reference genome. However, linear reference genomes are imperfect, lacking millions of bases of unknown relevance and are unable to reflect the genetic diversity of populations. This makes reference-guided methods susceptible to reference-allele bias. To overcome such limitations, we build a pangenome from six reference-quality assemblies from taurine and indicine cattle as well as yak. The pangenome contains an additional 70,329,827 bases co… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

5
79
0
2

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 58 publications
(86 citation statements)
references
References 75 publications
5
79
0
2
Order By: Relevance
“…Sequencing technologies and assemblers evolve rapidly, and so even recently generated bovine assemblies, including the ones reported here, have been produced under non-uniform conditions (e.g., (Crysnanto et al, 2021; Talenti et al, 2021)). Given the differences we observed between HiFi- and ONT-based assemblies, especially in comparison to the CLR-based ARS-UCD1.2 reference, it was crucial to examine how pangenome construction responded to different assembly inputs.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Sequencing technologies and assemblers evolve rapidly, and so even recently generated bovine assemblies, including the ones reported here, have been produced under non-uniform conditions (e.g., (Crysnanto et al, 2021; Talenti et al, 2021)). Given the differences we observed between HiFi- and ONT-based assemblies, especially in comparison to the CLR-based ARS-UCD1.2 reference, it was crucial to examine how pangenome construction responded to different assembly inputs.…”
Section: Resultsmentioning
confidence: 99%
“…Larger structural variants (SVs) and variation located in repetitive or challenging regions have rarely been studied across Bovinae due to the inherent limitations of short sequencing reads and incomplete reference genomes. Moreover, no reference assembly of a single individual can reflect the immense genomic diversity present in global breeds of domestic cattle (Crysnanto et al, 2019(Crysnanto et al, , 2021Crysnanto & Pausch, 2020;Talenti et al, 2021).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…We expect further explorations of other cell types to highlight additional MEIs that are not covered in the current set. In total, there may be many other functional sequences that are hidden by a haploid reference but that could be discovered through a pan-genomic reference [12].…”
Section: Discussionmentioning
confidence: 99%
“…This analysis is based on the latest reference sequence ARS-UCD1.2 of a Hereford cow, however, the reference sequence still spans many gaps and includes more than 2000 unplaced scaffolds that could potentially harbor important coding sequences [ 37 ]. Recently, it was shown that reference graphs based on sequence data of several breeds can include new functional sequences [ 97 ]. Another drawback of our approach is that we restricted our analysis to coding variants, while non-coding variants impairing the expression of a protein or more complex structural variants disturbing the function of genes might also be causative.…”
Section: Discussionmentioning
confidence: 99%