2012
DOI: 10.1186/1471-2164-13-206
|View full text |Cite
|
Sign up to set email alerts
|

Limitations of the rhesus macaque draft genome assembly and annotation

Abstract: Finished genome sequences and assemblies are available for only a few vertebrates. Thus, investigators studying many species must rely on draft genomes. Using the rhesus macaque as an example, we document the effects of sequencing errors, gaps in sequence and misassemblies on one automated gene model pipeline, Gnomon. The combination of draft genome with automated gene finding software can result in spurious sequences. We estimate that approximately 50% of the rhesus gene models are missing, incomplete or inco… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
73
0
1

Year Published

2012
2012
2017
2017

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 58 publications
(76 citation statements)
references
References 16 publications
2
73
0
1
Order By: Relevance
“…Breakpoints generated by rearrangements of these smaller SFs are either the footprint of bona fide structural rearrangements, or they may be artifacts produced by misassembled sequences. For example, previous studies revealed problems in the rheMac2 assembly version of the rhesus genome (31)(32)(33), which is one of the species showing a large discrepancy of the number and the rate of breakpoints at the two resolutions. Even though we used a more recent version of the rhesus genome (rheMac3), it is not clear whether all of the assembly problems in the previous version were completely fixed.…”
Section: Discussionmentioning
confidence: 99%
“…Breakpoints generated by rearrangements of these smaller SFs are either the footprint of bona fide structural rearrangements, or they may be artifacts produced by misassembled sequences. For example, previous studies revealed problems in the rheMac2 assembly version of the rhesus genome (31)(32)(33), which is one of the species showing a large discrepancy of the number and the rate of breakpoints at the two resolutions. Even though we used a more recent version of the rhesus genome (rheMac3), it is not clear whether all of the assembly problems in the previous version were completely fixed.…”
Section: Discussionmentioning
confidence: 99%
“…One can compare gene lists from different assemblies, but gaps and other issues in assemblies create ambiguity 21, 22 . Available evidence suggests that humans and chimpanzees experienced more rapid changes in gene copy number than did orangutans or rhesus macaques 13 .…”
Section: Genomic Differences Among Primatesmentioning
confidence: 99%
“…The initial draft assemblies for nonhuman primates all provide much useful information, but are not complete or reliable enough to support all current scientific goals 21, 22 . One limitation of draft genomes is the presence of gaps in chromosomal sequences, resulting in missing exons or genes.…”
Section: Future Directionsmentioning
confidence: 99%
“…Interestingly, we observed 72-78% aligned reads for NHP #2 independent of region or sequencing depth. Overall 72-84% aligned read results are less than anticipated as compared to alignment with well annotated human, mouse or rat reference genome builds and is suggested to be a result of missing gene models annotated in the draft MMUL 1.0 reference genome build (Zhang et al, 2012). However, despite non-optimal alignment efficiency using the draft Rhesus macaque genome build, the sample data yielded accurately detectable transcript abundance counts from 7845 to 9937 transcripts for brain regions of NHP #1 and from 6331 to 8826 transcripts for brain regions of NHP #2.…”
Section: 7mentioning
confidence: 83%