2019
DOI: 10.1186/s12864-019-5996-3
|View full text |Cite
|
Sign up to set email alerts
|

DiscoverY: a classifier for identifying Y chromosome sequences in male assemblies

Abstract: Background Although the Y chromosome plays an important role in male sex determination and fertility, it is currently understudied due to its haploid and repetitive nature. Methods to isolate Y-specific contigs from a whole-genome assembly broadly fall into two categories. The first involves retrieving Y-contigs using proportion sharing with a female, but such a strategy is prone to false positives in the absence of a high-quality, complete female reference. A second strategy uses the ratio of dep… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

3
6

Authors

Journals

citations
Cited by 24 publications
(22 citation statements)
references
References 21 publications
0
22
0
Order By: Relevance
“…For bonobo and Sumatran orangutan, we generated and assembled ( 53 ) deep-coverage short sequencing reads from male individuals and identified putative Y contigs by mapping them against the corresponding female reference assemblies ( 54 ). These contigs were then scaffolded with mate-pair reads ( 55 ).…”
Section: Methodsmentioning
confidence: 99%
“…For bonobo and Sumatran orangutan, we generated and assembled ( 53 ) deep-coverage short sequencing reads from male individuals and identified putative Y contigs by mapping them against the corresponding female reference assemblies ( 54 ). These contigs were then scaffolded with mate-pair reads ( 55 ).…”
Section: Methodsmentioning
confidence: 99%
“…This approach is unsatisfactory because (1) as shown in this article, it takes substantially more space than direct k-mer compression, (2) k-mer counting on the fly adds significant time and memory to the decompression process, and (3) there are applications where the k-mer set cannot be reproduced by simply counting k-mers in a FASTA file, for example, when it is a product of a multi-sample error correction algorithm (Yang et al, 2012). Further, there are applications where the k-mer set is not related to sequence read data at all, for example, a universal hitting set (Orenstein et al, 2017), a chromosome-specific reference dictionary (Rangavittal et al, 2019), or a winnowed min-hash sketch [e.g., as in Sahlin and , or see Marçais et al (2019) and Rowe (2019) for a survey].…”
Section: Related Workmentioning
confidence: 99%
“…Yella is a male Labrador Retriever, and while reads from the Y-chromosome could be detected via alignment to an existing partial Y chromosome reference sequence, the Y-chromosome for Yella was not able to be resolved beyond an acceptable threshold for a published reference genome. This is similar to issues experienced across mammalian genomics, in which the short and highly repetitive nature of the Y-chromosome, along with its homology to the Xchromosome can make it difficult to detect and assemble (G. Li et al 2013;Oetjens et al 2018;Carvalho and Clark 2013;Rangavittal et al 2019).…”
Section: Mitochondrial Sequence and Y-chromosomementioning
confidence: 68%