2019
DOI: 10.1007/978-3-030-17083-7_14
|View full text |Cite
|
Sign up to set email alerts
|

De Novo Clustering of Long-Read Transcriptome Data Using a Greedy, Quality-Value Based Algorithm

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
18
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 16 publications
(18 citation statements)
references
References 44 publications
0
18
0
Order By: Relevance
“…Optimizing the normalization protocol would be the best strategy for sequencing of a transcriptome where we expect transcripts are expressed at hugely variable levels. Specific bioinformatics tools including TAPIS pipeline (Abdel-Ghany et al, 2016), de novo AS (Liu et al, 2017), IsoCon (Sahlin et al, 2018), and isONclust (Sahlin and Medvedev, 2019) could be used to process and cluster read data prior further functional annotation at transcript isoform level.…”
Section: Discussionmentioning
confidence: 99%
“…Optimizing the normalization protocol would be the best strategy for sequencing of a transcriptome where we expect transcripts are expressed at hugely variable levels. Specific bioinformatics tools including TAPIS pipeline (Abdel-Ghany et al, 2016), de novo AS (Liu et al, 2017), IsoCon (Sahlin et al, 2018), and isONclust (Sahlin and Medvedev, 2019) could be used to process and cluster read data prior further functional annotation at transcript isoform level.…”
Section: Discussionmentioning
confidence: 99%
“…The input to our algorithm is a cluster of reads originating from transcripts of a single gene family. Such clusters can be generated from a whole-transcriptome dataset by using our previously published tool isONclust (Sahlin and Medvedev 2019). Each cluster is then processed individually and in parallel with isONcorrect, with the goal of correcting all the sequencing errors.…”
Section: Resultsmentioning
confidence: 99%
“…This strategy is in sharp contrast to approaches which cluster based on the isoform of origin. Such clustering results in low read coverage per transcript (Sahlin and Medvedev 2019), particularly for genes expressing multiple isoforms with variable start and stop sites and makes error correction unable to utilize full coverage over shared exons. By using isONclust to cluster at the gene family level, each read retains more complete exon coverage and helps the correction process preserve allele- or copy-specific small variant differences between transcripts that otherwise share the same structure.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…isONclust is one of the recent clustering tools that are designed to cluster PacBio and Oxford nanopore data efficiently. Clustering in isONclust is based on shared -mers and a minimizer scheme [20]. Although MMseqs2/Linclust is not designed to work for isoforms, we included this tool as the results in [22] showed that it is faster than CD-HIT.…”
Section: Introductionmentioning
confidence: 99%