2020
DOI: 10.1007/s00239-020-09954-0
|View full text |Cite
|
Sign up to set email alerts
|

EasyDIVER: A Pipeline for Assembling and Counting High-Throughput Sequencing Data from In Vitro Evolution of Nucleic Acids or Peptides

Abstract: In vitro evolution is a well-established technique for the discovery of functional RNA and peptides. Increasingly, these experiments are analyzed by high-throughput sequencing (HTS) for both scientific and engineering objectives, but computational analysis of HTS data, particularly for peptide selections, can present a barrier to entry for experimentalists. We introduce EasyDIVER (Easy pre-processing and Dereplication of In Vitro Evolution Reads), a simple, user-friendly pipeline for processing high-throughput… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

4
4

Authors

Journals

citations
Cited by 15 publications
(16 citation statements)
references
References 13 publications
0
16
0
Order By: Relevance
“…The raw, paired-end, demultiplexed Illumina read files (i.e., FASTQ files) of all k -Seq samples were processed with the EasyDIVER pipeline to create count files containing dereplicated lists of the central variable region sequences and their count reads ( 106 ). Every unique sequence detected in the count file of the input sample was tracked across all k -Seq samples.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The raw, paired-end, demultiplexed Illumina read files (i.e., FASTQ files) of all k -Seq samples were processed with the EasyDIVER pipeline to create count files containing dereplicated lists of the central variable region sequences and their count reads ( 106 ). Every unique sequence detected in the count file of the input sample was tracked across all k -Seq samples.…”
Section: Methodsmentioning
confidence: 99%
“…First, we examined whether any novel ribozyme families arose in either selection condition. The HTS result of each selection pool was analyzed by EasyDIVER pipeline ( 106 ) to create dereplicated lists of the central variable region sequences and their count reads. The sequences on the list were clustered into a peak if the sequences were within a Hamming distance of ≤3 from the known family wild types (S1a, S1b, S2a, S2b, and S3).…”
Section: Methodsmentioning
confidence: 99%
“…After purification with the QiaQuick PCR Purification Kit (Qiagen), the enriched DNA pool was transcribed (see Supplementary Data) and subjected to an additional round of selection (black arrow Figure 1B ). After six rounds of selection, the G6 DNA pool was analyzed by high-throughput Illumina sequencing (Illumina MiSeq, nano output) and the resulting sequence data was processed using EasyDIVER ( 41 ) ( Supplementary Materials and Methods and Supplementary Table S1 ).…”
Section: Methodsmentioning
confidence: 99%
“…Sequencing reads were processed using trimmomatic SE CROP:90 to facilitate joining 76 , and then paired-end reads were joined and unique sequences were enumerated using EasyDIVER 77 . Joining was performed using the following PANDAseq 78 flags: -a -l 1 -A pear -C completely_miss_the_point:0.…”
Section: Computational Analyses Of K-seq Datamentioning
confidence: 99%
“…Sequences were clustered into families based on sequence similarity, using a custom Python script (see Data Availability). The script ClusterBOSS.py uses the enumerated read output files generated from the EasyDIVER package 77 . In general, first, all sequences were sorted according to their read count values.…”
Section: Clustering Analysis Of Sequences From Selectionsmentioning
confidence: 99%