2022
DOI: 10.3390/biotech11010007
|View full text |Cite
|
Sign up to set email alerts
|

High Performance Integration Pipeline for Viral and Epitope Sequences

Abstract: With the spread of COVID-19, sequencing laboratories started to share hundreds of sequences daily. However, the lack of a commonly agreed standard across deposition databases hindered the exploration and study of all the viral sequences collected worldwide in a practical and homogeneous way. During the first months of the pandemic, we developed an automatic procedure to collect, transform, and integrate viral sequences of SARS-CoV-2, MERS, SARS-CoV, Ebola, and Dengue from four major database institutions (NCBI… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 23 publications
0
2
0
Order By: Relevance
“…In High Performance Integration Pipeline for Viral and Epitope Sequences [ 10 ], Tommaso Alfonsi, Pietro Pinoli and Arif Canakoglu presented an integrated pipeline to collect, transform, and integrate viral sequences of SARS-CoV-2, MERS, SARS-CoV, Ebola, and Dengue from four major database institutions (NCBI, COG-UK, GISAID, and NMDC). This pipeline allowed the development of VirusViz and EpiSurf, two data exploration interfaces, and of ViruSurf, one of the largest databases of viral sequences.…”
mentioning
confidence: 99%
“…In High Performance Integration Pipeline for Viral and Epitope Sequences [ 10 ], Tommaso Alfonsi, Pietro Pinoli and Arif Canakoglu presented an integrated pipeline to collect, transform, and integrate viral sequences of SARS-CoV-2, MERS, SARS-CoV, Ebola, and Dengue from four major database institutions (NCBI, COG-UK, GISAID, and NMDC). This pipeline allowed the development of VirusViz and EpiSurf, two data exploration interfaces, and of ViruSurf, one of the largest databases of viral sequences.…”
mentioning
confidence: 99%
“…The integration of efficient methodologies for data processing and retrieval of large amounts of data within these tools is a major problem in computational biology and in bioinformatics [6,7]. Data integration will allow for better management and facilitate the system's ability to store and retrieve data in addition to interacting with outside sources within the same field [8].…”
Section: Introductionmentioning
confidence: 99%