2020
DOI: 10.1101/2020.10.11.335406
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Unsupervised explainable AI for simultaneous molecular evolutionary study of forty thousand SARS-CoV-2 genomes

Abstract: Unsupervised AI (artificial intelligence) can obtain novel knowledge from big data without particular models or prior knowledge and is highly desirable for unveiling hidden features in big data. SARS-CoV-2 poses a serious threat to public health and one important issue in characterizing this fast-evolving virus is to elucidate various aspects of their genome sequence changes. We previously established unsupervised AI, a BLSOM (batch-learning SOM), which can analyze five million genomic sequences simultaneously… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2
1

Relationship

3
2

Authors

Journals

citations
Cited by 7 publications
(20 citation statements)
references
References 20 publications
0
20
0
Order By: Relevance
“…Herein, we focused on pentanucleotide composition, but similar separations were obtained for other lengths of oligonucleotides (Ikemura et al 2020). BLSOM is an explanatory AI that can clarify combinatorial patterns of oligonucleotides that contribute to the separation according to clades and .…”
Section: Conclusion and Perspecticesmentioning
confidence: 83%
See 2 more Smart Citations
“…Herein, we focused on pentanucleotide composition, but similar separations were obtained for other lengths of oligonucleotides (Ikemura et al 2020). BLSOM is an explanatory AI that can clarify combinatorial patterns of oligonucleotides that contribute to the separation according to clades and .…”
Section: Conclusion and Perspecticesmentioning
confidence: 83%
“…Next, it will be important to know the relationship between the strains isolated in clades and their subclusters and the causative mutations. When it comes to oligonucleotides as long as 15-mers, most are only present in one copy in the viral genome; therefore, changes in 15-mer sequences can be directly linked to mutations, and we have already started analysis from this perspective (Ikemura et al 2020).…”
Section: Conclusion and Perspecticesmentioning
confidence: 99%
See 1 more Smart Citation
“…Investigation of these black points revealed that they frequently contained the O (Other) clade sequences that GISAID did not classify as known specific clades. We believe that significant numbers of O-clade sequences can be classified into known clades (Ikemura et al, 2020). In Fig.…”
Section: Blsom Of Short Oligonucleotidesmentioning
confidence: 96%
“…Due to the current explosive increase in available sequences, we must develop new technologies that can grasp the whole picture of big-sequence data and support efficient data mining. We have recently analyzed time-series changes in short and long oligonucleotide compositions in a large number of SARS-CoV-2 genomes and found many oligonucleotides that are expanding rapidly in the virus population, which allowed us to predict candidate advantageous mutations for growth in human cells (Wada et al, 2020b;Ikemura et al, 2020). Furthermore, the oligonucleotide BLSOM can classify the virus sequences into not only the known clades but also their subgroups (Abe et al, 2021).…”
Section: Analyses Of Sars-cov-2mentioning
confidence: 99%