2018
DOI: 10.26599/bdma.2018.9020018
|View full text |Cite
|
Sign up to set email alerts
|

Survey on encoding schemes for genomic data representation and feature learning—from signal processing to machine learning

Ning Yu,
Zhihua Li,
Zeng Yu

Abstract: Data-driven machine learning, especially deep learning technology, is becoming an important tool for handling big data issues in bioinformatics. In machine learning, DNA sequences are often converted to numerical values for data representation and feature learning in various applications. Similar conversion occurs in Genomic Signal Processing (GSP), where genome sequences are transformed into numerical sequences for signal extraction and recognition. This kind of conversion is also called encoding scheme. The … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 57 publications
(9 citation statements)
references
References 107 publications
(185 reference statements)
0
9
0
Order By: Relevance
“…It is assumed that proteins with more similar GO annotations are more functionally coherent [20]. We calculate and analyze such functional similarity by the fraction of aligned proteins that share same GO annotations.…”
Section: Resultsmentioning
confidence: 99%
“…It is assumed that proteins with more similar GO annotations are more functionally coherent [20]. We calculate and analyze such functional similarity by the fraction of aligned proteins that share same GO annotations.…”
Section: Resultsmentioning
confidence: 99%
“…Triplet Encoding [6,42] 64 codons are encoded by weights [44] Each base is encoded with the value of the base distance between the next itself and the same base.…”
Section: X=[agctaccgtg]mentioning
confidence: 99%
“…Encoding [49] 1000 Chaos Game Representation (CGR) [6,50] A: (0, 0), T: (1, 0), G: (1, 1), C: (0, 1)…”
Section: Snp Ginmentioning
confidence: 99%
“…Another popular class of alignment-free representation schemes is one that is based on information theory principles such as entropy, although numerous other representations do not fall in either of these two categories [11] . A comprehensive review of the recent numerical encoding schemes can be found in [28] .…”
Section: Related Workmentioning
confidence: 99%