2012
DOI: 10.1186/1471-2105-13-97
|View full text |Cite
|
Sign up to set email alerts
|

Exploration of multivariate analysis in microbial coding sequence modeling

Abstract: BackgroundGene finding is a complicated procedure that encapsulates algorithms for coding sequence modeling, identification of promoter regions, issues concerning overlapping genes and more. In the present study we focus on coding sequence modeling algorithms; that is, algorithms for identification and prediction of the actual coding sequences from genomic DNA. In this respect, we promote a novel multivariate method known as Canonical Powered Partial Least Squares (CPPLS) as an alternative to the commonly used… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
8

Relationship

4
4

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 42 publications
0
7
0
Order By: Relevance
“…Also, in models with more than one component, VIP might not give the correct relationship between the pattern of variables and response ( Y ). In the case of OPLS-DA loading, weights of the predicative component will always give a correct relationship; however, there are difficulties in defining a threshold based on loading weights [for review Mehmood et al ( 2012 )]. Thus, in the present study, a combination of VIP and the weights of the predictive component were used.…”
Section: Discussionmentioning
confidence: 99%
“…Also, in models with more than one component, VIP might not give the correct relationship between the pattern of variables and response ( Y ). In the case of OPLS-DA loading, weights of the predicative component will always give a correct relationship; however, there are difficulties in defining a threshold based on loading weights [for review Mehmood et al ( 2012 )]. Thus, in the present study, a combination of VIP and the weights of the predictive component were used.…”
Section: Discussionmentioning
confidence: 99%
“…We have used the Partial Least Squares (PLS) method [ 24 ], which is one in a long list of supervised learning methods. PLS is well established and has been used in many bioinformatics applications, also for the analysis of sequence data [ 25 , 26 ]. PLS is especially applicable when there are many correlated explanatory variables.…”
Section: Methodsmentioning
confidence: 99%
“…This is a supervised learning method that has been used in many bioinformatics applications (e.g. [ 28 32 ]). A reason for the wide-spread use of PLS is that it is especially applicable when we have many correlated explanatory variables, which is typical for the present K -mer data, especially as K increases.…”
Section: Methodsmentioning
confidence: 99%