Novel
RNA motif design is of great practical importance for technology
and medicine. Increasingly, computational design plays an important
role in such efforts. Our coarse-grained RAG (RNA-As-Graphs) framework
offers strategies for enumerating the universe of RNA 2D folds, selecting
“RNA-like” candidates for design, and
determining sequences that fold onto these candidates. In RAG, RNA
secondary structures are represented as tree or dual graphs. Graphs
with known RNA structures are called “existing”, and
the others are labeled “hypothetical”. By using simplified
features for RNA graphs, we have clustered the hypothetical graphs
into “RNA-like” and “non-RNA-like” groups
and proposed RNA-like graphs as candidates for design. Here, we propose
a new way of designing graph features by using Fiedler vectors. The
new features reflect graph shapes better, and they lead to a more
clustered organization of existing graphs. We show significant increases
in K-means clustering accuracy by using the new features (e.g., up
to 95% and 98% accuracy for tree and dual graphs, respectively). In
addition, we propose a scoring model for top graph candidate selection.
This scoring model allows users to set a threshold for candidates,
and it incorporates weighing of existing graphs based on their corresponding
number of known RNAs. We include a list of top scored RNA-like candidates,
which we hope will stimulate future novel RNA design.