Background: Proteins play a crucial role in life activities, such as catalyzing metabolic reactions, DNA replication, responding to stimuli, etc. Identification of protein structures and functions are critical for both basic research and applications. Because the traditional experiments for studying the structures and functions of proteins are expensive and time consuming, computational approaches are highly desired. In key for computational methods is how to efficiently extract the features from the protein sequences. During the last decade, many powerful feature extraction algorithms have been proposed, significantly promoting the development of the studies of protein structures and functions. Objective: To help the researchers to catch up the recent developments in this important field, in this study, an updated review is given, focusing on the sequence-based feature extractions of protein sequences. Method: These sequence-based features of proteins were grouped into three categories, including composition-based features, autocorrelation-based features and profile-based features. The detailed information of features in each group was introduced, and their advantages and disadvantages were discussed. Besides, some useful tools for generating these features will also be introduced. Results: Generally, autocorrelation-based features outperform composition-based features, and profile-based features outperform autocorrelation-based features. The reason is that profile-based features consider the evolutionary information, which is useful for identification of protein structures and functions. However, profile-based features are more time consuming, because the multiple sequence alignment process is required. Conclusion: In this study, some recently proposed sequence-based features were introduced and discussed, such as basic k-mers, PseAAC, auto-cross covariance, top-n-gram etc. These features did make great contributions to the developments of protein sequence analysis. Future studies can be focus on exploring the combinations of these features. Besides, techniques from other fields, such as signal processing, natural language process (NLP), image processing etc., would also contribute to this important field, because natural languages (such as English) and protein sequences share some similarities. Therefore, the proteins can be treated as documents, and the features, such as k-mers, top-n-grams, motifs, can be treated as the words in the languages. Techniques from these filed will give some new ideas and strategies for extracting the features from proteins.
This paper reviews studies on neural networks in aerodynamic data modeling. In this paper, we analyze the shortcomings of computational fluid dynamics (CFD) and traditional reduced-order models (ROMs). Subsequently, the history and fundamental methodologies of neural networks are introduced. Furthermore, we classify the neural networks based studies in aerodynamic data modeling and illustrate comparisons among them. These studies demonstrate that neural networks are effective approaches to aerodynamic data modeling. Finally, we identify three important trends for future studies in aerodynamic data modeling: a) the transformation method and physics informed models will be combined to solve high-dimensional partial differential equations; b) in the research area of steady aerodynamic response predictions, model-oriented studies and data-integration-oriented studies will become the future research directions, while in unsteady aerodynamic response predictions, radial basis function neural networks (RBFNNs) are the best tools for capturing the nonlinear characteristics of flow data, and convolutional neural networks (CNNs) are expected to replace long short-term memories (LSTMs) to capture the temporal characteristics of flow data; and c) in the field of steady or unsteady flow field reconstructions, the CNN-based conditional generative adversarial networks (cGANs) will be the best frameworks in which to discover the spatiotemporal distribution of flow field data. INDEX TERMS Aerodynamics, convolutional neural networks, neural networks, generative adversarial networks, recurrent neural networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.