Here we report the development of a new neural network based approach for rapid quantification of protein secondary structure from Fourier transform infrared (FTIR) spectra of proteins. A technique for efficiently reducing the amount of spectral data by almost 90% is suggested to facilitate faster neural network analysis. Additionally, an automatic procedure is introduced for selecting only those regions within the amide I band of protein FTIR spectra, which can be best related to secondary structure contents by subsequent neural network analysis. Based on a given reference set of FTIR spectra from proteins with known secondary structure, a subset of merely 29 out of 101 amide I absorbance values could be identified, which lead to an improved prediction accuracy. The average prediction accuracy achieved for helix, sheet, turn, bend, and other is 4.96% which is better than that achieved by alternative methods that have been previously reported indicating the significant potential of this approach. Our suggested automatic amide I frequency selection procedure may be easily extended to identify promising regions from spectral data recorded by other spectroscopic techniques, like for example circular dichroism spectroscopy.
Fourier transform infrared (FTIR) spectroscopy is an attractive tool for proteomics research as it can be used to rapidly characterize protein secondary structure in aqueous solution. Recently, a number of secondary structure prediction methods based on reference sets of FTIR spectra from proteins with known structure from X-ray crystallography have been suggested. These prediction methods, often referred to as pattern recognition based approaches, demonstrated good prediction accuracy using some error measure, e.g., the standard error of prediction (SEP). However, to avoid possible adverse effects from differences in recording, the analysis has been mostly based on reference sets of FTIR spectra from proteins recorded in one laboratory only. As a result, these studies were based on reference sets of FTIR spectra from a limited number of proteins. Pattern recognition based approaches, however, rely on reference sets of FTIR spectra from as many proteins as possible representing all possible band shape variation to be related to the diversity of protein structural classes. Hence, if we want to build reliable pattern recognition based systems to support proteomics research, which are capable of making good predictions from spectral data of any unknown protein, one common goal should be to build a comprehensive protein infrared spectra databank (PISD) containing FTIR spectra of proteins of known structure. We have started the process of developing a comprehensive PISD composed of spectra recorded in different laboratories. As part of this work, here we investigate possible effects on prediction accuracy achieved by a neural network analysis when using reference sets composed of FTIR spectra from different laboratories. Surprisingly low magnitude of difference in SEPs throughout all our experiments suggests that FTIR spectra recorded in different laboratories may be safely combined into one reference set with only minor deterioration of prediction accuracy in the worst case.
Lack of reliable methods for accurate estimation of protein secondary structure from infrared spectra of proteins is a major barrier in its widespread use in protein secondary structure characterisation. Here we report a method for protein secondary structure estimation, from FTIR spectra of proteins, based on a multi-layer feed-forward neural network approach using an enhanced "resilient backpropagation" learning algorithm. The method utilises a database consisting of infrared spectra of 18 proteins, with known X-ray structure, as the reference set. Our study revealed that providing the neural network analysis with only part of the amide I region from empirically determined structure sensitive regions in combination with appropriate pre-processing of the spectral data produced the best overall results. This lead to a standard error of prediction (SEP) of 4.47% for α-helix, an SEP of 6.16% for β-sheet, and an SEP of 4.61% for turns. Compared to a previous factor analysis study by Lee et al., using the same set of 18 FTIR spectra of proteins, the error in prediction of α-helix and β-sheet was improved by 3.33% and 3.54% respectively, with minor increase for turns by 0.31%. Generally, our neural network analysis achieved comparable, in most cases even better prediction accuracy than most of the alternative pattern recognition based methods that were previously reported indicating the significant potential of this approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.