Abstract. Gene set enrichment analysis is a widely used tool for analyzing gene expression data. However, current implementations are slow due to a large number of required samples for the analysis to have a good statistical power. In this paper we present a novel algorithm, that efficiently reuses one sample multiple times and thus speeds up the analysis. We show that it is possible to make hundreds of thousands permutations in a few minutes, which leads to very accurate p-values. This, in turn, allows applying standard FDR correction procedures, which are more accurate than the ones currently used. The method is implemented in a form of an R package and is freely available at https://github.com/ctlab/fgsea.
A new computational approach for the efficient docking of flexible ligands in a rigid protein is presented. It exploits the binding modes of functional groups determined by an exhaustive search with solvation. The search in ligand conformational space is performed by a genetic algorithm whose scoring function approximates steric effects and intermolecular hydrogen bonds. Ligand conformations generated by the genetic algorithm are docked in the protein binding site by optimizing the fit of their fragments to optimal positions of chemically related functional groups. We show that the use of optimal binding modes of molecular fragments allows to dock known inhibitors with about ten rotatable bonds in the active site of the uncomplexed and complexed conformations of thrombin and HIV-1 protease.
The diversity of experimental workflows involving LC-MS/MS and the extended range of mass spectrometers tend to produce extremely variable spectra. Variability reduces the accuracy of compound identification produced by commonly available software for a spectral library search. We introduce here a new algorithm that successfully matches MS/MS spectra generated by a range of instruments, acquired under different conditions. Our algorithm called X-Rank first sorts peak intensities of a spectrum and second establishes a correlation between two sorted spectra. X-Rank then computes the probability that a rank from an experimental spectrum matches a rank from a reference library spectrum. In a training step, characteristic parameter values are generated for a given data set. We compared the efficiency of the X-Rank algorithm with the dot-product algorithm implemented by MS Search from the National Institute of Standards and Technology (NIST) on two test sets produced with different instruments. Overall the X-Rank algorithm accurately discriminates correct from wrong matches and detects more correct substances than the MS Search. Furthermore, X-Rank could correctly identify and top rank eight chemical compounds in a commercially available test mix. This confirms the ability of the algorithm to perform both a straight single-platform identification and a cross-platform library search in comparison to other tools. It also opens the possibility for efficient general unknown screening (GUS) against large compound libraries.
We present an integrated proteomics platform designed for performing differential analyses. Since reproducible results are essential for comparative studies, we explain how we improved reproducibility at every step of our laboratory processes, e.g. by taking advantage of the powerful laboratory information management system we developed. The differential capacity of our platform is validated by detecting known markers in a real sample and by a spiking experiment. We introduce an innovative two-dimensional (2-D) plot for displaying identification results combined with chromatographic data. This 2-D plot is very convenient for detecting differential proteins. We also adapt standard multivariate statistical techniques to show that peptide identification scores can be used for reliable and sensitive differential studies. The interest of the protein separation approach we generally apply is justified by numerous statistics, complemented by a comparison with a simple shotgun analysis performed on a small volume sample. By introducing an automatic integration step after mass spectrometry data identification, we are able to search numerous databases systematically, including the human genome and expressed sequence tags. Finally, we explain how rigorous data processing can be combined with the work of human experts to set high quality standards, and hence obtain reliable (false positive < 0.35%) and nonredundant protein identifications.
Program to engineer peptides (PEP) is a build-up approach forligand docking and design with implicit solvation. It requires the knowledge of a seed from which it iteratively grows polymeric ligands consisting of any type of amino acid, i.e., natural and/or nonnatural from a user-defined library. At every growing step, a genetic algorithm is used for conformational optimization of the last added monomer in the rigid binding site. Pruning is performed at every growing step by selecting sequences according to binding energy with electrostatic solvation. PEP is applied to three members of the caspase family of cysteine proteases using Asp at P 1 as seed. The optimal P 4 -P 2 peptide recognition motifs and variants thereof are docked correctly in the active site (backbone root-mean-square deviation < 0.9 Å). Moreover, for each caspase, the P 4 -P 2 sequences of potent aldehyde inhibitors are ranked among the 15 hits with the most favorable PEP energy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.