As modern biotechnologies advance, it has become increasingly frequent that different modalities of high-dimensional molecular data (termed “omics” data in this paper), such as gene expression, methylation, and copy number, are collected from the same patient cohort to predict the clinical outcome. While prediction based on omics data has been widely studied in the last fifteen years, little has been done in the statistical literature on the integration of multiple omics modalities to select a subset of variables for prediction, which is a critical task in personalized medicine. In this paper, we propose a simple penalized regression method to address this problem by assigning different penalty factors to different data modalities for feature selection and prediction. The penalty factors can be chosen in a fully data-driven fashion by cross-validation or by taking practical considerations into account. In simulation studies, we compare the prediction performance of our approach, called IPF-LASSO (Integrative LASSO with Penalty Factors) and implemented in the R package , with the standard LASSO and sparse group LASSO. The use of IPF-LASSO is also illustrated through applications to two real-life cancer datasets. All data and codes are available on the companion website to ensure reproducibility.
Formation of the mammalian primitive streak appears to rely on cell proliferation to a minor extent only, but compensating cell movements have not yet been directly observed. This study analyses individual cell migration and proliferation simultaneously, using multiphoton and differential interference contrast time-lapse microscopy of late pregastrulation rabbit blastocysts. Epiblast cells in the posterior gastrula extension area accumulated medially and displayed complex planar movements including U-turns and a novel type of processional cell movement. In the same area metaphase plates tended to be aligned parallel to the anterior-posterior axis, and statistical analysis showed that rotations of metaphase plates causing preferred orientation were near-complete 8 min before anaphase onset; in some cases, rotations were strikingly rapid, achieving up to 45°per min. The mammalian primitive streak appears to be formed initially with its typically minimal anteroposterior elongation by a combination of oriented cell divisions with dedicated planar cell movements. Developmental Dynamics 240:1905Dynamics 240: -1916
It has been shown previously that N-glycosylation of Asn-144 and/or Asn-627 is important for functional expression of neutral endopeptidase-24.11 (NEP). All glycosylation sites of NEP are conserved within endothelin-converting enzyme-1 (ECE-1). In the present study we investigated the importance of proper glycosylation for the biologic function of ECE-1. We show that the double mutation of Asn-632 and Asn-651 leads to expression of an enzymatically inactive ECE-1 protein. In contrast, the single mutation of either Asn-632 or Asn-651 did not alter the enzymatic activity of ECE-1b.
BackgroundReconstruction of protein-protein interaction or metabolic networks based on expression data often involves in silico predictions, while on the other hand, there are unspecific networks of in vivo interactions derived from knowledge bases.We analyze networks designed to come as close as possible to data measured in vivo, both with respect to the set of nodes which were taken to be expressed in experiment as well as with respect to the interactions between them which were taken from manually curated databasesResultsA signaling network derived from the TRANSPATH database and a metabolic network derived from KEGG LIGAND are each filtered onto expression data from breast cancer (SAGE) considering different levels of restrictiveness in edge and vertex selection.We perform several validation steps, in particular we define pathway over-representation tests based on refined null models to recover functional modules. The prominent role of the spindle checkpoint-related pathways in breast cancer is exhibited. High-ranking key nodes cluster in functional groups retrieved from literature. Results are consistent between several functional and topological analyses and between signaling and metabolic aspects.ConclusionsThis construction involved as a crucial step the passage to a mammalian protein identifier format as well as to a reaction-based semantics of metabolism. This yielded good connectivity but also led to the need to perform benchmark tests to exclude loss of essential information. Such validation, albeit tedious due to limitations of existing methods, turned out to be informative, and in particular provided biological insights as well as information on the degrees of coherence of the networks despite fragmentation of experimental data.Key node analysis exploited the networks for potentially interesting proteins in view of drug target prediction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.