In predictive model development, gene expression data is associated with the unique challenge that the number of samples (n) is much smaller than the amount of features (p). This “n ≪ p” property has prevented classification of gene expression data from deep learning techniques, which have been proved powerful under “n > p” scenarios in other application fields, such as image classification. Further, the sparsity of effective features with unknown correlation structures in gene expression profiles brings more challenges for classification tasks. To tackle these problems, we propose a newly developed classifier named Forest Deep Neural Network (fDNN), to integrate the deep neural network architecture with a supervised forest feature detector. Using this built-in feature detector, the method is able to learn sparse feature representations and feed the representations into a neural network to mitigate the overfitting problem. Simulation experiments and real data analyses using two RNA-seq expression datasets are conducted to evaluate fDNN’s capability. The method is demonstrated a useful addition to current predictive models with better classification performance and more meaningful selected features compared to ordinary random forests and deep neural networks.
The beta distribution is a versatile function that accommodates a broad range of probability distribution shapes. Beta regression based on the beta distribution can be used to model a response variable y that takes values in open unit interval (0, 1). Zero/one inflated beta (ZOIB) regression models can be applied when y takes values from closed unit interval [0, 1]. The ZOIB model is based a piecewise distribution that accounts for the probability mass at 0 and 1, in addition to the probability density within (0, 1). This paper introduces an R packagezoib that provides Bayesian inferences for a class of ZOIB models. The statistical methodology underlying the zoib package is discussed, the functions covered by the package are outlined, and the usage of the package is illustrated with three examples of different data and model types. The package is comprehensive and versatile in that it can model data with or without inflation at 0 or 1, accommodate clustered and correlated data via latent variables, perform penalized regression as needed, and allow for model comparison via the computation of the DIC criterion.
In a phase 1 randomized, single-center clinical trial, inactivated influenza virus vaccine delivered through dissolvable microneedle patches (MNPs) was found to be safe and immunogenic. Here, we compare the humoral and cellular immunologic responses in a subset of participants receiving influenza vaccination by MNP to the intramuscular (IM) route of administration. We collected serum, plasma, and peripheral blood mononuclear cells in 22 participants up to 180 days post-vaccination. Hemagglutination inhibition (HAI) titers and antibody avidity were similar after MNP and IM vaccination, even though MNP vaccination used a lower antigen dose. MNPs generated higher neuraminidase inhibition (NAI) titers for all three influenza virus vaccine strains tested and triggered a larger percentage of circulating T follicular helper cells (CD4 + CXCR5 + CXCR3 + ICOS + PD-1+) compared to the IM route. Our study indicates that inactivated influenza virus vaccination by MNP produces humoral and cellular immune response that are similar or greater than IM vaccination.
A unique challenge in predictive model building for omics data has been the small number of samples (n) versus the large amount of features (p). This "n p" property brings difficulties for disease outcome classification using deep learning techniques. Sparse learning by incorporating external gene network information such as the graph-embedded deep feedforward network (GEDFN) [19] model has been a solution to this issue. However, such methods require an existing feature graph, and potential mis-specification of the feature graph can be harmful on classification and feature selection. To address this limitation and develop a robust classification model without relying on external knowledge, we propose a forest graph-embedded deep feedforward network (forgeNet) model, to integrate the GEDFN architecture with a forest feature graph extractor, so that the feature graph can be learned in a supervised manner and specifically constructed for a given prediction task. To validate the method's capability, we experimented the forgeNet model with both synthetic and real datasets. The resulting high classification accuracy suggests that the method is a valuable addition to sparse deep learning models for omics data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.