Type 1 Diabetes (T1D) is an autoimmune disease in which immune cells destroy insulin-producing beta cells. The etiology of this complex disease is dependent on the interplay of multiple heterogeneous cell types in the pancreatic environment. Here, we provide a single-cell atlas of pancreatic islets of 24 T1D, autoantibody-positive, and non-diabetic organ donors across multiple quantitative modalities including ~80,000 cells using single-cell transcriptomics, ~7,000,000 cells using cytometry by time-of-flight, and ~1,000,000 cells using
in situ
imaging mass cytometry. We develop an advanced integrative analytical strategy to assess pancreatic islets and identify canonical cell types. We show that a subset of exocrine ductal cells acquires a signature of tolerogenic dendritic cells in an apparent attempt at immune suppression in T1D donors. Our multimodal analyses delineate cell types and processes that may contribute to T1D immunopathogenesis and provide an integrative procedure for exploration and discovery of human pancreas function.
In high-dimensional data, the performances of various classifiers are largely dependent on the selection of important features. Most of the individual classifiers with the existing feature selection (FS) methods do not perform well for highly correlated data. Obtaining important features using the FS method and selecting the best performing classifier is a challenging task in high throughput data. In this article, we propose a combination of resampling-based least absolute shrinkage and selection operator (LASSO) feature selection (RLFS) and ensembles of regularized regression (ERRM) capable of dealing data with the high correlation structures. The ERRM boosts the prediction accuracy with the top-ranked features obtained from RLFS. The RLFS utilizes the lasso penalty with sure independence screening (SIS) condition to select the top k ranked features. The ERRM includes five individual penalty based classifiers: LASSO, adaptive LASSO (ALASSO), elastic net (ENET), smoothly clipped absolute deviations (SCAD), and minimax concave penalty (MCP). It was built on the idea of bagging and rank aggregation. Upon performing simulation studies and applying to smokers’ cancer gene expression data, we demonstrated that the proposed combination of ERRM with RLFS achieved superior performance of accuracy and geometric mean.
Type 1 and Type 2 diabetes are distinct genetic diseases of the pancreas which are defined by the abnormal level of blood glucose. Understanding the initial molecular perturbations that occur during the pathogenesis of diabetes is of critical importance in understanding these disorders. The inability to biopsy the human pancreas of living donors hampers insights into early detection, as the majority of diabetes studies have been performed on peripheral leukocytes from the blood, which is not the site of pathogenesis. Therefore, efforts have been made by various teams including the Human Pancreas Analysis Program (HPAP) to collect pancreatic tissues from deceased organ donors with different clinical phenotypes. HPAP is designed to define the molecular pathogenesis of islet dysfunction by generating detailed datasets of functional, cellular, and molecular information in pancreatic tissues of clinically well-defined organ donors with Type 1 and Type 2 diabetes. Moreover, data generated by HPAP continously become available through a centralized database, PANC-DB, thus enabling the diabetes research community to access these multi-dimensional data pre-publication. Here, we present the computational workflow for single-cell RNA-seq data analysis of 258,379 high-quality cells from the pancreatic islets of 67 human donors generated by HPAP, the largest existing scRNA-seq dataset of human pancreatic tissues. We report various computational steps including preprocessing, doublet removal, clustering and cell type annotation across single-cell RNA-seq data from islets of four distintct classes of organ donors, i.e. non-diabetic control, auto-antibody positive but normoglycemic, Type 1 diabetic, and Type 2 diabetic individuals. Moreover, we present an interactive tool, called CellxGene developed by the Chan Zuckerberg initiative, to navigate these high-dimensional datasets. Our data and interactive tools provide a reliable reference for single-cell pancreatic islet biology studies, especially diabetes-related conditions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.