Increased levels of tumor-infiltrating lymphocytes (TILs) indicate favorable outcomes in many types of cancer. The manual quantification of immune cells is inaccurate and time-consuming for pathologists. Our aim is to leverage a computational solution to automatically quantify TILs in standard diagnostic hematoxylin and eosin-stained sections (H&E slides) from lung cancer patients. Our approach is to transfer an open-source machine learning method for the segmentation and classification of nuclei in H&E slides trained on public data to TIL quantification without manual labeling of the data. Our results show that the resulting TIL quantification correlates to the patient prognosis and compares favorably to the current state-of-the-art method for immune cell detection in non-small cell lung cancer (current standard CD8 cells in DAB-stained TMAs HR 0.34, 95% CI 0.17–0.68 vs. TILs in HE WSIs: HoVer-Net PanNuke Aug Model HR 0.30, 95% CI 0.15–0.60 and HoVer-Net MoNuSAC Aug model HR 0.27, 95% CI 0.14–0.53). Our approach bridges the gap between machine learning research, translational clinical research and clinical implementation. However, further validation is warranted before implementation in a clinical setting.
As a novel approach we will combine trajectories or longitudinal studies of gene expression with information on annual influenza epidemics. Seasonality of gene expression in immune cells from blood could be a consequence of within-host seasonal immunity interacting with the seasonal pandemics of influenza (flu) in temperate regions and, thus, with potential valuable analogy transfer to the proposed seasonal development of covid-19.Here we operationalized within-host immunity as genes with both a significant seasonal term and a significant flu term in the sine-cosine model. Information on gene expression was based on microarray using RNase buffered blood samples collected randomly from a population-based cohort of Norwegian middle-aged women in 2003-2006, The Norwegian Women and Cancer (NOWAC) study. The unique discovery (N=425) and replication (N=432) design were based on identical sampling and preprocessing. Data on proportion of sick leaves due to flu, and the flu intensities per week was obtained from the National Institute of Public Health, giving a semi-ecological analysis.The discovery analysis found 2942 (48.1%) significant genes in a generalized seasonal model over four years. For 1051 within-host genes both the seasonal and the flu term were significant. These genes followed closely the flu intensities. The trajectories showed slightly more genes with a maximum in early winter than in late summer. Moving the flu intensity forward in time indicated a better fit 3-4 weeks before the observed influenza. In the replication analyses, 369 genes (35.1% of 1051) were significant. Exclusion of genes with unknown functions and with more than a season in difference reduced the number of genes in the discovery dataset to 305, illustrating the variability in the measurements and the problem in assessing weak biological relationships. Thus, we found for the first time a clear seasonality in gene expression with marked responses to the annual seasonal influenza in a unique discovery – replication design. Hypothetically, this could support the within-host seasonal immunity concept.
Standardizing and documenting computational analyses is necessary to ensure reproducible results. We describe an R-based implementation of data management and preprocessing that is well integrated with the analysis tools typically used for statistical analysis of omics data. We have used these tools to organize data storage and documentation, and to standardize the analysis of gene expression data, in the Norwegian Women and Cancer study.
Background. Standardizing and documenting computational analyses are necessary to ensure reproducible results. It is especially important for large and complex projects where data collection, analysis, and interpretation may span decades. Our objective is therefore to provide methods, tools, and best practice guidelines adapted for analyses in epidemiological studies that use -omics data. Results.We describe an R-based implementation of data management and preprocessing. The method is well-integrated with the analysis tools typically used for statistical analysis of -omics data. We document all datasets thoroughly and use version control to track changes to both datasets and code over time. We provide a web application to perform the standardized preprocessing steps for gene expression datasets. We provide best practices for reporting data analysis results and sharing analyses. Conclusion.We have used these tools to organize data storage and documentation, and to standardize the analysis of gene expression data, in the Norwegian Women and Cancer (NOWAC) system epidemiology study. We believe our approach and lessons learned are applicable to analyses in other large and complex epidemiology projects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.