Successful deployment of GPS by combining complex data and recognizable iconography led to a tool that enabled point-of-care genomic delivery with high usability. Continued scalability and incorporation of additional clinical elements to be considered alongside PGx information could expand future impact.
The international pediatric oncology community has a long history of research collaboration. In the United States, the 2019 launch of the Children's Cancer Data Initiative puts the focus on developing a rich and robust data ecosystem for pediatric oncology. In this spirit, we present here our experience in constructing the Pediatric Cancer Data Commons (PCDC) to highlight the significance of this effort in fighting pediatric cancer and improving outcomes and to provide essential information to those creating resources in other disease areas. The University of Chicago's PCDC team has worked with the international research community since 2015 to build data commons for children's cancers. We identified six critical features of successful data commons design and implementation: (1) establish the need for a data commons, (2) develop and deploy the technical infrastructure, (3) establish and implement governance, (4) make the data commons platform easy and intuitive for researchers, (5) socialize the data commons and create working knowledge and expertise in the research community, and (6) plan for longevity and sustainability. Data commons are critical to conducting research on large patient cohorts that will ultimately lead to improved outcomes for children with cancer. There is value in connecting high-quality clinical and phenotype data to external sources of data such as genomic, proteomics, and imaging data. Next steps for the PCDC include creating an informed and invested data-sharing culture, developing sustainable methods of data collection and sharing, standardizing genetic biomarker reporting, incorporating radiologic and molecular analysis data, and building models for electronic patient consent. The methods and processes described here can be extended to any clinical area and provide a blueprint for others wishing to develop similar resources.
PURPOSE Robust institutional tumor banks depend on continuous sample curation or else subsequent biopsy or resection specimens are overlooked after initial enrollment. Curation automation is hindered by semistructured free-text clinical pathology notes, which complicate data abstraction. Our motivation is to develop a natural language processing method that dynamically identifies existing pathology specimen elements necessary for locating specimens for future use in a manner that can be re-implemented by other institutions. PATIENTS AND METHODS Pathology reports from patients with gastroesophageal cancer enrolled in The University of Chicago GI oncology tumor bank were used to train and validate a novel composite natural language processing-based pipeline with a supervised machine learning classification step to separate notes into internal (primary review) and external (consultation) reports; a named-entity recognition step to obtain label (accession number), location, date, and sublabels (block identifiers); and a results proofreading step. RESULTS We analyzed 188 pathology reports, including 82 internal reports and 106 external consult reports, and successfully extracted named entities grouped as sample information (label, date, location). Our approach identified up to 24 additional unique samples in external consult notes that could have been overlooked. Our classification model obtained 100% accuracy on the basis of 10-fold cross-validation. Precision, recall, and F1 for class-specific named-entity recognition models show strong performance. CONCLUSION Through a combination of natural language processing and machine learning, we devised a re-implementable and automated approach that can accurately extract specimen attributes from semistructured pathology notes to dynamically populate a tumor registry.
Objective Adherence to a treatment plan from HIV-positive patients is necessary to decrease their mortality and improve their quality of life, however some patients display poor appointment adherence and become lost to follow-up (LTFU). We applied natural language processing (NLP) to analyze indications towards or against LTFU in HIV-positive patients’ notes. Materials and Methods Unstructured lemmatized notes were labeled with an LTFU or Retained status using a 183-day threshold. An NLP and supervised machine learning system with a linear model and elastic net regularization was trained to predict this status. Prevalence of characteristics domains in the learned model weights were evaluated. Results We analyzed 838 LTFU vs 2964 Retained notes and obtained a weighted F1 mean of 0.912 via nested cross-validation; another experiment with notes from the same patients in both classes showed substantially lower metrics. “Comorbidities” were associated with LTFU through, for instance, “HCV” (hepatitis C virus) and likewise “Good adherence” with Retained, represented with “Well on ART” (antiretroviral therapy). Discussion Mentions of mental health disorders and substance use were associated with disparate retention outcomes, however history vs active use was not investigated. There remains further need to model transitions between LTFU and being retained in care over time. Conclusion We provided an important step for the future development of a model that could eventually help to identify patients who are at risk for falling out of care and to analyze which characteristics could be factors for this. Further research is needed to enhance this method with structured electronic medical record fields.
The mechanisms that underlie the timing of labor in humans are largely unknown. In most pregnancies, labor is initiated at term (≥ 37 weeks gestation), but in a signifiicant number of women spontaneous labor occurs preterm and is associated with increased perinatal mortality and morbidity. The objective of this study was to characterize the cells at the maternal–fetal interface (MFI) in term and preterm pregnancies in both the laboring and non-laboring state in Black women, who have among the highest preterm birth rates in the U.S. Using mass cytometry to obtain high-dimensional single-cell resolution, we identified 31 cell populations at the MFI, including 25 immune cell types and six non-immune cell types. Among the immune cells, maternal PD1+ CD8 T cell subsets were less abundant in term laboring compared to term non-laboring women. Among the non-immune cells, PD-L1+ maternal (stromal) and fetal (extravillous trophoblast) cells were less abundant in preterm laboring compared to term laboring women. Consistent with these observations, the expression of CD274, the gene encoding PD-L1, was significantly depressed and less responsive to fetal signaling molecules in cultured mesenchymal stromal cells from the decidua of preterm compared to term women. Overall, these results suggest that the PD1/PD-L1 pathway at the MFI may perturb the delicate balance between immune tolerance and rejection and contribute to the onset of spontaneous preterm labor.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.