Clinical annotations for prostate cancer research: Defining data elements, creating a reproducible analytical pipeline, and assessing data quality

Keegan, Niamh M.; Vasselman, Samantha E.; Barnett, Ethan S.; Nweji, Barbara; Carbone, Emily; Blum, Alexander; Morris, Michael J.; Rathkopf, Dana E.; Slovin, Susan F.; Danila, Daniel C.; Autio, Karen A.; Scher, Howard I.; Abida, Wassim; Stopsack, Konrad H.

doi:10.1002/pros.24363

Cited by 7 publications

(1 citation statement)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This ensures that while treatments and technology evolve, comprehensive capture of core data elements using structured and routinely updated fields remains standardized across institutions. Overall, our work is synergistic with other ongoing efforts to define data elements in prostate cancer care for the construction of analytical pipelines and improved health information sharing [ 57 ].…”

Section: Discussionmentioning

confidence: 89%

Identification of Key Elements in Prostate Cancer for Ontology Building via a Multidisciplinary Consensus Agreement

Moreno,

Solanki,

et al. 2023

Cancers

View full text Add to dashboard Cite

Background: Clinical data collection related to prostate cancer (PCa) care is often unstructured or heterogeneous among providers, resulting in a high risk for ambiguity in its meaning when sharing or analyzing data. Ontologies, which are shareable formal (i.e., computable) representations of knowledge, can address these challenges by enabling machine-readable semantic interoperability. The purpose of this study was to identify PCa-specific key data elements (KDEs) for standardization in clinic and research. Methods: A modified Delphi method using iterative online surveys was performed to report a consensus agreement on KDEs by a multidisciplinary panel of 39 PCa specialists. Data elements were divided into three themes in PCa and included (1) treatment-related toxicities (TRT), (2) patient-reported outcome measures (PROM), and (3) disease control metrics (DCM). Results: The panel reached consensus on a thirty-item, two-tiered list of KDEs focusing mainly on urinary and rectal symptoms. The Expanded Prostate Cancer Index Composite (EPIC-26) questionnaire was considered most robust for PROM multi-domain monitoring, and granular KDEs were defined for DCM. Conclusions: This expert consensus on PCa-specific KDEs has served as a foundation for a professional society-endorsed, publicly available operational ontology developed by the American Association of Physicists in Medicine (AAPM) Big Data Sub Committee (BDSC).

show abstract

Section: Discussionmentioning

confidence: 89%

Identification of Key Elements in Prostate Cancer for Ontology Building via a Multidisciplinary Consensus Agreement

Moreno,

Solanki,

et al. 2023

Cancers

View full text Add to dashboard Cite

show abstract

Automated real-world data integration improves cancer outcome prediction

Jee,

Fong,

Pichotta

et al. 2024

Nature

View full text Add to dashboard Cite

The digitization of health records and growing availability of tumour DNA sequencing provide an opportunity to study the determinants of cancer outcomes with unprecedented richness. Patient data are often stored in unstructured text and siloed datasets. Here we combine natural language processing annotations 1 , 2 with structured medication, patient-reported demographic, tumour registry and tumour genomic data from 24,950 patients at Memorial Sloan Kettering Cancer Center to generate a clinicogenomic, harmonized oncologic real-world dataset (MSK-CHORD). MSK-CHORD includes data for non-small-cell lung ( n = 7,809), breast ( n = 5,368), colorectal ( n = 5,543), prostate ( n = 3,211) and pancreatic ( n = 3,109) cancers and enables discovery of clinicogenomic relationships not apparent in smaller datasets. Leveraging MSK-CHORD to train machine learning models to predict overall survival, we find that models including features derived from natural language processing, such as sites of disease, outperform those based on genomic data or stage alone as tested by cross-validation and an external, multi-institution dataset. By annotating 705,241 radiology reports, MSK-CHORD also uncovers predictors of metastasis to specific organ sites, including a relationship between SETD2 mutation and lower metastatic potential in immunotherapy-treated lung adenocarcinoma corroborated in independent datasets. We demonstrate the feasibility of automated annotation from unstructured notes and its utility in predicting patient outcomes. The resulting data are provided as a public resource for real-world oncologic research.

show abstract

RAD21 promotes oncogenesis and lethal progression of prostate cancer

Su,

Stopsack,

Schmidt

et al. 2024

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

Higher levels of aneuploidy, characterized by imbalanced chromosome numbers, are associated with lethal progression in prostate cancer. However, how aneuploidy contributes to prostate cancer aggressiveness remains poorly understood. In this study, we assessed in patients which genes on chromosome 8q, one of the most frequently gained chromosome arms in prostate tumors, were most strongly associated with long-term risk of cancer progression to metastases and death from prostate cancer (lethal disease) in 403 patients and found the strongest candidate was cohesin subunit gene, RAD21 , with an odds ratio of 3.7 (95% CI 1.8, 7.6) comparing the highest vs. lowest tertiles of mRNA expression and adjusting for overall aneuploidy burden and Gleason score, both strong prognostic factors in primary prostate cancer. Studying prostate cancer driven by the TMPRSS2-ERG oncogenic fusion, found in about half of all prostate tumors, we found that increased RAD21 alleviated toxic oncogenic stress and DNA damage caused by oncogene expression. Data from both organoids and patients indicate that increased RAD21 thereby enables aggressive tumors to sustain tumor proliferation, and more broadly suggests one path through which tumors benefit from aneuploidy.

show abstract

Clinical annotations for prostate cancer research: Defining data elements, creating a reproducible analytical pipeline, and assessing data quality

Cited by 7 publications

References 23 publications

Identification of Key Elements in Prostate Cancer for Ontology Building via a Multidisciplinary Consensus Agreement

Identification of Key Elements in Prostate Cancer for Ontology Building via a Multidisciplinary Consensus Agreement

Automated real-world data integration improves cancer outcome prediction

RAD21 promotes oncogenesis and lethal progression of prostate cancer

Contact Info

Product

Resources

About