Purpose With large amounts of multidimensional molecular data on cancers generated and deposited into public repositories such as The Cancer Genome Atlas and International Cancer Genome Consortium, a cancer type agnostic and integrative platform will help to identify signatures with clinical relevance. We devised such a platform and showcase it by identifying a molecular signature for patients with metastatic and recurrent (MR) head and neck squamous cell carcinoma (HNSCC). Methods We devised a statistical framework accompanied by a graphical user interface–driven application, Clinical Association of Functionally Established MOlecular CHAnges ( CAFE MOCHA; https://github.com/binaypanda/CAFEMOCHA), to discover molecular signatures linked to a specific clinical attribute in a cancer type. The platform integrates mutations and indels, gene expression, DNA methylation, and copy number variations to discover a classifier first and then to predict an incoming tumor for the same by pulling defined class variables into a single framework that incorporates a coordinate geometry–based algorithm called complete specificity margin-based clustering, which ensures maximum specificity. CAFE MOCHA classifies an incoming tumor sample using either its matched normal or a built-in database of normal tissues. The application is packed and deployed using the install4j multiplatform installer. We tested CAFE MOCHA in HNSCC tumors (n = 513) followed by validation in tumors from an independent cohort (n = 18) for discovering a signature linked to distant MR. Results CAFE MOCHA identified an integrated signature, MR44, associated with distant MR HNSCC, with 80% sensitivity and 100% specificity in the discovery stage and 100% sensitivity and 100% specificity in the validation stage. Conclusion CAFE MOCHA is a cancer type and clinical attribute agnostic statistical framework to discover integrated molecular signatures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.