Hepatocellular carcinoma is the most common primary liver cancer, accounting for 90% of cases, and a major cause of death worldwide. Despite this, alpha-fetoprotein tests are the only blood-based diagnostic tools available, and their use is limited by their low sensitivity. DNA methylation changes, which have been implicated in a majority of cancers, offer an alternative method of diagnosis through measuring such changes in circulating cell-free DNA present in blood plasma.Method A genetic programming-based symbolic regression approach was applied to gain the benefits of machine learning while avoiding the opacity drawbacks of "black box" models. The data included plasma samples from 36 patients with hepatocellular carcinoma as well as a control group of 55 that contained patients with and without cirrhosis. A 75-25 train-test splitting was done before training.
ResultsThe symbolic regression methodology developed an equation utilizing the methylation levels of three biomarkers, with an accuracy of 91.3%, a sensitivity of 100%, and a specificity of 87.5% on the test data. The performance matches prior research while providing the added benefits of transparency.
ConclusionCirculating cell-free DNA presents opportunities for minimally invasive early diagnosis of hepatocellular carcinoma, and utilizing transparent machine learning approaches like symbolic regression can allow accurate diagnosis by combining biological and mathematical principles. Future validation of the model obtained here on a larger and more diverse dataset can reveal the potential for such approaches in cancer diagnosis and pave the way for further research.