High-resolution magic angle spinning (HR MAS) nuclear magnetic resonance (NMR) spectroscopy is increasingly being used to study metabolite levels in human breast cancer tissue, assessing, for instance, correlations with prognostic factors, survival outcome or therapeutic response. However, the impact of intratumoral heterogeneity on metabolite levels in breast tumor tissue has not been studied comprehensively. More specifically, when biopsy material is analyzed, it remains questionable whether one biopsy is representative of the entire tumor. Therefore, multi-core sampling (n = 6) of tumor tissue from three patients with breast cancer, followed by lipid (0.9- and 1.3-ppm signals) and metabolite quantification using HR MAS H NMR, was performed, resulting in the quantification of 32 metabolites. The mean relative standard deviation across all metabolites for the six tumor cores sampled from each of the three tumors ranged from 0.48 to 0.74. This was considerably higher when compared with a morphologically more homogeneous tissue type, here represented by murine liver (0.16-0.20). Despite the seemingly high variability observed within the tumor tissue, a random forest classifier trained on the original sample set (training set) was, with one exception, able to correctly predict the tumor identity of an independent series of cores (test set) that were additionally sampled from the same three tumors and analyzed blindly. Moreover, significant differences between the tumors were identified using one-way analysis of variance (ANOVA), indicating that the intertumoral differences for many metabolites were larger than the intratumoral differences for these three tumors. That intertumoral differences, on average, were larger than intratumoral differences was further supported by the analysis of duplicate tissue cores from 15 additional breast tumors. In summary, despite the observed intratumoral variability, the results of the present study suggest that the analysis of one, or a few, replicates per tumor may be acceptable, and supports the feasibility of performing reliable analyses of patient tissue.
MotivationDisease classification from molecular measurements typically requires an analysis pipeline from raw noisy measurements to final classification results. Multi capillary column—ion mobility spectrometry (MCC-IMS) is a promising technology for the detection of volatile organic compounds in the air of exhaled breath. From raw measurements, the peak regions representing the compounds have to be identified, quantified, and clustered across different experiments. Currently, several steps of this analysis process require manual intervention of human experts. Our goal is to identify a fully automatic pipeline that yields competitive disease classification results compared to an established but subjective and tedious semi-manual process.MethodWe combine a large number of modern methods for peak detection, peak clustering, and multivariate classification into analysis pipelines for raw MCC-IMS data. We evaluate all combinations on three different real datasets in an unbiased cross-validation setting. We determine which specific algorithmic combinations lead to high AUC values in disease classifications across the different medical application scenarios.ResultsThe best fully automated analysis process achieves even better classification results than the established manual process. The best algorithms for the three analysis steps are (i) SGLTR (Savitzky-Golay Laplace-operator filter thresholding regions) and LM (Local Maxima) for automated peak identification, (ii) EM clustering (Expectation Maximization) and DBSCAN (Density-Based Spatial Clustering of Applications with Noise) for the clustering step and (iii) RF (Random Forest) for multivariate classification. Thus, automated methods can replace the manual steps in the analysis process to enable an unbiased high throughput use of the technology.
The Multi-capillary-column-Ion-mobility-spectrometry (MCC-IMS) technology for measuring breath gas can be used for distinguishing between healthy and diseased subjects or between different types of diseases. The statistical methods for classifying the corresponding breath samples typically neglects potential confounding clinical and technical variables, reducing both accuracy and generalizability of the results. Especially measuring samples on different technical devices can heavily influence the results. We conducted a controlled breath gas study including 49 healthy volunteers to evaluate the effect of the variables sex, smoking habits and technical device. Every person was measured twice, once before and once after consuming a glass of orange juice. The two measurements were obtained on two different devices. The evaluation of the MCC-IMS data regarding metabolite detection was performed once using the software VisualNow, which requires manual interaction, and once using the fully automated algorithm SGLTR-DBSCAN. We present statistical solutions, peak alignment and scaling, to adjust for the different devices. For the other potential confounders sex and smoking, in our study no significant influence was identified.
Ion mobility spectrometry (IMS) is a technology for the detection of volatile compounds in the air of exhaled breath that is increasingly used in medical applications. One major goal is to classify patients into disease groups, for example diseased versus healthy, from simple breath samples. Raw IMS measurements are data matrices in which peak regions representing the compounds have to be identified and quantified. A typical analysis process consists of pre-processing and peak detection in single experiments, peak clustering to obtain consensus peaks across several experiments, and classification of samples based on the resulting multivariate peak intensities. Recently several automated algorithms for peak detection and peak clustering have been introduced, in order to overcome the current need for human-based analysis that is slow, subjective and sometimes not reproducible. We present an unbiased comparison of a multitude of combinations of peak processing and multivariate classification algorithms on a disease dataset. The specific combination of the algorithms for the different analysis steps determines the classification accuracy, with the encouraging result that certain fully-automated combinations perform even better than current manual approaches.
Ion mobility spectrometry (IMS) is a technology for the detection of volatile compounds in the air of exhaled breath that is increasingly used in medical applications. One major goal is to classify patients into disease groups, for example diseased versus healthy, from simple breath samples. Raw IMS measurements are data matrices in which peak regions representing the compounds have to be identified and quantified. A typical analysis process consists of pre-processing and peak detection in single experiments, peak clustering to obtain consensus peaks across several experiments, and classification of samples based on the resulting multivariate peak intensities. Recently several automated algorithms for peak detection and peak clustering have been introduced, in order to overcome the current need for human-based analysis that is slow, subjective and sometimes not reproducible. We present an unbiased comparison of a multitude of combinations of peak processing and multivariate classification algorithms on a disease dataset. The specific combination of the algorithms for the different analysis steps determines the classification accuracy, with the encouraging result that certain fully-automated combinations perform even better than current manual approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.