Mass spectrometry (MS) is a powerful technique that can provide the biochemical signature of a wide range of biological materials such as cells and biofluids. However, MS data usually has a large range of variables which may lead to difficulties in discriminatory analysis and may require high computational cost. In this paper, principal component analysis with linear discriminant analysis (PCA-LDA) and quadratic discriminant analysis (PCA-QDA) were applied for discrimination between healthy control and cancer samples (ovarian and prostate cancer) based on MS data sets. In addition, an identification of prostate cancer subtypes was performed. The results obtained herein were very satisfactory, especially for PCA-QDA. Selectivity and specificity were found in a range of 90-100%, being equal or superior to support vector machines (SVM)-based algorithms. These techniques provided reliable identification of cancer samples which may lead to fast and less-invasive clinical procedures.Keywords: mass spectrometry, classification, ovarian cancer, prostate cancer, QDA
IntroductionMass spectrometry (MS) is an analytical technique that is used for determining the chemical composition of a given sample, to quantify compounds, 1 and to help elucidate molecular structures. 2,3 This technique has been increasingly utilized in biomedical and clinical research, 4 since it can overcome many limitations of classical immunoassays 5,6 and supports the development of fast and less-invasive clinical procedures. [7][8][9] MS is usually coupled with chromatography such as liquid chromatography (LC-MS) and gas chromatography (GC-MS). Other techniques such as surface-enhanced laser desorption ionization time-of-flight (SELDI-TOF) and matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) are often used in MS applications, including disease screening and diagnosis.5 Some examples of MS applications includes toxicology screening and toxic drug quantification using quadrupole MS/MS; 10 identification of inborn errors in metabolism or genetic defects in newborns for prenatal screening programs using electrospray tandem MS; 11 detection of drug-induced hepatotoxicity using MS-based metabolomics; 12 and identification and quantification of bleomycin in serum and tumor tissue by high resolution LC-MS. 13 MS-based techniques have been largely employed for cancer identification, such as for breast cancer, 14 prostate cancer, 15,16 ovarian cancer, 17 lung cancer, 18 and pancreatic cancer; 19 as well as for identifying many biomarkers. 18,[20][21][22][23][24] One of the main fields using MS data is metabolomics, which aims to identify and quantify small molecules involved in metabolic reactions. 25 Metabolomics studies have been applied in several areas, especially cancer.
26These analyses are typically performed in either targeted or untargeted approaches. 25 The target approach aims to identify and quantify specific metabolites or metabolite class; whereas in the untargeted analysis a new hypothesis for further tests is generated by ...