a b s t r a c tThis review presents a retrospective of the studies carried out in the last 10 years (2006e2016) using spectroscopic methods as a research tool in the field of virology. Spectroscopic analyses are sensitive to variations in the biochemical composition of the sample, are non-destructive, fast and require the least sample preparation, making spectroscopic techniques tools of great interest in biological studies. Herein important chemometric algorithms that have been used in virological studies are also evidenced as a good alternative for analyzing the spectra, discrimination and classification of samples. Techniques that have not yet been used in the field of virology are also suggested. This methodology emerges as a new and promising field of research, and may be used in the near future as diagnosis tools for detecting diseases caused by viruses.
Analytical technologies that can improve disease diagnosis are highly sought after. Current screening/diagnostic tests for several diseases are limited by their moderate diagnostic performance, invasiveness, costly and laborious methodologies or the need for multiple tests before a definitive diagnosis. Spectroscopic techniques, including infrared (IR) and Raman, have attracted great interest in the medical field, with applications expanding from early disease detection to monitoring and real-time diagnosis. This review highlights applications of IR and Raman spectroscopy, with a focus on cancer and infectious diseases since 2015, and underscores the diverse sample types that can be analyzed, such as biofluids, cells and tissues. Studies involving more than 25 participants per group (disease and control group; if no control group >25 in disease group) were considered eligible, to retain the clinical focus of the paper. Following literature searches, we identified 94 spectroscopic studies on different cancers and 30 studies on infectious diseases. The review KEYWORDS
Motivation
Data splitting is a fundamental step for building classification models with spectral data, especially in biomedical applications. This approach is performed following pre-processing and prior to model construction, and consists of dividing the samples into at least training and test sets; herein, the training set is used for model construction and the test set for model validation. Some of the most-used methodologies for data splitting are the random selection (RS) and the Kennard-Stone (KS) algorithms; here, the former works based on a random splitting process and the latter is based on the calculation of the Euclidian distance between the samples. We propose an algorithm called the Morais-Lima-Martin (MLM) algorithm, as an alternative method to improve data splitting in classification models. MLM is a modification of KS algorithm by adding a random-mutation factor.
Results
RS, KS and MLM performance are compared in simulated and six real-world biospectroscopic applications using principal component analysis linear discriminant analysis (PCA-LDA). MLM generated a better predictive performance in comparison with RS and KS algorithms, in particular regarding sensitivity and specificity values. Classification is found to be more well-equilibrated using MLM. RS showed the poorest predictive response, followed by KS which showed good accuracy towards prediction, but relatively unbalanced sensitivities and specificities. These findings demonstrate the potential of this new MLM algorithm as a sample selection method for classification applications in comparison with other regular methods often applied in this type of data.
Availability and implementation
MLM algorithm is freely available for MATLAB at https://doi.org/10.6084/m9.figshare.7393517.v1.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.