Surface-enhanced Raman spectroscopy (SERS) has wide diagnostic applications because of narrow spectral features that allow multiplexed analysis. Machine learning (ML) has been used for non-dye-labeled SERS spectra but has not been applied to SERS dye-labeled materials with known spectral shapes. Here, we compare the performances of spectral decomposition, support vector regression, random forest regression, partial least squares regression, and convolutional neural network (CNN) for SERS "spectral unmixing" from a multiplexed mixture of 7 SERS-active "nanorattles" loaded with different dyes for mRNA biomarker detection. We showed that CNN most accurately determined relative contributions of each distinct dye-loaded nanorattle. CNN and comparative models were then used to analyze SERS spectra from a singleplexed, point-of-care assay detecting an mRNA biomarker for head and neck cancer in 20 samples. The CNN, trained on simulated multiplexed data, determined the correct dye contributions from the singleplex assay with RMSE label = 6.42 Â 10 À2 . These results demonstrate the potential of CNNbased ML to advance SERS-based diagnostics.
| INTRODUCTIONRaman spectroscopy, particularly surface-enhanced Raman scattering (SERS), has widespread applications in chemical, [1] biological sensing, [2][3][4] and imaging. [5] The advantages of SERS include high sensitivity and very narrow Raman peaks, enabling single molecule detection [6] and high degrees of multiplexing, respectively. These narrow peaks are an important improvement over common fluorescence techniques, which have broad emission spectra without distinct maximums, precluding multiplex analysis. These unique characteristics make the development of highly multiplexed SERS-based technology an appealing target for the advancement of molecular diagnostics.Molecular diagnostics are powerful medical tools for the diagnosis and monitoring of a variety of diseases, including cancer, [7] infectious disease, [8] and molecular