Surface-Enhanced Raman Spectroscopy (SERS) is often used for heavy metal ion detection. However, large variations in signal strength, spectral profile, and nonlinearity of measurements often cause problems that produce varying results. It raises concerns about the reproducibility of the results. Consequently, the manual classification of the SERS spectrum requires carefully controlled experimentation that further hinders the large-scale adaptation. Recent advances in machine learning offer decent opportunities to address these issues. However, well-documented procedures for model development and evaluation, as well as benchmark datasets, are missing. Towards this end, we provide the SERS spectral benchmark dataset of lead(II) nitride (Pb(NO3)2) for a heavy metal ion detection task and evaluate the classification performance of several machine learning models. We also perform a comparative study to find the best combination between the preprocessing methods and the machine learning models. The proposed model can successfully identify the Pb(NO3)2 molecule from SERS measurements of independent test experiments. In particular, the proposed model shows an 84.6% balanced accuracy for the cross-batch testing task.