Neonatal Jaundice is a common occurrence in neonates. High excess bilirubin would lead to hyperbilirubinemia, leading to irreversible adverse damage such as kernicterus. Therefore, it is necessary and important to monitor neonates’ bilirubin levels in real-time for immediate intervention. However, current screening protocols have their inherent limitations, necessitating more convenient measurements. In this proof-of-concept study, we evaluated the feasibility of using machine learning for the screening of hyperbilirubinemia in neonates from smartphone-acquired photographs. Different machine learning models were compared and evaluated to gain a better understanding of feature selection and model performance in bilirubin determination. An in vitro study was conducted with a bilirubin-containing tissue phantom to identify potential biological and environmental confounding factors. The findings of this study present a systematic characterization of the confounding effect of various factors through separate parametric tests. These tests uncover potential techniques in image pre-processing, highlighting important biological features (light scattering property and skin thickness) and external features (ISO, lighting conditions and white balance), which together contribute to robust model approaches for accurately determining bilirubin concentrations. By obtaining an accuracy of 0.848 in classification and 0.812 in regression, these findings indicate strong potential in aiding in the design of clinical studies using patient-derived images.