Electronic Health Records (EHRs), heralded for their potential to revolutionize healthcare outcomes, function as repositories for invaluable data. This study offers a compelling exploration into the integration of Apache Spark for EHR analysis, with a specific focus on elevating diabetes care. Leveraging Apache Spark alongside a robust machine learning framework, we automated EHR analysis by processing extensive datasets, conducting thorough preprocessing, and extracting pertinent features. The inherent distributed processing capabilities of Apache Spark facilitated concurrent training and evaluation of machine learning models. Its in-memory data processing markedly reduced reliance on disk input/output, thereby enhancing performance and scalability. This methodology enabled swift and thorough EHR data analysis, with ensuing insights effectively visualized and reported. This empowered healthcare professionals to make informed decisions. The iterative nature of the process allowed for continuous refinement, enhancing healthcare outcomes based on insightful data. The synergy between Apache Spark and machine learning techniques in EHR analysis emerged as a potent and efficient strategy. This approach exhibits promise in significantly advancing healthcare outcomes by enabling effective prediction and management of diabetes, ultimately contributing to superior patient care and reducing healthcare costs. The findings underscore the transformative potential of integrating contemporary data analysis tools within the healthcare sector.