Introduction:
In the modern era, significant efforts are being made to forecast heart diseases using medical data and advancements in data science. Cardiovascular diseases (CVDs) stand as the foremost cause of global mortality, annually claiming millions of lives and contributing to about 31% of total deaths. Heart failure, a prevalent complication of CVDs, necessitates early diagnosis and management for individuals with a history of cardiovascular ailments or those at high risk. Machine learning models offer promising prospects in this regard.
Methodology:
This study employs five distinct machine learning models: logistic regression, support vector classifier, decision tree classifier, random forest classifier, and K-nearest neighbors. These models are applied to analyze medical data accurately. The dataset comprises information from over 900 individuals sourced from reputable medical institutions. It includes various features such as age, sex, chest pain severity, blood pressure, cholesterol levels, blood sugar levels, electrocardiogram results, and other pertinent medical characteristics.
Results:
Upon evaluation using different metrics, the performance of the machine learning models varied. Logistic regression demonstrated the highest accuracy in predicting heart failure at 88%, while the support vector classifier also achieved 88%. The decision tree classifier fell below 85%, the random forest classifier attained 84%, and the K-nearest neighbors classifier showed an accuracy of 82%. The analysis of the dataset revealed a balanced distribution and highlighted gender-based disparities in heart failure. Significant correlations with heart disease were observed for factors such as age, chest pain severity, blood glucose levels, and indicators of physical activity.
Conclusion:
Combining multiple machine learning models facilitates early detection of heart failure. However, selecting the most appropriate model requires careful consideration of project requirements, considering the observed variations in accuracy across different models.