Predicting Cardiovascular Disease Using Ensemble Machine Learning algorithms
Abstract
Cardiovascular diseases (CVDs) account for the most global deaths, making early detection and efficient treatment paramount. Invasive, time-consuming, and expensive traditional techniques limit their usability, and therefore, machine learning (ML) emerges as an ideal choice for disease prediction on its own. The present study discusses ML models for predicting heart disease based on a UCI repository dataset, with 14 of the most important medical attributes including age, gender, chest pain type, and blood pressure. Four major ML algorithms—Random Forest, K-Nearest Neighbors (KNN), Logistic Regression, and Artificial Neural Networks (ANN)—were used in classifying the patients according to their risk for heart disease. Of these, Random Forest had the highest recall of 94%. Other models were also tested using Decision Trees, Naïve Bayes, Support Vector Machines (SVM), and XGBoost, with some classifiers attaining a 100% accuracy. The work exhibits the power of ML-influenced models in improved detection of CVD early in the life course by limiting reliance on expensive clinical tests. Development prospects ahead consist of incorporation with deep learning, remote tracking through wearables, and federated learning to empower secure healthcare analysis. Keywords: Heart disease prediction, K-Nearest Neighbors, Logistic Regression, XGBoost.
Full Text PDF