Enhanced Regularized Polynomial XGBoost (ERP-XGB): Reducing Bias and Optimizing Performance in Cardiovascular Risk Prediction
Abstract Cardiovascular diseases are among the leading causes of death globally, emphasizing the critical need for machine learning models that are both accurate and fair in clinical decision-making. This study introduces the Enhanced Regularized Polynomial XGBoost (ERP-XGB) model, which integrates polynomial feature expansion with L1, L2, and gamma regularization terms to improve classification accuracy, address class imbalance, and reduce algorithmic bias. ERP-XGB was evaluated on four benchmark datasets: Heart Failure (299 samples), Heart Attack (1,319 samples), Heart Disease (917 samples), and BRFSS (253679 samples). On the Heart Attack dataset, ERP-XGB achieved a ROC AUC of 99.59 ± 0.21 %, accuracy of 96.97 ± 0.49 %, F1 score of 97.73 ± 0.43 %, precision of 96.30 ± 0.73 %, and recall of 98.87 ± 0.47 %, with an average run time of 30.63 seconds. In terms of fairness, ERP-XGB reported an Equalized Odds (EO) score of 0.02 ± 0.01, Disparate Impact (DI) of 0.96 ± 0.02, and Demographic Parity (DP) values of 0.61 ± 0.01 for the unprivileged group and 0.64 ± 0.01 for the privileged group. On the Heart Disease dataset, ERP-XGB demonstrated even stronger performance, achieving a perfect ROC AUC of 100.00 ± 0.00 %, accuracy of 98.60 ± 0.43 %, F1 score of 98.58 ± 0.37 %, precision of 100.00 ± 0.00 %, and recall of 97.29 ± 0.48 %, with a run time of 41.45 seconds. Fairness evaluation showed EO at 0.03 ± 0.01, DI at 1.78 ± 0.03, and DP values of 0.69 ± 0.01 for the unprivileged group and 0.38 ± 0.01 for the privileged group. For Heart Failure, ERP-XGB achieved 89.82±0.02 % ROC AUC, 82.93±0.03 % accuracy, and strong fairness (DI=0.91±0.31). On BRFSS, it attained 90.57±0.000 % accuracy but showed lower recall (11.89±0.004 %) and fairness challenges (DI=0.38±0.03). These results confirm that ERP-XGB offers an effective balance between high predictive performance and robust fairness in clinical datasets, making it a promising tool for equitable cardiovascular disease diagnosis.
- Referencias
- Cómo citar
- Del mismo autor
- Métricas
Alanazi, T., & Muhammad, G. (2022). Human fall detection using 3D multi-stream convolutional neural networks with fusion. Diagnostics, 12(12), 3060. https://doi.org/10.3390/diagnostics12123060
Albahri, S., Duhaim, A. M., Fadhel, M. A., Alnoor, A., Baqer, N. S., Alzubaidi, L., & Deveci, M. (2023). A systematic review of trustworthy and explainable artificial intelligence in healthcare: Assessment of quality, bias risk, and data fusion. Information Fusion, 96.
Alderman, J. E., Palmer, J., Laws, E., McCradden, M. D., Ordish, J., Ghassemi, M., & Liu, X. (2025). Tackling algorithmic bias and promoting transparency in health datasets: The STANDING Together consensus recommendations. The Lancet Digital Health, 7(1), e64–e88.
Biswas, S., & Rajan, H. (2020). Do the machine learning models on a crowd sourced platform exhibit bias? An empirical study on model fairness. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (pp. 642–653).
Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of the Conference on Fairness, Accountability and Transparency (pp. 77–91).
Chakraborty, J., Majumder, S., Yu, Z., & Menzies, T. (2020). Fairway: A way to build fair ML software. In Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (pp. 654–665).
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.
Chen, I. Y., Szolovits, P., & Ghassemi, M. (2019). Can AI help reduce disparities in general medical and mental health care? AMA Journal of Ethics, 23(2), E117–E127. https://doi.org/10.1001/amajethics.2019.167
Chen, Z., Zhang, J. M., Sarro, F., & Harman, M. (2022). A comprehensive empirical study of bias mitigation methods for software fairness. arXiv. https://arxiv.org/abs/2207.03277
Chen, Z., Zhang, J. M., Sarro, F., & Harman, M. (2023). A comprehensive empirical study of bias mitigation methods for machine learning classifiers. ACM Transactions on Software Engineering and Methodology, 32(4), Article 106. https://doi.org/10.1145/3583561
Choi, S., Lee, H., & Park, K. (2020). Fair-MAML: Meta-learning for unbiased cardiovascular disease prediction. In Proceedings of the 34th Conference on Neural Information Processing Systems (pp. 3215–3227).
Drukker, A., Cherikh, W., Nidel, C., & Santanna, J. (2023). Bias in AI: Sources and mitigation strategies. In Artificial Intelligence in Medicine (pp. 211–227). Springer, Cham.
Fletcher, R. R., Nakeshimana, A., & Olubeko, O. (2021). Addressing fairness, bias, and appropriate use of artificial intelligence and machine learning in global health. Frontiers in Artificial Intelligence, 3, 6.
Grgić-Hlača, N., Zafar, M. B., Gummadi, K. P., & Weller, A. (2016). The case for process fairness in learning: Feature selection for fair decision making. In Proceedings of the NIPS Symposium on Machine Learning and the Law, 1, 2.
Haq, M. A., & Khan, M. A. R. (2022). DNNBoT: Deep neural network-based botnet detection and classification. Computers, Materials & Continua, 71, 1729–1750. https://doi.org/10.32604/cmc.2022.020938
Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems (pp. 3315–3323).
He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.
Holzinger, A., Dehmer, M., Emmert-Streib, F., Cucchiara, R., Augenstein, I., Del Ser, J., Samek, W., Jurisica, I.& Díaz-Rodríguez, N. 2022). Information fusion as an integrative cross-cutting enabler to achieve robust, explainable, and trustworthy medical artificial intelligence. Information Fusion, 79, 263–278. https://doi.org/10.1016/j.inffus.2021.10.007
Hort, M., Zhang, J. M., Sarro, F., & Harman, M. (2021). Fairea: A model behaviour mutation approach to benchmarking bias mitigation methods. In Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (pp. 994–1006).
Jiang, F., Zhao, Z., & Shao, X. (2023). Adaptive sampling for imbalanced cardiac arrhythmia detection. Scientific Reports, 13, 1205.
Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 27.
Kilic, A., Goyal, A., Miller, J. K., Gleason, T. G., & Dubrawski, A. (2022). Machine learning and feature selection methods for disease prediction with high-dimensional datasets. Journal of Biomedical Informatics, 125, 103959.
Lee, S., Kim, J., & Park, M. (2024). Addressing racial bias in machine learning models for hypertension management: A large-scale multi-center study.
Li, J., Zhang, X., & Li, Y. (2021). Adaptive boosting with class-dependent weights for heart failure prediction. IEEE Journal of Biomedical and Health Informatics, 25(7), 2565–2575.
Merugu, S., Yadav, R., Pathi, V., & Perianayagam, H. R. (2024). Identification and improvement of image similarity using autoencoder. Engineering, Technology & Applied Science Research, 14(4), 15541–15546. https://doi.org/10.48084/etasr.7548
Mosca, L., Barrett-Connor, E., & Wenger, N. K. (2011). Sex/gender differences in cardiovascular disease prevention: What a difference a decade makes. Circulation, 124(19), 2145–2154.
Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447–453.
Park, S., Liao, R., Xiong, Y., & Zhao, J. (2021). A survey on fair data representation and learning. arXiv. https://arxiv.org/abs/2103.06041
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215.
Shah, M., & Sureja, N. (2025). A comprehensive review of bias in deep learning models: Methods, impacts, and future directions. Archives of Computational Methods in Engineering, 32(1), 255–267.
Straw, I., & Callison-Burch, C. (2020). Artificial intelligence in healthcare: A critical analysis of the challenges and opportunities. Journal of Medical Internet Research, 22(11), e23954.
Sun, Y., Wang, X., & Tang, X. (2022). Multi-objective optimization for fair cardiovascular risk prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, 36(1), 721–729.
Tarawneh, A. S., Hassanat, A. B., Altarawneh, G. A., & Almuhaimeed, A. (2022). Stop oversampling for class imbalance learning: A review. IEEE Access, 10, 47643–47660. https://doi.org/10.1109/ACCESS.2022.3172128
Zhang, J. M., & Harman, M. (2021). Ignorance and prejudice in software fairness. In Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering (pp. 1436–1447).
Zhang, Y., Liu, X., & MacLeod, R. S. (2020). Cost-sensitive boosting for imbalanced ECG classification. In 2020 Computing in Cardiology (pp. 1–4).
Zhang, J., Smith, A., & Johnson, B. (2018). Adversarial debiasing for fair cardiovascular risk assessment. Journal of Machine Learning in Healthcare, 12(3), 245–260.
Zhang, Z., Wang, S., & Meng, G. (2023). A review on pre-processing methods for fairness in machine learning. In Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery: Proceedings of the ICNC-FSKD 2022 (pp. 1185–1191). Springer, Cham.
Zhao, Y., Kuang, Z., Li, X., & Zhang, J. (2020). Fairness-aware ensemble learning for cardiovascular disease risk prediction. In International Conference on Health Information Science (pp. 179–190). Springer, Cham.
Albahri, S., Duhaim, A. M., Fadhel, M. A., Alnoor, A., Baqer, N. S., Alzubaidi, L., & Deveci, M. (2023). A systematic review of trustworthy and explainable artificial intelligence in healthcare: Assessment of quality, bias risk, and data fusion. Information Fusion, 96.
Alderman, J. E., Palmer, J., Laws, E., McCradden, M. D., Ordish, J., Ghassemi, M., & Liu, X. (2025). Tackling algorithmic bias and promoting transparency in health datasets: The STANDING Together consensus recommendations. The Lancet Digital Health, 7(1), e64–e88.
Biswas, S., & Rajan, H. (2020). Do the machine learning models on a crowd sourced platform exhibit bias? An empirical study on model fairness. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (pp. 642–653).
Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of the Conference on Fairness, Accountability and Transparency (pp. 77–91).
Chakraborty, J., Majumder, S., Yu, Z., & Menzies, T. (2020). Fairway: A way to build fair ML software. In Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (pp. 654–665).
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.
Chen, I. Y., Szolovits, P., & Ghassemi, M. (2019). Can AI help reduce disparities in general medical and mental health care? AMA Journal of Ethics, 23(2), E117–E127. https://doi.org/10.1001/amajethics.2019.167
Chen, Z., Zhang, J. M., Sarro, F., & Harman, M. (2022). A comprehensive empirical study of bias mitigation methods for software fairness. arXiv. https://arxiv.org/abs/2207.03277
Chen, Z., Zhang, J. M., Sarro, F., & Harman, M. (2023). A comprehensive empirical study of bias mitigation methods for machine learning classifiers. ACM Transactions on Software Engineering and Methodology, 32(4), Article 106. https://doi.org/10.1145/3583561
Choi, S., Lee, H., & Park, K. (2020). Fair-MAML: Meta-learning for unbiased cardiovascular disease prediction. In Proceedings of the 34th Conference on Neural Information Processing Systems (pp. 3215–3227).
Drukker, A., Cherikh, W., Nidel, C., & Santanna, J. (2023). Bias in AI: Sources and mitigation strategies. In Artificial Intelligence in Medicine (pp. 211–227). Springer, Cham.
Fletcher, R. R., Nakeshimana, A., & Olubeko, O. (2021). Addressing fairness, bias, and appropriate use of artificial intelligence and machine learning in global health. Frontiers in Artificial Intelligence, 3, 6.
Grgić-Hlača, N., Zafar, M. B., Gummadi, K. P., & Weller, A. (2016). The case for process fairness in learning: Feature selection for fair decision making. In Proceedings of the NIPS Symposium on Machine Learning and the Law, 1, 2.
Haq, M. A., & Khan, M. A. R. (2022). DNNBoT: Deep neural network-based botnet detection and classification. Computers, Materials & Continua, 71, 1729–1750. https://doi.org/10.32604/cmc.2022.020938
Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems (pp. 3315–3323).
He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.
Holzinger, A., Dehmer, M., Emmert-Streib, F., Cucchiara, R., Augenstein, I., Del Ser, J., Samek, W., Jurisica, I.& Díaz-Rodríguez, N. 2022). Information fusion as an integrative cross-cutting enabler to achieve robust, explainable, and trustworthy medical artificial intelligence. Information Fusion, 79, 263–278. https://doi.org/10.1016/j.inffus.2021.10.007
Hort, M., Zhang, J. M., Sarro, F., & Harman, M. (2021). Fairea: A model behaviour mutation approach to benchmarking bias mitigation methods. In Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (pp. 994–1006).
Jiang, F., Zhao, Z., & Shao, X. (2023). Adaptive sampling for imbalanced cardiac arrhythmia detection. Scientific Reports, 13, 1205.
Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 27.
Kilic, A., Goyal, A., Miller, J. K., Gleason, T. G., & Dubrawski, A. (2022). Machine learning and feature selection methods for disease prediction with high-dimensional datasets. Journal of Biomedical Informatics, 125, 103959.
Lee, S., Kim, J., & Park, M. (2024). Addressing racial bias in machine learning models for hypertension management: A large-scale multi-center study.
Li, J., Zhang, X., & Li, Y. (2021). Adaptive boosting with class-dependent weights for heart failure prediction. IEEE Journal of Biomedical and Health Informatics, 25(7), 2565–2575.
Merugu, S., Yadav, R., Pathi, V., & Perianayagam, H. R. (2024). Identification and improvement of image similarity using autoencoder. Engineering, Technology & Applied Science Research, 14(4), 15541–15546. https://doi.org/10.48084/etasr.7548
Mosca, L., Barrett-Connor, E., & Wenger, N. K. (2011). Sex/gender differences in cardiovascular disease prevention: What a difference a decade makes. Circulation, 124(19), 2145–2154.
Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447–453.
Park, S., Liao, R., Xiong, Y., & Zhao, J. (2021). A survey on fair data representation and learning. arXiv. https://arxiv.org/abs/2103.06041
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215.
Shah, M., & Sureja, N. (2025). A comprehensive review of bias in deep learning models: Methods, impacts, and future directions. Archives of Computational Methods in Engineering, 32(1), 255–267.
Straw, I., & Callison-Burch, C. (2020). Artificial intelligence in healthcare: A critical analysis of the challenges and opportunities. Journal of Medical Internet Research, 22(11), e23954.
Sun, Y., Wang, X., & Tang, X. (2022). Multi-objective optimization for fair cardiovascular risk prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, 36(1), 721–729.
Tarawneh, A. S., Hassanat, A. B., Altarawneh, G. A., & Almuhaimeed, A. (2022). Stop oversampling for class imbalance learning: A review. IEEE Access, 10, 47643–47660. https://doi.org/10.1109/ACCESS.2022.3172128
Zhang, J. M., & Harman, M. (2021). Ignorance and prejudice in software fairness. In Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering (pp. 1436–1447).
Zhang, Y., Liu, X., & MacLeod, R. S. (2020). Cost-sensitive boosting for imbalanced ECG classification. In 2020 Computing in Cardiology (pp. 1–4).
Zhang, J., Smith, A., & Johnson, B. (2018). Adversarial debiasing for fair cardiovascular risk assessment. Journal of Machine Learning in Healthcare, 12(3), 245–260.
Zhang, Z., Wang, S., & Meng, G. (2023). A review on pre-processing methods for fairness in machine learning. In Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery: Proceedings of the ICNC-FSKD 2022 (pp. 1185–1191). Springer, Cham.
Zhao, Y., Kuang, Z., Li, X., & Zhang, J. (2020). Fairness-aware ensemble learning for cardiovascular disease risk prediction. In International Conference on Health Information Science (pp. 179–190). Springer, Cham.
Boughareb, D. (2025). Enhanced Regularized Polynomial XGBoost (ERP-XGB): Reducing Bias and Optimizing Performance in Cardiovascular Risk Prediction. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 14, e32367. https://doi.org/10.14201/adcaij.32367
Downloads
Download data is not yet available.
+
−