Integrated Ensemble Strategy for Breast Cancer Detection Using Dimensionality Reduction Technique
Abstract Breast cancer remains a critical global health concern, requiring advanced and reliable diagnostic methods for early detection and effective intervention. This work introduces an integrated ensemble framework that combines multiple dimensionality reduction (DR) techniques, including Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF), and Singular Value Decomposition (SVD), with robust machine learning (ML) classifiers for improved breast cancer detection. The publicly available Wisconsin Breast Cancer Dataset (WBCD) was utilized, with rigorous data preprocessing performed to address missing values, anomalies, and class imbalance through stratified sampling and median imputation. To mitigate overfitting and underfitting, dimensionality reduction was coupled with cross-validation and ensemble strategies. The predictive performance of Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), and Multi-Layer Perceptron (MLP) was systematically evaluated. Experimental results show that SVM consistently achieves a maximum accuracy of 97.9 % across all applied DR techniques, while MLP and LR also reach 97.9 % accuracy with PCA and NMF, though MLP exhibits performance variability depending on the selected DR method. The findings provide practical guidance for healthcare practitioners and researchers, supporting the adoption of explainable and scalable AI-driven diagnostic tools. Limitations include the reliance on a single dataset and the need for further validation on larger and more diverse clinical cohorts. Future work will focus on enhancing model interpretability, external validation, and real-world deployment in resource-constrained settings.
- Referencias
- Cómo citar
- Del mismo autor
- Métricas
Abdullah, N. N. M. (2023). Coercive Biomedical Body Politics: Redefining Breast Cancer as a Gender-Marked Experience in the Case Study of Linda Park-Fuller’s ‘A Clean Breast of It.’ International Journal of Arabic-English Studies, 23(1), 327–344. https://doi.org/10.33806/IJAES2000.23.1.17
Al-Fahaidy, F. A. K., Al-Fuhaidi, B., AL-Darouby, I., AL-Abady, F., AL-Qadry, M., & AL-Gamal, A. (2022). A Diagnostic Model of Breast Cancer Based on Digital Mammogram Images Using Machine Learning Techniques. Applied Computational Intelligence and Soft Computing, 2022, 1–17. https://doi.org/10.1155/2022/3895976
Ara, S., Das, A., & Dey, A. (2021). Malignant and Benign Breast Cancer Classification using Machine Learning Algorithms. 2021 International Conference on Artificial Intelligence (ICAI), 97–101. https://doi.org/10.1109/ICAI52203.2021.9445249
Ashokkumar, N., Meera, S., Anandan, P., Murthy, M. Y. B., Kalaivani, K. S., Alahmadi, T. A., Alharbi, S. A., Raghavan, S. S., & Jayadhas, S. A. (2022). Deep Learning Mechanism for Predicting the Axillary Lymph Node Metastasis in Patients with Primary Breast Cancer. BioMed Research International, 2022. https://doi.org/10.1155/2022/8616535
Bansal, M., Goyal, A., & Choudhary, A. (2022). A comparative analysis of K-Nearest Neighbor, Genetic, Support Vector Machine, Decision Tree, and Long Short-Term Memory algorithms in machine learning. Decision Analytics Journal, 3, e100071. https://doi.org/10.1016/J.DAJOUR.2022.100071
Bhatia, V., Rawat, P., Kumar, A., & Shah, R. R. (2019). End-to-End Resume Parsing and Finding Candidates for a Job Description using BERT. http://arxiv.org/abs/1910.03089
Binsaif, N. (2022). Application of Machine Learning Models to the Detection of Breast Cancer. Mobile Information Systems, 2022. https://doi.org/10.1155/2022/7340689
Botlagunta, M., Botlagunta, M. D., Myneni, M. B., Lakshmi, D., Nayyar, A., Gullapalli, J. S., & Shah, M. A. (2023). Classification and diagnostic prediction of breast cancer metastasis on clinical data using machine learning algorithms. Scientific Reports 2023 13:1, 13(1), 1–17. https://doi.org/10.1038/s41598-023-27548-w
Esfahani, P. R., Maalouf, M. M., Reddy, A. J., & Chawla, P. (2024). Abstract A088: Utilizing Machine Learning Techniques to Investigate Mammograms for Breast Cancer Detection. Cancer Research, 84(3_Supplement_1), A088–A088. https://doi.org/10.1158/1538-7445.ADVBC23-A088
Fatima, N., Liu, L., Hong, S., & Ahmed, H. (2020). Prediction of Breast Cancer, Comparative Review of Machine Learning Techniques, and Their Analysis. IEEE Access, 8, 150360–150376. https://doi.org/10.1109/ACCESS.2020.3016715
Feng, Y., Spezia, M., Huang, S., Yuan, C., Zeng, Z., Zhang, L., Ji, X., Liu, W., Huang, B., Luo, W., Liu, B., Lei, Y., Du, S., Vuppalapati, A., Luu, H. H., Haydon, R. C., He, T. C., & Ren, G. (2018). Breast cancer development and progression: Risk factors, cancer stem cells, signaling pathways, genomics, and molecular pathogenesis. Genes and Diseases, 5(2), 77–106. https://doi.org/10.1016/J.GENDIS.2018.05.001
Gupta, K., & Janghel, R. R. (2019). Dimensionality reduction-based breast cancer classification using machine learning. Advances in Intelligent Systems and Computing, 798, 133–146. https://doi.org/10.1007/978-981-13-1132-1_11/COVER
Harinishree, M. S., Aditya, C. R., & Sachin, D. N. (2021). Detection of Breast Cancer using Machine Learning Algorithms - A Survey. Proceedings - 5th International Conference on Computing Methodologies and Communication, ICCMC 2021, 1598–1601. https://doi.org/10.1109/ICCMC51019.2021.9418488
He, X., Liu, X., Zuo, F., Shi, H., & Jing, J. (2023). Artificial intelligence-based multi-omics analysis fuels cancer precision medicine. Seminars in Cancer Biology, 88, 187–200. https://doi.org/10.1016/J.SEMCANCER.2022.12.009
Islam, M. M., Haque, M. R., Iqbal, H., Hasan, M. M., Hasan, M., & Kabir, M. N. (2020). Breast Cancer Prediction: A Comparative Study Using Machine Learning Techniques. SN Computer Science, 1(5), 1–14. https://doi.org/10.1007/S42979-020-00305-W/METRICS
Jaiswal, G., Rani, R., Mangotra, H., & Sharma, A. (2023). Integration of hyperspectral imaging and autoencoders: Benefits, applications, hyperparameter tunning and challenges. Computer Science Review, 50, e100584. https://doi.org/10.1016/J.COSREV.2023.100584
Kabir, M. F., Chen, T., & Ludwig, S. A. (2023). A performance analysis of dimensionality reduction algorithms in machine learning models for cancer prediction. Healthcare Analytics, 3, e100125. https://doi.org/10.1016/J.HEALTH.2022.100125
Kabiraj, S., Raihan, M., Alvi, N., Afrin, M., Akter, L., Sohagi, S. A., & Podder, E. (2020). Breast Cancer Risk Prediction using XGBoost and Random Forest Algorithm. 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 1–4. https://doi.org/10.1109/ICCCNT49239.2020.9225451
Karuppasamy, A. D., Abdesselam, A., zidoum, H., Hedjam, R., & Al-Bahri, M. (2024). Combining a forward supervised filter learning with a sparse NMF for breast cancer histopathological image classification. Intelligence-Based Medicine, 10, e100174. https://doi.org/10.1016/J.IBMED.2024.100174
Li, X., Chen, X., & Rezaeipanah, A. (2023). Automatic breast cancer diagnosis based on hybrid dimensionality reduction technique and ensemble classification. Journal of Cancer Research and Clinical Oncology, 149(10), 7609–7627. https://doi.org/10.1007/S00432-023-04699-X
Maharana, K., Mondal, S., & Nemade, B. (2022). A review: Data pre-processing and data augmentation techniques. Global Transitions Proceedings, 3(1), 91–99. https://doi.org/10.1016/J.GLTP.2022.04.020
Mishra, V., & Rath, S. K. (2021). Detection of breast cancer tumours based on feature reduction and classification of thermograms. Quantitative InfraRed Thermography Journal, 18(5), 300–313. https://doi.org/10.1080/17686733.2020.1768497
Mishra, V., Rath, S., & Rath, S. K. (2023). Feature Analysis for Detection of Breast Cancer Thermograms Using Dimensionality Reduction Techniques (pp. 311–321). https://doi.org/10.1007/978-981-19-9090-8_27
Naji, M. A., Filali, S. El, Aarika, K., Benlahmar, E. H., Abdelouhahid, R. A., & Debauche, O. (2021). Machine Learning Algorithms for Breast Cancer Prediction and Diagnosis. Procedia Computer Science, 191, 487–492. https://doi.org/10.1016/j.procs.2021.07.062
Paul, A., Joe, M., Brian, M., Hussein, M., & Rosette, K. (2024). Exploring Dimensionality Reduction Techniques for Improved Breast Cancer Diagnosis. International Journal of Research and Scientific Innovation, 11(5), 808–824. https://ideas.repec.org/a/bjc/journl/v11y2024i5p808-824.html
Pérez-Núñez, J. R., Rodríguez, C., Vásquez-Serpa, L. J., & Navarro, C. (2024). The Challenge of Deep Learning for the Prevention and Automatic Diagnosis of Breast Cancer: A Systematic Review. Diagnostics, 14(24), e2896. https://doi.org/10.3390/DIAGNOSTICS14242896/S1
Rimi, I., Mondal, Md. N. I., & Oishy, J. (2022). Prediction Approach of Breast Cancer using Dimensionality Reduction and Outlier Detection. 2022 4th International Conference on Electrical, Computer and Telecommunication Engineering (ICECTE), 1–4. https://doi.org/10.1109/ICECTE57896.2022.10114477
Sengar, P. P., Gaikwad, M. J., & Nagdive, A. S. (2020). Comparative Study of Machine Learning Algorithms for Breast Cancer Prediction. 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), 796–801. https://doi.org/10.1109/ICSSIT48917.2020.9214267
Tai, C. en A., Gunraj, H., Hodzic, N., Flanagan, N., Sabri, A., & Wong, A. (2024). Enhancing Clinical Support for Breast Cancer with Deep Learning Models Using Synthetic Correlated Diffusion Imaging. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 14313 LNCS, 83–93. https://doi.org/10.1007/978-3-031-47076-9_9/COVER
Dwarakanath, G. V., & Bhairagond, M. (2023). Breast Cancer Detection and Prediction using Image Processing and ML. International Journal for Science Technology and Engineering, 11(8), 1902–1911. https://doi.org/10.22214/IJRASET.2023.55503
Yu, S. M., Young, C. Y. M., Chan, Y. H., Chan, Y. S., Tsoi, C., Choi, M. N. Y., Chan, T. H., Leung, J., Chu, W. C. W., Hung, E. H. Y., & Chau, H. H. L. (2024). Artificial intelligence–based computer-aided diagnosis for breast cancer detection on digital mammography in Hong Kong. Hong Kong Medical Journal, 30(6), 468–477. https://doi.org/10.12809/HKMJ2310920
Al-Fahaidy, F. A. K., Al-Fuhaidi, B., AL-Darouby, I., AL-Abady, F., AL-Qadry, M., & AL-Gamal, A. (2022). A Diagnostic Model of Breast Cancer Based on Digital Mammogram Images Using Machine Learning Techniques. Applied Computational Intelligence and Soft Computing, 2022, 1–17. https://doi.org/10.1155/2022/3895976
Ara, S., Das, A., & Dey, A. (2021). Malignant and Benign Breast Cancer Classification using Machine Learning Algorithms. 2021 International Conference on Artificial Intelligence (ICAI), 97–101. https://doi.org/10.1109/ICAI52203.2021.9445249
Ashokkumar, N., Meera, S., Anandan, P., Murthy, M. Y. B., Kalaivani, K. S., Alahmadi, T. A., Alharbi, S. A., Raghavan, S. S., & Jayadhas, S. A. (2022). Deep Learning Mechanism for Predicting the Axillary Lymph Node Metastasis in Patients with Primary Breast Cancer. BioMed Research International, 2022. https://doi.org/10.1155/2022/8616535
Bansal, M., Goyal, A., & Choudhary, A. (2022). A comparative analysis of K-Nearest Neighbor, Genetic, Support Vector Machine, Decision Tree, and Long Short-Term Memory algorithms in machine learning. Decision Analytics Journal, 3, e100071. https://doi.org/10.1016/J.DAJOUR.2022.100071
Bhatia, V., Rawat, P., Kumar, A., & Shah, R. R. (2019). End-to-End Resume Parsing and Finding Candidates for a Job Description using BERT. http://arxiv.org/abs/1910.03089
Binsaif, N. (2022). Application of Machine Learning Models to the Detection of Breast Cancer. Mobile Information Systems, 2022. https://doi.org/10.1155/2022/7340689
Botlagunta, M., Botlagunta, M. D., Myneni, M. B., Lakshmi, D., Nayyar, A., Gullapalli, J. S., & Shah, M. A. (2023). Classification and diagnostic prediction of breast cancer metastasis on clinical data using machine learning algorithms. Scientific Reports 2023 13:1, 13(1), 1–17. https://doi.org/10.1038/s41598-023-27548-w
Esfahani, P. R., Maalouf, M. M., Reddy, A. J., & Chawla, P. (2024). Abstract A088: Utilizing Machine Learning Techniques to Investigate Mammograms for Breast Cancer Detection. Cancer Research, 84(3_Supplement_1), A088–A088. https://doi.org/10.1158/1538-7445.ADVBC23-A088
Fatima, N., Liu, L., Hong, S., & Ahmed, H. (2020). Prediction of Breast Cancer, Comparative Review of Machine Learning Techniques, and Their Analysis. IEEE Access, 8, 150360–150376. https://doi.org/10.1109/ACCESS.2020.3016715
Feng, Y., Spezia, M., Huang, S., Yuan, C., Zeng, Z., Zhang, L., Ji, X., Liu, W., Huang, B., Luo, W., Liu, B., Lei, Y., Du, S., Vuppalapati, A., Luu, H. H., Haydon, R. C., He, T. C., & Ren, G. (2018). Breast cancer development and progression: Risk factors, cancer stem cells, signaling pathways, genomics, and molecular pathogenesis. Genes and Diseases, 5(2), 77–106. https://doi.org/10.1016/J.GENDIS.2018.05.001
Gupta, K., & Janghel, R. R. (2019). Dimensionality reduction-based breast cancer classification using machine learning. Advances in Intelligent Systems and Computing, 798, 133–146. https://doi.org/10.1007/978-981-13-1132-1_11/COVER
Harinishree, M. S., Aditya, C. R., & Sachin, D. N. (2021). Detection of Breast Cancer using Machine Learning Algorithms - A Survey. Proceedings - 5th International Conference on Computing Methodologies and Communication, ICCMC 2021, 1598–1601. https://doi.org/10.1109/ICCMC51019.2021.9418488
He, X., Liu, X., Zuo, F., Shi, H., & Jing, J. (2023). Artificial intelligence-based multi-omics analysis fuels cancer precision medicine. Seminars in Cancer Biology, 88, 187–200. https://doi.org/10.1016/J.SEMCANCER.2022.12.009
Islam, M. M., Haque, M. R., Iqbal, H., Hasan, M. M., Hasan, M., & Kabir, M. N. (2020). Breast Cancer Prediction: A Comparative Study Using Machine Learning Techniques. SN Computer Science, 1(5), 1–14. https://doi.org/10.1007/S42979-020-00305-W/METRICS
Jaiswal, G., Rani, R., Mangotra, H., & Sharma, A. (2023). Integration of hyperspectral imaging and autoencoders: Benefits, applications, hyperparameter tunning and challenges. Computer Science Review, 50, e100584. https://doi.org/10.1016/J.COSREV.2023.100584
Kabir, M. F., Chen, T., & Ludwig, S. A. (2023). A performance analysis of dimensionality reduction algorithms in machine learning models for cancer prediction. Healthcare Analytics, 3, e100125. https://doi.org/10.1016/J.HEALTH.2022.100125
Kabiraj, S., Raihan, M., Alvi, N., Afrin, M., Akter, L., Sohagi, S. A., & Podder, E. (2020). Breast Cancer Risk Prediction using XGBoost and Random Forest Algorithm. 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 1–4. https://doi.org/10.1109/ICCCNT49239.2020.9225451
Karuppasamy, A. D., Abdesselam, A., zidoum, H., Hedjam, R., & Al-Bahri, M. (2024). Combining a forward supervised filter learning with a sparse NMF for breast cancer histopathological image classification. Intelligence-Based Medicine, 10, e100174. https://doi.org/10.1016/J.IBMED.2024.100174
Li, X., Chen, X., & Rezaeipanah, A. (2023). Automatic breast cancer diagnosis based on hybrid dimensionality reduction technique and ensemble classification. Journal of Cancer Research and Clinical Oncology, 149(10), 7609–7627. https://doi.org/10.1007/S00432-023-04699-X
Maharana, K., Mondal, S., & Nemade, B. (2022). A review: Data pre-processing and data augmentation techniques. Global Transitions Proceedings, 3(1), 91–99. https://doi.org/10.1016/J.GLTP.2022.04.020
Mishra, V., & Rath, S. K. (2021). Detection of breast cancer tumours based on feature reduction and classification of thermograms. Quantitative InfraRed Thermography Journal, 18(5), 300–313. https://doi.org/10.1080/17686733.2020.1768497
Mishra, V., Rath, S., & Rath, S. K. (2023). Feature Analysis for Detection of Breast Cancer Thermograms Using Dimensionality Reduction Techniques (pp. 311–321). https://doi.org/10.1007/978-981-19-9090-8_27
Naji, M. A., Filali, S. El, Aarika, K., Benlahmar, E. H., Abdelouhahid, R. A., & Debauche, O. (2021). Machine Learning Algorithms for Breast Cancer Prediction and Diagnosis. Procedia Computer Science, 191, 487–492. https://doi.org/10.1016/j.procs.2021.07.062
Paul, A., Joe, M., Brian, M., Hussein, M., & Rosette, K. (2024). Exploring Dimensionality Reduction Techniques for Improved Breast Cancer Diagnosis. International Journal of Research and Scientific Innovation, 11(5), 808–824. https://ideas.repec.org/a/bjc/journl/v11y2024i5p808-824.html
Pérez-Núñez, J. R., Rodríguez, C., Vásquez-Serpa, L. J., & Navarro, C. (2024). The Challenge of Deep Learning for the Prevention and Automatic Diagnosis of Breast Cancer: A Systematic Review. Diagnostics, 14(24), e2896. https://doi.org/10.3390/DIAGNOSTICS14242896/S1
Rimi, I., Mondal, Md. N. I., & Oishy, J. (2022). Prediction Approach of Breast Cancer using Dimensionality Reduction and Outlier Detection. 2022 4th International Conference on Electrical, Computer and Telecommunication Engineering (ICECTE), 1–4. https://doi.org/10.1109/ICECTE57896.2022.10114477
Sengar, P. P., Gaikwad, M. J., & Nagdive, A. S. (2020). Comparative Study of Machine Learning Algorithms for Breast Cancer Prediction. 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), 796–801. https://doi.org/10.1109/ICSSIT48917.2020.9214267
Tai, C. en A., Gunraj, H., Hodzic, N., Flanagan, N., Sabri, A., & Wong, A. (2024). Enhancing Clinical Support for Breast Cancer with Deep Learning Models Using Synthetic Correlated Diffusion Imaging. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 14313 LNCS, 83–93. https://doi.org/10.1007/978-3-031-47076-9_9/COVER
Dwarakanath, G. V., & Bhairagond, M. (2023). Breast Cancer Detection and Prediction using Image Processing and ML. International Journal for Science Technology and Engineering, 11(8), 1902–1911. https://doi.org/10.22214/IJRASET.2023.55503
Yu, S. M., Young, C. Y. M., Chan, Y. H., Chan, Y. S., Tsoi, C., Choi, M. N. Y., Chan, T. H., Leung, J., Chu, W. C. W., Hung, E. H. Y., & Chau, H. H. L. (2024). Artificial intelligence–based computer-aided diagnosis for breast cancer detection on digital mammography in Hong Kong. Hong Kong Medical Journal, 30(6), 468–477. https://doi.org/10.12809/HKMJ2310920
Ansari, Z. A., Arif, M., Rajaboina, N. B., Shaikh, A. A., & Singh, Y. (2025). Integrated Ensemble Strategy for Breast Cancer Detection Using Dimensionality Reduction Technique. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 14, e31899. https://doi.org/10.14201/adcaij.31899
Downloads
Download data is not yet available.
+
−