Optimasi Akurasi Prediksi Penyakit Kanker Payudara Menggunakan Metode Random Forest
DOI:
https://doi.org/10.52436/1.jpti.802Kata Kunci:
Akurasi model, Diagnosis medis, Optimasi prediksi, Kanker payudara, Random forestAbstrak
Kanker payudara merupakan ancaman serius bagi kesehatan wanita, dengan deteksi dini dan diagnosis akurat yang sangat penting untuk meningkatkan hasil pasien dan mengurangi kematian; namun, akurasi dan keandalan metode diagnosis saat ini masih perlu ditingkatkan. Penelitian ini bertujuan untuk mengoptimalkan metode Random Forest dalam prediksi kanker payudara, meningkatkan akurasi dan efisiensi model diagnosa. Menggunakan dataset Kaggle yang mencakup 569 pasien dengan 30 fitur karakteristik sel payudara, penelitian ini melalui proses pemuatan data, pra-pemrosesan (pembersihan dan normalisasi), pembagian data latih dan uji, pelatihan model Random Forest, prediksi, evaluasi model dengan metrik akurasi, presisi, recall, F1-score, dan AUC-ROC, serta visualisasi hasil. Hasilnya, penelitian mencapai akurasi 96%, peningkatan dari 95% yang dicapai sebelumnya, dengan model Random Forest yang dioptimalkan menunjukkan performa sangat baik dalam mengklasifikasikan kanker payudara, memiliki presisi 98%, recall 93%, dan F1-Score 95%, serta kemampuan klasifikasi sempurna dengan AUC 1.00. Model ini dapat menjadi alat yang efektif dalam diagnosis medis kanker payudara, meningkatkan hasil pasien, dan mengurangi kematian, sehingga penelitian ini merekomendasikan penggunaan metode Random Forest yang dioptimalkan sebagai alat prediksi kanker payudara yang efektif dan efisien.
Unduhan
Referensi
R. Mehrotra and K. Yadav, “Breast cancer in India: Present scenario and the challenges ahead,” World J Clin Oncol, vol. 13, no. 3, pp. 209–218, Mar. 2022, doi: 10.5306/wjco.v13.i3.209.
D. Crosby et al., “Early detection of cancer,” Science (1979), vol. 375, no. 6586, Mar. 2022, doi: 10.1126/science.aay9040.
R. A. Mas’ud and Junta Zeniarja, “Optimasi Convolutional Neural Networks untuk Deteksi Kanker Payudara menggunakan Arsitektur DenseNet,” Edumatic: Jurnal Pendidikan Informatika, vol. 8, no. 1, pp. 310–318, Jun. 2024, doi: 10.29408/edumatic.v8i1.25883.
A. Pulumati, A. Pulumati, B. S. Dwarakanath, A. Verma, and R. V. L. Papineni, “Technological advancements in cancer diagnostics: Improvements and limitations,” Cancer Rep, vol. 6, no. 2, Feb. 2023, doi: 10.1002/cnr2.1764.
B. Hunter, S. Hindocha, and R. W. Lee, “The Role of Artificial Intelligence in Early Cancer Diagnosis,” Cancers (Basel), vol. 14, no. 6, p. 1524, Mar. 2022, doi: 10.3390/cancers14061524.
S. Hussain et al., “Modern Diagnostic Imaging Technique Applications and Risk Factors in the Medical Field: A Review,” Biomed Res Int, vol. 2022, no. 1, Jan. 2022, doi: 10.1155/2022/5164970.
R. Oktafiani, A. Hermawan, and D. Avianto, “Pengaruh Komposisi Split data Terhadap Performa Klasifikasi Penyakit Kanker Payudara Menggunakan Algoritma Machine Learning,” Jurnal Sains dan Informatika, pp. 19–28, Jun. 2023, doi: 10.34128/jsi.v9i1.622.
M. Rana and M. Bhushan, “Machine learning and deep learning approach for medical image analysis: diagnosis to detection,” Multimed Tools Appl, vol. 82, no. 17, pp. 26731–26769, Jul. 2023, doi: 10.1007/s11042-022-14305-w.
A. R. Inture, B. S. Sai Nadh, A. Sha, S. Abhishek, A. T, and T. V. Mullapudi, “Leveraging Random Forests for Ovarian Cancer Detection and Precision Prediction,” in 2023 7th International Conference on Electronics, Communication and Aerospace Technology (ICECA), IEEE, Nov. 2023, pp. 910–915. doi: 10.1109/ICECA58529.2023.10394819.
M. Song, H. Jung, S. Lee, D. Kim, and M. Ahn, “Diagnostic Classification and Biomarker Identification of Alzheimer’s Disease with Random Forest Algorithm,” Brain Sci, vol. 11, no. 4, p. 453, Apr. 2021, doi: 10.3390/brainsci11040453.
N. R. Muntiari and K. H. Hanif, “Klasifikasi Penyakit Kanker Payudara Menggunakan Perbandingan Algoritma Machine Learning,” Jurnal Ilmu Komputer dan Teknologi, vol. 3, no. 1, pp. 1–6, May 2022, doi: 10.35960/ikomti.v3i1.766.
J. Wang, X. Sun, Q. Cheng, and Q. Cui, “An innovative random forest-based nonlinear ensemble paradigm of improved feature extraction and deep learning for carbon price forecasting,” Science of The Total Environment, vol. 762, p. 143099, Mar. 2021, doi: 10.1016/j.scitotenv.2020.143099.
M. A. Hassan, H. Salem, N. Bailek, and O. Kisi, “Random Forest Ensemble-Based Predictions of On-Road Vehicular Emissions and Fuel Consumption in Developing Urban Areas,” Sustainability, vol. 15, no. 2, p. 1503, Jan. 2023, doi: 10.3390/su15021503.
S. S, S. A, and A. S, “Ultrasound Image Analysis in Breast Cancer: A Comparative Study of Decision Trees and Random Forests,” in 2024 IEEE 16th International Conference on Computational Intelligence and Communication Networks (CICN), IEEE, Dec. 2024, pp. 1024–1029. doi: 10.1109/CICN63059.2024.10847499.
S. M. Shah, R. A. Khan, S. Arif, and U. Sajid, “Artificial intelligence for breast cancer analysis: Trends & directions,” Comput Biol Med, vol. 142, p. 105221, Mar. 2022, doi: 10.1016/j.compbiomed.2022.105221.
S. N. S., “Prediction of Breast Cancer Through Random Forest,” Curr Med Imaging Rev, vol. 19, no. 10, Sep. 2023, doi: 10.2174/1573405618666220930150625.
S. Yalavarthi, S. S. Makkapati, H. Murari, K. S. Balamurugan, and P. Rajendran, “Advanced Breast Cancer Diagnostics through a Comparative Analysis of SVM, Random Forests, and Neural Networks in MRI Image Analysis,” in 2024 Asian Conference on Communication and Networks (ASIANComNet), IEEE, Oct. 2024, pp. 1–7. doi: 10.1109/ASIANComNet63184.2024.10811015.
N. M. ud din, R. A. Dar, M. Rasool, and A. Assad, “Breast cancer detection using deep learning: Datasets, methods, and challenges ahead,” Comput Biol Med, vol. 149, p. 106073, Oct. 2022, doi: 10.1016/j.compbiomed.2022.106073.
D. Kang et al., “Prediction Model for Postoperative Quality of Life Among Breast Cancer Survivors Along the Survivorship Trajectory From Pretreatment to 5 Years: Machine Learning–Based Analysis,” JMIR Public Health Surveill, vol. 9, p. e45212, Aug. 2023, doi: 10.2196/45212.
M. H. Alshayeji, H. Ellethy, S. Abed, and R. Gupta, “Computer-aided detection of breast cancer on the Wisconsin dataset: An artificial neural networks approach,” Biomed Signal Process Control, vol. 71, p. 103141, Jan. 2022, doi: 10.1016/j.bspc.2021.103141.
M. R. Darbandi, M. Darbandi, S. Darbandi, I. Bado, M. Hadizadeh, and H. R. Khorram Khorshid, “Artificial intelligence breakthroughs in pioneering early diagnosis and precision treatment of breast cancer: A multimethod study,” Eur J Cancer, vol. 209, p. 114227, Sep. 2024, doi: 10.1016/j.ejca.2024.114227.
W. T. Mohammad, R. Teete, H. Al-Aaraj, Y. S. Y. Rubbai, and M. M. Arabyat, “Diagnosis of Breast Cancer Pathology on the Wisconsin Dataset with the Help of Data Mining Classification and Clustering Techniques,” Appl Bionics Biomech, vol. 2022, pp. 1–9, Apr. 2022, doi: 10.1155/2022/6187275.
D. Benaya, “Implementasi Random Forest dalam Klasifikasi Kanker Paru-Paru,” JOINTER?: Journal of Informatics Engineering, vol. 5, no. 01, pp. 27–31, Jun. 2024, doi: 10.53682/jointer.v5i01.331.
J. Li, J. Shi, J. Chen, Z. Du, and L. Huang, “Self-attention random forest for breast cancer image classification,” Front Oncol, vol. 13, Feb. 2023, doi: 10.3389/fonc.2023.1043463.
Cecep Wahyu Cahyana and Akhsin Nurlayli, “Analisis Performa Logistic Regression, Naïve Bayes, dan Random Forest sebagai Algoritma Pendeteksi Kanker Payudara,” INSERT?: Information System and Emerging Technology Journal, vol. 4, no. 1, pp. 51–64, Jun. 2023, doi: 10.23887/insert.v4i1.62362.
G. Shanmugasundar, M. Vanitha, R. ?ep, V. Kumar, K. Kalita, and M. Ramachandran, “A Comparative Study of Linear, Random Forest and AdaBoost Regressions for Modeling Non-Traditional Machining,” Processes, vol. 9, no. 11, p. 2015, Nov. 2021, doi: 10.3390/pr9112015.
S. Asadi, S. Roshan, and M. W. Kattan, “Random forest swarm optimization-based for heart diseases diagnosis,” J Biomed Inform, vol. 115, p. 103690, Mar. 2021, doi: 10.1016/j.jbi.2021.103690.
R. Zhu, Y. Wang, J.-X. Liu, and L.-Y. Dai, “IPCARF: improving lncRNA-disease association prediction using incremental principal component analysis feature selection and a random forest classifier,” BMC Bioinformatics, vol. 22, no. 1, p. 175, Dec. 2021, doi: 10.1186/s12859-021-04104-9.
M. Pal and S. Parija, “Prediction of Heart Diseases using Random Forest,” J Phys Conf Ser, vol. 1817, no. 1, p. 012009, Mar. 2021, doi: 10.1088/1742-6596/1817/1/012009.
Y. Amethiya, P. Pipariya, S. Patel, and M. Shah, “Comparative analysis of breast cancer detection using machine learning and biosensors,” Intelligent Medicine, vol. 2, no. 2, pp. 69–81, May 2022, doi: 10.1016/j.imed.2021.08.004.
M. Aria, C. Cuccurullo, and A. Gnasso, “A comparison among interpretative proposals for Random Forests,” Machine Learning with Applications, vol. 6, p. 100094, Dec. 2021, doi: 10.1016/j.mlwa.2021.100094.