OPTIMIZATION OF SUPPORT VECTOR MACHINE WITH SMOTE AND BAYESIAN METHOD FOR HEART FAILURE CLASSIFICATION
Abstract
Abstract: This study applies an integrated approach to optimize heart failure classification. The main objective is to address the challenge of class imbalance in medical datasets and to improve the accuracy, sensitivity, and generalization of the classification model. The urgency of this issue is emphasized by statistics showing that cardiovascular diseases cause approximately 17.9 million deaths worldwide each year. Using a quantitative experimental approach, this study analyzes the "Heart Failure Prediction Dataset" from Kaggle, which consists of 918 records. The data were processed through normalization and encoding, followed by the application of SMOTE on the training set to balance class distribution. This step successfully increased model accuracy from 88.41% to 90.22% and minority class recall from 0.82 to 0.88. Furthermore, Bayesian Optimization was employed to refine the hyperparameters of SVM, resulting in a final model with an accuracy of 89.13% that demonstrated better generalization. This integrated approach significantly enhances the stability, sensitivity, and generalization of the model, making it a reliable tool for clinical decision support systems in predicting heart failure.
Keywords: bayesian optimization; heart failure; machine learning; SMOTE; SVM.
Abstrak: Penelitian ini menerapkan pendekatan terintegrasi untuk mengoptimalkan klasifikasi gagal jantung. Tujuan utama studi ini adalah untuk mengatasi tantangan ketidakseimbangan kelas dalam dataset medis dan meningkatkan akurasi, sensitivitas, serta generalisasi model klasifikasi. Urgensi ini ditegaskan oleh statistik yang menunjukkan bahwa penyakit kardiovaskular menyebabkan sekitar 17,9 juta kematian setiap tahun secara global. Menggunakan pendekatan eksperimental kuantitatif, penelitian ini menganalisis "Heart Failure Prediction Dataset" dari Kaggle, yang terdiri dari 918 catatan. Data diproses dengan normalisasi dan encoding, lalu SMOTE diterapkan pada data pelatihan untuk menyeimbangkan distribusi kelas. Langkah ini berhasil meningkatkan akurasi dari 88,41% menjadi 90,22% dan recall kelas minoritas dari 0,82 menjadi 0,88. Selanjutnya, Bayesian Optimization menyempurnakan hyperparameter SVM, menghasilkan model akhir dengan akurasi 89,13% yang menunjukkan generalisasi lebih baik. Pendekatan terintegrasi ini secara signifikan meningkatkan stabilitas, sensitivitas, dan generalisasi model. Hasil penelitian ini menjadikannya alat yang andal untuk sistem pendukung keputusan klinis dalam prediksi gagal jantung.
Kata kunci: bayesian optimization; gagal jantung; machine learning; SMOTE; SVM
References
J. J. Purba and T. C. L. Tobing, “Faktor Risiko Usia dan Gejala Klinis terhadap Status Gizi Anak Penyakit Jantung Bawaan di RSUP H. Adam Malik Medan Tahun 2020-2021,” SCRIPTA SCORE Scientific Medical Journal, vol. 4, no. 2, pp. 1–11, Mar. 2023, doi: 10.32734/scripta.v4i2.10563.
N. Sumarni, U. Rosidin, U. Sumarna, E. Arum R., I. Sholahuddin, and D. Purnama, “Senam Sehat Mampu Kendalikan Tekanan Darah pada Lansia di Rw 11 Kelurahan Sukamentri Garut,” Jurnal Kreativitas Pengabdian Kepada Masyarakat (PKM), vol. 6, no. 2, pp. 526–535, Feb. 2023, doi: 10.33024/jkpm.v6i2.8347.
B. Bozkurt et al., “Universal defini-tion and classification of heart failure: a report of the Heart Failure Society of America, Heart Failure Association of the European Society of Cardiology, Japanese Heart Failure Society and Writing Committee of the Universal Definition of Heart Failure: Endorsed by the Canadian Heart Failure Society, Heart Failure Association of India, Cardiac Society of Australia and New Zealand, and Chinese Heart Failure Association,” Eur J Heart Fail, vol. 23, no. 3, pp. 352–380, Mar. 2021, doi: 10.1002/ejhf.2115.
D. Andriyani, Ahmad Faqih, and Sandy Eka Permana, “The Effect of SMOTE Application on Support Vector Machine Performance in Sentiment Classification on Imbalanced Datasets,” Journal of Artificial Intelligence and Engineer-ing Applications (JAIEA), vol. 4, no. 2, pp. 752–757, Feb. 2025, doi: 10.59934/jaiea.v4i2.742.
A. M. Siallagan, “SYSTEMATIC REVIEW: KUALITAS HIDUP PASIEN GAGAL JANTUNG KONGESTIF,” Jurnal Medika : Karya Ilmiah Kesehatan, vol. 6, no. 2, Nov. 2021, doi: 10.35728/jmkik.v6i2.696.
I. A. P. Lubis, S. R. Siregar, and A. Saputra, “Literature Review : Sin-drom Kardiorenal,” GALENICAL : Jurnal Kedokteran dan Kesehatan Mahasiswa Malikussaleh, vol. 3, no. 3, p. 14, Jul. 2024, doi: 10.29103/jkkmm.v3i3.16831.
W. S. P. Harmadha et al., “Ex-plaining the increase of incidence and mortality from cardiovascular disease in Indonesia: A global burden of disease study analysis (2000–2019),” PLoS One, vol. 18, no. 12 December, Dec. 2023, doi: 10.1371/journal.pone.0294128.
R. Turaina, R. Saputra, S. Infor-masi, U. Metamedia, and J. Khatib Sulaiman Dalam, “Optimalisasi Klasifikasi Umpan Balik Mahasiswa Terhadap Layanan Kampus dengan Sinergi Random Forest dan Smote,” Jurnal Nasional Komputasi dan Teknologi Infor-masi, vol. 6, no. 6, 2023.
F. Thabtah, S. Hammoud, F. Kamalov, and A. Gonsalves, “Data imbalance in classification: Experi-mental evaluation,” Inf. Sci., vol. 513, pp. 429–441, 2020, doi: 10.1016/j.ins.2019.11.004.
L. N. Farida and S. Bahri, “Klas-ifikasi Gagal Jantung menggunakan Metode SVM (Support Vector Machine),” Komputika : Jurnal Sistem Komputer, vol. 13, no. 2, pp. 149–156, Oct. 2024, doi: 10.34010/komputika.v13i2.11330.
L. Hussain, K. Lone, I. Awan, A. Abbasi, and J.-U.-R. Pirzada, “De-tecting congestive heart failure by extracting multimodal features with synthetic minority oversampling technique (SMOTE) for imbalanced data using robust machine learning techniques,” Waves in Random and Complex Media, vol. 32, pp. 1079–1102, 2020, doi: 10.1080/17455030.2020.1810364.
D. T. Utari, “INTEGRATION OF SVM AND SMOTE-NC FOR CLASSIFICATION OF HEART FAILURE PATIENTS,” BA-REKENG: Jurnal Ilmu Matematika dan Terapan, vol. 17, no. 4, pp. 2263–2272, Dec. 2023, doi: 10.30598/barekengvol17iss4pp2263-2272.
R. Waluyo and A. S. Munir, “Opti-masi Prediksi Kematian pada Gagal Jantung Analisis Perbandingan Algoritma Pembelajaran Ensemble dan Teknik Penyeimbangan Data pada Dataset,” Jurnal Sistem dan Teknologi Informasi (JustIN), vol. 12, no. 2, p. 365, Apr. 2024, doi: 10.26418/justin.v12i2.75158.
A. M. Elshewey, M. Y. Shams, N. El-Rashidy, A. M. Elhady, S. M. Shohieb, and Z. Tarek, “Bayesian Optimization with Support Vector Machine Model for Parkinson Disease Classification,” Sensors, vol. 23, no. 4, Feb. 2023, doi: 10.3390/s23042085.
A. R. Lubis, Y. Y. Lase, D. R, and D. Witarsyah, “Optimization of SVM Classification Accuracy with Bayesian Optimization Utilizing Data Augmentation,” 2023 6th International Conference of Computer and Informatics Engineering (IC2IE), pp. 169–174, 2023, doi: 10.1109/IC2IE60547.2023.10331580.
P. Rani, R. Lamba, R. K. Sachdeva, A. Jain, T. Choudhury, and K. Ko-techa, “Heart Disease Prediction Using Bayesian Optimized Classification Algorithms,” in 2023 7th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), IEEE, Oct. 2023, pp. 1–5. doi: 10.1109/ISMSIT58785.2023.10304966.