EVALUASI KINERJA MODEL GRADIENT BOOSTED TREES UNTUK PREDIKSI STATUS KOMORBIDITAS PADA PASIEN BATU EMPEDU

Fasya Gilar Amali, Hasbi Firmansyah, Wahyu Asriani

Abstract


ABSTRAK

 

Penelitian ini bertujuan mengevaluasi kinerja model Gradient Boosted Trees (GBT) dalam memprediksi status komorbiditas (ada/tidak ada) pada pasien batu empedu. Data berasal dari UCI Machine Learning Repository dan mencakup 319 individu dengan 38 fitur klinis. Proses pemodelan meliputi pembersihan data, transformasi fitur, pembagian data dengan rasio 70:300 untuk pelatihan dan pengujian, pelatihan model GBT dengan skema 10-fold cross-validation, serta evaluasi menggunakan empat metrik utama, yaitu akurasi, classification error, weighted mean precision, dan weighted mean recall. Hasil eksperimen menunjukkan bahwa model GBT mencapai akurasi 82,29%, classification error 17,71%, weighted mean precision 89,63%, dan weighted mean recall 72,58%. Analisis lebih lanjut mengungkap precision sempurna (100,00%) pada kelas komorbiditas dan recall sempurna (100,00%) pada kelas tanpa komorbiditas, sementara recall kelas komorbiditas hanya 45,16%, yang mengindikasikan bias model terhadap kelas mayoritas dan keterbatasan dalam mendeteksi pasien berkomorbid. Temuan ini menunjukkan bahwa GBT merupakan pendekatan menjanjikan sebagai alat bantu keputusan untuk mengonfirmasi komorbiditas, namun masih kurang ideal sebagai alat skrining awal. Penelitian lanjutan disarankan menerapkan teknik penanganan ketidakseimbangan kelas, optimasi hyperparameter, penyesuaian threshold, serta pembandingan dengan algoritma ensemble lain guna meningkatkan sensitivitas dan generalisasi model.

 

Kata Kunci : Komorbiditas, Batu Empedu, Gradient Boosted Trees, Klasifikasi, RapidMiner, Weighted Mean Precision

 

ABSTRACT

 

This study aims to evaluate the performance of a Gradient Boosted Trees (GBT) model for predicting comorbidity status (present/absent) in gallstone patients. The dataset, obtained from the UCI Machine Learning Repository, comprises 319 individuals with 38 clinical features. The modelling pipeline includes data cleaning, feature transformation, an 70:30 train–test split, GBT training with 10-fold cross-validation, and evaluation using four metrics: accuracy, classification error, weighted mean precision, and weighted mean recall. Experimental results show that the GBT model achieves an accuracy of 82.29%, a classification error of 17.71%, a weighted mean precision of 89.63%, and a weighted mean recall of 72.58%. Further analysis reveals perfect precision (100.00%) for the comorbidity class and perfect recall (100.00%) for the non-comorbidity class, while recall for the comorbidity class is only 45.16%. This pattern indicates bias toward the majority class and limited sensitivity for detecting comorbid patients. These findings suggest that GBT is promising as a decision-support tool to confirm the presence of comorbidity, but less suitable as a primary screening tool. Future research should apply class-imbalance handling, hyperparameter optimisation, threshold adjustment, and comparison with alternative ensemble algorithms to improve sensitivity and generalisability.

 

Keyword : Differentiated thyroid cancer; Recurrence; Risk stratification; C4.5 Decision tree; Data mining; Clinical decision support system.


Full Text:

PDF [21-37]

References


I. Esen, H. Arslan, S. A. Esen, M. Gülşen, N. Kültekin, and O. Özdemir, “Early prediction of gallstone disease with a machine learning-based method from bioimpedance and laboratory data,” Medicine, vol. 103, 2024. DOI: 10.1097/MD.0000000000043567

E. Omar, H. Mat, A. Z. Abd Karim, R. Sanaudi, F. Ibrahim, M. A. Omar, et al., “Comparative analysis of logistic regression, gradient boosted trees, SVM, and random forest algorithms for prediction of acute kidney injury requiring dialysis after cardiac surgery,” Int. J. Nephrol. Renovasc. Dis., vol. 17, pp. 197–204, 2024. DOI: 10.2147/IJNRD.S438932

J. Zhou, S. Lee, Y. Liu, T. Liu, G. Tse, and Q. Zhang, “Predicting stroke and mortality in mitral regurgitation: a gradient boosting approach,” Front. Cardiovasc. Med., 2021.

DOI: 10.1101/2021.01.04.21249215

M. Chen, Z. Wang, Z. Zhao, W. Zhang, X. Guo, and J. Shen, “Task-wise split gradient boosting trees for multi-center diabetes prediction,” in Proc. ACM SIGKDD Conf., 2021.

DOI: 10.1145/3447548.3467402

F. Yagin, İ. Cicek, and Z. Kucukakcali, “Classification of stroke with gradient boosting tree using SMOTE-based oversampling method,” Medicine Science Int. Med. J., vol. 10, no. 2, pp. 567–573, 2021. DOI: 10.5455/medscience.2021.03.015

J. Gutierrez, M. Volkovs, T. Poutanen, T. Watson, and L. Rosella, “Risk stratification for COVID-19 hospitalization: a multivariable model based on gradient-boosting decision trees,” CMAJ Open, vol. 9, no. 4, pp. E1223–E1231, 2021. DOI: 10.9778/cmajo.20210036

G. V. Aiosa, M. Palesi, and F. Sapuppo, “Explainable AI for decision support to obesity comorbidities diagnosis,” IEEE Access, vol. 11, pp. 107767–107782, 2023.

DOI: 10.1109/ACCESS.2023.3320057

E. Etu, L. Monplaisir, S. Arslanturk, S. Masoud, C. Aguwa, I. Markevych, and J. Miller, “Prediction of length of stay in the emergency department for COVID-19 patients: A machine learning approach,” IEEE Access, vol. 10, pp. 42229–42237, 2022. DOI: 10.1109/ACCESS.2022.3168045

M.-L. Nielsen, T. Petersen, J. Maul, J. J. Wu, M. Rasmussen, T. Bertelsen, et al., “Multivariable predictive models to identify the optimal biologic therapy for treatment of patients with psoriasis at the individual level,” JAMA Dermatol., vol. 158, no. 12, pp. 1380–1388, 2022.

DOI: 10.1001/jamadermatol.2022.3171

R. Aznar-Gimeno, L. Esteban, G. Labata-Lezaun, R. del-Hoyo-Alonso, D. Abadía-Gallego, J. Paño-Pardo, et al., “A clinical decision web to predict ICU admission or death for patients hospitalised with COVID-19 using machine learning algorithms,” Int. J. Environ. Res. Public Health, vol. 18, no. 16, 2021. DOI: 10.3390/ijerph18168677

A. W. Boerman, M. Schinkel, L. Meijerink, E. S. van den Ende, L. C. Pladet, M. Scholtemeijer, et al., “Using machine learning to predict blood culture outcomes in the emergency department: a single-centre, retrospective, observational study,” BMJ Open, vol. 12, no. 2, 2022.

DOI: 10.1136/bmjopen-2021-053332

X. Zhang, Z. Xi, Y. Zhao, and F. Liu, “Forecasting hospitalization for adult asthma patients in emergency departments based on multiple environmental and clinical factors,” J. Asthma Allergy, vol. 18, pp. 861–876, 2025. DOI: 10.2147/JAA.S512405

R. Kasambara and M. Kamndaya, “Application of Extreme Gradient Boosting to predict NCD-HIV/AIDS comorbidity in young adults in Malawi,” medRxiv preprint, 2025.

DOI: 10.1101/2025.08.05.25333013




DOI: https://doi.org/10.31326/sistek.v8i1.2641

Refbacks

  • There are currently no refbacks.


JOURNAL IDENTITY

Journal Name:  Journal Information System and Science Technology

Jurnal Sistem Informasi dan Sains Teknologi


e-ISSN: 2684-8260
Publisher: Program Studi Sistem Informasi, Universitas Trilogi, Jakarta Selatan, Indonesia
Publication Schedule: February and August
Language: Indonesian and English
APC: Free of charge (submission, publishing) 
Indexing:  Google Scholar, Garuda, Neliti, One Search, Base, DRJI, Road, Crossref, Index CopernicusWorldCat, ScilitDimensions (find by DOI article)
OAI addresshttp://trilogi.ac.id/journal/ks/index.php/SISTEK/oai?verb=ListRecords&metadataPrefix=oai_dc
Collaboration Partners: Indonesian Association of Higher Education in Informatics and Computing (APTIKOM)

Contactsistek@trilogi.ac.id (Whatsapp Number: +628192454119)

license :
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Journal Information System and Science Technology (Jurnal Sistem Informasi dan Sains Teknologi) is Published by Information System Department Trilogi University, South Jakarta, Indonesia. 

Under license CC-BY from Creative Commons Attribution.