COMPARISON OF CLUSTERING MODELS FOR GROUPING LIFESTYLE PATTERNS AND OBESITY FACTORS

  • Khalid Al Mas Ud Sriwijaya University
  • Fathoni Fathoni Sriwijaya University
  • Hafiz Muhammad Kurniawan INTI International University
Keywords: Agglomerative, Clustering, GMM, K-means, Lifestyle patterns, Obesity

Abstract

Abstract: Obesity is an escalating global health concern, with unhealthy lifestyle patterns contributing significantly to its development. This study aims to evaluate and compare three clustering techniques for categorizing lifestyle patterns and obesity-related factors: K-Means, Agglomerative Clustering, and Gaussian Mixture Model (GMM). The data used in this study is sourced from the Food Nutrition dataset, which includes variables such as dietary habits, physical activity, and socio-economic status. The three clustering methods were assessed using evaluation metrics such as Silhouette Score, Davies-Bouldin Index (DBI), and Calinski-Harabasz Index (CHI). The findings revealed that K-Means exhibited the best performance in terms of cluster separation with a Silhouette Score of 0.5559, while GMM showed better flexibility in handling more complex data. Although Agglomerative Clustering produced acceptable results, it had a higher overlap between clusters compared to the other methods. This study offers valuable insights into selecting the most appropriate clustering technique based on the data characteristics.

           
Keywords: agglomerative; clustering; GMM; k-means; lifestyle patterns; obesity

 

Abstrak: Obesitas menjadi masalah kesehatan yang semakin meningkat di seluruh dunia, dengan pola hidup yang tidak sehat berperan besar dalam perkembangannya. Penelitian ini bertujuan untuk membandingkan tiga metode clustering dalam mengelompokkan pola gaya hidup dan faktor yang memengaruhi obesitas, yaitu K-Means, Agglomerative Clustering, dan Gaussian Mixture Model (GMM). Data yang digunakan diperoleh dari dataset Food Nutrition yang mencakup informasi terkait pola makan, aktivitas fisik, serta faktor sosial-ekonomi. Ketiga metode tersebut diuji dengan menggunakan beberapa metrik evaluasi, seperti Silhouette Score, Davies-Bouldin Index (DBI), dan Calinski-Harabasz Index (CHI). Hasil penelitian menunjukkan bahwa K-Means memiliki kinerja terbaik dalam hal pemisahan klaster, dengan nilai Silhouette Score sebesar 0.5559, sementara GMM lebih fleksibel dalam menangani data yang lebih kompleks. Meskipun Agglomerative Clustering memberikan hasil yang dapat diterima, tumpang tindih antar klaster lebih besar dibandingkan dengan kedua metode lainnya. Penelitian ini memberikan pemahaman yang lebih baik mengenai pemilihan metode clustering yang tepat berdasarkan karakteristik data yang digunakan.

 

Kata kunci: agglomerative; clustering; GMM; k-means; obesitas; pola gaya hidup

References

M. Brauer et al., “Global burden and strength of evidence for 88 risk factors in 204 countries and 811 subnational locations, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021,” Lancet, vol. 403, no. 10440, pp. 2162–2203, May 2024.

Y. Hu, Y. Zhang, J. Zhong, Y. Wang, E. Zhou, and F. Hong, “Association between obesity phenotypes and dietary patterns: A two-step cluster analysis based on the China multi-ethnic cohort study,” Prev. Med. (Baltim)., vol. 187, p. 108100, 2024.

Q. Wang, M. Yang, K. Chen, F. Zheng, Z. Zhang, and W. Niu, “Clustering unhealthy lifestyle factors in Chinese children and adolescents with overweight or obesity,” BMC Pediatr., vol. 25, no. 1, p. 226, 2025.

J. F. López-Gil, J. Brazo-Sayavera, A. García-Hermoso, E. M. de Camargo, and J. L. Yuste Lucas, “Clustering Patterns of Physical Fitness, Physical Activity, Sedentary, and Dietary Behavior among School Children,” Child. Obes., vol. 16, no. 8, pp. 564–570, Oct. 2020.

J. Kim et al., “Physical Activity Pattern of Adults With Metabolic Syndrome Risk Factors: Time-Series Cluster Analysis,” JMIR Mhealth Uhealth, vol. 11, p. e50663, 2023.

G. S. Mohamed Khamis, N. S. Alqahtani, S. Munadi Alanazi, M. M. Alruwaili, M. S. Alenazi, and M. A. Alrawaili, “Using Fuzzy C-Means clustering and PCA in public health: A machine learning approach to combat CVD and obesity,” Informatics Med. Unlocked, vol. 57, p. 101666, 2025.

K. Ahmad, S. A. Keramat, G. M. Ormsby, E. Kabir, and R. Khanam, “Clustering of lifestyle and health behaviours in Australian adolescents and associations with obesity, self-rated health and quality of life,” BMC Public Health, vol. 23, no. 1, p. 847, 2023.

A. J. Grant, D. Gill, P. D. W. Kirk, and S. Burgess, “Noise-augmented directional clustering of genetic association data identifies distinct mechanisms underlying obesity,” PLoS Genet., vol. 18, no. 1, pp. 1–24, 2022.

R. González-Martos et al., “Unsupervised clustering of biochemical markers reveals health profiles associated with function and survival in active aging,” Sci. Rep., vol. 15, no. 1, p. 30546, 2025.

Y. Wasnyo et al., “Clustering of diet and physical activity behaviours in adolescents across home and school area-level deprivation in Cameroon, South Africa, and Jamaica.,” BMC Public Health, vol. 24, no. 1, p. 3234, Nov. 2024.

R. Thirumalaiselvi and D. Gomathi, “Healthy eating behaviors of girl children using clustering techniques: A questionnaire study,” i-manager’s J. Comput. Sci., vol. 10, no. 1, p. 1, 2022.

A. Wosiak, M. Krzywicka, and K. Żykwińska, “Assessing the Impact of Physical Activity on Dementia Progression Using Clustering and the MRI-Based Kullback–Leibler Divergence,” Appl. Sci., vol. 15, no. 2, p. 652, Jan. 2025.

I. R. Paucar, C. Yactayo-Arias, and L. Andrade-Arenas, “Predictive Models in Mental Health Based on Unsupervised Data Clustering,” Int. J. Adv. Comput. Sci. Appl., vol. 16, no. 9, 2025.

M. A. Mizani et al., “Identifying subtypes of type 2 diabetes mellitus with machine learning: development, internal validation, prognostic validation and medication burden in linked electronic health records in 420 448 individuals,” BMJ Open Diabetes Res. Care, vol. 12, no. 3, p. e004191, Jun. 2024.

G.-E. Yie et al., “Plasma metabolite based clustering of breast cancer survivors and identification of dietary and health related characteristics: an application of unsupervised machine learning,” Nutr. Res. Pract., vol. 19, no. 2, p. 273, 2025.

D. Geovani, Z. Umari, and S. Ramadini, “Cluster Analysis of Obesity Risk Levels Using K-Means And DBScan Methods,” Comput. Eng. Appl. J., vol. 13, no. 3, pp. 10–24, Oct. 2024.

E. Setiawati, U. D. Fernanda, S. Agesti, M. Iqbal, and M. O. A. Herjho, “Implementation of K-Means, K-Medoid and DBSCAN Algorithms In Obesity Data Clustering,” IJATIS Indones. J. Appl. Technol. Innov. Sci., vol. 1, no. 1, pp. 23–29, Jan. 2024.

D. E. Coral et al., “Subclassification of obesity for precision prediction of cardiometabolic diseases,” Nat. Med., vol. 31, no. 2, pp. 534–543, 2025.

M. M. Mottalib, J. C. Jones-Smith, B. Sheridan, and R. Beheshti, “Subtyping Patients With Chronic Disease Using Longitudinal BMI Patterns,” IEEE J. Biomed. Heal. Informatics, vol. 27, no. 4, pp. 2083–2093, 2023.

Z. Zhou et al., “Volumetric visceral fat machine learning phenotype on CT for differential diagnosis of inflammatory bowel disease,” Eur. Radiol., vol. 33, no. 3, pp. 1862–1872, 2023.

M. Mehedi Hassan, S. Mollick, and F. Yasmin, “An unsupervised cluster-based feature grouping model for early diabetes detection,” Healthc. Anal., vol. 2, p. 100112, 2022.

Z. Lin et al., “Machine Learning to Identify Metabolic Subtypes of Obesity: A Multi-Center Study,” Front. Endocrinol. (Lausanne)., vol. 12, Jul. 2021.

M. Rivera-Ochoa et al., “Clustering Health Behaviors in Mexican Adolescents: The HELENA-MEX Study,” Res. Q. Exerc. Sport, vol. 95, no. 1, pp. 281–288, Jan. 2024

Published
2025-12-22
How to Cite
Al Mas Ud, K., Fathoni, F., & Muhammad Kurniawan, H. (2025). COMPARISON OF CLUSTERING MODELS FOR GROUPING LIFESTYLE PATTERNS AND OBESITY FACTORS. JURTEKSI (jurnal Teknologi Dan Sistem Informasi), 12(1), 61 - 68. https://doi.org/10.33330/jurteksi.v12i1.4265