Application of SMOTE in Multiclass Body Mass Index Classification:: A Study on Data Imbalance and Model Performance

Selly Anastassia Amellia Kharis; Melisa Arisanty; Arman Haqqi Anna Zili

doi:10.33830/isst.v4i1.5229

Authors

Selly Anastassia Amellia Kharis Universitas Terbuka, Mathematics Study Program, South Tangerang, Banten, Indonesia, 15437
Melisa Arisanty Universitas Terbuka, Library and Information Science Study Program, South Tangerang, Banten, Indonesia, 15437
Arman Haqqi Anna Zili Department of Mathematics, Universitas Indonesia, Depok, West Java, Indonesia, 16424

DOI:

https://doi.org/10.33830/isst.v4i1.5229

Keywords:

body mass index, classification, decision tree, K-nearest neighbor, logistic regression, random forest, SMOTE, support vector machine

Abstract

The Body Mass Index (BMI) is a commonly utilized measure that calculates body fat by analyzing a person’s height and weight. BMI can monitor and explain a person's nutritional status. BMI classification is not always limited to binary classification but can extend to multiclass scenarios. However, a common challenge in BMI classification is the imbalance in data distribution across different classes, where some classes have significantly fewer instances compared to others. This research aims to evaluate the effectiveness of multiclass BMI classification both with and without the application of the Synthetic Minority Over-Sampling Technique (SMOTE). This study divides BMI into five groups using different machine learning algorithms: extremely weak, weak, normal, overweight, obesity, and extreme obesity. The machine learning algorithm utilized in this research include Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Logistic Regression. After applying SMOTE, the F1-score improved significantly across all models, with SVM showing an increase from 82.72% to 93.67% and KNN from 87.02% to 94.95%. Similarly, the overall accuracy improved by up to 7.84% in the SVM model. These results demonstrate that SMOTE effectively enhances the predictive performance of multiclass classification, especially in recognizing underrepresented classes.

Application of SMOTE in Multiclass Body Mass Index Classification:

A Study on Data Imbalance and Model Performance

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Conference Proceedings Volume

Section

License

Information