Application of SMOTE in Multiclass Body Mass Index Classification:
A Study on Data Imbalance and Model Performance
DOI:
https://doi.org/10.33830/isst.v4i1.5229Keywords:
body mass index, classification, decision tree, K-nearest neighbor, logistic regression, random forest, SMOTE, support vector machineAbstract
The Body Mass Index (BMI) is a commonly utilized measure that calculates body fat by analyzing a person’s height and weight. BMI can monitor and explain a person's nutritional status. BMI classification is not always limited to binary classification but can extend to multiclass scenarios. However, a common challenge in BMI classification is the imbalance in data distribution across different classes, where some classes have significantly fewer instances compared to others. This research aims to evaluate the effectiveness of multiclass BMI classification both with and without the application of the Synthetic Minority Over-Sampling Technique (SMOTE). This study divides BMI into five groups using different machine learning algorithms: extremely weak, weak, normal, overweight, obesity, and extreme obesity. The machine learning algorithm utilized in this research include Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Logistic Regression. After applying SMOTE, the F1-score improved significantly across all models, with SVM showing an increase from 82.72% to 93.67% and KNN from 87.02% to 94.95%. Similarly, the overall accuracy improved by up to 7.84% in the SVM model. These results demonstrate that SMOTE effectively enhances the predictive performance of multiclass classification, especially in recognizing underrepresented classes.
Downloads
Published
Conference Proceedings Volume
Section
License
Copyright (c) 2025 Selly Anastassia Amellia Kharis, Melisa Arisanty, Arman Haqqi Anna Zili

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.