A feature selection approach to determine obesity using machine learning method
Abstract
The new health concern that is proliferating in developing and impoverished countries is obesity. It is recognized as a complex health issue caused by various factors
such as genetics, behaviour, and other issues. Obesity is not just about physique
or look; it is a persistent medical illness that opens the body to many diseases
and shortens life.Obesity frequently results in a wide variety of other disorders, including cardiovascular disease, hypertension, diabetes, numerous malignancies, and
more. The developed countries have already undertaken a few measures and are
deeply concerned about their health issues. Thus, the people of low or mid-income
countries are still unaware of this fact and will face significant health challenges in
the future. Specifically, in Bangladesh, many people have diabetes, and recently,
many people died due to heart disease and cancer, which could be prevented if they
were health concerns. Recent studies say that the young generation is more prone
to obesity as they are more influenced by western lifestyles, eating many junk foods,
and spending the maximum of their time on the internet. Our research has collected
more than 500 people’s data from different groups of people around Bangladesh. We
aim to predict the future outcome at which BMI value range people are more prone
to diseases. To predict the outcome, we have analyzed our sample dataset using
machine learning approaches such as Naive Bayes, Random Forest, decision tree,
The k-nearest neighbours (KNN), Logistic Regression. Among these algorithms,
Decision Tree has given the best accuracy of 96.67%. For selecting essential variables from the dataset, we used the BorutaShap wrapper feature selection method.
This algorithm delivers a better subset of attributes from a high volume of data and
trains the model faster. As the Boruta algorithm selects the best feature, reduces
the model size, and identifies the key features, it became easy to train our data set,
so we got a better accuracy level using this algorithm in our reach. This researcher
will help the people of Bangladesh to understand obesity and its detrimental aspects.
Moreover, it will assist them in being more conscious of their health conditions and
predicting which BMI level is a risk for them.