Machine learning based stream selection of secondary school students in Bangladesh
Abstract
A strong civilization is built on a strong foundation, and education plays a vital
role in acquiring the necessary information and skills for success in life. This thesis
focuses on the education system in Bangladesh, which is divided into three levels:
primary (PEC), middle school (JSC), and secondary school certificate (SSC). The
selection of a stream after the eighth grade is crucial for students’ higher studies and
career planning, with three options available: Science, Business Studies, and Humanities.
To address the challenge of stream selection based solely on PSC and JSC
results, we have collected a dataset from various Bangladeshi schools, comprising
student records that include subject-wise results, parent’s academic qualification,
parent’s profession, parent’s monthly income, sibling information, district, etc. In
this study, we employ a series of machine learning regression algorithms to analyze
the data.Furthermore, we utilize performance metrics and R2 scores to evaluate
and validate the models’ performance. Among the regressors, the gradient boosting
algorithm demonstrates superior performance for the Science stream, achieving an
R2 score of 0.34540. For the Business Studies stream, the Support Vector Machine
exhibits significantly better performance with an R2 score of 0.534092. Finally, the
Humanities stream shows excellent results with an R2 score of 0.80337 using extreme
gradient boosting.To enhance the interpretability of our models, we leverage the Local
Interpretable Model Agnostic Explanations (LIME) technique. The analysis and
findings of this research are expected to assist prospective students and stakeholders
in making informed decisions regarding stream selection, ensuring alignment with
their future goals and aspirations.