Analysis and manipulation of the data set obtained from the study of pre-primary education development using machine learning
Abstract
A data set on ‘Preprimary education’ consisting of eleven hundred and fourteen data was collected from IED BRAC University. At the end of the study the students were tested out of eighteen and scored according to their performance .The entire data set was divided in three groups, and the results were classified in four parts. The first data set consisted of the socio economic attributes and the result obtained from the test, the second data set contained other likely attributes which could possibly have affected the results, like the education of parents, siblings, average education of student’s families etc. The third and final data set consisted of the marks they scored in each question and the net total mark obtained, to test their readiness. The results were classified into four groups called fully prepared, partially prepared, unprepared and needs help and they were categorized accordingly. An attribute relation file format, a format which is compatible with WEKA was made of each data set, before taking input. Once input was taken, these data sets were analyzed using several machine learning algorithms. The main goal of analyzing the data was to test whether or not the preprimary education prepares the students for primary education. The algorithms used in analyzing the data set were Super vector machine: Sequential Maximization Optimization, Multilayer perceptron: Back propagation algorithm, Naive Bayes algorithm, Random tree and random forest. The results obtained from each algorithm were compared and the algorithm performing the best was selected.