Detection of acute Lymphocytic Leukemia (ALL) and its type by image processing and machine learning
Abstract
Cancer starts when cells of body begin to grow rapidly. Cells in nearly any part of the
body can become cancer and can spread to other areas of the body. The origin of Chronic
Lymphocytic Leukemia (CLL) in the bone marrow and causes the random growth of a
large number of unnatural cells. The leukemia cells start in the bone marrow. By the time,
access into the blood cells and cause fatal disease. Mainly, there exist 4 types of leukemia
which are Acute Lymphoblastic Leukemia (ALL), Acute Myeloid Leukemia (AML), Chronic
Lymphocytic Leukemia (CLL) and Chronic Myeloid Leukemia (CML). In this paper, we
proposed to build a methodology to detect the Leukemia (Cancer) by the help of image
processing and machine learning. We are using the two stage otsu-optimization approach
algorithm, Lab color space algorithm and wrapper method. For image preprocessing to
be fit in the classifiers Image to Feature Vector method and Label Encoding methods have
been applied on the dataset. Furthermore, we applied various machine learning algorithms,
Logistic Regression, Decision Tree, Gaussian Naive Bayes, K-Nearest Neighbor (KNN) and
from neural network algorithm Convolutional Neural Network (CNN) has been applied. We
made an effort to build a comprehensive comparison among machine learning algorithms.
Though it has been done in past research papers but in this paper we collected few image
data from Dhaka Medical College and preprocessed it with another public image data set
named ADL to attain at least a promising test accuracy. Moreover, in this research paper we
tried to break a superstition of recent age which is Convolutional Neural Network (CNN) is
the only appropriate model to train an image dataset. We implemented AdaBoost Classifier
which has given 87% of test accuracy with a glimpse of high cross validation accuracy of
90%. We also brought Voting Classifier in process, mixing AdaBoost, Gaussian Naive Bayes,
K-Nearest Neighbor (KNN) classifiers together has given 89% of test accuracy as much as
like Convolutional Neural Network (CNN) 90%. Thus, we can conclude the debate that
image dataset can be trained for pattern recognition with simple machine learning algorithm
with the minimum computational cost with higher accuracy.