Colorectal cancer detection using transformer-based approach with attention mechanism
Abstract
Image classification is the process of labeling and classifying pixels or vectors within
an image according to preset rules. Classification can be done using spectral or
textural features. Computer vision researchers focus on image classification, localization,
segmentation, and object recognition. One of the biggest challenges is image
classification. It’s a foundation for various object recognition problems. Image classification
applications are used in medical imaging, satellite object tracking, traffic
management, brake light detection, and many more fields. Try to uncover more
real-world photo categorization applications in our complete list of AI vision applications.
”Maximum likelihood” and ”minimum distance” are two popular training
data-based picture categorization algorithms. The ”maximum likelihood” classification
analyzes the picture’s textural and spectral indices’ standard deviation and
mean values to take advantage of statistical data. Using a normal distribution on
each class’s pixel data, the chance of each pixel belonging to each class is calculated.
Many traditional statistical approaches and probabilistic relationships are
also applied. The highest probability pixels are given to a group of characteristics.
We used the Vision transformer’s attention-based method to distinguish afflicted
and healthy colons during our investigation. Our path has involved using various
models to achieve the best result. We next compared CNN model findings to our
chosen transformer model VIT16, which supports attention-based techniques. Colorectal
cancer detection models include VGG16, VGG19, Resnet101, and Resnet 50.
The results were then compared to our model VIT16. We chose the best Colorectal
Cancer Detection model from the comparison. We compared results based on
val accuracy, val loss, precision, recall, and f1 score to select the best model. The
confusion matrix was another sign that the VIT-16 model worked well. In this report,
ViT-16 had the top val accuracy, val loss, Precision, Recall, and f1 score, while
ResNet101 ranked second. Thus, ViT-16, which uses the attention mechanism, is
the best model for colorectal cancer detection.