Colorectal cancer detection using transformer-based approach with attention mechanism

Sarker, Showmen; Fardin, Sadman; Rahman, Saik; Islam, Md.Tanjimul; Sifat, Golam

View/Open

19301188,19301068,19101011,19101613,20301478_CSE.pdf (1.119Mb)

Date

2022-09

Publisher

Brac University

Abstract

Image classification is the process of labeling and classifying pixels or vectors within an image according to preset rules. Classification can be done using spectral or textural features. Computer vision researchers focus on image classification, localization, segmentation, and object recognition. One of the biggest challenges is image classification. It’s a foundation for various object recognition problems. Image classification applications are used in medical imaging, satellite object tracking, traffic management, brake light detection, and many more fields. Try to uncover more real-world photo categorization applications in our complete list of AI vision applications. ”Maximum likelihood” and ”minimum distance” are two popular training data-based picture categorization algorithms. The ”maximum likelihood” classification analyzes the picture’s textural and spectral indices’ standard deviation and mean values to take advantage of statistical data. Using a normal distribution on each class’s pixel data, the chance of each pixel belonging to each class is calculated. Many traditional statistical approaches and probabilistic relationships are also applied. The highest probability pixels are given to a group of characteristics. We used the Vision transformer’s attention-based method to distinguish afflicted and healthy colons during our investigation. Our path has involved using various models to achieve the best result. We next compared CNN model findings to our chosen transformer model VIT16, which supports attention-based techniques. Colorectal cancer detection models include VGG16, VGG19, Resnet101, and Resnet 50. The results were then compared to our model VIT16. We chose the best Colorectal Cancer Detection model from the comparison. We compared results based on val accuracy, val loss, precision, recall, and f1 score to select the best model. The confusion matrix was another sign that the VIT-16 model worked well. In this report, ViT-16 had the top val accuracy, val loss, Precision, Recall, and f1 score, while ResNet101 ranked second. Thus, ViT-16, which uses the attention mechanism, is the best model for colorectal cancer detection.

Keywords

Image classification; Object recognition; Traditional statistical methods; Maximum likelihood; Traffic management; Categorization applications

Description

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2022.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 52-55).

Department

Department of Computer Science and Engineering, Brac University

Type

Thesis

Collections

Thesis & Report, BSc (Computer Science and Engineering) [1495]