dc.contributor.advisor | Alam, Md. Ashraful | |
dc.contributor.author | Bhuiyan, Mahdi Hasan | |
dc.contributor.author | Haldar, Sumit | |
dc.contributor.author | Chowdhury, Maisha Shabnam | |
dc.contributor.author | Bushra, Nazifa | |
dc.contributor.author | Jilan, Tahsin Zaman | |
dc.date.accessioned | 2024-05-20T06:48:30Z | |
dc.date.available | 2024-05-20T06:48:30Z | |
dc.date.copyright | ©2024 | |
dc.date.issued | 2024-01 | |
dc.identifier.other | ID: 20101541 | |
dc.identifier.other | ID: 20101544 | |
dc.identifier.other | ID: 20101459 | |
dc.identifier.other | ID: 20101536 | |
dc.identifier.other | ID: 20101581 | |
dc.identifier.uri | http://hdl.handle.net/10361/22888 | |
dc.description | This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024. | en_US |
dc.description | Cataloged from PDF version of thesis. | |
dc.description | Includes bibliographical references (pages 33-36). | |
dc.description.abstract | Early detection of retinal diseases can help people avoid going completely or partially
blind. In this research, we will be implementing an interpretable diagnosis
of retinal diseases using a hybrid model containing VGG-16 and Swin Transformer
and then visualize with Grad-CAM. Using Optical Coherence Tomography (OCT)
Images gathered from various sources, a unique multi-label classification approach is
developed in this study for the diagnosis of various retinal diseases. For the research,
a transformer-like hybrid architecture will be used, which is Vision Transformer that
works by classifying images. Recent developments in competitive architecture for
image classification include the original concept of Transformers. The implication
of this architecture is done over patches of images often called visual tokens. It
can handle different data modality. A ViT employs several embedding and tokenization
techniques. In order to accurately highlight key areas in pictures, the
gradient-weighted class activation mapping, known as (Grad-CAM) technique has
been used so that deep model prediction can be obtained in image classification,
image captioning and several other tasks. It explains network decisions by using the
gradients in back-propagation as weights. We used both VGG-16 that is a variant
of Convolutional Neural Networks (CNN) and Swin Transformers in our model. We
combined these two and introduced a hybrid model. After being tested, the VGG-16
component’s output accuracy was 0.8888, while the Vision Transformer component’s
accuracy was 0.9139. Then the hybrid model was tested after some fine tuning and
it performed extraordinarily. The output accuracy of the hybrid model is 0.988. | en_US |
dc.description.statementofresponsibility | Mahdi Hasan Bhuiyan | |
dc.description.statementofresponsibility | Sumit Haldar | |
dc.description.statementofresponsibility | Maisha Shabnam Chowdhury | |
dc.description.statementofresponsibility | Nazifa Bushra | |
dc.description.statementofresponsibility | Tahsin Zaman Jilan | |
dc.format.extent | 41 pages | |
dc.language.iso | en | en_US |
dc.publisher | Brac University | en_US |
dc.rights | Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. | |
dc.subject | Disease detection | en_US |
dc.subject | Ocular diseases screening | en_US |
dc.subject | GradCAM | en_US |
dc.subject | Deep learning | en_US |
dc.subject | Vision transformers | en_US |
dc.subject | Retinal diseases | |
dc.subject.lcsh | Neural networks (Computer science) | |
dc.subject.lcsh | Eye--Diseases | |
dc.subject.lcsh | Deep learning (Machine learning) | |
dc.title | An interpretable diagnosis of retinal diseases using vision transformer and Grad-CAM | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | Department of Computer Science and Engineering, Brac University | |
dc.description.degree | B.Sc in Computer Science | |