Show simple item record

dc.contributor.advisorAlam, Md. Ashraful
dc.contributor.authorBhuiyan, Mahdi Hasan
dc.contributor.authorHaldar, Sumit
dc.contributor.authorChowdhury, Maisha Shabnam
dc.contributor.authorBushra, Nazifa
dc.contributor.authorJilan, Tahsin Zaman
dc.date.accessioned2024-05-20T06:48:30Z
dc.date.available2024-05-20T06:48:30Z
dc.date.copyright©2024
dc.date.issued2024-01
dc.identifier.otherID: 20101541
dc.identifier.otherID: 20101544
dc.identifier.otherID: 20101459
dc.identifier.otherID: 20101536
dc.identifier.otherID: 20101581
dc.identifier.urihttp://hdl.handle.net/10361/22888
dc.descriptionThis thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.en_US
dc.descriptionCataloged from PDF version of thesis.
dc.descriptionIncludes bibliographical references (pages 33-36).
dc.description.abstractEarly detection of retinal diseases can help people avoid going completely or partially blind. In this research, we will be implementing an interpretable diagnosis of retinal diseases using a hybrid model containing VGG-16 and Swin Transformer and then visualize with Grad-CAM. Using Optical Coherence Tomography (OCT) Images gathered from various sources, a unique multi-label classification approach is developed in this study for the diagnosis of various retinal diseases. For the research, a transformer-like hybrid architecture will be used, which is Vision Transformer that works by classifying images. Recent developments in competitive architecture for image classification include the original concept of Transformers. The implication of this architecture is done over patches of images often called visual tokens. It can handle different data modality. A ViT employs several embedding and tokenization techniques. In order to accurately highlight key areas in pictures, the gradient-weighted class activation mapping, known as (Grad-CAM) technique has been used so that deep model prediction can be obtained in image classification, image captioning and several other tasks. It explains network decisions by using the gradients in back-propagation as weights. We used both VGG-16 that is a variant of Convolutional Neural Networks (CNN) and Swin Transformers in our model. We combined these two and introduced a hybrid model. After being tested, the VGG-16 component’s output accuracy was 0.8888, while the Vision Transformer component’s accuracy was 0.9139. Then the hybrid model was tested after some fine tuning and it performed extraordinarily. The output accuracy of the hybrid model is 0.988.en_US
dc.description.statementofresponsibilityMahdi Hasan Bhuiyan
dc.description.statementofresponsibilitySumit Haldar
dc.description.statementofresponsibilityMaisha Shabnam Chowdhury
dc.description.statementofresponsibilityNazifa Bushra
dc.description.statementofresponsibilityTahsin Zaman Jilan
dc.format.extent41 pages
dc.language.isoenen_US
dc.publisherBrac Universityen_US
dc.rightsBrac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subjectDisease detectionen_US
dc.subjectOcular diseases screeningen_US
dc.subjectGradCAMen_US
dc.subjectDeep learningen_US
dc.subjectVision transformersen_US
dc.subjectRetinal diseases
dc.subject.lcshNeural networks (Computer science)
dc.subject.lcshEye--Diseases
dc.subject.lcshDeep learning (Machine learning)
dc.titleAn interpretable diagnosis of retinal diseases using vision transformer and Grad-CAMen_US
dc.typeThesisen_US
dc.contributor.departmentDepartment of Computer Science and Engineering, Brac University
dc.description.degreeB.Sc in Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record