A comparative analysis of the different CNN-LSTM model caption generation of medical images

Amin, Mahzabin Yasmin Binte; Shammo, Weney Hasan; Sayed, Jawad Bin; Hossain, MD Junaied

dc.contributor.advisor	Alam, Md. Golam Rabiul
dc.contributor.advisor	Reza, Md. Tanzim
dc.contributor.author	Amin, Mahzabin Yasmin Binte
dc.contributor.author	Shammo, Weney Hasan
dc.contributor.author	Sayed, Jawad Bin
dc.contributor.author	Hossain, MD Junaied
dc.date.accessioned	2023-12-31T05:38:13Z
dc.date.available	2023-12-31T05:38:13Z
dc.date.copyright	2023
dc.date.issued	2023-05
dc.identifier.other	ID 18101479
dc.identifier.other	ID 19101601
dc.identifier.other	ID 21341025
dc.identifier.other	ID 20101204
dc.identifier.uri	http://hdl.handle.net/10361/22042
dc.description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2023.	en_US
dc.description	Cataloged from PDF version of thesis.
dc.description	Includes bibliographical references (pages 36-37).
dc.description.abstract	The intent of this paper is to make the process of interpreting and understanding information within ultrasound pictures simpler and quicker by addressing the lack of techniques for automatically deciphering medical images. In order to do so, we propose a method of ultrasound image caption generation using AI that highlights the potential Machine Translation has in translating medical images to textual notations. The model needs to be trained on an ultrasound image dataset of the abdominal region including the uterus, myometrium, endometrium and cervix, a field of the medical sector that remains inadequately addressed. Two pre-trained CNN models, namely, VGG16 and Inception v3 have been used to extract features from the ultrasound images. Subsequently, the encoder-decoder model takes in two types of inputs, one for each of its layers. The two kinds of inputs are the text sequence and the image features. Both Vanilla LSTM and Bi-directional LSTM have been used to build the language generation model. The embedding layer along with the LSTM layer will process the text input. At last, the output from the two layers stated above will be merged.	en_US
dc.description.statementofresponsibility	Mahzabin Yasmin Binte Amin
dc.description.statementofresponsibility	Weney Hasan Shammo
dc.description.statementofresponsibility	Jawad Bin Sayed
dc.description.statementofresponsibility	MD Junaied Hossain
dc.format.extent	49 pages
dc.language.iso	en	en_US
dc.publisher	Brac University	en_US
dc.rights	Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	Ultrasound image	en_US
dc.subject	Image captioning	en_US
dc.subject	Medical image captioning	en_US
dc.subject	Convolutional Neural Network	en_US
dc.subject	LSTM	en_US
dc.subject.lcsh	Imaging systems in medicine
dc.subject.lcsh	Diagnostic ultrasonic imaging
dc.subject.lcsh	Neural networks (Computer science)
dc.title	A comparative analysis of the different CNN-LSTM model caption generation of medical images	en_US
dc.type	Thesis	en_US
dc.contributor.department	Department of Computer Science and Engineering, Brac University
dc.description.degree	B.Sc. in Computer Science

Files in this item

Name:: 18101479_19101601_21341025_201 ...
Size:: 1.244Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Thesis & Report, BSc (Computer Science and Engineering) [1480]

Show simple item record