dc.contributor.advisor | Alam, Md.Golam Rabiul | |
dc.contributor.advisor | Reza, Md.Tanzim | |
dc.contributor.author | Alam, Salman | |
dc.contributor.author | Oni, Atquiya Labiba | |
dc.contributor.author | Samir, Jubair | |
dc.contributor.author | Hossain, Asif Mosharrof | |
dc.date.accessioned | 2023-12-11T07:29:01Z | |
dc.date.available | 2023-12-11T07:29:01Z | |
dc.date.copyright | 2023 | |
dc.date.issued | 2023-05 | |
dc.identifier.other | ID 19301037 | |
dc.identifier.other | ID 19301039 | |
dc.identifier.other | ID 22241149 | |
dc.identifier.other | ID 19201006 | |
dc.identifier.uri | http://hdl.handle.net/10361/21953 | |
dc.description | This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2023. | en_US |
dc.description | Cataloged from PDF version of thesis. | |
dc.description | Includes bibliographical references (pages 42-44). | |
dc.description.abstract | Sickle Cell Disease is a monogenic genetic disorder which often leads to various
repercussions affecting multiple vital organs simultaneously. However, the treat-
ment for Sickle Cell is diverse and often varies from patient to patient, but several
background studies revealed the progression and symptoms of Sickle Cell can be
predicted to a great extent based on a patient’s genetic mutation type in the HBB
gene. Moreover, such research regarding genetic mutation prediction can be seen in
other fields of medicine such as cancer, but in the case of Sickle Cell it is scarce. Fur-
thermore, other limitations include complexity and unavailability of genetic testing,
limited clinical data available and privacy concerns regarding medical information
of patients. Hence, our study aimed to build a Federated Siamese Bidirectional
LSTM to predict the Sickle Cell genotype from clinical data, in case of sparse and
decentralized data. Consequently, a Sickle Cell clinical dataset with 216 instances
and 4 different genotype class labels was pre-processed accordingly to train and
evaluate the model performance. The dataset was then used to create pairs with
corresponding similarity scores and the Siamese Bi-LSTM was trained for several
epochs to compute similarity between two instances. The data was divided among
client devices in case of federated, while the Siamese Bi-LSTM trained locally to
update the global model and the test data was then used to assess their perfor-
mance. Thus, based on the performance analysis the Siamese Bi-LSTM achieved
accuracy of 90.45% with f1 score of 90.66% and the Federated Siamese Bi-LSTM
model (FFSB-LSTM) achieved accuracy of 88.25% and f1 score of 88.57% show-
ing significant improvement compared to the baseline KNN and Logistic Regression
models. | en_US |
dc.description.statementofresponsibility | Salman Alam | |
dc.description.statementofresponsibility | Atquiya Labiba Oni | |
dc.description.statementofresponsibility | Jubair Samir | |
dc.description.statementofresponsibility | Asif Mosharrof Hossain | |
dc.format.extent | 55 pages | |
dc.language.iso | en | en_US |
dc.publisher | Brac University | en_US |
dc.rights | Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. | |
dc.subject | Sickle cell | en_US |
dc.subject | Clinical data | en_US |
dc.subject | Genotype | en_US |
dc.subject | Federated learning | en_US |
dc.subject | Few-shot siamese | en_US |
dc.subject | Federated siamese bidirectional LSTM | en_US |
dc.subject.lcsh | Machine learning | |
dc.subject.lcsh | Computer algorithms | |
dc.subject.lcsh | Sickle cell anemia | |
dc.title | Prediction of genetic mutation from clinical data of sickle cell disease using few-shot siamese bidirectional LSTM and federated learning | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | Department of Computer Science and Engineering, Brac University | |
dc.description.degree | B.Sc. in Computer Science and Engineering | |