Show simple item record

dc.contributor.advisorRhaman, Md. Khalilur
dc.contributor.advisorMukta, Jannatun Noor
dc.contributor.authorBhuiyan, Abir Ahammed
dc.contributor.authorNeha, Samiha Afaf
dc.contributor.authorKhan, Md. Ishrak
dc.date.accessioned2024-05-15T04:34:03Z
dc.date.available2024-05-15T04:34:03Z
dc.date.copyright©2024
dc.date.issued2024-01
dc.identifier.otherID: 20101197
dc.identifier.otherID: 20101266
dc.identifier.otherID: 20101051
dc.identifier.urihttp://hdl.handle.net/10361/22830
dc.descriptionThis thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.en_US
dc.descriptionCataloged from PDF version of thesis.
dc.descriptionIncludes bibliographical references (pages 51-54).
dc.description.abstractmicrobial ecosystems. This has led to their increased utilization in several research areas, such as bacterial genome engineering, phage therapy, disease diagnostics, and viral host identification. The structure of phages is made up of proteins called phage virion proteins (PVP). Classifying these proteins is important for genomic research, which in turn helps us understand the complex interactions between phages and their hosts in the context of making antibacterial drugs. Replacing the tedious traditional procedures, a growing number of computational strategies are being employed to annotate phage protein sequences acquired using high-throughput sequencing. Among these techniques, deep learning approaches demonstrate improved performance in classification outcomes. Such procedures require special sequence encodings for the model to perceive the protein sequences with their distinctive features. Numerous ways have been examined and assessed, while novel methods continue to emerge in order to optimize the task in terms of resource utilization and prediction accuracy. The objective of our work, ProteoKnight, is to explore and develop a unique encoding technique for phage proteins and demonstrate its effectiveness via classification. In our work, we make use of the time-separated PVP dataset that [47] introduced. Furthermore, this study aims to address the lack of research conducted on uncertainty analysis by exploring the domain of uncertainty in binary PVP classification using Monte Carlo Dropout (MCD) method. The experimental findings demonstrate the effectiveness of our strategy for binary classification, achieving a prediction accuracy of 90.2%. However, the accuracy for multi-class classification remains suboptimal. Furthermore, our uncertainty analysis reveals that the class and sequence length show variability in prediction confidence for our suggested classification approach.en_US
dc.description.statementofresponsibilityAbir Ahammed Bhuiyan
dc.description.statementofresponsibilitySamiha Afaf Neha
dc.description.statementofresponsibilityMd. Ishrak Khan
dc.format.extent68 pages
dc.language.isoenen_US
dc.publisherBrac Universityen_US
dc.rightsBrac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subjectPhage virionen_US
dc.subjectDeep learningen_US
dc.subjectDNA-walken_US
dc.subjectMonte Carlo dropouten_US
dc.subjectConvolutional neural network (CNN)en_US
dc.subject.lcshNeural networks (Computer science)
dc.subject.lcshDeep learning (Machine learning)
dc.titleProteoKnight: phage virion protein classification with CNN and uncertainty quantificationen_US
dc.typeThesisen_US
dc.contributor.departmentDepartment of Computer Science and Engineering, Brac University
dc.description.degreeB.Sc. in Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record