dc.contributor.advisor | Sadeque, Farig Yousuf | |
dc.contributor.author | Ahmed, Saib | |
dc.date.accessioned | 2025-02-18T06:53:25Z | |
dc.date.available | 2025-02-18T06:53:25Z | |
dc.date.copyright | 2024 | |
dc.date.issued | 2024-10 | |
dc.identifier.other | ID 22166032 | |
dc.identifier.uri | http://hdl.handle.net/10361/25436 | |
dc.description | This thesis is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science, 2024. | en_US |
dc.description | Cataloged from the PDF version of the thesis. | |
dc.description | Includes bibliographical references (pages 49-52). | |
dc.description.abstract | Documenting clinical notes is a vital but time-consuming task in healthcare. Even
in this modern era medical doctors spend considerable time documenting clinical
notes from encounters with patients. While there have been significant advancements
in general text summarization, research in clinical conversation summarization
remains sparse due to the scarcity of open-source datasets available to the
NLP community. Accurate summarization is paramount in clinical note generation,
given its implications for human health. Our research demonstrates the efficacy
of decoder-only models over traditional encoder-decoder models in generating more
precise clinical notes from doctor-patient conversations. The study also tackles key
challenges such as ensuring medical accuracy and complying with healthcare privacy
standards. We utilized the MTS-DIALOG dataset [28], including 1, 700 such
dialogues and corresponding clinical notes. This dataset was featured in the 2023
MEDIQAChat challenge, where the leading team, WangLab achieved a state-ofthe-
art (SOTA) Rouge-1 score of 0.4466 and BERTScore of 0.7307 [27]. Our study
surpasses these benchmarks by fine-tuning the ”metallama/Meta-Llama-3-8B”
model enhanced with Qlora 8-bit quantization. We assessed our models using Rouge
scores and BERT Scores to validate their superiority in performance. By evaluating
the system on real-world clinical conversations, we show that the decoder-only
LLM-generated notes closely match human-written ones in terms of completeness
and clinical relevance. This research highlights the potential for decoder-only LLMs
to revolutionize clinical workflows, making medical documentation more efficient
while allowing doctors to focus more on patient care. | en_US |
dc.description.statementofresponsibility | Saib Ahmed | |
dc.format.extent | 64 pages | |
dc.language.iso | en | en_US |
dc.publisher | BRAC University | en_US |
dc.rights | BRAC University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. | |
dc.subject | ClinicalNLP | en_US |
dc.subject | Dialouge2Note | en_US |
dc.subject | BERT | en_US |
dc.subject | Data documentation | en_US |
dc.subject | Medical documentation | en_US |
dc.subject | Text summarization | en_US |
dc.subject | Note generation | |
dc.subject.lcsh | Information storage and retrieval systems--Technology. | |
dc.subject.lcsh | Medical records--Management. | |
dc.subject.lcsh | Natural language processing (Computer science). | |
dc.subject.lcsh | Medical records--Data processing. | |
dc.title | Clinical note generation from doctor-patient conversations using decoder-only large language models | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | Department of Computer Science and Engineering, BRAC University | |
dc.description.degree | M.Sc. in Computer Science | |