Retrieval-augmented generation based doctor recommendation system using knowledge graph
Abstract
Finding suitable healthcare professionals and the growing demand for efficient healthcare
access is challenging and advanced language modeling techniques to provide
tailored medical advice. Reviewing the existing research on doctor recommendation
systems, it became apparent that while previous authors developed functional models,
their datasets are likely outdated and no longer reflective of current healthcare
trends. These models would require complete retraining with a new or updated
dataset to ensure their relevance and accuracy. In contrast, our approach takes
advantage of the ability to update the existing database with additional, current
information without the need for retraining the model from scratch. By simply integrating
the updated data, we can maintain the integrity and functionality of the
previous system, making our method both time-efficient and resource-conserving.
This approach eliminates the need for redundant work, allowing us to leverage the
previous model while ensuring the data remains current and applicable. The proposed
Doctor Recommendation System leveraging a Knowledge Graph built with
Neo4j and enhanced by a Retrieval-Augmented Generation (RAG) model using the
LangChain framework. The system aims to provide up-to-date, personalized and
accurate doctor recommendations by integrating structured and unstructured data
sources. The Neo4j Knowledge Graph captures comprehensive relationships between
doctors, specialties, disease, symptoms and medical conditions, offering a robust
data foundation. The LangChain framework, incorporating a large language model
(LLM), enhances the recommendation process by generating context-aware suggestions
based on patient queries and historical data. The Doctor’s Specialty Recommendation
dataset has been used on three chains which are RetrievalQAChain,
GraphCypherQAChain, RetrievalQAWithSourcesChain and also similarity search
is used. However, based on the correctness, distance, context accuracy and CoT
context accuracy, graphCypherQAChain performed well. Moreover, ROUGE-1,
ROUGE-2, ROUGE-3, ROUGE-L and BLEU are used to determine the performance
for all the chains as well as compared with GPT-4. Based on these evaluation
metrics, graphCypherQAChain attained a high performance level with 86% for
ROUGE-1, 82% for ROUGE-2, 77% for ROUGE-3, 86% for ROUGE-L and 46% for
BLEU using GPT-3.5-turbo whereas GPT-4 performed 78%, 60%, 48%, 78% and
40% respectively.