RiskRadar: an NLP-driven summarization system for query-based security insights
Date
2024-10Publisher
BRAC UniversityAuthor
Zilane, Md. ShahanurRahman, Mohammad Mushfiqur
Jisa, Aniqa Ibnat
Elma, Qurratul Ayen
Rifat, Asaduzzaman
Metadata
Show full item recordAbstract
The evolving complexity and frequency of cyber threat incidents demand the development
of robust, user-friendly systems that can educate and assist users, and
help them understand and mitigate them as much as possible. This thesis describes
Risk Radar, a query-based information retrieval and response system that uses some
advanced Natural Language Processing (NLP) methods to provide precise, contextaware
responses to cybersecurity data.
The system employs a multi-module architecture, with each module tailored to
a specific task, such as query correction, semantic sequence analysis, information
retrieval, and response generation. The core NLP models used are BERT for semantic
similarity, BM25 for effective retrieval of relevant content, and a distilled
BART model for summarization and context-based response generation. A unique
rule-based mechanism improves query understanding and maintains contextual continuity
across user interactions, addressing the challenges of multi-turn dialogue in
technical. The proposed system not only provides detailed responses, but it also includes
relevant articles to help users better understand specific incidents or trends.
The system’s performance is measured by its ability to retain the context of user
queries, retrieve and rank relevant content accurately, and generate coherent, informative
responses.
The system’s real-time implementation dynamically updates the dataset based on
daily scraping of cybersecurity articles, ensuring that responses are timely and relevant.
To address computational constraints, the model architecture prefers efficient
methods like sequence-based rule application and DistilBART over more computationally
intensive models like GPT-Neo. This trade-off balances accuracy and
resource availability, resulting in a solution that is both practical and efficient. This
thesis aims to contribute a scalable, efficient solution for tackling the growing need
for real-time, user-oriented cybersecurity information systems.