Efficient network traffic management and intelligent decision-making through machine learning and DNS log analysis
View/ Open
Date
2023-08Publisher
Brac UniversityAuthor
Hossain, Syed AbedMetadata
Show full item recordAbstract
This research presents a comprehensive approach to network traffic management
and analysis by leveraging DNS log analysis, machine learning techniques, and
Software-Defined Networking (SDN) integration. In an office environment, a DNS
server was set up to collect DNS logs from nearly 200 users over a month. The
collected data was subjected to data cleaning and additional information extraction
in Google BigQuery. Demographic analysis was conducted using Google LookerStudio,
providing valuable insights into user behavior patterns during different office
hours. Subsequently, various supervised and unsupervised machine learning models
were employed to predict browsing categories based on the DNS log analysis.
Among the models evaluated, the Random Forest Classifier (RFC) demonstrated
exceptional performance, achieving high accuracy, precision, recall, and F1 Score
during training, with values of 82.54%, 82.79%, 82.54%, and 81.81%, respectively.
The trained RFC model showcased its robustness in minimizing the discrepancy
between predicted probabilities and actual class values. The trained model was
then exported and integrated into a virtual Linux machine to simulate an SDN
environment. The experimental results showcased the system’s high accuracy in
categorizing DNS queries during real-time testing, with 100% accuracy achieved for
categories like Ads and Entertainment, and impressive accuracy rates of 98.57%,
87.5%, and 87.21% for Search Engines, Social Networks, and CDNs, respectively.
The system’s reliability and effectiveness in intelligently managing network traffic
were further demonstrated with slightly lower but still respectable accuracies
of 81.82% and 80.95% for Computer/Technology and Learning categories, respectively.
The predictive capabilities of the system have practical applications for office
network management, including website blocking, traffic rerouting based on predictions,
and bandwidth management, all facilitated through the SDN controller. The
findings of this study highlight the efficacy of combining DNS log analysis, machine
learning, and SDN integration for enhancing network security, optimizing resource
allocation, and delivering an enhanced user experience in a standard office environment.
The presented approach can serve as a blueprint for efficient network traffic
management and intelligent decision-making in similar settings.