Pathway to perception: a smart navigation approach for visually impaired individuals leveraging YOLO, faster R-CNN, and LLaMA
Date
2024-10Publisher
BRAC UniversityAuthor
Susmit, Tahsin AshrafeeMehejabin, Maliha
Hasan, Isratul
Kausar, Azmain Ibn
Akbar, Suraiya Binte
Metadata
Show full item recordAbstract
The purpose of our study is to create new technology that will provide a revolutionary
navigation system with significant improvement of mobility and independence
for visually impaired people. We utilize YOLOv11 and Faster R-CNN to detect an
object which is used in combination with Llama 3.2–3B Instruct for context-aware
navigation by providing helpful guidance of our current essential location. Our paper
tackles the failure points in today’s technologies with lack of flexibility for dynamic
and unfamiliar environments, unreliable performance under changes in lighting conditions
and inefficient obstacle detection. By training these models together and
selecting the one with the highest confidence score, we enhance spatial awareness,
identifying obstacles in key areas like the left, right, or center. This approach,
complemented by personalized navigation instructions, ensures improved decisionmaking
and safety in real-world scenarios. Using advanced locational technologies
available today and imagining those of tomorrow, we aspire to render current navigation
methods obsolete by fostering more efficient, real-time and autonomous tools
for visually impaired people as they become part of the familiar or unfamiliar environments.
After fine-tuning the Llama 3.2-3B-Instruct model, BLEU-4 increased
from 0.0442 to 0.1175, and ROUGE-L improved from 0.2102 to 0.3204, indicating
enhanced text generation fluency and coherence.