An AI and NLP approach for detecting grooming behavior
Date
2024Publisher
BRAC UniversityAuthor
Shanto, Hasibul HossainFarooqui, Farhan
Rafi, Abdullah Al
Feona, Maisha Maliha
Phul, Progya Talukder
Metadata
Show full item recordAbstract
"Grooming children on social media is a dangerous side effect of modern internet
era. AI models, specially NLP have the potential to play a critical role in detecting
grooming behavior. Even though, there have been studies in the past to build a
grooming detection system, there is limited research on building such systems us-
ing modern NLP techniques. In this paper, we propose a modern sexual grooming
detection system using state-of-the-art NLP models and techniques that can detect
and alert users to potentially dangerous online interactions between groomers and
their targets. Our detection system is a ConversationClassifier which is able to clas-
sify conversations, whether they are grooming or not. With over 19,000 grooming
sentences collected from PervertedJustice grooming conversations, we created an
annotated dataset exhibiting the grooming characteristics. Conversational data was
also collected from both PervertedJustice and PAN12 dataset. With the sentence-
level annotated dataset, we trained a SentenceClassifier model based on RoBERTa
& DeBERTa to be able to accurately predict if a sentence has grooming character-
istics or not. The ConversationClassifier was built on top of the SentenceClassifier
with LSTM & GRU to capture the sequential features in the conversation. Further-
more, a self-attention mechanism was added so that the model can focus on relevant
sentences. Our models achieved promising results. In case of the SentenceClassifier,
it displayed an accuracy of 93% for RoBERTa and 94% for DeBERTa. We paired
the RoBERTa based SentenceClassifier with LSTM which yielded an accuracy of
97% and DeBERTa based SentenceClassifier with GRU which yielded an accuracy
of 95%."