Analyzing Schizophrenic-prone text from social media content: a novel approach through ML and NLP

Rodela, Raisa Rahman; Efty, Farhan Tanvir; Rahman, Mubashira; Wajiha, Shaira

dc.contributor.advisor	Reza, Md Tanzim
dc.contributor.advisor	Rahman, Rafeed
dc.contributor.author	Rodela, Raisa Rahman
dc.contributor.author	Efty, Farhan Tanvir
dc.contributor.author	Rahman, Mubashira
dc.contributor.author	Wajiha, Shaira
dc.date.accessioned	2024-05-19T09:24:13Z
dc.date.available	2024-05-19T09:24:13Z
dc.date.copyright	©2024
dc.date.issued	2024-01
dc.identifier.other	ID: 19301011
dc.identifier.other	ID: 19301014
dc.identifier.other	ID: 19301010
dc.identifier.other	ID: 19301018
dc.identifier.uri	http://hdl.handle.net/10361/22875
dc.description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2024.	en_US
dc.description	Cataloged from PDF version of thesis.
dc.description	Includes bibliographical references (pages 80-81).
dc.description.abstract	Schizophrenia is one of the destructive personality disorders where people have unusual interpretations of reality and are lured to develop harmful actions if not diagnosed promptly. This study focuses on identifying language patterns indicative of schizophrenic-prone texts in online communication and intends to contribute to the development of early intervention techniques in mental health utilizing ML and NLP methods. This study used two datasets to examine language patterns associated with schizophrenia in social media posts. The first dataset, Pre existing obtained from a repository focused on identifying schizophrenia-related postings, functions as a standard for comparison and evaluation. The second dataset, New scrapped obtained by extracting information from subreddits associated with schizophrenia, offers a more extensive range of language patterns. The dual-phase technique entails training models using the existing dataset and evaluating their performance on the newly collected dataset. The research uses various models, including transformer model BERT, recurrent neural network model Bi-LSTM, and GRU, as well as machine learning models such as Support Vector Classifier, Logistic Regression, Multinomial Naive Bayes, Random Forest, and Decision Tree to predict whether textual data is suggestive of schizophrenia. The language patterns of schizophrenic-prone texts differ from texts written by mentally-healthy individuals, encompassing phonological, morphological, and syntactic aspects. These models can analyze linguistic patterns and acquire knowledge about them. The results achieved after the training of the models are outstanding. The DistilBERT transformer model achieves 97% and 84% accuracy, GRU achieves high accuracy rates of 91% and 79%, the logistic regression machine learning model demonstrates impressive efficiency with accuracy rates of 93% and 83% respectively for Pre existing and New scrapped dataset. In order to ensure the models can effectively handle new data, we conducted a contemporary comparison. This analysis revealed that consistent data collection is necessary for accurate predictive results.	en_US
dc.description.statementofresponsibility	Raisa Rahman Rodela
dc.description.statementofresponsibility	Farhan Tanvir Efty
dc.description.statementofresponsibility	Mubashira Rahman
dc.description.statementofresponsibility	Shaira Wajiha
dc.format.extent	94 pages
dc.language.iso	en	en_US
dc.publisher	Brac University	en_US
dc.rights	Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	Schizophrenia	en_US
dc.subject	Logistic regression	en_US
dc.subject	Mental illness	en_US
dc.subject	Decision tree	en_US
dc.subject	Language pattern	en_US
dc.subject	Social media post	en_US
dc.subject	Natural language processing	en_US
dc.subject	GRU
dc.subject	Bi-LSTM
dc.subject	BERT
dc.subject.lcsh	Natural language processing (Computer science)
dc.subject.lcsh	Machine learning
dc.title	Analyzing Schizophrenic-prone text from social media content: a novel approach through ML and NLP	en_US
dc.type	Thesis	en_US
dc.contributor.department	Department of Computer Science and Engineering, Brac University
dc.description.degree	B.Sc in Computer Science and Engineering

Files in this item

Name:: 19301011, 19301014, 19301010, ...
Size:: 1.172Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Thesis & Report, BSc (Computer Science and Engineering) [1480]

Show simple item record