Show simple item record

dc.contributor.advisorChakrabarty, Amitabha
dc.contributor.authorHasnat, Fahim
dc.contributor.authorHasan, Md. Mazidul
dc.contributor.authorKhan, Nayeem Hasan
dc.contributor.authorAli, Asif
dc.date.accessioned2018-12-18T10:46:31Z
dc.date.available2018-12-18T10:46:31Z
dc.date.copyright2018
dc.date.issued8/2/2018
dc.identifier.otherID 14101043
dc.identifier.otherID 14301104
dc.identifier.otherID 14301113
dc.identifier.otherID 12201068
dc.identifier.urihttp://hdl.handle.net/10361/11026
dc.descriptionCataloged from PDF version of thesis.
dc.descriptionIncludes bibliographical references (pages 43-46).
dc.descriptionThis thesis is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2018.en_US
dc.description.abstractFinancial, educational and communal activities produce enormous amount of data. Automatic text classification has significant application in content organization, point of view extraction, evaluation analysis, spam filtering and sentiment analysis. Automatic classification of text documents requires information extraction, machine learning and Natural Language processing. We have proposed a probabilistic framework for text classification. Proposed classification model is composed of three major modules i.e. pre-processing of unstructured text, learning of probabilistic model and the classification of unseen data by using learned model. This framework is trained and tested by using “20 newsgroup” dataset containing twenty different news categories i.e. politics, sports, religions and pc hardware. We have used both supervised and unsupervised algorithms to get the full insight on the relationships among various text classification techniques. Highest accuracy of 84.51% was achieved for 4 categories by Naïve Bayes among the other Supervised Algorithms we used and 62.79% homogeneity was achieved for unsupervised algorithms for 4 categories which demonstrates the effectiveness score of proposed automatic text classification approach.en_US
dc.description.statementofresponsibilityFahim Hasnat
dc.description.statementofresponsibilityMd. Mazidul Hasan
dc.description.statementofresponsibilityNayeem Hasan Khan
dc.description.statementofresponsibilityAsif Ali
dc.format.extent46 pages
dc.language.isoenen_US
dc.publisherBRAC Universityen_US
dc.rightsBRAC University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subjectText classificationen_US
dc.subjectMachine learningen_US
dc.subjectPre-processingen_US
dc.subjectFeature extractionen_US
dc.subjectNaïve bayesen_US
dc.subjectDecision treeen_US
dc.subject.lcshMachine learning.
dc.subject.lcshText processing (Computer science)
dc.titleText classification using machine learning algorithmsen_US
dc.typeThesis
dc.contributor.departmentDepartment of Computer Science and Engineering, BRAC University
dc.description.degreeB. Computer Science and Engineering


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record