dc.contributor.advisor | Rasel, Annajiat Alim | |
dc.contributor.advisor | Khan, Rubayat Ahmed | |
dc.contributor.author | Mahmud, Aqil | |
dc.contributor.author | Khan, Aswat Karim | |
dc.contributor.author | Hasan Rafi, Mohammad Mehdi | |
dc.contributor.author | Fahim, Kazi Rayhan | |
dc.date.accessioned | 2023-08-27T08:18:08Z | |
dc.date.available | 2023-08-27T08:18:08Z | |
dc.date.copyright | 2023 | |
dc.date.issued | 2023-01 | |
dc.identifier.other | ID: 18341010 | |
dc.identifier.other | ID: 18301282 | |
dc.identifier.other | ID: 18101629 | |
dc.identifier.other | ID: 18301114 | |
dc.identifier.uri | http://hdl.handle.net/10361/19954 | |
dc.description | This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2023. | en_US |
dc.description | Cataloged from PDF version of thesis. | |
dc.description | Includes bibliographical references (pages 23-24). | |
dc.description.abstract | This paper is intended to be a practical guide in terms of getting up and running
with reinforcement learning. Ideally, it aims to bridge the gap between practi cal implementation and the theories available for RL. The theory of reinforcement
learning involves two main components: an environment, which is the game itself
and an agent, which performs an action based on its observation from the environ ment. Initially, no in-game rules will be given to the agent and it will be rewarded
or punished based on the action that it will take. The goal is to increase Proximal
Policy Optimization (PPO) to maximize the reward that our agent will get, so over
time it will learn what action to take in order to do so. Therefore, we will develop
an AI agent that will be able to learn how to play one of the most popular arcade
games of all time, Street Fighter. We preprocess our game environment and apply
hyperparameter tuning using PyTorch, Stable Baselines, and Optuna to do it. This
approach will basically train different types of RL architecture and find a model with
the most weighted parameters. Moreover, we are going to Fine Tune that model and
run our test cases on it. We are going to see how a reinforcement learning algorithm
learns to play. | en_US |
dc.description.statementofresponsibility | Aqil Mahmud | |
dc.description.statementofresponsibility | Aswat Karim Khan | |
dc.description.statementofresponsibility | Mohammad Mehdi Hasan Rafi | |
dc.description.statementofresponsibility | Kazi Rayhan Fahim | |
dc.format.extent | 24 pages | |
dc.language.iso | en | en_US |
dc.publisher | Brac University | en_US |
dc.rights | Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. | |
dc.subject | Reinforcement learning | en_US |
dc.subject | Neural networks | en_US |
dc.subject | Games | en_US |
dc.subject | AI | en_US |
dc.subject | Proximal policy optimization | en_US |
dc.subject.lcsh | Reinforcement learning. | |
dc.title | Implementation of reinforcement learning architecture to augment an AI that can self-learn to play video games | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | Department of Computer Science and Engineering, Brac University | |
dc.description.degree | B. Computer Science and Engineering | |