Show simple item record

dc.contributor.advisorRashid, Warida
dc.contributor.advisorIslam, Riashat
dc.contributor.authorKhan, Mahir Asaf
dc.contributor.authorAshraf, Adib
dc.contributor.authorAmin, Tahmid Adib
dc.date.accessioned2023-05-23T04:43:23Z
dc.date.available2023-05-23T04:43:23Z
dc.date.copyright2022
dc.date.issued2022-05
dc.identifier.otherID 22141075
dc.identifier.otherID 20241063
dc.identifier.otherID 22141076
dc.identifier.urihttp://hdl.handle.net/10361/18306
dc.descriptionThis thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2022.en_US
dc.descriptionCataloged from PDF version of thesis.
dc.descriptionIncludes bibliographical references (pages 42-43).
dc.description.abstractIn this work we will analyze control variates and baselines in policy optimization methods in deep reinforcement learning (RL). Recently there has been a lot of progress in policy gradient methods in deep RL, where baselines are typically used for variance reduction. However, there has been recent progress on the mirage of state and state-action dependent baselines in policy gradients. To this end, it is not clear how control variates play a role in the optimization landscape of policy gradients. This work will dive into understanding the landscape issues of policy optimization, to see whether control variates are only for variance reduction or whether they play a role in smoothing out the optimization landscape. Our work will further investigate the issues of different optimizers used in deep RL experiments, and ablation studies of the interplay of control variates and optimizers in policy gradients from an optimization perspective.en_US
dc.description.statementofresponsibilityMahir Asaf Khan
dc.description.statementofresponsibilityAdib Ashraf
dc.description.statementofresponsibilityTahmid Adib Amin
dc.format.extent43 pages
dc.language.isoenen_US
dc.publisherBrac Universityen_US
dc.rightsBrac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subjectOptimization landscapeen_US
dc.subjectPolicy optimizationen_US
dc.subjectDeep reinforcement learningen_US
dc.subjectVariance reductionen_US
dc.subjectControl variatesen_US
dc.subject.lcshCognitive learning theory
dc.subject.lcshMachine learning
dc.titleAnalyzing optimization landscape of recent policy optimization methods in deep RLen_US
dc.typeThesisen_US
dc.contributor.departmentDepartment of Computer Science and Engineering, Brac University
dc.description.degreeB. Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record