Beyond Markov Decision Process with Scalar Markovian Rewards

Authors

Shuwa Miura

Manning College of Information and Computer Sciences, University of Massachusetts Amherst

Proceedings:

SOCS-22 Volume 15

Volume

Issue:

Vol. 15 No. 1 (2022): Fifteenth International Symposium on Combinatorial Search

Track:

Student Papers

Downloads:

Download PDF

Abstract:

Real-world decision problems often involve multiple competing objectives or a complex reward structure that violate Markov assumption. However, the existing research on sequential decision making under uncertainty primarily focused on Markov Decision Processes (MDPs) with scalar Markovian reward signals. My thesis considers settings where scalar Markovian rewards are not sufficient to produce desired behaviors. The first part of my thesis develops algorithms to optimize lexicographically ordered objectives. The second part considers autonomous agents which incorporate the perspective of their observer. As the perspective of the observer can depend on how the agents behaved so far, rewards in this setting can depend on histories (non-Markovian). In the final part of my thesis, I hope to characterize when rewards beyond scalar Markovian signals are needed from the decision theoretic perspective

DOI:

10.1609/socs.v15i1.21805

SOCS

Vol. 15 No. 1 (2022): Fifteenth International Symposium on Combinatorial Search

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.