Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes

Warning

This publication doesn't include Faculty of Education. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors	KLAŠKA David KUČERA Antonín KŮR Vojtěch MUSIL Vít ŘEHÁK Vojtěch
Year of publication	2024
Type	Article in Proceedings
Conference	Proceedings of 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)
MU Faculty or unit	Faculty of Informatics
Citation
web	Paper URL
Doi	http://dx.doi.org/10.1609/aaai.v38i18.29993
Keywords	Markov decision processes; invariant distribution
Attached files	paper.pdf
Description	Long-run average optimization problems for Markov decision processes (MDPs) require constructing policies with optimal steady-state behavior, i.e., optimal limit frequency of visits to the states. However, such policies may suffer from local instability in the sense that the frequency of states visited in a bounded time horizon along a run differs significantly from the limit frequency. In this work, we propose an efficient algorithmic solution to this problem.
Related projects:	Models, Algorithms, and Tools for Solving Adversarial Security Problems Cyber-security Excellence Hub in Estonia and South Moravia (CHESS) VESCAA: Verifiable and Efficient Synthesis of Controllers for Autonomous Agents