BIRS Workshop Lecture Videos

Banff International Research Station Logo

BIRS Workshop Lecture Videos

Robust Markov decision processes with average and Blackwell optimality Petrik, Marek

Description

Robust Markov Decision Processes (RMDPs) are a widely used framework for sequential decision-making under parameter uncertainty. RMDPs have been extensively studied when the objective is to maximize the discounted return, but little is known for average optimality (optimizing the long-run average of the rewards obtained over time) and Blackwell optimality. In this talk, we prove several foundational results for RMDPs beyond the discounted return. We show that average optimal policies can be chosen stationary and deterministic for sa-rectangular RMDPs but, perhaps surprisingly, we show that for s-rectangular RMDPs average optimal policies may not exist, and if they exist, may need to be history-dependent (Markovian). We also study Blackwell optimality for sa-rectangular RMDPs. Overall, we demonstrate the superior practical properties of distance-based sa-rectangular models over s-rectangular models for average and Blackwell optimality.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International