Robust Markov decision processes with average and Blackwell optimality

BIRS Workshop Lecture Videos

Featured Collection

BIRS Workshop Lecture Videos

Robust Markov decision processes with average and Blackwell optimality Petrik, Marek

Description

Robust Markov Decision Processes (RMDPs) are a widely used framework for sequential decision-making under parameter uncertainty. RMDPs have been extensively studied when the objective is to maximize the discounted return, but little is known for average optimality (optimizing the long-run average of the rewards obtained over time) and Blackwell optimality. In this talk, we prove several foundational results for RMDPs beyond the discounted return. We show that average optimal policies can be chosen stationary and deterministic for sa-rectangular RMDPs but, perhaps surprisingly, we show that for s-rectangular RMDPs average optimal policies may not exist, and if they exist, may need to be history-dependent (Markovian). We also study Blackwell optimality for sa-rectangular RMDPs. Overall, we demonstrate the superior practical properties of distance-based sa-rectangular models over s-rectangular models for average and Blackwell optimality.

Item Metadata

Title	Robust Markov decision processes with average and Blackwell optimality
Creator	Petrik, Marek
Publisher	Banff International Research Station for Mathematical Innovation and Discovery
Date Issued	2026-02-25
Description	Robust Markov Decision Processes (RMDPs) are a widely used framework for sequential decision-making under parameter uncertainty. RMDPs have been extensively studied when the objective is to maximize the discounted return, but little is known for average optimality (optimizing the long-run average of the rewards obtained over time) and Blackwell optimality. In this talk, we prove several foundational results for RMDPs beyond the discounted return. We show that average optimal policies can be chosen stationary and deterministic for sa-rectangular RMDPs but, perhaps surprisingly, we show that for s-rectangular RMDPs average optimal policies may not exist, and if they exist, may need to be history-dependent (Markovian). We also study Blackwell optimality for sa-rectangular RMDPs. Overall, we demonstrate the superior practical properties of distance-based sa-rectangular models over s-rectangular models for average and Blackwell optimality.
Extent	31.0 minutes
Subject	Mathematics; Operation Research; Computer Science; Operations Research; Mathematical Programming
Type	Moving Image
File Format	video/mp4
Language	eng
Notes	Author affiliation: University of New Hampshire
Series	BIRS Workshop Lecture Videos (Banff, Alta)
Date Available	2026-03-02
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0451594
URI	http://hdl.handle.net/2429/93708
Affiliation	Non UBC
Peer Review Status	Unreviewed
Scholarly Level	Researcher
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Item Media

202602251039-Petrik_hrv-0.mov -- 907.83MB

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Open Collections

BIRS Workshop Lecture Videos

Robust Markov decision processes with average and Blackwell optimality Petrik, Marek

Description

Item Metadata

Item Media

Item Citations and Data

Rights