Piecewise linear Markov decision processes with an application to partially observable Markov models

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Piecewise linear Markov decision processes with an application to partially observable Markov models Sawaki, Katsushige

Abstract

This dissertation applies policy improvement and successive approximation or value iteration to a general class of Markov decision processes with discounted costs. In particular, a class of Markov decision processes, called piecewise-linear, is studied. Piecewise-linear processes are characterized by the property that the value function of a process observed for one period and then terminated is piecewise-linear if the terminal reward function is piecewise-linear. Partially observable Markov decision processes have this property. It is shown that there are e-optimal piecewise-linear value functions and piecewise-constant policies which are simple. Simple means that there are only finitely many pieces, each of which is defined on a convex polyhedral set. Algorithms based on policy improvement and successive approximation are developed to compute simple approximations to an optimal policy and the optimal value function.

Item Metadata

Title	Piecewise linear Markov decision processes with an application to partially observable Markov models
Creator	Sawaki, Katsushige
Publisher	University of British Columbia
Date Issued	1977
Description	This dissertation applies policy improvement and successive approximation or value iteration to a general class of Markov decision processes with discounted costs. In particular, a class of Markov decision processes, called piecewise-linear, is studied. Piecewise-linear processes are characterized by the property that the value function of a process observed for one period and then terminated is piecewise-linear if the terminal reward function is piecewise-linear. Partially observable Markov decision processes have this property. It is shown that there are e-optimal piecewise-linear value functions and piecewise-constant policies which are simple. Simple means that there are only finitely many pieces, each of which is defined on a convex polyhedral set. Algorithms based on policy improvement and successive approximation are developed to compute simple approximations to an optimal policy and the optimal value function.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2010-02-25
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0094312
URI	http://hdl.handle.net/2429/20920
Degree (Theses)	Doctor of Philosophy - PhD
Program (Theses)	Business Administration
Affiliation	Business, Sauder School of
Degree Grantor	University of British Columbia
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

UBC_1977_A1 S28.pdf -- 3.13MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Piecewise linear Markov decision processes with an application to partially observable Markov models Sawaki, Katsushige

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights