Algorithmic learning in games

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Algorithmic learning in games Possnig, Clemens

Abstract

This dissertation studies algorithmic agents that interact repeatedly in strategic settings. Chapter 2 provides asymptotic results for a family of reinforcement learning algorithms known as ‘actor-critic learners’. Each such algorithmic agent simultaneously estimates what is called a ‘critic’, such as a value function, and updates its policy, which is referred to as the ‘actor’. The critic is used to indicate directions of improvement for the actor. I establish sufficient conditions for the consistency of each agent’s parametric critic estimator, which enables them to adapt and find optimal responses despite the non-stationarity inherent to multi-agent settings. The conditions depend on the environment, number of observations used in the critic estimation, and policy stepsize. Chapter 3 presents an analytical characterization of the long run policies learned by algorithmic agents in the multi-agent setting. The algorithms studied here form a superset of the family considered in chapter 2. These algorithms update policies, which are maps from observed states to actions. I show that the long run policies correspond to equilibria that are stable points of a tractable differential equation. In chapter 4, I consider algorithmic agents playing a repeated Cournot game of quantity competition. In this situation, learning the stage game Nash equilibrium serves as noncollusive benchmark. I give necessary and sufficient conditions for this Nash equilibrium not to be learned. These conditions are requirements on the state variables of the algorithms, and on the stage game. When algorithms determine actions based only on the past period’s price, the Nash equilibrium can be learned. However, agents may condition their actions on richer types of state variables beyond the past period’s price. In that case, I give sufficient conditions such that the policies converge to a collusive equilibrium with positive probability, while never converging to the Nash equilibrium.

Item Metadata

Title	Algorithmic learning in games
Creator	Possnig, Clemens
Supervisor	Li, Hao; Farinha Luz, Vitor
Publisher	University of British Columbia
Date Issued	2023
Description	This dissertation studies algorithmic agents that interact repeatedly in strategic settings. Chapter 2 provides asymptotic results for a family of reinforcement learning algorithms known as ‘actor-critic learners’. Each such algorithmic agent simultaneously estimates what is called a ‘critic’, such as a value function, and updates its policy, which is referred to as the ‘actor’. The critic is used to indicate directions of improvement for the actor. I establish sufficient conditions for the consistency of each agent’s parametric critic estimator, which enables them to adapt and find optimal responses despite the non-stationarity inherent to multi-agent settings. The conditions depend on the environment, number of observations used in the critic estimation, and policy stepsize. Chapter 3 presents an analytical characterization of the long run policies learned by algorithmic agents in the multi-agent setting. The algorithms studied here form a superset of the family considered in chapter 2. These algorithms update policies, which are maps from observed states to actions. I show that the long run policies correspond to equilibria that are stable points of a tractable differential equation. In chapter 4, I consider algorithmic agents playing a repeated Cournot game of quantity competition. In this situation, learning the stage game Nash equilibrium serves as noncollusive benchmark. I give necessary and sufficient conditions for this Nash equilibrium not to be learned. These conditions are requirements on the state variables of the algorithms, and on the stage game. When algorithms determine actions based only on the past period’s price, the Nash equilibrium can be learned. However, agents may condition their actions on richer types of state variables beyond the past period’s price. In that case, I give sufficient conditions such that the policies converge to a collusive equilibrium with positive probability, while never converging to the Nash equilibrium.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2023-09-07
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0435830
URI	http://hdl.handle.net/2429/85877
Degree	Doctor of Philosophy - PhD
Program	Economics
Affiliation	Arts, Faculty of; Vancouver School of Economics
Degree Grantor	University of British Columbia
Graduation Date	2023-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Algorithmic learning in games Possnig, Clemens

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights