Bridging control and reinforcement learning with partial model knowledge

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Bridging control and reinforcement learning with partial model knowledge Wang, Shuyuan

Abstract

This thesis develops a series of complementary approaches that bridge control theory and reinforcement learning through systematic exploitation of partial model knowledge. Control theory leverages known system structure to deliver precise solutions but struggles with unknown dynamics, whereas reinforcement learning is flexible yet typically suffers from poor sample efficiency. The proposed methods integrate the strengths of both paradigms by combining model-based control where knowledge is available with learning-based adaptation for unknown components. We first consider linear systems and introduce Partial Knowledge Least Squares Policy Iteration (PLSPI), which decomposes system dynamics into known and unknown components (A = A1 + A2, B = B1 + B2). This formulation enables a principled interpolation between optimal control and reinforcement learning, improving sample efficiency while retaining robustness to modeling errors. We then provide a theoretical analysis explaining when and why PLSPI achieves superior convergence compared to standard LSPI. Through spectral analysis of the value function estimator, we show that the estimator norm in PLSPI can be smaller under certain conditions, leading to reduced variance and improved convergence behavior. Experiments across different partial knowledge configurations further illustrate how the design of known model components influences learning performance. We extend the partial knowledge paradigm to nonlinear systems through a hybrid architecture that combines structured control modules with neural network policies. Different from the PLSPI structure, this framework explicitly separates the roles of known and unknown dynamics within the policy, enabling effective nonlinear control with improved sample efficiency compared to black-box deep reinforcement learning. Finally, we develop DiLQR, a framework that makes the iterative Linear Quadratic Regulator (iLQR), a numerical nonlinear controller, fully differentiable via implicit differentiation. The proposed method computes exact gradients while accounting for all parameter dependencies, and introduces a forward algorithm with O(T) complexity, yielding substantial gains in computational efficiency and learning performance. Overall, this thesis presents principled methods for leveraging structural knowledge without sacrificing adaptability, with applications in robotics, autonomous systems, and industrial process control where sample efficiency and reliability are critical.

Item Metadata

Title	Bridging control and reinforcement learning with partial model knowledge
Creator	Wang, Shuyuan
Supervisor	Gopaluni, Bhushan; Loewen, Philip D. (Philip Daniel), 1960-
Publisher	University of British Columbia
Date Issued	2025
Description	This thesis develops a series of complementary approaches that bridge control theory and reinforcement learning through systematic exploitation of partial model knowledge. Control theory leverages known system structure to deliver precise solutions but struggles with unknown dynamics, whereas reinforcement learning is flexible yet typically suffers from poor sample efficiency. The proposed methods integrate the strengths of both paradigms by combining model-based control where knowledge is available with learning-based adaptation for unknown components. We first consider linear systems and introduce Partial Knowledge Least Squares Policy Iteration (PLSPI), which decomposes system dynamics into known and unknown components (A = A1 + A2, B = B1 + B2). This formulation enables a principled interpolation between optimal control and reinforcement learning, improving sample efficiency while retaining robustness to modeling errors. We then provide a theoretical analysis explaining when and why PLSPI achieves superior convergence compared to standard LSPI. Through spectral analysis of the value function estimator, we show that the estimator norm in PLSPI can be smaller under certain conditions, leading to reduced variance and improved convergence behavior. Experiments across different partial knowledge configurations further illustrate how the design of known model components influences learning performance. We extend the partial knowledge paradigm to nonlinear systems through a hybrid architecture that combines structured control modules with neural network policies. Different from the PLSPI structure, this framework explicitly separates the roles of known and unknown dynamics within the policy, enabling effective nonlinear control with improved sample efficiency compared to black-box deep reinforcement learning. Finally, we develop DiLQR, a framework that makes the iterative Linear Quadratic Regulator (iLQR), a numerical nonlinear controller, fully differentiable via implicit differentiation. The proposed method computes exact gradients while accounting for all parameter dependencies, and introduces a forward algorithm with O(T) complexity, yielding substantial gains in computational efficiency and learning performance. Overall, this thesis presents principled methods for leveraging structural knowledge without sacrificing adaptability, with applications in robotics, autonomous systems, and industrial process control where sample efficiency and reliability are critical.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2026-01-09
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0451186
URI	http://hdl.handle.net/2429/93336
Degree (Theses)	Doctor of Philosophy - PhD
Program (Theses)	Chemical and Biological Engineering
Affiliation	Applied Science, Faculty of; Chemical and Biological Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2026-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Bridging control and reinforcement learning with partial model knowledge Wang, Shuyuan

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights