Reinforcement learning of a feedforward controller with soft actor-critic for a reaching task

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Reinforcement learning of a feedforward controller with soft actor-critic for a reaching task Srungarapu, Venkata Praneeth

Abstract

Learning to control is a complicated process, yet humans seamlessly control various complex movements. Motor theory suggests that humans start motor learning by learning to act in a feedforward manner. However, it is still unclear how humans learn feedforward control strategies. We hypothesize that this mechanism is governed by the criterion of success (reinforcement) or failure (penalty) of the task. Taking this for inspiration, we investigate how we can learn a feedforward controller utilizing reinforcement learning. Additionally, we investigate how the factors such as the difficulty of the task and noise present in the motor system are related to human motor control. Hence, a one-dimensional muscle-based biomechanical model is built to create a reaching task setup. The model contains an actuator controlled by the antagonistic and agonistic muscle pair and a goal or target to reach. Then, an end-to-end reinforcement-learning-based feedforward controller is learned to estimate control signals while taking the difficulty levels of a reaching task and noise levels into account. To design the learning-based controller, we adapted the model-free RL algorithm ``Soft Actor-Critic". As a result, during training, we observed that the SAC-based feedforward controller has learned to prepare co-activation to reach a target in the kinematic space using a minimum number of controller predictions. Moreover, we found that the controller has learned to estimate high-amplitude muscle activations as a way to adapt to the noise levels in the motor system. Finally, we conducted information analysis similar to Fitts' analysis to determine how the difficulty of the task and noise affected the controller. The effect of the task's difficulty and the noise in the system is determined by finding the relationship between the number of controller predictions, task difficulty, and the amount of noise. Our analysis demonstrates that the number of controller predictions increases exponentially with the increase in the difficulty of the task with the amount of noise kept constant. A linear relationship exists between the number of controller predictions and the amount of noise with ID kept constant. Additionally, we found that the effect of target width is more dominant than the distance, which confirms Welford's observation.

Item Metadata

Title	Reinforcement learning of a feedforward controller with soft actor-critic for a reaching task
Creator	Srungarapu, Venkata Praneeth
Supervisor	Fels, Sidney
Publisher	University of British Columbia
Date Issued	2021
Description	Learning to control is a complicated process, yet humans seamlessly control various complex movements. Motor theory suggests that humans start motor learning by learning to act in a feedforward manner. However, it is still unclear how humans learn feedforward control strategies. We hypothesize that this mechanism is governed by the criterion of success (reinforcement) or failure (penalty) of the task. Taking this for inspiration, we investigate how we can learn a feedforward controller utilizing reinforcement learning. Additionally, we investigate how the factors such as the difficulty of the task and noise present in the motor system are related to human motor control. Hence, a one-dimensional muscle-based biomechanical model is built to create a reaching task setup. The model contains an actuator controlled by the antagonistic and agonistic muscle pair and a goal or target to reach. Then, an end-to-end reinforcement-learning-based feedforward controller is learned to estimate control signals while taking the difficulty levels of a reaching task and noise levels into account. To design the learning-based controller, we adapted the model-free RL algorithm ``Soft Actor-Critic". As a result, during training, we observed that the SAC-based feedforward controller has learned to prepare co-activation to reach a target in the kinematic space using a minimum number of controller predictions. Moreover, we found that the controller has learned to estimate high-amplitude muscle activations as a way to adapt to the noise levels in the motor system. Finally, we conducted information analysis similar to Fitts' analysis to determine how the difficulty of the task and noise affected the controller. The effect of the task's difficulty and the noise in the system is determined by finding the relationship between the number of controller predictions, task difficulty, and the amount of noise. Our analysis demonstrates that the number of controller predictions increases exponentially with the increase in the difficulty of the task with the amount of noise kept constant. A linear relationship exists between the number of controller predictions and the amount of noise with ID kept constant. Additionally, we found that the effect of target width is more dominant than the distance, which confirms Welford's observation.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2021-10-04
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0402421
URI	http://hdl.handle.net/2429/79880
Degree	Master of Applied Science - MASc
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2021-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Reinforcement learning of a feedforward controller with soft actor-critic for a reaching task Srungarapu, Venkata Praneeth

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights