UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Distributed reinforcement learning in emergency response simulation Lopez, Cesar


In this thesis we present the implementation of a coordinated decision-making agent for emergency response scenarios. The agent’s implementation uses Reinforcement Learning (RL). RL is a machine learning technique that enables an agent to learn from experimenting. The agent’s learning is based on rewards, feedback signals proportional to how good its actions are. The simulation platform used was i2Sim, the Infrastructure Interdependencies Simulator, in which, we have tested the suitability of the approach in previous studies. In this work, we have added new features, for increasing the speed of convergence and enabling distributed processing capabilities. These additions include enhanced reward and exploration schemes and a scheduler for orchestrating the distributed training. We include two test cases. The first case is a compact model with 4 critical infrastructures. In this model, the agent’s training required only 10% of the attempts as compared to references given by past studies done in our group. Improvements in convergence come from the enhanced shaping reward and exploration schemes. We trained the agent across 24 simultaneous configurations of our model (scenarios). The complete distributing training process needed 4 minutes. The second case is an extended model, a more detailed representation of the first case. This extended case included additional infrastructures and a higher level of resolution. By adding more infrastructures, the dimensionality of the problem grew four thousand times. This dimensionality growth did not affect performance and the training had an even faster convergence. We ran 96 parallel instances of the extended model and the process completed in 2.87 minutes. The results show a fast and stable convergence framework with a wide range of applicability. This agent could help during multiple stages of emergency response including real time situations.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International