BIRS Workshop Lecture Videos
Multi-armed bandit problem on rescue resource allocation Chiang, Wen-Hao
Natural disasters have been posed a significant threat in many areas of the United States. Many catastrophic ones had taken the lives of millions of people throughout history and caused severe negative economic damages. For example, known as Hurricane Harvey, the tropical storm had brought the loss of more than 125 billion dollars in the U.S.A. in 2017. It also claimed at least 107 peopleâ s lives during the period, displaced 30,000 people, and prompted more than 17,000 rescues . However, some safety preparations can be made in advance to mitigate the potential damages and reduce the loss. Accurate prediction for natural disaster hotspots and timely rescue resource allocation are two primary precautions to send out the help before major destructive events strike the targeted area. Through monitoring the past events, we can strategically make the decision of where we should deploy the rescue team beforehand or announce an early evacuation warning. Such a prediction task can be considered as a reinforcement learning process while we make the decisions based on the reward from our past actions. We further formulate our decision-making problem as a multi-arm bandit (MAB) problem. We first consider every region in the disaster-stricken area as an arm of a multi-armed bandit machine. In every time period, by â pulling the arms,â we observe the number of the disaster events of interest in only a couple of regions and we decide the regions to be observed in the next time period based on the past limited observations. We propose a MAB strategy that considers the tradeoff between the exploration for overlooked places while exploiting the regions with high event intensity. Our strategy combines the traditional epsilon-greedy algorithm that allows us to probe the uncharted area with non-parametric Hawkes process to capture the temporal dynamics in those well-observed regions . Several MAB algorithms are compared with our proposed model . Specifically, we gather the dataset from city requests for service that is reported in Houston, Texas during the time period of tropical storm Harvey . Flooding events are extracted as the natural disaster of interest. Our model and other baselines are further evaluated by the recall of flooding events throughout the recorded timespan of Hurricane Harvey. The proposed model has an improvement of 36.38% compared to the best baseline model. The improvement shows that our proposed model can not only exploit the best available options where there is a high probability of flooding, but it also can gather more information in those overlooked places. In our future work, more timely relevant information, such as tweets from Twitter and the water level record from observations nearby the river and ocean, will be integrated into our model to make predictions more accurate.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International