Reinforcement learning for data scheduling in internet of things (IoT) networks

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Reinforcement learning for data scheduling in internet of things (IoT) networks Rashtian, Hootan

Abstract

I investigate data prioritization and scheduling problems on the Internet of Things (IOT) networks that encompass large volumes of data. The required criteria for prioritizing data depend on multiple aspects such as preservation of importance and timeliness of data messages in environments with different levels of complexity. I explore three representative problems within the landscape of data prioritization and scheduling. First, I study the problem of scheduling for polling data from sensors where it is not possible to gather all data at a processing centre. I present a centralized mechanism for choosing sensors to gather data at each polling epoch. Our mechanism prioritizes sensors using information about the data generation rate, the expected value of the data, and its time sensitivity. Our work relates to the restless bandit model in a continuous state space, unlike many other such models. The contribution is to derive an index policy and show that it can be useful even when not optimal through a quantitative study where event arrivals follow a hyper-exponential distribution. Second, I study the problem of balancing timeliness and criticality when gathering data from multiple sources using a hierarchical approach. A central decision-maker decides which local hubs to allocate bandwidth to, and the local hubs have to prioritize the sensors’ messages. An optimal policy requires global knowledge of messages at each local hub, hence impractical. I propose a reinforcement-learning approach that accounts for both requirements. The proposed approach’s evaluation results show that the proposed policy outperforms all the other policies in the experiments except for the impractical optimal policy. Finally, I consider the problem of handling timeliness and criticality trade-off when gathering data from multiple resources in complex environments. There exist dependencies among sensors in such environments that lead to patterns in data that are hard to capture. Motivated by the success of the Asynchronous Advantage Actor-Critic (A3C) approach, I modify the A3C by embedding Long Short Term Memory (LSTM) to improve performance when vanilla A3C could not capture patterns in data. I show the effectiveness of the proposed solution based on the results in multiple scenarios.

Item Metadata

Title	Reinforcement learning for data scheduling in internet of things (IoT) networks
Creator	Rashtian, Hootan
Publisher	University of British Columbia
Date Issued	2020
Description	I investigate data prioritization and scheduling problems on the Internet of Things (IOT) networks that encompass large volumes of data. The required criteria for prioritizing data depend on multiple aspects such as preservation of importance and timeliness of data messages in environments with different levels of complexity. I explore three representative problems within the landscape of data prioritization and scheduling. First, I study the problem of scheduling for polling data from sensors where it is not possible to gather all data at a processing centre. I present a centralized mechanism for choosing sensors to gather data at each polling epoch. Our mechanism prioritizes sensors using information about the data generation rate, the expected value of the data, and its time sensitivity. Our work relates to the restless bandit model in a continuous state space, unlike many other such models. The contribution is to derive an index policy and show that it can be useful even when not optimal through a quantitative study where event arrivals follow a hyper-exponential distribution. Second, I study the problem of balancing timeliness and criticality when gathering data from multiple sources using a hierarchical approach. A central decision-maker decides which local hubs to allocate bandwidth to, and the local hubs have to prioritize the sensors’ messages. An optimal policy requires global knowledge of messages at each local hub, hence impractical. I propose a reinforcement-learning approach that accounts for both requirements. The proposed approach’s evaluation results show that the proposed policy outperforms all the other policies in the experiments except for the impractical optimal policy. Finally, I consider the problem of handling timeliness and criticality trade-off when gathering data from multiple resources in complex environments. There exist dependencies among sensors in such environments that lead to patterns in data that are hard to capture. Motivated by the success of the Asynchronous Advantage Actor-Critic (A3C) approach, I modify the A3C by embedding Long Short Term Memory (LSTM) to improve performance when vanilla A3C could not capture patterns in data. I show the effectiveness of the proposed solution based on the results in multiple scenarios.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2020-09-01
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0394147
URI	http://hdl.handle.net/2429/75833
Degree	Doctor of Philosophy - PhD
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2020-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Reinforcement learning for data scheduling in internet of things (IoT) networks Rashtian, Hootan

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights