"Applied Science, Faculty of"@en . "Electrical and Computer Engineering, Department of"@en . "DSpace"@en . "UBCV"@en . "Mao, Shaobo"@en . "2012-07-30T17:26:25Z"@en . "2012"@en . "Master of Applied Science - MASc"@en . "University of British Columbia"@en . "The energy management policy of a rechargeable wireless sensor network (WSN) needs to take into account the energy harvesting process, and is thus different from that of a traditional WSN powered by non-rechargeable batteries. In this thesis, we study the energy allocation for sensing and transmission in an energy harvesting sensor node with a rechargeable battery. The sensor aims to maximize the expected total amount of data transmitted subject to time-varying energy harvesting rate, energy availability in the battery, data availability in the data buffer, and channel fading. In this thesis, we first consider the energy allocation problem that assumes a fixed sensor lifetime. Then, we extend the energy allocation problem by taking into account the randomness of the senor lifetime.\nIn the first part of this thesis, we study the joint energy allocation for sensing and transmission in an energy harvesting sensor node with a fixed sensor lifetime. We formulate the energy allocation problem as a finite-horizon Markov decision process\n(MDP) and propose an optimal energy allocation (OEA) algorithm using backward induction. We conduct simulations to compare the performance between our proposed OEA algorithm and the channel-aware energy allocation (CAEA) algorithm extended from [1]. Simulation results show that the OEA algorithm can transmit a much larger amount of data over a finite horizon than the CAEA algorithm under different settings.\nIn the second part of this thesis, we extend the joint energy allocation problem by taking into account the randomness of the sensor lifetime, and formulate the problem as an infinite-horizon discounted MDP. We propose an optimal stationary energy allocation (OSEA) algorithm using the value iteration. We then consider a special case with infinite data backlog and prove that the optimal transmission energy allocation (OTEA) policy is monotone with respect to the amount of battery energy available. Finally, we conduct extensive simulations to compare the performance of the OSEA, OTEA, and CAEA algorithms. Results show that the OSEA algorithm transmits the largest amount of data, and the OTEA algorithm can achieve a near-optimal performance."@en . "https://circle.library.ubc.ca/rest/handle/2429/42832?expand=metadata"@en . "Joint Energy Allocation for Sensing and Transmission in Rechargeable Wireless Sensor Networks by Shaobo Mao B.E., Zhejiang University, China, 2010 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE in THE FACULTY OF GRADUATE STUDIES (Electrical and Computer Engineering) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) July 2012 c\u00C2\u00A9 Shaobo Mao, 2012 Abstract The energy management policy of a rechargeable wireless sensor network (WSN) needs to take into account the energy harvesting process, and is thus different from that of a traditional WSN powered by non-rechargeable batteries. In this thesis, we study the energy allocation for sensing and transmission in an energy harvesting sensor node with a rechargeable battery. The sensor aims to maximize the expected total amount of data transmitted subject to time-varying energy harvesting rate, energy availability in the battery, data availability in the data buffer, and channel fading. In this thesis, we first consider the energy allocation problem that assumes a fixed sensor lifetime. Then, we extend the energy allocation problem by taking into account the randomness of the senor lifetime. In the first part of this thesis, we study the joint energy allocation for sensing and transmission in an energy harvesting sensor node with a fixed sensor lifetime. We formulate the energy allocation problem as a finite-horizon Markov decision process (MDP) and propose an optimal energy allocation (OEA) algorithm using backward induction. We conduct simulations to compare the performance between our pro- posed OEA algorithm and the channel-aware energy allocation (CAEA) algorithm extended from [1]. Simulation results show that the OEA algorithm can transmit a much larger amount of data over a finite horizon than the CAEA algorithm under different settings. In the second part of this thesis, we extend the joint energy allocation problem ii Abstract by taking into account the randomness of the sensor lifetime, and formulate the problem as an infinite-horizon discounted MDP. We propose an optimal stationary energy allocation (OSEA) algorithm using the value iteration. We then consider a special case with infinite data backlog and prove that the optimal transmission energy allocation (OTEA) policy is monotone with respect to the amount of battery energy available. Finally, we conduct extensive simulations to compare the performance of the OSEA, OTEA, and CAEA algorithms. Results show that the OSEA algorithm transmits the largest amount of data, and the OTEA algorithm can achieve a near- optimal performance. iii Preface Hereby, I declare that I am the first author of this thesis. Chapters 2-3 are based on work that has been published or submitted for publication. The related publications were done in collaboration with Dr. Man Hon Cheung and Prof. Vincent Wong. For all publications, I conducted the paper survey on related topics, performed the analysis, and carried out the simulations of the considered communication systems. Dr. Man Hon Cheung checked the analytical model and the simulation codes. Prof. Vincent Wong gave important suggestions on the presentation of the papers. The papers were originally prepared by me, and further revised by all the co-authors. The following publications are accomplished through this research. Journal Paper \u00E2\u0080\u00A2 Shaobo Mao, Man Hon Cheung, and Vincent W.S. Wong, \u00E2\u0080\u009CJoint energy allo- cation for sensing and transmission in rechargeable wireless sensor networks,\u00E2\u0080\u009D submitted. Conference Paper \u00E2\u0080\u00A2 Shaobo Mao, Man Hon Cheung, and Vincent W.S. Wong, \u00E2\u0080\u009CAn optimal energy allocation algorithm for energy harvesting wireless sensor networks,\u00E2\u0080\u009D in Proc. of IEEE International Conference on Communications (ICC), Ottawa, Canada, June, 2012. iv Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Wireless Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Energy Harvesting Technology . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Energy Harvesting Methods . . . . . . . . . . . . . . . . . . . 2 1.2.2 Energy Harvesting Architectures . . . . . . . . . . . . . . . . 3 1.3 Energy Management in Energy Harvesting WSNs . . . . . . . . . . . 5 1.4 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.6 List of Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.7 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 9 v Table of Contents 2 Energy Allocation Algorithm Based on Finite-horizon MDP . . . 11 2.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Finite-Horizon MDP . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 21 3 Energy Allocation Algorithms Based on Infinite-horizon MDP . 27 3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2 Energy Allocation Algorithms . . . . . . . . . . . . . . . . . . . . . . 30 3.2.1 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.2 Special Case: Infinite Data Backlog . . . . . . . . . . . . . . 34 3.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 41 4 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . 49 4.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 vi List of Figures 1.1 Energy harvesting architectures with and without energy storage ca- pability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 The system model of an energy harvesting wireless sensor node trans- mitting data to the receiver Rx of the sink. . . . . . . . . . . . . . . 12 2.2 Timing diagram of a Markov decision process (MDP). . . . . . . . . . 14 2.3 A three-state Markov chain for the channel gain, where \u00E2\u0080\u009CB\u00E2\u0080\u009D, \u00E2\u0080\u009CN\u00E2\u0080\u009D, and \u00E2\u0080\u009CG\u00E2\u0080\u009D represent the channel in the bad, normal, and good states, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.4 The total amount of data transmitted of the two algorithms for differ- ent number of total time slots K. . . . . . . . . . . . . . . . . . . . . 23 2.5 The total amount of data transmitted of the two algorithms for differ- ent average energy harvesting rates when K = 30. . . . . . . . . . . . 24 2.6 The total amount of data transmitted of the two algorithms for differ- ent values of data-sensing efficiency parameter \u00CE\u00B3. . . . . . . . . . . . 26 3.1 The total amount of data transmitted of OTEA algorithm under dif- ferent percentage of energy allocated for sensing p. Since the OSEA algorithm does not allocate a fixed amount of energy for sensing, its total amount of data transmitted is independent of p. . . . . . . . . . 42 vii List of Figures 3.2 The optimal percentage of energy allocated for sensing under different data-sensing efficiency \u00CE\u00B3 for the OTEA algorithm. . . . . . . . . . . . 43 3.3 The total amount of data transmitted of the three algorithms for dif- ferent average energy harvesting rates H\u00CC\u0084 . . . . . . . . . . . . . . . . . 44 3.4 The total amount of data transmitted of the OSEA algorithm and the OTEA algorithm for different values of data-sensing efficiency param- eter \u00CE\u00B3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.5 The total amount of data transmitted of the OSEA algorithm and the OTEA algorithm for different battery storage capacity bmax. . . . . . 46 3.6 The total amount of data transmitted of the OSEA algorithm and the OTEA algorithm for different data buffer size qmax. . . . . . . . . . . 47 3.7 The total amount of data transmitted of the OSEA algorithm and the OTEA algorithm for different values of discount factor \u00CE\u00BD. . . . . . . . 48 viii List of Acronyms AWGN Additive White Gaussian Noise CAEA Channel-Aware Energy Allocation CSI Channel State Information MAC Medium Access Control MDP Markov Decision Process OEA Optimal Energy Allocation OSEA Optimal Stationary Energy Allocation OTEA Optimal Transmission Energy Allocation POMDP Partially Observable Markov Decision Process SNR Signal-to-Noise Ratio WSN Wireless Sensor Network ix Acknowledgments Foremost, I would like to express my deepest gratitude to my supervisor, Prof. Vin- cent Wong, for his continuous support of my graduate study and research, for his patience, motivation, enthusiasm, and immense knowledge. His guidance helped me in all the time of research and writing of this thesis. I could not have imagined having a better advisor and mentor for my Master study. I would also like to thank Prof. Victor Leung and Prof. Sathish Gopalakrishnan for their invaluable comments on my thesis. I am heartily thankful to my colleague, Dr. Man Hon Cheung, who has provided me with precious assistance during my graduate research. I would also like to thank my colleagues in Prof. Wong\u00E2\u0080\u0099s research group: Dr. Vahid Shah-Mansouri, Dr. Taesoo Kwon, Dr. Keivan Ronasi, Pedram Samadi, Binglai Niu, Enxin Yao, Bojiang Ma, Suyang Duan, as well as my colleagues in the Communications Lab: Dr. Derrick Wing Kwan Ng, Dr. Hu Jin, Peiran Wu, Jun Zhu, who have provided constructive suggestions on my work. Finally, I would like to express my profound gratitude to my beloved parents for their understanding, support, and endless love to me. To them I dedicate this thesis. This research is supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. x Chapter 1 Introduction This chapter introduces some background of wireless sensor network (WSN), energy harvesting technology, and energy management in rechargeable WSNs. The scope of the thesis is given at the end of this chapter. 1.1 Wireless Sensor Network A WSN consists of a large number of spatially distributed sensor nodes, which have the capabilities of sensing, data processing, and communicating [2]. It can be deployed for remote environmental monitoring and target tracking, for example, volcano mon- itoring [3], habitat monitoring [4], vehicle tracking [5], and structural monitoring [6]. A wireless sensor node is typically equipped with three basic components, a sens- ing module for data acquisition from the surrounding environment, a processing mod- ule for local data processing and storage, and a wireless communication module for data transmission. Besides, a battery with limited energy budget supplies the en- ergy needed by the device to perform tasks. In addition, an actuator may also be incorporated in a sensor node, depending on the application and the type of sensors used. In the design of a WSN, there are several constraints, such as limited amount of energy due to a finite battery capacity, short communication range, low bandwidth, 1 Chapter 1. Introduction and limited processing and storage in each sensor node. Among all these design constraints, the major limitation is that the sensor node can only operate for a limited amount of time due to the finite capacity of the battery. However, a sensor network should have a lifetime long enough to fulfill the requirements of the application. In many cases, the sensor network may be required to perform the task for several months, or even years. Therefore, how to prolong the lifetime of a WSN is a crucial question. A lot of research efforts have been dedicated to prolong the lifetime of a WSN by improving its energy efficiency. Some of these include power-aware storage, energy- aware medium access control (MAC) protocols [7],[8],[9], routing protocols [10] [11], and duty-cycling strategies [12]. While all the techniques above optimize the energy consumption so as to maximize the lifetime of the sensor network, the lifetime remains bounded and finite, and thus the energy-related inhibitions are not precluded. 1.2 Energy Harvesting Technology Recently, the idea of energy harvesting was proposed to address the problem of finite lifetime in a WSN by enabling the sensor nodes to replenish energy from ambient sources, for example, by using solar panels to convert sunlight into electricity, by using vibration-based energy harvesting technology, or by utilizing thermoelectric generators [13] [14] [15]. 1.2.1 Energy Harvesting Methods There are mainly three energy harvesting methods, as mentioned above, including photonic method, vibrational method, and thermal method. 2 Chapter 1. Introduction \u00E2\u0080\u00A2 Photonic method : Silicon solar cells exploit the photovoltaic effect to convert sunlight into electricity. When the photons of sunlight strike the silicon cell, their energy may be absorbed and transferred to electrons of the silicon, which are then able to escape from their normal positions in the silicon to become part of the current in an electrical circuit. This phenomenon is called the photovoltaic effect. Since solar energy is a convenient harvesting source, lots of implementations of solar energy harvesting sensor nodes have already existed, for example, Heliomote [16], Everlast [17], Prometheus [18], and HydroWatch [19]. \u00E2\u0080\u00A2 Vibrational method : Vibrations can generate electric energy. There are mainly three methods to harvest vibrations, including piezoelectric materials, inductive systems, and capacitive systems [15]. \u00E2\u0080\u00A2 Thermal method : The thermoelectric effect is the direct conversion of temper- ature differences to electric voltage. Thermoelectric devices utilize this effect and can generate electricity when there exists a temperature gradient across the device. Compared with vibration-based devices, thermoelectric devices can function for a much longer duration due to the absence of any moving parts. 1.2.2 Energy Harvesting Architectures In general, energy harvesting architectures for sensor nodes can be divided into two categories, harvest-use architecture and harvest-store-use architecture [14], as shown in Figure 1.1. \u00E2\u0080\u00A2 Harvest-use architecture: As shown in Figure 1.1 (a), in harvest-use architec- ture, the energy harvesting system powers the sensor node directly. Therefore, 3 Chapter 1. Introduction Harvesting System Sensor Node Energy Storage Component (a) harvest-use (b) harvest-store-use Harvesting System Sensor Node Figure 1.1: Energy harvesting architectures with and without energy storage capa- bility. in order to keep the sensor operational, the power output of the harvesting system must be continuously above the minimum operating point. Otherwise, the sensor node will be disabled. \u00E2\u0080\u00A2 Harvest-store-use architecture: Figure 1.1 (b) depicts the harvest-store-use ar- chitecture, which has an additional energy storage component compared with the harvest-use architecture. The energy is harvested by the harvesting system and stored in the energy storage component. The energy storage would be quite useful when the harvested energy is more than the sensor\u00E2\u0080\u0099s current need. The stored energy can either be used later when there is no harvesting opportu- nity or the energy usage of the sensor node has to be increased to improve the performance. 4 Chapter 1. Introduction 1.3 Energy Management in Energy Harvesting WSNs The energy management of an energy harvesting WSN is different from a non- rechargeable battery powered WSN in many ways. First, with a potentially infinite amount of energy available to the sensor nodes, an energy harvesting WSN can re- main functional for a long period of time. Hence, energy conservation is not the prime design issue. Second, the energy management strategy for an energy harvesting WSN needs to take into account the energy replenishment process. For example, an overly conservative energy expenditure may limit the amount of transmitted data by failing to take the full advantage of the energy harvesting process. On the other hand, an overly aggressive use of energy may result in an energy outage, which prevents some sensor nodes from functioning properly. Third, the energy availability constraint, which requires the energy consumption to be less than the energy stored in the bat- tery, must be met at all time. This constraint complicates the design of an energy management policy, since the current energy consumption decision would affect the outcome in the future. A lot of research efforts have been devoted recently to study the energy man- agement and data transmission in energy harvesting WSNs. Kansal et al. in [20] proposed analytically tractable models to characterize the complex time varying na- ture of energy sources. Distributed algorithms were developed to utilize the harvested energy efficiently. Sharma et al. in [21] proposed energy management schemes for a single energy harvesting sensor node that achieves the maximum throughput and minimum mean delay. A greedy policy was shown to achieve both objectives in the low signal-to-noise ratio (SNR) regime. Gatzianas et al. in [22] presented an online 5 Chapter 1. Introduction adaptive transmission scheme for wireless networks with rechargeable batteries that maximizes total system utility and stabilizes the data queue using Lyapunov tech- niques. Huang et al. in [23] proposed an online algorithm that achieves a close-to- optimal utility performance in finite capacity energy storage devices. The Lyapunov optimization techniques with weight perturbation were used. In [24], utility-optimal energy allocation algorithms were proposed for systems with predictable or stochastic energy availability. References [25, 26] studied the transmission completion time minimization prob- lem in energy harvesting wireless networks, and assumed that the energy harvesting times and harvested energy amounts were known before the transmission started. Yang et al. in [25] investigated two different scenarios of data arrivals and proposed optimal off-line scheduling policies. Antepli et al. in [26] considered the problem with an additive white Gaussian noise (AWGN) broadcast channel. The special structure in the problem was exploited, and an iterative off-line algorithm that minimizes the transmission completion time for the case of two-user broadcast channel was pro- posed. Some of the recent works on energy harvesting WSNs have formulated the energy management problem as a Markov decision process (MDP) [27, 28]. Ho et al. in [1] proposed a throughput-optimal energy allocation algorithm for a time-slotted sys- tem under time-varying fading channel and energy source by using MDP. In [29], a throughput-optimal energy allocation policy was derived in a continuous time model and suboptimal online waterfilling schemes were proposed to address the dimension- ality problem inherent in the MDP solution. Chen et al. in [30] studied the energy allocation problem of a single node using the shortest path approach. A simple dis- tributed heuristic scheme was proposed that solves the joint energy allocation and 6 Chapter 1. Introduction routing problem in a rechargeable WSN. Li et al. in [31] proposed energy efficient scheduling strategies for cooperative communications in energy harvesting WSNs to maximize the long-term utility. The scheduling problems under two different as- sumptions were formulated and solved using MDP and partially observable MDP (POMDP). 1.4 Motivations Most of these results from [1, 29, 30, 21, 22, 23, 24] for energy management in energy harvesting WSNs only considered the special case that there is either an infinitely long data backlog or data buffer. Yet, it is more practical to consider a finite data buffer. Besides, the energy consumed in data sensing has always been overlooked in the literature. This motivates us to design an optimal energy allocation (OEA) algorithm for energy harvesting WSNs which takes into account both the data sensing energy consumption and the finite capacity of the data buffer. However, these considerations introduce new challenges. For instance, if the sensor node consumes an insufficient amount of energy for sensing but an excessive amount of energy for transmission, then the data buffer may be empty, which leads to a reduction in the total amount of data transmitted. Thus, the sensor node needs to maintain a good balance between the energy consumed for sensing and the energy for transmission. 1.5 Contributions In this thesis, we consider the joint energy allocation algorithms design for sensing and transmission in energy harvesting WSNs. We consider a point-to-point wireless link between an energy harvesting sensor node and a sink. The channel and energy 7 Chapter 1. Introduction harvesting rate may vary over time. The sensor node has a rechargeable battery and a data buffer with finite capacity. Our objective is to maximize the expected total amount of data transmitted. The sensor node needs to decide the amount of energy it should allocate for sensing and transmission in each time slot by taking into account the battery energy level, data buffer level, energy harvesting rate, and channel condition. In Chapter 2, we consider the case that the sensor lifetime is fixed. In Chapter 3, we take into account the randomness of the sensor lifetime. The main contributions of this thesis are as follows: \u00E2\u0080\u00A2 In Chapter 2, we study the energy allocation problem for sensing and transmis- sion in an energy harvesting WSN over a finite horizon. The sensor lifetime is a fixed value. We formulate it as a finite-horizon MDP under channel fluctuations and energy variations in a time-slotted system. We obtain the optimal energy allocation policy and propose the OEA algorithm by using backward induc- tion. We provide extensive simulation results to compare the performance of the OEA algorithm and the channel-aware energy allocation (CAEA) algorithm extended from [1]. The results show that the OEA algorithm can transmit a much larger amount of data over a finite horizon than the CAEA algorithm under different settings. \u00E2\u0080\u00A2 In Chapter 3, we extend the joint energy allocation problem by taking into account the randomness of the sensor lifetime, and formulate the problem as an infinite-horizon discounted MDP. We obtain the optimal stationary energy allocation (OSEA) policy and propose the OSEA algorithm by using value iter- ation in MDP. We also study the transmission energy allocation problem under the assumption of infinite data backlog. We obtain structural results for the op- timal transmission energy allocation (OTEA) policy, and prove that the OTEA 8 Chapter 1. Introduction policy is a monotonically increasing function of the available battery energy. Finally, we provide extensive simulation results to compare the performance of the OSEA, OTEA, and CAEA algorithms. We study the impact of the average energy harvesting rate, the battery capacity, the data buffer size, the lifetime of the sensor node, and the data-sensing efficiency (i.e., the amount of data that the sensor can sense per unit energy) on the performance of total transmit- ted data. The results show that the OSEA algorithm transmits the maximum amount of total transmitted data among these three algorithms, and the OTEA algorithm can achieve a near-optimal performance. 1.6 List of Publications The following publications have been completed based on the work in this thesis. \u00E2\u0080\u00A2 Shaobo Mao, Man Hon Cheung, and Vincent W.S. Wong, \u00E2\u0080\u009CAn optimal energy allocation algorithm for energy harvesting wireless sensor networks,\u00E2\u0080\u009D in Proc. of IEEE International Conference on Communications (ICC), Ottawa, Canada, June, 2012. \u00E2\u0080\u00A2 Shaobo Mao, Man Hon Cheung, and Vincent W.S. Wong, \u00E2\u0080\u009CJoint energy allo- cation for sensing and transmission in rechargeable wireless sensor networks,\u00E2\u0080\u009D submitted to IEEE Transactions on Wireless Communications, 2012. 1.7 Structure of the Thesis The rest of the thesis is organized as follows: In Chapter 2, we present the energy allocation algorithm design in energy harvesting WSNs over a finite horizon. In 9 Chapter 1. Introduction Chapter 3, we extend the energy allocation problem in Chapter 2 by taking into account the randomness of the sensor lifetime. Conclusions and future work are given in Chapter 4. 10 Chapter 2 Energy Allocation Algorithm Based on Finite-horizon MDP In this chapter, we present the energy allocation algorithms design for sensing and transmission in an energy harvesting sensor node with a rechargeable battery and a finite data buffer. We formulate the energy allocation problem as a finite-horizon MDP and solve it by using backward induction. 2.1 System Model As shown in Figure 2.1, we consider a single energy harvesting sensor node, which contains a rechargeable battery with capacity bmax Joule and a data buffer with size qmax Mbits. We assume that the system is time-slotted with K time slots and the duration of a time slot is \u00CF\u0084 sec. We let k \u00E2\u0088\u0088 K , {0, 1, . . . , K \u00E2\u0088\u0092 1} be the time slot index. The sensor node performs sensing in the field, stores the sensed data in the buffer, and transmits the data to the receiver Rx of the sink over a wireless channel. We consider an AWGN channel with block flat fading. That is, the channel remains constant for the duration of each time slot, but may change at the slot boundaries. Let \u00CE\u00B1k be the channel gain in time slot k. We assume that the data transmission at every time slot is successful, which is reasonable since we can apply proper channel coding techniques. 11 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP Tx Battery Data buffer Rx Channel state feedback kq ( )kx s kh kb ks ke min{ ( , ), }k k ke qm a maxb maxq Noise Harvested energy Sensed data kaChannel gain Figure 2.1: The system model of an energy harvesting wireless sensor node transmit- ting data to the receiver Rx of the sink. We assume that the sink sends delayed channel state information (CSI) of the previous time slot back to the sensor node. At the beginning of time slot k, the sensor node only knows the value of \u00CE\u00B1k\u00E2\u0088\u00921, but not \u00CE\u00B1k. The stored battery level is bk and the amount of stored data in the data buffer is qk. During the whole time slot k, the sensor node is able to replenish energy by hk, which can be used for sensing or transmission in time slot k + 1 onward. As a result, the sensor node does not know the value of hk until the beginning of the next time slot k+1. In other words, at the beginning of time slot k, the sensor node knows the value of hk\u00E2\u0088\u00921, but not hk. If the channel gain is \u00CE\u00B1k and the allocated transmission energy is ek in time slot k, then the instantaneous transmission power is ek \u00CF\u0084 . We consider that the sensor node is able to transmit \u00C2\u00B5(ek, \u00CE\u00B1k) bits of data in time slot k, where \u00C2\u00B5(ek, \u00CE\u00B1k) is a monotonically non-decreasing and concave function in ek given \u00CE\u00B1k in general. One such function is given by [32, pp. 172]: \u00C2\u00B5(ek, \u00CE\u00B1k) = \u00CF\u0084W log2 ( 1 + \u00CE\u00B1kek N0W\u00CF\u0084 ) bits, (2.1) 12 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP where N0 is the power spectral density of the Gaussian noise, andW is the bandwidth of the channel. For sensing in time slot k, we let x(sk) be the amount of data generated when sk units of energy are used for sensing. In general, x(sk) is a monotonically non- decreasing and concave function in sk. The data obtained by sensing in time slot k will be stored in the data buffer until they are transmitted in the subsequent time slots. Except for sensing and transmission, we assume that other energy consumptions in the sensor node, for example, the processing energy, the energy consumed for storing information in the memory, the energy for turning on and off the transmitter, the energy consumed for receiving feedback from the sink, the energy leakage from the battery, and battery relaxation effects [33], are negligible. At time slot k, the sensor node needs to choose ek and sk, for all k \u00E2\u0088\u0088 K such that the expected total amount of data transmitted is maximized. To achieve this goal, the sensor has to maintain a good tradeoff between the energy allocation for ek and sk. Given a fixed energy budget in a time slot, if ek is too small, then the transmitted data in time slot k will be small. However, if ek is too large, then sk will be small such that insufficient amount of sensing data is stored in the buffer for transmission in the next time slot, which leads to a reduction in the amount of data transmitted in future time slots. In addition, the total energy budget ek + sk in time slot k should also be carefully controlled. If the energy management policy is overly aggressive such that the rate of energy consumption is greater than the rate of energy harvesting, the sensor node may stop functioning because of the energy outage. On the other hand, an overly conservative energy management policy would limit the amount of data transmitted in each time slot. Thus, it is a challenging problem to decide the values of ek and sk optimally in each time slot k \u00E2\u0088\u0088 K. 13 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP 0 1 2 3 \u00C4\u0082 \u00C4\u0082 2K - 1K - 1y 1a 2y 2a 3y 3a 1Ky - 1Ka -2Ky - 2Ka -\u000F \u000F0y 0a\u000F \u000F \u000F \u000F time t t Figure 2.2: Timing diagram of a Markov decision process (MDP). 2.2 Problem Formulation In this section, we formulate the problem of finding the optimal energy allocation for sensing and transmission as an MDP [27] [28], which consists of five elements: decision epochs, states, actions, state transition probabilities, and rewards. Referring to Figure 2.2, the decision epochs are k \u00E2\u0088\u0088 K = {0, 1, . . . , K \u00E2\u0088\u0092 1}. The state of the system is denoted as y = (b, q, h, \u00CE\u00B1), which includes the battery energy state b and data buffer state q for the current time slot, as well as the energy harvesting state h and channel state \u00CE\u00B1 in the previous time slot. We denote the state space as Y = B\u00C3\u0097Q\u00C3\u0097H\u00C3\u0097A, where B is the set of battery energy states, Q is the set of data buffer states, H is the set of energy harvesting states, and A is the set of channel states. Let yk denote the state of the system at time slot k, i.e., yk = (bk, qk, hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921). First, for the battery energy state in time slot k, the sensor node harvests hk units of energy from the environment. On the other hand, it consumes ek units of energy for data transmission and sk units of energy for sensing. Since the battery has a finite capacity bmax, the energy stored in the battery is updated as bk+1 = min{bk \u00E2\u0088\u0092 (ek + sk) + hk, bmax}, \u00E2\u0088\u0080 k \u00E2\u0088\u0088 K, (2.2) 14 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP such that the battery energy state transition probability is given by P (bk+1 | bk, hk, ek, sk) = \u00EF\u00A3\u00B1\u00EF\u00A3\u00B4\u00EF\u00A3\u00B2 \u00EF\u00A3\u00B4\u00EF\u00A3\u00B3 1, if (2.2) is satisfied, 0, otherwise. (2.3) Equation (2.2) ensures that the maximum stored energy bmax is not exceeded. We assume that the initial energy b0 is known and satisfies the constraint 0 \u00E2\u0089\u00A4 b0 \u00E2\u0089\u00A4 bmax. Moreover, the amount of energy consumed for sensing and transmission must be no more than the battery level: ek + sk \u00E2\u0089\u00A4 bk, \u00E2\u0088\u0080 k \u00E2\u0088\u0088 K. (2.4) Second, for the data buffer state in time slot k, x(sk) amount of sensing data is generated and queued up in the data buffer if sk units of energy are allocated for sensing. On the other hand, if the amount of data available in the data buffer for transmission at time slot k is qk, and ek units of energy are used for transmission, then the amount of data transmitted and removed from the data buffer at time slot k is given by min{\u00C2\u00B5(ek, \u00CE\u00B1k), qk}. Since the data buffer is finite with capacity qmax, the amount of data in the buffer is then updated as qk+1 = min{[qk \u00E2\u0088\u0092 \u00C2\u00B5(ek, \u00CE\u00B1k)] + + x(sk), qmax}, \u00E2\u0088\u0080 k \u00E2\u0088\u0088 K, (2.5) where [z]+ = max{z, 0}. The data buffer state transition probability is then given by P (qk+1 | qk, \u00CE\u00B1k, ek, sk) = \u00EF\u00A3\u00B1\u00EF\u00A3\u00B4\u00EF\u00A3\u00B2 \u00EF\u00A3\u00B4\u00EF\u00A3\u00B3 1, if (2.5) is satisfied, 0, otherwise. (2.6) 15 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP We assume that the initial amount of data in the data buffer q0 is known and satisfies 0 \u00E2\u0089\u00A4 q0 \u00E2\u0089\u00A4 qmax. Equation (2.5) implies that if the sensor allocates too much energy for transmission such that \u00C2\u00B5(ek, \u00CE\u00B1k) > qk, then energy will be wasted. On the other hand, if the sensor allocates too much energy for sensing so that x(sk) > qmax, then the data buffer will be overflown and energy will be wasted too. Thus the sensor should make a proper energy allocation decision at each time slot. Third, since the energy harvesting rate and the current channel state information at time slot k is not known to the sensor, we use two independent first-order stationary Markovian models to model hk and \u00CE\u00B1k [1, 31, 34]. The transition probability of these two independent random variables are denoted as P (hk | hk\u00E2\u0088\u00921) and P (\u00CE\u00B1k |\u00CE\u00B1k\u00E2\u0088\u00921). Based on the current state yk at time slot k, the sensor will choose to consume ek units of energy for data transmission and sk units of energy for sensing. That is, an action ak = (ek, sk) is taken for transmission and sensing energy allocation from its feasible set U(yk). We have ak \u00E2\u0088\u0088 U(yk) = {(e, s) | e+ s \u00E2\u0089\u00A4 bk, e \u00E2\u0089\u00A5 0, s \u00E2\u0089\u00A5 0}, (2.7) where U(yk) represents the feasible set of the action ak given the current state yk at time slot k. The constraint ek + sk \u00E2\u0089\u00A4 bk, \u00E2\u0088\u0080 k \u00E2\u0088\u0088 K ensures that the amount of energy consumed for sensing and transmission must be no more than the battery level. In addition, it is possible to impose additional constraints on ak. For example, a constraint on the minimum amount of energy for sensing or transmission to ensure a minimum amount of sensed data or transmitted data for each time slot, respectively. Also, a maximum transmission power constraint can be imposed. The state transition probability P (yk+1 |yk,ak) is the probability that the system will go into state yk+1 if action ak is taken at state yk at time slot k. Due to the 16 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP independence between (bk+1, hk) and (qk+1, \u00CE\u00B1k) for all k \u00E2\u0088\u0088 K, we can simplify the state transition probability as P (yk+1 |yk,ak)=P (bk+1, qk+1, hk, \u00CE\u00B1k | bk, qk, hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921, ek, sk) =P (bk+1, hk | bk, hk\u00E2\u0088\u00921, ek, sk)P (qk+1, \u00CE\u00B1k | qk, \u00CE\u00B1k\u00E2\u0088\u00921, ek, sk) (2.8) =P (bk+1 | bk, hk, ek, sk)P (hk | hk\u00E2\u0088\u00921)P (qk+1 | qk, \u00CE\u00B1k, ek, sk)P (\u00CE\u00B1k |\u00CE\u00B1k\u00E2\u0088\u00921), where P (bk+1 | bk, hk, ek, sk) and P (qk+1 | qk, \u00CE\u00B1k, ek, sk) are defined in (2.3) and (2.6), respectively. Given the current state yk and the action ak, E\u00CE\u00B1k [\u00C2\u00B5(ek, \u00CE\u00B1k) |\u00CE\u00B1k\u00E2\u0088\u00921] is the expected amount of data that can be transmitted when ek units of energy are used for trans- mission. However, since the data available in the data buffer for transmission at time slot k are qk, the expected amount of data transmitted at time slot k is given by E\u00CE\u00B1k [min{\u00C2\u00B5(ek, \u00CE\u00B1k), qk} |\u00CE\u00B1k\u00E2\u0088\u00921]. We define the reward at time slot k, r(yk,ak) to be the expected amount of data transmitted at time slot k. That is, r(yk,ak) = E\u00CE\u00B1k [min{\u00C2\u00B5(ek, \u00CE\u00B1k), qk} |\u00CE\u00B1k\u00E2\u0088\u00921]. (2.9) A decision rule prescribes a procedure for action selection in each state at a specified time slot. We denote the deterministic Markovian decision rule at time slot k as \u00CE\u00B4k, i.e., ak = \u00CE\u00B4k(yk), which specifies the action choice ak when the system occupies state yk at time slot k. A policy pi = (\u00CE\u00B40, \u00CE\u00B41, . . . , \u00CE\u00B4K\u00E2\u0088\u00921) is a sequence of decision rules to be used at all the time slots. A feasible policy should satisfy (2.7) at all the time slots. Let \u00CE\u00A0 be the feasible set of pi. The sensor node aims to find an optimal and feasible sensing and transmit energy allocation policy pi\u00E2\u0088\u0097 that maximizes the expected total reward for all the K time slots. That is, for any given initial state 17 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP y0 = (b0, q0, h\u00E2\u0088\u00921, \u00CE\u00B1\u00E2\u0088\u00921) at the first time slot, the optimal expected total reward is given by T \u00E2\u0088\u0097 = max pi\u00E2\u0088\u0088\u00CE\u00A0 K\u00E2\u0088\u00921\u00E2\u0088\u0091 k=0 E { r(yk,ak) \u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3y0, pi} = max pi\u00E2\u0088\u0088\u00CE\u00A0 K\u00E2\u0088\u00921\u00E2\u0088\u0091 k=0 E { min{qk, \u00C2\u00B5(ek, \u00CE\u00B1k)} \u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3y0, pi}, (2.10) where E{\u00C2\u00B7} denotes the statistical expectation taken over all relevant random vari- ables given initial state y0 and policy pi. It should be noted that with a different policy pi and initial state y0, a different action will be chosen in each time slot in general, which results in a different state transition probability when the expectation E{\u00C2\u00B7} is computed. 2.3 Finite-Horizon MDP In this section, we solve problem (2.10) by using finite-horizon MDP. An OEA algo- rithm is proposed that can transmit the maximal total amount of data in problem (2.10). Let Vk(bk, qk, hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921) be the maximum expected amount of data transmitted from time slot k to K \u00E2\u0088\u0092 1, given that the system is in state (bk, qk, hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921) imme- diately before the decision at time slot k. The Bellman\u00E2\u0080\u0099s equations are given by the 18 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP following recursive equations starting from k = K \u00E2\u0088\u0092 1 to k = 0. For k = K \u00E2\u0088\u0092 1, we have VK\u00E2\u0088\u00921(bK\u00E2\u0088\u00921, qK\u00E2\u0088\u00921, hK\u00E2\u0088\u00922, \u00CE\u00B1K\u00E2\u0088\u00922) = max aK\u00E2\u0088\u00921\u00E2\u0088\u0088U(yK\u00E2\u0088\u00921) E\u00CE\u00B1K\u00E2\u0088\u00921 { min{\u00C2\u00B5(eK\u00E2\u0088\u00921, \u00CE\u00B1K\u00E2\u0088\u00921), qK\u00E2\u0088\u00921} |\u00CE\u00B1K\u00E2\u0088\u00922 } . (2.11a) For k = K \u00E2\u0088\u0092 2, . . . , 0, we have Vk(bk, qk, hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921) = max ak \u00E2\u0088\u0088U(yk) { E\u00CE\u00B1k { min{\u00C2\u00B5(ek, \u00CE\u00B1k), qk} |\u00CE\u00B1k\u00E2\u0088\u00921 } + Ehk,\u00CE\u00B1k { Vk+1(bk+1, qk+1, hk, \u00CE\u00B1k) | hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921 }} , (2.11b) where bk+1 and qk+1 are updated as in (2.2) and (2.5), respectively. Notice that if the feasible set of ak is U(yk) as defined in (2.7), then (2.11a) can be simplified as VK\u00E2\u0088\u00921(bK\u00E2\u0088\u00921, qK\u00E2\u0088\u00921, hK\u00E2\u0088\u00922, \u00CE\u00B1K\u00E2\u0088\u00922) =E\u00CE\u00B1K\u00E2\u0088\u00921 { min{\u00C2\u00B5(bK\u00E2\u0088\u00921, \u00CE\u00B1K\u00E2\u0088\u00921), qK\u00E2\u0088\u00921} |\u00CE\u00B1K\u00E2\u0088\u00922 } . (2.12) That is, we use all the available energy for transmission in the final time slot. Thus the optimal energy allocation for the final time slot K\u00E2\u0088\u00921 is (e\u00E2\u0088\u0097K\u00E2\u0088\u00921, s \u00E2\u0088\u0097 K\u00E2\u0088\u00921) = (bK\u00E2\u0088\u00921, 0). For (2.11b), the first and second terms on the right hand side represent, respectively, the expected immediate reward for time slot k and the expected total future rewards for time slot k + 1 to K \u00E2\u0088\u0092 1 if action ak is chosen. Hence, the equation in (2.11b) describes the tradeoff between the current reward and the future rewards. Theorem 2.1. The optimal policy of problem (2.10) is pi\u00E2\u0088\u0097 = {a\u00E2\u0088\u0097k(yk), \u00E2\u0088\u0080yk, k \u00E2\u0088\u0088 K}, where a\u00E2\u0088\u0097k(yk) = arg max ak \u00E2\u0088\u0088U(yk) { E\u00CE\u00B1k { min{\u00C2\u00B5(ek, \u00CE\u00B1k), qk} |\u00CE\u00B1k\u00E2\u0088\u00921 } +Ehk,\u00CE\u00B1k { Vk+1(bk+1, qk+1, hk, \u00CE\u00B1k) | hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921 }} . (2.13) 19 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP Moreover, for every initial state y0 = (b0, q0, h\u00E2\u0088\u00921, \u00CE\u00B1\u00E2\u0088\u00921), the maximum amount of transmitted data T \u00E2\u0088\u0097 is given by V0(b0, q0, h\u00E2\u0088\u00921, \u00CE\u00B1\u00E2\u0088\u00921). Proof. The proof follows by applying the Bellman\u00E2\u0080\u0099s equations and backward induction [27] and using (2.2) and (2.5). Algorithm 1 Optimal Energy Allocation (OEA) Algorithm for Energy Harvesting Sensor Node. 1: Planning Phase: 2: Set VK\u00E2\u0088\u00921(bK\u00E2\u0088\u00921, qK\u00E2\u0088\u00921, hK\u00E2\u0088\u00922, \u00CE\u00B1K\u00E2\u0088\u00922), \u00E2\u0088\u0080 bK\u00E2\u0088\u00921, \u00E2\u0088\u0080 qK\u00E2\u0088\u00921, \u00E2\u0088\u0080hK\u00E2\u0088\u00922, \u00E2\u0088\u0080\u00CE\u00B1K\u00E2\u0088\u00922, using (2.11a). 3: Set k := K \u00E2\u0088\u0092 2. 4: while k \u00E2\u0089\u00A5 0 do 5: Calculate Vk(bk, qk, hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921), \u00E2\u0088\u0080 bk, \u00E2\u0088\u0080 qk, \u00E2\u0088\u0080hk\u00E2\u0088\u00921, \u00E2\u0088\u0080\u00CE\u00B1k\u00E2\u0088\u00921, using (2.11b). 6: Find the optimal action a\u00E2\u0088\u0097k(yk) := (e \u00E2\u0088\u0097 k(yk), s \u00E2\u0088\u0097 k(yk)), using (2.13). 7: Set k := k \u00E2\u0088\u0092 1. 8: end while 9: Sensing and Transmission Phase: 10: Set k := 0. 11: while k \u00E2\u0089\u00A4 K \u00E2\u0088\u0092 1 do 12: Track the energy harvesting rate of the previous time slot hk\u00E2\u0088\u00921. 13: Track the energy available for use in the battery bk. 14: Track the amount of data in the buffer qk. 15: Obtain the channel feedback \u00CE\u00B1k\u00E2\u0088\u00921 from the sink. 16: Set yk := (bk, qk, hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921). 17: Obtain a\u00E2\u0088\u0097k(yk) := (e \u00E2\u0088\u0097 k(yk), s \u00E2\u0088\u0097 k(yk)) based on optimal policy pi \u00E2\u0088\u0097. 18: Consume e\u00E2\u0088\u0097k(yk) amount of energy for transmission and s \u00E2\u0088\u0097 k(yk) amount of energy for sensing. 19: Update battery energy bk+1 by using (2.2) and the amount of data in the buffer qk+1 by using (2.5). 20: Set k := k + 1. 21: end while We then propose our OEA algorithm in Algorithm 1. In the planning phase, the sensor solves for the optimal policy pi\u00E2\u0088\u0097 and records it as a look-up table. In the sensing and transmission phase, the sensor first tracks the energy harvesting rate of the previous time slot hk\u00E2\u0088\u00921, the battery energy level bk, the amount of data in the buffer qk, and obtains the channel feedback \u00CE\u00B1k\u00E2\u0088\u00921 from the sink. Then, the sensor chooses the action a\u00E2\u0088\u0097k = (e \u00E2\u0088\u0097 k, s \u00E2\u0088\u0097 k) based on current system state yk and the optimal 20 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP B N G PBN PNB PGN PGG PNG PBB PNN Figure 2.3: A three-state Markov chain for the channel gain, where \u00E2\u0080\u009CB\u00E2\u0080\u009D, \u00E2\u0080\u009CN\u00E2\u0080\u009D, and \u00E2\u0080\u009CG\u00E2\u0080\u009D represent the channel in the bad, normal, and good states, respectively. policy pi\u00E2\u0088\u0097. That is, it consumes e\u00E2\u0088\u0097k and s \u00E2\u0088\u0097 k amount of energy for transmission and sensing, respectively. 2.4 Performance Evaluation In this section, we simulate the performance of our OEA and CAEA algorithms in terms of the total amount of data transmitted in Matlab, where the CAEA algorithm is extended based on the algorithm proposed in [1]. We consider a band-limited AWGN channel, where the channel bandwidth is W = 100 KHz and the noise power spectral density is N0 = 10 \u00E2\u0088\u009218 W/Hz. The channel state can be \u00E2\u0080\u009CG = Good\u00E2\u0080\u009D, \u00E2\u0080\u009CN = Normal\u00E2\u0080\u009D, or \u00E2\u0080\u009CB = Bad\u00E2\u0080\u009D. It evolves according to the three-state Markov chain as shown in Figure 2.3 [35] with the transition matrix of the Markov chain given by P\u00CE\u00B1= \u00EF\u00A3\u00AE \u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00B0 PBB PBN PBG PNB PNN PNG PGB PGN PGG \u00EF\u00A3\u00B9 \u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BB= \u00EF\u00A3\u00AE \u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00B0 0.3 0.7 0 0.25 0.5 0.25 0 0.7 0.3 \u00EF\u00A3\u00B9 \u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BB, (2.14) where PXZ represents the probability of the channel state going from state X to 21 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP state Z, X and Z \u00E2\u0088\u0088 {B,N,G}. The channel gain \u00CE\u00B1 is 0.5 \u00C3\u0097 10\u00E2\u0088\u009213, 1 \u00C3\u0097 10\u00E2\u0088\u009213, and 1.5 \u00C3\u0097 10\u00E2\u0088\u009213 when the channel state is \u00E2\u0080\u009CBad\u00E2\u0080\u009D, \u00E2\u0080\u009CNormal\u00E2\u0080\u009D, and \u00E2\u0080\u009CGood\u00E2\u0080\u009D, respectively. The battery buffer size bmax is set to be 100 Joules, and the data buffer size qmax is set to be 1 Mbits. For tractability, we assume that the energy harvesting state hk takes values from the finite set H = {H1, H2, H3, H4} and evolves according to the four-state Markov chain with the state transition probability given by Ph = \u00EF\u00A3\u00AE \u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00B0 PH1H1 PH1H2 PH1H3 PH1H4 PH2H1 PH2H2 PH2H3 PH2H4 PH3H1 PH3H2 PH3H3 PH3H4 PH4H1 PH4H2 PH4H3 PH4H4 \u00EF\u00A3\u00B9 \u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BB = \u00EF\u00A3\u00AE \u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00B0 0.3 0.7 0 0 0.25 0.5 0.25 0 0 0.25 0.5 0.25 0 0 0.7 0.3 \u00EF\u00A3\u00B9 \u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BB , (2.15) where PHiHj represents the probability of the energy harvesting state going from state Hi to state Hj, \u00E2\u0088\u0080 i, j \u00E2\u0088\u0088 {1, 2, 3, 4}. The steady state probability is then given by [PH1 PH2 PH3 PH4] = [0.13 0.37 0.37 0.13]. x(sk) is assumed to be a linear function of sk [30], given by x(sk) = \u00CE\u00B3sk, (2.16) where \u00CE\u00B3 is the data-sensing efficiency parameter (i.e., the amount of data that the sensor can sense per unit energy). Unless specified otherwise, we assume that \u00CE\u00B3 is equal to be 0.02 Mbits/J. The CAEA algorithm in [1] assumed infinite backlogged data and neglected the sensing energy. For fair comparison, we modify the CAEA algorithm by allowing the data buffer to be finite with size qmax. We assume that the sensor allocates a fixed percentage of energy available in the battery for sensing in each time slot, and optimizes the energy allocated for transmission to achieve the maximal amount of total transmitted data. 22 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP 5 10 15 20 25 30 0 1 2 3 4 5 6 7 Total Number of Time Slots K T ot al A m ou n t of D a ta T ra n sm it te d (M b it s) OEA algorithm CAEA algorithm Figure 2.4: The total amount of data transmitted of the two algorithms for different number of total time slots K. We start by examining the total amount of transmitted data of the OEA algo- rithm and the CAEA algorithm with different number of total time slots K. We set the fixed percentage of energy for sensing to be 10% in the CAEA algorithm, which is reasonable in WSNs. The value of energy harvesting rate is taken from the set H = {H1, H2, H3, H4} = {6, 12, 18, 24} J/time slot. As shown in Figure 2.4, our proposed OEA algorithm outperforms the CAEA algorithm in terms of the achieved amount of transmitted data. The reason is that in the CAEA algorithm, the sen- sor just optimally controls the energy for transmission, while the sensing energy is fixed. However, in our OEA algorithm, both the sensing and transmitting energy is optimally allocated, which results in a better performance than the CAEA algorithm. Next, we consider the performance of the two algorithms under different average 23 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP 5 10 15 20 25 30 35 2 3 4 5 6 7 8 9 10 11 12 Average Energy Harvesting Rate H\u00CC\u0084 (J/time slot) T ot al A m ou n t of D at a T ra n sm it te d (M b it s) OEA algorithm CAEA algorithm 110% Figure 2.5: The total amount of data transmitted of the two algorithms for different average energy harvesting rates when K = 30. energy harvesting rates H\u00CC\u0084, where H\u00CC\u0084 = \u00E2\u0088\u00914 i=1HiPHi . In Figure 2.5, we plot the total amount of transmitted data against the average energy harvesting rate when the total number of time slots K = 30. We observe that our OEA algorithm performs much better than the CAEA algorithm, especially when the average energy harvesting rate H\u00CC\u0084 is high. As shown in Figure 2.5, our OEA algorithm achieves 110% larger amount of transmitted data than the CAEA algorithm when the average energy harvesting rate H\u00CC\u0084 = 35 J/time slot. Moreover, the performance of the CAEA algorithm sat- urates very quickly as the average harvesting rate is increased. It is because the harvested energy cannot be accommodated, and more and more energy is lost due to the overflow of the battery energy. However, in our algorithm, energy wastage will not occur as long as the harvesting rate is less than bmax and the data buffer is large 24 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP enough. The reason is that under the OEA algorithm, the sensor node maintains a good balance between the energy allocated for sensing and transmission, and thus achieves a better performance. Finally, we study the impact of the data-sensing efficiency on the amount of trans- mitted data. We fix K to be 30 and examine the amount of total transmitted data under different values of \u00CE\u00B3. A larger value of \u00CE\u00B3 corresponds to a higher data-sensing efficiency, since the sensor node spends less energy for sensing the same amount of data. As shown in Figure 2.6, when \u00CE\u00B3 is increased, the amount of transmitted data increases as well, because more energy is available for data transmission. However, the performance saturates as \u00CE\u00B3 is increased beyond a certain value. When \u00CE\u00B3 ap- proaches infinity, it corresponds to the case where the sensing is extremely efficient. The throughput of this case provides an upper bound for the performance of the OEA algorithm for sensor nodes with different sensing efficiency. 25 Chapter 2. Energy Allocation Algorithm Based on Finite-horizon MDP 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 2 3 4 5 6 7 8 9 10 11 12 \u00CE\u00B3 (Mbits/J) T ot al A m ou n t of D at a T ra n sm it te d (M b it s) Upper bound OEA algorithm CAEA algorithm Figure 2.6: The total amount of data transmitted of the two algorithms for different values of data-sensing efficiency parameter \u00CE\u00B3. 26 Chapter 3 Energy Allocation Algorithms Based on Infinite-horizon MDP In Chapter 2, we considered the case that the lifetime of the sensor node is fixed. The joint energy allocation problem was formulated as a finite-horizon MDP, and was solved by using backward induction. In this chapter, we consider a different setting, where the lifetime of the sensor node is a random variable, and formulate the energy allocation problem as an infinite-horizon discounted MDP. We solve the infinite-horizon MDP by using the value iteration algorithm. 3.1 Problem Formulation The system model in this chapter is the same as the model in Chapter 2, as shown in Figure 2.1. The sensor node can function for K time slots and will stop functioning after that time. We take into account the randomness of the sensor node lifetime K and assume that K is geometrically distributed with mean 1/(1\u00E2\u0088\u0092\u00CE\u00BD), where \u00CE\u00BD \u00E2\u0088\u0088 [0, 1). We apply the same notations as in Chapter 2. The sensor aims to maximize the total amount of data transmitted from the first time slot to the time that the sensor stops functioning. The setting of the sensor lifetime K and the objective to maximize the total amount of transmitted data are reasonable in many applications. For example, on the battle filed, a camera with a wireless transceiver in the enemy territory, which 27 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP is used to monitor the action of the enemy, can be detected by the enemy at any time. Thus, its objective is to sense and transmit as much information as possible before it is detected and destroyed by the enemy. Another application is that nowadays in Europe, sensor nodes are placed in the forest to monitor the forest fire. These sensors are aimed to gather and transmit as much information as possible before they are damaged by the fire. Considering that the lifetime of the sensor node is a random variable, for any given state y0 = (b0, q0, h\u00E2\u0088\u00921, \u00CE\u00B1\u00E2\u0088\u00921) at the first time slot, the expected total reward between the first time slot till the sensor stops functioning with policy pi \u00E2\u0088\u0088 \u00CE\u00A0 is given by Jpi(y0) = E { EK { K\u00E2\u0088\u00921\u00E2\u0088\u0091 k=0 r(yk,ak) }\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3y0, pi } , (3.1) where E{\u00C2\u00B7} denotes the statistical expectation taken over all relevant random vari- ables given initial state y0 and policy pi. EK{\u00C2\u00B7} denotes the expectation with respect to the random variable K, which is the lifetime of the sensor node. Lemma 3.1. Based on the geometric distribution of the lifetime K of the senor node with mean 1/(1 \u00E2\u0088\u0092 \u00CE\u00BD), equation (3.1) is equivalent to the objective function of infinite-horizon MDP with discounted reward given by Jpi(y0) = E { \u00E2\u0088\u009E\u00E2\u0088\u0091 k=0 \u00CE\u00BDkr(yk,ak) \u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3 y0, pi } . (3.2) Proof. Since K is distributed as P (K = m) = \u00CE\u00BDm\u00E2\u0088\u00921(1\u00E2\u0088\u0092 \u00CE\u00BD), m = 1, 2, 3, . . . , 28 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP equation (3.1) can be written as Jpi(y0) = E { \u00E2\u0088\u009E\u00E2\u0088\u0091 K=1 K\u00E2\u0088\u00921\u00E2\u0088\u0091 k=0 r(yk,ak)\u00CE\u00BD K\u00E2\u0088\u00921(1\u00E2\u0088\u0092 \u00CE\u00BD) \u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3y0, pi } = E { \u00E2\u0088\u009E\u00E2\u0088\u0091 k=0 \u00E2\u0088\u009E\u00E2\u0088\u0091 K=k+1 r(yk,ak)\u00CE\u00BD K\u00E2\u0088\u00921(1\u00E2\u0088\u0092 \u00CE\u00BD) \u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3y0, pi } = E { \u00E2\u0088\u009E\u00E2\u0088\u0091 k=0 \u00CE\u00BDkr(yk,ak) \u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3y0, pi } . Here, we can interpret \u00CE\u00BD as the discount factor of the model. Since the sensor node will stop functioning at some time in the future, the reward at time slot k is discounted by a factor \u00CE\u00BDk. Lemma 3.2. Jpi(y0) defined in (3.2) is finite. That is |J pi(y0)| <\u00E2\u0088\u009E. Proof. Since sup a\u00E2\u0088\u0088U(y),y\u00E2\u0088\u0088Y |r(y,a)| = max \u00CE\u00B1\u00E2\u0088\u0088A { E\u00CE\u00B1\u00E2\u0080\u00B2 [min{\u00C2\u00B5(bmax, \u00CE\u00B1 \u00E2\u0080\u00B2), qmax} |\u00CE\u00B1] } <\u00E2\u0088\u009E, (3.3) the objective function Jpi(y0) of the infinite-horizon MDP converges to a finite value [28, pp. 121]. The sensor node aims to find an optimal sensing and transmit energy allocation policy that maximizes the expected total discounted reward given in (3.2). That is, given the initial state y0, the sensor aims to obtain the optimal expected total discounted reward J(y0) and the optimal policy pi \u00E2\u0088\u0097 defined as J(y0) = max pi\u00E2\u0088\u0088\u00CE\u00A0 Jpi(y0) and pi \u00E2\u0088\u0097 = argmax pi\u00E2\u0088\u0088\u00CE\u00A0 Jpi(y0). (3.4) 29 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP A policy is said to be stationary if \u00CE\u00B4k = \u00CE\u00B4 for all k \u00E2\u0088\u0088 K such that pi = (\u00CE\u00B4, \u00CE\u00B4, . . .). For the rest of this chapter, a general policy is denoted by pi, while a stationary policy is denoted by \u00CE\u00B4. For an infinite-horizon MDP, the only case of interest is when a stationary optimal policy exists. Thus our objective is to find an optimal stationary deterministic policy \u00CE\u00B4\u00E2\u0088\u0097, which maximizes the expected total discounted reward in (3.2). 3.2 Energy Allocation Algorithms In this section, we obtain the optimal stationary policies for energy allocation. First, we consider a general case that takes into account a finite data buffer and the energy allocated for sensing. Next, we study a special case where we assume that there is an infinite data backlog. 3.2.1 General Case In this subsection, we obtain the optimal stationary policy for the general case. An OSEA algorithm that achieves the maximum expected total discounted reward in (3.4) is proposed based on the value iteration algorithm [28, pp. 161]. The optimal expected total discounted reward J(y) given current state y satisfies the Bellman\u00E2\u0080\u0099s equation of optimality [28]: J(y) = max a\u00E2\u0088\u0088U(y) \u00EF\u00A3\u00B1\u00EF\u00A3\u00B2 \u00EF\u00A3\u00B3r(y,a) + \u00CE\u00BD \u00E2\u0088\u0091 y \u00E2\u0080\u00B2\u00E2\u0088\u0088Y P (y\u00E2\u0080\u00B2 |y,a)J(y\u00E2\u0080\u00B2) \u00EF\u00A3\u00BC\u00EF\u00A3\u00BD \u00EF\u00A3\u00BE . (3.5) In equation (3.5), the first and second terms on the right hand side represent, respec- tively, the immediate reward at the current time slot and the expected total discounted 30 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP future reward if action a is chosen. Hence, equation (3.5) describes the tradeoff be- tween the current reward and the expected future reward. Theorem 3.1. There exists an optimal stationary deterministic policy \u00CE\u00B4\u00E2\u0088\u0097 that max- imizes the right hand side of (3.5), given by \u00CE\u00B4\u00E2\u0088\u0097(y) = arg max a\u00E2\u0088\u0088U(y) \u00EF\u00A3\u00B1\u00EF\u00A3\u00B2 \u00EF\u00A3\u00B3r(y,a) + \u00CE\u00BD \u00E2\u0088\u0091 y \u00E2\u0080\u00B2\u00E2\u0088\u0088Y P (y\u00E2\u0080\u00B2 |y,a)J(y\u00E2\u0080\u00B2) \u00EF\u00A3\u00BC\u00EF\u00A3\u00BD \u00EF\u00A3\u00BE . (3.6) Proof. Notice that the system state space Y is countable and discrete, and U(y) is finite for each y \u00E2\u0088\u0088 Y . From [28, Theorem 6.2.10], an optimal stationary deterministic policy exists. We then propose the OSEA algorithm in Algorithm 2. In the planning phase, the sensor solves for the optimal stationary policy \u00CE\u00B4\u00E2\u0088\u0097 based on value iteration algorithm, and records it as a look-up table. Specifically, in line 2, we initialize J0(y) for all y \u00E2\u0088\u0088 Y arbitrarily, specify the error bound \u00CE\u00B5, and set the iteration sequence n to be 0. In line 3, we compute Jn+1(y) for each y \u00E2\u0088\u0088 Y based on the knowledge of Jn(y). In line 4, we first check whether ||Jn+1\u00E2\u0088\u0092Jn|| < \u00CE\u00B5(1\u00E2\u0088\u0092\u00CE\u00BD) 2\u00CE\u00BD holds where Jn+1 = (Jn+1(y), \u00E2\u0088\u0080y \u00E2\u0088\u0088 Y ) and Jn = (Jn(y), \u00E2\u0088\u0080y \u00E2\u0088\u0088 Y ), and the norm function is defined to be ||J || = max |J(y)| for y \u00E2\u0088\u0088 Y . If the inequality holds, which means that the value iteration algorithm has converged, then we proceed to obtain the optimal stationary policy \u00CE\u00B4\u00E2\u0088\u0097 in line 5 and stop. Otherwise, we go back to line 3 and continue to iterate. In the sensing and transmission phase, the sensor node first tracks the energy harvesting rate of the previous time slot hk\u00E2\u0088\u00921, the battery energy level bk, the amount of data in the buffer qk, and obtains the channel feedback \u00CE\u00B1k\u00E2\u0088\u00921 from the sink in line 9 to 12. Then, the sensor node chooses the action \u00CE\u00B4\u00E2\u0088\u0097(y) = (e\u00E2\u0088\u0097(y), s\u00E2\u0088\u0097(y)) based on current system state y and the optimal stationary policy \u00CE\u00B4\u00E2\u0088\u0097 in line 14. That is, it consumes e\u00E2\u0088\u0097(y) and 31 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP Algorithm 2 Optimal Stationary Energy Allocation (OSEA) Algorithm for Energy Harvesting Sensor Node. 1: Planning Phase: 2: Arbitrarily select J0(y) for each y \u00E2\u0088\u0088 Y , specify \u00CE\u00B5 > 0, and set n := 0. 3: For each y \u00E2\u0088\u0088 Y , compute Jn+1(y) by Jn+1(y) := max a\u00E2\u0088\u0088U(y) \u00EF\u00A3\u00B1\u00EF\u00A3\u00B2 \u00EF\u00A3\u00B3r(y,a) + \u00CE\u00BD \u00E2\u0088\u0091 y \u00E2\u0080\u00B2\u00E2\u0088\u0088Y P (y\u00E2\u0080\u00B2 |y,a)Jn(y \u00E2\u0080\u00B2) \u00EF\u00A3\u00BC\u00EF\u00A3\u00BD \u00EF\u00A3\u00BE . (3.7) 4: If ||Jn+1\u00E2\u0088\u0092Jn|| < \u00CE\u00B5(1\u00E2\u0088\u0092\u00CE\u00BD) 2\u00CE\u00BD , go to line 5. Otherwise increment n by 1 and go to line 3. 5: For each y \u00E2\u0088\u0088 Y , choose stationary \u00CE\u00B5-optimal policy \u00CE\u00B4\u00E2\u0088\u0097(y) := arg max a\u00E2\u0088\u0088U(y) \u00EF\u00A3\u00B1\u00EF\u00A3\u00B2 \u00EF\u00A3\u00B3r(y,a) + \u00CE\u00BD \u00E2\u0088\u0091 y \u00E2\u0080\u00B2\u00E2\u0088\u0088Y P (y\u00E2\u0080\u00B2 |y,a)Jn+1(y \u00E2\u0080\u00B2) \u00EF\u00A3\u00BC\u00EF\u00A3\u00BD \u00EF\u00A3\u00BE , (3.8) and stop. 6: Sensing and Transmission Phase: 7: Set k := 0. 8: while k \u00E2\u0089\u00A4 K \u00E2\u0088\u0092 1 do 9: Track the energy harvesting rate of the previous time slot hk\u00E2\u0088\u00921. 10: Track the energy available for use in the battery bk. 11: Track the amount of data in the buffer qk. 12: Obtain the channel feedback \u00CE\u00B1k\u00E2\u0088\u00921 from the sink. 13: Set y := (bk, qk, hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921). 14: Obtain action \u00CE\u00B4\u00E2\u0088\u0097(y) := (e\u00E2\u0088\u0097(y), s\u00E2\u0088\u0097(y)) based on the optimal policy. 15: Consume e\u00E2\u0088\u0097(y) amount of energy for transmission and s\u00E2\u0088\u0097(y) amount of energy for sensing. 16: Update battery energy bk+1 using (2.2) and the amount of data in the buffer qk+1 using (2.5). 17: Set k := k + 1. 18: end while 32 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP s\u00E2\u0088\u0097(y) amount of energy for transmission and sensing, respectively. For the convergence, Jn(y) generated in line 3 converges in norm to J(y) for all y \u00E2\u0088\u0088 Y . The stationary policy \u00CE\u00B4\u00E2\u0088\u0097 defined in line 5 is \u00CE\u00B5-optimal, and whenever the convergence criterion ||Jn+1 \u00E2\u0088\u0092 Jn|| < \u00CE\u00B5(1\u00E2\u0088\u0092\u00CE\u00BD) 2\u00CE\u00BD is satisfied, ||Jn+1 \u00E2\u0088\u0092 J || < \u00CE\u00B5/2 holds, where J = (J(y), \u00E2\u0088\u0080y \u00E2\u0088\u0088 Y ) is the vector of optimal expected total discounted reward defined in (3.5). Besides, the convergence is linear at rate \u00CE\u00BD. In practice, choosing \u00CE\u00B5 small enough ensures to obtain a policy that is very close to optimal. Lemma 3.3. (a) J(b, q, h, \u00CE\u00B1) is increasing in battery state b for any given data buffer state q, energy harvesting state h, and channel state \u00CE\u00B1. (b) J(b, q, h, \u00CE\u00B1) is increasing in q for any given b, h and \u00CE\u00B1. Proof. We prove it by mathematical induction. In order to show that J(b, q, h, \u00CE\u00B1) is increasing in b and q, we aim to prove that Jn(b, q, h, \u00CE\u00B1) generated by equation (3.7) in Algorithm 2 is increasing in b and q for all n. Since for any initialization J0(b, q, h, \u00CE\u00B1), Jn(b, q, h, \u00CE\u00B1) converges to the same optimal expected total discounted reward J(b, q, h, \u00CE\u00B1) [28], we can select J0(b, q, h, \u00CE\u00B1) which is increasing in b and q. Assume Jn(b, q, h, \u00CE\u00B1) is increasing in b and q. We expand (3.7) as Jn+1(b, q, h, \u00CE\u00B1) = max a\u00E2\u0088\u0088U(y) { E\u00CE\u00B1\u00E2\u0080\u00B2 [min{\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2), q} |\u00CE\u00B1] + \u00CE\u00BDEh\u00E2\u0080\u00B2,\u00CE\u00B1\u00E2\u0080\u00B2 [ Jn ( min{b\u00E2\u0088\u0092(e + s) + h\u00E2\u0080\u00B2, bmax},min{[q \u00E2\u0088\u0092 \u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2)]++x(s), qmax}, h \u00E2\u0080\u00B2, \u00CE\u00B1\u00E2\u0080\u00B2 ) \u00E2\u0088\u00A3\u00E2\u0088\u00A3\u00E2\u0088\u00A3h, \u00CE\u00B1] } . (3.9) Note that the first term on the right hand side of equation (3.9) is independent of b and increasing in q, and the second term is increasing in b and q based on the assumption that Jn(b, q, h, \u00CE\u00B1) is increasing in b and q. Therefore, Jn+1(b, q, h, \u00CE\u00B1) is increasing in b and q. By induction, Jn(b, q, h, \u00CE\u00B1) is increasing in b and q for all n. Thus, J(b, q, h, \u00CE\u00B1) = J\u00E2\u0088\u009E(b, q, h, \u00CE\u00B1) is increasing in b for any given q, h and \u00CE\u00B1. 33 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP Moreover, it is increasing in q for any given b, h and \u00CE\u00B1. This property is intuitive. If more energy is available in the battery (i.e., a larger b), we can allocate more energy for sensing and transmission so that the total reward J increases. Similarly, if more data are available in the data buffer for transmission (i.e., a larger q), we can then allocate less energy for sensing and more energy for transmission, which would result in a larger total reward J . 3.2.2 Special Case: Infinite Data Backlog In this subsection, we consider a special case where the sensor has an infinite data backlog. As a result, we do not need to consider the sensing energy s and the data buffer state q. So the system state is left with three elements: the battery energy b for current time slot, the energy harvesting rate h, and the channel state \u00CE\u00B1 for the previous time slot. Based on the current system state, the sensor will choose e units of energy for transmission. We denote the expected optimal total discounted reward as J\u00CC\u0082(b, h, \u00CE\u00B1), which satisfies the following Bellman\u00E2\u0080\u0099s equation of optimality: J\u00CC\u0082(b, h, \u00CE\u00B1) = max 0\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b { E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084(b\u00E2\u0088\u0092 e, h, \u00CE\u00B1) } , (3.10) where J\u00CC\u0084(b\u00CC\u0082, h, \u00CE\u00B1) = Eh\u00E2\u0080\u00B2,\u00CE\u00B1\u00E2\u0080\u00B2 [J\u00CC\u0082(min{b\u00CC\u0082+ h \u00E2\u0080\u00B2, bmax}, h \u00E2\u0080\u00B2, \u00CE\u00B1\u00E2\u0080\u00B2) | h, \u00CE\u00B1]. (3.11) The first term on the right hand side of equation (3.10) represents the immediate re- ward for allocating e units of energy for transmission, and the second term represents the total future discounted reward. Equation (3.10) can be solved via the value itera- tion algorithm as in Section 3.2.1. However, we can prove some properties related to 34 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP J\u00CC\u0082(b, h, \u00CE\u00B1) and J\u00CC\u0084(b\u00CC\u0082, h, \u00CE\u00B1) in Lemmas 3.4 and 3.5, which leads to the monotone policy [28] in Theorem 3.2. Lemma 3.4. J\u00CC\u0082(b, h, \u00CE\u00B1) is increasing in b for any given h and \u00CE\u00B1. Proof. We prove it by mathematical induction. The optimal discounted reward J\u00CC\u0082(b, h, \u00CE\u00B1) is obtained by the value iteration algorithm, given by J\u00CC\u0082n+1(b, h, \u00CE\u00B1) = max 0\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b { E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084n(b\u00E2\u0088\u0092 e, h, \u00CE\u00B1) } , (3.12) where J\u00CC\u0084n(b\u00CC\u0082, h, \u00CE\u00B1) = Eh\u00E2\u0080\u00B2,\u00CE\u00B1\u00E2\u0080\u00B2 [J\u00CC\u0082n(min{b\u00CC\u0082+ h \u00E2\u0080\u00B2, bmax}, h \u00E2\u0080\u00B2, \u00CE\u00B1\u00E2\u0080\u00B2) | h, \u00CE\u00B1]. (3.13) In order to show that J\u00CC\u0082(b, h, \u00CE\u00B1) is increasing in b, we aim to prove that J\u00CC\u0082n(b, h, \u00CE\u00B1) generated by equation (3.12) is increasing in b for all n. Since for any initialization J\u00CC\u00820(b, h, \u00CE\u00B1), J\u00CC\u0082n(b, h, \u00CE\u00B1) converges to the same optimal expected total discounted reward J\u00CC\u0082(b, h, \u00CE\u00B1), we can select J\u00CC\u00820(b, h, \u00CE\u00B1) which is increasing in b. Assume J\u00CC\u0082n(b, h, \u00CE\u00B1) is increasing in b, which indicates that J\u00CC\u0084n(b\u00CC\u0082, h, \u00CE\u00B1) is increasing in b\u00CC\u0082. Let b \u00E2\u0080\u00B2 > b, and we have J\u00CC\u0082n+1(b \u00E2\u0080\u00B2, h, \u00CE\u00B1) = max 0\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b\u00E2\u0080\u00B2 { E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084n(b \u00E2\u0080\u00B2 \u00E2\u0088\u0092 e, h, \u00CE\u00B1) } , (3.14a) J\u00CC\u0082n+1(b, h, \u00CE\u00B1) = max 0\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b { E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084n(b\u00E2\u0088\u0092 e, h, \u00CE\u00B1) } . (3.14b) For any chosen action e, the first term on the right hand side of (3.14a) is the same as that of (3.14b) and the second term on the right hand side of (3.14a) is greater than or equal to that of (3.14b). Besides, the action set at the right hand side of (3.14a), {e | 0 \u00E2\u0089\u00A4 e \u00E2\u0089\u00A4 b\u00E2\u0080\u00B2}, is larger than the action set at the right hand side of (3.14b), 35 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP {e | 0 \u00E2\u0089\u00A4 e \u00E2\u0089\u00A4 b}. Thus, we have J\u00CC\u0082n+1(b \u00E2\u0080\u00B2, h, \u00CE\u00B1) \u00E2\u0089\u00A5 J\u00CC\u0082n+1(b, h, \u00CE\u00B1). (3.15) Thus, by induction, J\u00CC\u0082(b, h, \u00CE\u00B1) = J\u00CC\u0082\u00E2\u0088\u009E(b, h, \u00CE\u00B1) is increasing in b for any given h and \u00CE\u00B1. Lemma 3.5. (a) J\u00CC\u0082(b, h, \u00CE\u00B1) is concave in b for any given h and \u00CE\u00B1. (b) J\u00CC\u0084(b\u00CC\u0082, h, \u00CE\u00B1) is concave in b\u00CC\u0082 for any given h and \u00CE\u00B1. Proof. We prove it by mathematical induction. Since for any initialization J\u00CC\u00820(b, h, \u00CE\u00B1), the sequence J\u00CC\u0082n(b, h, \u00CE\u00B1) generated by equation (3.12) converges to the optimal dis- counted reward J\u00CC\u0082(b, h, \u00CE\u00B1), we can choose such J\u00CC\u00820(b, h, \u00CE\u00B1) that is concave in b for any given h and \u00CE\u00B1. Assume J\u00CC\u0082n(b, h, \u00CE\u00B1) is concave in b for any given h and \u00CE\u00B1 . We denote the optimal action that achieves J\u00CC\u0082n+1(b1, h, \u00CE\u00B1) by e1, and the optimal action that achieves J\u00CC\u0082n+1(b2, h, \u00CE\u00B1) by e2. Then, we have J\u00CC\u0082n+1(b1, h, \u00CE\u00B1) = E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e1, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084n(b1 \u00E2\u0088\u0092 e1, h, \u00CE\u00B1), (3.16) J\u00CC\u0082n+1(b2, h, \u00CE\u00B1) = E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e2, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084n(b2 \u00E2\u0088\u0092 e2, h, \u00CE\u00B1). (3.17) Since \u00C2\u00B5(e, \u00CE\u00B1\u00E2\u0080\u00B2) is concave in e for any given \u00CE\u00B1\u00E2\u0080\u00B2, E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] is also concave in e because it is a weighted sum of concave functions. We then prove that J\u00CC\u0084n(b\u00CC\u0082, h, \u00CE\u00B1) is concave in b\u00CC\u0082. From the procedure of the proof of Lemma 3.4, we have J\u00CC\u0082n(b \u00E2\u0080\u00B2, h\u00E2\u0080\u00B2, \u00CE\u00B1\u00E2\u0080\u00B2) is increasing in b\u00E2\u0080\u00B2 for given h\u00E2\u0080\u00B2 and \u00CE\u00B1\u00E2\u0080\u00B2 for all n. We already assume at the beginning of the proof that J\u00CC\u0082n(b \u00E2\u0080\u00B2, h\u00E2\u0080\u00B2, \u00CE\u00B1\u00E2\u0080\u00B2) is concave in b\u00E2\u0080\u00B2 for given h\u00E2\u0080\u00B2 and \u00CE\u00B1\u00E2\u0080\u00B2. And b\u00E2\u0080\u00B2 = min{b\u00CC\u0082+h\u00E2\u0080\u00B2, bmax} is a concave function in b\u00CC\u0082 [36]. Thus, by applying the results of composition [36, (3.10)], we can conclude that J\u00CC\u0082n(min{b\u00CC\u0082+ h \u00E2\u0080\u00B2, bmax}, h \u00E2\u0080\u00B2, \u00CE\u00B1\u00E2\u0080\u00B2) is concave in b\u00CC\u0082 for given h\u00E2\u0080\u00B2 and \u00CE\u00B1\u00E2\u0080\u00B2, which indicates that J\u00CC\u0084n(b\u00CC\u0082, h, \u00CE\u00B1) is concave in b\u00CC\u0082, since it is a weighted sum of 36 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP concave functions. Now combining equation (3.16) and equation (3.17), and using the concavity of E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] and J\u00CC\u0084n(b\u00CC\u0082, h, \u00CE\u00B1), we have \u00CE\u00BBJ\u00CC\u0082n+1(b1, h, \u00CE\u00B1) + (1\u00E2\u0088\u0092 \u00CE\u00BB)J\u00CC\u0082n+1(b2, h, \u00CE\u00B1) \u00E2\u0089\u00A4 E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e\u00CE\u00BB, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084n(b\u00CE\u00BB \u00E2\u0088\u0092 e\u00CE\u00BB, h, \u00CE\u00B1),(3.18) where e\u00CE\u00BB = \u00CE\u00BBe1 + (1\u00E2\u0088\u0092 \u00CE\u00BB)e2, b\u00CE\u00BB = \u00CE\u00BBb1 + (1\u00E2\u0088\u0092 \u00CE\u00BB)b2. Since 0 \u00E2\u0089\u00A4 e1 \u00E2\u0089\u00A4 b1 and 0 \u00E2\u0089\u00A4 e2 \u00E2\u0089\u00A4 b2, we have 0 \u00E2\u0089\u00A4 e\u00CE\u00BB \u00E2\u0089\u00A4 b\u00CE\u00BB. By applying the definition of maximum and J\u00CC\u0082n+1(b, h, \u00CE\u00B1) in equation (3.12), we have E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e\u00CE\u00BB, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084n(b\u00CE\u00BB \u00E2\u0088\u0092 e\u00CE\u00BB, h, \u00CE\u00B1) \u00E2\u0089\u00A4 max 0\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b\u00CE\u00BB { E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084n(b\u00CE\u00BB \u00E2\u0088\u0092 e, h, \u00CE\u00B1) } = J\u00CC\u0082n+1(b\u00CE\u00BB, h, \u00CE\u00B1). (3.19) Combining inequalities (3.18) and (3.19), we have \u00CE\u00BBJ\u00CC\u0082n+1(b1, h, \u00CE\u00B1) + (1\u00E2\u0088\u0092 \u00CE\u00BB)J\u00CC\u0082n+1(b2, h, \u00CE\u00B1) \u00E2\u0089\u00A4 J\u00CC\u0082n+1(\u00CE\u00BBb1 + (1\u00E2\u0088\u0092 \u00CE\u00BB)b2, h, \u00CE\u00B1). (3.20) Inequality (3.20) shows that J\u00CC\u0082n+1(b, h, \u00CE\u00B1) is concave in b for given h and \u00CE\u00B1. By induction, we can conclude that J\u00CC\u0082n(b, h, \u00CE\u00B1) is concave in b for given h and \u00CE\u00B1 for all n. Also, J\u00CC\u0084n(b\u00CC\u0082, h, \u00CE\u00B1) is concave in b\u00CC\u0082 for all n. Hence, J\u00CC\u0082(b, h, \u00CE\u00B1) = J\u00CC\u0082\u00E2\u0088\u009E(b, h, \u00CE\u00B1) is concave in b for given h and \u00CE\u00B1, and J\u00CC\u0084(b\u00CC\u0082, h, \u00CE\u00B1) = J\u00CC\u0084\u00E2\u0088\u009E(b\u00CC\u0082, h, \u00CE\u00B1) is concave in b\u00CC\u0082 for given h and \u00CE\u00B1 . Since \u00C2\u00B5(e, \u00CE\u00B1\u00E2\u0080\u00B2) is concave in e, E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] is also concave in e. By applying Lemma 3.5(b), \u00CE\u00BDJ\u00CC\u0084(b\u00E2\u0088\u0092 e, h, \u00CE\u00B1) is concave in (b\u00E2\u0088\u0092 e). Thus, the concavities of the two terms in (3.10) translate into a diminishing marginal reward for consuming energy at the current time slot, and saving energy for the future time slots, respectively. 37 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP Balancing these two terms properly results in an optimal policy. Theorem 3.2. For the optimal stationary policy e\u00CC\u0082\u00E2\u0088\u0097(b, h, \u00CE\u00B1) = min { e\u00E2\u0080\u00B2 \u00E2\u0088\u0088 arg max 0\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b { E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084(b\u00E2\u0088\u0092 e, h, \u00CE\u00B1) }} , (3.21) it is monotone increasing in b for any given h and \u00CE\u00B1. That is, for any b\u00E2\u0080\u00B2 \u00E2\u0089\u00A5 b, we have e\u00CC\u0082\u00E2\u0088\u0097(b\u00E2\u0080\u00B2, h, \u00CE\u00B1) \u00E2\u0089\u00A5 e\u00CC\u0082\u00E2\u0088\u0097(b, h, \u00CE\u00B1), \u00E2\u0088\u0080h \u00E2\u0088\u0088 H, \u00E2\u0088\u0080\u00CE\u00B1 \u00E2\u0088\u0088 A. (3.22) Proof. We prove Theorem 3.2 by applying [37, Theorem 2]. We aim to prove that e\u00CC\u0082n+1(b, h, \u00CE\u00B1), which is defined as e\u00CC\u0082n+1(b, h, \u00CE\u00B1) = min { e\u00E2\u0080\u00B2 \u00E2\u0088\u0088 arg max 0\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b { E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDJ\u00CC\u0084n(b\u00E2\u0088\u0092 e, h, \u00CE\u00B1) }} , (3.23) is increasing in b for given h and \u00CE\u00B1 for all n. We drop the arguments of h and \u00CE\u00B1 from all functions. We denote f(e) = E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1], and gn(b\u00CC\u0082) = \u00CE\u00BDJ\u00CC\u0084n(b\u00CC\u0082, h, \u00CE\u00B1). Then, equation (3.12) can be written as J\u00CC\u0082n+1(b) = max 0\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b {f(e) + gn(b\u00E2\u0088\u0092 e)} . (3.24) Define el(b) and eu(b) to be the lower bound and upper bound of the set of feasible actions e, respectively, when the available energy in the battery is b. In equation (3.24), we have el(b) = 0 and eu(b) = b, which are both increasing in b. To apply [37, Theorem 2], it is sufficient to show that f(e) + gn(b\u00E2\u0088\u0092 e) has increasing difference in 38 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP (b, e), that is, for any b\u00E2\u0080\u00B2 \u00E2\u0089\u00A5 b, e\u00E2\u0080\u00B2 \u00E2\u0089\u00A5 e, (f(e\u00E2\u0080\u00B2)+gn(b \u00E2\u0080\u00B2 \u00E2\u0088\u0092 e\u00E2\u0080\u00B2))\u00E2\u0088\u0092(f(e\u00E2\u0080\u00B2)+gn(b\u00E2\u0088\u0092 e \u00E2\u0080\u00B2)) \u00E2\u0089\u00A5 (f(e)+gn(b \u00E2\u0080\u00B2 \u00E2\u0088\u0092 e))\u00E2\u0088\u0092(f(e)+gn(b\u00E2\u0088\u0092 e)). (3.25) Inequality (3.25) can be simplified to gn(b \u00E2\u0080\u00B2 \u00E2\u0088\u0092 e\u00E2\u0080\u00B2)\u00E2\u0088\u0092 gn(b\u00E2\u0088\u0092 e \u00E2\u0080\u00B2) \u00E2\u0089\u00A5 gn(b \u00E2\u0080\u00B2 \u00E2\u0088\u0092 e)\u00E2\u0088\u0092 gn(b\u00E2\u0088\u0092 e), \u00E2\u0088\u0080 b \u00E2\u0080\u00B2 \u00E2\u0089\u00A5 b, e\u00E2\u0080\u00B2 \u00E2\u0089\u00A5 e. (3.26) From the proof of Lemma 3.5, gn(b\u00CC\u0082) = \u00CE\u00BDJ\u00CC\u0084n(b\u00CC\u0082, h, \u00CE\u00B1) is concave in b\u00CC\u0082 for all n. By applying the property of concave functions, we have gn(w +\u00E2\u0088\u0086)\u00E2\u0088\u0092 gn(w) \u00E2\u0089\u00A5 gn(v +\u00E2\u0088\u0086)\u00E2\u0088\u0092 gn(v), \u00E2\u0088\u0080w \u00E2\u0089\u00A4 v,\u00E2\u0088\u0086 \u00E2\u0089\u00A5 0. (3.27) Substituting w = b\u00E2\u0088\u0092 e\u00E2\u0080\u00B2, v = b\u00E2\u0088\u0092 e,\u00E2\u0088\u0086 = b\u00E2\u0080\u00B2\u00E2\u0088\u0092 b, we obtain (3.26). Now, by applying the conclusion of [37, Theorem 2], we prove that e\u00CC\u0082n+1(b, h, \u00CE\u00B1) is increasing in b for any given h and \u00CE\u00B1 for all n. Thus, e\u00CC\u0082\u00E2\u0088\u0097(b, h, \u00CE\u00B1) = e\u00CC\u0082\u00E2\u0088\u009E(b, h, \u00CE\u00B1) is increasing in b for given h and \u00CE\u00B1. With this monotone structure, we can significantly reduce the computational complexity of the value iteration algorithm, and propose our OTEA algorithm in Algorithm 3. The planning phase of the OTEA algorithm and OSEA algorithm (i.e., Algorithm 2) are similar. The main difference is the procedure in computing J\u00CC\u0082n+1(b, h, \u00CE\u00B1) in (3.28) in Algorithm 3, which has a lower complexity than that of the OSEA algorithm. Specifically, in line 6 of Algorithm 3, for any given h \u00E2\u0088\u0088 H and \u00CE\u00B1 \u00E2\u0088\u0088 A, we have e\u00CC\u0082n+1(b + \u00E2\u0088\u0086b, h, \u00CE\u00B1) \u00E2\u0089\u00A5 e\u00CC\u0082n+1(b, h, \u00CE\u00B1) from the proof of Theorem 3.2, where \u00E2\u0088\u0086b is the quantization resolution of battery energy. When we compute 39 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP Algorithm 3 Optimal Transmission Energy Allocation (OTEA) Algorithm for En- ergy Harvesting Sensor Node. 1: Planning Phase: 2: Arbitrarily select J\u00CC\u00820(b, h, \u00CE\u00B1) for each b \u00E2\u0088\u0088 B, h \u00E2\u0088\u0088 H, \u00CE\u00B1 \u00E2\u0088\u0088 A , specify \u00CE\u00B5 > 0, \u00E2\u0088\u0086b > 0, and set n := 0. 3: for each h \u00E2\u0088\u0088 H, \u00CE\u00B1 \u00E2\u0088\u0088 A do 4: Set b := 0 and l := 0. 5: while b \u00E2\u0089\u00A4 bmax do 6: Compute J\u00CC\u0082n+1(b, h, \u00CE\u00B1) := max l\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b { E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDEh\u00E2\u0080\u00B2,\u00CE\u00B1\u00E2\u0080\u00B2 [J\u00CC\u0082n(min{b\u00E2\u0088\u0092 e+ h \u00E2\u0080\u00B2, bmax}, h \u00E2\u0080\u00B2, \u00CE\u00B1\u00E2\u0080\u00B2) |h, \u00CE\u00B1] } , (3.28) e\u00CC\u0082n+1(b, h, \u00CE\u00B1) := min { e\u00E2\u0080\u00B2 \u00E2\u0088\u0088 arg max l\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b { E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] +\u00CE\u00BDEh\u00E2\u0080\u00B2,\u00CE\u00B1\u00E2\u0080\u00B2 [J\u00CC\u0082n(min{b\u00E2\u0088\u0092 e + h \u00E2\u0080\u00B2, bmax}, h \u00E2\u0080\u00B2, \u00CE\u00B1\u00E2\u0080\u00B2) |h, \u00CE\u00B1] }} . 7: Set b := b+\u00E2\u0088\u0086b, l := e\u00CC\u0082n+1(b, h, \u00CE\u00B1). 8: end while 9: end for 10: If ||J\u00CC\u0082n+1 \u00E2\u0088\u0092 J\u00CC\u0082n|| < \u00CE\u00B5(1\u00E2\u0088\u0092\u00CE\u00BD) 2\u00CE\u00BD , go to line 11. Otherwise increment n by 1 and go to line 3. 11: For each b \u00E2\u0088\u0088 B, h \u00E2\u0088\u0088 H, \u00CE\u00B1 \u00E2\u0088\u0088 A, choose stationary \u00CE\u00B5-optimal policy \u00CE\u00B4\u00CC\u0082\u00E2\u0088\u0097(b, h, \u00CE\u00B1) := arg max 0\u00E2\u0089\u00A4e\u00E2\u0089\u00A4b { E\u00CE\u00B1\u00E2\u0080\u00B2 [\u00C2\u00B5(e, \u00CE\u00B1 \u00E2\u0080\u00B2) |\u00CE\u00B1] + \u00CE\u00BDEh\u00E2\u0080\u00B2,\u00CE\u00B1\u00E2\u0080\u00B2 [J\u00CC\u0082n+1(min{b\u00E2\u0088\u0092 e+ h \u00E2\u0080\u00B2, bmax}, h \u00E2\u0080\u00B2, \u00CE\u00B1\u00E2\u0080\u00B2) |h, \u00CE\u00B1] } , (3.29) and stop. 12: Sensing and Transmission Phase: 13: Set k := 0. 14: while k \u00E2\u0089\u00A4 K \u00E2\u0088\u0092 1 do 15: Track the energy harvesting rate of the previous time slot hk\u00E2\u0088\u00921. 16: Track the energy available for use in the battery bk. 17: Track the amount of data in the buffer qk. 18: Obtain the channel feedback \u00CE\u00B1k\u00E2\u0088\u00921 from the sink. 19: Choose the amount of energy for sensing to be s\u00CC\u0082\u00E2\u0088\u0097 := pbk, where p is the fixed percentage of energy for sensing. 20: Choose the amount of energy for transmission to be e\u00CC\u0082\u00E2\u0088\u0097 := \u00CE\u00B4\u00CC\u0082\u00E2\u0088\u0097((1\u00E2\u0088\u0092 p)bk, hk\u00E2\u0088\u00921, \u00CE\u00B1k\u00E2\u0088\u00921) based on the optimal policy. 21: Consume e\u00CC\u0082\u00E2\u0088\u0097 amount of energy for transmission and s\u00CC\u0082\u00E2\u0088\u0097 amount of energy for sensing. 22: Update battery energy bk+1 using (2.2) and the amount of data in the buffer qk+1 using (2.5). 23: Set k := k + 1. 24: end while 40 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP J\u00CC\u0082n+1(b+\u00E2\u0088\u0086b, h, \u00CE\u00B1) and search for e\u00CC\u0082n+1(b+\u00E2\u0088\u0086b, h, \u00CE\u00B1), we can find the optimal solution in the interval of [e\u00CC\u0082n+1(b, h, \u00CE\u00B1), b+\u00E2\u0088\u0086b] instead of the longer interval [0, b+\u00E2\u0088\u0086b]. In the sensing and transmission phase, when we apply our OTEA algorithm to a practical system, we still need to take into account the energy for sensing. In this way, we fix the percentage of energy allocated for sensing to be p in line 19. The other operations in the sensing and transmission phase are the same as that in the OSEA algorithm. 3.3 Performance Evaluation In this section, we simulate the performance of the OSEA, OTEA, and CAEA al- gorithms in terms of the total amount of data transmitted in Matlab. We consider the same wireless channel as in Chapter 2, which evolves according to the three-state Markov chain as shown in Figure 2.3 with the transition matrix of the Markov chain given by (2.14). Unless specified otherwise, we assume that the battery buffer size bmax = 30 J, and the data buffer size qmax = 0.5 Mbits. The initial amount of energy in the battery is 10 J and the initial amount of data in the buffer is 0.1 Mbits. For tractability, we assume that the energy harvesting rate hk takes values from the finite set H = {H1, H2, H3} = {4, 8, 12} J/time slot, and evolves according to the three-stateMarkov chain with the state transition probability given by Ph = \u00EF\u00A3\u00AE \u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00B0 PH1H1 PH1H2 PH1H3 PH2H1 PH2H2 PH2H3 PH3H1 PH3H2 PH3H3 \u00EF\u00A3\u00B9 \u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BB = \u00EF\u00A3\u00AE \u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00AF\u00EF\u00A3\u00B0 0.5 0.5 0 0.25 0.5 0.25 0 0.5 0.5 \u00EF\u00A3\u00B9 \u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BA\u00EF\u00A3\u00BB . (3.30) The steady state probability is then given by [PH1 PH2 PH3 ] = [0.25 0.5 0.25]. x(sk) is assumed to be a linear function of sk given by x(sk) = \u00CE\u00B3sk. We adopt \u00CE\u00B3 = 0.08 41 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP 10 20 30 40 50 60 70 80 90 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Percentage of Energy Allocated for Sensing p (%) T ot al A m o u n t o f D a ta T ra n sm it te d (M b it s) OSEA algorithm OTEA algorithm Figure 3.1: The total amount of data transmitted of OTEA algorithm under different percentage of energy allocated for sensing p. Since the OSEA algorithm does not allocate a fixed amount of energy for sensing, its total amount of data transmitted is independent of p. Mbits/J. For the value iteration algorithm, we choose \u00CE\u00B5 to be 10\u00E2\u0088\u00923 and the discount factor \u00CE\u00BD to be 0.95. Since the CAEA algorithm considers the energy allocation over a finite horizon, where the lifetime of the sensor node is known, we fix the sensor lifetime in the CAEA algorithm to be equal to the mean of the sensor lifetime in the OSEA and OTEA algorithms. Since the performance of the OTEA algorithm is related to the fixed amount of energy allocated for sensing, we examine the total amount of data transmitted by the OTEA algorithm under different percentage of energy allocated for sensing, and compare with the OSEA algorithm. As shown in Figure 3.1, with around 50% of the available battery energy allocated for sensing, the OTEA algorithm transmits the largest amount of data, and it is close to that of the OSEA algorithm. This 42 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 20 30 40 50 60 70 80 90 100 Sensing Efficiency \u00CE\u00B3 (Mbits/J) O p ti m al P er ce n ta ge of E n er gy A ll o ca te d fo r S en si n g (% ) Figure 3.2: The optimal percentage of energy allocated for sensing under different data-sensing efficiency \u00CE\u00B3 for the OTEA algorithm. implies that we can apply the OTEA algorithm, which has a lower complexity than the OSEA algorithm, and choose the optimal fixed percentage of energy for sensing to achieve a near-optimal performance. Moreover, the optimal percentage of energy allocated for sensing for the OTEA algorithm depends on the data-sensing efficiency \u00CE\u00B3. With a larger \u00CE\u00B3, the sensor can sense more data using the same amount of energy. Figure 3.2 shows that as \u00CE\u00B3 increases, the optimal percentage of energy for sensing decreases. We then examine the total amount of data transmitted by the OSEA algorithm, the OTEA algorithm, and the CAEA algorithm under different average energy har- vesting rates H\u00CC\u0084, where H\u00CC\u0084 = \u00E2\u0088\u00913 i=1HiPHi. For the OTEA and CAEA algorithms, 43 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP 4 8 12 16 20 24 28 32 36 2 3 4 5 6 7 8 9 Average Energy Harvesting Rate H\u00CC\u0084 (J/time slot) T ot al A m ou n t of D at a T ra n sm it te d (M b it s) OSEA algorithm OTEA algorithm CAEA algorithm Figure 3.3: The total amount of data transmitted of the three algorithms for different average energy harvesting rates H\u00CC\u0084 . the percentage of energy allocated for sensing is fixed to be 50%. In Figure 3.3, we plot the total amount of data transmitted against different average energy harvest- ing rate for these three algorithms. We observe that the OSEA algorithm performs better than the OTEA algorithm and the CAEA algorithm, since the OSEA algo- rithm achieves the optimal performance by solving the problem (3.4). Moreover, the OTEA algorithm has a better performance than the CAEA algorithm. It is because the OTEA algorithm takes into account the randomness of the lifetime of the sensor node, while the CAEA algorithm just considers the lifetime of the sensor node to be its mean value. Besides, the total amount of data transmitted by these three algo- rithms saturates as the average harvesting rate is increased beyond a certain level. It is because when the energy harvesting rate is larger than the battery capacity, part 44 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0 1 2 3 4 5 6 7 Sensing Efficiency \u00CE\u00B3 (Mbits/J) T ot al A m o u n t o f D a ta T ra n sm it te d (M b it s) Upper bound OSEA algorithm OTEA algorithm Figure 3.4: The total amount of data transmitted of the OSEA algorithm and the OTEA algorithm for different values of data-sensing efficiency parameter \u00CE\u00B3. of the harvested energy cannot be accommodated, and is lost due to the overflow of the battery energy. Next, we examine the total amount of data transmitted of the OSEA and OTEA algorithms under different data-sensing efficiency \u00CE\u00B3. As shown in Figure 3.4, when \u00CE\u00B3 is increased, the total amount of data transmitted increases as well, because more energy is left for data transmission. However, the performance saturates as \u00CE\u00B3 is increased beyond a certain value. To the extreme when \u00CE\u00B3 approaches infinity, it corresponds to the case where the sensing is extremely efficient. The total amount of data transmitted in this case provides an upper bound for the performance of the OSEA algorithm for the sensor node with different sensing efficiency. Figure 3.5 shows the impact of the battery storage capacity bmax on the total 45 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP 10 15 20 25 30 35 40 45 50 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 Battery Storage Capacity bmax (J) T ot al A m ou n t of D at a T ra n sm it te d (M b it s) OSEA algorithm OTEA algorithm Figure 3.5: The total amount of data transmitted of the OSEA algorithm and the OTEA algorithm for different battery storage capacity bmax. amount of data transmitted. We consider that the value of h is taken from the set H = {20, 24, 28} J/time slot. As shown in Figure 3.5, the total amount of data transmitted increases as the battery storage capacity bmax increases. It is because with a larger battery storage capacity bmax, the sensor node can manage the harvested energy better since the sensor can save more energy for future use if necessary. In other words, the sensor has more freedom to manage the incoming energy when bmax is larger. The total amount of data transmitted saturates when bmax goes to large values because under current energy harvesting rates, the battery energy level never exceeds some value, thus for all the battery capacities bmax that are larger than that certain value, the sensor has the same performance. In Figure 3.6, we study the impact of the data buffer size qmax on the total amount 46 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP 0.1 0.2 0.3 0.4 0.5 0.6 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Data Buffer Size qmax (Mbits) T ot al A m ou n t o f D a ta T ra n sm it te d (M b it s) OSEA algorithm OTEA algorithm Figure 3.6: The total amount of data transmitted of the OSEA algorithm and the OTEA algorithm for different data buffer size qmax. of data transmitted by the OSEA and OTEA algorithms. We can observe that the total amount of transmitted data increases when qmax increases. The performance saturates when qmax is increased to a certain large value. This means that the amount of data in the buffer never exceeds a certain level under the optimal energy allocation policy. Otherwise, we should have observed that the total amount of transmitted data would continue to increase with the data buffer size qmax. Finally, we study the total amount of transmitted data of the OSEA algorithm and the OTEA algorithm under different discount factors \u00CE\u00BD. Since 1/(1\u00E2\u0088\u0092\u00CE\u00BD) represents the average lifetime of the sensor node, a larger \u00CE\u00BD corresponds to a longer lifetime, which leads to a larger total amount of data transmitted. In Figure 3.7, the total amount of data transmitted increases as the discount factor \u00CE\u00BD increases. When \u00CE\u00BD approaches 47 Chapter 3. Energy Allocation Algorithms Based on Infinite-horizon MDP 0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 0 5 10 15 20 25 Discount Factor \u00CE\u00BD T ot al A m ou n t o f D a ta T ra n sm it te d (M b it s) OSEA algorithm OTEA algorithm Figure 3.7: The total amount of data transmitted of the OSEA algorithm and the OTEA algorithm for different values of discount factor \u00CE\u00BD. 1, where the lifetime of the sensor node approaches infinity, the total amount of transmitted data goes to infinity. Besides, the number of iterations required for the value iteration algorithm to converge depends on \u00CE\u00BD. With a larger \u00CE\u00BD, a larger number of iterations are required. 48 Chapter 4 Conclusions and Future Work In this chapter, we conclude the thesis by summarizing our contributions. We also suggest topic for further research. 4.1 Conclusions In this thesis, we studied the problem of maximizing the expected total amount of data transmitted for an energy harvesting sensor node under energy harvesting rate variations and channel fluctuations in a time-slotted system. A finite data buffer and the energy consumed for sensing data were considered for the first time. In this case, the sensor should achieve a good tradeoff between the energy consumed for sensing and transmission so as to achieve a large amount of total transmitted data. We considered two cases of the sensor lifetime, a fixed value and a random variable, in Chapters 2 and 3, respectively. \u00E2\u0080\u00A2 In Chapter 2, we studied the energy allocation problem for sensing and trans- mission in a energy harvesting WSN over a finite horizon. The sensor lifetime is a fixed value. We formulated it as a finite-horizon MDP under channel fluctua- tions and energy variations in a time-slotted system. We obtained the optimal energy allocation policy and propose the OEA algorithm by using backward induction. We also provided extensive simulation results to compare the per- formance of the OEA algorithm and the CAEA algorithm. The results showed 49 Chapter 4. Conclusions and Future Work that the OEA algorithm could transmit much more amount of data over a finite horizon than the CAEA algorithm under different settings. \u00E2\u0080\u00A2 In Chapter 3, we extended the joint energy allocation problem by taking into account the randomness of the sensor lifetime. Since the lifetime of the sensor node is a random variable with geometric distribution, we formulated the prob- lem as an infinite-horizon MDP. We obtained the optimal stationary energy allocation policy and proposed an OSEA algorithm based on value iteration in MDP. We also studied the transmission energy allocation problem under the assumption that there was infinite data backlog. We obtained structural results for the OTEA policy and proved that the OTEA policy was a monotonically increasing function of the available battery energy. Finally, we provided exten- sive simulation results to compare the performances of the OSEA algorithm, the OTEA algorithm, and the CAEA algorithm, and studied the impact of the average energy harvesting rate, the data-sensing efficiency, the battery capacity, the data buffer size and the lifetime of the sensor node on the total amount of data transmitted. The results showed that the OSEA algorithm transmitted the largest amount of data among the three algorithms, and the OTEA algo- rithm could achieve a near-optimal performance when the fixed percentage of energy for sensing was chosen properly. 4.2 Future Work In terms of future work, we can consider several potential extensions of current work. Maximizing the average throughput for an energy harvesting wireless sensor node over an infinite horizon. We consider maximizing the average 50 Chapter 4. Conclusions and Future Work throughput of a sensor node over an infinite horizon instead of maximizing the to- tal amount of transmitted data. In this case, we can formulate the problem as an infinite-horizon MDP with average reward, and solve it by using value iteration, policy iteration or linear programming. Joint optimal energy allocation, scheduling, and routing policy in a multi-hop energy harvesting wireless sensor network. We consider extending the single hop scenario to a multi-hop setting for data transmission. In the multi-hop case, the sensor nodes sense the field and transmit data to a fusion node. In each time slot, the sensor node can harvest energy from the environment and store it in a rechargeable battery for future use. The sensor is in either sleep mode or active mode in any time slot. In the sleep mode, the sensor nodes can only harvest energy, but can not sense or transmit data. In the active mode, the sensor nodes can harvest energy, sense, process, and transmit data. Each node can transmit data to any other nodes in the network. For such a model, we can aim to develop a joint optimal energy allocation, scheduling and routing policy that ensures a fair utilization of the network resources. We can also take into consideration the fact that the energy harvesting process is unpredictable and stochastic in nature, and develop adaptive routing and scheduling algorithms that are able to dynamically learn and adapt to time variations in the energy harvesting process and network environments. 51 Bibliography [1] C. K. Ho and R. Zhang, \u00E2\u0080\u009COptimal energy allocation for wireless communications powered by energy harvesters,\u00E2\u0080\u009D in Proc. of IEEE Int\u00E2\u0080\u0099l. Symp. on Inform. Theory (ISIT), Austin, TX, Jun. 2010. [2] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, \u00E2\u0080\u009CWireless sensor networks: A survey,\u00E2\u0080\u009D Computer Networks, vol. 38, no. 4, pp. 393\u00E2\u0080\u0093422, Mar. 2002. [3] G. Werner-Allen, K. Lorincz, M. Ruiz, O. Marcillo, J. Johnson, J. Lees, and M. Welsh, \u00E2\u0080\u009CDeploying a wireless sensor network on an active volcano,\u00E2\u0080\u009D IEEE Internet Comput., vol. 10, no. 2, pp. 18\u00E2\u0080\u009325, Mar. 2006. [4] A. Mainwaring, D. Culler, J. Polastre, R. Szewczyk, and J. Anderson, \u00E2\u0080\u009CWireless sensor networks for habitat monitoring,\u00E2\u0080\u009D in Proc. of the 1st ACM International Workshop on Wireless Sensor Networks and Applications, Atlanta, GA, Sept. 2002. [5] M. Karpiriski, A. Senart, and V. Cahill, \u00E2\u0080\u009CSensor networks for smart roads,\u00E2\u0080\u009D in Proc. of IEEE International Conference on Pervasive Computing and Commu- nications Workshops, Pisa, Italy, Mar. 2006. [6] K. Chebrolu, B. Raman, N. Mishra, P. K. Valiveti, and R. Kumar, \u00E2\u0080\u009CBrimon: A sensor network system for railway bridge monitoring,\u00E2\u0080\u009D in Proc. of ACM Interna- tional Conference on Mobile Systems, Applications, and Services, Breckenridge, CO, Jun. 2008. [7] L. van Hoesel, T. Nieberg, H. Kip, and P. Havinga, \u00E2\u0080\u009CAdvantages of a TDMA based, energy-efficient, self-organizing MAC protocol for WSNs,\u00E2\u0080\u009D in Proc. of IEEE Vehicular Technology Conference, Los Angeles, CA, Sept. 2004. [8] Y. Kim, H. Shin, and H. Cha, \u00E2\u0080\u009CY-MAC: An energy-efficient multi-channel MAC protocol for dense wireless sensor networks,\u00E2\u0080\u009D in Proc. of the 7th International Conference on Information Processing in Sensor Networks, St. Louis, MO, Apr. 2008. [9] Q. Ren and Q. Liang, \u00E2\u0080\u009CAn energy-efficient MAC protocol for wireless sensor networks,\u00E2\u0080\u009D in Proc. of IEEE Globecom, St. Louis, MO, Nov. 2005. 52 Bibliography [10] D. Ganesan, R. Govindan, S. Shenker, and D. Estrin, \u00E2\u0080\u009CHighly-resilient, energy- efficient multipath routing in wireless sensor networks,\u00E2\u0080\u009D ACM SIGMOBILE Mo- bile Computing and Communications Review, vol. 5, no. 4, pp. 11\u00E2\u0080\u009325, Oct. 2001. [11] S. D. Muruganathan, D. C. F. Ma, R. I. Bhasin, and A. O. Fapojuwo, \u00E2\u0080\u009CA centralized energy-efficient routing protocol for wireless sensor networks,\u00E2\u0080\u009D IEEE Communications Magazine, vol. 43, no. 3, pp. S8\u00E2\u0080\u0093S13, Mar. 2005. [12] F. Wang and J. Liu, \u00E2\u0080\u009CDuty-cycle-aware broadcast in wireless sensor networks,\u00E2\u0080\u009D in Proc. of IEEE INFOCOM, Rio de Janeiro, Brazil, Apr. 2009. [13] D. Niyato, E. Hossain, M. M. Rashid, and V. K. Bhargava, \u00E2\u0080\u009CWireless sensor networks with energy harvesting technologies: A game-theoretic approach to optimal energy management,\u00E2\u0080\u009D IEEE Wireless Communications, vol. 14, no. 4, pp. 90\u00E2\u0080\u009396, Aug. 2007. [14] S. Sudevalayam and P. Kulkarni, \u00E2\u0080\u009CEnergy harvesting sensor nodes: Survey and implications,\u00E2\u0080\u009D IEEE Communications Surveys and Tutorials, vol. 13, no. 3, pp. 443\u00E2\u0080\u0093461, Third Quarter 2011. [15] S. Priya and D. J. Inman, Energy Harvesting Technologies. Springer, 2009. [16] V. Raghunathan, A. Kansal, J. Hsu, J. Friedman, and M. Srivastava, \u00E2\u0080\u009CDesign considertaions for solar energy harvesting wireless embedded systems,\u00E2\u0080\u009D in Proc. of the 4th International Symposium on Information Processing in Sensor Net- works, Los Angeles, CA, Apr. 2005. [17] F. Simjee and P. H. Chou, \u00E2\u0080\u009CEverlast: Long-life, supercapacitor-operated wirelss sensor node,\u00E2\u0080\u009D in Proc. of the 2006 International Symposium on Low Power Elec- tronics and Design, Tegernsee, Germany, Oct. 2006. [18] X. Jiang, J. Polastre, and D. Culler, \u00E2\u0080\u009CPerpetual environmentally powered sensor networks,\u00E2\u0080\u009D in Proc. of the 4th International Symposium on Information Process- ing in Sensor Networks, Los Angeles, CA, Apr. 2005. [19] J. Taneja, J. Jeong, and D. Culler, \u00E2\u0080\u009CDesign, modeling, and capacity planning for micro-solar power sensor networks,\u00E2\u0080\u009D in Proc. of the 7th International Conference on Information Processing in Sensor Networks, St. Louis, MO, Apr. 2008. [20] A. Kansal, J. Hsu, S. Zahedi, and M. B. Srivastava, \u00E2\u0080\u009CPower management in energy harvesting sensor networks,\u00E2\u0080\u009D ACM Trans. on Embedded Computing Sys- tems, vol. 6, no. 4, Sept. 2007. [21] V. Sharma, U. Mukherji, V. Joseph, and S. Gupta, \u00E2\u0080\u009COptimal energy management policies for energy harvesting sensor nodes,\u00E2\u0080\u009D IEEE Trans. Wireless Communi- cations, vol. 9, no. 4, pp. 1326\u00E2\u0080\u00931336, Apr. 2010. 53 Bibliography [22] M. Gatzianas, L. Georgiadis, and L. Tassiulas, \u00E2\u0080\u009CControl of wireless networks with rechargeable batteries,\u00E2\u0080\u009D IEEE Trans. on Wireless Communications, vol. 9, no. 2, pp. 581\u00E2\u0080\u0093593, Feb. 2010. [23] L. Huang and M. J. Neely, \u00E2\u0080\u009CUtility optimal scheduling in energy harvesting networks,\u00E2\u0080\u009D in Proc. of ACM MobiHoc, Paris, France, May 2011. [24] M. Gorlatova, A. Wallwater, and G. Zussman, \u00E2\u0080\u009CNetworking low-power energy harvesting devices: Measurements and algorithms,\u00E2\u0080\u009D in Proc. of IEEE INFO- COM, Shanghai, China, Apr. 2011. [25] J. Yang and S. Ulukus, \u00E2\u0080\u009COptimal packet scheduling in an energy harvesting communication system,\u00E2\u0080\u009D IEEE Trans. on Communications, vol. 60, no. 1, pp. 220\u00E2\u0080\u0093230, Jan. 2012. [26] M. A. Antepli, E. Uysal-Biyikoglu, and H. Erkal, \u00E2\u0080\u009COptimal packet scheduling on an energy harvesting broadcast link,\u00E2\u0080\u009D IEEE J. Sel. Areas Communnications, vol. 29, no. 8, pp. 1721\u00E2\u0080\u00931731, Sept. 2011. [27] D. P. Bertsekas, Dynamic Programming and Optimal Control: Volume 1, 2nd ed. Athena Scientific, 2000. [28] M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Pro- gramming. New York, NY: John Wiley and Sons, 2005. [29] O. Ozel, K. Tutuncuoglu, J. Yang, S. Ulukus, and A. Yener, \u00E2\u0080\u009CAdaptive transmis- sion policies for energy harvesting wireless nodes in fading channels,\u00E2\u0080\u009D in Proc. of IEEE Information Sciences and Systems (CISS), Baltimore, MD, Mar. 2011. [30] S. Chen, P. Sinha, N. B. Shroff, and C. Joo, \u00E2\u0080\u009CFinite-horizon energy allocation and routing scheme in rechargeable sensor networks,\u00E2\u0080\u009D in Proc. of IEEE INFOCOM, Shanghai, China, Apr. 2011. [31] H. Li, N. Jaggi, and B. Sikdar, \u00E2\u0080\u009CRelay scheduling for cooperative communi- cations in sensor networks with energy harvesting,\u00E2\u0080\u009D IEEE Trans. on Wireless Communications, vol. 10, no. 9, pp. 2918\u00E2\u0080\u00932928, Sept. 2011. [32] D. Tse and P. Viswanath, Fundamentals of Wireless Communication. Cam- bridge University Press, 2005. [33] C.-F. Chiasserini and R. R. Rao, \u00E2\u0080\u009CImproving battery performance by using traffic shaping techniques,\u00E2\u0080\u009D IEEE J. Sel. Areas Communnications, vol. 19, no. 7, pp. 1385\u00E2\u0080\u00931394, Jul. 2001. [34] J. Lei, R. Yates, and L. Greenstein, \u00E2\u0080\u009CA generic model for optimizing single-hop transmission policy of replenishable sensors,\u00E2\u0080\u009D IEEE Trans. on Wireless Commu- nications, vol. 8, no. 2, pp. 547\u00E2\u0080\u0093551, Feb. 2009. 54 Bibliography [35] Q. Zhang and S. A. Kassam, \u00E2\u0080\u009CFinite-state Markov model for Rayleigh fading channels,\u00E2\u0080\u009D IEEE Trans. on Communications, vol. 47, no. 11, pp. 1688\u00E2\u0080\u00931692, Nov. 1999. [36] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004. [37] R. Amir, \u00E2\u0080\u009CSupermodularity and complementarity in economics: An elementary survey,\u00E2\u0080\u009D Southern Economic Journal, vol. 71, no. 3, pp. 636\u00E2\u0080\u0093660, Jan. 2005. 55"@en . "Thesis/Dissertation"@en . "2012-11"@en . "10.14288/1.0072935"@en . "eng"@en . "Electrical and Computer Engineering"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "Attribution-NonCommercial-NoDerivatives 4.0 International"@en . "http://creativecommons.org/licenses/by-nc-nd/4.0/"@en . "Graduate"@en . "Joint energy allocation for sensing and transmission in rechargeable wireless sensor networks"@en . "Text"@en . "http://hdl.handle.net/2429/42832"@en .