Information-based Sampling for Spatiotemporal FieldEstimation and Reconstruction in EnvironmentalMonitoringbyJiahong ChenB.Eng., Xiamen University, 2015A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFDoctor of PhilosophyinTHE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES(Mechanical Engineering)The University of British Columbia(Vancouver)March 2019c© Jiahong Chen, 2019The following individuals certify that they have read, and recommend to the Faculty ofGraduate and Postdoctoral Studies for acceptance, the thesis entitled:Information-based Sampling for Spatiotemporal Field Estimation andReconstruction in Environmental Monitoringsubmitted by Jiahong Chen in partial fulfillment of the requirements for the degree ofDoctor of Philosophy in Mechanical Engineering.Examining Committee:Prof. Clarence W. de Silva, Mechanical EngineeringSupervisorProf. Mu Chiao, Mechanical EngineeringSupervisory Committe MemberProf. Jose´ Martı´, Electrical and Computer EngineeringUniversity ExaminerProf. Ryozo Nagamune, Mechanical EngineeringUniversity ExaminerAdditional Supervisory Committee Members:Prof. Hsi-Yung (Steve) Feng, Mechanical EngineeringSupervisory Committee MemberProf. Dana Grecov, Mechanical EngineeringSupervisory Committee MemberiiAbstractThis dissertation addresses the near-optimal deployment problem of robot-sensory nodesin a spatiotemporal field. With limited resources, monitoring of a complex environmentmay face serious challenges in providing sufficient information for spatiotemporal signalestimation and reconstruction. It is therefore essential to retrieve most useful informationfrom sampling locations while using a small number of sensor nodes. In this dissertation,three aspects are investigated to overcome the shortcomings of the existing information-based sampling methods.First, a sensor node deployment method is designed to find the minimum number of sen-sor deployment locations while achieving near-optimal field estimation error. To this end,a sampling-based field exploration method is used to find near-optimal sampling locationsover an infinite horizon. Moreover, spatiotemporal correlations of the sampling data arestudied to find redundant signals. The corresponding sampling locations of the redundantsignals are eliminated concerning the network connectivity.Second, a deep reinforcement learning approach is proposed to accelerate the field ex-ploration. Typically, field exploration methods are heavily dependent on random sampling,which has low efficiency. To avoid unnecessary or redundant sampling locations, obser-vations from the sampling locations are utilized. Then a model-based information gaindetermination of the sampling locations is developed to evaluate the effectiveness of the ap-proach. The proposed method can determine the informativeness of the spatiotemporal fieldby learning the information gain from the sampled area. The mobile sensory agents are theniiiencouraged to take more samples in the area of higher information gain. Consequently, thespatiotemporal field can be efficiently explored. Moreover, the selected sampling locationscan near-optimally reconstruct the spatiotemporal field using statistical methods.Third, a deep learning framework is designed to provide accurate reconstruction andprediction of the spatiotemporal field, using a limited number of observations. Nonlin-ear mapping from limited observations to the entire spatiotemporal field is needed in asufficiently large spatiotemporal field. Hence, a deep learning method is proposed to ex-tract sparse representations of the field and their nonlinear mappings. It is also proven thatthe proposed framework obeys Lipschitz continuity and that the observations collected bysparse representations are sufficient for spatiotemporal field reconstruction.ivLay SummaryThe environmental pollution is a critical issue in todays world, and the ecological changesand pollution may happen anywhere. For example, aquatic pollutants may come fromhousehold waste, industrial effluent, and environmental contamination. Large scale moni-toring systems can be established to determine the polluted areas and their dynamics. Then,corrective actions can be taken using that information. This dissertation seeks to developinformation-based sampling approaches that can retrieve as much information as possiblefrom a spatiotemporal (space-time) field. Redundant sampling locations are detected andeliminated to reduce the infrastructural cost. In this manner, the efficiency of the field ex-ploration is improved while reducing the computational cost, by learning the sampling data.The collected data can be used to estimate the spatiotemporal field. Besides, the field canbe reconstructed from the sampling data, and the environmental pollution and changes canbe detected or predicted.vPrefaceAll the work presented in this dissertation was conducted by Jiahong Chen in the In-dustrial Automation Laboratory (IAL) at the University of British Columbia (UBC), Van-couver, under the direct supervision and guidance of Dr. Clarence W. de Silva, Professor inthe Department of Mechanical Engineering at UBC. The technical content of each chapter,including literature survey, algorithm development and implementation, and experimenta-tion, was all completed by Jiahong Chen under the guidance and advise of Dr. Clarence W.de Silva. In particular, Dr. de Silva proposed and supervised the overall research project,suggested the research topics, obtained funding, provided research facilities and field testopportunities, and revised and refined the dissertation and other publications.Chapter 2 and Chapter 6 are based on the work published on:• Jiahong Chen, Teng Li, Tongxin Shu, and Clarence W. de Silva, “Rapidly-exploringTree with Linear Reduction: A Near-optimal approach for Spatiotemporal SensorDeployment in Aquatic Fields using Minimal Sensor Nodes.”, IEEE Sensors Journal,vol. 18, no. 24, pp. 10225-10239, 15 Dec.15, 2018.Jiahong Chen was responsible for all major areas of problem formulation, algorithmpreparation, and implementation, experimental validation, as well as manuscript composi-tion. Teng Li and Tongxin Shu were involved in the early stages of concept formulationand contributed to manuscript editing. The prototype of the unmanned surface vehicle wasdeveloped collaboratively by Teng Li, Tongxin Shu and me, in the UBC Industrial Automa-tion Laboratory, with the guidance of Dr. Clarence W. de Silva. Dr. Clarence W. de Silvaviwas the supervisory author on this work and was involved throughout the project in conceptformation and manuscript composition, as indicated above.A version of Chapter 3 and Chapter 4 has been submitted to a journal:• Jiahong Chen, Tongxin Shu, Teng Li, and Clarence W. de Silva, “Deep ReinforcedLearning Tree for Spatiotemporal Monitoring with Wireless Sensor Network.”.Jiahong Chen was responsible for all major areas of problem formulation, algorithmpreparation and implementation, numerical experimentation, and manuscript composition.Tongxin Shu and Teng Li were involved in the early stages of concept formulation andcontributed to manuscript editing. Dr. Clarence W. de Silva was the supervisory authoron this work and was involved throughout the project in concept formation and manuscriptcomposition, as indicated above.A version of Chapter 5 is to be submitted to a journal:• Jiahong Chen and Clarence W. de Silva, “Sparse Sensor Deployment for Spatiotem-poral Field Reconstruction and Prediction.”.Jiahong Chen was responsible for all major areas of problem formulation, algorithmpreparation and implementation, numerical simulation, and manuscript composition. Dr.Clarence W. de Silva was the supervisory author on this work and was involved throughoutthe project in concept formation and manuscript composition, as noted above.viiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xivNomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xixAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3.1 Field Coverage with a Static Sensor Network . . . . . . . . . . . . 51.3.2 Field Coverage with Dynamic Sensor Network . . . . . . . . . . . 6viii1.3.3 Spatiotemporal Monitoring with Wireless Robotic Sensor Networks 71.3.4 Field Exploration Assisted by Reinforcement Learning . . . . . . . 81.3.5 Spatiotemporal Field Reconstruction . . . . . . . . . . . . . . . . . 101.3.6 Optimal Sparse Representation Through Deep Learning . . . . . . . 121.4 Contributions and Organization of the Dissertation . . . . . . . . . . . . . . 132 Spatiotemporal Sensor Deployment using Minimal Sensor Nodes . . . . . . . 172.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.2.1 Modeling of Environment . . . . . . . . . . . . . . . . . . . . . . 192.2.2 Estimation Error of Deployment Location . . . . . . . . . . . . . . 212.2.3 Matrix Linear Redundancy . . . . . . . . . . . . . . . . . . . . . . 222.2.4 WSN Connectivity of the Informative Sampling Locations . . . . . 232.2.5 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 242.3 Rapidly-exploring Random Tree with Connected Linear Reduction . . . . . 252.3.1 Optimization Scheme for Near-optimal Sensor Node Deployment . 252.3.2 Linearly Dependent Sampling Location Elimination with regard toNetwork Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . 302.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 Deep Reinforced Learning Tree for Efficient Spatiotemporal Exploration . . 433.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.2 Problem Definition and Mathematical Background . . . . . . . . . . . . . . 463.2.1 Modeling of Environment . . . . . . . . . . . . . . . . . . . . . . 463.2.2 Sampling Quality of Deployment Location . . . . . . . . . . . . . 473.2.3 Field Coverage and Network Connectivity Analysis . . . . . . . . . 493.3 Deep Reinforced Learning Tree . . . . . . . . . . . . . . . . . . . . . . . . 51ix3.3.1 Guaranteed Near-optimality for Informative Sampling Using Ex-ploring Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.3.2 Model-based Spatiotemporal Information Gain . . . . . . . . . . . 543.3.3 Deep Reinforcement Learning Model for Maximum InformationRetrieval over a Spatiotemporal Field . . . . . . . . . . . . . . . . 583.3.4 Architecture for Deep Reinforced Learning Tree . . . . . . . . . . . 613.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633.4.1 NOAA Sea Surface Temperature Dataset . . . . . . . . . . . . . . . 633.4.2 Benchmark Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 643.4.3 Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . 653.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724 Model-based Sensor Deployment for Spatiotemporal Field Reconstruction . 734.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734.2 Mathematical Background and Problem Formulation . . . . . . . . . . . . 744.2.1 Sparse Sensor Deployment Analysis . . . . . . . . . . . . . . . . . 744.2.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 754.3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754.3.1 Sparse Learning for Spatiotemporal Reconstruction . . . . . . . . . 754.3.2 Model-based Sparse Sensor Deployment with Informative Sampling 774.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 804.4.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 814.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 835 Model-free Sensor Deployment for Spatiotemporal Field Reconstruction andPrediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84x5.2 Preliminary and Problem Formulation of Sensor Selection . . . . . . . . . . 865.2.1 Sparse Sensor Deployment Analysis . . . . . . . . . . . . . . . . . 875.3 Sparse Sensor Deployment using Auto Field Reconstructor . . . . . . . . . 885.3.1 Sparse Sampling for Reconstruction . . . . . . . . . . . . . . . . . 885.3.2 Deep Learning based Spatiotemporal Field Reconstruction . . . . . 905.3.3 Lipschitz Continuity and Reconstruction Capability for the ProposedFramework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1015.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 1015.4.2 Training Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1045.4.3 Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . 1065.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126 Hardware Implementation and Experimentation . . . . . . . . . . . . . . . . 1146.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146.2 Hardware Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1156.2.1 Water Quality Sensors . . . . . . . . . . . . . . . . . . . . . . . . 1156.2.2 Onboard Controllers . . . . . . . . . . . . . . . . . . . . . . . . . 1166.2.3 Mobile Base of USV . . . . . . . . . . . . . . . . . . . . . . . . . 1176.3 In Situ Tests and Experimental Results . . . . . . . . . . . . . . . . . . . . 1186.3.1 Experiment Setup for In Situ Tests . . . . . . . . . . . . . . . . . . 1186.3.2 Experimental Results for Spatiotemporal Sensor Deployment usingMinimal Sensor Nodes . . . . . . . . . . . . . . . . . . . . . . . . 1196.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1237 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 1257.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1257.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127xiBibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129xiiList of TablesTable 2.1 Statistical measurement of the field estimation uncertainty. . . . . . . . . 40Table 3.1 Numerical results for DRLT and benchmarks. . . . . . . . . . . . . . . . 71Table 4.1 MSE of the proposed method and the benchmark algorithms. . . . . . . 83Table 5.1 Comparison of performance using the two datasets. . . . . . . . . . . . 113Table 6.1 Specifications for the water quality sensors. . . . . . . . . . . . . . . . . 116Table 6.2 Specifications for the propeller. . . . . . . . . . . . . . . . . . . . . . . 117Table 6.3 Statistical measurement of the field estimation uncertainty. . . . . . . . . 121xiiiList of FiguresFigure 2.1 The framework of optimal sensor node deployment with minimal sensornodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Figure 2.2 Illustration of network connectivity. (Red sensor nodes have spatiotem-porally correlated sampling data.) . . . . . . . . . . . . . . . . . . . . 23Figure 2.3 Illustration of redundant sensor elimination. . . . . . . . . . . . . . . . 31Figure 2.4 Deployment locations in CFD simulation. . . . . . . . . . . . . . . . . 36Figure 2.5 Deployment results generated by RRT-based algorithms. . . . . . . . . 38Figure 2.6 Deployment results generated by random search-based algorithms. . . . 39Figure 2.7 Field estimation uncertainty results of RRT-based algorithms. . . . . . . 40Figure 2.8 Change of minimum estimation error. . . . . . . . . . . . . . . . . . . 41Figure 3.1 Schematic architecture for DRL model . . . . . . . . . . . . . . . . . . 58Figure 3.2 Illustration of the action structure. Blue circle: current sampling loca-tions; diamonds: possible sampling locations for next step. . . . . . . . 60Figure 3.3 Deployment strategy for information-based sampling. Red dots: com-puted information-based sampling locations; blue lines: exploring tree;yellow squares: informative centers for RRTPI. . . . . . . . . . . . . . 66Figure 3.4 Field estimation uncertainty for information-based sampling. . . . . . . 67xivFigure 3.5 Reduction of the estimation error with iteration (Blue solid line: resultof INFO; green dash line: RRC; red dash-dot line: RRT; light blue dotline: RRTPI; purple dash-dash line: DRLT). . . . . . . . . . . . . . . . 68Figure 3.6 Estimation error of the information-based deployment strategy deter-mined by the proposed and the benchmark algorithms decreases mono-tonically with time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69Figure 3.7 Estimation error of the information-based deployment strategy deter-mined by the proposed and the benchmark algorithms decreases mono-tonically with time. (Zoomed in) . . . . . . . . . . . . . . . . . . . . . 70Figure 3.8 Cumulative reward and estimation error of DRLT. (Red dashed line: cu-mulative reward of the DRLT; solid blue line: corresponding estimationError). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Figure 4.1 Ground-truth and reconstruction results from the proposed and bench-mark algorithms. (Red circle: sampling locations generated from dif-ferent information-based planning algorithms.) . . . . . . . . . . . . . 82Figure 5.1 Illustration of generating sparse sampling locations using a canonicalbasis. Dark blue region in matrix C stands for 1, rest of the matrixcontains 0; second, seventh and fifth row of φ is extracted according toC and compose y. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90Figure 5.2 The framework of the auto field reconstructor. Yellow nodes representthe compression layers which optimize sparse sampling locations; bluenodes are neurons in the hidden layer for model fitting; red layer is theoutput layer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Figure 5.3 Training using SST dataset with 100, 200, 302, 400 sparse samplingsensors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105xvFigure 5.4 Training using PREC dataset with 50, 100, 206, 300 sparse samplingsensors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105Figure 5.5 Ground truth of the testing spatiotemporal field and reconstruction fromthe proposed and the benchmark algorithms in SST dataset. (Black dia-monds: sampling locations.) . . . . . . . . . . . . . . . . . . . . . . . 107Figure 5.6 Reconstruction variance for all test snapshots in the SST dataset. . . . . 109Figure 5.7 Ground truth of the testing spatiotemporal field and reconstruction fromthe proposed and benchmark algorithms in the PREC dataset. (Bluediamonds: sampling locations.) . . . . . . . . . . . . . . . . . . . . . . 111Figure 5.8 Reconstruction variance for all test snapshots in the PREC dataset. . . . 111Figure 6.1 The prototype of the developed USV. . . . . . . . . . . . . . . . . . . . 115Figure 6.2 Feedback control of the USV. . . . . . . . . . . . . . . . . . . . . . . . 117Figure 6.3 The framework for in situ experiments. . . . . . . . . . . . . . . . . . 118Figure 6.4 Sensor deployment for the in situ test. . . . . . . . . . . . . . . . . . . 119Figure 6.5 Deployment results generated by RRT-based algorithms. . . . . . . . . 121Figure 6.6 Deployment results generated by random search-based algorithms. . . . 122Figure 6.7 Field estimation uncertainty results of RRT-based algorithms. . . . . . . 122Figure 6.8 Change of minimum estimation error. . . . . . . . . . . . . . . . . . . 123xviNomenclatureA a two-dimensional environment of interestA temporal information matrixA action set for reinforcement learningb bias vectorC measurement matrixη infinite horizon costE neighbors of a sampling locationE edges of a space-filling treeG deep learning frameworkGr reconstruction framework in GG space-filling treeH(·) entropy functionI information matrix of Gaussian EliminationJ set of redundant sampling locationsJ matrix of onesM set of sensor readingsM sensor reading matrixMI(·) Mutual Information functionN deployment strategyNˆi a set of deployment strategies at iteration ixviiP estimation error matrixpi(·) maximum eigenvalue of a matrixψ a location in the two-dimensional environmentφ high-dimensional spatiotemporal fieldΨ principal basisR· covariance matrix of Gaussian white noiseρ step size for field exploration methodsSψ the spatial information matrix at location ψΣ covariance matrixT (·) element-wise nonlinear rectifier functionV vertices of a space-filling treevψ[t] an observation in location ψ at time tW weight matrixx[t] time-varying coefficient vectorxviiiGlossaryCS Compressive sensingDL Deep learningDNN Deep neural networkDO Dissolved oxygenDRL Deep Reinforcement LearningDRLT Deep Reinforced Learning TreeEOI Environment of InterestGAN Generative adversarial networksGB GigaByteGPS Global Positioning SystemLSTM Long short-term memoryMOO Multi-objective optimizationMSE Mean square errorMSN Mobile sensor networkNCEI National Centers for Environmental InformationxixNCOM Naval Oceanographic Office Regional Navy Coastal Ocean ModelNOAA National Oceanic and Atmospheric AdministrationORP Oxidation-reduction potentialPCA Principal component analysisPREC PRECipitation datasetReLU Rectified Linear UnitsRL Reinforcement learningRMSE Root mean square errorROS Robot operating systemsRRC Rapidly-exploring random cyclesRRCstar Rapidly-exploring random cycles starRRLR Rapid Random exploring tree with Linear ReductionRRT Rapidly-exploring random treeSAFER Spatiotemporal sparse Auto FiEld ReconstructorSLAM Simultaneous localization and mappingSVD Singular value decompositionUSV Unmanned surface vehicleVAE Variational auto-encoderWSN Wireless sensor networkxxAcknowledgmentsFirst and foremost, I wish to express my deepest gratitude to my research supervisor,Professor Clarence W. de Silva, who has continuously given me rigorous advice and en-couragement throughout my entire research program at the University of British Columbia.His invaluable suggestions not only pushed my critical thinking to a higher level but alsotaught me to become an independent researcher. I truly appreciate all the support and advicethat he provided throughout the years.I wish to thank all my colleagues in the Industrial Automation Laboratory (IAL), Dr.Yunfei Zhang, Mr. Shan Xiao, Dr. Muhammad Tufail Khan, Dr. Lili Meng, Dr. Min Xia,Dr. Shujun Gao, Mr. Hani Balkhair, Dr. Teng Li, Mr. Zhuo Chen, Mr. Tongxin Shu, Mr.Sheikh Tanvir, Mr. Fan Yang, Mr. Lucas Falch, Ms. Swapna Premasiri, Mr. HiroshanGunawardane, Mr. Jing Wang, Mr. Kuangen Zhang for their friendship and help on bothacademic affairs and personal life.Additionally, I would like to acknowledge the financial support for the research pre-sented in this thesis, through the research grants from IC-IMPACTS: India-Canada Cen-tre for Innovative Multidisciplinary Partnership to Accelerate Community Transformationand Sustainability, obtained by Dr. Clarence W. de Silva as the Principal Investigator, andthe Tier 1 Canada Research Chair in Mechatronics and Industrial Automation held by Dr.Clarence W. de Silva.In particular and closing, I want to thank my family for their unquestionable supportthroughout my upbringing.xxiChapter 1Introduction1.1 MotivationIn recent decades, public awareness has been considerably raised with regard to varioustypes of environmental pollution issues. Environment-related problems significantly affectpublic health, in both developing and developed countries. In this context, environmen-tal monitoring systems can be designed to sample the key attributes that adversely affectthe living environment and then inform the public and relevant authorities about the lat-est undesirable conditions and take corrective actions. Deployed sensors may comprise awireless sensor network (WSN) that can carry out different types of tasks such as persistentmonitoring [1], event detection [2], and spatiotemporal field estimation and reconstruction[3, 4].Typically, an environmental monitoring system consists of a group of sensor nodes de-ployed in an environment of interest (EOI). In practice, only a limited number of sensornodes are deployed in critical and fixed locations due to a limitation of resources, such asbattery life and the cost of sensor node components. It is therefore essential to allocatesensor nodes to optimal locations and possible move them as appropriate when the opti-mal conditions change. In this manner, a deployed sensor network can receive the greatestamount of information while maintaining an acceptable environmental monitoring estima-1tion error.Traditionally, dense deployment has been performed to achieve high monitoring resolu-tion for a field. The deployed static sensors were assumed to detect events within a specificsensing range, which is regarded as the coverage area of sensor nodes. As the sensor net-work spreads over the field, the EOI can be fully covered. Originally, researchers tendedto deploy sensor nodes randomly in the field; efficient data transmission protocols and thesleep-active scheduling method were then used to increase the network lifetime [5, 6]. How-ever, randomly deployed sensor nodes in a field can result in high infrastructure costs andredundant area coverage. Hence, the sensor nodes should be deployed within an infinitehorizon and dynamically according to suitable optimizing criteria.In contrast to static sensor networks, mobile sensor networks provide mobility to sensornodes, enabling them to move as appropriate within the EOI. In this manner, mobile sensorscan optimize their locations to attain sufficient area coverage in an effective manner. Themobility of sensor nodes may also help the sensor network adapt to environmental changes.Researchers have made great efforts to maximize the coverage ratio and minimize the trav-eling distance of the mobile sensors [7, 8]. Typically, a small number of mobile sensors arerandomly deployed or placed along a narrow area. A planning algorithm then instructs eachmobile agent to move around to increase the coverage ratio while minimizing the travelingdistance. In this manner, the maximum coverage area will be provided by the deployedsensors in an “optimal” manner. Redundant sensors can be removed or relocated to furtheroptimize the system and reduce costs.However, if the EOI is large, deploying sensors in a distributed manner may still requirea large number of sensor nodes. Hence, the EOI can be determined as a spatiotemporal field,and sampling-based deployment strategies can be developed to use only a small numberof sensors to estimate the entire spatiotemporal field optimally. The sensor deploymentlocations of the sampling-based strategies are determined by minimizing the estimationerror over the EOI, or by maximizing the information gain from the field. Novel algorithms2have been developed to explore the spatiotemporal field efficiently and find the near-optimalsampling locations for the sampling data [3, 9]. These approaches first model the EOI as ananalytical model. Mobile sensor nodes are then sent out to explore the EOI. Near-optimalsampling locations can subsequently be determined by minimizing the field estimation error.Moreover, maximum information can be retrieved from the EOI by selecting the mostinformative subset from the sampling locations [10]. Information theory-based methodscalculate the entropy or mutual information of the sampling data, after which the subset ofthe sampling data that provides maximum information is selected. Thus, the correspond-ing locations of the selected subset can be determined as the most informative samplinglocations, and the maximum information gain can be achieved from the EOI.In some cases, merely achieving minimal estimation error in a field might result in awaste of limited resources. Sensory data collected from the deployment locations may havesome redundancy according to specific criteria, such as linear dependency or informationtheory. In such cases, the selected near-optimal sampling locations can be further reducedto minimize the number of deployed sensor nodes [11]. A redundancy-removing algorithmcan first find the optimal sensor deployment location set, which contains a set of sensorsthat achieves the lowest estimation error in the EOI. The correlation of various samplinglocations is then investigated. If the sampling data at one or more locations can be recon-structed from the remainder of the observations, they are regarded as redundant locations.By removing these redundant locations, the most informative subset can be selected fromthe optimal set to reduce the number of sensor nodes.Currently, efficient optimization methods, such as RRT-based approaches, heavily de-pend on exploration metrics, which may result in extensive computing time. The rapidly-exploring random tree (RRT) algorithm, a common exploration method, can efficiently ex-plore an infinite horizon [12]. Randomly selecting the subsequent sampling point in a fieldguarantees an optimal solution for deployment locations; however, continuously samplingrandom locations may result in a duplication of efforts. Hence, information acquired from3sampled locations can be utilized to accelerate the exploration [13]. By learning the qual-ity of sampled locations, reinforcement learning (RL)-based algorithms can decide the bestfuture exploration steps given the current sampling locations. Furthermore, geodesic dis-tance, information gain, and estimation error can be considered as the sampling reward,which helps to avoid unnecessary exploration locations.Given the observations from the optimal sampling locations, it is also essential to predictthe full state of the EOI. Recently, researchers have attempted to reconstruct the EOI accord-ing to the spatiotemporal correlations of the sampling data [4]. Statistical methods, such asprincipal component analysis (PCA) and compressive sensing (CS), have been applied toperform signal compression and reconstruction. These approaches project the original sig-nal onto a lower-dimensional space, which indicates essential parts of the original signal.Thus, the corresponding sampling location can be selected as the critical sampling locationfor reconstruction, and used to interpret how the signal is encoded into this low-dimensionalspace. Statistical methods can then decode the observations from the low-dimensional spaceto the original signal. Hence, given future observations from the selected sampling location,the entire spatiotemporal field can be predicted. However, traditional statistical approachesmay have limitations in capturing both spatial and temporal correlations. Consequently,Deep Learning (DL) may be utilized to reconstruct signals at higher efficiency and preci-sion.1.2 Research ObjectivesIt is important to identify and evaluate the performance and the weaknesses of existingenvironmental monitoring systems, with the aim of conducting persistent monitoring, spa-tiotemporal reconstruction, and the information-based sampling of the systems. Optimiza-tion methods can then be implemented to generate more efficient deployment strategies andto precisely reconstruct the spatiotemporal field according to the sampled data. The primaryproblem in this context is to find, using limited observations, an efficient and near-optimal4sensor deployment strategy for estimating and reconstructing the spatiotemporal field.The first goal of this dissertation is to design a near-optimal sensor node deploymentalgorithms with minimal sensor nodes for comprehensive environmental monitoring. Thesecond goal is to develop efficient exploration methods to accelerate the optimization. Thesampling-based near-optimal strategy that is developed for sensor node deployment shouldbe capable of learning sampled data and taking intelligent sampling actions. The thirdgoal is to reconstruct and predict the extensive spatiotemporal field, using limited data.Since only a limited number of sensor nodes are deployed in the spatiotemporal field, theobservations taken from these sampling locations should be able to accurately reconstructthe full scenario.1.3 Related Work1.3.1 Field Coverage with a Static Sensor NetworkA static WSN, which is the most widely used type of sensor network in real-worldapplications, provides high system redundancy and monitoring resolution over the EOI.In the beginning, Vuran et al. worked on addressing challenges with regard to WSNs thatwere densely deployed at fixed locations and used a large number of sensor nodes to con-tinuously observe physical phenomenon [5]. The spatial and temporal correlations withinthe WSNs were utilized for the development of efficient communication protocols to savenetwork costs and energy consumption. Next, Yoon et al. focused on the spatiotemporalcorrelations of sensed data in WSNs and the inherent physics of the environment [6]. Theresearchers developed a clustering algorithm to cluster nodes according to similar sensingvalues and spatiotemporal correlation. Then, only one sensor reading is transmitted to onecluster, to reduce transmission costs.However, the spatial correlation between sensors and cluster-heads may result in lowaccuracy and may not fully exploit spatial and temporal correlations. Moreover, the trans-mitted data cannot be used to reconstruct the entire network, and choosing an appropriate5threshold that suits the current environment would be extremely difficult.Consequently, Gupta and his colleagues designed techniques that exploit spatiotempo-ral correlations in sensor data to minimize the communication and energy costs of datacollection in a sensor network [1]. Their algorithm used linear regression methods to findhighly correlated sensor nodes and cluster them into small subsets. The sensor network thenneeded to collect data from only one sensor node in each subset. Based on Guptas work,Alsheikh et al. utilized neural networks to determine redundant sensors [11]. The use ofneural networks made the algorithm suitable for a variety of sensory data and improved theregression accuracy.Villas et al. considered the case of large-scale, dense WSNs [2, 14]. With high-densitydeployment it is very likely to detect spatially and temporally correlated information in asmall group of sensor nodes. Such information can be used to reduce the extent of commu-nication and data exchange; by eliminating such redundancy, energy can be saved.However, in these methods, since these sensor networks are static, and unnecessarysensor nodes are turned off but not removed, the deployment is not “optimal” and also therewill be significant initial deployment cost. Thus, it is not desirable to deploy static sensornodes arbitrarily and densely in large-scale field monitoring.1.3.2 Field Coverage with Dynamic Sensor NetworkIn contrast to static sensor networks, mobile sensor networks empower sensor nodeswith mobility, enabling them to provide dynamic coverage of the field of interest. In thismanner, overlapping of the sensing areas by nodes can be minimized, and hence the numberof sensor nodes can be reduced. The mobility of sensor nodes may also help the sensornetwork adapt to environmental changes in an optimal manner [15].Heo and Varshney presented an autonomous deployment method for mobile WSNs[16, 17]. Their algorithm converts random deployment into uniform distribution with min-imal sensor node movement. They proposed an algorithm to increase the ratio of region6coverage while using limited computational time and relatively short mean sensor nodetravel distance, which can extend the network lifetime over the field of interest.Abo and his colleagues attempted to use robotic sensors to maintain dynamic coverageof a field for as long as possible [8, 18, 19]. The sensor node deployment strategy in the fieldis the essential factor related to the coverage; however, the traditional dense deploymentmethod requires many static sensor nodes and can cause coverage gaps in the sensing field.Therefore, the researchers proposed a new deployment method based on multi-objectiveimmune optimization to minimize the coverage gaps. The coverage area of the WSN ismaximized by rescheduling the placement of the mobile sensors with limited mobility andcommunication connectivity. The sensors with mobility not only improve the coverage ofthe area but also reduce the cost of long-term deployment.Although mobile WSNs succeeded in minimizing the number of sensors required in thefield and the energy consumption of the network, they did not consider the sampling datacorrelation, which is widely viewed in static sensor network protocols. In particular, dataredundancy can be minimized by modeling environmental dynamics, using information-based sampling strategies.1.3.3 Spatiotemporal Monitoring with Wireless Robotic SensorNetworksTypically, a wireless robotic sensor network consists of a fleet of mobile sensor nodes,which can be spatiotemporally deployed in an EOI. Information-based sampling methodsutilize a limited number of mobile sensor nodes, which provide sufficient information froma field when deployed in relatively few critical locations. It is therefore essential to allocatesensor nodes to their optimal locations for receiving maximum possible information whileminimizing the environmental monitoring estimation error. To achieve this goal, sensornodes need to search dynamically within an infinite horizon with specific optimizing criteria[20], and strategies such as spatiotemporal sensor deployment can be used [21] [22].Outfitting sensor nodes with mobility makes a sensor network more efficient and flex-7ible by reorganizing the sensor network and reducing the number of sensor nodes. Thesemethods deploy a limited number of sensor nodes that can provide maximum useful in-formation [23] or achieve the least estimation error for an environment [3]. Krause et al.proposed a mutual information approximation method that improves the sensor node place-ment quality [10]. They tested this criterion using real-world datasets and determined thatmutual information can achieve lower root mean square error (RMSE) in the environmentalestimation than entropy. Du et al. proposed a near-optimal sensor node deployment strategyfor monitoring hydro-dynamics and water quality in urban areas. They used entropy [24]and mutual information [23] ] to gather maximum information in selected locations [25][26].Lan et al. developed Rapidly-exploring random cycles (RRC) to estimate a spatiotem-poral field in a dynamic environment by deciding the locations in the field where one ormore robots should reach to acquire data [27–29]. RRC first models the environment usingpast data, based on which mobile sensor nodes are deployed to explore the EOI. Locationsare then determined by minimizing the largest eigenvalue of the error covariance matrixof the field estimate error over an infinite horizon. By doing so, they seek to estimate theenvironment continuously with a minimum number of sensor nodes. Lan et al. proposeda new scheme called Rapidly-exploring Random Cycles star (RRCstar) that enhances RRCby escaping local minima [3]. RRCstar, a variant of RRC, seeks to find the lowest globalestimation error over the field by rewiring those deployments with high estimation error.1.3.4 Field Exploration Assisted by Reinforcement LearningOutfitting sensor nodes with mobility to monitor infinite horizon problems has improvedthe performance and efficiency of WSNs significantly. Mobility approaches deploy a lim-ited number of sensor nodes that can provide adequate amounts of useful information andachieve the minimum estimation error for an environment [30].Field exploration algorithms, such as the RRT, have been further proposed to efficiently8search a non-convex field [20]. Karaman et al. evaluated the quality of solutions providedby RRTs. The proposed algorithm, RRT∗, was proven to be asymptotically optimal as thenumber of samples increases [31]. Subsequently, sampling-based path planning algorithmshave been used to study sampled data on-the-fly while exploring [3, 9, 32]. However, theperformance of RRT-based algorithms is highly reliant on the distance measures that areapplied, and the exploration strategy has low efficiency.Sivamurugan et al. proposed a sampling-based policy iteration to explore an infinitehorizon efficiently [13]. Specifically, reinforcement learning (RL) was used to study therelationship of geodesic distance among the sampled points on-the-fly, which significantlyimproved the computational efficiency of RRT-based methods. However, only geodesicdistance was utilized as the cost function to calculate the rewards of the algorithm, whichneglected spatiotemporal correlations of the sampling data. With the help of RL, algorithmscan determine the best actions under the current state and converge to a near-optimal situa-tion [33–38]. Nevertheless, more computation time is consumed by the algorithm to exploreall possible states, as the environment and the states become more complex. Moreover, theapplications of RL-based algorithms have been limited to environments with directly ex-tractable features or to low-dimensional environments that are fully observable. Thus, DLhas been introduced to approximate similar states, which enables automatic feature learningthrough the gradient descent process [39, 40].Mnih et al. improved RL with the advantage of deep neural networks (DNNs) to addressthe issue about a complex environment [41, 42]. A DNN, which can generate a precisemodel of a complex nonlinear domain, is a useful tool in many fields, including imagerecognition [43], natural language processing [44], and bioinformatics [45]. Mnih et al.tested their Deep Reinforcement Learning (DRL) algorithms on the practical domain ofAtari 2600 games [46]. In their algorithm, DRL uses only the pixels and the game score asinputs and achieves performance at the professional human level. Pfeiffer and his colleaguesadopted a DRL model for path-planning problems with respect to sensor data and goal9information [47]. Robots can utilize such information to plan paths over an infinite horizonand efficiently reach most informative goal positions. Shiarlis et al. also proposed an RL-based scheme to avoid repeatedly calculating costly planning procedures [48]]; their methodachieved better performance with increased computational efficiency. However, the abovemethods are suitable only for end-to-end navigation and not for searching over an EOI.1.3.5 Spatiotemporal Field ReconstructionRecently, data-driven approaches have been developed to find principal locations for op-timal sensor deployment. After the sampled data from a spatiotemporal field are analyzed,and the principal locations are calculated, the entire spatiotemporal field can be recon-structed using a limited set of observations. Data-driven approaches are of low complexityand more amenable to implementation than model-based convex optimization approaches.These data-driven approaches are called sparse representations, which can define a rela-tively large number of regions, such asO(2N), with merelyO(N) examples [49].Numerouscomplex systems can be reconstructed and predicted using data-driven sparse representationapproaches. Applications include image reconstruction [50], image denoising [51–53], im-age super-resolution [54, 55], and structured signal recovery [56].Compressive Sensing (CS) is a widely used signal recovery method for unknown signalreconstruction that utilizes sparse representations [57, 58]. Brunton et al. proposed a CS-based sparse sensor placement method for characterizing or classifying a high-dimensionalsystem [59]. Their algorithm solves the `1 minimization problem of finding the measure-ment with the fewest zero entities for reconstructing the original signal. Lu et al. thenproposed a convolutional CS framework to avoid inefficiency and blocking artifacts in thetraditional CS algorithms [60]. Their algorithm senses the input image using a set of con-volutional filters and reconstructs the image using linear convolutional measurements. Al-though CS can recover a comprehensive class of signals, it has limitations in capturingspatiotemporal patterns and does not perform well for large signals. Moreover, CS-based10algorithms need to solve a linear program to recover low-dimensional signals, which istime-consuming and has low efficiency.In contrast, principal component analysis (PCA) extracts low-dimensional patterns andfeatures from high-dimensional systems. PCA transforms input data into their orthogonalcoordinate system and projects the data onto coordinates according to the significance ofthe variance. The selected principal components then retain the most variance in the dataand can easily reconstruct the system. Manohar et al. proposed an optimal sparse sensordeployment scheme based on QR factorization of the sampling data matrix and singularvalue decomposition (SVD) [4], which can outperform the CS-based approaches. Guo etal. developed a sampling technique based on sparse topological data analysis to reconstructa high-dimensional signal from a limited set of observations [61]. The optimal sparse setof samples was selected and used to reconstruct the input signal at high efficiency andprecision. Lu et al. used SVD to learn basis images from training data and perform sparsereconstruction of fluid flows from random samples [62]. They investigated the interplayof data sparsity in the underlying flow system and reconstructed the entire space using alimited set of random samples.A variant of PCA, called Sparse PCA (SPCA), has been developed as well. The princi-pal components of traditional PCA are usually linear combinations of all input data, whichcan lead to errors in interpreting the derived principal components [63–65]. Hence, to en-code the high-order information of the input data, Jenatton et al. developed an extension ofSPCA, which specializes in image recognition and reconstruction [66]. Hein et al. focusedon nonlinear eigenproblems, which comprise the fundamental part of many machine learn-ing and statistics problems [67]. They developed a generalized inverse power method thatguarantees convergence to a nonlinear eigenvector. Drmac et al. proposed a novel frame-work to construct a discrete empirical interpolation method [68]. Their algorithm utilizesQR factorization with column pivoting to provide near-optimal sampling for the recon-struction of nonlinear systems. The algorithm, based on structured regularization in which11sparse patterns are structured and stored in different sets of predetermined shapes, achievesstate-of-the-art performance when applied to clustering and SPCA problems. Erichson etal. modeled SPCA as a matrix factorization problem with orthogonality constraints [69].Their algorithm minimizes the subset of projection variables by incorporating various spar-sity regularizers. Jankova´ et al. suggested a de-biased SPCA to estimate the first vector ofloadings and the largest eigenvalue of the covariance matrix [70]. They identified the bias inthe traditional Lasso-based estimator and guaranteed asymptotic normality of the proposedestimator. Mollenhauer et al. provided a solid analytical foundation for the eigenvalue de-composition of operators on reproducing kernel Hilbert spaces, and extended the method ofusing SVDs to solve the eigenvalue decomposition problem [71].1.3.6 Optimal Sparse Representation Through Deep LearningRecently, DL methods have been utilized to improve the accuracy of signal reconstruc-tion. Zeiler et al. investigated the functionality of convolution networks and deconvolutionnetworks in image classification [72, 73]. Convolution networks downscale an originalimage into a low-rank feature space, which can be regarded as compressed information.In contrast, the unpooling of deconvolution networks reconstructs the compressed featureback to the original signal. With the help of deep convolution/deconvolution networks, theperformance of compression and reconstruction improves significantly. Mousavi et al. de-veloped a DL framework for recovering structured signals [56], using DL frameworks tocapture nonlinear relationships. The proposed stacked denoising autoencoder is an unsu-pervised feature learner that captures statistical dependencies between various observationsand significantly improves signal recovery quality. Loh et al. built a DL framework to learnthe event reconstruction problem by placing photon sensors in an antineutrino detector [74].Furthermore, generative models have succeeded in modeling data distributions usingneural networks. Kingma et al. proposed variational auto-encoders (VAEs) [75] to encodea complex signal into a low-dimensional space and decode it. Goodfellow et al. then de-12veloped generative adversarial networks (GANs), which either generate superficial imagesor automatically encode input signals into compressed signals, using two neural networksthat contest with each other [76]. Bora et al. proposed a generative model based on thecompressive sensing method to use fewer measurements to achieve high accuracy [77]. Thegenerative model is built upon VAEs and GANs. The researchers also proved that giventhe Lipschitz constant of the proposed algorithm, recovery performance can be guaranteedwithin a certain probability and error rate. Calculation of the Lipschitz constant was in-vestigated in [78], where Gouk et al. discussed the effects of regularizing neural networkswith Lipschitz continuity. They provided a method for calculating the Lipschitz constantof a neural network and then used the calculated Lipschitz constant to formulate the train-ing process of the neural network. The performance of neural networks has cumulativelyimproved through regularization using Lipschitz continuity.However, DL-based approaches usually encode the original signal into a low-dimensionalspace with a complex model, which can exacerbate the difficulty of deciding the most criti-cal observation locations.1.4 Contributions and Organization of the DissertationThe main contributions of the dissertation are listed below.• A near-optimal sensor node deployment strategy for deploying a minimal number ofsensor nodes in a spatiotemporal field is developed. This strategy relocates sensornodes to their best sensing locations while reducing the required number of sensornodes without losing information. The approach is based on an environmental modeland the linear dependence of sensor readings. In particular, the spatiotemporal corre-lation of sensor node deployment in a large geographic EOI is minimized. It is alsoproven that the optimization objective function, which uses the prior estimation error,is submodular and results in a near-optimal solution. Compared with other sensornode deployment strategies and baseline algorithms, this new strategy requires much13fewer sensor nodes to achieve the same, or lower, estimation error. The proposed al-gorithm is applied to a real-world dataset, which is collected by the developed roboticsensor nodes. The in situ tests confirm that the proposed near-optimal sensor nodedeployment strategy can achieve lowest field estimation error with least sensor nodes.• A DRL framework is proposed to improve the efficiency of searching for the mostinformative sampling locations. The parameterized sampling locations in an infinitehorizon space are modeled according to their spatiotemporal correlations and sub-ject to various constraints, including field estimation error and information gain. Asthe model-based information gain can be efficiently calculated over an infinite hori-zon, the effectiveness of the sampling locations is learned during exploration by therobotic sensors. The robotic sensors are then instructed to avoid unnecessary sam-pling locations in future iterations. Also, it is proven that the proposed algorithm iscapable of effectively searching for near-optimal sampling locations and guaranteeinga minimum field estimation error. Simulation based on the Naval Oceanographic Of-fice Regional Navy Coastal Ocean Model (NCOM) is presented, which demonstratesthe significant enhancements made by the proposed algorithm. Compared with tradi-tional approaches, such as information theory, which is based on a greedy approach,or random sampling, the proposed method shows superior performance with regardto both estimation error and planning efficiency. Besides, a model-based statisticalapproach is developed to reconstruct the entire spatiotemporal field with the obser-vations collected from the most informative sampling locations. The results showthat the most informative sampling locations generated by the DRL framework cannear-optimally condition the spatiotemporal field reconstruction.• A model-free approach is developed to reconstruct the spatiotemporal field from lim-ited observations. The sampling locations in an infinite horizon are obtained viacanonical basis compression according to their spatiotemporal correlations and sub-14ject to reconstruction performance. Then, a simulated annealing-based approach isutilized to optimize the canonical basis for the optimal sampling locations. The sparserepresentations of the spatiotemporal field given by the canonical basis indicate opti-mal sampling locations. Auto-encoders are then tuned to reconstruct the spatiotem-poral field from the observations in the selected sampling locations. In this manner,reconstruction from the sparse representation to the spatiotemporal field is learned.It is also proven that the proposed model is Lipschitz continuous and that the sparsesampling strategy is sufficient for reconstruction.The organization of the dissertation is as follows:Chapter 2 introduces the developed near-optimal sensor node deployment strategy, whichis termed Rapid Random exploring tree with Linear Reduction and Connectivity (RRLRConnectivity). Linear correlations are investigated among the sampling data to identifyspatiotemporally redundant observations. The corresponding sensor nodes of the redundantobservations are then removed to reduce the number of deployed sensor nodes, with regardto the network connectivity. Besides, details regarding the proof about submodularity andnear-optimal guarantee are provided.Chapter 3 presents a novel spatiotemporal field exploration method called Deep Rein-forced Learning Tree (DRLT), which can learn the information gain at sampling locationsduring exploration. The area with a lower information gain will subsequently have de-creased probability of exploration. In this manner, the speed of finding the near-optimalsampling locations will increase.Chapter 4 presents a model-based sensor deployment strategy for spatio-temporal fieldreconstruction. It shows that the model-based informative sampling strategy that is proposedin Chapter 3 can near-optimally condition the reconstruction of the spatiotemporal field.Also, a statistical method is developed to learn the linear mapping from observations of theinformative sampling locations to the entire spatiotemporal field.Chapter 5 presents a model-free deep learning framework for spatiotemporal field recon-15struction, called Spatiotemporal sparse Auto FiEld Reconstructor (SAFER). The canonicalbasis is utilized to compress the spatiotemporal field into limited observations, and nonlin-ear mapping from the observations to the original spatiotemporal field is learned. Further,proofs regarding Lipschitz continuity of the deep learning framework and sufficiency forthe reconstruction of sparse representation are provided.Chapter 6 illustrates the hardware design, the implementation of the robotic sensor mod-ule, and the results of the in situ experimentation. An environmental model is establishedbased on the sampling data from the conducted in situ tests, according to the metrics intro-duced in Chapter 2. Then, the proposed methods are applied to the established environmentmodel to test its effectiveness.Chapter 7 concludes the dissertation by summarizing the main contributions, and indi-cating possible future work.16Chapter 2Spatiotemporal Sensor Deploymentusing Minimal Sensor Nodes2.1 IntroductionIn recent decades, different types of pollution issues have significantly raised the pub-lic awareness, where water pollution has become a key concern. Water-related problemsseverely affect peoples health not only in developing countries but also in developed coun-tries. In this context, an environmental monitoring system has been designed in our lab-oratory to sample the critical water quality attributes, determine the quality of water, andinform the public and relevant authorities the obtained results with view to possible correc-tive actions.The layout of the developed environmental monitoring system is shown in Figure 2.1.In the beginning, technologies such as satellite imaging, mobile WSNs, and static WSNsmay be utilized for sampling the conditions at several locations in the field. Then, basedon the time series data sourced from different locations, methods like kriging can be usedto establish a spatiotemporal model for the existing environment [79]. Next, based on thisenvironmental model, field exploration method and heuristic algorithms can be developed tofind the optimal sensor node deployment locations that meet specific optimization criteria17(e.g., receiving most information, achieving least estimation error, minimizing the powerusage). Given these deployment locations, the redundant sensor nodes may be removed tokeep a minimal number of sensors in the field to reduce infrastructure costs. In this manner,the deployed sensor nodes can form a WSN that persistently monitors the most informativelocations in the field for sensing.Search theenvironmentModel the currentenvironmentFind optimal locationswith minimal sensornodesDeployment of mobilesensor nodesEnvironmentalchangedetectionStartPersitentlymonitor theenvironmentFigure 2.1: The framework of optimal sensor node deployment with minimal sensornodes.It is essential to preserve the network connectivity of a WSN while eliminating possi-ble redundant sensor nodes. Traditionally, the selection of informative sampling locationsdoes not consider the correlation of sampling data or the network connectivity. Hence, thespatiotemporal filed might be over-sampled or under-sampled. Over-sampled sensor de-ployment strategies do not consider the spatiotemporal correlation of the sampling data andwill result in sensor node redundancy. Under-sampled sensor deployment strategies cannotprovide sufficient sensor nodes to ensure network connectivity.In this chapter, a method is proposed to identify proper nodes and remove linearly redun-dant sensor nodes efficiently while maintaining network connectivity. Through this process,the optimal sensor node deployment locations that achieve the lowest estimation error andshare least linear correlation are determined. Based on these selected deployment locations,18mobile sensor nodes can be deployed to monitor the geographic EOI persistently. When achange in the geographic EOI is detected, the above process has to be repeated to generatea new model that satisfies the current environment.The underlying method as proposed in this dissertation is called Rapid Random explor-ing tree with Linear Reduction and Connectivity (RRLR Connectivity). RRLR Connectivityis a near-optimal strategy of sensor node deployment, which is developed here to optimizesensor node deployment locations for field estimation. Initially, it is proved that using priorestimation error as the optimization function has a characteristic of submodularity, whichwill result in a near-optimal solution. Then, an RRT-based approach is developed to searchfor possible sensor node deployment locations by building space-filling trees. During thetree expansion, sampling locations with the lowest prior estimation error are found. Then,linearly correlated sampling data are investigated, and the redundant sampling locationsare eliminated according to the network connectivity. In essence, the proposed methodsignificantly reduces the number of sensor nodes required for persistent monitoring whileachieving a lower or similar prior estimation error.2.2 Preliminaries2.2.1 Modeling of EnvironmentA common environmental model that is used for geological and environmental systemsis chosen in this dissertation [79]. Consider a discrete time-variant scalar field vψ[t] ⊂ R,where ψ ∈ A is an arbitrary point in the two-dimensional EOI (A), which can be non-convex and filled with obstacles. Then, the environmental model can be represented asthe output value of desired locations at a given timestamp. The model can be expressedas a function of the temporal information matrix (A), the spatial information matrix (Sψ),location in the field ψ and time t:vψ[t] = F (A,Sψ, ψ, t) (2.1)19The spatiotemporal environmental model is considered to be separable [80]. Thus, vψ[t]may be decomposed as the dot product of the spatial information Sψ and the time-varyingcoefficient vector x[t][3]:vψ[t] = Sψx[t] + v[t] (2.2)where v[t] ∼ N(0,Rv) is Gaussian white noise, and Rv is its covariance matrix, which ispositive semidefinite. The noise is assumed to be time independent with E{v[i]v[j]T} =0,∀ i 6= j. For simplicity, the variance of different noise is assumed to be the same [80].The temporal information matrix A, can be used for calculating the time-varying coef-ficient vector x[t] = [xt1 · · · xtk ]T , which is regard as the state variable of a linear discretetime-invariant system:x[t+ 1] = Ax[t] + w[t] (2.3)where w[t] ∼ N(0,Rw), is Gaussian white noise and Rw is its covariance matrix, whichis positive semidefinite. Also, E{w[i]w[j]T} = 0,∀ i 6= j. Since v and w are uncorrelated,E{w[i]v[j]T} = 0,∀ i, j. In this dissertation, temporal information matrix A and its covari-ance matrix Rw can be learnt from sampled data at the station using subspace identificationmethods [81, 82], with the raw observation data from National Centers for EnvironmentalInformation (NCEI) of U.S. National Oceanic and Atmospheric Administration (NOAA)[83–85].Then, the spatial information matrix Sψ is the row vector consisting of k spatial basisfunctions [79]:Sψ = [s1(ψ), · · · , sk(ψ)] (2.4)Each basis function is represented by a Gaussian basis function as si(ψ) = Ke−||(ψ)−(ψi)||22σ2 ,where ψi represents the location of the ith basis center, and σ is its variance.In this manner, Equations 2.2 and 2.3 jointly represent a linear discrete time-invariantsystem, and the spatiotemporal information is captured. Although due to its robustness theGaussian radial basis function is selected to represent the basis, the proposed method does20not rely on the specific type of basis function.2.2.2 Estimation Error of Deployment LocationIt has been shown that the linear Kalman filter is suitable for estimating time-varyingcoefficients, and its prior error covariance matrix can be used as a quality measure for thecriteria. It is an optimal estimator that minimizes the mean square error (MSE) of thesystem [80] [86]. The prior estimation of the linear Kalman filter is used for updating theerror covariance matrix as:P[t+ 1|t] = AP[t|t]AT + Rw (2.5)where:(2.6)P[t|t] = P[t|t− 1]−P[t|t− 1]STψi × (SψiP[t|t− 1]STψi + Rv)−1SψiP[t|t− 1]Hence, the prior estimation error matrix of the linear Kalman filter can be representedas:(2.7)P[t+ 1|t] = AP[t|t− 1]AT−AP[t|t− 1]STψi × (SψiP[t|t− 1]STψi + Rv)−1SψiP[t|t− 1]AT + RwThe maximum eigenvalue of the prior covariance matrix, pi(P[t+ 1|t]), is the objectivefunction for optimization because the largest eigenvalue represents the worst estimation ac-curacy at the current point. Furthermore, it has been shown that to remove the effect oftransient high error of estimation in the beginning, the prior error (persistent monitoringerror) should be evaluated as time goes to infinity. Also, when there are several deploy-ment locations in each deployment strategy, estimation about a single sensor node in onedeployment strategy will converge as time goes to infinity [3]. It can be represented as:(2.8)Pi∞ = APi∞AT −APi∞STψi × (SψiPi∞STψi + Rv)−1SψiPi∞AT + Rwwhere Pi∞ indicates the prior error covariance for the ith deployment location of strategyN as time goes to infinity.21Then, the minimum largest eigenvalue in each possible deployment strategy is the cost:Γ(N ) = pi(PN∞) = maxi=1,2,...pi(Pi∞) (2.9)where N is the deployment strategy containing several deployment locations that are se-lected by the algorithm, Pi∞ indicates the prior error covariance for the ith deploymentlocation of strategyN as time goes to infinity, and pi(PN∞) denotes the largest eigenvalue ofthe prior error covariance matrix for strategy N as time goes to infinity.2.2.3 Matrix Linear RedundancyAssume that a set of sensor nodes, N = {n1, n2, · · · , nk}, is persistently deployed ina geographic EOI. Denote sensor readings for M snapshots asM = {m1,m2, · · · ,mk},where m ∈ RM×1. Hence, the readings of these k sensor nodes can be placed in a matrixM ∈ Rk×M as M = [m1,m2, · · · ,mk]T . The goal is to find if there are readings mi thatcan be linearly represented by other data rows in matrix M; i.e., M does not have the fullrank, and these rows can be expressed as a linear combination of others:mi =∑kj ·mj, (kj 6= 0 and mi,mj ∈M). (2.10)In this way, the redundancy of the node deployment can be indicated by the linear depen-dent relationship of sensor node readings when they are persistently deployed. Any nonzeromj with the associated nonzero kj is then defined as a correlated neighbor of mi. Then, mitogether with its correlated neighbors comprise a linearly correlated set Φ. If several nodesare linearly dependent in set Φ, information from some of these nodes (called redundantnodes) can be inferred from other nodes in this set.222.2.4 WSN Connectivity of the Informative Sampling LocationsThe deployed sensor nodes comprise a wireless sensor network. During the processof redundant sensor node elimination, the network connectivity might be broken if the re-maining sensor nodes are far from each other. As shown in Figure 2.2, red sensor nodesdenote the ones that have spatiotemporally correlated sampling data, and one of them canbe removed. However, eliminating the sensor node as in Figure 2.2 (a) might break theconnectivity of the WSN due to the increase of communication distance. Instead, removingthe sensor node as in Figure 2.2 (b) can preserve the sensor network connectivity whileeliminating the redundant sensor node.(a) Network connectivity broken due toredundant sensor node removal.(b) Network connectivity preserved afterredundant sensor node removal.Figure 2.2: Illustration of network connectivity. (Red sensor nodes have spatiotempo-rally correlated sampling data.)In this dissertation, the network connectivity is assumed to be based on the communi-cation distance of the robotic sensors and does not depend on the node size. Therefore, thenetwork connectivity can be formulated as:∀ni, nj ∈ N ,∃k-hop communication path. (2.11)For RRT-based informative sampling strategies, the distance between any 1-hop sensor23nodes will be less than its exploration step size. Hence, making the step size less than thecommunication distance can preserve the sensor networks connectivity. However, remov-ing the redundant sensor nodes may break the sensor network connectivity, as shown inFigure 2.2. Thus, the constraint of network connectivity should be considered during theoptimization of sampling locations.2.2.5 Problem FormulationNow, the informative sensor deployment problem can be formulated mathematically.Assume that the spatial sensing field is geographically infinite since every location in thefield might be explored to obtain a better estimation of the environment. A small subset oflocations, N ⊆ A, can then choose to best estimate the environment. The quality of theselected deployment strategy N can be evaluated by Γ(N ), which denotes the persistentestimation error and the infinite horizon cost. Each deployment strategyN may have severaldeployment locations, and every sampling location will persistently sample data from theenvironment and generate the data matrix M. Moreover, redundant sensor nodes should beeliminated while obtaining sufficient information and preserving the network connectivity.The problem can be formally defined as:minN⊆AΓ(N ), (2.12)subject to: |M|= rank(M)Equation 2.11242.3 Rapidly-exploring Random Tree with ConnectedLinear Reduction2.3.1 Optimization Scheme for Near-optimal Sensor NodeDeploymentSelecting a small subset of locations over the infinite horizon to minimize the priorestimation error has been shown to be NP-hard [10]. Therefore, it is unlikely to find theoptimal subsetN ⊆ A of this problem. Instead, a near-optimal solution is provided. In thischapter, Rapid-exploring random tree (RRT) is used to search a non-convex geographic EOIand find near-optimal solutions. RRT can efficiently build a space-filling tree G = (V,E)over the infinite horizon, where V represents the vertices andE represents the edges. Space-filling treeG contains many spanning trees, whose vertices indicate the possible deploymentlocations within the field. Hence, the prior estimation error of the corresponding samplinglocations can be determined, and the spanning tree with the smallest estimation error can becomputed.The critical factor for near-optimal optimization is that more observation will result inless information gain and estimation error reduction. Moreover, it leads to convergenceof the maximization or minimization of the cost function. This concept is formalized assubmodularity [87]. A function Γ is said to be submodular if:∀A∗ ⊆ A∗∗ ⊆ A and n ∈ A \ A∗∗Γ(A∗ ∪ n)− Γ(A∗) ≥ Γ(A∗∗ ∪ n)− Γ(A∗∗)Proposition 2.1 show that the cost function of the proposed method is submodular.Proposition 2.1. Function Γ(N ) is submodular.Proof. According to Equation 2.9, let N ⊆ N ∗ ⊆ A and a location n ∈ A \ N ∗. The goal25is to satisfy following equation:Γ(N ∪ n)− Γ(N ) ≥ Γ(N ∗ ∪ n)− Γ(N ∗). (2.13)Then, let N ∪A = N ∗,Γ(N ∗) = pi(PN ∗∞ ) = pi(PN∪A∞ )= max(maxi∈Npi(Pi∞),maxi∈Api(Pi∞))= max(α, β),(2.14)where α = maxi∈Npi(Pi∞), β = maxi∈Api(Pi∞)). Thus, Γ(N ) = maxi∈Npi(Pi∞) = α.Similarly,Γ(N ∗ ∪ n) = pi(PN ∗∪n∞ ) = pi(PN∪A∞ ∪ n)= max(maxi∈Npi(Pi∞),maxi∈Api(Pi∞), pi(Pn∞))= max(α, β, γ),(2.15)where pi(Pn∞) = γ. Then,Γ(N ∪ n) = pi(PN∪n∞ )= max(maxi∈Npi(Pi∞), pi(Pn∞))= max(α, γ).(2.16)Next,Γ(N ∪ n)− Γ(N )− Γ(N ∗ ∪ n) + Γ(N ∗)= max(α, γ)− α−max(α, β, γ) + max(α, β).(2.17)If γ ≤ α, Equation 2.17 equals α− α−max(α, β) + max(α, β) = 0.26If γ > α, when γ < β, Equation 2.17 equals:γ − α−max(γ, β) +max(α, β)→γ − α− β + β→γ − α> 0.If γ > α, when γ > β, Equation 2.17 equals:γ − α− γ +max(α, β)→max(α, β)− α≥ 0.Thus,max(α, γ)− α−max(α, β, γ) + max(α, β) > 0→Γ(N ∪ n)− Γ(N )− Γ(N ∗ ∪ n) + Γ(N ∗) > 0→Γ(N ∪ n)− Γ(N ) > Γ(N ∗ ∪ n)− Γ(N ∗),which satisfies Equation 2.13.Next, Proposition 2.2 proves that using RRT-based approach to find sensor deploymentlocations guarantees a near-optimal result.Proposition 2.2. RRT-based sampling algorithm finds near-minimal estimation error, asthe error monotonically decreases with the algorithm iterates.Proof. Let Ni be the set of feasible deployment strategy produced in ith iteration, and thecorresponding estimation error is σi. Similarly, the set of feasible deployment strategy ini + 1th iteration is Ni+1 and the estimation error is σi+1. Then, denote the new searching27location in i + 1th iteration as n. According to Equation 2.12, the deployment strategy isgreedily selected to be the one with minimal estimation error:σi+1 = minN⊆Ni+1Γ(N )= min( minN⊆NiΓ(N ), minN⊆Ni+1\NiΓ(N )).(2.18)Then,σi+1 = min(σi, minN⊆Ni+1\NiΓ(N )) ≤ σi. (2.19)Hence, the estimation error will decrease monotonically as the algorithm iterates. Be-sides, according to the principle of submodularity, the greedily selected deployment strategyis near-optimal. Thus, the result of RRT-based sampling algorithm has the near-minimal es-timation error as it converges according to the iteration.In this manner, the near-optimal sensor deployment locations can be found. Besides, thelinear dependence of the sampling data can be captured and eliminated by detecting theirlinear correlation while expanding the spanning tree. The algorithm of RRLR Connectivityis presented as Algorithm 2.1.The inputs of Algorithm 2.1 are the exploration step size ρ, matrices A and S thatcontain the spatiotemporal information, and the geographic EOI (A). Line 1 initializes thealgorithm by setting up the starting location of the space-filling tree, where G denotes thespace-filling tree generated by the rapid-exploring random tree, and V and E stand for itsvertices and edges, respectively. Then, line 2 initializes the estimation error and the sensordeployment set. Also, τ denotes the set of computed deployment locations, which shouldbe empty in the beginning, and η records the infinite horizon cost calculated through priorestimation error on locations in set τ .Next, the algorithm is executed iteratively to find near-optimal sensor deployment loca-tions. In each iteration, a location vrand will be randomly selected in A. Then, the nearestvertex in the space-filling tree G will be calculated as vnearest, and the new sampling loca-28Algorithm 2.1: Rapid-exploring Random tree with ConnectedLinear Reduction.Input: ρ, A, S, AOutput: η, τ1 V ← φ0, E ← ∅, G← (V,E);2 τ ← ∅, η ←∞;3 for i = 1 to T do do4 vrand = RAND NODE(A);5 vnearest = NEARVERTEX(vrand, G);6 vnew = STEP(vnearest, vrand, ρ);7 Nˆ = GenPath(G, vnew, vnearest);8 η˜ = minN˜∈Nˆ Γ(N˜ );9 τ˜ = arg minN˜∈Nˆ Γ(N˜ );10 if η˜ < η then11 η = η˜;12 M = genSample(A,S, τ˜ , i,M );13 M∗, I∗ = LinearDependentDetection(M, );14 G, τ = RedundancyElimination(M∗, I∗, G, τ˜ , dcom);15 end16 V ← V ∪ vnew;17 E ← E ∪ edge(vnearest, vnew);18 endtion vnew is selected by moving from vnearest in the direction of vrand at most ρ. Next, line 7finds spanning trees Nˆ of G that pass through the vertices vnew and vnearest. Nˆ indicates thepossible sensor deployment strategies. The strategy with minimal estimation error withinset Nˆ is calculated in line 8 to line 9. Therefore, as shown in line 10 to line 15, if a sensordeployment strategy that has the lowest estimation error is found, it will be recorded andchecked for sampling location redundancy.Line 12 generates sampling data based on the environment model and the sampling lo-cation. Then, the data matrix M is passed to Algorithm 2.2 to detect linearly correlatedneighbors, and the results of Algorithm 2.2 are used to eliminate redundant sampling loca-tions with regard to network connectivity according to Algorithm 2.4.Last, the new vertex and its edge are added to the space-filling tree for next iterations.RRLR Connectivity runs on a local or cloud server, which can communicate with each29mobile sensor node. Currently, it is assumed that no communication is needed among thesensor nodes. As a result, information will be processed by the local server. Instructions onthe sampling locations are sent to the mobile sensor nodes after computation.2.3.2 Linearly Dependent Sampling Location Elimination withregard to Network ConnectivityUsually, multiple homogeneous sensor nodes will be deployed in the spatiotemporalfield. Each of them will generate a time series of data for each type of sensor. Hence,sampling data collected in different locations should be checked to examine the redundancyof the deployed locations. If linearly correlated time series exist, the matrix of data willnot be of full rank, and the dimension of the data matrix can be reduced to its rank byeliminating the redundant rows. Accordingly, it is not necessary to deploy sensor nodes inall required locations because the information provided by these redundant sensor nodes canbe reconstructed [88, 89]. However, arbitrary removal of redundant sampling data accordingto the rank of the data matrix might result in information loss. Figure 2.3 illustrates acase for preserving information of the sampled data while removing redundant samplinglocations.Figure 2.3 (a) presents a 5-by-M matrix with redundant entities, whose rank is 3. There-fore, there exist two rows of the matrix that can be removed while not resulting in informa-tion loss. Assuming that the rows of the matrix that have the same color are linearly corre-lated, each row can be reconstructed by the rest of the rows with the same color accordingto Equation 2.10. Note that the nodes that have gradient color from yellow to blue are lin-early correlated with rows with both colors. In this manner, the rows with gradient colorcontain the most duplicated information, and removing them may not result in informationloss since they have a higher chance to be reconstructed.Figure 2.3 (b) presents a case where randomly removing redundant rows of the datamatrix may result in information loss. If the first row and the second row of the matrix areremoved due to the rank of the matrix, the information will be lost as the removed rows30(a) A 5-by-M matrix with redundantentities.(b) The redundant removal strategy thatresults in information loss.(c) The removed entities can bereconstructed.Figure 2.3: Illustration of redundant sensor elimination.cannot be reconstructed from the remaining rows. Therefore, a possible redundant removalstrategy is to remove the rows with the highest redundancy first. As shown in Figure 2.3(c), removing the second row and the third row of the matrix does not result in informationloss.During the process of removing redundant rows in the data matrix, the sensor networkconnectivity should also be considered. Removing all the redundant sensor nodes mightbreak the connectivity of the wireless sensor network. Thus, the constraint concerning the31wireless network connectivity is added:∀ni ∈ N ,∃nj ∈ N : dist(ni, nj) ≤ dcom, (2.20)where ni and nj are two locations belonging to the deployment set N , and dcom is thethreshold for the largest communication distance. If there exists at least one location whosenearest neighbor is further than τcom, the connectivity of the network is broken.Hence, the redundant removal strategy of RRLR Connectivity is proposed to detect andeliminate redundant sensor nodes with regard to information loss and network connectivity.This method first identifies linearly correlated neighbors and then removes the redundantsampling locations. As a precursor to the algorithm, a way to detect linearly correlatedsampling data for a given matrix M is presented as Algorithm 2.2.Algorithm 2.2: Linearly Correlated Neighbor Detection.Input: M, Output: M∗, I∗1 M,k ← size(M);2 r ← rank(M);3 I = eye(M);4 M∗, I∗ = GaussianElimination(M, I, ,M, k, r);As shown in Algorithm 2.2, the inputs are the data matrix M and the noise threshold. Line 1 calculates the size of the input data matrix, and line 2 calculates its rank. Then,line 3 builds an identity matrix at the size of sampling locations. Identity matrix records therow addition actions of the Gaussian elimination and contains the information of linearlycorrelated sets. The use of the identity matrix ensures that the ith row in I represents thesame row in matrix M. By carrying out the same row addition action (i.e., add the samerow with the same coefficient) as in matrix M, I can store the information about how rowaddition is done in the matrix M. Then, data matrix, identity matrix, noise threshold, matrixsize, and the rank of the data matrix are passed to Algorithm 2.3 to detect linearly correlatedneighbors.32Algorithm 2.3: Gaussian Elimination.Input: M, I, ,M, k, rOutput: M∗, I∗1 i =1, j = 1;2 while i ≤M do3 imax = arg maxi=j,···,M abs(M[i, j]);4 if M[imax, j] ≤ then5 j + +;6 else7 pivot = M[imax, j];8 swap(M[i, ·],M[imax, ·]);9 swap(I[i, ·], I[imax, ·]);10 M[i : M, j : k]− = M[i : M, j]× J1,k/pivot;11 I[i : M, j : k]− = I[i : M, j]× J1,k/pivot;12 i+ +, j + +;13 end14 M∗ = M, I∗ = I;Line 1 of Algorithm 2.3 initializes the index of the data matrix. Then, line 3 finds thepivot of the ith row. This step improves the stability of the algorithm and avoids addingthe same row multiple times. Then, the effect of noise is considered, and a threshold isutilized to remove such effect. Line 4 checks whether the value of the pivot is less than forthe current iteration. If so, the current iteration should be skipped because it is consideredas a zero row.Line 7 to Line 12 carry out row addition for each nonzero row. Line 7 records thevalue of the pivot of this iteration. Then, line 10 carries out row addition for removingrow redundancy, where J is a 1-by-k matrix of ones. At the same time, line 11 recordsthe row addition action in identity matrix I. When the iteration ends, there should be zerorows in the output matrix M∗ if the input matrix M has row redundancy, and the numberof zero rows plus the rank of the input matrix M should be equal to the row dimensionM . Moreover, the input identity matrix I records the information of how row addition isoperated in the matrix M, and such information is stored in the output matrix I∗.Removing one redundant row found by Algorithm 2.3 will not result in information loss33since it can be linearly represented by other rows. However, randomly removing multiplerows in the data matrix may result in information loss, as shown in Figure 2.3.Once the linearly correlated rows that stand for the corresponding redundant nodes havebeen detected, they can be relocated to other uncovered places or completely removed, toreduce the cost of resources. The fundamental requirements in the removal of redundantrows from a correlating cluster are: 1. not breaking the correlation of any clusters, and2. not resulting in information loss or broken network connectivity. Typically, differentclusters might share the same redundant nodes, and there remains uncertainty about whetherremoving such nodes will break the linear correlation in other clusters. Besides, as theremight be multiple redundant nodes in all sampling nodes, randomly removing all of themmight result in information loss if the sampling data from the removed nodes cannot bereconstructed.To this end, redundant sensor readings are found by utilizing the output matrices M∗, I∗through Algorithm 2.2. The row addition information about the Gaussian elimination ofdata matrix M is stored in I∗. Zero rows of the data matrix are the redundant ones, whoserow addition information is stored in the corresponding rows of the matrix I∗, and the col-umn indices of nonzero elements of I∗ represent the linearly correlated rows in matrix M.Therefore, linearly correlated sampling data can be recovered from I∗.Algorithm 2.4 aims to eliminate redundant rows and the deployment of the correspond-ing sensor nodes. This algorithm reduces the size of the deployment set by discovering thelinear redundancy among the collected sampling data in different sensor nodes, based onthe outputs of Algorithm 2.2. Moreover, the network connectivity is preserved during theelimination of redundant deployment locations.Algorithm 2.4 first finds the set that stores the redundant sampling locations from line2 to line 4. All rows (sensor nodes) within the linearly correlated set are picked into theredundant sampling location J . Then, only |M∗|−rank(M∗) sensor nodes in the set Jshould be selected as shown in line 7. Besides, as there are multiple options for removing34Algorithm 2.4: Redundancy elimination w.r.t. connectivity.Input: M∗, I∗, G, τ, dcomOutput: G, τ1 J ← ∅;2 foreach max(M∗·i) ≤ and I∗ij ≥ do3 J ← J ∪ j;4 end5 J = |J |;6 foreach j ∈ J do7 while |J |≥ J + r −M do8 if min-dist(τ(j), τ \ τ(j)) < dcom and |τ |< |M∗|−rank(M∗) then9 τ \ τ(j);10 G(τ(j)) = ∅;11 J \ j;12 end13 end14 endthe redundant sensor nodes to reduce the size of matrix M to its rank, network connectivityshould be considered during this process. As line 8 shows, if the minimal distance betweenthe redundant deployment location and the entire deployment strategy τ \ τ(j), which iscalculated by min-dist function, is less than the threshold, this redundant deployment canbe removed. In this manner, the network connectivity will not break when removing theredundant sensor nodes.2.4 Simulation ResultsNumerical simulations are conducted using the dataset that was generated from a com-putational fluid dynamics (CFD) model to simulate a 2-dimensional spatiotemporal fieldbased on COMSOL software [90]. Real-world experiments usually have restrictions onthe resources, which calls for the deployment of a limited number of sensor nodes in thefield. CFD simulations avoids this problem by generating an unlimited number of sensinglocations in the field. The test data generated by COMSOL software contains a simulationof three heat sources in a squared aquatic field. It also considers the effects of diffusion,35horizontal heat transfers by small flows and the influence of obstacles. In the present work,a simulation generates a time series for 50 random sampling locations with noise in a 50-meter by 50-meter square field. Figure 2.4 illustrates the sensor deployments. Orangesquares denote sensor node deployment locations, and the blue area indicates the aquaticenvironment.Figure 2.4: Deployment locations in CFD simulation.The field estimation uncertainty is calculated to validate the effectiveness of the gener-ated near-optimal deployment locations. Then, a color map is drawn to indicate the esti-mation uncertainties according to the given deployment locations. High uncertain area hasbright color, and lower ones is shown in a dark color. The estimation uncertainty is calcu-lated for each point ψ over the field A, which is denoted by its variance V (ψ), ψ ∈ A. The36variance at location ψ can be calculated as:V (ψ) = E[(Sψ(x[t]− xˆ[t])) · (Sψ(x[t]− xˆ[t]))T ]= tr(P[t]STψSψ),(2.21)where tr function calculates the trace of the matrix.Before applying RRLR Connectivity to these datasets, the parameters should be tuned tosatisfy the environment of the sampling area. Matrices A,S,Rw and Rv, are studied fromthe data sampled from five sampling locations. In this case, the number of basis functionsis set to five. The parameter values for si(ψ) = Ke−||(ψit)−(ψj)||22σ2 are chosen as: σ2 = squaresum of the basis function variance, and K = average of the entire vector. The number ofiterations is set to 2000 and the step length is 10 meters.The comparison algorithms, RRC and RRCstar are executed using the same parame-ter values. Besides this, two algorithms, random search (RS) and greedy algorithm, areexecuted as benchmark problems with the same common parameters. RS moves aroundrandomly from the initial point to different sampling locations. Then, the deployment withminimal estimation error is computed from all locations visited in the field. Greedy Algo-rithm is based on RS, but it does not take new observations randomly. It takes four (GD4)or eight (GD8) random attempts in the current location. Each attempt will generate a newobservation location near the current location of the sensing agent. Then, together withthe existing observations, the action leading to lower estimation error will be picked andexecuted as the next step. Also, a version of RRLR Connectivity that did not consider thenetwork connectivity, which is termed as RRLR, is added for benchmark. All experimentswere conducted on a PC with 4.0 GHz Quad-Core CPU and 16 G memory.Figure 2.5 presents the simulation results generated from the CFD dataset using RRT-based algorithms. RRC requires 10 sensor nodes to monitor the selected environment per-sistently while RRC needs 12, as indicated in Figure 2.5 (a)-(b). In contrast, RRLR andRRLR Connectivity reduce 9 and 4 redundant deployment locations, respectively, with the37help of Linear Reduction. They only require 12 and 6 sensor nodes in the field, respectively.Comparing Figure 2.5 (a)-(d), RRLR-based algorithms remove several closely deployedsensor nodes according to their linear redundancy in the sampling data. This process notonly reduces the number of sensors in the field but also helps the system to explore morearea in the field with limited resources. Besides, by eliminating redundant deployments,RRLR and RRLR Connectivity explore a more deployable area (i.e., the bottom left area inFigure 2.5 (c)-(d)), which will reduce the field estimation uncertainty, as indicated in Figure2.7.Figure 2.5: Deployment results generated by RRT-based algorithms.38Figure 2.6 (a) - (c) present the results generated by the random search and the greedyalgorithm, where too many sensor nodes are stuck in a narrow area of local minima andcannot escape. Especially, RS is stuck around its initial search point as it picks the nextsampling location randomly. Although the greedy-based approach has good coverage onexploring the EOI, the results easily entrap in local minima. As indicated in the figure, thesampling nodes are narrowly deployed in a small area, which will result in a considerableloss.Figure 2.6: Deployment results generated by random search-based algorithms.Figure 2.7 presents the field estimation uncertainty using the results generated throughthe CFD dataset. The results show that all four RRT-based algorithms generate similaruncertain regions in the EOI since all of them are near-optimal algorithms. RRLR-basedalgorithms generate better results by placing the sensors close to the bottom left area of theEOI, which results in lower estimation uncertainty in that area. The statistical measurementsof the estimation uncertainty and the required number of sensor nodes are shown in Table2.1. The results show that the RRLR-based algorithms achieve low uncertainty over the fieldwith a small number of sensors. Especially, RRLR Connectivity finds the best deploymentstrategy with low uncertainty and variance while using minimal sensor nodes.Figure 2.8 shows the estimation error from different algorithms over the iterations. The39Figure 2.7: Field estimation uncertainty results of RRT-based algorithms.Table 2.1: Statistical measurement of the field estimation uncertainty.Max Min Variance # sensorsRRC 3.74 3.05 ∗ 10−5 0.14 10RRCstar 3.37 3.11 ∗ 10−5 0.16 12RRLR 3.37 1.80 ∗ 10−5 0.16 12RRLR Connectivity 3.48 1.82 ∗ 10−5 0.17 640randomized algorithm does not perform well and can hardly reduce the estimation error.However, all RRT-based algorithms reduce the estimation error significantly and also con-verge. Hence, according to Proposition 2.2, all the RRT-based algorithms achieve a near-optimal solution with the CFD dataset. Moreover, as indicated in Table 2.1, RRLR Connec-tivity uses least sensor nodes to perform the persistent monitoring with minimal field esti-mation uncertainty. RRLR Connectivity only requires six sensor nodes in the field, whileall other benchmark algorithms need more than ten. Thus, it can be concluded that the pro-posed algorithm achieves near-optimal solution while significantly reducing the number ofsensor nodes in the field.Figure 2.8: Change of minimum estimation error.In conclusion, RRLR Connectivity requires fewer sensor nodes to persistently monitorthe environment over the EOI while receiving the lowest estimation error and field estima-tion uncertainty. This significant improvement results from the elimination of redundant41sampling data during the tree expansion. RRLR Connectivity can also realize the near-optimal sensor deployment strategy while preserving the network connectivity.Another test based on in situ results is provided in Chapter 6.2.5 SummaryIn this chapter, a novel strategy was proposed to find a near-optimal solution for de-termining the sensor node deployment locations in a spatiotemporal field. The proposedmethod aimed to reduce the redundancy in sampled data while optimizing the estimationover an infinite horizon and preserving the network connectivity. It utilized row redundancyin the sampling data matrix to eliminate unnecessary rows (sensor nodes) and the structureof the space-filling tree in the field to determine the suitable deployment locations. Also,the redundant sampling locations were eliminated with regard to the network connectiv-ity. Simulation results showed that the number of sensor nodes needed in the field wassignificantly reduced while achieving maximum prior estimation error and minimal fieldestimation uncertainty.42Chapter 3Deep Reinforced Learning Tree forEfficient Spatiotemporal Exploration3.1 IntroductionAs introduced in Chapter 2, sampling location optimization is an essential part of spa-tiotemporal monitoring, and the sampling redundancy can be detected and eliminated throughthat. However, RRT-based spatiotemporal field exploration schemes are inefficient due torandom sampling. Therefore, this chapter seeks to improve the efficiency of spatiotemporalfield exploration.Considerable research effort has gone into the monitoring of environmental processesthat occur in the ecosystems. Such work can facilitate the generation of early warningsabout events that can adversely affect the stability of an Environment of Interest (EOI) [91].It will also provide adequate time for responding and minimizing possible negative impactson the environment, and taking proper corrective actions.Typically, data sampling from a large EOI can be quite laborious. For example, whenmonitoring an aquatic environment, sensors are primarily used to sample real-time dataof some critical parameters (e.g., temperature, conductivity) of the water [92]. For this,oceanographers and limnologists commonly rely on the manual sensor deployment at fixed43locations from boat [93]. Clearly, searching for suitable monitoring locations and conduct-ing persistent monitoring using robotic sensors is a better approach.Generally, the environmental features of an EOI are spatiotemporally correlated. Thus,persistent monitoring using sensor networks or robotic sensors is required to ensure the ob-servation quality while employing a relatively small number of sensor nodes in the field[94]. Common strategies that can provide field coverage for persistent monitoring includegeometry coverage using disk models, and by placing robotic sensors in the most informa-tive locations of the EOI.The traditional approach of field monitoring utilizes a static sensor network, which isnot accurate and efficient. This approach first densely deploys sensor nodes in the EOI.Then, redundant sensor nodes are removed according to the spatiotemporal correlation ofthe sensor node location and readings [1, 5, 95–97]. Such methods typically require a densedeployment of sensor nodes in the initial stage and then turning off the deployed sensorsselectively, which is costly not adequately accurate. For better performance, mobile sensornodes should be employed.Researchers considered as well taking critical observations from the field according toinformation theory rather than taking as many samples as possible. Guestrin et al., Krauseet al. and Singh et al. proposed mutual information-based methods that produce a near-optimal informative sampling strategy [10, 23, 87]. They tested their methods using real-world datasets and in situ tests and determined that mutual information could acquire moreinformation from the field while achieving a lower root mean square error (RMSE) forthe environment estimation. Du et al. proposed an optimal sensor placement strategy formonitoring hydrodynamics and water quality in an urban district, using entropy and mutualinformation to gather most information in selected locations [24, 25]. Nguyen et al. pro-posed an efficient spatial sensor selection method using Mutual Information and GaussianMarkov random fields [98]. Ma et al. developed a data-driven learning method and planningmetrics for environmental sampling [99]. However, approaches that are based on traditional44information theory only focus on choosing a subset that is most informative from the per-spective of pre-determined observation locations. Although these approaches find samplinglocations in a near-optimal way, they may not be suitable for infinite horizon problems.Researchers have recognized that mobile sensor nodes can search over an infinite hori-zon with improved performance and efficiency in persistent field monitoring. Typically,field exploration methods like RRT are used to search over an infinite horizon. How-ever, RRT-based approaches pick sampling locations randomly, and do not utilize the sam-pled information, which causes low efficiency. RL is utilized to accelerate the samplingscheme [13]. When the learning agents in the RL-based algorithms face complex real-world situations, it is hard for them to derive environmental features efficiently throughhigh-dimensional sensory inputs. Deep learning may be utilized to help RL understand acomplex environment [42].In the present dissertation, the EOI is modeled as a Gaussian process (GP), and the cov-erage of the field can be provided by searching for the most informative sampling locations[10, 100]. The objective of the informative planning method that is developed in this dis-sertation is to determine a set of sampling locations that minimize the field estimation errorover the EOI. This objective can be achieved by taking samples from the EOI and identify-ing the most informative subset of sampling locations, using field exploration methods likeRRT. Then, as the sampling data from different locations are not equally important, not allsampling locations that are selected by RRT need to be visited. Therefore, a deep reinforce-ment learning (DRL) approach is utilized in the present work to study the effectiveness ofthe sampled locations and to avoid unnecessary sampling locations. The process of search-ing over the spatiotemporal EOI can be accelerated in this manner and the most informativesampling locations can found efficiently.Also, a novel field exploration method is developed in the dissertation based on the de-veloped DRL model. The developed algorithm is called Deep Reinforced exploring Learn-ing Tree (DRLT), which can efficiently find near-optimal sampling locations for a set of45mobile robotic sensors in an EOI. The robotic sensors are assumed to be point sensors,which only capture information at their location, in the presence of Gaussian white noise.Moreover, the placement quality of the robotic sensors is determined by the field estimationerror as calculated by a Kalman filter [3], and the field coverage is evaluated by estimatingthe monitoring uncertainty in the field. Usually, finding an optimal solution over an infi-nite horizon is intractable, and it has shown to be NP-hard. However, DRLT can producea series of sampling locations with monotonically decreasing cost, which is proved to benear-optimal with minimum estimation error of the EOI.3.2 Problem Definition and Mathematical Background3.2.1 Modeling of EnvironmentIn this chapter, the primary goal is to accelerate the field exploration scheme that wasintroduced in Chapter 2, by learning from the sampled data. The environment modelingapproach that is used in the present chapter is the same as in Chapter 2, which is a com-mon environmental model used for geological and environmental systems. A discrete time-variant scalar field vψ[t] ⊂ R is used to present the spatiotemporal field, where ψ ∈ A is anarbitrary point in a two-dimensional (2-D) EOIA, which can be non-convex and filled withobstacles. The model can be expressed as a function of the temporal information matrix(A), the spatial information matrix (Sψ), location in the field ψ and time t:vψ[t] = Sψx[t] + v[t],x[t+ 1] = Ax[t] + w[t].(3.1)The temporal information matrix A is used to calculate the time-varying coefficientvector x[t] = [xt1 · · ·xtk ]T . x[t], which is considered as the state vector of a linear discretetime-invariant system. The spatial information matrix Sψ is the row vector consisting of kspatial basis functions Sψ = [s1(ψ), · · · , sk(ψ)].463.2.2 Sampling Quality of Deployment LocationEstimation ErrorIn the present chapter, a linear Kalman filter is also used for estimating the system outputvψ[t], as in Chapter 2. Linear Kalman filter is an optimal estimator for a linear discrete time-invariant system[86, 101]. Thus, the prior estimation error calculated by the Kalman filter isused as a criterion to indicate the estimation quality over the field [28]. The prior estimationof the linear Kalman filter is used for updating the error covariance matrix [101].As shown in Chapter 2, the largest eigenvalue in the estimation error matrix is regardedas the cost of the deployment strategy:Γ(N ) = pi(PN∞) = maxi=1,2,...pi(Pi∞).Here pi(PN∞) denotes the largest eigenvalue of the prior error covariance matrix for strat-egy N as time goes to infinity. The efficient computation of the prior estimation errorfollows the approach given in [3].Accordingly, the near-optimal solution is achieved by finding the deployment strategiesof minimal cost:Nopt = arg minNΓ(N ) = arg minNpi(PN∞)= arg minN(maxi∈Npi(Pi∞)).Information Gain of New Sampling LocationsIn the present work, information theory provides the basis for learning tree expansion.It depends on a location with higher entropy or mutual information that will collect moreinformation from the field. Then, the sampling location has higher information gain andgives a higher opportunity to reduce the field estimation error. Assume that A is a field to47be monitored and all possible observations compose the set A. Let N ⊂ A means a subsetin this field, which contains information about the deployment locations.In the set of selected locations in the EOI, Ni = {n1, n2, · · ·ni} ⊂ A, the chain rule ofentropy gives:H(Ni) = H(ni|Ni−1) + · · ·+H(n2|N1) +H(n1|N0). (3.2)Based on entropy, a subset is found that maximizes the mutual information (MI) betweenthe observed locations (N ) and the rest of the space (A \ N ). MI can be expressed as afunction of entropy:MI(N ) =MI(N ;A \ N ) = H(N )−H(N|A \ N )= H(N )−H(A) +H(A \ N ),(3.3)where H(A \ N ) is the entropy of unobserved locations in the field, and H(A \ N|N ) isthe conditional entropy of unobserved location set A \ N given observations at N .Deep Reinforcement LearningUnlike supervised learning, reinforcement learning (RL) does not require a dataset totrain; it learns through interaction with the environment [102, 103]. In this work, the learn-ing process of DRL network takes place while analyzing the information gain of the newobservations in the field. Higher information gain encourages the DRL to take more ob-servations and not get trapped at locations, which helps in reducing the estimation error.Thus, the goal of the proposed method is to select actions that maximize cumulative futureinformation gain.To model a RL problem, five key attributes should be defined: states n ∈ A, action seta ∈ A, transition probability between states T (n, a, n′), policy P (n), reward r and valuefunction QP (·)(n, a). The action set contains information about the actions can be taken in48different states; e.g., move forward or make a turn. Thus, each state n corresponds to anaction set A and actions will make the robot move from state n to n′. In the present work,the transition between states is deterministic; hence, T (n, a, n′) = 1. To select suitableactions in the current state, policy P (n) should be defined, which regulates the rule ofsearching over the field and determines all possible next states that can be reached by thegiven actions at ∈ Ant in its current state st. Then, the reward r of taking action a atstate n can be calculated. Rewards provide a form of evaluation for possible actions froma particular state, and they are bounded. A value function is introduced to quantify theaccumulated reward based on the selected action in the current state. It can be representedusing Bellman equations for a given policy P (·):Q(st, at)← Q(st, at) + α[rt+1 + γmaxaQ(st+1, at+1)−Q(st, at)]. (3.4)Here γ is the discount factor for future rewards and α is the learning rate. Also, themaximization function ensures that actions with best future rewards will be selected in eachiteration.Based on the mentioned theory, the process of RL is shown as follows. First, initialstate s0 is selected. Second, an action a0 is selected according to the rewards, which leadsto some state s1. Third, by executing these steps iteratively, RL will generate a sequenceincluding states and action:s0, a0, s1, a1, · · · , st, at.The accumulated rewards will converge to a maximum value and lead to a steady state[104] [105].3.2.3 Field Coverage and Network Connectivity AnalysisApart from achieving the lowest field estimation error, it is also essential to ensurethat the deployment strategy has good coverage of the EOI. Good coverage of EOI avoids49densely sampling on a narrow area with high variance and provides better recognition ofthe field. In the present work, point sensors are carried by the robotic platforms to mon-itor a large field. Hence, the traditional disk model [106] for active sensors, like radar orsonar, cannot be used, and the coverage of the field is determined by taking into accountestimation uncertainty. The field estimation uncertainty is determined by the variance of theobservation in each point of the field, given that the robotic sensor is deployed accordingto the deployment strategy. The variance of a location in the EOI, V ar(vψ), according toEquation 2.2 and [3], is given by:V ar(vψ) = E[(Sψ(x[t]− xˆ[t])) · (Sψ(x[t]− xˆ[t]))T ]= tr(P[t]STψSψ).Then the pixel-wise uncertainty of the EOI is calculated based on the correspondingcovariance matrix P[t] and the spatial correlation matrix Sψ.Robotic sensors establish a WSN network, which requires them to communicate witheach other. It is assumed that the network connectivity is based only on the positions of therobotic sensors and does not depend on the node size. Also, it is assumed that the locationsare known, bandwidth is adequate, and all the robotic sensors are homogeneous. Constraintsensure that the deployed sensors and the sinks form a connected wireless sensor network isas:∀ni ∈ N ,∃nj ∈ N : ||ni − nj||≤ τcom.Since the informative deployment strategy is generated from the exploring tree, theparent node np of ni will be no farther than the exploring step size τstep. Hence, lettingτstep ≤ τcom, there exists at least one node np in N within maximum communication dis-tance of each robotic sensor ni. This ensures the network connectivity.503.3 Deep Reinforced Learning TreeIn this section, a sampling-based exploring tree is presented to find the near-optimalsolution for informative sampling over an infinite horizon. Unlike the traditional RRT-basedalgorithms, in the present approach, the sampling locations are not chosen randomly, andthe information gain is utilized to assist a DRL model in efficiently finding areas with moresignificant information gain. Typically, RRT-based algorithms sample new observationsfrom the field randomly, which has low efficiency. In the present work, DRLT uses a DRLmodel to study the information gain of new observations of each action. Higher informationgain will give a higher reward to the DRL, and the DRL will instruct the exploring tree tosample areas with more information rather than sample randomly.However, naively computing the information gain from the infinite horizon is compu-tationally intensive. Therefore, the present method also provides an efficient computationmethod for the information gain. In this chapter, the EOI is modeled as a linear-Gaussiandynamic system, which captures spatiotemporal correlations. Hence, the sampling value ateach point of the EOI can be modeled as a dependent variable of other sampling locations.Additionally, a model-based mutual information gain is developed. Given a specific envi-ronmental model, the information gain can be computed efficiently from the spatiotemporalcorrelation. Thus, by encouraging the exploration and learning of areas of higher informa-tion gain, the locations for achieving a lower estimation error can be determined efficiently.3.3.1 Guaranteed Near-optimality for Informative Sampling UsingExploring TreeThe sampling-based exploring tree can find the deployment strategy with minimal esti-mation error over the field, and overall, the solution is near-optimal. A pertinent propositionis presented now.Proposition 3.1. DRLT algorithm gives the minimal feasible estimation error, which mono-tonically decreases as the exploring tree expands.51Proof of Proposition 3.1. Let Gi is the graph produced by the exploring tree iteration i,which contains a set of paths Nˆi. Then, Ni ∈ Nˆi gives the sampling locations computedby DRLT in the ith iteration, where Ni = arg minN∈Nˆi Γ(N ), and the estimation error isi = minN∈Nˆi Γ(N ). Thus, Ni gives the deployment set with minimal feasible estimationerror i in its iteration.Next, in the iteration i + 1, a new sampling location n is added to graph Gi, and thegraph become Gi+1, and the set of paths become Nˆi+1. Denoting the added paths as Nˆ ,where Nˆ ∪ Nˆi = Nˆi+1. The minimal error in iteration i+ 1 becomes:i+1 = minN∈Nˆi+1Γ(N ) = minN∈{Nˆ∪Nˆi}Γ(N )= min(i, minN∈NˆΓ(N )) ≤ i.Thus, DRLT produces a monotonically decreasing estimation error as the algorithmiterates.Given the deployment strategy with minimal estimation error over the field, it can beshown that this result is near-optimal. Generally, the selection of locations over an EOIto minimize the cost is NP-hard [10], and finding the optimal solution is computationallyexpensive. Thus, a near-optimal solution is provided.Normally, a near-optimal optimization can be guaranteed by greedy maximization orminimization of the submodular function [107]. The diminishing return property ensuresconvergence of the submodular function. A function f(·) is submodular if:f(A ∪ n)− f(A) ≥ f(B ∪ n)− f(B), (3.5)where A ⊂ B ⊂ C, and n ∈ C \B.Proposition 3.2. The sampling locations with minimum estimation error returned by DRLTare near-optimal.52Proof of Proposition 3.2. First, the function to calculate the estimation error should be sub-modular. Let N and N ∗ be two sets that contain sampling locations, where N ⊂ N ∗ ⊂ A.Denote the new observation location as n ∈ A \ N ∗, and the subtraction of sets is Nˆ =N ∗ \ N . Thus, the estimation error of set N ∗ is:Γ(N ∗) = pi(PN ∗∞ ) =pi(PN∪Nˆ∞ )= max(maxi∈Npi(Pi∞),maxi∈Nˆpi(Pi∞)).Similarly,Γ(N ∗ ∪ n) =pi(PN∪Nˆ∪n∞ )= max(maxi∈Npi(Pi∞),maxi∈Nˆpi(Pi∞), pi(Pn∞))Γ(N ∪ n) =pi(PN∪n∞ )= max(maxi∈Npi(Pi∞), pi(Pn∞)).Then,Γ(N ∪ n)− Γ(N )− Γ(N ∗ ∪ n) + Γ(N ∗)= max(maxi∈Npi(Pi∞), pi(Pn∞))−maxi∈Npi(Pi∞)−max(maxi∈Npi(Pi∞),maxi∈Nˆpi(Pi∞), pi(Pn∞))+max(maxi∈Npi(Pi∞),maxi∈Nˆpi(Pi∞))≥0.(3.6)Equation 3.6 can be reorganized as:Γ(N ∪ n)− Γ(N ) ≥ Γ(N ∗ ∪ n)− Γ(N ∗). (3.7)This is identical to Equation 3.5. Thus, the set function Γ(·) is submodular. Then,53using the greedily selected deployment set as shown in Proposition 3.1, we can achievenear-optimal solution.3.3.2 Model-based Spatiotemporal Information GainNow, a model-based spatiotemporal mutual information calculation is developed to ef-ficiently capture the information gain of new observation locations. The result can be usedas the action rewards of the DRL model, which accelerate the process of finding the deploy-ment strategy with near-optimal estimation error on the EOI.First, the covariance of two random variables, which represents the modeled value intwo locations of EOI, can be computed from the environmental model according to:cov(X, Y ) = E[XY T ]− E[X]E[Y ]T , (3.8)where X = vψi [t + 1], Y = vψj [t + 1]. Given the environmental model in Equation 2.2,where w[t] and v[t] are both Gaussian white noise with a zero mean, we can represent theX and Y in Equation 3.8 as:vψi [t] =Sψix[t] + v[t]=Sψi(Ax[t− 1] + w[t− 1]) + v[t]⇒ X = vψi [t+ 1] =SψiAx[t] + Sψiw[t] + v[t+ 1].(3.9)andY = vψj [t+ 1] = SψjAx[t] + Sψjw[t] + v[t+ 1]. (3.10)54Substitute, Equation 3.9 and 3.10 into E[XY T ]:E[XY T ] =E[(SψiAx[t] + Sψiw[t] + v[t+ 1])· (SψjAx[t] + Sψjw[t] + v[t+ 1])T ]=E[SψiAx[t]x[t]TATSTψj + Sψiw[t]x[t]TATSTψj+ v[t+ 1]x[t]TATSTψj + SψiAx[t]w[t]TSTψj+ Sψiw[t]w[t]TSTψj + v[t+ 1]w[t]TSTψj+ SψiAx[t]v[t+ 1]T + Sψiw[t]v[t+ 1]T+ v[t+ 1]v[t+ 1]T ].Since v[t] does not depend on x[t], we have E[v[t]x[t]T ] = 0. Also, E{w[i]v[j]T} =0,∀ i, j, E{w[t]v[t+ 1]T} = 0. Hence,E[XY T ] =E[SψiAx[t]x[t]TATSTψj ]+ E[Sψiw[t]x[t]TATSTψj ]+ E[SψiAx[t]w[t]TSTψj ]+ E[Sψiw[t]w[t]TSTψj ] + E[v[t+ 1]v[t+ 1]T ].(3.11)Furthermore, E[x[t]w[t]T ] can be simplified as:E[x[t]w[t]T ] = E[(Ax[t− 1] + w[t− 1])w[t]T ]= E[Ax[t− 1]w[t]T + w[t− 1]w[t]T ]= E[Ax[t− 1]w[t]T ]= · · ·= E[At−1x[0]w[t]T ] = At−1x[0]E[w[t]T ]= At−1x[0] · 0 = 0.(3.12)Next, substitute E[w[t]w[t]T ] = Rw, E[v[t+ 1]v[t+ 1]T ] = Rv and Equation 3.12 into55Equation 3.11:E[XY T ] =SψiAE[x[t]x[t]T ]ATSTψj + SψiRwSTψj+ Rv. (3.13)Then, the E[X]E[Y ]T part in Equation 3.8 can be expressed as :E[X]E[Y ]T =E[SψiAx[t] + Sψiw[t] + v[t]]· E[SψjAx[t] + Sψjw[t] + v[t]]T=E[SψiAx[t]] · E[SψjAx[t]]=E[SψiAx[t]x[t]TASψj ]=SψiAE[x[t]x[t]T ]ASψj .(3.14)Substitute equation 3.13 and 3.14 into function 3.8:cov(vψi , vψj) = SψiRwSTψj+ RvSome Gaussian noise is added to avoid the singularity.As the values of all feasible sampling points, vψi , are random variables, they can beformulated as a random vector V = [vψ1 , vψ2 , · · · , vψn ]T . The matrix Σ is the covariancematrix of V, where Σij = cov(vψi , vψj) = SψiRwSTψj+ Rv. Then,Σ =Sψ1RwSTψ1Sψ1RwSTψ2· · · Sψ1RwSTψnSψ2RwSTψ1Sψ2RwSTψ2· · · Sψ2RwSTψn...... . . ....SψnRwSTψ1SψnRwSTψ2· · · SψnRwSTψn+ Rv · J, (3.15)56where J is the matrix of ones.It has been shown that entropy can be represented as a function of covariance [10].Hence, the conditional entropy function in the right-hand side of Equation 3.2 can be repre-sented as the monotonic function of its variance:H(ni|Ni−1) = 12log(2pieσ2ni|Ni−1)=12log(σ2ni|Ni−1) +12(log(2pi) + 1).(3.16)Denote Σ∗, the sub-matrix of Σ, as the covariance matrix of ni and Ni−1, which com-putes the conditional variance σni|Ni−1 [108, 109]. The covariance matrix Σ∗ is denotedas:Σ∗ =Σ11 Σ12Σ21 Σ22 ,where Σ11 is the variance of ni, and Σ22 is the covariance of set Ni−1. Then, Σ12 and Σ21give the covariances between the random variable ni and the set Ni−1. The conditionalvariance σni|Ni−1 can be expressed as:σni|Ni−1 = Σ11 − Σ12Σ−122 Σ21. (3.17)As the present goal is to maximize the increased MI while retrieving new observationsn from the field in each iteration, the increased MI (information gain) can be representedas:δn =MI(N ∪ n)−MI(N ). (3.18)57According to [10] and Equations 3.3, 3.16 and 3.17, Equation 3.18 can be simplified as:δn = H(n|N )−H(n|N¯ )=12log(σ2n|N )−12log(σ2n|N¯ )= log(|Σn − ΣnNΣ−1NNΣNnΣn − ΣnN¯Σ−1N¯ N¯ΣN¯n|),(3.19)where N¯ = A \ (N ∪ n).In this manner, the model-based spatiotemporal information gain can be calculated ef-ficiently using Equation 3.15 and Equation 3.19. Besides, the calculation for informationgain on any new sampling location over the infinite horizon is feasible.3.3.3 Deep Reinforcement Learning Model for MaximumInformation Retrieval over a Spatiotemporal FieldIn this section, a DRL model is developed for learning the information gain of samplinglocations and gives instructions on sampling actions. The DRL model does not dependon prior training knowledge to accelerate the exploration, and it facilitates the exploringtree to expand by monitoring the quality of the new sampling locations online. The modelarchitecture is shown in Figure 3.1.Inputs3@90x60Featuremaps32@14x14Featuremaps64@8x8Featuremaps64@5x5Convolution12x8 kernelConvolution7x7 kernelConvolution4x4 kernelHiddenunits512Action1FullyconnectedFullyconnectedFigure 3.1: Schematic architecture for DRL model11This figure is generated by adapting the code from https://github.com/gwding/draw convnet58The input for the DRL model is the frame that indicates the current sampling locationson the EOI. Then the input frame is downscaled to a 90×60×3 image (from 1200×800×3) to reduce the computational expense and the memory usage. The input frame to the DNNhas size 90×60×3, which is processed by a convolution layer with 32 filters of 12×8 withstride 6× 4, followed by rectifier linearity (ReLu). The second layer has 64 filters of 7× 7with stride 1, is also followed by a ReLu. The third convolutional layer convolves 64 filtersof size 4× 4 with stride 1 followed by ReLu. The final hidden layer is fully-connected andconsists of 512 rectifier units. The output layer is fully-connected with a single output foreach valid action.In the case of random exploring trees, the location of the newly added observation onthe EOI is infinite. So, in the proposed model, the selection of the location is narroweddown towards the direction of its closest vertex in the spanning tree. Then, their relativedirection is divided into right angles, and the selection of new locations is decomposed intofour actions. In Figure 3.2, the blue circle represents the current sampling locations, anddiamonds stand for the possible sampling locations for the next step. The relative angle isdivided into four right angles as indicated by the background color. Note that the actionspass to the DRL model includes only the corresponding right angle of the new samplinglocation.The reward that is passed to the DRL network is given by the information gain, as ob-tained in Section 3.3.3. In each iteration, the information gain of the new sampling locationis calculated and encourages the mobile agents to move towards the area of higher informa-tion gain. Besides, as the variable with the small value in the logarithmic function (Equation3.19) gives a sizable negative information gain, all non-positive gains are clipped to zero.Clipping the information gain in this manner makes the model to merely focus on searchingthe area with a more significant positive information gain and makes it easier to learn.The DRL model takes on the learning process as the tree expands. In the beginning, theDRL model produces random actions for the exploring tree. Then, as more rewards from59θ1θ2Figure 3.2: Illustration of the action structure. Blue circle: current sampling locations;diamonds: possible sampling locations for next step.Algorithm 3.1: extendDRL()Input: G,A,ΣOutput: n1 Initialization: set replay buffer B with size N , set value function Q with randomizedweight θ;2 With probability p select a random action at ∈ A, otherwise selectat = arg maxaQ(st, at; θ);3 Compute new sampling location n based on at and A.;4 Generate deployment strategy N based on n and G;5 Limit selected action at to step size;6 Execute action at to calculate reward rt = δn|N from Σ and observe new state st+1;7 Save state transition (st, at, st+1, rt) to B;8 Perform a Adam optimization to update the model.the sampling location are given, more precise instructions on the tree expanding will beobtained. The deep Q-learning algorithm for tree expansion is given in Algorithm 3.1.The input for the Algorithm 3.1 is the exploring tree, which provides the current stateand then outputs the generated sampling location. In the beginning, the agent selects anaction with the p-greedy policy based on Q, which selects a random action from the actionset A with probability p and uses the Q-based greedy policy with probability 1 − p. Then,60the action is executed, and the deployment strategy is generated to calculate the reward.The deployment strategy is the new sampling location n combined with its closest path inthe tree G. Last, the model is updated with experience replay [42] and a double Q-networkto [110] to avoid divergence. In this manner, the utilization of the deep Q-network canstudy the information gain during the tree exploration period. Then, as areas with higherinformation gain are sampled at a higher priority, the overall estimation error will reducequickly.3.3.4 Architecture for Deep Reinforced Learning TreeBased on the results in Sections 3.3.1, 3.3.2 and 3.3.3, DRLT is proposed here to searchfor near-optimal sampling locations and efficiently estimate the spatiotemporal EOI.The deep reinforced exploring learning tree is built based on the proposed DRL model toaccelerate the process of finding near-optimal sampling locations. The DRLT builds a rein-forced space-filling tree, which efficiently searches the non-convex EOI. Unlike traditionalRRT-based algorithms, DRLT does not randomly expand the space-filling tree. It studiesthe collected information from the field and focuses more on areas of higher interest, whichhelps to achieve lower estimation error faster. The DRLT is described in Algorithm 3.2. Itbegins with the initialization of the exploring tree and the value of estimation error. The ini-tial estimation error is set to positive infinity. Then, the algorithm iterates T times. In eachiteration, the sampling location is estimated by the DRL model using the method presentedin Algorithm 3.1. Then, line 4 finds the nearest vertex of n in G. And line 5 finds a set ofvertices in tree G, which is within the radius r of the sampling point n. It is followed by thefunction that generates all possible sampling strategies in line 6. The details of line 6 arein Algorithm 3.3. Next, line 7 finds the sampling strategies with the lowest estimation errorgiven the new sampling point n. Last, the algorithm checks if a lower overall estimationerror is reached and correspondingly expands the exploring tree. In this manner, the DRLTalgorithm finds the near-optimal solution for the informative sampling problem, which is61executed efficiently through the help of DRL model.Algorithm 3.2: Deep Reinforced exploring Learning TreeInput: A,A,Rw,Rv,Sψ,ΣOutput: N1 Initialization: set exploring tree V with starting point ψ0, edge E as empty set ∅ andtree G as (V,E). The minimum estimation error is set to infinity in the beginning,ω =∞;2 for 1 to T do3 n = extendDRL(G,A,Σ);4 nnear = Nearest(n,G);5 E = Cluster(G, n, r);6 Nˆ = GenDEPL(n, E, G);7 N˜ = arg minN˜∈Nˆ Γ(N˜ );8 if Γ(N˜ ) < ω then9 ω = Γ(N˜ );10 N = N˜ ;11 end12 V ← V ∪ n;13 E ← E ∪ edge(n, nnear);14 G← (V,E);15 endAlgorithm 3.3 generates possible sampling strategies given the new sampling location nand its neighbors E. It outputs a set of subtrees ofG as sampling plans including node n andeach of its neighbors. Then, the sampling plan with lowest estimation error is selected fromthese available sampling plans. Next, the deployment strategy is compared with the overallestimation error to find the global minimal. The input set for Algorithm 3.3 is the newsampling point n, the set of its neighbors E, and the exploration tree G. Each deploymentstrategy is generated based on the basic RRT-based planning principle, which traces backfrom the locations in the set E to its root. Hence, the algorithm generates sampling strategiesfor each neighbor of n.In this manner, the DRLT algorithm can find a near-optimal deployment strategy withminimal estimation error from the field. By encouraging itself to explore areas with higherinformation gain using DRL, the DRLT algorithm also completes the goal efficiently.62Algorithm 3.3: GenDEPL()Input: n, E, GOutput: Nˆ1 Nˆ = ∅;2 foreach ne ∈ E do3 n′ = ne;4 N = {n, n′};5 while hasparent(n′) do6 n′=parent(n′, G);7 N ∪ n′8 end9 Nˆ ∪N10 end3.4 Simulation ResultsThis section presents numerical simulations using the proposed algorithm and data fromNOAA. Specifically, the DRLT algorithm and benchmark algorithms are used to plan themost informative sampling locations to estimate the SST over the Caribbean Sea.3.4.1 NOAA Sea Surface Temperature DatasetThe test dataset for the proposed algorithm and the benchmark algorithms is NOAAcoastal water temperature dataset, which contains the sea surface temperature of a rectangu-lar region of size 1200 units× 800 units at 1/30 degree (∼3km) resolution in the CaribbeanSea [83, 85]. In this dataset, the locations of the obstacles and the sampling data of thebasis center are given. The preprocessing of the dataset fills the missing measurements ofthe stationary data, and the missing measurements are merely recorded as the same value asthe previous one. As the values of such attributes do not change significantly in a day whencompared to seasonal changes, the dataset is down-sampled to simplify the modeling pro-cess by setting a time window to 6 hours. After that, the subspace identification algorithm[82] is utilized to learn temporal information of the environmental model.633.4.2 Benchmark AlgorithmsThe proposed algorithm are compared with three state-of-art algorithms: Informationdriven planner (INFO) [99], Rapidly exploring Random Cycles (RRC) [3], and Policy iter-ation on continuous domains using RRTs (RRTPI) [13]. Also, a native RRT is implementedto find informative sampling locations for the benchmark algorithm. Only the informa-tiveness maximization procedure of INFO is adapted to produce sampling locations; theexploration method remains the same for the proposed algorithm and other benchmark al-gorithms. The planner of INFO samples the next point with the largest mutual informationgain in the field. By building an exploring tree with such points, it will gradually find themost informative strategy with the lowest estimation error. Besides, RRC seeks to buildcycles from the exploring tree with the lowest estimation error. The deployment strategy isestablished by finding suitable cycles with the lowest estimation error among the paths ofan exploring tree. Similar to RRC, the RRT finds the paths but not cycles of the exploringtree with the lowest estimation error. Also, both RRC and RRT explore the tree by randomsampling from the field, which does not utilize the feedback of information gain of the sam-pling points. RRTPI is an approach that is intended for solving geodesic-based problems.Therefore, top-10 locations are picked in the area with the highest mutual information. Theexploring tree will have more substantial rewards when approaching those top-10 locations.In this manner, the exploring tree will tend to sample more points in the areas of higherinformation gain. The simulation was conducted on a PC with 4.0 GHz Quad-Core CPUand 16 G memory. The DRL model was implemented using the TensorFlow framework[111] and run without GPUs.The change of SST is modeled using the spatial basis function and the temporal corre-lation. As given in Equation 2.4, the Gaussian radian basis function is selected as the basisfunction. Hence, the ith element of the temporal matrix is si(ψ) = Ke−||(ψit)−(ψj)||2/2σ2 ,whereK is the scale factor, ψit is the location of the ith robotic sensor in the field at time t, ψjis the location of the ith basis center, and σ is the Gaussian variance. In the present work, the64scale factors are selected as 10 and the variance is 100 units. The Gaussian radial basis cen-ters are selected using the metrics described in [3], and the number of basis centers is nine.The temporal information is learned from the NOAA dataset. The starting location of theexploring tree for the proposed and benchmark algorithms is ψ0 = {500units, 200units}and the step size for the exploring tree is 300 units. The number of iteration is limited to500 times due to the fast convergence of the proposed algorithm. It is found that the perfor-mance of the proposed work does not depend on the selected environmental model and itsparameters.3.4.3 Performance ComparisonFigure 3.3 (a) - (e) presents the information-based sensor deployment strategies as com-puted using INFO, RRC, RRT, RRTPI, and DRLT, respectively, where the black stars indi-cate the location of informative sampling locations, the grey area represents the obstacles,the white area is the EOI, and the blue lines indicate the exploring tree generated fromdifferent algorithms. In Figure 3.3 (d), yellow squares represent the pre-calculated top-10locations with the highest information gain in the EOI. For all five figures, the proposedalgorithm and the benchmark algorithms, both searched over most of the EOI, which fulfillthe goal of minimizing the estimation error over the EOI by making observations. Besides,the proposed algorithm tends to search some specific areas that have higher informationgain and leads to a faster realization of the minimum estimation error.Figure 3.4 shows the field estimation uncertainty over the EOI when deploying roboticsensors using the information-based deployment strategy as generated by the proposed andthe benchmark algorithms. Then, the color map is used to indicate the estimation uncertain-ties when deploying robotic sensors in the locations of red dots. A more significant numberin the color bar represents a higher variance in the EOI, and indicates that the deploy-ment strategy will have higher uncertainty in this area. Also, as all five algorithms achievenear-optimal solutions, the estimation uncertainty over the field of the resulting sampling65(a) Deployment result of INFO (b) Deployment result of RRC(c) Deployment result of RRT (d) Deployment result of RRTPI(e) Deployment result of DRLTFigure 3.3: Deployment strategy for information-based sampling. Red dots: com-puted information-based sampling locations; blue lines: exploring tree; yellowsquares: informative centers for RRTPI.strategies is at a low level. Specifically, when the robotic sensors are deployed using thestrategy provided by DRLT, the maximum uncertainty over the field is 4.75, and the uncer-tainty does not exceed 2 in most areas. Also, the average field uncertainty is as low as 0.73,which means that the proposed information-based deployment strategy has fast convergencein the EOI. INFO has performed well in minimizing the field estimation uncertainty withthe computed deployment strategy. As it is an algorithm that is based on information gain,it attempts to find the sampling location with the highest information gain from the field.66(a) Field estimation uncertainty of INFO (b) Field estimation uncertainty of RRC(c) Field estimation uncertainty of RRT (d) Field estimation uncertainty of RRTPI(e) Field estimation uncertainty of DRLT0.51.01.52.02.53.03.54.04.5Figure 3.4: Field estimation uncertainty for information-based sampling.Hence, it achieves a low field estimation uncertainty. Other benchmark algorithms alsoachieve low estimation uncertainty as all of them can find near-optimal results.Figure 3.5 shows how the estimation error changes as the number of iterations increases.The upper bound for visualizing the lowest estimation error is limited to 400 due to the highestimation error in the beginning. The cost of the computed informative sampling strate-gies decreases monotonically as the algorithms execute. The proposed algorithm, DRLT,uses the least number of iterations to reduce the overall estimation error from infinity to400, which shows an exceptional ability in finding better sampling locations with limitedinformation. The DRLT also finds a deployment strategy with lower estimation error faster,compared to the benchmark algorithms in the subsequent iterations. Moreover, among allfive algorithms, DRLT achieves the lowest estimation error in the field using a limited num-670 100 200 300 400 500Number of iterations050100150200250300350400Estimation errorINFORRCRRTRRTPIDRLTFigure 3.5: Reduction of the estimation error with iteration (Blue solid line: result ofINFO; green dash line: RRC; red dash-dot line: RRT; light blue dot line: RRTPI;purple dash-dash line: DRLT).ber of iterations, which also shows that it can find a near-optimal solution efficiently andusing fewer iterations of computation.Figure 3.6 presents the minimal estimation found by the proposed and the benchmarkalgorithms with respect to the computational time. The RRC algorithm takes the longesttime to complete 500 iterations of tree expansion, followed by the INFO algorithm. Bothalgorithms take more than two thousand seconds, which is not computationally efficientwhen compared to the speed of other three algorithms. Since RRT, RRTPI, and DRLT useless than eight hundred seconds to complete the computation, Figure 3.6 is zoomed in asFigure 3.7 to compare their performance.Figure 3.7 indicates that DRLT uses the least computing time to find the deploymentstrategy with the lowest estimation error in the field. RRT and RRTPI also find a gooddeployment strategy efficiently, but they seem to get trapped locally, and the estimation680 2000 4000 6000 8000 10000 12000Computing time050100150200250300350400Estimation errorINFORRCRRTRRTPIDRLTFigure 3.6: Estimation error of the information-based deployment strategy determinedby the proposed and the benchmark algorithms decreases monotonically withtime.error stops decreasing quite early. INFO performs worst with the highest estimation errorand long computation time. Moreover, although RRC finds the second best solution withinthree hundred seconds, it takes more than ten thousand seconds to finish all five hundrediterations, which might be caused by the high computational cost of finding cycles amongthe paths in the exploring tree. The cycles are determined by finding two paths that sharethe same parent from the newly added observation. Hence, as the size of the exploringtree grows, the computation for finding the cycles with a lower estimation error increasesexponentially.Figure 3.8 presents the change of the cumulative reward of DRLT during the explorationand the corresponding estimation error in each iteration. In the present work, the cumulativereward is set to the sum of the information gain when sampling in a new location, and highercumulative rewards can lead to a better learning process. Notably, DRLT maximizes the690 100 200 300 400 500 600 700 800Computing time050100150200250300350400Estimation errorINFORRCRRTRRTPIDRLTFigure 3.7: Estimation error of the information-based deployment strategy determinedby the proposed and the benchmark algorithms decreases monotonically withtime. (Zoomed in)cumulative reward in a stable manner as the tree explores, which lead to a steady state withthe near-optimal result. By comparing the two curves in Figure 3.8, it is seen that DRLT canretrieve information from the field continuously to efficiently reduce the estimation error.Table 3.1 summaries the numerical results of the simulations. It shows that the DRLTachieves the lowest estimation error at 33.66, which is 9% lower than the second-best result(RRC) and about half the worst result. Also, DRLT does not require a high computing timeto obtain this near-optimal result. It only takes 771.85 seconds to complete 500 iterations oftree exploration, which only takes 7% of the computing time to outperform the second-bestresult. The small estimation error together with the low computing time indicate that DRLTis the most efficient algorithm out of those considered, for obtaining a near-optimal deploy-ment strategy. Besides, by adopting sampling strategies for robotic sensor deployment fromboth the proposed and benchmark algorithms, the field estimation uncertainty also becomes700 100 200 300 400 500Number of iterations0200400600800100012001400Cumulative reward050100150200250300350400Estimation errorFigure 3.8: Cumulative reward and estimation error of DRLT. (Red dashed line: cu-mulative reward of the DRLT; solid blue line: corresponding estimation Error).small. In particular, the maximum uncertainty over the field does not exceed 5 for all algo-rithms, and the average uncertainty is about 0.7. The speed reduction of the estimation errorconvergence is also examined. Setting the threshold at 400, DRLT only takes 11 iterationsand 11.44 seconds to reduce the estimation error from infinity to the threshold value. Allother algorithms require more than 30 iterations to achieve this reduction.Table 3.1: Numerical results for DRLT and benchmarks.Algorithm INFO RRC RRT RRTPI DRLTEstimation Error 66.62 36.86 44.31 41.78 33.66Computing Time 2608.61 10375.94 653.27 649.10 771.85Max Uncertainty 4.60 4.61 4.61 4.86 4.75Min Uncertainty 0.0056 0.0056 0.0056 0.0060 0.0058Mean Uncertainty 0.70 0.71 0.69 0.81 0.73Estimation Error Reaches400 (iteration) 39 30 39 30 11Estimation Error Reaches400 (time) 49.49 21.30 4.96 8.017 11.44713.5 SummaryThis chapter focused on developing a WSN deployment strategy that could efficientlyachieve minimal estimation error. A novel algorithm, DRLT, was proposed to search overa sizable spatiotemporal field efficiently with the assistance of information gain. The de-veloped deployment strategy was proved to be near-optimal. The simulation results wereobtained for the NOAA dataset, and they confirmed that the proposed algorithm could out-perform the benchmark algorithms in both performance and efficiency.72Chapter 4Model-based Sensor Deployment forSpatiotemporal Field Reconstruction4.1 IntroductionBesides spatiotemporal field estimation, it is essential to reconstruct the spatiotemporalfield using the sampling data. The pollution and environmental changes can be detectedand predicted from the reconstructed field. Subsequently, corrective actions may be takento minimize the hazards. This chapter seeks to reconstruct the entire spatiotemporal fieldfrom the observations sampled at near-optimal sampling locations.The spatiotemporal field should be constructed using observations sampled in the model-based near-optimal sampling locations. With the help of sparse learning, statistical methodsare widely used to recover signals from limited observations [4, 58, 112]. Sparse learninginvestigates linear mappings from low-dimensional sparse observations to the original high-dimensional signal. To this end, a method based on principal component analysis (PCA)can be utilized to learn a projection basis from the training data. Then, a tailored singularvalue decomposition (SVD) approach can be used to reconstruct the entire spatiotemporalfield, given future observations in the field.In this chapter, a model-based approach for sensor node deployment and spatiotemporal73field reconstruction is proposed. The proposed approach deploys sensor nodes that can rep-resent the spatiotemporal field near-optimally given the environmental model. Observationsin the deployed locations can then be used to reconstruct the spatiotemporal field by learn-ing the sparse representation mapping. Moreover, it is proven that the sensor deploymentmethod, DRLT, generates near-optimal sensor deployment locations for spatiotemporal fieldreconstruction.4.2 Mathematical Background and Problem Formulation4.2.1 Sparse Sensor Deployment AnalysisThe PCA is optimal for recovering high-dimensional signals from unknown contents.It converts correlated observations into principal components that have mere linear redun-dancy. In this manner, high-dimensional signals can be represented by a few sparse repre-sentations in the field. Then, sparse learning methods can be used to study the relationshipbetween the compressed signal and the original one, to reconstruct a full signal from a smallsubset of measurements.Denote the EOI as a high dimensional state φ ∈ Rm, and the spatiotemporal dynamicsof φ can be captured by the low dimensional observations. The selected observations (y)can be represented by a measurement matrix C ∈ Rr×m, where r m is the number ofmeasurements:y = C · φ. (4.1)Note that φ can be sparsely represented by a principal basis:φ = Ψr · a, (4.2)where Ψr ∈ Rm×r is the principal basis learned from training data, and a ∈ Rr.744.2.2 Problem FormulationMathematically, the sparse signal reconstruction and be expressed as the projection froma low-dimensional observation space to a high-dimensional signal:φˆ← ξ(y) : Rm ← Rr.As r m, this system is underdetermined. Hence, there are infinite solutions. In thismanner, sampling locations should be optimized, and the mean error is utilized to evaluatethe reconstructed spatiotemporal field:arg minξ‖φ− ξ(y)‖.4.3 Proposed Method4.3.1 Sparse Learning for Spatiotemporal ReconstructionHigh-dimensional states φ ∈ Rm can be represented as linear combinations of r or-thogonal eigenmodes ψ. This low-dimensional state can be learned from training data bypruning the PCA basis. φ may be decomposed to produce temporal and spatial coefficient[4]. In this manner, the linear combination of φ shown in Equation 4.2 can be representedas:φi ≈r∑k=1ak(ti)ψk(φ), (4.3)where φi is the spatiotemporal field at time i and ψk(φ) is the time independent spatialcoefficient. The temporal coefficient ak(ti) varies with time ti.Next, ψk and ak can be computed by Singular Value Decomposition (SVD). Trainingdata Φ = [φ1φ2 . . .φM ] are given for M snapshots. The tailored SVD basis consists oforthonormal left singular vectors Ψ, right singular vectors V and the diagonal matrix S.75Hence, the training data can be represented as:Φ = Ψ · S ·V. (4.4)Then, the dimension of the right-hand part of Equation 4.4 can be reduced to r accordingto the Eckart-Young theorem [113]:Φ ≈ Φ∗ = arg minΦ˜∥∥∥Φ− Φ˜∥∥∥Fs.t. rank(Φ˜) = r,(4.5)where Φ∗ = Ψr ·Sr ·Vr and ‖·‖F is the Frobenius norm. In this manner, PCA can reduce thedimension of the high-dimensional system by using orthogonal projection. In the presentwork, rank r indicates the number of observations in the EOI. Hence, information in thecorresponding r rows of Ψ, S, and V will be extracted as Ψr, Sr, and Vr respectively.Then, a canonical measurement matrix C is used to select critical sampling locationssparsely, according to Equation 4.1:C = [eγ1 , eγ2 , · · · , eγr ]T , (4.6)where ej ∈ Rm is the canonical basis vectors. Since matrix C is canonical, the observationy can be simplified as:y = Cφ = [φγ1 , φγ2 , · · · , φγr ]T , (4.7)where γ = {γ1, γ2, · · · , γr} ⊂ {1, 2, · · · ,m} is the set of indexes for selected sensors. Inthis manner, Equation 4.1 can be used to derive sampling locations from φ efficiently.Next, the observations in the field, φj ∈ φ, can be represented as a linear combinationof the basis and the coefficients according to Equation 4.3 and Equation 4.5:φj =r∑k=1Ψjkak. (4.8)76Observations in the sensor deployment locations can then be expressed as the linearcombination of the canonical basis and the states according to Equation 4.7 and Equation4.8:yi =n∑j=1Cijφj =n∑j=1Cij ·r∑k=1Ψjkak. (4.9)Equation 4.9 can be simplified as:y = C ·Ψr · a. (4.10)According to Equation 4.3, φ can be reconstructed by the basis coefficients aˆ:φˆ = Ψraˆ. (4.11)Usually, only r sensors are deployed in the field, and only the corresponding readingsare obtained, which is denoted as y ∈ Rr. Hence, according to Equation 4.10 and Equation4.11:φˆ =Ψraˆ=Ψr(CΨr)−1y,(4.12)where Ψr ∈ Rn×r can be learned through SVD. C ∈ Rr×n can be obtained by finding themost informative locations in the field, such as near-optimal sensor deployment locationsdeveloped in Chapter 3. Thus, the environment φˆ can be reconstructed given r observationsy, where m r.4.3.2 Model-based Sparse Sensor Deployment with InformativeSamplingOptimal sensor deployment locations can guarantee the most accurate reconstruction ofφˆ. For this reason, the locations of the sensor nodes should be optimized to acquire the most77information from point measurements in sparsely selected sampling locations. Observationsfrom the sparse sampling locations can then be used to reconstruct high-dimensional states,given the tailored SVD basis. Recall that γ represents the structure of the canonical matrixC, which gives the sampling locations in the field. Therefore, γ affects the reconstructionperformance. Thus, the optimal γ and its corresponding sampling locations should providethe most information about the field.Choosing a suitable number of sensors in the field while maintaining a low noise levelin the data is a challenge. Gavish et al. proved an optimal threshold, singular value hardthresholding (SVHT), that guarantees minimal signal reconstruction error [114]. Whileensuring least reconstruction error, SVHT finds only the optimal value for reducing thesize of the tailored SVD basis and cannot find the optimal sensor deployment locations.Hence, the result provided by SVHT is regarded as the reconstruction target. In this manner,the number of sampling locations, r, can be calculated, and r rows of Ψ, correspondingto the sensor deployment locations should be obtained with maximum information in thespatiotemporal field.DRLT is a model-based informative sampling technique that finds near-optimal sam-pling locations for spatiotemporal field estimation. Proposition 4.1 shows that the samplinglocations selected by DRLT can near-optimally represent the field.Proposition 4.1. DRLT generates sampling locations that near-optimally condition the rep-resentation of the spatiotemporal field for spatiotemporal reconstruction.Proof. Proposition 3.2 shows that the objective function of DRLT is submodular. The di-minishing returns property of the submodular function guarantees a near-optimal solutionin the case of greedy maximization/minimization [107].Moreover, as shown in algorithm 3.2, DRLT greedily selects target sensor deploymentlocations. Hence, as the learning tree expands, sufficient samples are observed in the field,and the greedy selection guarantees that the resulting sampling set is near-optimal. Theobjective function finds the sensor deployment set that seeks the lowest estimation error78during the greedy selection process. At the same time, DRLT encourages a sensor to takemore samples in the area with higher information gain. Therefore, the mutual informationbetween the sampling locations and the rest of the field is maximized. Thus, DRLT cangenerate sampling locations that near-optimally represent and reconstruct the spatiotempo-ral field by collecting the most information from the spatiotemporal field.In this manner, r sampling locations selected by DRLT can be converted into γ to con-struct the canonical measurement matrix C. Given the observations at these locations, thespatiotemporal field can be reconstructed.The proposed sparse learning method for spatiotemporal field reconstruction is given inAlgorithm 4.1. Line 1 initializes the algorithm by pre-processing the training data, wherethe training data can be obtained from the environmental model. Mean normalization iscarried out to facilitate faster learning. Line 2 learns the training data by decomposing itinto orthonormal matrices using SVD. Line 3 to line 7 project the calculated near-optimalsampling set (N ) into the corresponding locations (γ) in the φ space, whereN is generatedby DRLT using Algorithm 3.2. Next, γ is organized as the canonical matrix C, and thelow-dimensional representation of the orthonormal singular vector Ψ is extracted to finishthe training process. Last, line 10 performs the spatiotemporal reconstruction from limitedobservation y to φˆ.In this way, the spatiotemporal field can be reconstructed by using the observations fromthe calculated sparse sampling locations.4.4 SimulationThis section presents the simulation results of the proposed model-based sensor deploy-ment approach for spatiotemporal field reconstruction, using a National Oceanic and Atmo-spheric Administration (NOAA) dataset. The proposed algorithm is compared with state-of-the-art informative sensor deployment methods, and the reconstruction performance is79Algorithm 4.1: Sparse Learning from Training Data.Input: N ,Φtrain,yOutput: φˆ1 Init.;2 Ψ,S,V = svd(Φtrain);3 γ = ∅;4 for i = 1, 2, · · · , r do5 γiΨ←− Ni;6 γ = [γ, γi];7 end8 C← canonical(γ);9 Ψr ← C ·Ψ;10 φˆ← Ψr(CΨr)−1y;analyzed based on the mean square error (MSE) between the reconstructed field and theground truth.4.4.1 Experimental SetupThe NOAA dataset covers the southeast Americas Seas region, which includes the Gulfof Mexico and the Caribbean Sea [83]. The daily sea surface temperature in 2017 wasextracted from this dataset to model the environment.The performance target and the benchmark algorithms for informative sensor deploy-ment are as follows:• SVHT. Singular value hard thresholding (SVHT) finds optimal thresholds for thedimensions of the tailored SVD basis that guarantee minimal signal reconstructionerror [114]. The SVHT compression and reconstruction method reduces only the sizeof the SVD basis, and the resulting tailored SVD basis is used to represent a higher-dimensional space. Therefore, SVHT cannot find the sampling locations because itrequires the state to be fully observable and encodes the original signal to anotherspace with a lower dimension. The result generated by SVHT is regarded as theperformance target.80• RRC: Rapidly-exploring random cycles (RRC) is an RRT-based method that gen-erates sampling locations with the lowest estimation error along with a cycled pathwithin the space-filling tree [3]. The cycled sampling locations can be used to performperiodic sampling using a single robot.• INFO: The approach of informativeness maximization is utilized to find samplinglocations with maximum mutual information in the field [99]. As more samples aretaken in the spatiotemporal field, a near-optimal sampling set that maximizes themutual information is generated.• RRTPI: RRTPI is a reinforcement learning-based RRT method that efficiently solvesgeodesic-based exploration [13]. Ten locations with the highest mutual informationare selected in the spatiotemporal field, and the exploring tree has a higher probabilityof exploring areas with higher information gain and efficiently finding the samplinglocations with the lowest estimation error.4.4.2 Simulation ResultsFigure 4.1 presents the ground truth of a spatiotemporal field for testing as well asthe reconstruction results. The red circles in the figure indicate the informative samplinglocations calculated by the corresponding planning algorithms. SVHT generates the bestreconstruction result as it tailors only the information that is less significant in the SVDbasis. The input signal is compressed and reconstructed via an orthogonal projection with atailored SVD basis.RRC and RRTPI collect only limited information and have the worst reconstruction per-formance as they focus merely on minimizing the estimation error of the spatiotemporalfield. The MSEs for the field reconstruction of RRC and RRTPI are 39.37 and 64.17, re-spectively. As these two methods do not optimize the information gain in the field, sensornodes may be placed in locations that provide less effective observations. Some spatial lo-cations may reduce the estimation error by finding abrupt changes in the field but not help81(a) Ground truth (b) SVHT (c) RRC(d) RRTPI (e) INFO (f) DRLTFigure 4.1: Ground-truth and reconstruction results from the proposed and bench-mark algorithms. (Red circle: sampling locations generated from differentinformation-based planning algorithms.)to represent the general information in the field.Note that both INFO and DRLT achieve lower reconstruction error as they maximize themutual information in the field, which is critical for representing the spatiotemporal field.The MSEs for the field reconstruction of INFO and DRLT are 9.08 and 2.59, respectively.INFO captured more information in the southern Caribbean Sea and performed better inreconstruction than RRC and RRTPI. However, the reconstruction in the northern CaribbeanSea and the Gulf of Mexico is rather poor due to the inadequateness of information iscaptured. DRLT leverages the minimization of the estimation error and the maximizationof information gain to outperform the benchmarks in reconstruction. As shown in Figure4.1 (f), the reconstruction in both southern and northern Caribbean Sea is better than inthe benchmark algorithms. The Gulf of Mexico is the only location where information ismissing as no sampling locations exist.Table 4.1 indicates the MSE for the reconstruction using the proposed method and thebenchmark algorithms. The target reconstruction performance, generated by SVHT, is 0.12.DRLT produces the sampling locations with lower MSE when compared to the benchmark82Table 4.1: MSE of the proposed method and the benchmark algorithms.SVHT∗ RRC RRTPI INFO DRLT0.12 39.37 64.17 9.08 2.59algorithms.4.5 SummaryThis chapter proposed a model-based sensor deployment strategy for spatiotemporalfield reconstruction. The proposed model-based sensor deployment strategy significantlyoutperformed the benchmark algorithms in reconstructing a spatiotemporal field with lim-ited observations. However, the reconstruction accuracy of the proposed method was highcompared with the target performance generated by SVHT. Therefore, the reconstruction ofa spatiotemporal field has room for improvement. In general, model-based sampling meth-ods may result in inaccuracy when the environmental model cannot adequately express ahighly dynamic field of interest. Hence, model-free approaches, such as DL approaches,should be utilized to improve the spatiotemporal reconstruction accuracy.83Chapter 5Model-free Sensor Deployment forSpatiotemporal Field Reconstructionand Prediction5.1 IntroductionChapter 4 introduced the linear reconstruction of the spatiotemporal field from the ob-servations. Linear reconstruction may cause some inaccuracies in capturing spatiotemporalmappings between the observations and the overall spatiotemporal field. In this chapter,nonlinear spatiotemporal mappings are investigated to optimize the sampling location andimprove the spatiotemporal reconstruction accuracy.Optimal spatiotemporal field reconstruction usually leverages a latent low-dimensionalrepresentation, which empowers sparse sampling. In this context, it is clearly advanta-geous to deploy a small number of sensors optimally when estimating and predicting ahigh-dimensional systems. The optimal sensor deployment is intractable using brute-forceapproaches, and it is shown to be NP-hard.Typically, sensor locations are chosen using convex optimization. Model-based sam-pling approaches have been utilized for deploying sensors near-optimally to estimate moderate-84size spaces [3, 9, 115]. Also, robotic sensors can be deployed to sample and estimate anenvironment [116–119]. Moreover, probabilistic approaches have been proposed to han-dle the trade-off between exploration and exploitation in an increasingly generic manner[120, 121]. Their objective is formulated based on information theory, which enables arobotic sensor to sample unexplored areas in the field at higher information gain. However,these approaches mainly focus on simultaneous localization and mapping (SLAM) tasks,which are limited to geometric relationships. In general, model-based convex optimizationmethods highly rely on the environmental model and use a greedy approach only to navi-gate robots to the most informative sampling locations. Hence, model-based near-optimalsensor deployment methods may entrap at local minima and result in less accurate fieldreconstruction and prediction.Recently, data-driven approaches, such as principal component analysis (PCA) andcompressive sensing (CS), have been developed to determine the primary locations for opti-mal or near-optimal sensor deployment. Data-driven approaches are of low complexity andare easier to implement than model-based convex optimization approaches. Observationsobtained from the sampling locations can be used to reconstruct the entire spatiotemporalfield. In contrast to CS and PCA systems, which employ sparse linear representations forsignal reconstruction, a deep learning (DL) framework can support both linear and nonlin-ear measurement reconstruction and provide better reconstruction performance. Besides,the nonlinear activation function of a DL framework reduces the likelihood of gradient van-ishing.In this chapter, a novel DL framework called Spatiotemporal sparse Auto FiEld Recon-structor (SAFER) is developed to optimize the sensor deployment locations for spatiotem-poral field reconstruction and prediction. The spatiotemporal field is encoded into sparsesampling locations, where key deployment locations are selected as the low-rank repre-sentation of the entire spatiotemporal field. Then, an auto-reconstructor is used to capturenonlinear spatiotemporal correlations between the sparse sampling locations and the rest of85the field. The observations in the selected locations can then be decoded to the entire spa-tiotemporal field. The proposed DL framework is proven to be Lipschitz continuous, and itis adequate for reconstructing and predicting the spatiotemporal field via observations in aminimal number of sampling locations.5.2 Preliminary and Problem Formulation of SensorSelectionThere are(mr)= m(m−1)...(m−r+1)r(r−1)...1 possible selections of r sensors in an m-dimensionalenvironment. It has been shown that recovering the corresponding r-sparse signal fromr m observations is an NP-hard problem [122]. The PCA and CS methods have showntheir capability to recover high-dimensional signals from an unknown content. However,they are limited to the analysis of linear correlations between the low-dimensional rep-resentations and the high-dimensional space. Besides, the spatiotemporal pattern of thesignal can be obtained by extracting dominating features from the training set. Hence, ahigher reconstruction performance can be achieved by learning the nonlinear mappings inthe specific domain of interest.Typically, sparse sampling can reconstruct a full signal from a small subset of measure-ments. Denote the physical phenomenon of an Environment of Interest (EOI) as the highdimensional state φ ∈ Rm. The nonlinear spatiotemporal dynamics of φ can be captured bythe low dimensional observations for prediction and control. Hence, φ˜ can be reconstructedfrom its low-rank representations based on sparse sampling that optimizes the spatiotempo-ral field reconstruction:φ˜ = Gr(y), (5.1)where Gr : Rr → Rm is the reconstruction function, and y ∈ Rr is the sparsely sampleddata from φ. Therefore, the principal task of the proposed work is to design a measurement86matrix C ∈ Rr×m to sparsely sample critical observations from φ:y = C · φ, (5.2)where r m is the number of measurements.Substituting Equation 5.1 into Equation 5.2 yields:φ˜ = Gr(C · φ) = G(φ), (5.3)where G is the DL framework used for reconstructing the spatiotemporal field with sparselysampled measurements. The goal in this work is to optimize sensor placement locationswithin φ, which correspond to the measurement matrix C.5.2.1 Sparse Sensor Deployment AnalysisIn sparse sensor deployment, the spatiotemporal field is only observable at the locationsthat have deployed sensors. Then, a model is trained to reconstruct the entire spatiotemporalfield with limited observations by investigating their nonlinear mappings.A high-dimensional system φ ∈ Rm may be expressed as the linear/nonlinear combina-tions of r orthogonal eigenmodes. Hence, high-dimensional systems can be projected intothe lower-dimensional subspace. Then, high-dimensional system at time i, φi ∈ Rm, canbe represented according to Equation 5.1:φi ≈ φ˜i = Gr(yi), (5.4)where yi are the observations from the deployment locations and Gr is the time independentauto-reconstructor. The temporal coefficient yi varies with time i. Therefore, G can be canbe trained from the sampled data, given the measurements Φ = [φ1,φ2, . . .φM ] for M87snapshots:Φ˜ = G(Φ, θˆ), (5.5)where θˆ stands for the hyper-parameters.The model is trained to minimize the difference between the training data and the modeloutput:Φ∗ = arg minΦ˜∥∥∥Φ− Φ˜∥∥∥2, (5.6)where Φ∗ is the reconstructed signal and ‖·‖2 is the `2 norm. The model G that generatesthe reconstructed signal can be used for predicting the future spatiotemporal field, given thesampling data at that time.In this manner, the problem of sparse sensor deployment can be formulated as:min∑φ∈Φ∥∥∥φ− G(φ, θˆ)∥∥∥2,s.t. |C · φ|= r,(5.7)where |·| is the cardinality of the selected sampling locations.5.3 Sparse Sensor Deployment using Auto FieldReconstructorThe proposed approach incorporates a DL framework to learn spatiotemporal correla-tions in the training data, and determines the critical sampling locations that optimize thespatiotemporal field reconstruction. It is a model-free approach, which does not need tomodel the environment with prior knowledge.5.3.1 Sparse Sampling for ReconstructionIn this chapter, a high-dimensional signal is assumed to be reconstructed from a smallsubset of point observations. Therefore, the sensor deployment locations are optimizedto achieve better reconstruction performance. Denote the spatiotemporal field as an m-88dimensional space:φ = [φ1φ2 · · ·φm]T , (5.8)where φ· is a random variable that represents the sampling value at the corresponding loca-tion.Then, the measurement matrix C can be used to select the sampling locations:C = [eγ1eγ2 · · · eγr ]T , (5.9)where ej ∈ Rm are the canonical basis vectors.According to Equation 5.2, observations in the sensor placement locations can be ex-pressed as a linear combination of the canonical basis and the high dimensional input data:yi = eγiφSince matrix C is canonical, the observation y can be simplified as:y = [φγ1φγ2 · · ·φγr ]T , (5.10)where γ = {γ1, γ2, · · · , γr} ⊂ {1, 2, · · · ,m} is the set of indices for the selected sam-pling locations. Hence, the observations in y are downsampled from the m-dimensionalspatiotemporal field. Then, as y can be directly observed from the spatiotemporal field, thespatiotemporal field φ˜ can be reconstructed given the observations at the sampling locationsand the nonlinear mapping Gr.A schematic diagram for sparse sensor deployment is illustrated in Figure 5.1. Theobservations at the sampling locations y are selected by the measurement matrix C to guar-antee the best feasible reconstruction φ˜.The sensor deployment locations in the spatiotemporal field correspond to the sensorreadings in the input matrix φ. Hence, the sparse sensor deployment strategy is to compute89y C φ= ×Figure 5.1: Illustration of generating sparse sampling locations using a canonical ba-sis. Dark blue region in matrix C stands for 1, rest of the matrix contains 0;second, seventh and fifth row of φ is extracted according to C and compose y.the rows of φ that optimally represent the spatiotemporal correlation in the field. Gr shouldbe trained to learn spatiotemporal correlations between the observation y and φ.5.3.2 Deep Learning based Spatiotemporal Field ReconstructionAccording to Equation 5.1, input signal φ can be compressed as y = Cφ. Hence, theinput training dataset can be expressed as Φ and the corresponding compressed signal isY = [y1y2 . . .yM ]. Given a suitable canonical matrix C ∈ Rr×m, the input data matrix canbe compressed to key observations:Y = CΦ (5.11)Then, the training set for the proposed deep learning framework hasM pairs of the inputsignal and the corresponding observations in the selected locations:Strain = {(y1,φ1), (y2,φ2), · · · , (yM ,φM)}.The nonlinear signal reconstruction mapping is learned from this training set. Similarly,90the test set Stest contains several pairs of signals and observations.Traditional sparse reconstruction methods show that using greedy and iterative approachesprovide better performance by using the matrix-vector multiplication with the computa-tional cost of O(nr) [56]. Besides, fully connected layers suffer from the curse of dimen-sionality and consume vast system memory [123]. For example, the spatiotemporal fieldin the training data is a 360 × 180 image, and there are 44219 non-obstacle pixels in eachimage. Then, the weights between two fully connected layers will be stored in a 44219-by-44219 matrix using float 32, which consumes a massive amount of memory and it is hardto train. Specifically,memory consumption = 442192 ∗ 32 bits= 62570238752 bits= 7.82 GB.(5.12)Therefore, a stacked structure is utilized to save memory, where each layer of the DLframework has m inputs and r outputs or vice versa.Then, a multi-layer DNN is used to learn the signal reconstruction from limited samplingpoints. In the beginning, the input signal is compressed to r-dimension using a logicalmatrix with the canonical basis:y = C · φ. (5.13)Then, each hidden layer can be represented asxhi+1 = T (Whixhi + bhi), (5.14)where Whi ∈ Rm×r,bhi ∈ Rm or Whi ∈ Rr×m,bhi ∈ Rr, depending on the size of theinput data.Besides, x denotes the hidden neurons and T (·) is the element-wise nonlinear rectifierfunction. Note that xh0 is the input of the hidden layers, y. Then, the calculation for the91φy = C · φxhi+1 = T (Whixhi + bhi) ~φ = G(φ)Figure 5.2: The framework of the auto field reconstructor. Yellow nodes represent thecompression layers which optimize sparse sampling locations; blue nodes areneurons in the hidden layer for model fitting; red layer is the output layer.output layer is given by:φ˜ = T (Woxh + bo), (5.15)where Wo ∈ Rr×m and bo ∈ Rm are the weighting matrix and the bias of last hidden layer,respectively. As the size of the output layer must match the dimension of the environment,the number of hidden layers must be even.Denote hyper-parameters as θˆ = {C,Wh1 ,bh1 ,Wh2 ,bh2 , · · · ,Wo,bo}. The nonlinearmapping from the input training data φ to the output data φ˜ can be represented as φ˜ =G(φ, θˆ). The the mean squared error (MSE) is used as the loss function:L(G,C) = 1MM∑i=1∥∥∥G(φi, θˆ)− φi∥∥∥22(5.16)The framework of the proposed model is shown in Figure 5.2. Two-dimensional images92of the spatiotemporal field are reshaped into a one-dimensional vector at size m and fedinto the input layer. Then, r key sampling locations, y, are selected via the canonical mea-surement matrix C. Next, y is passed through the multi-layer stacked auto-reconstructor toreconstruct the spatiotemporal field (φ˜). All hidden layers are applied with nonlinear Rec-tified Linear Units (ReLU). Last, Adam optimization is utilized to compare the differencebetween φ˜ and φ, and minimize the loss.Once the training process minimizes the difference between the model output and theinput data, the auto field reconstructor learns the spatiotemporal correlation between theselected deployment locations and the entire spatiotemporal field. Hence, once the sensorsamples new data after the deployment, the entire spatiotemporal field can be reconstructed.Next, it is essential to optimize C matrix for sparse sampling. Finding the best samplinglocations helps to train optimal decoders for the best reconstruction performance. Algorithm5.1 explains how the sparse sampling locations are optimized during the model trainingprocess for reconstruction.Algorithm 5.1: SAFERInput: ΦOutput: G(·), C1 Initialization: set measurement matrix C as a random canonical matrix, andrandomly draw hyper-parameters θˆ from uniform distributions with a zero mean.;2 for 1 to T do3 Shuffle m input data;4 Use Adam optimization to minimize L(G,C);5 Cˆ = swap(C);6 if L(Gˆ, Cˆ) < L(G,C)||rand() < ρ then7 C = Cˆ;8 G = Gˆ;9 end10 endInitially, a random measurement matrix C is generated to randomly select the sensordeployment locations. Also, the hyper-parameters are initialized. Then, the training pro-cess iterates for T times to optimize the deployment strategy for the spatiotemporal field93reconstruction. In each iteration, m input data are shuffled to reduce the variance and en-sure the model to be general and has less over-fit. Line 4 minimize the loss according tothe model indicated in Figure 5.2. Line 5 to line 10 seek to find a better sensor deploymentstrategy in each iteration. Line 5 swaps one non-zero column in the matrix C with one zerocolumn to search for better sensor deployment strategies by introducing new observationsfor the model. Line 6 checks if the updated measurement matrix and the model will resultin a lower loss. If new observations can achieve better performance, the updated modelwill be able to provide overall better performance. This scheme not only achieves a bettermodel but also adopts a worse model according to a slowly decreasing probability ρ. Al-lowing worse solutions provides the capability to avoid local minima in simulated annealingapproaches [124].5.3.3 Lipschitz Continuity and Reconstruction Capability for theProposed FrameworkLipschitz continuity is a strong property of function continuity in mathematical analysis.It limits the change rate of the function and bounds the slope of the function within theLipschitz constant. Lipschitz continuity also guarantees that the function obeys the Cauchy-Lipschitz Theorem, which ensures the existence and uniqueness of solutions to certain first-order equations [125]. Definition 5.1 and Proposition 5.1 show that the proposed model Gis L-Lipschitz. Besides, Definition 5.2 and Lemma 5.1 show that r samples are sufficientfor spatiotemporal field reconstruction.The definition of Lipschitz continuity can be formulated as follows:Definition 5.1. A function, G : Rr → Rm , is Lipschitz continuous if there exists a positivereal constant L such that:‖G(x1)− G(x2)‖2 ≤ L ‖x1 − x2‖2 ,∀x1,x2 ∈ Rr, (5.17)where the value of L is the Lipschitz constant, and the function G is L-Lipschitz.94Then, as shown in Proposition 5.1, the Lipschitz continuity of the proposed network canbe proved layer-by-layer and the corresponding Lipschitz constant can be calculated.Proposition 5.1. G is L-Lipschitz.Proof. The Lipschitz constant of the proposed framework can be computed as a series offunction compositions:G(φ) = (G1 ◦ G2 ◦ · · ·)(x), (5.18)where Gi can be the linear operation between layers and activation function. Denote theLipschitz constant of the function as L(·). Then, Lipschitz constant of the functional com-positions can be expressed as:L(G) =∏L(Gi). (5.19)In this manner, the Lipschitz constant of G can be calculated layer by layer. Denote ahidden layer with r inputs asGh(φ) = Wx + b. (5.20)The Lipschitz constant is given by the spectral norm of the weighting matrix under the`2 norm [126, 127]. Substitute Equation 5.20 in to Equation 5.17:‖(Wx1 + b)− (Wx2 + b)‖2 ≤ L ‖x1 − x2‖2‖Wα‖2 ≤ L ‖α‖2‖Wα‖2‖α‖2≤ L,(5.21)where α = x1 − x2. Hence, the smallest Lipschitz constant can be derived as the operatornorm:L(Gh) = supα 6=0‖Wα‖2‖α‖2. (5.22)In the case of `2 norm, the operator norm can be computed by using the largest singular95value of the weighting matrix W [78, 126].The W is initialized with Gaussian random variables. Then WTW is its Wishart en-semble. The expectation of the largest eigenvalue of WTW is 4r according to [128–130].Hence, the largest singular value of W is the square root of WTW, which is 2√r. Thus,Gh(·) is Lipschitz continuous and its Lipschitz constant is 2√r. Similarly, the hidden layerwith m inputs has a Lipschitz constant of 2√m.The Lipschitz constant of the compression layer can also be calculated similarly. Thesmallest Lipschitz constant of the compression layer is the largest singular value of thelogical matrix with the canonical basis:L(Gc) = supα 6=0‖Cα‖2‖α‖2. (5.23)The singular values of the matrix C are the square roots of the eigenvalues of CTC.Denote λ as the eigenvalue of CTC and v = [v1v2 · · · vm]. Then:λv = CTCv= Ev,(5.24)where E = diag(e1, e2, · · · , em), ei ∈ {0, 1} is a diagonal identity matrix.Therefore,(λ− ei)vi = 0. (5.25)The eigenvalue of CTC is either 0 or 1, thus the Lipschitz constant of compress layer(largest singular value of matrix C) is L(Gc) = 1.The Lipschitz continuity of the activation function can be analyzed in a similar manner.In this chapter, ReLU is used as the activation function:ReLU(x) = max(0, x), (5.26)96whose derivative is:ReLU ′(x) =0, if x > 01, otherwise.(5.27)Hence, the maximum gradient of the ReLU function is 1, which indicates that ReLU isLipschitz continuous and the Lipschitz constant of ReLU is L(Gai) = 1.In this manner, the Lipschitz constant for the whole network can be derived accordingto Equation 5.19:L(G) =∏L(Ghi)×∏L(Gai)× L(Gc)=∏L(Ghi)= (4√mr)d/2,(5.28)where d is the number of hidden layers. Thus, the proposed model G follows Lipschitzcontinuity.Next, it is essential to show that the sparsely sampled observations are sufficient toreconstruct a spatiotemporal field. In sparse sampling and reconstruction, Restricted Eigen-value Condition (REC) is a sufficient condition for robust reconstruction. REC requires allsparse vectors φ to be far from the null-space of matrix C:‖Cφ‖ ≥ γ ‖φ‖ , (5.29)where γ is a constant greater than 0. It has been proved that the functions satisfying theREC is sufficient for reconstruct sparse signals, such as Lasso [131, 132] and Elastic Net[133].In this chapter, a restricted version of REC is proposed to fit the sparse sample recon-struction problem, called Canonical Restricted Eigenvalue Condition (CREC).Definition 5.2. Denoting G ∈ Rr, a canonical matrix C is said to satisfy CREC(G, γ, δ), if97∀x1, x2 ∈ S and ∃γ > 0, δ ≥ 0,‖C(x1 − x2)‖2 ≥ γ ‖x1 − x2‖2 − δ.The CREC definition restricts that for any two signals in G, the corresponding com-pressed observations should have at least the same significance as the original signal. There-fore, there is a possibility to recover the unknown vector from these observations. A canon-ical matrix C is used to perform sparse sampling for signal reconstruction rather than theiid random matrix used for signal compression in [77].Proposition 5.1 has shown that the proposed framework for signal reconstruction is L-Lipschitz. Then, following lemmas show that the canonical matrix sampling r observationswill satisfy the CREC for sparse signal reconstruction given Gr : Rr → Rm.Lemma 5.1. Denote B(r) = {y|y ∈ Rr} as the `2-norm ball in Rr. A random canonicalmatrix, C, satisfies CREC(Gr(B(r)), γ, δ), where γ > 0, δ > 0.Proof. Denote y ∈ Rr, where x = Gr(y), for any y,y′ ∈ B(r), the sparse signal recon-struction model is L-Lipschitz according to Definition 5.1. Hence,‖Gr(y)− Gr(y′)‖2 ≤ L ‖y − y′‖2 .Denote a net ζ to be the δL-net onB(r). Then, Gr(ζ) is the δ-cover of Gr(B(r)) accordingto the L-Lipschitz property.Then, ∀y,y′ ∈ B, ∃y1,y2 ∈ ζ , such that Gr(y1) and Gr(y2) are δ-close to Gr(y) andGr(y′), respectively. Therefore, according to triangle inequality,‖Gr(y)− Gr(y′)‖2 ≤‖Gr(y)− Gr(y1)‖2 +‖Gr(y1)− Gr(y2)‖2 + ‖Gr(y2)− Gr(y′)‖2≤‖Gr(y1)− Gr(y2)‖2 + 2δ(5.30)98Similarly,‖CGr(y1)−CGr(y2)‖2 ≤‖CGr(y1)−CGr(y)‖2 +‖CGr(y)−CGr(y′)‖2 +‖CGr(y′)−CGr(y2)‖2≤‖CGr(y)−CGr(y′)‖2 +O(δ).(5.31)Then, for samples in the δ-cover of Gr(B(r)),‖Gr(y1)− Gr(y2)‖2 =∥∥CTCGr(y1)−CTCGr(y2)∥∥2≤∥∥CT∥∥2‖CGr(y1)−CGr(y2)‖2 .(5.32)The `2 norm,∥∥CT∥∥2, is the largest singular value of CT according to Lemma 5.2.Hence,‖CGr(y1)−CGr(y2)‖2‖Gr(y1)− Gr(y2)‖2≥ 1‖CT‖2≥ 1‖CGr(y1)−CGr(y2)‖2 ≥1‖CT‖2‖Gr(y1)− Gr(y2)‖2 .(5.33)Next, substitute Equation 5.30 and Equation 5.31 into Equation 5.33:1‖CT‖ ‖Gr(y)− Gr(y′)‖ ≤ ‖CGr(y)−CGr(y′)‖+O(δ). (5.34)Thus, C satisfies CREC(Gr(B(r)), γ, δ), where γ = 1‖CT ‖ .Lemma 5.2. `2 norm of the canonical matrix C equals the largest singular value of C.Proof. `2 norm of a matrix can be expressed as the square root of the inner product of thematrix itself and its conjugate transpose:‖C‖2 =√C†C. (5.35)99Besides, the spectral norm of matrix C, ‖C‖2 can be represented as:‖C‖2 = sup‖x‖2=1‖Cx‖2 , (5.36)where x is a temporary variable that assists the proof in this lemma.Denote H = C†C, which is a Hermitian matrix. Then, iff (if and only if) a lineartransformation of Euclidean space E is Hermite, there exists an orthonormal basis of E thatcomprises all the eigenvectors of H. Let λ1, λ2, · · · , λm be the eigenvalues of Hermitianmatrix H, and e1, e2, · · · , em be the orthonormal basis of E.Also, let x = a1e1 + a2e2 + · · ·+ amem. The norm of x is :‖x‖ = 〈m∑i=1aiei,m∑i=1aiei〉 12 =√∑mi=1a2i = 1. (5.37)Hence,‖Cx‖2 = 〈Cx,Cx〉 = 〈x,C†Cx〉= 〈x,Hx〉=∑mi=1√λiai≤√λ∗,(5.38)where√λ∗ as the largest eigenvalue of matrix H. Therefore, ‖C‖2 ≤√λ∗.Besides,‖C‖2 ≥ 〈x,Hx〉 = 〈e∗, λ∗e∗〉 =√λ∗. (5.39)Thus, ‖C‖2 =√λ∗. Also, as the largest singular value of C is equal to the square rootof H, the `2 norm of the canonical matrix C equals the largest singular value of C.Therefore, Proposition 5.1 and Lemma 5.1 are proved, and they jointly confirm that the100proposed model G is L-Lipschitz and is sufficient to reconstruct a spatiotemporal field withr observations.5.4 Simulation ResultsThis section presents the conducted numerical simulations of SAFER using data fromNOAA. The proposed algorithm is compared with some state-of-the-art algorithms for re-constructing a global spatiotemporal field.5.4.1 Experimental SetupModel trainingBack-propagation is used to fine-tune the layer weights and biases for SAFER. Theproposed network is trained with ADAM optimizer [134] by using the default settings:β1 = 0.9, β2 = 0.999 and = 108. The learning rate is set to 0.001, and the mini-batchsize is set to 20. The input sequence is shuffled to avoid model overfitting and reducethe variance. The dataset is divided into a training set, and a test set according to theirsequence in the time series. The beginning part of the time series is regarded as the trainingset for learning the nonlinear map from the observations to the entire spatiotemporal field.The test set then composes the remaining snapshots of the spatiotemporal field in the timeseries. Hence, the training error indicates the reconstruction performance of the conductedalgorithm, while the test error indicates the prediction performance. In this manner, giventhe unknown future observations in the deployed location, the ability to predict the futurespatiotemporal field is examined.The simulation was conducted on a PC with 4.0 GHz Quad-Core CPU and 16 G mem-ory. The DL framework was implemented using the TensorFlow framework [111] and runwithout GPUs.101DatasetSimulations were conducted using two global climate datasets. The first one is the Na-tional Oceanic and Atmospheric Administration (NOAA) global sea surface temperaturedataset (SST) spanning from 1990 to 2018, which is publicly available online at [135]. TheSST dataset provides weekly global sea surface temperature means in 1.0-degree latitude×1.0-degree longitude global grid (180× 360) [136]. The second dataset is the NOAA’s PRE-Cipitation REConstruction Dataset (PREC) [137]. This dataset provides monthly globalprecipitation constructed on a 2.5-degree latitude × 2.5-degree longitude grid over theglobal (72 × 144) for the period from Jan. 1948 to Aug. 2018 [138].Following [4], the data for the first 16 years are selected as the training set, and theremaining data are used as the test set, for the SST dataset. As for PREC dataset, the first70% of spatiotemporal field scenarios are chosen as the training set, and the remaining 30%scenarios are used as the test set for performance evaluation.Evaluation MetricsThe proposed framework is evaluated by using two metrics, MSE@N , and VAR@N ,where N is the number of deployed sensor nodes. MSE@N is the Mean Square Error thatmeasures the average of the squares of the errors between predictions and the ground truth.It is calculated as:MSE@N =1m ·Mm∑i=1M∑j=1(Φij − Φ˜ij)2. (5.40)VAR@N calculates the mean variance of MSE for all snapshots of the spatiotemporalfield. Then, a low VAR@N indicates that the MSE of the predicting test snapshots will notchange significantly in most locations. Hence, the model can perform a more generalizedreconstruction and have a higher prediction ability.VAR@N =1mm∑i=1(1MM∑j=1(Φij − Φ˜ij)2 −MSE@N)2. (5.41)102Benchmark AlgorithmThe proposed algorithm, SAFER, is compared with the following benchmark algo-rithms:• SVHT. Singular value hard thresholding (SVHT) is a PCA-based approach that findsoptimal thresholds for the dimensions of the tailored SVD basis, guaranteeing min-imal signal reconstruction error [114]. The SVHT compression and reconstructionmethod reduces only the size of the SVD basis, and the resulting tailored SVD ba-sis can be used to represent a higher-dimensional space as a low-dimensional one.The reconstruction is based on the signal that is compressed via the tailored SVDbasis, using the left singular vectors and the corresponding orthogonal projections[139, 140]. Therefore, SVHT cannot find the sampling locations because it requiresthe state to be fully observable and encodes the original signal to another space witha lower dimension. Nevertheless, SVHT-based reconstruction is widely used in sta-tistical signal processing to reduce the size of an original signal. Hence, the result ofSVHT is regarded as the target performance.• Q-DEIM. QR factorization with the discrete empirical interpolation method (Q-DEIM)utilizes a greedy approximation solution provided by the matrix QR factorization withcolumn pivoting. The sensor deployment scheme of Q-DEIM seeks rows of Ψr, cor-responding to the point sensor locations in the spatiotemporal field that will optimallycondition the inversion of the measurement matrix [4, 68]. The canonical measure-ment matrix C is then extracted from the corresponding sampling locations of Ψr.• CS. Compressive sensing (CS) can efficiently compress and reconstruct a signal bysolving suitable underdetermined linear systems [57, 58, 141]. Similar to SAFER andQ-DEIM, the signal is compressed via a canonical sparse sensing matrix: y = C ·φ.• `2 Random. `2 Random is a baseline algorithm in which the sensor nodes are ran-domly deployed in the field. The reconstruction method of `2 Random is the same103as that of Q-DEIM, and the random seed for generating the sampling locations is setsame as the one used in CS. Hence, the sensors are deployed in the same locationsas in CS. The spatiotemporal field is reconstructed using the observations from theserandom locations and the information is learned from the training data.5.4.2 Training DetailsFigure 5.3 presents the reconstruction error in the SST dataset for both the training setand the test set during the model training. Four thresholds for sparse reconstruction areselected, including the optimal threshold for signal reconstruction, calculated by singularvalue hard thresholding (SVHT) [114]. The optimal hard threshold for singular value re-construction is 302, and the other thresholds are chosen as 100, 200, and 400. All selectedthresholds result in a rapid MSE decrease within a few iterations. All the final training errorsare less than 0.3, and the test errors are less than 0.45. The MSE variance during the train-ing process increases as more sensor nodes are deployed in the field. Note that 100 sparsesampling sensors are sufficient to achieve the lowest training error and the second-lowesttest error. Moreover, since increasing the number of sampling locations would not result ina significant reconstruction improvement in the test set, 100 sparse sampling locations aresufficient for sparse field reconstruction in this dataset.Figure 5.4 presents the training and test errors for the PREC dataset. The optimal hardthreshold, SVHT, is 206, and the other thresholds are selected as 50, 100, and 300. Fig-ure 5.4 (a) shows that the MSE of the training decreases steadily for all threshold values.Thresholds 206 and 300 achieve the lowest training error, which is close to 0.1, and the cor-responding test error, which is approximately 0.25, is significantly lower than the test errorfor the rest of the threshold values. Furthermore, the training variance is small for thresholds206 and 300. In these cases, deploying 206 sensors in the field can provide a highly accu-rate reconstruction for global precipitation reconstruction while using the smallest numberof sensor nodes.1040 5 10 15 20 25 30 35 4010 Epochs0.00.20.40.60.81.01.2MSENumber of sparse sampling sensors: 100Number of sparse sampling sensors: 200Number of sparse sampling sensors: 302Number of sparse sampling sensors: 400(a) Training error.0 5 10 15 20 25 30 35 4010 Epochs0.40.60.81.01.2MSENumber of sparse sampling sensors: 100Number of sparse sampling sensors: 200Number of sparse sampling sensors: 302Number of sparse sampling sensors: 400(b) Test error.Figure 5.3: Training using SST dataset with 100, 200, 302, 400 sparse sampling sen-sors.0 20 40 60 80 100Epochs0.10.20.30.40.50.60.70.8MSENumber of sparse sampling sensors: 50Number of sparse sampling sensors: 100Number of sparse sampling sensors: 206Number of sparse sampling sensors: 300(a) Training error.0 20 40 60 80 100Epochs0.20.30.40.50.60.70.8MSENumber of sparse sampling sensors: 50Number of sparse sampling sensors: 100Number of sparse sampling sensors: 206Number of sparse sampling sensors: 300(b) Test error.Figure 5.4: Training using PREC dataset with 50, 100, 206, 300 sparse sampling sen-sors.1055.4.3 Performance ComparisonSST DatasetFigure 5.5 presents the reconstruction results for the SST dataset with 100 sensor nodes,using the proposed and the benchmark algorithms. The black diamonds in this figure indi-cate the sensor deployment locations. The first snapshot of the SVHT and SAFER producesthe visibly best reconstruction when compared with the ground truth. Details of the groundtruth are fully reconstructed, and critical features of global SST, such as ocean current andEl Nin˜o, are captured. As shown in Figure 5.5 (f), SAFER tends to deploy sensor nodesnear the boundaries of different temperature levels.As sea surface temperature is profoundly affected by variations in solar radiation andthe natural properties of water, global SST is usually similar at the same latitude or near thesame ocean current. Hence, the discovered temperature boundary can be utilized to providemore information as the environment changes. In this manner, the SST distribution patternsand the sampling locations identified in the training data can provide sufficient informationfor nonlinear mapping, from limited observations to the entire spatiotemporal field. More-over, the features and sampling locations can better fit the test data that are generated in thefuture. Consequently, more information can be collected, and the reconstruction model canbe generalized to estimate the entire spatiotemporal field, given limited observations.Q-DEIM produces slightly worse reconstruction results than SAFER. The reconstructedSST increases significantly over the Indian Ocean. Nevertheless, the SST reconstruction forother oceans is successful. Note that the sensor deployment locations selected by Q-DEIMare close to the borders of continents, which may occur due to higher dynamics in coastalareas. However, although the information collected in coastal regions has a higher variancethan in blue water, it is difficult to generalize local information to a global situation. Instead,more observations in the oceanic area will help to produce a better reconstruction model. Incomparison, the proposed method, SAFER, learns more information by deploying sensor106051015202530(a) Ground truth051015202530(b) SVHT051015202530(c) Q-DEIM051015202530(d) `2 Random051015202530(e) CS051015202530(f) SAFERFigure 5.5: Ground truth of the testing spatiotemporal field and reconstruction fromthe proposed and the benchmark algorithms in SST dataset. (Black diamonds:sampling locations.)107nodes in a more distributed manner over the spatiotemporal field.CS generates less accurate reconstruction results than Q-DEIM. The temperature dis-tributions along the latitude are captured; however, the details of the ground truth are lost,which blurs the reconstructed field image. Information about inland bodies of water, suchas the Mediterranean Sea, the Black Sea, and the Great Lakes, is also lost. Moreover, asCS-based approaches do not provide a scheme to optimize the sensor deployment locations,the sparse sensing matrix is randomly chosen. Therefore, the reconstruction process mayhave higher variance if the selected sampling locations are ineffective.`2 Random produces the worst reconstruction results. The reconstructed SST is signifi-cantly higher than the ground truth in several oceans. As in CS, the sampling locations arerandomly selected; therefore, less valid observations are obtained. As a result, `2 Randomcannot provide satisfactory results in reconstruction and estimation.Figure 5.6 presents the prediction variance for all test sets, which examine the recon-struction variance for long-term observation and evaluate the ability to predict the futurespatiotemporal field. The presented variance is calculated pixel-wise and evaluates if thereconstruction MSE changes significantly for the prediction of all future time series gener-ations. Hence, a higher variance shown in this figure indicates a lower confidence for futureprediction.As shown in Figure 5.6 (a), SVHT achieves the lowest variance across the entire field.SAFER has the second-lowest estimation variance for the test data. Higher uncertainty inprediction occurs in some offshore areas. For example, the variance in the northern PacificOcean is high, which may be affected by the Aleutian Islands for the SST disturbance in-troduced by the archipelago. CS and Q-DEIM have higher variance in predicting futurespatiotemporal fields than SAFER. Their reconstruction and prediction models are not gen-eralized and are not robust for estimating environmental changes.Moreover, the selected sensor deployment locations cannot provide sufficient informa-tion for the reconstruction model. As a result, when the environment changes, the sensor10800.511.522.533.544.55(a) SVHT00.511.522.533.544.55(b) Q-DEIM00.511.522.533.544.55(c) `2 Random00.511.522.533.544.55(d) CS00.511.522.533.544.55(e) SAFERFigure 5.6: Reconstruction variance for all test snapshots in the SST dataset.109deployment strategies generated by these two approaches cannot perform well. `2 Randomhas the highest variance due to randomly deployed sensor nodes. Hence, CS, Q-DEIM and`2 Random are not suitable for sensor deployment or spatiotemporal field reconstruction.PREC DatasetFigure 5.7 presents the reconstruction results for the PREC dataset. Generally, as globalprecipitation is more complicated than sea surface temperature, the reconstruction resultsare less accurate for the PREC dataset than for the SST dataset. Similar to the results forthe SST dataset, SVHT and SAFER produce the best reconstruction results for the PRECdataset, while Q-DEIM produces the second-best reconstruction result. SAFER generatesa more generalized model than Q-DEIM. Although some features are lost, the spatiotem-poral field is accurately reconstructed. In contrast, Q-DEIM tends to over-fit the trainingdata. Hence, slight increases/decreases in precipitation may have a significant impact onthe reconstruction, causing a higher reconstruction MSE.As shown in Figure 5.7 (d)-(e), CS and `2 Random cannot produce effective reconstruc-tion for the PREC dataset. Significant disturbance and variance are introduced, and theoriginal spatiotemporal field can neither be reconstructed nor predicted.The variances in predicting future spatiotemporal fields for the PREC dataset are pre-sented in Figure 5.8. The results for `2 Random are excluded from this figure due to the highvariance of this approach in all the sampling locations. SAFER achieves the lowest overallvariance throughout the spatiotemporal field, which means that this approach has high ro-bustness for predicting a complex spatiotemporal field. SVHT and Q-DEIM have elevatedvariance for all test sets, and the variance in test snapshots changes dramatically. Moreover,future spatiotemporal field reconstruction and prediction will be significantly less accurate.Hence, although the reconstruction snapshots shown in Figure 5.7 (b) - (c) are satisfactory,SVHT and Q-DEIM are not suitable for long-term spatiotemporal field reconstruction andprediction. Similarly, CS performs poorly in future spatiotemporal field reconstruction and110-4-2024(a) Ground truth-4-2024(b) SVHT(c) Q-DEIM-4-2024(d) `2 Random-4-2024(e) CS-4-2024(f) SAFERFigure 5.7: Ground truth of the testing spatiotemporal field and reconstruction fromthe proposed and benchmark algorithms in the PREC dataset. (Blue diamonds:sampling locations.)012345(a) SVHT012345(b) Q-DEIM012345(c) CS012345(d) SAFERFigure 5.8: Reconstruction variance for all test snapshots in the PREC dataset.prediction.Table 5.1 summarizes the performance of the SST and the PREC datasets based on theevaluation metrics described in Section 5.4.1. In general, SAFER achieves the best recon-struction accuracy for long-term spatiotemporal field reconstruction and prediction in both111datasets, given limited observations. SAFER provides a performance superior to that of thebenchmark algorithms, concerning both reconstruction accuracy and model robustness. Theminimum reconstruction errors of SAFER in the SST and the PREC datasets are 0.393 and0.252, respectively. Also, these values that are close to the optimal reconstruction bound.The reconstruction accuracy is at least 61.49% higher than in the benchmark algorithms,and the model robustness in predicting long-term spatiotemporal fields is increased by atleast 33.85%. The lowest variance in predicting long-term precipitation data is obtained inSAFER. Hence, in reconstructing and predicting complex long-term spatiotemporal data,SAFER can produce a more generalized and robust model.Q-DEIM and CS generate the second-best results in reconstructing and predicting thespatiotemporal data, given few observations. Q-DEIM can achieve better performance ifthe number of sensor nodes is small since its mechanism attempts to extract the essentialinformation from the SVD basis. However, as the number of sensor nodes in the fieldincreases and more information is observed, CS gradually achieves better results.`2 Random produces the worst results with both datasets, as it generates the highestreconstruction error and variance.5.5 SummaryIn this chapter, a model-free sensor deployment strategy for spatiotemporal field recon-struction and prediction was developed. A novel DL framework was established to optimizesensor deployment locations for spatiotemporal field reconstruction and prediction. It wasproven in this chapter that the proposed model is Lipschitz continuous and that the sparsesampling strategy is sufficient for reconstruction. Simulation results were conducted usingtwo NOAA environmental datasets. The results showed that the proposed method couldoutperform the benchmark algorithms in both reconstruction accuracy and model robust-ness.1SVHT provides the optimal reconstruction MSE bound and is regarded as the target algorithm for recon-structing an encoded signal. Note that SVHT cannot reconstruct the signal given few observations.112Table 5.1: Comparison of performance using the two datasets.Dataset Metric SVHT1 Q-DEIM `2 Random CS SAFER Improv.SSTMSE@100 0.212 1.783 25.520 2.083 0.396 77.79%MSE@200 0.161 2.329 27.477 1.564 0.408 73.91%MSE@302 0.133 2.534 26.748 1.400 0.393 71.93%MSE@400 0.114 2.53 27.904 1.253 0.407 67.52%VAR@100 0.002 0.527 163.12 1.174 0.130 75.33%VAR@200 0.001 0.654 195.78 0.426 0.129 70.41%VAR@302 8.8e-04 1.049 155.66 0.250 0.106 57.60%VAR@400 6.6e-04 1.076 124.73 0.192 0.127 33.85%PRECMSE@50 0.174 0.857 19.213 1.216 0.330 61.49%MSE@100 0.119 0.981 23.400 1.013 0.281 71.36%MSE@206 0.076 1.147 30.260 1.019 0.258 74.68%MSE@300 0.059 1.424 33.822 0.900 0.252 72.00%VAR@50 0.002 0.193 185.36 0.550 4.6e-04 99.76%VAR@100 0.001 0.406 137.07 0.338 6.96e-05 99.98%VAR@206 4.3e-04 0.44 106.64 0.342 9.49e-05 99.97%VAR@300 2.7e-04 0.528 81.895 0.227 6.47e-05 99.97%113Chapter 6Hardware Implementation andExperimentation6.1 IntroductionThis chapter presents the hardware design and implementation of the environmentalmonitoring system, as well as the in situ experimentation carried out at the Yosef WoskReflecting Pool of the University of British Columbia in Vancouver, Canada. A prototypeunmanned surface vehicle (USV) is developed to take water data samples in the aquaticfield, which can model the environment for validating the proposed algorithms.Researchers have developed robotic sensors and sensor networks to monitor aquaticenvironments. However, the existing work has some drawbacks such as not using self-propelled monitoring platforms [87, 142], or has difficulties in long-term deployment inextensive areas [143, 144]. In this chapter, a fully automated and cost-effective USV isdeveloped for environmental monitoring.As shown in Figure 6.1, the USV consists of three groups of components: water qualitysensors, onboard controllers, and the mobile base of the USV. Figure 6.1 (a) shows thetop view of the USV; for signal transmission, the onboard controller and a GPS receiverare placed on top of the USV. Figures 6.1 (b)-(c) show the arrangement of the sensors and114(a) Top view. (b) Back view.(c) Front-right view. (d) Side view.Figure 6.1: The prototype of the developed USV.propellers; to take water data samples, the water quality sensors are placed around the USVwithout interfering with one another. Also, to power the sensors, a solar panel is used,which is placed on top of the USV.The USVs can explore the field of interest and sample data in a distributed manner. Thecollected data is then used to build the environmental model for information-based sensordeployment.6.2 Hardware Design6.2.1 Water Quality SensorsThe water quality sensors used in the hardware implementation are Libelium Plug &Sense devices. Five types of water quality sensors are adopted in the developed USV: pH115Table 6.1: Specifications for the water quality sensors.Sensor Sensing range AccuracypH value sensor 0 - 14 pH ±0.5 mVDO sensor 0 - 20 mg/L ±2%Temperature sensor 0 - 100◦C ±0.60Conductivity sensor N/A N/AORP sensor 0 - ±1999 mV N/Avalue, dissolved oxygen (DO), temperature, conductivity, and oxidation-reduction potential(ORP). These sensors can adequately carry out water quality monitoring. Sensor data isread via serial ports, using Arduino-based software 1. The specifications for the selectedsensors are given in Table 6.1 26.2.2 Onboard ControllersTThe onboard controller and the design of the autonomous navigation system of theUSV are all based on Pixhawk, an autopilot controller for small unmanned vehicles 3. TheUSV can be automatically navigated to selected locations, based on the GPS coordinatesand gyroscope data. In the beginning of the navigation process, a map of the field of interestis loaded, and obstacles are detected. The GPS receiver provides the current location of theUSV, and the pose of the USV is obtained from the gyroscope. The USV can then adjustitself during the navigation process according to the relative angle between the USV andthe target location. The feedback control loop of the USV navigation system is shown inFigure 6.2. As shown there, the USV can continuously adjust its movements according tothe localization information and through feedback of its relative position from the targetpoint.1https://www.arduino.cc/2http://www.libelium.com/downloads/documentation/smart water sensor board.pdf3http://pixhawk.org116OnboardControllerActuator onmobile baseFeedback fromtarget locationUSV movementGPSGyroscopeFigure 6.2: Feedback control of the USV.6.2.3 Mobile Base of USVThe mobile base is developed from the Blue Robotics T100 Thruster 4. To extend theUSV life, fewer propellers are used, and the water sealing and buoyancy of the USV areredesigned. A buoy is added to provide sufficient buoyancy for onboard equipment, such aswater quality sensors and the onboard controller. To prevent water damage to the electronicequipment, the onboard equipment is placed in a sealed enclosure, while the antennas arekept on top of the USV to transmit data.As the in situ experiments are carried out in an aquatic environment with a weak current,the USV is equipped with only two propellers. More propellers can be added to the USVto provide sufficient thrust, depending on the environment of a field test. The specificationsfor the propellers are given in Table 6.2 5Table 6.2: Specifications for the propeller.Item SpecificationMaximum Forward Thrust 2.36 kgfMaximum Reverse Thrust 1.85 kgfMinimum Thrust 0.01 kgfRotational Speed 300-4200 rev/min4https://www.bluerobotics.com/5https://docs.bluerobotics.com/thrusters/1176.3 In Situ Tests and Experimental Results6.3.1 Experiment Setup for In Situ TestsThe developed USV is used for in situ experiments carried out at the Yosef Wosk Re-flecting Pool. The framework for the conducted tests is shown in Figure 6.3. In the begin-ning of the experimental process, the USV explores the field of interest for a generalizedunderstanding of the environment. Then, based on the sampled data, the environment is spa-tiotemporally modeled according to the metrics introduced in Chapter 2. Next, information-based sampling strategies are executed to find near-optimal sampling locations, such as fieldestimation, reconstruction, and prediction. Finally, the sensor nodes (USVs) are deployedin the computed locations for long-term monitoring. The above process will be periodicallyexecuted to update the model and obtain better performance of the algorithm.Exploration of thefield of interest. Model ofenvironmentInformation-basedsampling strategiesSensor deploymentat key locationsPeriodicallyupdate themodelEstimation PredictionReconstructionFigure 6.3: The framework for in situ experiments.The in situ test collects five key water quality attributes (temperature, pH value, con-118ductivity, ORP, and DO) from five locations in the pool. Sampling locations are shown inFigure 6.4 (a), where an orange square denotes a sensor node location, blue area indicatesthe aquatic environment, and the gray area is the region (land or obstacle) that the mobilesensor nodes cannot reach. The area shown in Figure 6.4 (a) is 120 meters wide and 90meters long. Boundaries are drawn according to the rough GPS coordinates and turned intothe scale of one meter by selecting the top-left point as the origin. Figure 6.4 (b) presentsthe real-world experiment scenario. The proposed information-based sampling method as-sumes that each sensor node collects only one water quality attribute, and the spatiotemporalmodel is established based on the temperature sensor readings.(a) Sensor deployment locations. (b) Real-world deployment scenario.Figure 6.4: Sensor deployment for the in situ test.6.3.2 Experimental Results for Spatiotemporal Sensor Deploymentusing Minimal Sensor NodesIn this section, methods introduced in Chapter 2 are carried out based on the in situtest results. Figure 6.5 shows the results generated from four RRT-based algorithms and theenvironmental model generated from the real-world sampled data. Figure 6.5 (a) and Figure6.5 (b) are generated from two benchmark algorithms, and Figure 6.5 (c) and Figure 6.5(d) show the results generated by the proposed algorithm RRLR and RRLR Connectivity,respectively. The yellow square denotes the suitable sensor node deployment location; red119diamond blocks represent the redundant deployment locations, and the blue lines indicategraph G with several spanning trees.As shown in Figure 6.5 (a) and Figure 6.5 (b), both RRC and RRCstar require 12 sensornode deployment locations to achieve the minimum prior estimation error. However, somesensor nodes are too close, which can result in redundancy. They can be eliminated by usingthe RRLR algorithm, as shown in Figure 6.5 (c) and Figure 6.5 (d). Only eight sensor nodesare needed to be deployed in the geographic EOI when using RRLR to achieve minimumprior estimation error. Moreover, RRLR Connectivity not only reduces the redundant sam-pling locations but also maintains the network connectivity. If we remove the results of allredundant sensor nodes from RRLR as shown in Figure 6.5 (c), the wireless sensor networkwill be split into two parts, and the connectivity will be broken. RRLR Connectivity onlyremoves two redundant sampling locations to maintain both network connectivity and theminimal deployment location.Figures 6.6 (a) - (c) present the results generated by RS, GD4, and GD8, respectively.RS only searches a small part of the environment as it uses a randomized approach, whichmay not generate a near-optimal result. In contrast, GD4 and GD8 search the field suffi-ciently and achieves a lower estimation error as indicated in Figure 6.8. However, thesebenchmark methods require many deployment locations and may not be able to explore asufficient geographic EOI. Hence, the randomized approach cannot collect sufficient infor-mation from the field, and the sampling data redundancy is not investigated. Hence, theycannot guarantee a near-optimal spatiotemporal sensor deployment while using minimalsensor nodes.Figure 6.7 presents the field estimation uncertainty results when using the deploymentstrategies generated by the RRT-based algorithms. The field estimation uncertainty is cal-culated based on what is presented in Chapter 2. Estimation uncertainties of random searchmethods are ignored due to their poor performance in generating deployment strategies. AsFigure 6.7 shows, all four methods will generate reasonable estimation of the EOI since120Figure 6.5: Deployment results generated by RRT-based algorithms.most regions of the estimated field have low estimation uncertainty.Moreover, as Table 6.3 shows, the overall estimation uncertainty is very low over theEOI for all algorithms. Thus, both the proposed algorithm and the benchmark RRT-basedalgorithm will generate good deployment locations with low estimation uncertainty. Be-sides, the proposed algorithms use 33% and 25% fewer sensor nodes to achieve similarperformance.Table 6.3: Statistical measurement of the field estimation uncertainty.Method Max Min Variance # sensorsRRC 1.31 ∗ 10−118 0 2.55 ∗ 10−120 12RRCstar 1.02 ∗ 10−118 0 1.99 ∗ 10−120 12RRLR 8.39 ∗ 10−118 0 1.63 ∗ 10−119 8RRLR Connectivity 2.09 ∗ 10−118 0 4.08 ∗ 10−120 9Figure 6.8 presents the minimal prior estimation error found by the algorithms at each121Figure 6.6: Deployment results generated by random search-based algorithms.Figure 6.7: Field estimation uncertainty results of RRT-based algorithms.iteration. All results are monotonically decreasing as the minimum estimation error isrecorded in each iteration. The prior estimation error generated by the randomized methodsstays at a high level, while the one provided by the RRT-based algorithms decrease signif-icantly. RS algorithm produces the worst result since it is only able to search in a limitedarea. The greedy search-based approach improves the performance slightly by exploring122more parts of EOI. However, all these three algorithms can easily entrap at local minimawithout achieving a near-optimal solution.Figure 6.8: Change of minimum estimation error.According to Proposition 2.2, all four RRT-based algorithms provides near-optimal re-sults after 1000 iterations as the estimation error converges. Hence, the estimation error ofthe RRT-based algorithms converges with the number of iteration. In this manner, the in situtest confirms that the proposed method achieves the lowest estimation error while deployingminimal sensor nodes.6.4 SummaryIn this chapter, the developed prototype of the autonomous surface vehicle was de-scribed and used for the in situ experiments. An environmental model was built basedon the conducted in situ tests and used for evaluating the effectiveness of the proposedinformation-based sampling strategies. The results confirmed that the proposed methods123outperform the benchmark algorithms in both simulation tests and real-world tests.124Chapter 7Conclusions and Future Work7.1 ConclusionsIn this dissertation, several main challenges of informative sampling for spatiotemporalfield estimation, reconstruction, and prediction were investigated. Sampled data from thefield was analyzed so as to preserve the most informative sampling locations and minimizeinfrastructure costs. In this manner, the selected sampling locations were able to providesufficient information for various monitoring tasks such as field estimation, reconstruction,and prediction. Learning methods were also used to improve the model efficiency andperformance. In the present work, the actual sensor navigation cost is not considered, whichsimplifies the computations in the optimization algorithms.The first goal of this dissertation was to find the most informative sampling locationsin a field in order to minimize the infrastructure costs while maintaining a good estimationaccuracy of the field and preserving the network connectivity. To this end, a near-optimalsensor node deployment strategy was developed. The proposed method investigated thelinear dependency of the sampled data, given the environment model. Then, spatiotempo-rally correlated sampling locations could be eliminated while giving due consideration tonetwork connectivity. Moreover, it was proven in the dissertation that the proposed methodhad submodularity and generated a near-optimal sampling plan. The results demonstrated125that the proposed method required fewer sensor nodes than the benchmark algorithms toachieve a sufficiently low estimation error.Second, the computational complexity in finding the near-optimal sampling locationswas reduced. A DRL-based field exploration method was proposed to accelerate the pro-cess of finding the most informative sampling locations. Feedback from each sampling lo-cation in the field was studied to avoid unnecessary sampling locations, and a model-basedinformation gain was calculated to evaluate the effectiveness of each sampling location dur-ing the exploration. As a result, the proposed method, DRLT, was able to learn from thesampled locations and guide the robotic sensor to avoid unnecessary sampling locations.The sampling locations found by the DRL model were proven to be near-optimal. Simula-tions were conducted using an NCOM dataset. The proposed algorithm showed significantimprovement in both field estimation performance and planning efficiency.Then, the spatiotemporal field was reconstructed based on the observations from the se-lected near-optimal sampling locations. A model-based sensor deployment strategy was de-veloped to reconstruct the spatiotemporal field with limited observations. Statistical meth-ods were utilized to find linear mappings from the observations to the entire spatiotemporalfield. The sampling locations generated by the proposed method were proven to be near-optimal for spatiotemporal reconstruction. Moreover, the simulation results demonstratedthat the proposed information-based sampling methods provided the best reconstructionperformance when compared with the benchmark algorithms.However, model-based reconstruction methods find only the linear mappings from thesampling data to the overall spatiotemporal field. Some features of the original signal werelost when the spatiotemporal field became more complicated. Therefore, a model-free DL-based framework was developed to capture the nonlinear relationships between the observa-tions in the critical sampling locations and the spatiotemporal field. The sensor deploymentlocations were then optimized by a simulated annealing-based approach during the training.Hence, the new observations in the selected sampling locations could be used to reconstruct126and predict the entire spatiotemporal field, given the trained DL framework.It was proven as well that the proposed DL framework had Lipschitz continuity, andthat the generated sampling locations were sufficient for spatiotemporal reconstruction andprediction.Several USVs were developed for environmental monitoring, and in situ tests were con-ducted. The collected data samples were used to build an environmental model for aninformation-based sampling algorithm. The experimental results showed that the proposedalgorithm had superior performance in real-world scenarios when compared to the bench-mark algorithms.7.2 Future WorkIn possible future developments, multichannel sensor data analysis may be carried outto perform multi-object information sampling. In the present investigation, the proposedmethods and the state-of-the-art approaches focused only on detecting the most informativesampling locations for a single channel of data sampling. In future work, sensor fusion andmultichannel signal processing may be utilized to analyze various types of sensor data andto decide the optimal sampling locations for spatiotemporal field monitoring.In this dissertation, DL methods were utilized to investigate the optimal nonlinear spatialmappings from the observations to the entire spatiotemporal field. However, the analysis ofthe spatial and temporal information was separate. Time series data analysis methods maybe incorporated to more accurately predict the spatiotemporal field. Techniques such asrecurrent neural networks (RNNs) and long short-term memory (LSTM) may also be usedto learn temporal changes while learning the spatial mappings between the observations andthe overall spatiotemporal field [145–147]. In this manner, nonlinear spatial mappings canbe captured based on the temporal predictions. Consequently, the reconstruction precisionof the field will improve.The USV developed in this dissertation has limitations in carrying out experiments in127quite complex environments. Currently, the environment of interest is assumed to be known,and the obstacles are assumed to be static. Hence, fully autonomous exploration in an un-known environment with dynamic obstacles would not be feasible with the current USVs.Simultaneous localization and mapping (SLAM) techniques may be utilized to build a mapof an unknown environment while simultaneously enabling the USV to localize [148–150].RGB cameras and laser range finders may be employed to meet the requirements of SLAM,and to simultaneously enable the USV to avoid dynamic obstacles. Moreover, open-sourcerobotic middleware, such as robot operating systems (ROS) may be used for easier devel-opment and control of a USV [151].128Bibliography[1] H. Gupta, V. Navda, S. Das, and V. Chowdhary, “Efficient gathering of correlateddata in sensor networks,” ACM Transactions on Sensor Networks (TOSN), vol. 4,no. 1, p. 4, 2008. → pages 1, 6, 44[2] L. A. Villas, A. Boukerche, H. A. De Oliveira, R. B. De Araujo, and A. A. Loureiro,“A spatial correlation aware algorithm to perform efficient data collection in wirelesssensor networks,” Ad Hoc Networks, vol. 12, pp. 69–85, 2014. → pages 1, 6[3] X. Lan and M. Schwager, “Rapidly exploring random cycles: Persistent es-timation of spatiotemporal fields with multiple sensing robots,” IEEE Trans-actions on Robotics, vol. 32, no. 5, pp. 1230–1244, 2016. → pages1, 3, 8, 9, 20, 21, 46, 47, 50, 64, 65, 81, 85[4] K. Manohar, B. W. Brunton, J. N. Kutz, and S. L. Brunton, “Data-driven sparse sen-sor placement for reconstruction: Demonstrating the benefits of exploiting knownpatterns,” IEEE Control Systems, vol. 38, no. 3, pp. 63–86, 2018. → pages1, 4, 11, 73, 75, 102, 103[5] M. C. Vuran, O¨. B. Akan, and I. F. Akyildiz, “Spatio-temporal correlation: theoryand applications for wireless sensor networks,” Computer Networks, vol. 45, no. 3,pp. 245–259, 2004. → pages 2, 5, 44[6] S. Yoon and C. Shahabi, “The clustered aggregation (cag) technique leveraging spa-tial and temporal correlations in wireless sensor networks,” ACM Transactions onSensor Networks (TOSN), vol. 3, no. 1, p. 3, 2007. → pages 2, 5[7] F. Li, J. Luo, W. Wang, and Y. He, “Autonomous deployment for load balancingk-surface coverage in sensor networks,” IEEE Transactions on Wireless Communi-cations, vol. 14, no. 1, pp. 279–293, 2015. → page 2[8] M. Abo-Zahhad, S. M. Ahmed, N. Sabor, and S. Sasaki, “Utilisation of multi-objective immune deployment algorithm for coverage area maximisation with limitmobility in wireless sensors networks,” IET Wireless Sensor Systems, vol. 5, no. 5,pp. 250–261, 2015. → pages 2, 7[9] J. Chen, T. Li, T. Shu, and C. W. de Silva, “Rapidly-exploring tree with linear re-duction: A near-optimal approach for spatiotemporal sensor deployment in aquaticfields using minimal sensor nodes,” IEEE Sensors Journal, 2018. → pages 3, 9, 85129[10] A. Krause, A. Singh, and C. Guestrin, “Near-optimal sensor placements in gaus-sian processes: Theory, efficient algorithms and empirical studies,” Journal ofMachine Learning Research, vol. 9, no. Feb, pp. 235–284, 2008. → pages3, 8, 25, 44, 45, 52, 57, 58[11] M. A. Alsheikh, S. Lin, H.-P. Tan, and D. Niyato, “Area coverage under low sen-sor density,” in Sensing, Communication, and Networking (SECON), 2014 EleventhAnnual IEEE International Conference on, pp. 173–175, IEEE, 2014. → pages 3, 6[12] S. M. LaValle, “Rapidly-exploring random trees: A new tool for path planning,”1998. → page 3[13] M. S. Sivamurugan and B. Ravindran, “Rrtpi: Policy iteration on continuous do-mains using rapidly-exploring random trees,” in Robotics and Automation (ICRA),2014 IEEE International Conference on, pp. 4362–4367, IEEE, 2014. → pages4, 9, 45, 64, 81[14] L. A. Villas, A. Boukerche, D. L. Guidoni, H. A. De Oliveira, R. B. De Araujo, andA. A. Loureiro, “An energy-aware spatio-temporal correlation mechanism to performefficient data collection in wireless sensor networks,” Computer Communications,vol. 36, no. 9, pp. 1054–1066, 2013. → page 6[15] N. Ganganath, C.-T. Cheng, and K. T. Chi, “Distributed antiflocking algorithms fordynamic coverage of mobile sensor networks,” IEEE Transactions on Industrial In-formatics, vol. 12, no. 5, pp. 1795–1805, 2016. → page 6[16] N. Heo and P. K. Varshney, “A distributed self spreading algorithm for mobile wire-less sensor networks,” in Wireless Communications and Networking, 2003. WCNC2003. 2003 IEEE, vol. 3, pp. 1597–1602, IEEE, 2003. → page 6[17] N. Heo and P. K. Varshney, “Energy-efficient deployment of intelligent mobile sensornetworks,” IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systemsand Humans, vol. 35, no. 1, pp. 78–92, 2005. → page 6[18] M. Abo-Zahhad, S. M. Ahmed, N. Sabor, and S. Sasaki, “Rearrangement of mobilewireless sensor nodes for coverage maximization based on immune node deploymentalgorithm,” Computers & Electrical Engineering, vol. 43, pp. 76–89, 2015. → page7[19] M. Abo-Zahhad, N. Sabor, S. Sasaki, and S. M. Ahmed, “A centralized immune-voronoi deployment algorithm for coverage maximization and energy conservationin mobile wireless sensor networks,” Information Fusion, vol. 30, pp. 36–51, 2016.→ page 7[20] S. M. LaValle, Planning algorithms. Cambridge university press, 2006. → pages7, 9130[21] L. Zhao, W. Zhang, J. Hu, A. Abate, and C. J. Tomlin, “On the optimal solutions ofthe infinite-horizon linear sensor scheduling problem.,” IEEE Trans. Automat. Contr.,vol. 59, no. 10, pp. 2825–2830, 2014. → page 7[22] S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimal motion plan-ning with deterministic µ-calculus specifications,” in American Control Conference(ACC), 2012, pp. 735–742, IEEE, 2012. → page 7[23] C. Guestrin, A. Krause, and A. P. Singh, “Near-optimal sensor placements in gaus-sian processes,” in Proceedings of the 22nd international conference on Machinelearning, pp. 265–272, ACM, 2005. → pages 8, 44[24] N. Cressie, Statistics for spatial data. John Wiley & Sons, 2015. → pages 8, 44[25] W. Du, Z. Xing, M. Li, B. He, L. H. C. Chua, and H. Miao, “Optimal sensor place-ment and measurement of wind for water quality studies in urban reservoirs,” inInformation Processing in Sensor Networks, IPSN-14 Proceedings of the 13th Inter-national Symposium on, pp. 167–178, IEEE, 2014. → pages 8, 44[26] W. Du, Z. Xing, M. Li, B. He, L. H. C. Chua, and H. Miao, “Sensor placement andmeasurement of wind for water quality studies in urban reservoirs,” ACM Transac-tions on Sensor Networks (TOSN), vol. 11, no. 3, p. 41, 2015. → page 8[27] X. Lan and M. Schwager, “Planning periodic persistent monitoring trajectories forsensing robots in gaussian random fields,” in Robotics and Automation (ICRA), 2013IEEE International Conference on, pp. 2415–2420, IEEE, 2013. → page 8[28] X. Lan and M. Schwager, “A variational approach to trajectory planning for persistentmonitoring of spatiotemporal fields,” in American Control Conference (ACC), 2014,pp. 5627–5632, IEEE, 2014. → page 47[29] X. Lan and S. Di Cairano, “Continuous curvature path planning for semi-autonomousvehicle maneuvers using rrt,” in Control Conference (ECC), 2015 European,pp. 2360–2365, IEEE, 2015. → page 8[30] L. V. Nguyen, S. Kodagoda, R. Ranasinghe, and G. Dissanayake, “Mobile roboticwireless sensor networks for efficient spatial prediction,” in Intelligent Robots andSystems (IROS 2014), 2014 IEEE/RSJ International Conference on, pp. 1176–1181,IEEE, 2014. → page 8[31] S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimal motion plan-ning,” The international journal of robotics research, vol. 30, no. 7, pp. 846–894,2011. → page 9[32] R. Cui, Y. Li, and W. Yan, “Mutual information-based multi-auv path planning forscalar field sampling using multidimensional rrt,” IEEE Transactions on Systems,Man, and Cybernetics: Systems, vol. 46, no. 7, pp. 993–1004, 2016. → page 9131[33] C. Diuk, A. Cohen, and M. L. Littman, “An object-oriented representation for effi-cient reinforcement learning,” in Proceedings of the 25th international conference onMachine learning, pp. 240–247, ACM, 2008. → page 9[34] Y. Bengio et al., “Learning deep architectures for ai,” Foundations and trends R© inMachine Learning, vol. 2, no. 1, pp. 1–127, 2009.[35] C.-T. Law and J. I. Gold, “Reinforcement learning can account for associative andperceptual learning on a visual-decision task,” Nature neuroscience, vol. 12, no. 5,pp. 655–663, 2009.[36] S. Lange and M. Riedmiller, “Deep auto-encoder neural networks in reinforcementlearning,” in Neural Networks (IJCNN), The 2010 International Joint Conference on,pp. 1–8, IEEE, 2010.[37] M. Riedmiller, “Neural fitted q iteration-first experiences with a data efficient neuralreinforcement learning method,” in ECML, vol. 3720, pp. 317–328, Springer, 2005.[38] C. Liu, X. Xu, and D. Hu, “Multiobjective reinforcement learning: A comprehensiveoverview,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 45,no. 3, pp. 385–398, 2015. → page 9[39] J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural networks,vol. 61, pp. 85–117, 2015. → page 9[40] Y. Li, “Deep reinforcement learning: An overview,” arXiv preprintarXiv:1701.07274, 2017. → page 9[41] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, andM. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprintarXiv:1312.5602, 2013. → page 9[42] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare,A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., “Human-level con-trol through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533,2015. → pages 9, 45, 61[43] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deepconvolutional neural networks,” in Advances in neural information processing sys-tems, pp. 1097–1105, 2012. → page 9[44] T. Mikolov, M. Karafia´t, L. Burget, J. Cernocky`, and S. Khudanpur, “Recurrent neu-ral network based language model.,” in Interspeech, vol. 2, p. 3, 2010. → page 9[45] D. Chicco, P. Sadowski, and P. Baldi, “Deep autoencoder neural networks for geneontology annotation predictions,” in Proceedings of the 5th ACM Conference onBioinformatics, Computational Biology, and Health Informatics, pp. 533–540, ACM,2014. → page 9132[46] M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling, “The arcade learning en-vironment: An evaluation platform for general agents.,” J. Artif. Intell. Res.(JAIR),vol. 47, pp. 253–279, 2013. → page 9[47] M. Pfeiffer, M. Schaeuble, J. Nieto, R. Siegwart, and C. Cadena, “From perceptionto decision: A data-driven approach to end-to-end motion planning for autonomousground robots,” in Robotics and Automation (ICRA), 2017 IEEE International Con-ference on, pp. 1527–1533, IEEE, 2017. → page 10[48] K. Shiarlis, J. Messias, and S. Whiteson, “Rapidly exploring learning trees,” inRobotics and Automation (ICRA), 2017 IEEE International Conference on, pp. 1541–1548, IEEE, 2017. → page 10[49] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org. → page 10[50] M. Lustig, D. Donoho, and J. M. Pauly, “Sparse mri: The application of compressedsensing for rapid mr imaging,” Magnetic Resonance in Medicine: An Official Journalof the International Society for Magnetic Resonance in Medicine, vol. 58, no. 6,pp. 1182–1195, 2007. → page 10[51] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, “Stacked de-noising autoencoders: Learning useful representations in a deep network with a lo-cal denoising criterion,” Journal of machine learning research, vol. 11, no. Dec,pp. 3371–3408, 2010. → page 10[52] H. C. Burger, C. J. Schuler, and S. Harmeling, “Image denoising: Can plain neu-ral networks compete with bm3d?,” in Computer Vision and Pattern Recognition(CVPR), 2012 IEEE Conference on, pp. 2392–2399, IEEE, 2012.[53] J. Xie, L. Xu, and E. Chen, “Image denoising and inpainting with deep neural net-works,” in Advances in neural information processing systems, pp. 341–349, 2012.→ page 10[54] C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep convolutional network forimage super-resolution,” in European conference on computer vision, pp. 184–199,Springer, 2014. → page 10[55] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convo-lutional networks,” IEEE transactions on pattern analysis and machine intelligence,vol. 38, no. 2, pp. 295–307, 2016. → page 10[56] A. Mousavi, A. B. Patel, and R. G. Baraniuk, “A deep learning approach to structuredsignal recovery,” in Communication, Control, and Computing (Allerton), 2015 53rdAnnual Allerton Conference on, pp. 1336–1343, IEEE, 2015. → pages 10, 12, 91[57] M. E. Tipping, “Sparse bayesian learning and the relevance vector machine,” Journalof machine learning research, vol. 1, no. Jun, pp. 211–244, 2001. → pages 10, 103133[58] Z. Zhang, T.-P. Jung, S. Makeig, Z. Pi, and B. D. Rao, “Spatiotemporal sparsebayesian learning with applications to compressed sensing of multichannel physio-logical signals,” IEEE transactions on neural systems and rehabilitation engineering,vol. 22, no. 6, pp. 1186–1197, 2014. → pages 10, 73, 103[59] B. W. Brunton, S. L. Brunton, J. L. Proctor, and J. N. Kutz, “Sparse sensor place-ment optimization for classification,” SIAM Journal on Applied Mathematics, vol. 76,no. 5, pp. 2099–2122, 2016. → page 10[60] X. Lu, W. Dong, P. Wang, G. Shi, and X. Xie, “Convcsnet: A convolutional compres-sive sensing framework based on deep learning,” arXiv preprint arXiv:1801.10342,2018. → page 10[61] W. Guo, K. Manohar, S. L. Brunton, and A. G. Banerjee, “Sparse-tda: Sparse real-ization of topological data analysis for multi-way classification,” IEEE Transactionson Knowledge and Data Engineering, vol. 30, no. 7, pp. 1403–1408, 2018. → page11[62] C. Lu and B. Jayaraman, “Interplay of sensor quantity, placement and systemdimensionality on energy sparse reconstruction of fluid flows,” arXiv preprintarXiv:1806.08428, 2018. → page 11[63] L. De Lathauwer, B. De Moor, and J. Vandewalle, “A multilinear singular valuedecomposition,” SIAM journal on Matrix Analysis and Applications, vol. 21, no. 4,pp. 1253–1278, 2000. → page 11[64] H. Zou, T. Hastie, and R. Tibshirani, “Sparse principal component analysis,” Journalof computational and graphical statistics, vol. 15, no. 2, pp. 265–286, 2006.[65] K. Sjo¨strand, L. H. Clemmensen, R. Larsen, G. Einarsson, and B. K. Ersbøll,“Spasm: A matlab toolbox for sparse statistical modeling,” Journal of StatisticalSoftware, vol. 84, no. 10, 2018. → page 11[66] R. Jenatton, G. Obozinski, and F. Bach, “Structured sparse principal component anal-ysis,” in Proceedings of the Thirteenth International Conference on Artificial Intelli-gence and Statistics, pp. 366–373, 2010. → page 11[67] M. Hein and T. Bu¨hler, “An inverse power method for nonlinear eigenproblems withapplications in 1-spectral clustering and sparse pca,” in Advances in Neural Informa-tion Processing Systems, pp. 847–855, 2010. → page 11[68] Z. Drmac and S. Gugercin, “A new selection operator for the discrete empirical in-terpolation method—improved a priori error bound and extensions,” SIAM Journalon Scientific Computing, vol. 38, no. 2, pp. A631–A648, 2016. → pages 11, 103[69] N. B. Erichson, P. Zeng, K. Manohar, S. L. Brunton, J. N. Kutz, and A. Y. Ar-avkin, “Sparse principal component analysis via variable projection,” arXiv preprintarXiv:1804.00341, 2018. → page 12134[70] J. Jankova´ and S. van de Geer, “De-biased sparse pca: Inference and testing foreigenstructure of large covariance matrices,” arXiv preprint arXiv:1801.10567, 2018.→ page 12[71] M. Mollenhauer, I. Schuster, S. Klus, and C. Schu¨tte, “Singular value decompositionof operators on reproducing kernel hilbert spaces,” arXiv preprint arXiv:1807.09331,2018. → page 12[72] M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, “Deconvolutional networks,”2010. → page 12[73] M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,”in European conference on computer vision, pp. 818–833, Springer, 2014. → page12[74] C.-W. Loh, Z.-Q. Qian, R. Zhang, Y.-H. Liu, D.-W. Cao, W. Wang, H.-B. Yang, andM. Qi, “Deep learning the effects of photon sensors on the event reconstruction per-formance in an antineutrino detector,” Advances in High Energy Physics, vol. 2018,2018. → page 12[75] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprintarXiv:1312.6114, 2013. → page 12[76] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neuralinformation processing systems, pp. 2672–2680, 2014. → page 13[77] A. Bora, A. Jalal, E. Price, and A. G. Dimakis, “Compressed sensing using generativemodels,” arXiv preprint arXiv:1703.03208, 2017. → pages 13, 98[78] H. Gouk, E. Frank, B. Pfahringer, and M. Cree, “Regularisation of neural networksby enforcing lipschitz continuity,” arXiv preprint arXiv:1804.04368, 2018. → pages13, 96[79] M. L. Stein, Interpolation of spatial data: some theory for kriging. Springer Science& Business Media, 2012. → pages 17, 19, 20[80] J. Corte´s, “Distributed kriged kalman filter for spatial estimation,” IEEE Transactionson Automatic Control, vol. 54, no. 12, pp. 2816–2827, 2009. → pages 20, 21[81] L. Ljung, System Identification Toolbox 7: Getting Started Guide. The MathWorks,2008. → page 20[82] “Estimate state-space model using subspace method - matlab n4sid.” https://www.mathworks.com/help/ident/ref/n4sid.html. Accessed: 2017-08-24. → pages 20, 63[83] “Naval oceanographic office regional navy coastal ocean model (ncom).”https://www.ncdc.noaa.gov/data-access/model-data/model-datasets/navoceano-ncom-reg. Accessed: 2017-08-25. → pages 20, 63, 80135[84] U. D. of Commerce and N. N. C. for Environmental Information, “Coastal watertemperature guide.” http://www.nodc.noaa.gov/cwtg/all tmap.html, Jun 2018.[85] G. K. Rutledge, J. Alpert, and W. Ebisuzaki, “Nomads: A climate and weather modelarchive at the national oceanic and atmospheric administration,” Bulletin of the Amer-ican Meteorological Society, vol. 87, no. 3, pp. 327–341, 2006. → pages 20, 63[86] R. E. Kalman, “A new approach to linear filtering and prediction problems,” Journalof basic Engineering, vol. 82, no. 1, pp. 35–45, 1960. → pages 21, 47[87] A. Singh, A. Krause, C. Guestrin, and W. J. Kaiser, “Efficient informative sensingusing multiple robots,” Journal of Artificial Intelligence Research, vol. 34, pp. 707–755, 2009. → pages 25, 44, 114[88] B. Fateh and M. Govindarasu, “Energy minimization by exploiting data redundancyin real-time wireless sensor networks,” Ad Hoc Networks, vol. 11, no. 6, pp. 1715–1731, 2013. → page 30[89] D. Ganesan, D. Estrin, and J. Heidemann, “Dimensions: Why do we need a new datahandling architecture for sensor networks?,” ACM SIGCOMM Computer Communi-cation Review, vol. 33, no. 1, pp. 143–148, 2003. → page 30[90] R. Jedermann, H. Paul, and W. Lang, “Compressed radio transmission of spatial fieldmeasurements by virtual sensors,” in Wireless for Space and Extreme Environments(WiSEE), 2016 IEEE International Conference on, pp. 184–189, IEEE, 2016. →page 35[91] H. M. La, W. Sheng, and J. Chen, “Cooperative and active sensing in mobile sen-sor networks for scalar field mapping,” IEEE Transactions on Systems, Man, andCybernetics: Systems, vol. 45, no. 1, pp. 1–12, 2015. → page 43[92] T. Li, M. Xia, J. Chen, Y. Zhao, and C. de Silva, “Automated water quality survey andevaluation using an iot platform with mobile sensor nodes,” Sensors, vol. 17, no. 8,p. 1735, 2017. → page 43[93] G. Hitz, E. Galceran, M.-E`. Garneau, F. Pomerleau, and R. Siegwart, “Adaptivecontinuous-space informative path planning for online environmental monitoring,”Journal of Field Robotics, vol. 34, no. 8, pp. 1427–1449, 2017. → page 44[94] A. Boubrima, W. Bechkit, and H. Rivano, “Optimal wsn deployment models forair pollution monitoring,” IEEE Transactions on Wireless Communications, vol. 16,no. 5, pp. 2723–2735, 2017. → page 44[95] C. Liu, K. Wu, and J. Pei, “An energy-efficient data collection framework for wire-less sensor networks by exploiting spatiotemporal correlation,” IEEE transactions onparallel and distributed systems, vol. 18, no. 7, 2007. → page 44136[96] E. Karasabun, I. Korpeoglu, and C. Aykanat, “Active node determination for corre-lated data gathering in wireless sensor networks,” Computer Networks, vol. 57, no. 5,pp. 1124–1138, 2013.[97] S. Temel, N. Unaldi, and O. Kaynak, “On deployment of wireless sensors on 3-d terrains to maximize sensing coverage by utilizing cat swarm optimization withwavelet transform,” IEEE Transactions on Systems, Man, and Cybernetics: Systems,vol. 44, no. 1, pp. 111–120, 2014. → page 44[98] L. V. Nguyen, S. Kodagoda, and R. Ranasinghe, “Spatial sensor selection via gaus-sian markov random fields,” IEEE Transactions on Systems, Man, and Cybernetics:Systems, vol. 46, no. 9, pp. 1226–1239, 2016. → page 44[99] K.-C. Ma, L. Liu, H. K. Heidarsson, and G. S. Sukhatme, “Data-driven learning andplanning for environmental sampling,” Journal of Field Robotics, 2017. → pages44, 64, 81[100] J. Binney, A. Krause, and G. S. Sukhatme, “Optimizing waypoints for monitor-ing spatiotemporal phenomena,” The International Journal of Robotics Research,vol. 32, no. 8, pp. 873–888, 2013. → page 45[101] C. W. De Silva, Sensor systems: Fundamentals and applications. Crc Press, 2016.→ page 47[102] R. S. Sutton, A. G. Barto, F. Bach, et al., Reinforcement learning: An introduction.MIT press, 1998. → page 48[103] J. Han, J. Pei, and M. Kamber, Data mining: concepts and techniques. Elsevier,2011. → page 48[104] T. Jaksch, R. Ortner, and P. Auer, “Near-optimal regret bounds for reinforcementlearning,” Journal of Machine Learning Research, vol. 11, no. Apr, pp. 1563–1600,2010. → page 49[105] L. Busoniu, R. Babuska, B. De Schutter, and D. Ernst, Reinforcement learning anddynamic programming using function approximators, vol. 39. CRC press, 2010. →page 49[106] M. Abo-Zahhad, N. Sabor, S. Sasaki, and S. M. Ahmed, “A centralized immune-voronoi deployment algorithm for coverage maximization and energy conservationin mobile wireless sensor networks,” Information Fusion, vol. 30, pp. 36–51, 2016.→ page 50[107] A. Krause and D. Golovin, “Submodular function maximization.,” 2014. → pages52, 78[108] R. A. Johnson and D. Wichern, Multivariate analysis. Wiley Online Library, 2002.→ page 57137[109] P. Ding, “On the conditional distribution of the multivariate t distribution,” The Amer-ican Statistician, vol. 70, no. 3, pp. 293–295, 2016. → page 57[110] H. Van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with doubleq-learning.,” in AAAI, vol. 16, pp. 2094–2100, 2016. → page 61[111] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado,A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Is-ard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane´, R. Monga,S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever,K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Vie´gas, O. Vinyals, P. Warden,M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machinelearning on heterogeneous systems,” 2015. Software available from tensorflow.org.→ pages 64, 101[112] Z. Zhang and B. D. Rao, “Sparse signal recovery with temporally correlated sourcevectors using sparse bayesian learning,” IEEE Journal of Selected Topics in SignalProcessing, vol. 5, no. 5, pp. 912–926, 2011. → page 73[113] C. Eckart and G. Young, “The approximation of one matrix by another of lowerrank,” Psychometrika, vol. 1, no. 3, pp. 211–218, 1936. → page 76[114] M. Gavish and D. L. Donoho, “The optimal hard threshold for singular values is4/√3,” IEEE Transactions on Information Theory, vol. 60, no. 8, pp. 5040–5053,2014. → pages 78, 80, 103, 104[115] T. Li, M. Xia, J. Chen, S. Gao, and C. W. de Silva, “A hexagonal grid-based sam-pling planner for aquatic environmental monitoring using unmanned surface vehi-cles,” in Systems, Man, and Cybernetics (SMC), 2017 IEEE International Conferenceon, pp. 3683–3688, IEEE, 2017. → page 85[116] L. V. Nguyen, S. Kodagoda, R. Ranasinghe, and G. Dissanayake, “Adaptive place-ment for mobile sensors in spatial prediction under locational errors,” IEEE SensorsJournal, vol. 17, no. 3, pp. 794–802, 2017. → page 85[117] K.-C. Ma, L. Liu, and G. S. Sukhatme, “Informative planning and online learningwith sparse gaussian processes,” in Robotics and Automation (ICRA), 2017 IEEEInternational Conference on, pp. 4292–4298, IEEE, 2017.[118] M. Dunbabin and A. Grinham, “Quantifying spatiotemporal greenhouse gas emis-sions using autonomous surface vehicles,” Journal of Field Robotics, vol. 34, no. 1,pp. 151–169, 2017.[119] S. Garg and N. Ayanian, “Persistent monitoring of stochastic spatio-temporal phe-nomena with a small team of robots.,” in Robotics: Science and Systems, 2014. →page 85138[120] B. J. Julian, S. Karaman, and D. Rus, “On mutual information-based control of rangesensing robots for mapping applications,” in 2013 IEEE/RSJ International Confer-ence on Intelligent Robots and Systems, pp. 5156–5163, Nov 2013. → page 85[121] B. J. Julian, S. Karaman, and D. Rus, “On mutual information-based control of rangesensing robots for mapping applications,” The International Journal of Robotics Re-search, vol. 33, no. 10, pp. 1375–1392, 2014. → page 85[122] D. Baron, M. F. Duarte, M. B. Wakin, S. Sarvotham, and R. G. Baraniuk, “Distributedcompressive sensing,” arXiv preprint arXiv:0901.3403, 2009. → page 86[123] R. Bellman, Dynamic programming. Courier Corporation, 2013. → page 91[124] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated anneal-ing,” science, vol. 220, no. 4598, pp. 671–680, 1983. → page 94[125] E. A. Coddington and N. Levinson, Theory of ordinary differential equations. TataMcGraw-Hill Education, 1955. → page 94[126] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral normalization forgenerative adversarial networks,” arXiv preprint arXiv:1802.05957, 2018. → pages95, 96[127] B. Neyshabur, “Implicit regularization in deep learning,” arXiv preprintarXiv:1709.01953, 2017. → page 95[128] A. Soshnikov and Y. V. Fyodorov, “On the largest singular values of random matriceswith independent cauchy entries,” Journal of mathematical physics, vol. 46, no. 3,p. 033302, 2005. → page 96[129] P. Vivo, S. N. Majumdar, and O. Bohigas, “Large deviations of the maximum eigen-value in wishart random matrices,” Journal of Physics A: Mathematical and Theoret-ical, vol. 40, no. 16, p. 4317, 2007.[130] S. Mendelson, G. Paouris, et al., “On the singular values of random matrices,”Preprint, 2014. → page 96[131] F. Santosa and W. W. Symes, “Linear inversion of band-limited reflection seis-mograms,” SIAM Journal on Scientific and Statistical Computing, vol. 7, no. 4,pp. 1307–1330, 1986. → page 97[132] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of theRoyal Statistical Society. Series B (Methodological), pp. 267–288, 1996. → page97[133] H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,”Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 67,no. 2, pp. 301–320, 2005. → page 97139[134] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXivpreprint arXiv:1412.6980, 2014. → page 101[135] NOAA, “Noaa optimum interpolation sea surface temperature v2.”https://www.esrl.noaa.gov/psd/data/gridded/data.noaa.oisst.v2.html. [AccessedAugust 3, 2018]. → page 102[136] R. W. Reynolds, N. A. Rayner, T. M. Smith, D. C. Stokes, and W. Wang, “An im-proved in situ and satellite sst analysis for climate,” Journal of climate, vol. 15,no. 13, pp. 1609–1625, 2002. → page 102[137] NOAA, “Noaa’s precipitation reconstruction dataset (prec).”https://www.esrl.noaa.gov/psd/data/gridded/data.prec.html. [Accessed Sep 29,2018]. → page 102[138] M. Chen, P. Xie, J. E. Janowiak, and P. A. Arkin, “Global land precipitation: A 50-yrmonthly analysis based on gauge observations,” Journal of Hydrometeorology, vol. 3,no. 3, pp. 249–266, 2002. → page 102[139] G. Berkooz, P. Holmes, and J. L. Lumley, “The proper orthogonal decompositionin the analysis of turbulent flows,” Annual review of fluid mechanics, vol. 25, no. 1,pp. 539–575, 1993. → page 103[140] P. Holmes, J. L. Lumley, G. Berkooz, and C. W. Rowley, Turbulence, coherent struc-tures, dynamical systems and symmetry. Cambridge university press, 2012. → page103[141] E. J. Cande`s and M. B. Wakin, “An introduction to compressive sampling,” IEEEsignal processing magazine, vol. 25, no. 2, pp. 21–30, 2008. → page 103[142] C. Alippi, R. Camplani, C. Galperti, and M. Roveri, “A robust, adaptive, solar-powered wsn framework for aquatic environmental monitoring,” IEEE Sensors Jour-nal, vol. 11, no. 1, pp. 45–55, 2011. → page 114[143] “Bluerov2 by blue robotics.” http://docs.bluerobotics.com/brov2/. Accessed: 2016.→ page 114[144] “Heron by clearpath robotics.” http://docs.bluerobotics.com/brov2/. Accessed: 2016.→ page 114[145] S. Haykin, Neural networks, vol. 2. Prentice hall New York, 1994. → page 127[146] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation,vol. 9, no. 8, pp. 1735–1780, 1997.[147] K. S. Tai, R. Socher, and C. D. Manning, “Improved semantic representa-tions from tree-structured long short-term memory networks,” arXiv preprintarXiv:1503.00075, 2015. → page 127140[148] H. Durrant-Whyte and T. Bailey, “Simultaneous localization and mapping: part i,”IEEE robotics & automation magazine, vol. 13, no. 2, pp. 99–110, 2006. → page128[149] T. Bailey and H. Durrant-Whyte, “Simultaneous localization and mapping (slam):Part ii,” IEEE Robotics & Automation Magazine, vol. 13, no. 3, pp. 108–117, 2006.[150] M. Montemerlo, S. Thrun, D. Koller, B. Wegbreit, et al., “Fastslam: A factored solu-tion to the simultaneous localization and mapping problem,” Aaai/iaai, vol. 593598,2002. → page 128[151] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, andA. Y. Ng, “Ros: an open-source robot operating system,” in ICRA workshop on opensource software, vol. 3, p. 5, Kobe, Japan, 2009. → page 128141
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Information-based sampling for spatiotemporal field...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Information-based sampling for spatiotemporal field estimation and reconstruction in environmental monitoring Chen, Jiahong 2019
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | Information-based sampling for spatiotemporal field estimation and reconstruction in environmental monitoring |
Creator |
Chen, Jiahong |
Publisher | University of British Columbia |
Date Issued | 2019 |
Description | This dissertation addresses the near-optimal deployment problem of robot-sensory nodes in a spatiotemporal field. With limited resources, monitoring of a complex environment may face serious challenges in providing sufficient information for spatiotemporal signal estimation and reconstruction. It is therefore essential to retrieve most useful information from sampling locations while using a small number of sensor nodes. In this dissertation, three aspects are investigated to overcome the shortcomings of the existing information-based sampling methods. First, a sensor node deployment method is designed to find the minimum number of sensor deployment locations while achieving near-optimal field estimation error. To this end, a sampling-based field exploration method is used to find near-optimal sampling locations over an infinite horizon. Moreover, spatiotemporal correlations of the sampling data are studied to find redundant signals. The corresponding sampling locations of the redundant signals are eliminated concerning the network connectivity. Second, a deep reinforcement learning approach is proposed to accelerate the field exploration. Typically, field exploration methods are heavily dependent on random sampling, which has low efficiency. To avoid unnecessary or redundant sampling locations, observations from the sampling locations are utilized. Then a model-based information gain determination of the sampling locations is developed to evaluate the effectiveness of the approach. The proposed method can determine the informativeness of the spatiotemporal field by learning the information gain from the sampled area. The mobile sensory agents are then encouraged to take more samples in the area of higher information gain. Consequently, the spatiotemporal field can be efficiently explored. Moreover, the selected sampling locations can near-optimally reconstruct the spatiotemporal field using statistical methods. Third, a deep learning framework is designed to provide accurate reconstruction and prediction of the spatiotemporal field, using a limited number of observations. Nonlinear mapping from limited observations to the entire spatiotemporal field is needed in a sufficiently large spatiotemporal field. Hence, a deep learning method is proposed to extract sparse representations of the field and their nonlinear mappings. It is also proven that the proposed framework obeys Lipschitz continuity and that the observations collected by sparse representations are sufficient for spatiotemporal field reconstruction. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2020-05-31 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0377718 |
URI | http://hdl.handle.net/2429/69385 |
Degree |
Doctor of Philosophy - PhD |
Program |
Mechanical Engineering |
Affiliation |
Applied Science, Faculty of Mechanical Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2019-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2019_may_chen_jiahong.pdf [ 19.05MB ]
- Metadata
- JSON: 24-1.0377718.json
- JSON-LD: 24-1.0377718-ld.json
- RDF/XML (Pretty): 24-1.0377718-rdf.xml
- RDF/JSON: 24-1.0377718-rdf.json
- Turtle: 24-1.0377718-turtle.txt
- N-Triples: 24-1.0377718-rdf-ntriples.txt
- Original Record: 24-1.0377718-source.json
- Full Text
- 24-1.0377718-fulltext.txt
- Citation
- 24-1.0377718.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0377718/manifest