Accounting for preferential sampling in the statistical analysis of spatio-temporal data

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Accounting for preferential sampling in the statistical analysis of spatio-temporal data Watson, Joe

Abstract

Spatio-temporal statistical methods are widely used to model natural phenomena across both space and time. Example phenomena include the concentrations of airborne pollutants and the distributions of endangered species. A spatio-temporal process is said to have been preferentially sampled when the locations and/or times chosen to observe it depend stochastically on the values of the process at the chosen locations and/or times. When standard statistical methodologies are used, predictions of a preferentially sampled spatio-temporal process into unsampled regions and times may be severely biased. Preferential sampling within spatio-temporal data may be the rule rather than the exception in practice. The work demonstrated in this dissertation addresses the issue of preferential sampling. We develop the first general framework for modelling preferential sampling in spatio-temporal data and apply it to historical UK black smoke measurements. We demonstrate that existing estimates of population-level black smoke exposures may be highly inaccurate due to preferential sampling. By leveraging the information contained in the chosen sampling locations, we can adjust estimates of black smoke exposure to the presence of preferential sampling. Next, we develop a fast, intuitive, powerful, and general test for preferential sampling. A user-friendly R-package we wrote performs the test. We demonstrate its utility in both a thorough simulation study and by successfully replicating previously-published results on preferential sampling. Finally, we adapt our ideas on preferential sampling to the setting of spatio-temporal point patterns. By considering the observed point pattern as a spatio-temporal thinned, marked log-Gaussian Cox process, we show that preferential sampling can be directly accounted for within the model. Under certain assumptions, the true distribution of locations can then be attained. Using these ideas, we develop a framework for combining multiple data sources to estimate the spatio-temporal distribution of an animal. We then apply our framework to estimate effort-corrected space-use of an endangered ecotype of killer whales. Ultimately, we hope that investigations into preferential sampling will become an essential component within spatio-temporal analyses, akin to model diagnostics. The methods developed in this dissertation are widely applicable, allowing researchers to routinely perform such investigations.

Item Metadata

Title	Accounting for preferential sampling in the statistical analysis of spatio-temporal data
Creator	Watson, Joe
Publisher	University of British Columbia
Date Issued	2020
Description	Spatio-temporal statistical methods are widely used to model natural phenomena across both space and time. Example phenomena include the concentrations of airborne pollutants and the distributions of endangered species. A spatio-temporal process is said to have been preferentially sampled when the locations and/or times chosen to observe it depend stochastically on the values of the process at the chosen locations and/or times. When standard statistical methodologies are used, predictions of a preferentially sampled spatio-temporal process into unsampled regions and times may be severely biased. Preferential sampling within spatio-temporal data may be the rule rather than the exception in practice. The work demonstrated in this dissertation addresses the issue of preferential sampling. We develop the first general framework for modelling preferential sampling in spatio-temporal data and apply it to historical UK black smoke measurements. We demonstrate that existing estimates of population-level black smoke exposures may be highly inaccurate due to preferential sampling. By leveraging the information contained in the chosen sampling locations, we can adjust estimates of black smoke exposure to the presence of preferential sampling. Next, we develop a fast, intuitive, powerful, and general test for preferential sampling. A user-friendly R-package we wrote performs the test. We demonstrate its utility in both a thorough simulation study and by successfully replicating previously-published results on preferential sampling. Finally, we adapt our ideas on preferential sampling to the setting of spatio-temporal point patterns. By considering the observed point pattern as a spatio-temporal thinned, marked log-Gaussian Cox process, we show that preferential sampling can be directly accounted for within the model. Under certain assumptions, the true distribution of locations can then be attained. Using these ideas, we develop a framework for combining multiple data sources to estimate the spatio-temporal distribution of an animal. We then apply our framework to estimate effort-corrected space-use of an endangered ecotype of killer whales. Ultimately, we hope that investigations into preferential sampling will become an essential component within spatio-temporal analyses, akin to model diagnostics. The methods developed in this dissertation are widely applicable, allowing researchers to routinely perform such investigations.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2020-12-16
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NoDerivatives 4.0 International
DOI	10.14288/1.0395329
URI	http://hdl.handle.net/2429/76831
Degree	Doctor of Philosophy - PhD
Program	Statistics
Affiliation	Science, Faculty of; Statistics, Department of
Degree Grantor	University of British Columbia
Graduation Date	2021-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Accounting for preferential sampling in the statistical analysis of spatio-temporal data Watson, Joe

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights