BIRS Workshop Lecture Videos

Banff International Research Station Logo

BIRS Workshop Lecture Videos

Statistical Inference for Massive Distributed Spatial Data Using Low-Rank Models Katzfuss, Matthias


Due to rapid data growth, it is increasingly becoming infeasible to move massive datasets, and statistical analyses have to be carried out where the data reside. If several massive datasets stored in separate physical locations are all relevant to a given problem, the challenge is to obtain valid inference based on all data without moving the datasets. This distributed data problem frequently arises in the geophysical and environmental sciences, for example when a spatial process of interest is measured by several satellite instruments. We show that for the widely used class of spatial low-rank models, which contain a component that can be written as a linear combination of spatial basis functions, computationally feasible spatial inference and prediction for massive distributed data can be carried out exactly and in parallel. The required number of floating-point operations is linear in the number of data points, while the required amount of communication does not depend on the data sizes at all. After discussing several extensions and special cases of this result, we apply our methodology to carry out spatio-temporal filtering inference on total precipitable water measured by three different sensor systems.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivs 2.5 Canada