- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- BIRS Workshop Lecture Videos /
- Statistical Inference for Massive Distributed Spatial...
Open Collections
BIRS Workshop Lecture Videos
BIRS Workshop Lecture Videos
Statistical Inference for Massive Distributed Spatial Data Using Low-Rank Models Katzfuss, Matthias
Description
Due to rapid data growth, it is increasingly becoming infeasible to move massive datasets, and statistical analyses have to be carried out where the data reside. If several massive datasets stored in separate physical locations are all relevant to a given problem, the challenge is to obtain valid inference based on all data without moving the datasets. This distributed data problem frequently arises in the geophysical and environmental sciences, for example when a spatial process of interest is measured by several satellite instruments. We show that for the widely used class of spatial low-rank models, which contain a component that can be written as a linear combination of spatial basis functions, computationally feasible spatial inference and prediction for massive distributed data can be carried out exactly and in parallel. The required number of floating-point operations is linear in the number of data points, while the required amount of communication does not depend on the data sizes at all. After discussing several extensions and special cases of this result, we apply our methodology to carry out spatio-temporal filtering inference on total precipitable water measured by three different sensor systems.
Item Metadata
Title |
Statistical Inference for Massive Distributed Spatial Data Using Low-Rank Models
|
Creator | |
Publisher |
Banff International Research Station for Mathematical Innovation and Discovery
|
Date Issued |
2014-02-13
|
Description |
Due to rapid data growth, it is increasingly becoming infeasible to move massive datasets, and statistical analyses have to be carried out where the data reside. If several massive datasets stored in separate physical locations are all relevant to a given problem, the challenge is to obtain valid inference based on all data without moving the datasets. This distributed data problem frequently arises in the geophysical and environmental sciences, for example when a spatial process of interest is measured by several satellite instruments. We show that for the widely used class of spatial low-rank models, which contain a component that can be written as a linear combination of spatial basis functions, computationally feasible spatial inference and prediction for massive distributed data can be carried out exactly and in parallel. The required number of floating-point operations is linear in the number of data points, while the required amount of communication does not depend on the data sizes at all.
After discussing several extensions and special cases of this result, we apply our methodology to carry out spatio-temporal filtering inference on total precipitable water measured by three different sensor systems.
|
Extent |
40 minutes
|
Subject | |
Type | |
File Format |
video/mp4
|
Language |
eng
|
Notes |
Author affiliation: Texas A&M University
|
Series | |
Date Available |
2014-08-07
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivs 2.5 Canada
|
DOI |
10.14288/1.0043894
|
URI | |
Affiliation | |
Peer Review Status |
Unreviewed
|
Scholarly Level |
Faculty
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivs 2.5 Canada