UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Crop yield estimation in the Canadian Prairies : assessing the relative importance of scale, satellite and biophysical data Gogoi, Jumi

Abstract

Crop production plays an important role in the Canadian Prairie economy. Crop yield models based on satellite data offer promise for obtaining consistent yield estimates across large regions. With the growing availability of observational yield data, mostly at coarser spatial scales, linear regression models have been a popular modeling tool in Canadian agriculture. Using historical crop yields from two spatial scales for key crops in the Canadian Prairies, this dissertation evaluates how to improve satellite-based yield estimation models by incorporating biophysical data and using new modeling algorithms. First, I developed linear regression models for the Canadian Prairies (Alberta, Saskatchewan, Manitoba) using satellite data for historical yield estimation using municipality scale data for model training. Vegetation indices derived from different satellite datasets (Terra-MODIS, Landsat 7&8, Sentinel-2) were tested. I found that a single vegetation index model could perform as well as a model trained using multiple different vegetation indices as long as the right vegetation index was identified for the purpose. This highlights the critical need to identify the target application and corresponding sensor-specific vegetation index/indices. Second, using the same municipality scale data, I investigated how the addition of weather data to a vegetation index-based model influences estimation accuracy. I also compared the performance of multiple linear regression to machine learning. I found that combining weather data with vegetation indices within a machine learning approach improved estimation accuracy by 3-5% across crops compared to linear regression using satellite data alone. Third, I developed a crop yield prediction model calibrated using sub-field-scale yields from one farm in the Canadian Prairies. I assessed the performance of machine and deep learning models and different combinations of satellite and biophysical data. I found that Random Forests using optimal inputs improved prediction accuracy by 7-8% across crops compared to linear regression using only vegetation indices. An application of the optimal model in within-season forecasting mode showed that end-of-season yield forecasts stabilized within three months of lead time for canola and wheat. The findings of this dissertation demonstrate the potential of additional biophysical inputs and non-linear machine learning models for improving crop yield estimation and forecasting.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International