Title: Detection of cropland field parcels from Landsat imagery 1 Authors: Jordan Graessera,*, Navin Ramankuttyb 2 Author details: 3 a Department of Geography, McGill University, Montreal, QC H3A2K6, Canada 4 b The Liu Institute for Global Issues and Institute for Resources, Environment, and Sustainability, 5 University of British Columbia, 6476 NW Marine Drive, Vancouver, BC V6T1Z2 Canada 6 Role of authors: JG contributed to ideas, work, writing, and stewardship, with an overall 7 estimated contribution of 80%. NR contributed to ideas, work, writing, and stewardship, with 8 an overall estimated contribution of 20%. 9 Conflict of interest: The authors declare no conflict of interest. 10 Role of the funding source: The funding source had no involvement in the research. 11 12 *corresponding author 13 *Email: graesser@bu.edu 14 *Permanent address: 60 Otis St., Apt. 1, Cambridge, MA, USA, 02141 15 *Revised Manuscript with no Changes HighlightedClick here to download Revised Manuscript with no Changes Highlighted: graesser_ramankutty_MANUSCRIPT.docxhttps://doi.org/10.1016/j.rse.2017.08.027Graesser, J., & Ramankutty, N. (2017). Detection of cropland field parcels from Landsat imagery. Remote sensing of environment, 201, 165-180. 2 Abstract 16 A slowdown in global agricultural expansion, spurred by land limitations, improved 17 technologies, and demand for specific crops has led to increased agricultural intensification. 18 While agricultural expansion has been heavily scrutinized, less attention has been paid to 19 changes within cropland systems. Here we present a method to detect individual cropland field 20 parcels from temporal Landsat imagery to improve cropland estimates and better depict the 21 scale of farming across South America. The methods consist of multi-spectral image edge 22 extraction and multi-scale contrast limited adaptive histogram equalization (CLAHE) and 23 adaptive thresholding using Landsat Surface Reflectance Climate Data Record (CDR) products. 24 We tested our methods across a South American region with approximately 82% of the 25 2000/2001 total cropland area, using a Landsat time series composite with a January 1, 2000 to 26 August 1, 2001 timeframe. A thematic accuracy assessment revealed an overall cropland f-score 27 of 91%, while an object-based assessment of 5,480 fields showed low geometric errors. The 28 results illustrate that Landsat time series can be used to accurately estimate cropland in South 29 America, and the low geometric errors of the per-parcel estimates highlight the applicability of 30 the proposed methods over a large area. Our approach offers a new technique of analyzing 31 agricultural changes across a broad geographic scale. By using multi-temporal Landsat imagery 32 with a semi-automatic field extraction approach, we can monitor within-agricultural changes at 33 a high degree of accuracy, and advance our understanding of regional agricultural expansion 34 and intensification dynamics across South America. 35 Keywords: cropland, field, Landsat, South America 36 37 3 1. Introduction 38 In the latter half of the 20th century, growing food demand was met through intensification of 39 agricultural production, while global agricultural expansion slowed down (Tilman et al., 2001). 40 Farmers raised productivity through increased application of inputs such as fertilizers, 41 herbicides, and pesticides, and by adopting modern plant varieties, mechanization, and new 42 farming techniques (Deininger & Byerlee, 2012; Matson et al., 1997). While reduced land 43 clearing for agriculture (Gibbs et al., 2010; Gibbs et al., 2015; Graesser et al., 2015) can 44 contribute greatly to biodiversity preservation and habitat conservation (Foley et al., 2005), 45 intensification can be environmentally harmful when inputs such as nitrogen and phosphorous 46 are mismanaged (Barrett et al, 2001; Tilman et al., 2001; Tilman et al., 2002). Thus, there is a 47 critical need for agricultural monitoring to assess the environmental implications of agro-48 industrialization and intensification. 49 Timely and consistent monitoring of agricultural intensification is challenging because 50 the availability of data that describe intensification over large areas is limited. For example, 51 agricultural censuses provide information about farm size, machinery, and fertilizers, but these 52 data lack the spatial and temporal resolution needed to consistently monitor detailed changes 53 over large areas. Remote sensing, however, offers a unique solution to this problem. Remote 54 sensing provides the capability to detect indicators of intensification, namely indicators of 55 physical agricultural characteristics. For example, agricultural morphology, i.e., field shape or 56 size, is observable with moderate- to high-resolution sensors, and would be an invaluable piece 57 of information for multiple reasons. Field size is important in order to understand farm 58 management practices such as crop diversity and rotation, and to assess tradeoffs between 59 4 agricultural scale and efficiency, biodiversity monitoring, landscape fragmentation, and 60 ecological diversity (Barrett et al., 2001; Fahrig et al., 2015). Field size is also complementary to 61 farm size. For example, if a farmer’s capacity to expand land holdings is limited, farm size 62 remains unchanged. But a farmer can still alter the landscape through changes in management 63 such as field enlargement. Therefore, field size can provide important information about the 64 planet’s rapidly changing agricultural systems that would not otherwise be captured with farm 65 size data from agricultural censuses and surveys. Fortunately, Landsat, one of the remote 66 sensing community’s long-standing pillars of global change monitoring, offers the geographic 67 and temporal coverage as well as the spatial resolution necessary to detect cropland field 68 parcels over large areas. The challenge is to exploit this vast resource and design practical and 69 robust methods to accurately depict field parcels, which will complement growing information 70 about agricultural expansion. 71 While the remote sensing community has remarkably improved the capacity to monitor 72 extensive land cover changes, particularly into forests (Hansen et al., 2008; Hansen et al., 2012; 73 Potapov et al., 2012), less has been accomplished to remotely depict land use intensification 74 such as field size changes. Part of the challenge is that traditional per-pixel based methods are 75 not suitable for understanding landscape shape and context. Instead, image processing 76 methods are essential to solving this problem. For example, contextual information, often 77 referred to as image texture, can provide useful information about the structure and 78 morphology of landscape context. This approach was used in combination with linear 79 regression to estimate field sizes at a continuous scale in Eastern Europe (Kuemmerle et al., 80 2009). Though computationally simple and shown to produce accurate estimates, the method 81 5 restricts the data output to large area units rather than individual field parcels. Similar to this 82 work, European-wide field size estimates were conducted from interpolation of survey data 83 (Kuemmerle et al., 2013). A very different approach from the previous studies made use of 84 crowdsourcing to rapidly produce many field size samples from satellite imagery (Fritz et al., 85 2015). By doing so, the authors produced, to our knowledge, the first and only global estimate 86 of field sizes, offering a first look at the major global patterns on the scale of food production. 87 However, the methodology does not provide wall-to-wall estimates, instead interpolating 88 between crowdsourced samples. Although somewhat expected in global studies, the result is 89 an over-generalization of field sizes because of categorical field size classes and assumptions 90 about field size patterns over interpolated space. Still, Argentina—particularly scrutinized 91 because of our paper’s regional focus—is a case in point of the limitations of this approach. 92 Only remnants of small fields (although ‘small’ is not explicitly defined) were estimated, when 93 in fact many small fields exist throughout the country, as we shall show later. 94 The limitations (reliance on third-party data, coarse field estimates) of the approaches 95 above warrant a solution that can produce wall-to-wall, large-scale estimates of individual 96 fields. Yan and Roy (2014) developed a novel procedure to detect individual parcels from multi-97 temporal Landsat imagery by combining image-processing techniques such as image 98 morphology and segmentation. The authors employed temporal Web-enabled Landsat Data 99 (WELD) (Roy et al., 2010) to extract fields over a five-year period in the United States and 100 presented the first large-scale estimate of individual fields. More recently, the authors reduced 101 the timeframe to one year, refined the methods, and applied their algorithm to the contiguous 102 US (Yan and Roy, 2016). Their approach, utilizing a combination of image processing methods, is 103 6 more promising than previous approaches. Another European-wide study illustrated the 104 potential for ‘field patch’ segmentation from satellite imagery (Weissteiner et al., 2016). 105 However, whereas the estimates of Yan and Roy (2016) were kept at the field level, Weissteiner 106 et al. (2016) aggregated their data to a much coarser scale than individual fields. 107 In this study, we present an image processing method to detect individual field parcels 108 from multi-temporal Landsat imagery, with some key differences from the Yan and Roy studies, 109 and with application over different agricultural landscapes across much of South America. 110 South America’s agricultural landscape has changed rapidly over the past several decades 111 (Berdegué and Fuentealba, 2011; Dros, 2004; FAO, 2015; Graesser et al., 2015; Martinelli, 112 2012). Better estimates of cropland and data that describe the nature of the agricultural 113 changes are needed in order to accurately monitor and understand the consequences of these 114 rapid changes. Field-size data, in particular, would greatly enhance the capacity to monitor the 115 scale of these changes. This study addresses two questions: 1) Is Landsat a reliable sensor for 116 cropland observations? and 2) Can individual field parcels be detected at the continental scale 117 from multi-temporal Landsat imagery over a broad range of crop types and field 118 configurations? To characterize cropland, we used all available Landsat scenes over a 1.5-year 119 period, from 2000 to 2001. We then estimated cropland at a parcel level using multi-temporal 120 Landsat imagery and contemporary edge-based methods, and tested the robustness of our 121 methods over a large and complex agricultural region of South America. 122 123 7 2. Data and study area 124 2.1 Study area 125 We identified individual crop field parcels across a selected region of South America, broadly 126 defined as cropland south of the Amazon River and north of Patagonia (Fig. 1). The test region 127 was chosen from Landsat scenes that intersected selected world ecoregions (Olson et al., 2001) 128 in order to include a wide range of crop types and landscapes. This selected region contained 129 approximately 82% of the 2001 South American cropland area (Graesser et al., 2015). Argentina 130 and Brazil comprise the majority of the agricultural land within the study region. Bolivia, 131 Paraguay, and Uruguay have substantially less agricultural land, but are important agricultural 132 economies. Chile’s agricultural landscape is perhaps the most diverse, with a wide range of 133 small, specialized production, to large-scale grain production. Of the countries included, only 134 Argentina, Paraguay, and Uruguay are completely within the study boundary. In contrast, the 135 area included of Bolivia, Brazil, and Chile generally comprises land outside of the Amazon and 136 Andes regions, but still includes a large portion of the respective country’s agricultural area. 137 8 138 Figure 1. South America test region for semi-automatic crop-field extraction. The study area (shown in black 139 outline) consists of Landsat scenes that intersect selected ecoregions. For field parcel validation, 10 km x 10 km 140 grids were generated to cover the test region. The grids were then restricted to those with >=10% cropland area 141 (shown in cyan). Finally, we randomly sampled 1,000 grids (shown in red) from this >=10% cropland grid to use for 142 crop field parcel validation. Inset A illustrates a larger-scale view of the sample grids. 143 2.2 Data and timeframe 144 The Landsat Surface Reflectance Climate Data Record (CDR) product is freely available by 145 request through the U.S. Geological Survey (USGS) Earth Resources Observation and Science 146 9 Center (EROS) Science Processing Architecture (ESPA) (http://espa.cr.usgs.gov). The USGS 147 creates the CDR product by converting Landsat data to surface reflectance using the 6S 148 radiative transfer model (Vermote et al., 1997). We acquired all Landsat 5 Thematic Mapper 149 (TM) and Landsat 7 Enhanced Thematic Mapper Plus (ETM+) CDR scenes over the study area for 150 the 2000/2001 season that had 70% or less cloud cover. The temporal range of the coverage is 151 January 1, 2000 to August 1, 2001 and captures one full cycle of the agricultural growing season 152 in the study area. Each image comprises 6 reflective bands (1-5 and 7, specifically the blue 153 (0.45—0.52 µm), green (0.53—0.61 µm), red (0.63—0.69 µm), near-infrared (0.76—0.9 µm), 154 and mid-infrared (1.55—1.75 µm and 2.08—2.35 µm) wavelengths) and the Fmask atmospheric 155 mask with cloud, water, and shadow flags (Zhu & Woodcock, 2012). 156 Our objective in this study was to test the field extraction methods over a large 157 geographic space. We could have chosen any timeframe to apply the methods. We used 158 imagery from 2000 and 2001 because there was a plentiful supply of Landsat data and it 159 roughly represents the starting point from which agriculture began to expand rapidly in South 160 America (FAO 2015). 161 162 3. Methods 163 The proposed methodology uses Landsat satellite time-series composites to detect crop field 164 parcels over an agricultural growing season, using a semi-automatic approach. The generalized 165 workflow is shown in Figure 2 and broadly defined as: 1) image time-series computation over 166 one agricultural growing season; 2) multi-scale contrast limited adaptive histogram equalization 167 (CLAHE) and adaptive thresholding (ATh); 3) morphological edge cleaning; and 4) delineation of 168 10 individual cropland field parcels. The details are described in the following sub-sections and 169 explain the semi-automatic procedure to extract land-cover object edges and label individual 170 crop fields. 171 172 Figure 2. The general workflow of Landsat processing, cropland classification, and cropland field parcel detection, 173 with the main components: A) Landsat pre-processing; B) time-series computation; C) land-cover object 174 extraction; and D) cropland and field parcel intersection. 175 11 3.1 Field definition 176 None of the literature reviewed clearly defined an agricultural field. The definition of a crop 177 field may vary by disciplines and research questions. For some, a field may represent land 178 tenure, which is related to rules of property rights allocation rather than how the plot of land is 179 managed. But tenure is more closely associated to farm size rather than field size. Here we 180 defined fields as representing individual crops planted throughout a year (Fig. 3). In Figure 3 we 181 illustrate one plot of land under the same ownership over one year, with three crops planted 182 during the first phase of the rotation and two crops planted during the second, with a double 183 crop of soy replacing wheat. Although the land tenure did not change during the year, the type 184 and number of crops did. The objective of this study is to identify the highest parcel-level 185 resolution at which crops were cultivated during a growing season, illustrated by the three 186 crops in the first half of the rotation. 187 188 Figure 3. Illustrative diagram of field sizes under one tenure system. The left panel displays three crops that were 189 grown during the first phase of the crop rotation. The following rotation phase brought a double crop of soybeans 190 (wheat to soybean rotation). This study considers the highest field resolution of cultivated crops throughout the 191 example season to be 3. 192 12 3.2 Landsat pre-processing 193 Prior to field extraction, we processed Landsat imagery in the following steps: 1) climate 194 surface reflectance (CDR) imagery were acquired from USGS; and for each scene we 2) 195 normalized cross-track surface reflectance variations; 3) computed spectral indices; and 4) 196 computed temporal descriptive statistics from step 3. 197 198 3.2.1 Radiometric adjustments 199 Radiometric normalization is desirable to remove scene-to-scene variations and allow for 200 standardized land cover sampling across scenes. Atmospheric conditions and surface 201 inconsistencies impede image normalization and standard classification models, particularly 202 over large areas. Thus, radiometric adjustments are necessary to account for varying 203 atmospheric conditions (Flood et al., 2013). The 6S radiative transfer model, used to produce 204 the CDR product, converts Landsat data to atmospherically corrected surface reflectance. 205 However, the CDR products are not corrected for cross-track degradation, an issue observed in 206 Landsat imagery (Hansen and Loveland, 2012; Toivonen et al., 2006), and observed in the bulk 207 of the scenes covering our study region. We used the MODerate-resolution Imaging 208 Spectroradiometer (MODIS) bidirectional distribution function (BRDF) satellite product 209 (MCD43A4) to normalize the CDR cross-track spectral response. First, for each pixel we 210 computed the two-year (2000 and 2001) median of the MODIS BRDF MCD43A4 product 211 (hereafter, referred to as MODMED). Following methods similar to Gao et al. (2011) and 212 Potapov et al. (2012), we fit robust linear functions of each individual CDR band against the 213 matching MODMED wavelength. Each Landsat band was downsampled to 500 m to match the 214 13 spatial resolution of the MODMED composites. Then the Landsat values were subtracted from 215 MODMED (MODMED minus Landsat reflectance bias) for each MODMED-Landsat paired band. 216 Starting in the western scan edge all values within a 40 km-wide (i.e., 0 km to 40 km from scene 217 edge) column were extracted. Any cloud or shadow pixels in the CDR Fmask, or poor-quality 218 pixels in the MODIS MCD43A2 quality layer, were removed. We then used the mean of the 219 remaining MODMED-Landsat bias pixels. The mean column bias ( ) and the distance (km) from 220 the western scene edge to the column center ( ) were recorded. The next 40 km-wide column 221 was selected by moving the column selection 20 km east of the current scan column, creating 222 columns with 20 km overlaps. This process was repeated until the eastern scene edge was 223 reached. Lastly, the set of bias values ( ) and distance to column centers ( ) were regressed 224 with a Theil Sen linear regression (Theil, 1992). As mentioned, we used the CDR Fmask and 225 MODIS MCD43A2 quality products to mask “bad” pixels. However, the Theil Sen model was 226 used in order to address any misclassifications (extreme outliers) in the mask products. Finally, 227 the predicted bias was applied to the original 30 m CDR band to create the final adjusted band. 228 An illustration of the procedure is shown in Figure 4. Scenes where we could not extract 229 sufficient cross-track column samples were adjusted using a linear function of all clear 230 MODMED pixels ( ) to estimate CDR surface reflectance ( ) (i.e., , with no cross-231 track adjustments). The cross-track method improved the scene-to-scene normalization after 232 visually comparing the results to the CDR bands (i.e., with no cross-track adjustments) for the 233 study region. 234 14 235 Figure 4. Landsat cross-track adjustment using the 500 m MODIS Bidirectional Reflectance Distribution Function 236 product-derived MODMED. Starting at the western scene edge, Landsat and MODMED pixel values are extracted 237 from a km-wide column ( =40). After removing atmospherically contaminated pixels, the mean of the 238 remaining column (C1) pixels is computed and stored in the y vector. For the X vector, the distance from the scene 239 edge to the center of C1 is recorded. The column is then shifted half the width of (20 km), overlapping the first 240 column, and the process is repeated until the eastern scene edge is reached. The y and X vectors are used to 241 normalize the image. 242 243 3.2.2 Spectral transformations 244 The timing of crop planting and harvesting can vary from field to field and over large regions. 245 Therefore, we used multiple spectral transformations to exploit the complete spectrum of 246 different crops. We used descriptive statistics computed from temporal composites of spectral 247 15 indices—instead of the CDR spectral bands—for image edge extraction and classification. 248 Initially, we computed five vegetation spectral indices from each CDR image: the normalized 249 difference vegetation index (NDVI), the enhanced vegetation index (EVI2), the green normalized 250 difference vegetation index (GNDVI), the normalized difference soil index (NDSI), and the 251 normalized difference bareness index (NDBaI) (Table 1). Each spectral index image array and its 252 image information were stored in a HDF5 (Hierarchical Data Format) database (Alted & Vilata, 253 2002--) for efficient storage and fast query indexing. 254 255 Table 1. Spectral vegetation transformations. 256 Vegetation Index Equation Reference NDVI Tucker, 1979 EVI2 Jiang et al., 2008 NDSI Rogers & Kearney, 2004 NDBaI Zhao & Chen, 2005 GNDVI Gitelson et al., 1996 257 3.2.3 Time series composites 258 Seasonal satellite composites are necessary to accurately estimate row-crop agriculture 259 because cropland typically produces a unique spectral response if measured over an entire 260 growing season. Landsat time series compositing methods have become more commonplace 261 since the opening of the Landsat archive in 2008 (Woodcock et al., 2008), which allowed 262 researchers to develop composite datasets at a higher spatial resolution than previously 263 16 possible. Compositing methods range from the selection of single statistics over a year or multi-264 year timespan (Maxwell and Sylvester, 2012; Yan and Roy, 2014; Yan and Roy, 2016), ‘best 265 available pixel’ selection based on probability scoring functions over several years (Griffiths et 266 al., 2013; White et al., 2014), spatial and temporal gap-free (proxy value) composites 267 (Hermosilla et al., 2015), atmospheric coverage rules (Roy et al., 2010), preferential seasonal 268 selection (Potapov et al., 2012), synthetic time series generation (Schmidt et al., 2016), and 269 descriptive statistics of spectral indices and transformations from an image time series 270 (Gebhardt et al., 2014). Regardless of the compositing technique, pixel-based compositing 271 offers advantages over previous scene-based approaches, particularly by the inclusion of 272 (potentially) thousands of cloud-free samples that were otherwise excluded in scene-based 273 criteria. In this study, we used pixel-based compositing to produce radiometrically consistent 274 data across the study region. The composites were then used to estimate cropland and extract 275 individual field parcels across the study area. 276 To compute a time series composite for a single Landsat path/row, we automatically 277 checked each image in the path/row for: 1) extents larger than two standard deviations from 278 the path/row mean extent size (indicating large image shifts); and 2) images with valid data 279 (i.e., non-background) totals smaller than two standard deviations from the path/row mean 280 valid data count. If any of the two checks were true, then the image was discarded from further 281 use. A third, visual, check was performed to identify mis-registered images (pixel shifts as 282 opposed to large image offsets) post time-series compositing, and automatically adjusted using 283 cross-correlation (Guizar-Sicairos et al., 2008). If a path/row time-series composite was found 284 to contain a mis-registered image, we cross-checked the mis-registered image in the path/row 285 17 for alignment against the nearest Julian day reference image with sufficient clear observations, 286 and recorded the and coordinate shifts (derived from the cross-correlation procedure) 287 against the nearest reference image. The and coordinate shift was inverted to apply the 288 shift adjustment. To limit atmospheric contamination in the cross-correlation process, test and 289 reference images were scanned for clear image blocks. The image was scanned for blocks with 290 99% clear data, starting with the largest square block that fit an image’s valid data area. The 291 block size was reduced by 64 pixels on each side and the image was re-scanned until an 292 uncontaminated block was found. The final set of images for the path/row was then used for 293 time series calculations. 294 Next, the images were sorted by year and Julian day, and cloud and shadow pixels were 295 flagged using the CDR Fmask at each acquisition date. Missing observations in each spectral 296 transformation were interpolated using the flagged pixels and linear interpolation following: 297 (1) where, and are index positions around the position to be interpolated, , and and 298 are the values around the day to be interpolated, , at pixel location . After interpolation, 299 we applied a second order Savitsky-Golay filter (Savitsky & Golay, 1964) to smooth the time 300 series at each pixel. We then computed 10 time-series statistics, namely the minimum, 301 maximum, mean, coefficient of variation (CV), 25th, 50th, and 75th percentiles, maximum slope, 302 minimum slope, and day of maximum value. The maximum and minimum slopes were 303 computed with a moving window ( =7). The resulting set of variables was: 304 18 where, is the set of annual statistics for image , and the superscript denotes the number of 305 time series statistics computed. Thus, there were fifty statistical layers for each path/row time 306 series. 307 308 3.3 Edge and object extraction overview 309 The large range of field sizes in South America prevents the use of global thresholds on dense 310 concentrations of small fields (generally less than 2 ha), particularly at the Landsat spatial scale 311 of 30 m. Here, small field size is relative to the study region, where farm and field sizes are 312 larger than what is found in regions such as sub-Saharan Africa and South and Southeast Asia. 313 High global edge-thresholds that appeal to large fields with strong edges tend to under-314 segment small, dense field clusters. Alternatively, lower edge-thresholds that address dense 315 regions with very small fields over-segment large fields. Our solution was to apply a series of 316 multi-directional convolution kernels to extract the edge gradient magnitude (EGM) from 317 selected variables (explained in detail in section 3.3.1), and then use multi-scale contrast 318 limited adaptive histogram equalization (CLAHE) and adaptive thresholding (ATh), followed by 319 image morphology to extract land cover object edges. The multi-scale approach reduced the 320 threshold bias that might occur with one window size. Image edges were subsequently 321 “cleaned” with image morphology, where edges were morphologically thinned and edge ends 322 linked to close small gaps. Finally, detected object edges were intersected with thematic 323 cropland estimates (procedure described in section 3.4) to predict individual cropland field 324 parcels. The following sub-sections describe the edge extraction methods in detail. 325 326 19 3.3.1 Edge extraction 327 A subset of was required for field extraction, because whereas a machine learning approach 328 can ‘learn’ to set aside noisy data, our object segmentation approach uses all input data given. 329 The set of variables used for field extraction is given as: 330 where the new superscript denotes the 50th, 75th, and 95th percentile variables, and the CV, and 331 excludes the remaining 6 time-series statistics. After visual analysis, this subset of time series 332 statistics provided the most consistent set of variables that were least affected by cloud mask 333 errors. Prior to edge extraction, each variable was smoothed using a bilateral filter 334 (Tomasi & Manduchi, 1998). A bilateral filter is a weighted average of local pixels but, 335 importantly, considers pixel neighbors in the color and spatial domain. This has the effect of 336 smoothing image noise while preserving true edges. Next, we applied a series of , 337 convolution kernels to extract the normalized EGM (dynamic range of 0-1) from . Image 338 convolution is the sum of the local product of a kernel pair ( ) (Fig. 5) and an image. 339 340 341 Figure 5. Pre-defined convolution kernel pairs. Each bracket consists of a , symmetric kernel pair used to detect 342 object edges. 343 344 20 The pairs are reflections of each other, so we computed a set of EGM images by image 345 convolution as: 346 347 (2) where is the set of and EGM values for image , sub-variable set 348 (e.g., ), at pixel location . The image gradient, ( ), for kernel 349 pair is computed by convolving a time series statistic ( ) with a kernel as: 350 (3) 351 where is the set of kernel pixel offset locations relative to the image pixel location ( ). The 352 EGM was computed over the set of 8 convolution kernel pairs, applied independently to each 353 time series statistic, and the mean and maximum EGM of the 8 EGMs were calculated to 354 produce . A crop field edge should have a high EGM maximum and mean over all 355 sub-variables. Thus, in order to give preferential treatment to consistently high and strong 356 EGMs, we weighted them by the and EGM, followed by a gamma transform 357 ( ) to increase low EGM values relative to higher values (Eq. 3). 358 359 (4) 360 21 In the next step, we used CLAHE to adjust and standardize the local contrast of EGM values. The 361 CLAHE procedure is a histogram equalization applied tile by tile, as opposed to traditional 362 histogram equalization that uses the entire image for contrast adjustment. The CLAHE approach 363 only considers the range of data within a particular tile, and therefore produces a 364 transformation function for each tile, relevant at the tile’s center pixel. All other pixels within 365 the tile are bi-linearly interpolated with the adjusted tile center values, for computational 366 efficiency and to produce smooth transitions between tile borders. The ‘contrast limited’ 367 portion of the method is derived from clipping the local histograms to predetermined 368 thresholds prior to computing the cumulative distribution function (Zuiderveld, 1994). We used 369 a range of tile sizes (5 x 5, 7 x 7, 9 x 9, or 150m x 150m, or 210m x 210m, and 270m x 270m, 370 respectively) and clip percentage limits (1, 2.5, 5), and then used the mean adjustment over all 371 combinations (i.e., 9 iterations): 372 373 (5) 374 where is the locally CLAHE adjusted EGM at pixel location , at CLAHE tile 375 size and clip percentage threshold pairs. Next, we applied a binary, edge or no-edge, 376 threshold to the CLAHE results using an adaptive threshold (ATh) approach. As mentioned, a 377 single threshold value cannot be consistently applied to an entire image or image region. 378 Therefore, we used ATh to convert object edges to binary edge or no-edge values. For each 379 local tile, a threshold value was considered from the local pixels only. A threshold was explicitly 380 22 computed (as opposed to interpolated) for each pixel, for each local tile centered around pixel 381 . We used the tile median EGM to determine the threshold at scale . The binary object edge 382 estimate for image is thus: 383 384 (6) 385 where is either a land cover edge or no-edge and is the ATh 386 procedure at scale . The results are cumulatively combined ( ) by taking the 387 product of each successive CLAHE process, where is initialized at 1. 388 389 3.3.2 Edge morphological cleaning 390 After adaptively thresholding image edges, we applied a series of morphological thinning (Lam 391 et al., 1992), endpoint linking, and endpoint trimming to clean edges. First, we morphological 392 thinned the cumulative edges with 1 iteration. Then gaps between endpoints were closed by 393 linking the shortest path between the two endpoints following a set of rules: 1) gaps between 394 two endpoints that were less than 4 pixels wide were closed; 2) gaps between two endpoints 395 that were between 4 pixels and 8 pixels wide were closed if the endpoints were at inverse 396 angles. Here, we estimated the directional angle of each endpoint and considered inverse 397 rounded angle pairs as 0 and 180, 45 and 225, 135 and 315, and 90 and 270 degrees; 3) gaps 398 between an endpoint and an edge that were greater than 8 pixels wide were closed if there was 399 23 sufficient EGM (mean EGM > 0.1) along the shortest connecting path and the endpoints were at 400 inverse angles, as described in the second criteria. This last check was to ensure field 401 boundaries with true gaps were not incorrectly closed. The connecting path between endpoints 402 was determined by the shortest distance between endpoints that did not cross land cover 403 edges. After thinning and closing small gaps, the final edges were thinned with a morphological 404 skeleton operator, which is morphological thinning with infinite iterations (Zhang and Suen, 405 1984), that ensured all edges were a maximum of 1 pixel wide. The final edge image after the 406 morphological cleaning process is: 407 (7) 408 where is the final binary object estimate for image , at pixel , and is 409 the morphological cleaning operations of thinning, endpoint linking, and skeleton. The final 410 procedure (inverse) was an inversion of edges to objects and the removal of small objects with 411 a predetermined threshold of 5 pixels (~0.45ha) for the minimum field size. The lower threshold 412 was determined from visual observation of the minimum detectable object from Landsat in the 413 study region. The steps described above are illustrated in rows A—D of Figure 6. 414 24 415 Figure 6. Principal steps of the field extraction process. Each row represents a step in the workflow and each 416 column a sample grid within the test region. The methodological steps are: A) Time-series variable, NDVI CV, 417 smoothed with a bilateral filter; B) Edge gradient magnitude after multi-scale CLAHE normalization; C) Binary edges 418 after multi-scale adaptive thresholding; D) Binary edges after morphological cleaning; E) Thematic land cover 419 classification (orange illustrates cropland); F) Land cover objects after intersection with cropland (color display is 420 random); and G) Field parcels after intersection with cropland, where the field size is illustrated according to the 421 color ramp on the bottom. The center coordinates for each sample grid are: I) -31° -59’ -9.44” S, -61° -57’ -58.57” 422 W; II) -27° -45’ -12.89” S, -53° -21’ -8.83” W; III) -17° -12’ -45.13” S, -62° -8’ -24.1” W; IV) -24° -44’ -25.49” S, -64° -8’ 423 25 -42.96” W; V) -25° -26’ -2.47” S, -63° -44’ -56.13” W; VI) -12° -52’ -12.8” S, -45° -40’ -39.62” W; VII) -12° -38’ -57.96” 424 S, -46° -10’ -40.07” W; VIII) -12° -50’ -42.67” S, -45° -23’ -36.3” W. 425 426 3.4 Field labeling 427 The object extraction approach using localized edge thresholding was fully automated, whereas 428 we used a semi-supervised approach to label objects as cropland or not. We collected samples 429 for 9 land cover categories (Cr=cropland; Pg=pasture/grassland; Tr=trees (natural); Ub=Urban; 430 Wt=Water/Wetland; Ba=Bare; Sh=Shrub; Pl=trees (plantation); and Cl=Cleared trees (during 431 study timeframe)), and used the samples collected across the study region to train a supervised 432 predictive model to estimate cropland and non-cropland land covers from . Land cover was 433 sampled similarly to Graesser et al. (2015), yet whereas the reference study was restricted to 434 high-resolution imagery, this study supplemented high-resolution acquisition gaps by sampling 435 directly from Landsat imagery. As in Graesser et al. (2015), Latin America regional experts and 436 image interpretation experts used high-resolution imagery from Google Earth and the Landsat 437 time series composites to distinguish between the 9 land covers. 438 We used the classification model, Extremely Randomized Trees (ERT) (Geurts et al., 439 2006), which is an ensemble of randomized decision trees similar to Random Forests (RF) 440 (Breiman, 2001). However, rather than bootstrap sampling as typically performed in RFs, the 441 entire set of samples is used in each tree with ERT. Additionally, a RF searches for the best split 442 from a random subset of predictor variables, whereas an ERT randomly chooses a split, typically 443 resulting in smoother decision boundaries in feature space. We used the Scikit-learn 444 implementation of ERTs (Pedregosa et al., 2011) to construct the predictive land cover model. 445 26 To identify individual crop fields, per-pixel cropland estimates were intersected with the 446 segmented objects described in section 3.3.1. If estimated cropland pixels accounted for 50% or 447 more of a segmented object, the object was identified as a crop field. Otherwise, the object was 448 removed. 449 We conducted a predictor variable ranking and ERT parameter optimization prior to 450 training the classification model. First, we ranked the importance of all fifty possible predictor 451 variables from with a test (Liu & Setiono, 1995) and the ERT feature importance, and 452 removed features that fell in the lower 50th percentile of both of the respective ranking 453 methods, resulting in fourteen of the fifty possible predictor variables removed. This reduced 454 predictor set was supplemented with Shuttle Radar Topography Mission (SRTM) 1 arc-second 455 (~30 m) elevation and slope data, as well as , coordinate information of each sample, 456 culminating in forty predictors. Then, we applied a 5-folds cross-validation of a range of ERT 457 parameters on the forty predictor variables. At each fold, 50% of the land cover samples were 458 randomly selected with replacement. We then tested the overall accuracy on the samples 459 withheld after each parameter combination of trees (500 and 1,000), maximum depth (5, 10, 460 15, 20, 25, 30, and 50), and minimum samples to split a node (2, 5, and 10). The parameter 461 combination with the highest overall accuracy on withheld samples was 500, 30, and 2 for 462 trees, maximum depth, and minimum samples, respectively, and we used defaults for the 463 remaining parameters. This final set of parameters was used to train an ERT model using all 464 available land cover samples. 465 466 27 3.5 Validation & assessment 467 3.5.1 Sampling & assessment 468 We conducted two accuracy assessments: 1) an assessment of the thematic map accuracy of 469 the cropland ERT estimates; and 2) an object-based assessment of randomly stratified fields 470 throughout the study region to evaluate the field parcel accuracy. For the thematic land cover 471 assessment, we assessed model performance on 30% of the samples randomly withheld from 472 the ERT model. Note that the model assessment on 30% of the withheld samples was 473 conducted post-parameter optimization (section 3.4) on the final parameter set, and not on 474 cross-validated samples. 475 We also assessed the local error of the land cover map by using the complete set of land 476 cover samples with a spatially constrained systematic sampling approach following methods 477 described in Foody (2005), and illustrated in Figure 7. For the spatially constrained map 478 assessment, we systematically created a point grid of local assessment locations every 200 km 479 across the study area. At an assessment location, we recorded the -nearest ( =100 for this 480 study) land cover samples described in section 3.4. We constrained the sample search by 481 limiting the maximum search distance to 300 km around each assessment location. The 482 cropland class f-score, a weighted average of the producer’s and user’s accuracy 483 , was then recorded for those -nearest 100 land cover samples, at an 484 assessment location. This was repeated for all assessment locations, and the cropland -scores 485 were interpolated to create a continuous grid for the entire study area. We used the -score in 486 place of overall (or global) accuracy because the cropland class was of most interest for this 487 28 study. The subsequent grid is indicative of model fit rather than map accuracy because we used 488 the same set of samples to train the classification model. 489 490 491 Figure 7. Spatially constrained systematic sampling for thematic accuracy assessment. At each sample location 492 (black dots, currently at ), the n-nearest land cover samples (red dots) are selected for map assessment. The local 493 assessment scores are stored at each sample location. The land cover sample selection is first limited to the -494 nearest samples and then by the search radius around the current sample location (here, set at 300 km). 495 For the object-based assessment we created wall-to-wall 10 km x 10 km grids across the 496 test region and randomly selected 1,000 grids with at least 10% crop area (estimates based on 497 2000/2001 Landsat cropland estimates from this study) (Fig. 1). Our goal was to have more crop 498 field samples throughout the entire region rather than many samples from one area. Using the 499 Landsat time series composites and Google Earth imagery, we manually digitized at least 2 500 fields within each grid for a total of 5,480 validation field parcels, limiting delineated parcels to 501 29 those in which we had high confidence that they were indeed crop fields. In each grid, we 502 included the range of field sizes and shapes found throughout the study region. Moreover, we 503 attempted to delineate fields with ‘weak’ edges or where field interiors were not clean. Still, an 504 entire field must be visible for an interpreter to identify it from imagery. Thus, we did not 505 include all fields within each grid. 506 Traditional per-pixel validation approaches are not appropriate for object-based 507 validation. Instead, we assessed individual field parcels following proposed metrics (Persello 508 and Bruzzone, 2010; Whiteside et al., 2014; Yan and Roy, 2014). Land-cover objects can be 509 evaluated based on location (the spatial coherence between a predicted and reference object) 510 and overlap (over- and under-segmentation, shape comparison). The Euclidean distance of 511 object centroids was used to measure offset errors, while over- and under-segmentation, 512 fragmentation, and shape eccentricity were used to measure object overlap and shape errors. 513 We only considered overlapping objects when evaluating these object-based metrics. That is, 514 for each reference object we extracted all overlapping, predicted objects. Only the object with 515 maximum overlap was used for over- and under-segmentation, whereas all overlapping 516 predicted objects were considered with fragmentation (see Persello and Bruzzone (2010) for 517 detailed explanation). In Persello and Bruzzone (2010), error metrics are defined in a 0-1 518 dynamic range, with perfect agreement being values of 0. We express error metrics in inverted 519 percentages (i.e., ). Therefore, perfect agreement equals 100. 520 521 30 3.5.2 Field distribution curves 522 To evaluate inequality in field size distributions, we plotted Lorenz curves (Lorenz, 1905) 523 (commonly used to show economic inequality) of field parcel sizes within each province. The 524 Lorenz curve shows the crop field deciles in a province on the x-axis versus the cumulative 525 percentage of total cropland on the y-axis. We computed the Gini index (as a measure of field 526 size inequality) from the Lorenz curve as: , where is the area between the 1:1 527 line and the plotted distribution (Lorenz curve), and is the remaining area under the curve. 528 The Gini coefficient ranges from 0 to 1, where 0 is equally distributed and 1 is high inequality. 529 530 4. Results 531 4.1 Composite, cropland, & field parcel estimates 532 The seasonal compositing resulted in seamless path/row boundaries throughout the study area, 533 which allowed the use of a single classification model and limited artificial image edges. Two 534 subsets of the composites illustrate the effectiveness of the normalized CDR surface reflectance 535 products to produce a standardized continental Landsat dataset (Fig. 8). The CV and the 50th 536 and 75th percentiles illustrate the seasonality fluctuations of row-crop agriculture. Generally, 537 dark blue and purple shades are those with high seasonal variance and low to medium 50th and 538 75th percentile values, and pink illustrates land cover with a high CV and high 75th percentile. 539 Stable values are generally shades of yellow, where the CV is very low and the 50th and 75th 540 percentiles are high. 541 31 542 Figure 8. Examples of annual composites, with the 75th percentile displayed as red, the 50th percentile displayed as 543 green, and the coefficient of variation displayed as blue. Each location shown, highlighted by a blue or orange 544 frame in the top left inset, covers an area of 450 km x 150 km. The gray overlapping grids illustrate overlapping 545 Landsat path/row grids. 546 547 32 The thematic cropland results capture the key cropland areas of the study area and the 548 detection of individual field parcels illustrates the broad range of field sizes across this vast area 549 (Fig. 9). Small fields and larger (displayed in shades of green and orange) are distributed 550 throughout the study area, but generally more prevalent in the southern half. In southern 551 Argentina and southern Brazil, small fields tend to be mixed with larger fields (shades of blues), 552 but the broad pattern is one of large-to-small fields emanating from core areas. For example, 553 the densest cluster of small fields in the Argentine Pampas is in central Santa Fe province, on 554 the northern fringe of the agricultural core. Elsewhere in Brazil’s most southern state, Rio 555 Grande du Sol, small fields surround larger fields in the main agricultural centers. This pattern is 556 also similar through the Brazilian states of Santa Catarina, Paraná, and São Paulo, and to some 557 extent in Santa Cruz, Bolivia. However, newer agricultural regions have contrasting field 558 arrangements. For example, the configuration of larger fields is more prominent in the 559 Argentine Chaco (Salta and Santiago del Estero provinces), with the exception of small fields in 560 the Andes foothills. Similarly, the u-shaped agricultural belt in the Brazilian Cerrado, covering 561 states of Mato Grosso, Mato Grosso do Sul, Goías, and Bahia, consists principally of fields 562 around, or larger, than 100 ha. Finally, results for central Chile and southwestern Uruguay 563 illustrate landscapes dominated by small fields during the 2000/2001 timeframe. 564 33 565 Figure 9. Cropland and field size results for the January 1, 2000 to August 1, 2001 period. Here, the cropland 566 footprint is shown by field size and is illustrated by non-gray colors. The right inset shows a large-scale map view of 567 marker A in the left map. 568 Figures 10 and 11 illustrate crop field extraction results at a larger scale than shown in 569 Figure 9. The 3-band time series sub-samples showcase the importance of seasonal information 570 for cropland detection. In the center column, detected fields are displayed by random colors, 571 while in the right column, detected fields are displayed by area, on a scale of 0.45 to >=100 ha 572 (0, or black, is non-cropland). Many small fields (although small may be relative for this part of 573 34 the world) exist in the core agricultural region of Argentina. Small and large fields are mixed 574 throughout the Argentine landscape, as illustrated in the Río de la Plata basin, where there are 575 many fields less than 10 ha (Fig. 10). Elsewhere, in the main agricultural region of Rio Grande du 576 Sol, Brazil, the field size range spans the entire scale spectrum, but very large fields dominate 577 this landscape (Fig. 11). A noticeable difference between the Brazilian example (Fig. 11) and the 578 Argentine example (Fig. 10) is the field morphology, or shape. The flat, Argentine Pampas 579 allows for regularly gridded farming patterns, whereas a much different pattern exists in 580 southern Brazil, where field shapes follow tree corridors, water bodies, and the topography. 581 Elsewhere throughout the region, settlement patterns produce a complex agricultural 582 landscape, with very small fields approaching the limitations of Landsat’s detectable scale. In 583 some agricultural landscapes, such as in Santa Cruz, Bolivia, fields approaching 5 ha and smaller 584 are more difficult to detect because they are long and narrow. 585 35 586 Figure 10. Field extraction examples from Santa Fe province, Argentina. Left column: RGB display of temporal 587 statistics computed from 1.5 years of Landsat imagery. Generally, blue, purple, and pink represent row-crop 588 agriculture, green and blue represent pastures, and yellow represents trees. Center column: Field extraction 589 results, displayed by random colors. Right column: field extraction results, colored by increasing field size. Each 590 inset is 60 km x 30 km. 591 592 36 593 Figure 11. Field extraction examples from Rio Grande do Sul state, Brazil. Left column: RGB display of temporal 594 statistics computed from 1.5 years of Landsat imagery. Generally, blue, purple, and pink represent row-crop 595 agriculture, green and blue represent pastures, and yellow represents trees. Center column: Field extraction 596 results, displayed by random colors. Right column: field extraction results, colored by increasing field size. Each 597 inset is 60 km x 30 km. 598 Field size distributions at the second administrative unit (i.e., provincial or state) are 599 shown for 5 countries (Figure 12). The figure illustrates the provincial-level Lorenz curves and 600 Gini coefficients of field size for each country. The highest Gini scores (i.e., most unequal 601 distribution of cropland by field size) among all administrative units assessed were in Piauí, 602 Brazil (0.73), Neuquén, Argentina (0.70), Alto Paraguay, Paraguay (0.68). The high Gini scores 603 are illustrated by the dominance of large fields in the respected provinces (Fig. 9). Provinces 604 37 with cropland dominated by large fields are found in Argentina, Brazil, and Paraguay. However, 605 Chile and Uruguay show more equal distributions of small fields, which noticeably dominate the 606 cropland landscape (Fig. 9). 607 38 608 Figure 12. 2000/2001 field size distributions in selected second level administrative units (province or state) in 609 Argentina, Brazil, Chile, Paraguay, and Uruguay. Lorenz curves and Gini coefficients of provincial-level field size 610 distributions. The x-axis represents the cumulative crop field deciles within a province, while the y-axis represents 611 the cumulative percentage of total crop area at each decile. The colors illustrate the province’s Gini coefficient 612 39 scores (1=unequal distribution of cropland area toward large fields; 0=equal cropland distribution among all field 613 sizes). 614 615 4.2 Map and object accuracy 616 The thematic cropland accuracy assessment resulted in cropland producer’s, user’s, and f-617 scores of 90.7%, 97.1%, and 91%, respectively (Table 2). The other classes included in the 618 classification model are also shown but not highlighted, along with overall (84%), kappa (0.8), 619 and the sample-adjusted (Olofsson et al., 2013) overall score (88%). Figure 13, panel A shows 620 the results of the f-score spatially constrained assessment. Cropland f-scores were in the upper 621 90s across the region, indicating a spatially well-balanced and generalizable predictive model. 622 Pockets of poorer estimates, such as in central Chile, decreased the overall f-scores (Table 2). 623 In general, over-segmentation errors were worse than under-segmentation (Table 3), 624 particularly on the northwestern and northeastern fringes of the study region (Fig. 13, panels B-625 C). The highest under-segmentation errors were in and north of São Paulo, Brazil. Overall, 626 object-based errors are clustered near the ideal target of 70 and higher (Fig. 14). Segmentation 627 and fragmentation errors decrease as field size enlarges (top left and bottom left insets). 628 However, the eccentricity shape error and centroid offset do not indicate any relationship with 629 field size (top right and bottom right insets). 630 631 632 633 40 Table 2. Error matrix for thematic classification results (From section 3.4: Cr=cropland; Pg=pasture/grassland; 634 Tr=trees (natural); Ub=Urban; Wt=Water; Wd=Wetland; Ba=Bare; Sh=Shrub; Pl=trees (plantation); Cl=Cleared trees 635 (during study timeframe)). Nine land cover classes were sampled, but we highlight cropland (in gray) for the 636 purposes of this study’s interests. (UserA and ProdA refer to sample-adjusted producer’s and user’s accuracy, 637 respectively, following Olofsson et al., (2013); *=Overall accuracy; **=Kappa score; ***=Sample-adjusted overall 638 accuracy). 639 640 641 642 Observed Cr Pg Tr Ub Wt Ba Sh Pl Cl Total User UserA Predicted Cr 997 9 1 4 3 -- -- 13 -- 1,027 97.1 97.1 Pg 91 1,002 24 55 8 6 32 8 3 1,229 81.5 81.5 Tr 10 8 946 8 1 -- -- 30 2 1,005 94.1 94.1 Ub -- 2 -- 620 -- 8 -- 1 -- 631 98.3 98.3 Wt -- -- -- -- 244 2 -- -- -- 246 99.2 99.2 Ba -- 2 -- 1 39 86 2 -- -- 130 66.2 66.2 Sh -- 2 1 1 -- 3 152 -- -- 159 95.6 95.6 Pl -- -- 3 -- -- -- -- 93 -- 96 96.9 96.9 Cl 1 -- 1 -- -- -- -- -- 50 52 96.2 96.2 Total 1,099 1,025 976 689 295 105 186 145 55 Prod 90.7 97.8 96.9 90 82.7 81.9 81.7 64.1 90.9 *84 **0.8 ProdA 67.5 98.8 97.2 32.2 56.2 78.7 78.9 8.1 25 ***88.1 41 Table 3. Object assessment summary of 5,480 fields. The range of over- and under-segmentation, eccentricity 643 error, and fragmentation is 0 to 100, with 100 being perfect agreement. The offset error is given in meters. 644 Over-segmentation Under- segmentation Eccentricity Fragmentation Offset Relative % error 68.02 86.58 90.23 99.85 7.18 -10.07 50th percentile 75.96 94.44 95.26 99.89 2.98 -16.78 Number of Fields greater than 3,473 4,346 3,730 3,340 4,098 3,574 Percentage of fields greater than 63 79 68 61 75 65 Margin of error (p=0.05) 0.64 0.63 0.34 0.004 0.29 2.03 645 646 42 647 Figure 13. Spatially constrained errors of: A) Cropland f-scores (the f-scores are scaled to percentages). For every 648 systematically sampled point location (200 km spacing), the local cropland f-scores of the n-nearest (n=100) land 649 cover samples was calculated. The results for all points were then interpolated to produce a continuous error 650 estimate for the entire study area. The points used for accuracy assessment were the same points used to train the 651 classification model, thus the illustration is a better representation of model accuracy than map accuracy; B) Mean 652 of over- and under-segmentation. C) Over-segmentation; and D) Under-segmentation. Each metric was masked by 653 a 1,000 m x 1,000 m grid of cropland, where areas in white do not contain any cropland. 654 655 43 656 Figure 14. Object accuracy binned scatterplots. For each object metric, 5,480 manually delineated fields were 657 assessed against the predictions of the crop extraction algorithm. The best possible score for the ‘Over-Under’, 658 ‘Eccentricity’, and ‘Fragmentation’ metrics is 100, whereas the distance metric is optimal at 0. The dashed lines 659 represent the median score for the respective metrics. Top left: Over- and under-segmentation. Note that over-660 segmentation ranges from 0 to 100, while under-segmentation was inverted (0 to -100) for illustration purposes. 661 Top right: Eccentricity shape error (range 0 to 100). Bottom left: Fragmentation error (range 0 to 100). Bottom 662 right: Distance offset (units in meters). The x-axis is logged for display, with real units shown. 663 44 5. Discussion 664 Cropland mapping is often reliant on a seasonal profile to accurately distinguish it from other 665 vegetation. Previously, dense time series stacks were a major limitation with the Landsat 666 satellite because of computational limitations and the cost and access to Landsat imagery 667 (Woodcock et al., 2008). The opening of the archive and computational advancements have 668 since made Landsat a more viable option for regional (Yan and Roy, 2016) and global earth 669 observation (Hansen et al., 2013). As shown from Figure 9 and Table 2, the extent of cropland 670 can be accurately estimated in South America using multi-temporal Landsat imagery, and is 671 comparable to MODIS cropland estimates that utilized higher temporal but coarser spatial 672 resolution (Graesser et al., 2015). At the field parcel level, the object validation showed that 673 parcels can be extracted across a large region. Additionally, the more general field-size patterns 674 across the region illustrated that the older, traditional agricultural regions of southern Brazil, 675 Santa Cruz, Bolivia, and the Argentine Pampas comprise a wide range of field sizes, whereas the 676 newer agricultural frontiers of northern Argentina and the Brazilian Cerrado have a greater 677 presence of large fields. This is also reflected in the Lorenz plots, where there is a larger 678 distribution spread in administrative units closer to the agricultural frontiers. 679 680 5.1.1 Temporal challenges 681 In this study, we extracted field parcels for the period January 1, 2000 to August 1, 2001. This 682 timespan presented few limitations to data coverage and sensor issues (Kovalskyy and Roy, 683 2013). However, time series compositing with pre-2000s Landsat data will prove more 684 challenging because of less coverage over the South American continent. Moving forward, 685 45 Landsat availability for the 21st century is generally plentiful. However, while applications of 686 post-2000 field detection can utilize the ETM+ sensor, SLC-off corrections will be imperative in 687 order to eliminate potential artificial field boundaries introduced from the scan line errors. This 688 will be an important challenge for detecting fields in the SLC-off/pre-Landsat 8 period. While 689 linear interpolation might suffice for uses such as visual aid and per pixel land cover 690 classification, it will likely hinder object segmentation methods. Gap-filling methods that utilize 691 spatial context are more promising. We consider two such methods that have proven to 692 provide better estimates of missing data gaps: 1) Geostatistical kriging (Zhang et al., 2007) and 693 2) the USGS Phase 2 method (USGS, 2004). Both of these methods use neighboring pixels from 694 multiple scenes to fill SLC-off data gaps. Future work should assess the tradeoffs between the 695 Zhang et al. (2007) method, which is more statistically rigorous, but more computationally 696 intensive, and the USGS Phase 2 least squares approach. The recent launches of Landsat 8 and 697 Sentinel-2 series should ensure continued broad temporal coverage at a sub-30 m scale beyond 698 the SLC-off period, though. 699 700 5.1.2 Spatial challenges 701 At 30 m spatial resolution, Landsat can be used to detect a large range of field sizes, and much 702 of the sub-Andean row-crop agriculture in our study area is at or above Landsat’s minimum size 703 requirements. However, fields smaller than 1 ha are not as reliably detected, depending on the 704 field’s configuration. In the Argentine Pampas, for example, the gridded settlement, road, and 705 agricultural patterns produce many uniform-sized fields. Thus, fields around 0.5 ha are 706 detectable. However, land cover objects are only detectable if there is sufficient separation 707 46 between an object’s interior and its edges. Otherwise, the object simply becomes a line. 708 Elongated fields under 1 ha are more challenging to detect because of a lack of separability in 709 the object’s interior. In addition to field size, shape, and configuration, another obstacle to 710 accurately identify field boundaries is crop type. In the absence of natural or man-made 711 borders such as streams, wetlands, or roads, field parcel detection is reliant on the detection of 712 salient features to distinguish one crop from another. Neighboring crops may often be different 713 species types (e.g., maize or wheat), and therefore have different growing patterns. Or, the 714 crop’s temporal signature may be sufficient to identify boundaries. However, when neighboring 715 crop fields are of the same species or are only separated by a small gap or fence line, the 716 minute separation becomes saturated at the Landsat scale. 717 These scale impediments were illustrated in the object accuracy results, where the worst 718 observed scores were in the main sugarcane region of Brazil and in northeastern Brazil. 719 Sugarcane is challenging to segment because of the homogeneous connectivity from field to 720 field (especially in the absence of fence lines), as well as the terraced planting methods 721 employed by farmers. In some cases, even high-resolution imagery is not sufficient to visibly 722 distinguish between field boundaries in this region. In northeastern Brazil, the scale of 723 agricultural production affected the results, where very small-scale agriculture led to poor 724 results in the region. An obvious solution to the issue of very small fields or subtle borders is the 725 employment of higher-resolution imagery. If the tradeoffs between spatial, temporal, and 726 radiometric resolution do not limit the parcel detection, then Landsat’s panchromatic band 727 (15m) and the newly launched Sentinel-2 (10m and 20m) could be valuable for future field 728 extraction. Future work on field parcel detection from satellite imagery should utilize the 729 47 growing resource-base of high-resolution imagery, as well as assess possibilities for algorithmic 730 improvement to reduce over-segmentation errors. 731 732 5.1.3 Large-scale estimates beyond land cover 733 Rich satellite datasets and technological advancements are changing the nature in which we 734 can observe and monitor vegetation across the planet (Hansen et al., 2013). However, global 735 datasets, especially agricultural land cover maps, still lack in precision and detail (Fritz, 2013). 736 Information that describes agriculture beyond area, such as agricultural intensification, is 737 especially absent (Kuemmerle et al., 2013). Land cover maps that describe agricultural area are 738 not sufficient to improve our understanding of rapidly changing agricultural landscapes 739 (Vallejos et al., 2015). Global agricultural expansion has slowed in recent decades, and future 740 agricultural changes will likely stem from greater intensification rather than expansion. Remote 741 sensing, image processing, data storage, and compute infrastructures can provide the capacity 742 to improve agricultural datasets at the field level (Fritz et al., 2015; Yan and Roy, 2015). The 743 scientific community will likely still rely on censuses and surveys for information such as 744 agricultural inputs and mechanization, but remote sensing can contribute a great deal toward 745 monitoring of agricultural intensification, crop rotations, and crop productivity, often in near 746 real-time. 747 748 48 5.1.4 Caveats and uncertainties 749 The cropland field object accuracy assessment was objective in the sampling of the 10 km x 10 750 km grids (Figure 1), but not in the field delineation within the 1,000 randomly sampled grids. 751 Our decision to include parcels that we had high confidence in was a subjective one. Given a 752 lack of ground data about crop boundaries for South America, we considered this the best 753 option to assess the method over a large region. Moreover, any field manually delineated from 754 satellite imagery is biased from human image or photo interpretation. Though it is more costly 755 and time consuming, future work should make use of ground-collected field data for optimal 756 and unbiased validation. 757 758 6. Summary & conclusion 759 The methods presented in this paper describe the detection of individual crop field parcels from 760 multi-temporal Landsat imagery across a large region of South America. We employed a multi-761 directional and multi-spectral object extraction technique that was able to identify cropland 762 field parcels at a high percentage, intersected with thematic cropland estimated at a rate of 763 91% (f-score). We performed two levels of assessment to address whether the methods could 764 prove robust across varying landscapes and field types. The two, per-pixel thematic and object-765 based, validation assessments were spatially rigorous and included multiple ecoregions and 766 countries. With a continuous spectrum of field-size data, we generated Lorenz curves for 767 selected states and provinces in the study region, illustrating how the scale of farming at the 768 field level varies among and within countries across much of South America. 769 49 We applied the experimental field detection in a continent that has experienced 770 profound agricultural changes. South American agriculture has been at the forefront of 771 environmental concerns, but largely because of its role in tropical deforestation. Agricultural 772 changes have not been limited to expansion into tropical forests, or forests alone for that 773 matter. With increasing awareness of agriculture’s role in South American deforestation, plus 774 fewer land expansion options in many areas, the remaining avenues for farmers to increase 775 productivity are often to acquire existing cropland from other landholders, increase inputs, or 776 sell and move to areas of expansion. There is evidence of expansion (Graesser et al., 2015), but 777 further research is needed to identify cropland intensification from remote sensing. The 778 methods in this study can contribute greatly to understanding the dynamics of agricultural 779 changes, whether they be extensive in the form of forest or grassland clearance, or intensive in 780 the form of increased mechanization, scale of production, or altered crop rotations. 781 Agricultural technological advancements are changing the face of farming across the 782 globe. Global deforestation watchdogs, data access, and computational improvements have 783 augmented our understanding of expansion into grasslands and forests. Perhaps as important, 784 though, are the results of land use changes and the implications for a host of environmental 785 issues. Agro-industrialization is changing how food is grown across the globe, yet it is not easily 786 observed with current monitoring programs. Satellite data availability, advanced image 787 processing techniques, and computational advancements will continue to be valuable resources 788 to help fill this data gap. 789 790 50 Acknowledgements 791 This research was funded by a Discovery Grant to N Ramankutty from the Natural Science and 792 Engineering Research Council (NSERC) of Canada. 793 794 51 References 795 Alted, F. & Vilata, I. (2002--). PyTables: Hierarchical datasets in Python. 796 http://www.pytables.org. 797 Barrett, C.B., Barbier, E.B., & Reardon, T. (2001). Agroindustrialization, globalization, and 798 international development: environment implications. Environment and Development 799 Economics, 419—433. 800 Berduegué, J.A. & Fuentealba, R. (2011). Latin America: The state of smallholders in agriculture. 801 IFAD Conference on New Directions for Smallholder Agriculture, Rome, Italy, Vol. 24. 802 Breiman, L. (2001). Random Forests. Machine Learning 45, 5—32. 803 Deininger, K., Byerlee, D. (2012). The rise of large farms in land abundant countries: Do they 804 have a future? World Development 40, 701—714. 805 Dros, J.M. (2004). Managing the soy boom: Two scenarios of soy production expansion in South 806 America. Amsterdam: AIDEnvironment. 807 Fahrig, L., Girard, J., Duro, D., Pasher, J., Smith, A., Javorek, S., King, D., Lindsay, K.F., Mitchell, S., 808 & Tischendorf, L. (2015). Farmlands with smaller crop fields have higher within-field 809 biodiversity. Agriculture, Ecosystems and Evironment, 200, 219—234. 810 FAO (Food and Agriculture Organization). (2015). FAOSTAT. http://faostat.fao.org. 811 Flood, N., Danaher, T., Gill, T., & Gillingham, S. (2013). An operational scheme for standardised 812 surface reflectance from Landsat TM/ETM+ and SPOT HRG imagery for Eastern Australia. 813 Remote Sensing, 5, 83—109. 814 Foley, J.A., DeFries, R., Asner, G.P., Barford, C., Bonan, G., Carpentar, S.R., Chapin, F.S., Coe, 815 M.T., Daily, G.C., Gibbs, H.K., Helkowski, J.H., Holloway, T., Howard, E.A., Kucharik, C.J., 816 52 Monfreda, C., Patz, J.A., Prentice, I.C., Ramankutty, N., & Snyder, P.K. (2005). Global 817 consequences of land use. Science, 309, 570—574. 818 Foody, G.M. (2005). Local characterization of thematic classification accuracy through spatially 819 constrained confusion matrics. International Journal of Remote Sensing, 26:6, 1217—820 1228. 821 Fritz, S., See, L., You, L., Justice, C., Becker-Reshef, I., Bydekerke, L., Cumani, R., Defourny, P., 822 Erb, K., Foley, J., Gilliams, S., Gong, P., Hansen, M., Hertel, T., Herold, M., Herrero, M., 823 Kayitakire, F., Latham, J., Leo, O., McCallum, I., Obersteiner, M., Ramankutty, N., Rocha, J., 824 Tang, H., Thornton, P., Vancutsem, C., van der Velde, M., Wood, S., & Woodcock, C. 825 (2013). The need for improved maps of global cropland. Eos, Transactions, American 826 Geophysical Union, 94(3), 31—32. 827 Fritz, S., See, L., McCallum, I., You, L., Bun, A., Moltchanova, E., Duerauer, M., Albrecht, F., Schill, 828 C., Perger, C., Havlik, P., Mosnier, A., Thornton, Ph., Wood-Sichra, U., Herrero, M., Becker-829 Reshef, I., Justice, C., Hansen, M., Gong, P., Abdel Aziz, S., Cipriani, A., Cumani, R., Cecchi, 830 G., Conchedda, G., Ferreira, S., Gomez, A., Haffani, M., Kayitakire, F., Malanding, J., 831 Mueller, R., Newby, T., Nonguierma, A., Olusegun, A., Ortner, S., Ram Rajak, D., Rocha, J., 832 Schepaschenko, D., Schepaschenko, M., Terekhov, A., Tiangwa, A., Vancutsem, C., 833 Vintrou, E., Wenbin, W., Van der Velde, M., Dunwoody, A., Kraxner, F., & Obersteiner, M. 834 (2015). Mapping global cropland and field size. Global Change Biology, 21, 1980—1992. 835 Gao, F., Masek, J. G., Wolfe, R. E., & Huang, C. (2011). Building a consistent medium resolution 836 satellite data set using moderate resolution imaging spectroradiometer products as 837 reference. Journal of Applied Remote Sensing, 4, 043526. 838 53 Gebhardt, S., Wehrmann, T., Ruiz, M. A. M., Maeda, P., Bishop, J., Schramm, M., Kopeinig, R., 839 Cartus, O., Kellndorfer, J., Ressl, R., Santos, L. A., & Schmidt, M. (2014). MAD-MEX: 840 Automatic wall-to-wall land cover monitoring for the Mexican REDD-MRV program using 841 all Landsat data. Remote Sensing, 6, 3923—3943. 842 Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63, 843 3—42. 844 Gibbs, H.K., Ruesch, A.S., Achard, F., Clayton, M.K., Holmgren, P., Ramankutty, N., & Foley, J.A. 845 (2010). Tropical forests were the primary sources of new agricultural land in the 1980s 846 and 1990s. Proceedings of the National Academy of Sciences of the United States of 847 America, 107(38), 16732—16737. 848 Gibbs, H.K., Rausch, L., Munger, J., Schelly, I., Morton, D.C., Noojipady, P., Soares-Filho, B., 849 Barreto, P., Micol, L., & Walker, N.F. (2015). Brazil’s soy moratorium. Science, 347(6220), 850 377—378. 851 Gitelson, A.A., Kaufman, Y.J., & Merzlyak, M.N. (1996). Use of a green channel in remote 852 sensing of global vegetation from EOS-MODIS. Remote Sensing of Environment, 58(3), 853 289—298. 854 Graesser J., Aide, T.M., Grau, H.R., & Ramankutty, N. (2015). Cropland/pastureland dynamics 855 and the slowdown of deforestation in Latin America. Environmental Research Letters, 10, 856 034017. 857 Griffiths, P., van der Linden, S., Kuemmerle, T., & Hostert, P. (2013). A pixel-based Landsat 858 compositing algorithm for large area land cover mapping. IEEE Journal of Selected Topics 859 in Applied Earth Observations and Remote Sensing, 6(5), 2088—2101. 860 54 Guizar-Sicairos, M., Thurman, S.T., Fienup, J.R. (2008). Efficient subpixel image registration 861 algorithms. Optics Letters, 33(2), 156—158. 862 Hansen, M.C., Roy, D.P., Lindquist, E., Adusei, B., Justice, C. O., & Altstatt, A. (2008). A method 863 for integrating MODIS and Landsat data for systematic monitoring of forest cover and 864 change in the Congo Basin. Remote Sensing of Environment, 112, 2495—2513. 865 Hansen, M.C., Loveland, T.R. (2012). A review of large area monitoring of land cover change 866 using Landsat data. Remote Sensing of Environment, 122, 66—74. 867 Hansen, M.C., Potapov, P.V., Moore, R., Hancher, M., Turubanova, S.A., Tyukavina, A., Thau, D., 868 Stehman, S.V., Goetz, S.J., Loveland, T.R., Kommareddy, A., Egorov, A., Chini, L., Justice, 869 C.O., Townshend, J.R.G. (2013). High-resolution global maps of 21st-century forest change. 870 Science, 342, 850—853. 871 Hermosilla, T., Wulder, M.A., White, J.C., Coops, N.C., Hobart, G.W. (2015). An integrated 872 Landsat time series protocol for change detection and generation of annual gap-free 873 surface reflectance composites. Remote Sensing of Environment, 158, 220—234. 874 Jiang, Zhangyan, Huete, Alfredo R., Didan, Kamel, & Miura, Tomoaki. (2008). Development of a 875 two-band enhanced vegetation index without a blue band. Remote Sensing of 876 Environment, 112, 3833—3845. 877 Kuemmerle, T., Hostert, P., St-Louis, V., & Radeloff, V. C. (2009). Using image texture to map 878 farmland field size: a case study in Eastern Europe. Journal of Land Use Science, 4(1-2), 879 85—107. 880 Kuemmerle, T., Erb, K., Meyfroidt, P., Müller, D., Verburg, P. H., Estel, S., Haberl, H., Hostert, P., 881 Jepsen, M. R., Kastner, T., Levers, C., Lindner, M., Plutzar, C., Verkerk, P. J., van der 882 55 Zanden, E., H., Reenberg, A. (2013). Challenges and opportunities in mapping land use 883 intensity globally. Current Opinion in Environmental Sustainability, 5(1), 1—10. 884 Kovalskyy, V. & Roy, D.P. (2013). The global availability of Landsat 5 TM and Landsat 7 ETM+ 885 land surface observations and implications for global 30 m Landsat data product 886 generation. Remote Sensing of Environment, 130, 280—293. 887 Lam, L., Seong-Whan, L., Suen, C.Y. (1992). Thinning methodologies—A comprehensive survey. 888 IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(9), 869—885. 889 Liu, H. & Setiono, R. (1995). Chi2: feature selection and discretization of numeric attributes. In 890 the Proceedings of the 7th International Conference on Tools with Artificial Intelligence, 891 (Nov), 388—391. 892 Lorenz, M.O. (1905). Methods of measuring the concentration of wealth. Publications of the 893 American statistical association, 9(70), 209—219. 894 Martinelli, L.A. (2012). Ecosystem services and agricultural production in Latin America and the 895 Caribbean. Inter-American Development Bank. 896 Maxwell, S. K. & Sylvester, K. M. (2012). Identification of “ever-cropped” land (1984-2010) using 897 Landsat annual maximum NDVI image composites: Southwestern Kansas case study. 898 Remote Sensing of Environment, 121, 186—195. 899 Matson, P.A., Parton, W.J., Power, A.G., Swift, M.J. (1997). Agricultural intensification and 900 ecosystem properties. Science, 277, 504—509. 901 Olofsson, P., Foody, G. M., Stehman, S. V., & Woodcock, C. E. (2013). Making better use of 902 accuracy data in land change studies: Estimating accuracy and area and quantifying 903 uncertainty using stratified estimation. Remote Sensing of Environment, 129, 122—131. 904 56 Olson, D. M., Dinerstein, E., Wikramanayake, E. D., Burgess, N. D., Powell, G. V. N., Underwood, 905 E. C., D’amico, J. A., Itoua, I., Strand, H. E., Morrison, J. C., Loucks, C. J., Allnutt, T. F., 906 Ricketts, T. H., Kura, Yumiko, Lamoreux, J. F., Wettengel, W. W., Hedao, P., & Kassem, K. 907 R. (2001). Terrestrial ecoregions of the world: a new map of life. BioScience, 51, 933—938. 908 Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., 909 Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., 910 Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. 911 Journal of Machine Learning Research, 12, 2825—2830. 912 Persello C. & Bruzzone, L. (2010). A novel protocol for accuracy assessment in classification of 913 very high resolution images. IEEE Transactions on Geoscience and Remote Sensing, 48:3, 914 1232—1244. 915 Potapov, P. V., Turubanova, S. A., Hansen, M. C., Adusei, B., Broich, M., Altstatt, A., Mane, L., & 916 Justice, C. O. (2012). Quantifying forest cover loss in Democratic Republic of the Congo, 917 2000-2010, with Landsat ETM+ data. Remote Sensing of Environment, 122, 106—116. 918 Rogers, A. S. & Kearney, M. S. (2004) Reducing signature variability in unmixing coastal marsh 919 Thematic Mapper scenes using spectral indices. International Journal of Remote Sensing, 920 25(12), 2317—2335. 921 Roy, D. P., Ju, J., Kline, K., Scaramuzza, P. L., Kovalskyy, V., Hansen, M., Loveland, T. R., Vermote, 922 E., & Zhang, C. (2010). Web-enabled Landsat data (WELD): Landsat ETM+ composited 923 mosaics of the conterminous United States. Remote Sensing of Environment, 114, 35—49. 924 Savitsky, A. & Golay, M.J.E. (1964). Smoothing and differentiation of data by simplified least 925 squares procedures. Analytical Chemistry, 36(8), 1627—1639. 926 57 Schmidt, M., Pringle, M., Devadas, R. Denham, R. & Tindall, D. (2016). A framework for large-927 area mapping of past and present cropping activity using seasonal Landsat images and 928 time series metrics. Remote Sensing, 8(312), 1—25. 929 Theil, H. (1992). A rank-invariant method of linear and polynomial regression analysis. Henri 930 Theil’s contributions to economics and econometrics (Springer), 345—381. 931 Tilman, D., Farigione, J., Wolff, B., D’Antonio, C., Dobson, A., Howarth, R., Schindler, D., 932 Schlesinger, W.H., Simberloff, D., Swackhamer, D. (2001). Forecasting agriculturally driven 933 environmental change. Science, 292, 281—284. 934 Tilman, D., Cassman, K. G., Matson, P. A., Naylor, R., & Polasky, S. (2002). Agricultural 935 sustainability and intensive production practices. Nature, 418. 936 Tomasi, C. & Manduchi, R. (1998). Bilateral filtering for gray and color images. Proceedings of 937 the Sixth IEEE International Conference on Computer Vision, Bombay, India, 839—846. 938 Toivonen, T., Kalliola, R., Ruokolainen, K., & Malik, R. N. (2006). Across-path DN gradient in 939 Landsat TM imagery of Amazonian forests: A challenge for image interpretation and 940 mosaicking. Remote Sensing of Environment, 100(4), 550—562. 941 Tucker, C. J. (1979). Red and photographic red linear combinations for monitoring vegetation. 942 Remote Sensing of Environment, 8, 127—150. 943 USGS (2004) SLC-off gap-filled products gap-fill algorithm methodology: Phase2 gap-fill al- 944 gorithm (http://landsat.usgs.gov). 945 Vallejos, M., Volante, J. N., Mosciaro, M. J., Vale, L. M., Bustamante, M. L., & Paruelo, J. M. 946 (2015). Transformation dynamics of the natural cover in the Dry Chaco ecoregion: A plot 947 level geo-database from 1976 to 2012. Journal of Arid Environments, 123, 3—11. 948 58 Vermote, E. F., Tanré, D., Deuzé, J. L., Herman, M., & Morcrette, J.-J. (1997). Second simulation 949 of the satellite signal in the solar spectrum, 6S: an overview. IEEE Transactions on 950 Geoscience and Remote Sensing, 35(3), 675—686. 951 Weissteiner, C. J., García-Feced, C., Paracchini, M. L. (2016). A new view on EU agricultural 952 landscapes: Quantifying patchiness to assess farmland heterogeneity. Ecological 953 Indicators, 61, 317—327. 954 White, J.C., Wulder, M.A., Hobart, G.W., Luther, J.E., Hermosilla, T., Griffiths, P., Coops, N.C., 955 Hall, R.J., Hostert, P., Dyk, A., & Guindon, L. (2014). Pixel-based image compositing for 956 large-area dense time series applications and science. Canadian Journal of Remote 957 Sensing, 40(3), 192—212. 958 Whiteside T.G., Maier S.W., & Boggs G.S. (2014). Area-based and location-based validation of 959 classified image objects. International Journal of Applied Earth Observation and 960 Geoinformation, 28, 117—130. 961 Woodcock, C.E., Allen, A.A., Anderson, M., Belward, A.S., Bindschadler, R., Cohen, W.B., Gao, F., 962 Goward, S.N., Helder, D., Helmer, E., Nemani, R., Oreopoulos, L., Schott, J., Thenkabail, 963 P.S., Vermote, E.F., Vogelmann, J., Wulder, M.A., & Wynne, R. (2008). Free access to 964 Landsat imagery. Science, 320, 1011. 965 Yan, L. & Roy, D. P. (2014). Automated crop field extraction from multi-temporal Web Enabled 966 Landsat Data. Remote Sensing of Environment, 144, 42—64. 967 Yan, L. & Roy, D. P. (2016). Conterminous United States crop field size quantification from 968 multi-temporal Landsat data. Remote Sensing of Environment, 172, 67—86. 969 59 Zhao, H. & Chen, X. (2005). Use of normalized difference bareness index in quickly mapping 970 bare areas from TM/ETM+. Proceedings of the 2005 IEEE Geoscience and Remote Sensing 971 Symposium. 972 Zhang, T.Y. & Suen, C.Y. (1984). A fast parallel algorithm for thinning digital patterns. Image 973 Processing and Computer Vision, 27(3), 236—239. 974 Zhang, C., Li, W., & Travis D. (2007). Gaps-fill of SLC-off Landsat ETM+ satellite image using a 975 geostatistical approach. International Journal of Remote Sensing, 28(22), 5103—5122. 976 Zhu, Z. & Woodcock, C. E. (2012). Object-based cloud and cloud shadow detection in Landsat 977 imagery. Remote Sensing of Environment, 118, 83—94. 978 Zuiderveld, K. (1994). Contrast limited adaptive histogram equalization. Graphics gems IV, 979 Academic Press Professional, Inc., 474—485. 980