Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Using multiple scales for enhancing predictive capacity in modelling responses to the cumulative effects… Kielstra, Brian William 2020

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2020_may_kielstra_brian.pdf [ 8.75MB ]
Metadata
JSON: 24-1.0389683.json
JSON-LD: 24-1.0389683-ld.json
RDF/XML (Pretty): 24-1.0389683-rdf.xml
RDF/JSON: 24-1.0389683-rdf.json
Turtle: 24-1.0389683-turtle.txt
N-Triples: 24-1.0389683-rdf-ntriples.txt
Original Record: 24-1.0389683-source.json
Full Text
24-1.0389683-fulltext.txt
Citation
24-1.0389683.ris

Full Text

USING MULTIPLE SCALES FOR ENHANCING PREDICTIVE CAPACITY IN MODELLING RESPONSES TO THE CUMULATIVE EFFECTS OF DISTURBANCE IN STREAMS  by Brian William Kielstra  B.Sc. (Env), The University of Guelph, 2010 M.Sc., Queen’s University, 2014  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Forestry)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  March 2020   © Brian William Kielstra, 2020 ii  The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, the dissertation entitled: Using multiple scales for enhancing predictive capacity in modelling responses to the cumulative effects of disturbance in streams  submitted by Brian Kielstra in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Forestry  Examining Committee: Dr. John Richardson; Forest & Conservation Sciences Supervisor  Dr. R. Daniel Moore; Geography Supervisory Committee Member  Dr. Jeanine Rhemtulla; Forest & Conservation Sciences Supervisory Committee Member Dr. Allan Carroll; Forest & Conservation Sciences University Examiner Dr. Mark Johnson; Earth, Ocean and Atmospheric Sciences University Examiner   iii  Abstract Disturbances affect ecosystems in complex ways at multiple spatial and temporal scales. Much research has focused on quantifying multiscale landscape patterns but less has focused on using multiscale information to increase predictive capacity in understanding complex ecosystem processes.  Predicting cumulative effects of stressors is important for managing ecologically resilient landscapes. Of particular importance is predicting how changes in land use cumulatively affect aquatic ecosystems, such as streams. Furthermore, despite strong functional linkages showing that headwaters deliver important resources downstream, limited research focuses on understanding landscape-scale variation in headwaters or predicting how many instances of fine-scale headwater alterations cumulatively affects downstream. My dissertation addressed predictive capacity and headwater variability in several ways. First, I used empirical data from headwaters in an urbanizing region to examine multiscale variability in headwater condition and showed that incorporating spatial dependencies can nearly double predictive capacity. Second, I developed and analyzed data from a citizen science protocol designed to examine functional and structural variability of urban headwater streams. Specifically, I examined cotton strip decomposition rates to show that local-scale variation explained nearly 70% of the variability. Third, I developed benthic macroinvertebrate cumulative effects models to examine the effects of environmental context and land cover conditions. I showed that the relative importance of environmental, land cover, spatial, and headwater variables were indicator-dependent suggesting that practitioners should address context dependency when evaluating land cover conclusions. Fourth, I applied a common, multiscale analytical framework (i.e., spatial stream network models) to examine the variability of chemical, decomposition, respiration, and benthic macroinvertebrate indicators to also show that incorporating multiscale dependencies can iv  increase predictive capacity on average, but again this was indicator-dependent. To confront complexity, generate stronger predictions, identify knowledge gaps, and improve understanding of cumulative effects, environmental practitioners stand to benefit from incorporating multiscale dependencies.     v  Lay Summary Ecosystems are threatened by natural and human stressors occurring at multiple scales (e.g., local to global). For environmental managers, it is crucial to understand and predict how these stressors combine to impact ecosystems. When it is not possible to measure the many potential stressors at many scales, managers can use models to look more deeply into patterns of ecosystem indicators (e.g., biological diversity and water quality) to examine the important threats and scales that need more study. This can result in better predictions and understanding of ecosystem conditions. I examined the effects of environmental variability (e.g., geology) and land cover (e.g., urban land use) on stream ecosystems using several studies. I found that models that incorporate multiple scales can increase predictive ability but that important threats and spatial patterns depended on the indicator. More use of models by environmental managers that include multiple scales should provide better predictions and better insight into important threats and knowledge gaps.     vi  Preface All work presented here is associated with an NSERC Strategic Partnership Grant (463575-14), “Cumulative effects in a riverscape across scales: thresholds of disturbance in ecosystem integrity,” held by Dr. John Richardson at the University of British Columbia. Dr. Richardson provided design, analysis, and writing advice for all dissertation components.  Chapter 2 is an analysis of field work conducted by the Toronto and Region Conservation Authority following the Ontario Stream Assessment Protocol Section 4 Module 10 “Assessing Headwater Drainage Features” (Stanfield et al. 2013). In appreciation of the benefit of the Ontario Stream Assessment Protocol, of the Flowing Waters Information System infrastructure, the Centre for Community Mapping and of the effort of southern Ontario’s stream monitoring community, we gratefully acknowledge the Toronto and Region Conservation Authority for the use of their data. The inspiration for understanding headwater variability and assigning headwater conditions comes from Stanfield et al.'s (2014) discussion paper on cumulative effects of headwater alteration and discussions with Les Stanfield. Dr. Lenka Kuglerová helped develop the headwater condition index. I used the field data to design and carry out spatial and statistical analyses. I wrote the final manuscript.   Chapter 3 is an analysis of field work conducted by citizen scientists and in collaboration with EcoSpark, particularly Executive Director Joyce Chau. A version of Chapter 3 was published but altered here for formatting consistency: Kielstra, B. W., J. Chau, and J. S. Richardson. 2019. Measuring function and structure of urban headwater streams with citizen scientists. Ecosphere 10:e02720. The study was conducted in collaboration using an Ontario Trillium Foundation Seed vii  Grant (SD94917) held by EcoSpark. Joyce Chau recruited and organized citizen scientists, helped design the citizen science sampling protocol, and contributed edits to the final manuscript. The citizen scientists and I collected field data and samples. A laboratory assistant and I prepared cotton strips for analysis. I designed the study, carried out field and laboratory work, carried out spatial and statistical analyses, and wrote the final manuscript.    Chapter 4 is an analysis of benthic macroinvertebrate data gathered from several agencies over several decades. In appreciation of the benefit of the Ontario Stream Assessment Protocol, of the Flowing Waters Information System (FWIS) infrastructure, the Centre for Community Mapping and of the effort of southern Ontario’s stream monitoring community, we gratefully acknowledge the following organizations for the use of their data (to the best of my ability and as attributed in the FWIS database): Anishinabek Ontario Fisheries Resource Centre, Cataraqui Region Conservation Authority, City of Ottawa, Conservation Halton, Ganaraska Region Conservation Authority, Kawartha Conservation, Lake Simcoe Region Conservation Authority, Lower Trent Conservation, Mississippi Valley Conservation, Ontario Ministry of Natural Resources and Forestry (Aurora/Midhurst Districts, Aquatic Research and Development Section, Lake Erie Management Unit, Salmonid Ecology Unit), Ontario Streams, Rainy River First Nations, Sir Sanford Fleming College, and the Toronto and Region Conservation Authority. Data were also contributed by various members of the Ontario Benthos Biomonitoring Network collected using standard methods (outlined in Chapter 4). The late Dr. Antoine Morin (University of Ottawa) conceived the study. Bernadette Charpentier (University of Ottawa) compiled data and performed initial landscape and statistical analysis using code from Dr. Morin. I took the initial analyses and improved on them by generating automated landscape viii  analysis, making new data inclusion decisions, using quantile random forest regression, and adding analyses evaluating the importance of spatial and headwater variables. This work benefited greatly from taxonomic compilations and statistical analysis of Élysabeth Théberge (University of Ottawa). Bernadette Charpentier and Les Stanfield contributed their guidance and knowledge throughout the analysis. I wrote the final manuscript.      Chapter 5 is an analysis of field work conducted in collaboration with Dr. Kuglerová. We were both interested in how varying amounts of urban land cover affect riparian and stream ecosystems. Some of our sites overlapped but our analysis did not. I designed and carried out the field work and laboratory work (stream sampling) with the help of summer research assistants and laboratory assistants. I processed the data, conducted the spatial and statistical analyses, and wrote the final manuscript.  ix  Table of Contents Abstract ......................................................................................................................................... iii Lay Summary .................................................................................................................................v Preface ........................................................................................................................................... vi Table of Contents ......................................................................................................................... ix List of Tables .............................................................................................................................. xiv List of Figures ........................................................................................................................... xxiii Acknowledgements ....................................................................................................................xxx Chapter 1: Introduction ................................................................................................................1 1.1 The problem of pattern, scale, and prediction in ecology ............................................... 1 1.2 Cumulative effects .......................................................................................................... 3 1.3 Predictive capacity and multiscale models ..................................................................... 4 1.4 Multiscale models to improve predictions in stream ecosystems ................................... 6 1.5 Dissertation goals, overarching hypotheses, and structure ............................................. 9 1.6 Figures........................................................................................................................... 12 Chapter 2: Predicting variability in the condition of urbanizing headwaters using a composite indicator, landscape variables and spatial dependencies .......................................15 2.1 Introduction ................................................................................................................... 15 2.2 Methods......................................................................................................................... 18 2.2.1 Study region and headwater drainage features dataset ......................................... 18 2.2.2 Multivariate analysis of HDF data ........................................................................ 19 2.2.3 Evaluating a composite indicator of headwater condition (HCI) ......................... 20 2.3 Results ........................................................................................................................... 25 x  2.3.1 General properties of HDF data ............................................................................ 25 2.3.2 Multivariate analysis of HDF data ........................................................................ 26 2.3.3 Results for evaluating a composite indicator of headwater condition (HCI) ........ 28 2.4 Discussion ..................................................................................................................... 30 2.5 Tables ............................................................................................................................ 36 2.6 Figures........................................................................................................................... 42 Chapter 3: Measuring function and structure of urban headwater streams with citizen scientists ........................................................................................................................................52 3.1 Introduction ................................................................................................................... 52 3.2 Methods......................................................................................................................... 56 3.2.1 Data collection ...................................................................................................... 56 3.2.2 Data analysis ......................................................................................................... 63 3.3 Results ........................................................................................................................... 65 3.3.1 Structural measurements ....................................................................................... 66 3.3.2 Functional measurements...................................................................................... 67 3.3.3 Citizen science engagement .................................................................................. 69 3.4 Discussion ..................................................................................................................... 69 3.5 Tables ............................................................................................................................ 76 3.6 Figures........................................................................................................................... 78 Chapter 4: Predicting cumulative effects of  landscape variables on stream benthic macroinvertebrate communities using spatiotemporally extensive biomonitoring data and hindcasting ....................................................................................................................................84 4.1 Introduction ................................................................................................................... 84 xi  4.2 Methods......................................................................................................................... 88 4.2.1 Data collection ...................................................................................................... 88 4.2.2 Data analysis ......................................................................................................... 92 4.3 Results ........................................................................................................................... 95 4.3.1 Landscape variables and BMI community metrics ............................................... 95 4.3.2 Predictive capacity for BMI metrics and predictor importance ............................ 96 4.3.3 Hindcasted deviations ........................................................................................... 97 4.3.4 Incorporating spatial and headwater variables in a regional subset ...................... 98 4.4 Discussion ..................................................................................................................... 99 4.5 Tables .......................................................................................................................... 106 4.6 Figures......................................................................................................................... 109 Chapter 5: Incorporating spatial dependencies for increased understanding of stream indicator responses to landscape disturbance across multiple scales. ..................................118 5.1 Introduction ................................................................................................................. 118 5.2 Methods....................................................................................................................... 121 5.2.1 Data collection .................................................................................................... 121 5.2.2 Data analysis ....................................................................................................... 127 5.3 Results ......................................................................................................................... 131 5.3.1 Potential versus realized stream networks .......................................................... 132 5.3.2 Stream sampling.................................................................................................. 133 5.4 Discussion ................................................................................................................... 138 5.5 Tables .......................................................................................................................... 147 5.6 Figures......................................................................................................................... 150 xii  Chapter 6: Conclusion ...............................................................................................................159 6.1 Multiscale dependency effects on predictive capacity and explained variation ......... 160 6.2 Multiscale dependency effects on statistical significance of predictors ..................... 163 6.3 Urban and local-scale predictor importance ............................................................... 165 6.4 Importance of headwaters to downstream .................................................................. 167 6.5 Concluding remarks .................................................................................................... 169 Bibliography ...............................................................................................................................171  Supporting information for Chapter 2 – Headwater condition index and sensitivity analysis .................................................................................................................. 203  Generating a headwater condition index, detailed approach .................................. 203  Tables ...................................................................................................................... 206  Supporting information for Chapter 2 - Geospatial analysis .............................. 207  Geospatial procedures ............................................................................................. 207  Spatial stream networks .......................................................................................... 207  Tables ...................................................................................................................... 217  Figures..................................................................................................................... 231  Supporting information for Chapter 2 - Multivariate analysis ........................... 236  Tables ...................................................................................................................... 236  Figures..................................................................................................................... 240  Supporting information for Chapter 3 ................................................................ 249  Calculation of LDI .................................................................................................. 249  Tables ...................................................................................................................... 250  Figures..................................................................................................................... 254 xiii   Supporting information for Chapter 4 ................................................................ 256  Details for benthic macroinvertebrate sampling ..................................................... 256  Tables ...................................................................................................................... 257  Figures..................................................................................................................... 259  Supporting information for Chapter 5................................................................. 261  Details for cotton strip decomposition rate determination ...................................... 261  Details for quantile random forest regression estimates of stream temperature ..... 261  Tables ...................................................................................................................... 263  Figures..................................................................................................................... 301  xiv  List of Tables Table 2.1 Headwater drainage feature variables used for generating the HCI, feature-scale. Original categories are protocol values for each variable and scores are for generating the HCI. “NA” indicates no score applied. .................................................................................................. 36 Table 2.2 Headwater drainage feature site variables used for generating the HCI, site-scale. Original categories are protocol values for each variable and scores are for generating the HCI. “NA” indicates no score applied. .................................................................................................. 37 Table 2.3 Model selection table for fixed effects for HCI. See Section 2.2.3.4 for land cover definitions. AIC is used for comparing models, where a low AIC indicates a better and more parsimonious fit. ∆AIC is the change in AIC relative to the best model (model 1). Rel L. is the relative probability of being the best model. Spatial parameters for the tail-up (TU), tail-down (TD), and Euclidean (EUC) components of the model are the partial sill (θv) and range (θr). R2 (fixed) is the variance explained by the fixed effects. R2 (full) is the variance explained by the fixed effects and random effects (i.e., spatial dependencies). R2 (CV) is the leave-one-out cross validation R2 between predictions and observations. ................................................................... 38 Table 2.4 Model selection table for spatial components for HCI. Models for tail-up (TU) and tail-down (TD) are Epanechnikov (EP), Exponential (EX), Linear with sill (LS), Mariah (MA), and Spherical (SP). Euclidean (EUC) models are Cauchy (CA), Exponential (EX), Gaussian (GA), and Spherical (SP). AIC is used for comparing models, where a low AIC indicates a better and more parsimonious fit. ∆AIC is the change in AIC relative to the best model (model 1). Rel L. is the relative probability of being the best model. Spatial parameters for the TU, TD, and EUC components of the model are the partial sill (θv) and range (θr). R2 (fixed) is the variance explained by the fixed effects. R2 (full) is the variance explained by the fixed effects and random effects (i.e., spatial dependencies). R2 (CV) is the leave-one-out cross validation R2 between predictions and observations. Dashed lines (--) indicate no estimate. .......................................... 39 Table 2.5 AIC is used for comparing models, where a low AIC indicates a better and more parsimonious fit. RMSPE is Root Mean Squared Prediction Error where lower values indicate a better fit. R2 (fixed) is the variance explained by the fixed effects. R2 (full) is the variance explained by the fixed effects and random effects (i.e., spatial dependencies). R2 (CV) is the leave-one-out cross validation R2 between predictions and observations. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects. Parameters for random effects are partial sill (θv; overall variance parameter) and range (θr; spatial effects only). Dashed lines (--) indicate no estimate. ................................................................................................................................... 41 Table 3.1 Linear mixed model results from model comparison. Models are ordered from best to worst candidate. Model numbers in parentheses refer to models from Table D.4. Text in model column also refers to the scale at which the variables were estimated. AIC is used for comparing models, where a low AIC indicates a better and more parsimonious fit. ∆AIC is the change in AIC relative to the best model (model 1). Rel L. is the relative probability of being the best model. For site-scale variables, their values can be estimated over “c_” or “l_250” catchments, xv  where “c_” is the full catchment and “l_” is a local 250 m buffer, respectively. Depth categories: Depth 1 – Buried under sediment; Depth 2 – On stream bottom; Depth 3 – Floating above stream bottom; Depth 5 – unable to determine. Depth 4 was excluded during data processing. Coefficient 95% CI are ± 1.96 x SE of the coefficient. Those CI not overlapping 0 are bolded. “Int.” is the intercept. Variance components (2) are presented for reference. R2 (fixed) is the variance explained by the fixed effects. R2 (full) is the variance explained by the fixed effects and random effects. En dash indicates “not applicable.” .............................................................. 76 Table 4.1 Landscape variables measured at the catchment-scale used in random forest models. The term “masl” indicates “metres above sea level”. Summary statistics are percentiles for the predictor. ..................................................................................................................................... 106 Table 4.2 Benthic macroinvertebrate metrics used as response variables in random forest models. Summary statistics are percentiles for the BMI metric. .............................................................. 107 Table 4.3 Landscape variable importance for its most influenced and least influenced BMI metric. Variables are ordered by highest median importance across all BMI metrics. Variable importance as % increase in mean squared error (MSE). Abbreviations can be found in Table 4.1 and Table 4.2. .............................................................................................................................. 108 Table 5.1 Generalized fixed effects, random effects, and spatial dependencies used in spatial stream network models per response variable. HDF P/A is the probabilistic stream network model. BMI is benthic macroinvertebrates. ................................................................................ 147 Table 5.2 Benthic macroinvertebrate (BMI) metrics used as response variables in random forest models Summary statistics in Table F.21. .................................................................................. 148 Table 5.3 Model comparison and component variation for spatial stream network models applied across seven catchments. HDF P/A is the probabilistic stream network model. Respiration* is the set of models run after removing an outlier site. For models, Null includes random effects, Non is non-spatial version of the best spatial model (BestS). Brier scores near zero indicate high predictive accuracy. Area under the receiver operator curve (AUC) scores near one indicate good model presence/absence discrimination. ∆AIC and Rel L. is relative to the best model in the Null, Non, and BestS set. R2 (CV) is the cross-validation R2. Spatial dependencies are tail-up (TU), tail-down (TD), and Euclidean (EUC). Dashed lines (--) indicate no estimate. ......................... 149 Table A.1 Results for headwater condition index (HCI) sensitivity analysis. Variables are ordered by scale and first-order sensitivity index, 𝑆𝑖, where i is an individual variable. Higher 𝑆𝑖 indicates higher independent contribution of each variable’s variance to variance in HCI. Total-effect index, 𝑆𝑇𝑖, is similar but also includes higher-order interactions. Greater differences (𝑆𝑇𝑖 −  𝑆𝑖) indicate the variable interacts with other variables in contributing to HCI variance...................................................................................................................................................... 206 Table B.1 GIS data uses and sources used throughout dissertation. ........................................... 217 xvi  Table B.2 Variables or datasets derived from data sources used in analyses. Only basic derivation details are presented here. Different summary statistics are used per variable. Not all variables are calculated for each analysis................................................................................................... 221 Table B.3 Land cover classes and landscape disturbance index (LDI) for land cover layers AAFC and SOLRIS 2.0 (Table B.1). Code is the raster integer class. Coarse 1 and 2 refer to coarse labels used in their respective dissertation chapters. Priority refers to the fusion layer generated from AAFC and SOLRIS 2.0. For each class, if it is not a priority then the class code is changed to NA and replaced by classes in the priority layer. LD refers to the LD coefficient used in Table B.1 calculations. .......................................................................................................................... 224 Table B.4 Quaternary geology classes and geolC coefficients for Ontario Quaternary Geology layer (Table B.1). Code is the raster integer class. G Model refers to the coarser categorization of unit names used in Neff et al. 2005’s G Model. Column geolC refers to the geolC coefficient used in the Table B.1 BFI calculation. ........................................................................................ 227 Table B.5 Euclidean distance covariance functions available in the ‘SSN’ package (Ver Hoef et al. 2014). Terms include ℎ as the separation distance, 𝜃𝑣 as the partial-sill parameter, and 𝜃𝑟as the range parameter. Note that values 4.4, 3, and 3 in Cauchy, Exponential, and Gaussian models, respectively, are used to have autocorrelation at 0.05 when h equals 𝜃𝑟. ..................... 228 Table B.6 Tail-up and tail-down covariance functions available in the ‘SSN’ package (Ver Hoef et al. 2014). Terms include ℎ as the separation distance for flow-connected sites (tail-up model), a or b as distances used in flow-unconnected model as in Figure B.1 (tail-down model), 𝜃𝑣 as the partial-sill parameter, and 𝜃𝑟as the range parameter. For tail-down models, there are different models depending on if sites are flow-connected (i.e., “if C”) or flow-unconnected (i.e., “if U”). Note that values 3 and 90 in Exponential and Mariah models, respectively, are used to have autocorrelation at 0.05 when h equals 𝜃𝑟. .................................................................................. 229 Table C.1 Loadings for feature-scale PCoA, feature-scale reduced PCoA, and site-scale PCoA. Loadings are the coefficients (mean) from a linear model fit to the axis scores using envfit in ‘vegan’ (Oksanen et al. 2016). Numbers following each variable name are the categories within a variable. Any categories that did not exist (e.g., Riparian vegetation, L, 0–1.5 m – 3) are not included. Category descriptions can be found in Table 2.1. ....................................................... 236 Table D.1 Select landscape summary statistics for the York Region, sites used in this study, and headwater drainage features (HDFs) found within York Region. For York Region, values are taken over the entire region. For HDFs and sites, values are for the catchments expressed as the “median (range)” of the sites. ..................................................................................................... 250 Table D.2  Numerical (N) and categorical (C) loadings. For N, these are analogous to principal component correlations with the original variables. For C, these are average scores from the PCoA per factor level. Numbers following each variable name are the factor levels. Generally, these rank from lowest (i.e., clay substrate, no riparian vegetation) to greatest (i.e., bedrock, forest). Any factor levels that did not exist (e.g., Riparian vegetation, L, 0–1.5 m – 3) are not included. Factor levels can be found in Kielstra et al. (2019) in Appendix S2. ......................... 251 xvii  Table D.3 Model sets with explanatory factors considered at strip and location (4a), local catchment (4b), and full catchment (4c) scales compared using maximum likelihood estimation. Abbreviations “Cohes” and “Dens” are patch cohesion index and patch density, respectively. AIC is used for comparing models, where a low AIC indicates a better and more parsimonious fit. ∆AIC is the change in AIC relative to the best model (model 1). Rel L. is the relative probability of being the best model. ........................................................................................... 253 Table E.1 Sampling event details for final dataset used in random forest analysis. Sources were the Flowing Waters Information System/Ontario Stream Assessment Protocol (FWIS); Ontario Benthic Biomonitoring Network (OBBN); the Ontario Ministry of Environment, Conservation, and Parks (MOE); and the Toronto and Region Conservation Authority (TRCA). The term “K & S” means kick-and-sweep. Numbers are sampling events with percentages in parentheses. ..... 257 Table E.2 Loadings for benthic macroinvertebrate community Principle Coordinates Analysis (PCoA). Taxa are OBBN/OSAP 27-group taxa and asterisks indicate additional taxa. Taxon numeric order references Figure E.1. Loadings are the coefficients (mean) from a linear model fit to the axis scores. Percentages represent explained variation per axis. Summary statistics are the percentiles of the rarefied abundance data. ................................................................................. 258 Table F.1 Potential stream-road crossings visited across seven catchments in 2015 and 2016. Catchments ordered from low to high landscape disturbance index values. .............................. 263 Table F.2 Statistics for landscape variables taken at main sites across seven catchments (section 5.2.1 for definitions). Catchments ordered least → most disturbed in terms of most downstream site c_LDI.  Statistics are 50th (2.5th, 97.5th) percentiles per catchment. .................................... 264 Table F.3 Model comparison for probabilistic stream network fixed effects. Fixed effects are those substituted while holding others constant (section 5.2.1 for definitions). Brier scores near zero indicate high predictive accuracy. Area under the receiver operator curve (AUC) scores near one indicate good model presence/absence discrimination. For proportion of variation, fixed is fixed effects and catchment is random effect. Tail-up (TU), tail-down (TD), and Euclidean (EUC) are spatial structures held constant as “exponential” for each. ....................................... 266 Table F.4 Model comparison for best probabilistic stream network fixed effects with spatial structure substitution. Brier scores near zero indicate high predictive accuracy. Area under the receiver operator curve (AUC) scores near one indicate good model presence/absence discrimination. Fixed is fixed effects and catchment is random effect. Structures for TU and TD are Epanechnikov (EP), Exponential (EX), Linear with sill (LS), Mariah (MA), and Spherical (SP). Euclidean models are Cauchy (CA), Exponential (EX), Gaussian (GA), and Spherical (SP). Dashed lines (--) indicate no estimate. ........................................................................................ 267 Table F.5 Model comparison for null (Null), non-spatial (Non), and best spatial (Best) probabilistic stream network models. Brier scores near zero indicate high predictive accuracy. Area under the receiver operator curve (AUC) scores near one indicate good model presence/absence discrimination. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects (section 5.2.1 for definitions). Spatial parameters for the TU, TD, and EUC xviii  components of the model are the partial sill (θv) and range (θr). Dashed lines (--) indicate no estimate. ...................................................................................................................................... 268 Table F.6 Statistics for water chemistry variables taken across sites within the seven catchments (section 5.2.1 for definitions). Catchments ordered least → most disturbed in terms of most downstream site c_LDI.  Statistics are 50th (2.5th, 97.5th) percentiles per catchment. ................ 269 Table F.7 Model comparison for SpCond fixed effects. Fixed effects are substituted while holding other variables constant (section 5.2.1 for definitions). ∆AIC is relative to the best model (model 1). Rel L. is the relative probability of being the best model. For proportion of variation, fixed is fixed effects and catchment and site are random effects. Tail-up (TU), tail-down (TD), and Euclidean (EUC) are spatial structures held constant here as “exponential” for each. ........ 270 Table F.8 Model comparison for best SpCond fixed effects with spatial structure substitution.  ∆AIC is relative to best model (model 1). Rel L. is relative probability of being best model. Catchment and site are random effects. Structures for TU and TD are Epanechnikov (EP), Exponential (EX), Linear with sill (LS), Mariah (MA), and Spherical (SP). Euclidean models are Cauchy (CA), Exponential (EX), Gaussian (GA), and Spherical (SP). Dashed lines (--) indicate no estimate. ................................................................................................................................. 271 Table F.9 Model comparison for null (Null), non-spatial (Non), and best spatial (Best) SpCond models. Lower Akaike Information Criteria (AIC) indicates better and more parsimonious fit. Root mean-squared prediction error (RMSPE) uses leave-one-out cross validation predictions (LOOCV). R2 (fixed) is explained variation of the fixed effects (section 5.2.1 for definitions). R2 (full) is explained variation of fixed and random effects. R2 (CV) is from site-level 5-fold cross validation. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects. Parameters for random effects are partial sill (θv; overall variance parameter) and range (θr; spatial effects only). Dashed lines (--) indicate no estimate. ...................................................... 272 Table F.10 Model comparison for pH fixed effects. Fixed effects are substituted while holding other variables constant (section 5.2.1 for definitions). ∆AIC is relative to the best model (model 1). Rel L. is the relative probability of being the best model. For proportion of variation, fixed is fixed effects and catchment and site are random effects. Tail-up (TU), tail-down (TD), and Euclidean (EUC) are spatial structures held constant here as “exponential” for each. .............. 273 Table F.11 Model comparison for best pH fixed effects with spatial structure substitution.  ∆AIC is relative to best model (model 1). Rel L. is relative probability of being best model. Catchment and site are random effects. Structures for TU and TD are Epanechnikov (EP), Exponential (EX), Linear with sill (LS), Mariah (MA), and Spherical (SP). Euclidean models are Cauchy (CA), Exponential (EX), Gaussian (GA), and Spherical (SP). Dashed lines (--) indicate no estimate. 274 Table F.12 Model comparison for null (Null), non-spatial (Non), and best spatial (Best) pH models. Lower Akaike Information Criteria (AIC) indicates better and more parsimonious fit. Root mean-squared prediction error (RMSPE) uses leave-one-out cross validation predictions (LOOCV). R2 (fixed) is explained variation of the fixed effects (section 5.2.1 for definitions). R2 (full) is explained variation of fixed and random effects. R2 (CV) is from site-level 5-fold cross xix  validation. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects. Parameters for random effects are partial sill (θv; overall variance parameter) and range (θr; spatial effects only). Dashed lines (--) indicate no estimate. ...................................................... 275 Table F.13 Model comparison for DO.PercEU fixed effects. Fixed effects are substituted while holding other variables constant (section 5.2.1 for definitions). ∆AIC is relative to the best model (model 1). Rel L. is the relative probability of being the best model. For proportion of variation, fixed is fixed effects and catchment and site are random effects. Tail-up (TU), tail-down (TD), and Euclidean (EUC) are spatial structures held constant here as “exponential” for each. ........ 276 Table F.14 Model comparison for best DO.PercEU fixed effects with spatial structure substitution.  ∆AIC is relative to best model (model 1). Rel L. is relative probability of being best model. Catchment and site are random effects. Structures for TU and TD are Epanechnikov (EP), Exponential (EX), Linear with sill (LS), Mariah (MA), and Spherical (SP). Euclidean models are Cauchy (CA), Exponential (EX), Gaussian (GA), and Spherical (SP). Dashed lines (--) indicate no estimate. ................................................................................................................................. 277 Table F.15 Model comparison for null (Null), non-spatial (Non), and best spatial (Best) DO.PercEU models. Lower Akaike Information Criteria (AIC) indicates better and more parsimonious fit. Root mean-squared prediction error (RMSPE) uses leave-one-out cross validation predictions (LOOCV). R2 (fixed) is explained variation of the fixed effects (section 5.2.1 for definitions). R2 (full) is explained variation of fixed and random effects. R2 (CV) is from site-level 5-fold cross validation. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects. Parameters for random effects are partial sill (θv; overall variance parameter) and range (θr; spatial effects only). Dashed lines (--) indicate no estimate. ............. 278 Table F.16 Model comparison for respiration rate fixed effects. Fixed effects are substituted while holding other variables constant (section 5.2.1 for definitions). ∆AIC is relative to the best model (model 1). Rel L. is the relative probability of being the best model. For proportion of variation, fixed is fixed effects and catchment and site are random effects. .............................. 279 Table F.17 Model comparison for null (Null) and best fixed effects (Best) respiration rate models. Lower Akaike Information Criteria (AIC) indicates better and more parsimonious fit. Root mean-squared prediction error (RMSPE) uses leave-one-out cross validation predictions (LOOCV). R2 (fixed) is explained variation of the fixed effects (section 5.2.1 for definitions). R2 (full) is explained variation of fixed and random effects. R2 (CV) is from site-level 5-fold cross validation. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects. Parameters for random effects are partial sill (θv; overall variance parameter) and range (θr; spatial effects only). Dashed lines (--) indicate no estimate. ...................................................... 280 Table F.18 Model comparison for decomposition rate fixed effects. Fixed effects are substituted while holding other variables constant (section 5.2.1 for definitions). ∆AIC is relative to the best model (model 1). Rel L. is the relative probability of being the best model. For proportion of variation, fixed is fixed effects and catchment and site are random effects. Tail-up (TU), tail-down (TD), and Euclidean (EUC) are spatial structures held constant here as “exponential” for each. ............................................................................................................................................ 281 xx  Table F.19 Model comparison for best decomposition rate fixed effects with spatial structure substitution.  ∆AIC is relative to best model (model 1). Rel L. is relative probability of being best model. Catchment and site are random effects. Structures for TU and TD are Epanechnikov (EP), Exponential (EX), Linear with sill (LS), Mariah (MA), and Spherical (SP). Euclidean models are Cauchy (CA), Exponential (EX), Gaussian (GA), and Spherical (SP). Dashed lines (--) indicate no estimate. ................................................................................................................................. 282 Table F.20 Model comparison for null (Null), non-spatial (Non), and best spatial (Best) decomposition rate models. Lower Akaike Information Criteria (AIC) indicates better and more parsimonious fit. Root mean-squared prediction error (RMSPE) uses leave-one-out cross validation predictions (LOOCV). R2 (fixed) is explained variation of the fixed effects (Section 5.2.1 for definitions). R2 (full) is explained variation of fixed and random effects. R2 (CV) is from site-level 5-fold cross validation. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects. Parameters for random effects are partial sill (θv; overall variance parameter) and range (θr; spatial effects only). Dashed lines (--) indicate no estimate. ............. 283 Table F.21 Statistics for benthic macroinvertebrate metrics taken across sites within the seven catchments (Table 5.2 for definitions). Catchments ordered least → most disturbed in terms of c_LDI.  Statistics are 50th (2.5th, 97.5th) percentiles per catchment. ........................................... 284 Table F.22 Loadings for benthic macroinvertebrate community Principle Coordinates Analysis (PCoA), coarse taxonomy. Taxa are OBBN/OSAP 27-group taxa. Taxon numeric order references Figure F.5. Loadings are the coefficients (mean) from a linear model fit to the axis scores. Percentages represent explained variation per axis. Summary statistics are the percentiles of the abundance data.................................................................................................................. 286 Table F.23 Loadings for benthic macroinvertebrate community Principle Coordinates Analysis (PCoA), fine taxonomy. Taxa are the lowest practical taxonomic level achieved. Taxon numeric order references Figure F.6. Only first 4 axes presented here. Loadings are the coefficients (mean) from a linear model fit to the axis scores. Percentages represent explained variation per axis. Summary statistics are the percentiles of the abundance data. ........................................... 287 Table F.24 Model comparison for BMI coarse PCoA1 fixed effects. Fixed effects are substituted while holding other variables constant (section 5.2.1 for definitions). ∆AIC is relative to the best model (model 1). Rel L. is the relative probability of being the best model. For proportion of variation, fixed is fixed effects and catchment and site are random effects. Tail-up (TU), tail-down (TD), and Euclidean (EUC) are spatial structures held constant here as “exponential” for each. ............................................................................................................................................ 290 Table F.25 Model comparison for best BMI coarse PCoA1 fixed effects with spatial structure substitution.  ∆AIC is relative to best model (model 1). Rel L. is relative probability of being best model. Catchment and site are random effects. Structures for TU and TD are Epanechnikov (EP), Exponential (EX), Linear with sill (LS), Mariah (MA), and Spherical (SP). Euclidean models are Cauchy (CA), Exponential (EX), Gaussian (GA), and Spherical (SP). Dashed lines (--) indicate no estimate. ................................................................................................................................. 291 xxi  Table F.26 Model comparison for null (Null), non-spatial (Non), and best spatial (Best) BMI coarse PCoA1 models. Lower Akaike Information Criteria (AIC) indicates better and more parsimonious fit. Root mean-squared prediction error (RMSPE) uses leave-one-out cross validation predictions (LOOCV). R2 (fixed) is explained variation of the fixed effects (section 5.2.1 for definitions). R2 (full) is explained variation of fixed and random effects. R2 (CV) is from site-level 5-fold cross validation. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects. Parameters for random effects are partial sill (θv; overall variance parameter) and range (θr; spatial effects only). Dashed lines (--) indicate no estimate. ............. 292 Table F.27 Model comparison for Pct.EPT fixed effects. Fixed effects are substituted while holding other variables constant (section 5.2.1 for definitions). ∆AIC is relative to the best model (model 1). Rel L. is the relative probability of being the best model. For proportion of variation, fixed is fixed effects and catchment and site are random effects. Tail-up (TU), tail-down (TD), and Euclidean (EUC) are spatial structures held constant here as “exponential” for each. ........ 293 Table F.28 Model comparison for best Pct.EPT fixed effects with spatial structure substitution.  ∆AIC is relative to best model (model 1). Rel L. is relative probability of being best model. Catchment and site are random effects. Structures for TU and TD are Epanechnikov (EP), Exponential (EX), Linear with sill (LS), Mariah (MA), and Spherical (SP). Euclidean models are Cauchy (CA), Exponential (EX), Gaussian (GA), and Spherical (SP). Dashed lines (--) indicate no estimate. ................................................................................................................................. 294 Table F.29 Model comparison for null (Null), non-spatial (Non), and best spatial (Best) Pct.EPT models. Lower Akaike Information Criteria (AIC) indicates better and more parsimonious fit. Root mean-squared prediction error (RMSPE) uses leave-one-out cross validation predictions (LOOCV). R2 (fixed) is explained variation of the fixed effects (section 5.2.1 for definitions). R2 (full) is explained variation of fixed and random effects. R2 (CV) is from site-level 5-fold cross validation. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects. “s_” in fixed effects indicates dominant substrate type. Parameters for random effects are partial sill (θv; overall variance parameter) and range (θr; spatial effects only). Dashed lines (--) indicate no estimate. ...................................................................................................................................... 295 Table F.30 Model comparison for HBI fixed effects. Fixed effects are substituted while holding other variables constant (section 5.2.1 for definitions). ∆AIC is relative to the best model (model 1). Rel L. is the relative probability of being the best model. For proportion of variation, fixed is fixed effects and catchment and site are random effects. Tail-up (TU), tail-down (TD), and Euclidean (EUC) are spatial structures held constant here as “exponential” for each. .............. 297 Table F.31 Model comparison for best HBI fixed effects with spatial structure substitution.  ∆AIC is relative to best model (model 1). Rel L. is relative probability of being best model. Catchment and site are random effects. Structures for TU and TD are Epanechnikov (EP), Exponential (EX), Linear with sill (LS), Mariah (MA), and Spherical (SP). Euclidean models are Cauchy (CA), Exponential (EX), Gaussian (GA), and Spherical (SP). Dashed lines (--) indicate no estimate. ................................................................................................................................. 298 xxii  Table F.32 Model comparison for null (Null), non-spatial (Non), and best spatial (Best) Pct.EPT models. Lower Akaike Information Criteria (AIC) indicates better and more parsimonious fit. Root mean-squared prediction error (RMSPE) uses leave-one-out cross validation predictions (LOOCV). R2 (fixed) is explained variation of the fixed effects (section 5.2.1 for definitions). R2 (full) is explained variation of fixed and random effects. R2 (CV) is from site-level 5-fold cross validation. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects. “s_” in fixed effects indicates dominant substrate type. Parameters for random effects are partial sill (θv; overall variance parameter) and range (θr; spatial effects only). Dashed lines (--) indicate no estimate. ...................................................................................................................................... 299  xxiii  List of Figures Figure 1.1 Increasing complexity of incorporating spatial dependencies into ecological models. Independent and grouping can be considered non-spatial (i.e., lacking spatial reference grid) whereas adding coordinates, distance to disturbance, pairwise distances, or pairwise distances/topology represent increasingly spatial models. ........................................................... 12 Figure 1.2 Approximate scales of study used in this dissertation. Sites overlay the satellite-derived built-up index. Stippling indicates no data. Landsat imagery courtesy of the U.S. Geological Survey ......................................................................................................................... 13 Figure 1.3 Relationships between data chapters based on methods (dark arrows) and prediction (white arrows) information transfer. ............................................................................................. 14 Figure 2.1 Map of HDF sites (nsite = 1140) within TRCA jurisdiction (subwatershed boundaries are broken lines) overlaid on built-up index values from Landsat imagery (i.e., [NDVI – NDBI]). Darker pixels indicate more built-up area. Landsat imagery courtesy of the U.S. Geological Survey. .......................................................................................................................................... 42 Figure 2.2 A headwater drainage feature (HDF) site separated into upstream and downstream components according to flow direction. Light dashed lines indicate spatial extent of site measurements (40 m). Dark dashed lines indicate primary feature contributing most site flow. Multiple features per site can be measured (e.g., tile drain, ditch). Figure adapted from OSAP S4.M10 (Stanfield et al. 2013). ..................................................................................................... 43 Figure 2.3 Procedure for generating the Headwater Condition Index (HCI). HCI is the scaled (0–1) geometric mean of the site geometric mean (range: 1–10) and flow-weighted feature geometric mean (range: 1–10) of variables at their respective scales. ......................................... 44 Figure 2.4 Percentage of dominant vegetation categories in zones 0–1.5 m, 1.5–10 m, and 10–30 m extending out from the stream, including both left and right banks, taken over all headwater drainage features. .......................................................................................................................... 45 Figure 2.5 Percentage of HCI values at various scales. For feature-scale (a) and site-scale (b) histograms, the values have been rescaled from 1–10 to 0–1. ...................................................... 46 Figure 2.6 Plots of site-scale HCI with first three axes of feature-scale reduced PCoA (a–c) and site-scale PCoA (d–e). Spearman’s ρ values are 0.40, -0.07, 0.44 (p < 0.05) for a–f and -0.55, 0.49, and -0.003 (p < 0.05 for d and e but p > 0.05 for f). ............................................................ 47 Figure 2.7 Results of Headwater Condition Index (HCI) sensitivity analysis. Sensitivity value is either the first-order sensitivity index (closed circles; simple additive effect) or the total sensitivity index (open triangles; additive effect + interactions with other variables) per input factor. High sensitivity values indicate higher contribution to systematic variance in HCI. ........ 48 xxiv  Figure 2.8 Results from plotting observed HCI values vs. predicted HCI values based on leave-one-out cross-validation predictions (LOOCV) for a non-spatial model (a) and a model incorporating spatial dependency (b). Line is the 1:1 relationship. .............................................. 49 Figure 2.9 Relationship of BU with HCI, controlling for the effects of topographic wetness index, baseflow index, and catchment area. ................................................................................. 50 Figure 2.10 Spatial distribution of Headwater Condition Index (HCI) values (points) and predictions (lines) for headwater drainage feature sites in the Toronto and Region Conservation Authority jurisdiction (a) and in the Rouge River watershed (b, delineated with black outline in a). Lines are DEM-generated streams with a 1 ha initiation threshold. Stream colours are the aggregated HCI value based on high density prediction points (~5 points per stream). Colours range from poor to good HCI values (purple → blue → yellow). Grey lines are streams greater than 2.91 km2 catchment area. ...................................................................................................... 51 Figure 3.1 Map of sites (nsite = 40) within York Region of the Greater Toronto Area overlaid on the maximum value composite image of NDVI values for 2016. Symbol sizes are proportional to decomposition rates estimated at the site-level. Landsat imagery courtesy of the U.S. Geological Survey. .......................................................................................................................................... 78 Figure 3.2 Percentage of dominant vegetation categories in zones 0-1.5 m, 1.5-10 m, and 10-30 m extending out from the stream, including both left and right banks, taken over all site:location combinations. ................................................................................................................................ 79 Figure 3.3 Decomposition rate variation with strip depth category. Points are predicted means and 95% CI for strip depth category at time of collection. Estimates do not consider uncertainty in the random effects. Strips could be 1) buried under sediment, 2) on stream bottom, 3) floating above stream bottom, 4) out of water (not used in analysis), or 5) unable to determine (e.g., if water was too murky).................................................................................................................... 80 Figure 3.4 Decomposition rate variation with stream velocity. Points are predicted means and 95% CI at the location level when the estimates of a model including only depth and random intercepts are subtracted from the response. Line is predicted relationship with 95% CI and does not consider uncertainty in the random effects. ............................................................................ 81 Figure 3.5 Decomposition rate variation with NDVI at different levels of TWI. Points are means and 95% CI at the site level. Polygons are the 95% CI for predicted relationship between decomposition rate and NDVI in a 250 m upstream buffer at low TWI (-1.5 TWI SD; light grey), average TWI (medium grey), and high TWI (+1.5 SD; dark grey). ............................................. 82 Figure 3.6 Number of hours volunteered by citizen scientists according to project stage and individual. ..................................................................................................................................... 83 Figure 4.1 Map of sites (black points; nsite = 1952) used in random forest cumulative effects models and sites (red points; nsite =  184) used for adding spatial predictors. ............................ 109 xxv  Figure 4.2 Observed vs. out-of-bag (OOB) predictions for benthic macroinvertebrate (BMI) metrics from random forest models. OOB predictions are the median predicted value for an observation across all trees where the observation was not used (generally 1/3 of trees). Abbreviations are available Table 4.2. ....................................................................................... 110 Figure 4.3 Landscape variable importance (% increase in mean-squared error [MSE]) summarized across all benthic macroinvertebrate metrics. Each point represents the importance of the variable to a single metric. Abbreviations are available in Table 4.1............................... 111 Figure 4.4 Partial variable plots for the top three predicted benthic macroinvertebrate metrics, moving downwards, and their top environmental (left) and land cover (right) predictors. The rank order of the variable importance for each BMI metric precedes the x-axis label. For each plot, the relationship is only shown between the predictor’s 10th and 90th percentile (see Section 4.3.2 for explanation). ................................................................................................................. 112 Figure 4.5 Deviations from hindcasted reference conditions (lowest impact) for each benthic macroinvertebrate (BMI) metric. Intervals represent standard deviations from the mean hindcasted deviation. Interpretation of the direction of deviation will change per metric. BMI metrics are ordered as in Figure 4.2 and descriptions are in Table 4.2. ...................................... 113 Figure 4.6 Map of sites coloured by standardized deviations from hindcasted predictions for Hilsenhoff Biotic Index (HBI). Higher values (yellow) indicate higher deviation from hindcasted prediction and indicate higher likelihood of organic pollution according to HBI. ..................... 114 Figure 4.7 Observed vs. out-of-bag (OOB) predictions for benthic macroinvertebrate (BMI) metrics when adding spatial variables to the Toronto region subset. OOB predictions are the median predicted value across all trees in which an observation was not used to generate the tree...................................................................................................................................................... 115 Figure 4.8 Variable importance (% increase in mean-squared error [MSE]) of AEM variables for HBI (left) and spatial representation of HBI values and AEM values for chosen variables (right). For HBI, more shaded points indicated higher HBI values (i.e., more organic pollution). For AEMs, white points indicate positive AEM scores on the axis with the relative size indicating higher scores whereas black points indicate negative scores with higher relative size indicating lower scores. ............................................................................................................................... 116 Figure 4.9 Variable importance (% increase in mean-squared error [MSE]) of MEM variables for HBI (left) and spatial representation of HBI values and MEM values for chosen variables (right). For HBI, more shaded points indicated higher HBI values (i.e., more organic pollution). For AEMs, white points indicate positive AEM scores on the axis with the relative size indicating higher scores whereas black points indicate negative scores with higher relative size indicating lower scores. ............................................................................................................................... 117 Figure 5.1 Maps of catchment main sites (red triangles) and potential stream-roadway crossings (points) (upper panel) as a subset of seven catchments (lower panel) of varying landscape disturbance. Moving from west to east, landscape disturbance index (LDI; lower is less xxvi  disturbed) at the most downstream site is 0.05 for Rogers, 0.39 for Vaughan, 0.55 for Wilket, 0.51 for Morningside, 0.50 for Ganatsekiagon, 0.39 for Pringle, and 0.03 for Ganaraska, where 0.60 is maximally disturbed (i.e., 100% urban cover) in this dataset. ........................................ 150 Figure 5.2 Longitudinal patterns of built-up index (BU) estimated at four scales at main sites across seven catchments. Scales were a 100 m-radius buffer around sites (l_100), a 1000 m-radius buffer around sites (l_1000), a 50 m buffer extending out perpendicular to the stream on either side and along the upstream reaches (r_50), and the whole catchment (c_). “% distance downstream” is the percentage of distance travelled along the network from the most upstream site to the most downstream site. Catchments are Ganaraska (GR), Rogers (RO), Ganatsekiagon (GN), Vaughan (VN), Pringle (PR), Morningside (MO), and Wilket (WI) coloured based on landscape disturbance index values (LDI) at most downstream site (purple → green → yellow indicates increasing catchment disturbance). ................................................... 151 Figure 5.3 Longitudinal patterns of landscape disturbance index (LDI), chemistry (SpCond), respiration rate, decomposition rate (% tensile loss per day [TLDD], and benthic macroinvertebrate (Pct.EPT) variables across seven catchments. % distance downstream is the percentage of distance travelled along the network from the most upstream site to the most downstream site.  Catchments are Ganaraska (GR), Rogers (RO), Ganatsekiagon (GN), Vaughan (VN), Pringle (PR), Morningside (MO), and Wilket (WI) coloured based on landscape disturbance index values (LDI) at most downstream site (purple → green → yellow indicates increasing catchment disturbance). ............................................................................................. 152 Figure 5.4 Percentage of variation explained by fixed effects, random effects (catchment and site), and spatial dependencies (tail-up [TU], tail-down [TD], and Euclidean [EUC]) across response variables used in SSN models. Null models includes random effects and Non is the non-spatial version of the best spatial model (BestS). ................................................................ 153 Figure 5.5 Predicted probabilities of stream reaches in three catchments of varying catchment disturbance using the null, best fixed effects non-spatial model, and best spatial model. Here, landscape disturbance index values (LDI) based on most downstream site where 0.6 is maximally disturbed (i.e., 100% urban). ..................................................................................... 154 Figure 5.6 Boxplots of chemistry variables (a, c, e) and relationships with strongest predictor (b, d, f). Boxes (a, c, e) are interquartile range (IQR, 25th and 75th percentiles) and whiskers extend to data within 1.75 x IQR. Polygon (b, d, f) is 95% confidence interval (CIs) about the relationship. Points (b, d, f) are means and 95% CIs for sites. “Partial” indicates that estimates were generated in absence of predictor and subtracted from observations; random effects are not subtracted.  Lines (b, d, f) indicate different regression lines based on years. ........................... 155 Figure 5.7 Boxplots of respiration rate (a) and its relationship with strongest predictor (b). Boxes (a) are interquartile range (IQR, 25th and 75th percentiles) and whiskers extend to data within 1.75 x IQR. Polygon (b) is 95% confidence interval (CI) about the relationship. Points (b) are means and 95% CIs for sites. ...................................................................................................... 156 xxvii  Figure 5.8 Boxplots of % tensile loss per degree day (TLDD) (a) and relationship with strongest predictor (b). Boxes (a) are interquartile range (IQR, 25th and 75th percentiles) and whiskers extend to data within 1.75 x IQR. Polygon (b) is 95% confidence interval (CI) about the relationship. Points (b) are means and 95% CIs for sites. “Partial” indicates that estimates were generated in absence of predictor and subtracted from observations; random effects are not subtracted.  Lines (b) indicate different regression lines based on years.................................... 157 Figure 5.9 Boxplots of benthic macroinvertebrate metrics (a, c, e) and relationships with strongest predictor (b, d, f). Boxes (a, c, e) are interquartile range (IQR, 25th and 75th percentiles) and whiskers extend to data within 1.75 x IQR. Polygon (b, d, f) is 95% confidence interval (CIs) about the relationship. Points (b, d, f) are means and 95% CIs for sites. “Partial” indicates that estimates were generated in absence of predictor and subtracted from observations; random effects are not subtracted.  Lines (b, d, f) indicate different regression lines based on years (different line types) and substrate category. .............................................................................. 158 Figure B.1 Spatial stream network distances and topologies. Each reach i, can have li and ui, which are the most downstream and upstream distances on that reach, respectively. The ui at terminal branches are assumed to be infinite for modelling purposes. Flow-unconnected distances are simple hydrological distance (e.g., s2 → s3 = a + b and vice versa) whereas flow-connected distances are only in the downstream direction (e.g., s2 → s3 = 0, s2 → s1 = a + c, and s1 → s2 = 0). Example adapted from Peterson and Ver Hoef (2010). ......................................... 231 Figure B.2 Calculation of spatial weights and catchment characteristics using the SSN framework. For a) and b), reaches i, are associated with a reach-contributing area (RCAi). At confluences (i.e., p5 and p9), catchment areas are used to calculate the proportional influence (RCA PI) of each upstream reach on the downstream reach. The product of all downstream reach RCA PIs inclusive of its own is used to calculate the additive function (RCA AFV) that is used in calculating spatial weights (e.g., ω) in c). For c), ratios (i.e., proportion that a site moves upstream on its reach) and catchment characteristics from b) are used in eqn. (14) to calculate point-specific catchment characteristics. .................................................................................... 232 Figure B.3 Comparison of estimated catchment characteristics using the classic approach (i.e., delineating each site individually) and the edge-generated approach (i.e., using RCAs and site distances/topologies) for catchment area (a) and mean NDBI (b). The solid lines indicate a 1:1 relationship. ................................................................................................................................. 233 Figure B.4 Buffers surrounding a site of 50, 100, and 250 m (increasing radius and line thickness) for estimating local catchment characteristics (i.e., l_). ............................................ 234 Figure B.5 Riparian buffers surrounding stream reaches of 50 m and 100 m (increasing radius and line thickness) for estimating riparian buffer catchment characteristics using eqn. (14) (i.e., r_). If a buffer overlaps another catchment (i.e., upper left of figure), the buffer is truncated by the catchment boundary. ............................................................................................................. 235 Figure C.1 Scores of features shaded by variable category along the first two feature-scale PCoA axes (a), 95% confidence ellipses of feature scores shaded by variable category along those axes xxviii  (b), and proportion of data in each variable category (c) with shades corresponding to (a) and (b). Variables differ per three panel plot and have sub-captions at the bottom of each plot. Category descriptions can be found in Table 2.1. ...................................................................................... 240 Figure C.2 Scores of features shaded by variable category along the first two reduced feature-scale PCoA axes (a), 95% confidence ellipses of feature scores shaded by variable category along those axes (b), and proportion of data in each variable category (c) with shades corresponding to (a) and (b). Variables differ per three panel plot and have sub-captions at the bottom of each plot. Category descriptions can be found in Table 2.1. ..................................... 244 Figure C.3 Scores of sites shaded by variable category along the first two site-scale PCoA axes (a), 95% confidence ellipses of site scores shaded by variable category along those axes (b), and proportion of data in each variable category (c) with shades corresponding to (a) and (b). Variables differ per three panel plot and have sub-captions at the bottom of each plot. Category descriptions can be found in Table 2.2. ...................................................................................... 246 Figure D.1 Mean daily stream temperature for all locations during their deployment periods (grey lines) and the mean of mean daily temperatures across all locations (circles) for the study duration. ...................................................................................................................................... 254 Figure D.2  Gradients of headwater structural measurements using Principal Coordinates Analysis. Areas of focus are a) channel characteristics and b) riparian vegetation characteristics. Axis percentages indicate variation explained by each axis of the original data. Points are site:location scale and are coloured by a) channel substrate size (darker ∝ larger substrates), and b) riparian vegetation score (darker ∝ denser, more forested vegetation). Ellipses are 95% confidence ellipses for the dominant substrate categories, where darker ∝ larger substrates. ... 255 Figure E.1 Scores of sites (n = 1952; grey points) and benthic macroinvertebrate taxa (n = 31; red points with number) along the first two Principal Coordinates Analysis (PCoA) axes. Numbers refer to “Taxon numeric order” in Table E.2. Scores for taxa are the coefficients (mean) from a linear model fit to the axis scores. Percentages represent explained variation per axis. ............................................................................................................................................. 259 Figure E.2 Scores of sites (n = 1952, grey points) and benthic macroinvertebrate metrics (n = 18, red text). Angles between arrows with themselves and axes reflect their correlations. Percentages represent explained variation per axis. Abbreviations can be found in Table 4.2. ..................... 260 Figure F.1 Example of quantile random forest regression stream temperature predictions (dotted lines) based on nearby weather stations against stream temperature observations (solid line) at Ganaraska Site 1 in 2015. OOB R2 for this model was 0.88. OOB predictions are the median predicted value for an observation across all trees where the observation was not used (generally 1/3 of trees). Predictions are predicted stream temperatures based on nearby weather stations and the random forest model. ............................................................................................................ 301 xxix  Figure F.2 Scores of all chemistry measurements (nobs = 197) taken at main sites across seven catchments. Angles between arrows with themselves and axes reflect their correlations. Percentages represent explained variation per axis. Variable definitions in section 5.2.1. ........ 302 Figure F.3 Observed vs. out-of-bag (OOB) predictions for chemistry metrics based random forest analysis with landscape variables. OOB predictions are the median predicted value for an observation across all trees where the observation was not used (generally 1/3 of trees). Variable definitions in section 5.2.1. ......................................................................................................... 303 Figure F.4 Observed vs. out-of-bag (OOB) predictions for chemistry metrics based random forest analysis with landscape, chemistry, and headwater condition variables. OOB predictions are the median predicted value for an observation across all trees where the observation was not used (generally 1/3 of trees). Abbreviations are available in Table 5.2. ............................................. 304 Figure F.5 Scores of sites (n = 262; grey points) and benthic macroinvertebrate taxa (n = 27; red points with number) along the first two Principal Coordinates Analysis (PCoA) axes. Numbers refer to “Taxon numeric order” in Table F.22. Scores for taxa are the coefficients (mean) from a linear model fit to the axis scores. Percentages represent explained variation per axis. ............ 305 Figure F.6 Scores of sites (n = 262; grey points) and benthic macroinvertebrate taxa (n = 78; red points with number) along the first two Principal Coordinates Analysis (PCoA) axes. Numbers refer to “Taxon numeric order” in Table F.23. Scores for taxa are the coefficients (mean) from a linear model fit to the axis scores. Percentages represent explained variation per axis. ............ 306 Figure F.7 Image comparison of high-resolution satellite imagery (Google Maps Image) with satellite-derived normalized difference vegetation index (NDVI), normalized difference built-up index (NDBI), and BU (NDBI – NDVI). In NDVI, lighter pixels indicate higher vegetation density. In NDBI, lighter pixels indicate higher urban density. In BU, lighter pixels indicate higher urban density. Note that agricultural fields (i.e., upper right portion of images) have higher NDBI values (i.e., lighter) than urban areas (i.e., lower left portion of images). However, they have lower BU (i.e., darker) values than urban areas. This suggests BU better resolves agricultural fields and urban areas than NDBI and NDVI can do individually. Map data: Google, TerraMetrics, National Oceanic and Atmospheric Administration; Landsat imagery courtesy of the U.S. Geological Survey. ........................................................................................................ 307   xxx  Acknowledgements This dissertation is the outcome of so much support. First, I’d like to thank Dr. John Richardson for providing endless knowledge, opportunities, patience, and encouragement in all aspects of PhD life. From the beginning, I have respected your sincere commitment to the success of students, postdoctoral fellows, and collaborators. Thanks to Dr. Dan Moore and Dr. Jeanine Rhemtulla for providing constructive criticism of my proposal, during committee meetings, and with drafts of the dissertation. I am also grateful for the mentorship and friendship given by Dr. Lenka Kuglerová during many hours of field work and research discussion. Hopefully your long winters in Sweden don’t take away from your truly positive outlook and strong work ethic. I’d like to acknowledge and thank Les Stanfield for his enthusiastic and far-reaching commitment to stream research in Ontario (e.g., OSAP, FWIS, Stream Monitoring and Research Team networks to name only a few). Similarly, I would like to acknowledge collaborators Dr. Jim Buttle, the late Dr. Antoine Morin, and Dr. Ibrahim Rashid for their thoughtful discussions throughout my work. In particular, I would like to thank Bernadette Charpentier for her support and research skills. I’d also like to acknowledge the many members of the Southern Ontario Stream Monitoring and Research Team for always having useful practical and theoretical discussions and a place to present my ongoing work. I’m very grateful to Dr. Don Jackson for welcoming me into his lab group as a visiting student for the final years of my PhD.   I also owe much of my day-to-day research enjoyment to interactions with members of the Richardson lab at UBC and the Jackson lab at University of Toronto. To my office mates (at UBC: Ana Chará-Serna, Roseanna Gamlen-Greene, Kasey Moran, Tonya Ramey, Matt Wilson; at UofT: David Benoit, Jenn Bonjte, Tessa Brinklow, Ruben Cordero, Abby Daigle, Lauren Lawson, Melissa, Orobko, Ian Richter), it was an absolute pleasure to interact with you on a xxxi  daily basis as we struggled through together. It was also a pleasure to have interactions with other UBC lab members: Danielle Courcelles, Gillian Fuss, Liliana García Lago, Sean Naman, Felipe Rossetti de Paula, Elizabeth Perkin, David Tavernini, Alex Yeung and others. At UofT, it was always nice to have discussions with Karen Alofs, Omar Cabezis, Andrew Chin, Daniel Gillis, Bailey Hewitt, Lauren Jarvis, Karl Lamothe, Lauren Lawson, Charlie Loewen, Ken Minns, Jennifer Robinson and Laura Third. I am indebted to the hard work and dedication of Chris Blackford, Madeline Burnatowski, Shantel Catania, Alyssa Fiedler, and Steven Typa for field and laboratory help. I also thank Erik Emilson for his mentorship and friendship throughout my academic journey.  My family was an important source of encouragement. Thanks to my parents, Peter and Judy, for always offering your love and support in countless ways. To Colleen Smith, for also providing endless support and encouragement to our family. To my siblings, David and Wendy, and their families, thank you for the encouragement you provided. To Kaye, Angeline, and Celine, thank you for being there while we were living in Toronto.    I cannot even begin to thank my wife, Corina, our daughter, Ruby, or our son, Jasper, for the joy you all bring to my life. You are so patient, supportive, loving, and fun to be around. Corina, we went through so many life changes these last few years. I am grateful that we could do it together.   1  Chapter 1: Introduction 1.1 The problem of pattern, scale, and prediction in ecology Levin’s “The problem of pattern and scale in ecology” (1992) both synthesized and provided a launching point for ecological thinking that there is no correct scale at which to conduct science, and that a unifying problem across disciplines is how best to translate information across scales. Since then, improvements in computing power, biochemical methods, environmental sensing, and worldwide communications have shifted our treatment of scale issues from “verbal” to “quantitative” (Schneider 2001, Chave 2013). Around the same time, Peters’ “A critique for ecology” (1991) synthesized thinking that ecology was relatively weak in generating useful concepts, hypotheses, and predictions. Since then, strong predictions that demonstrate our understanding, if any, are still lacking (Houlahan et al. 2017). For example, a meta-analysis revealed that ecologists can explain < 5% of the variation on average in the main factors of interest (Møller and Jennions 2002). Similarly, spatial autocorrelation (i.e., that observations spaced close together are more similar) can be a better predictor of abundance than niche-based models (i.e., that the abundance is dependent on some environmental factors) (Bahn and McGill 2007). Furthermore, questions of predictive transferability (i.e., in different contexts), predictive capacity (i.e., predictions with “no knowledge” versus “full knowledge”), and the upper limit on predictive capacity (i.e., observational error) are rarely explicitly addressed in ecological studies (Houlahan et al. 2017). This could be because practitioners recognize that predictions lie on an apparent continuum of “explanatory” to “anticipatory” based on predictive practice, with the former based on repeated measures of individual systems and the latter assuming underlying relations are true for anticipating system responses (Mouquet et al. 2015). 2   This problem of prediction arises, in part, because patterns of physical and biological indicators emerge from complex and multiscale processes. In discussing the importance of the Long Term Ecological Research program, two papers – “Long-term research and the invisible present” and “Long term research and the invisible place” – persuasively argue that current conditions are best understood with temporal and spatial contexts considered (Magnuson 1990, Swanson and Sparks 1990). For example, without opening up a 132-year record of ice cover duration one cannot appreciate a single datapoint as uncharacteristically low, related to decadal patterns of El Niño events, and related to nearly a century of average decline (Magnuson 1990). The invisible present and invisible place are at relatively imperceptible spatial and temporal scales than contemporary and regional ecological studies can appreciate.  Human activities (e.g., land use decisions) also emerge from complex interactions within social-ecological systems; they are often carried out using the best available data despite high degrees of complexity and uncertainty about future impacts (Polasky et al. 2011). Human activities have had such a pervasive effect on their physical and biological environments that we have recently entered what some consider a new geological epoch – the Anthropocene (Crutzen 2006, Reid et al. 2019). However, the effects of activities, such as urbanization along stream networks, can change depending on natural context (Booth et al. 2016, Utz et al. 2016). Combined, natural and anthropogenic disturbances occur at varying scales resulting in patchy mosaics of landscapes in various stages of recovery (Urban et al. 1987, Turner 1989). When a thorough understanding of temporal and spatial contexts are lacking, environmental practitioners should employ multiscale techniques that seek non-random spatial patterns which point to drivers of variation or scales requiring more study (Dray et al. 2012).   3  1.2 Cumulative effects Cumulative effects can be both the frequent and dense application of single activities or the combined effects of multiple activities on an indicator across broader spatial and temporal scales. Indicators can be any “valued ecosystem component” representing an ecosystem structure (e.g., diversity) or function (e.g., growth rate) (Jones 2016). Indicators can also be “stressor-based” or “effects-based” – stressor-based indicators reflect likely exposure to stressors (e.g., catchment % impervious cover) whereas effects-based indicators reflect the integration of many potential biological and stressor interactions (e.g., biological diversity; Jones 2016). The biological relevance of stressor-based or effects-based indicators may be less clear than a valued ecosystem component since ecosystem indicators may respond very differently to the same stressor. Just as indicators take many forms, so can the expected cumulative effects. In the simple case of two stressors (a, b), cumulative effects responses can be additive (a + b); synergistic, responses where two impacts together exceed the impact of simply adding both effects (> a + b); or antagonistic, responses where two impacts together are less than the impact of both effects (< a + b) (Crain et al. 2008). Recent definitions distinguish between positive and negative effects on indicators appropriate to environmental practitioners working in disturbed environments (e.g., positive or negative synergism; Piggott et al. 2015). The complexity in outcomes of just two stressors highlights one of the many challenges associated with cumulative effects.   Despite challenges of scale and complexity when evaluating cumulative effects, environmental practitioners are often tasked with developing indicator targets or standards for evaluating whether current or future conditions are “acceptable” or “unacceptable” (Seitz et al. 2011, Polasky et al. 2011, Jones 2016). Therefore, predicting cumulative effects has important implications for conservation and land management. Cumulative effects assessment is a 4  subdiscipline of environmental impact assessment; it considers effects of all activities on a valued ecosystem component across spatial and temporal scales whereas environmental impact assessments are limited in scope to cause-effect-mitigation on a per project or per ecosystem component basis (MacDonald 2000, Seitz et al. 2011, Jones 2016). Assessing cumulative effects can suffer in flexibility because it can lack tangible and suitable approaches for evaluating complexity of interactions on an indicator (Seitz et al. 2011, Jones 2016). This matters, for example, when an indicator such as fish growth rate is not directly altered by a single activity (e.g., house construction) but the intensification of one activity (e.g., many houses) or multiple activities (e.g., roads and shopping centre construction) affects quantity and quality of resources for the indicator (e.g., availability in the amount or types of benthic macroinvertebrates that support growth). Consequently, the response of an indicator is expected to be a cumulative product of complex interactions related to those activities and to other ecosystem patterns and processes that occur at multiple spatial and temporal scales. Prediction of the response of indicators requires multiscale approaches to increase understanding of the outcomes of many activities.  1.3 Predictive capacity and multiscale models  To improve on weak predictions, predictive ecology must be flexible enough to support multiple working hypotheses and modern hierarchies (i.e., multi-level information translation across scales), practical enough to measure complementary variables from and for other disciplines, and simple enough to provide practitioners with reasonable direction to complex problems (Peters 1991). Across many subdisciplines of ecology, and following an explosion in the amounts of data collected, the challenge still remains how best to translate information across scales to 5  obtain better predictions (Levin 1992, Chave 2013). Examining indicator responses through the lenses of multiple scales can provide better understanding about where studies fail or succeed in prediction.  Quantitative methods developed in the fields of community, landscape, and numerical ecology can be used to confront scale and complexity issues in assessing cumulative effects. These three fields incorporate multiscale heterogeneity rather than ignore important ecological phenomena occurring at scales other than a focal ecosystem component (Urban et al. 1987, Brown et al. 2011, Legendre and Legendre 2012). In community ecology, the metacommunity framework suggests that local ecological communities are shaped by interactions with broader-scale communities, a “metacommunity”, in complex ways (Leibold et al. 2004, Brown et al. 2011); an example being that a connected community is more resilient to extirpation since dispersing individuals “rescue” a population or community (Symons and Arnott 2013). In landscape ecology, variation in patterns and processes of both response and explanatory variables are quantified at different scales (Urban et al. 1987, Turner 1989). Finally, numerical ecology formalizes these associations between ecosystem components and/or explanatory variables in univariate and multivariate space (e.g., communities of benthic invertebrates that are influenced by regional and local biological and abiotic patterns (Legendre and Legendre 2012)). Together, through various methods these fields offer the ability to quantify complexity at different scales, produce explanatory models, and generate testable hypotheses that can increase the predictive capacity of modelling ecological response to cumulative effects of disturbance.  Modelling multiple scales can be done in several ways depending on research questions and data at hand (Figure 1.1). One can consider spatial variation as a nuisance or as unmeasured variation resulting from complex spatial and temporal scales worth investigating (Dray et al. 6  2012). In geostatistics, non-random spatial variation is modelled and used for predicting indicator values at unmeasured locations (Dale and Fortin 2014). Generally, inclusion of spatial dependencies can be a continuum of simple to increasingly complex, involving tradeoffs between generality and specificity. Importantly, not appropriately accounting for spatial and temporal dependency can increase type I errors (rejecting the null hypothesis when it is true) (Zuur et al. 2010). If observations are not independent, a simple approach can be to group data into hierarchies (i.e., scales) of observations. Groups could be included as simple categorical variables as in regression models (e.g., different geographic regions) or more complex dependency structures as in multilevel regression (e.g., mixed effects models). If observations are not spatially independent, dependencies can also be incorporated using simple categorical variables or covariates (e.g., x- and y-coordinates, distances to disturbances), more complex covariates (e.g., scores along multivariate axes representing the distances between sites; Blanchet et al. 2011), or through modelled adjustments to the covariance matrix based on site distances and topologies (Ver Hoef and Peterson 2010). In any of these scenarios, observational variation can be partitioned and interpreted with respect to several scales (e.g., between-group or shared variation as a function of separation distance). Partitioning variation at various scales and monitoring predictive capacity using different approaches will be useful for practitioners modelling cumulative effects since it can be instructive for targeting management action at the relevant scale (e.g., reducing urban cover near streams versus within the whole catchment).   1.4 Multiscale models to improve predictions in stream ecosystems Freshwater ecosystems are threatened by many natural stressors and human activities. Over the past decades, freshwater organisms have been under increasingly greater threat than their marine 7  and terrestrial counterparts (Strayer and Dudgeon 2010, Reid et al. 2019). Both biodiversity and human water security are threatened by the effects of water resource development, pollution, catchment disturbance, and biotic factors like invasive species (Vörösmarty et al. 2010).  The analysis of stream ecosystems stands to benefit from multiscale modelling because they integrate the complex spatial, temporal, biological, and human interactions mentioned above. Stream ecosystems have directional flow through dendritic networks – small streams combine to form successively larger streams. While it is useful to think of streams as having characteristic shifts in geomorphology and biological communities moving from headwaters to downstream (e.g., the river continuum concept; Vannote et al. 1980), strong spatial factors like disturbances and tributaries of varying sizes can result in discontinuities that extend in both upstream and downstream directions to confound the interpretation of longitudinal patterns (Benda et al. 2004, Kiffney et al. 2006, Ellis and Jones 2016, Jones and Schmidt 2018). It is well known that freshwaters are integrators of their contributing catchments and dependent on flows of energy and materials from adjacent ecosystems to support food web productivity from algae/bacteria through to fish (Richardson et al. 2010, Marcarelli et al. 2011, Tanentzap et al. 2014, 2017, but see Brett et al. 2017). It is also well known that the position of sites along dendritic stream networks results in varying proportional influences of local- versus broader-scale processes. For example, conditions of small streams are strongly influenced by localized catchment conditions like topography, soil, rainfall, and vegetation, whereas waterbodies with successively larger drainage areas (i.e., when many small headwater systems join) are more a function of fluvial network processes (Gomi et al. 2002, Richardson 2019b, 2019a). For this reason, multiscale network approaches are increasingly favoured that consider the landscape context of sites (i.e., network position and quantity and quality of contributing catchments) to 8  provide insight into whether current indicator values result from expected gradients of environmental variation (i.e., longitudinal patterns) or dependent on a set of disturbances occurring at multiple scales (Peterson et al. 2013). Following the development of theory for appropriately accounting for stream network topologies and autocovariance (Cressie et al. 2006, Ver Hoef et al. 2006, Peterson and Ver Hoef 2010, Ver Hoef and Peterson 2010), many applications of these models may reveal more general patterns within and between regions to improve on existing conceptual frameworks.  A remaining challenge for both conceptual and cumulative effects models in streams is how to best account for the effects of small streams (Bishop et al. 2008, Stanfield et al. 2014). It is well understood that small streams are tightly linked with local and catchment properties that may leave them more susceptible to immediate effects of landscape disturbance than downstream environments (Gomi et al. 2002, Freeman et al. 2007). It is also well understood that, in aggregate, small streams contribute substantially to the total catchment area and stream length of larger systems (Richardson 2019b), the processing and export of important materials to downstream (Wipfli et al. 2007), the maintenance of biodiversity in stream networks (Meyer and Wallace 2001, Meyer et al. 2007), and the availability of habitat and refugia for unique assemblages of terrestrial and aquatic organisms (Meyer et al. 2007, Ramey and Richardson 2017). Nevertheless, small streams are disproportionality buried or converted to drainage infrastructure in agricultural and urbanizing regions (Kaushal and Belt 2012, Stammler et al. 2013). Furthermore, despite strong functional links showing that headwaters deliver important resources downstream, limited research focuses on predicting how multiple instances of local-scale headwater alterations may cumulatively affect downstream. For example, impairment of headwaters may affect local habitats (e.g., nutrient enrichment from fertilized lawn runoff) but 9  the strength of these “small” effects is likely diluted when multiple headwater systems combine to form larger networks. However, impairment of many headwater systems could cumulatively alter downstream environments within the network. Incorporating small streams into cumulative effects models will require understanding their multiscale spatial variability to disentangle their individual and aggregate effects on downstream environments.   1.5 Dissertation goals, overarching hypotheses, and structure In this dissertation, my goals were to 1) examine how incorporating spatial dependencies affects predictive capacity and explained variation in stream network models across long gradients of land cover (e.g., in forested, agricultural, and urban settings), and 2) examine the structural and functional properties of headwaters across an urbanizing region and their importance for downstream ecosystems. Examining spatial dependencies was my attempt to reveal the invisible present and invisible place not captured by measured variables (Magnuson 1990, Swanson and Sparks 1990). To be applicable, I worked at scales associated with watershed managers and biomonitoring programs (Figure 1.2). For consistency, I applied similar landscape and data analysis in subsequent chapters with slight modifications (Figure 1.3). In addition to other chapter-specific goals, I examined several overarching hypotheses:  1) Predictive capacity and explained variation increase by incorporating spatial dependencies representing multiple scales. 2) Less total variation is explained by fixed effects and fewer environmental variables are statistically significant when incorporating multiple scales (i.e., reduced type I error) since they represent spatial autocorrelation rather than “true” environmental gradients. 10  3) Urban factors (e.g., % impervious cover) explain more indicator variation than environmental factors (e.g., geology) since anthropogenic effects on ecosystems are so pervasive.  4) Local-scale landscape variables (e.g., those generated in 100 m-radius circular buffers around sites) and spatial dependencies explain more between-site variability than between-catchment variability and larger scale variables (e.g., those generated at whole catchment scales) since disturbance disrupts longitudinal patterns.   5) Headwater conditions explain similar or more variation than other landscape variables since they better represent nuanced cumulative effects.  In Chapter 2, I examined the variability of headwater stream structural properties, developed a composite indicator of headwater condition, and related headwater conditions to landscape variables across an urbanizing region using multiscale spatial models; I examined changes in predictive capacity and explained variation by comparing the best model with a non-spatial model.  In Chapter 3, I developed and analyzed data from a citizen science protocol designed to examine the functional and structural variability of urban headwater streams. I examined variability in cotton strip decomposition rate as it relates to predictor variables measured at several scales.   In Chapter 4, I developed regional benthic macroinvertebrate cumulative effects models to examine the effects of environmental and land cover variables. From this model, I predicted conditions when land cover disturbance variables were set to low levels (i.e., hindcasting) and examined how sites deviated from these expected “reference” conditions. I also took a regional 11  subset of sites to examine how spatial variables and headwater conditions (from Chapter 2) affect predictive capacity.     In Chapter 5, I applied a common multiscale analytical framework to examine the spatial variability of stream chemistry, decomposition and respiration rate, and benthic macroinvertebrates as they related to variables measured at several scales across a land cover disturbance gradient. For the indicators, I used the best predictors and spatial dependencies to compare predictive capacity and explained variation with non-spatial and null models (i.e., no predictors).   In Chapter 6, I discussed how my findings advance our understanding of stream network models with respect to explained variation, predictive capacity, and cumulative effects research. I placed my findings in the context of existing literature and their implications for future research.   12  1.6 Figures  Figure 1.1 Increasing complexity of incorporating spatial dependencies into ecological models. Independent and grouping can be considered non-spatial (i.e., lacking spatial reference grid) whereas adding coordinates, distance to disturbance, pairwise distances, or pairwise distances/topology represent increasingly spatial models.   13     Figure 1.2 Approximate scales of study used in this dissertation. Sites overlay the satellite-derived built-up index. Stippling indicates no data. Landsat imagery courtesy of the U.S. Geological Survey 14   Figure 1.3 Relationships between data chapters based on methods (dark arrows) and prediction (white arrows) information transfer.   15  Chapter 2: Predicting variability in the condition of urbanizing headwaters using a composite indicator, landscape variables and spatial dependencies   2.1 Introduction  Small streams (e.g., headwaters) are important physical and biological components of stream networks. Although individually small, these streams can cumulatively contribute to over 70% of catchment area and stream length. Small streams, whether ephemeral or perennial, also function to retain, process, and export important materials to downstream environments (e.g., sediments, inorganic chemicals, and organic materials) as well as provide habitat and refugia for aquatic and terrestrial organisms (Hynes 1975, Meyer and Wallace 2001, Gomi et al. 2002, Meyer et al. 2007, Wohl 2017, Richardson 2019b, 2019a). Small stream functions are tightly linked with local and catchment properties (Gomi et al. 2002, Freeman et al. 2007). In forested regions, small streams tend to have rough surfaces that increase hydrological and material retention (e.g., coarse and unsorted substrates, fallen debris), whereas streams in agricultural or urban regions can be buried (i.e., for drainage infrastructure) or tend to have reduced riparian vegetation and simplified structures designed for the quick conveyance of surface water (Paul and Meyer 2001, Freeman et al. 2007, Wipfli et al. 2007, Kaushal and Belt 2012). Individually, this degraded physical condition and disruption to natural flow regime can result in increased peak flows; temporarily increased sediment supply, nutrients, contaminants, and scouring during storm events; reduced residence time for uptake and processing of nutrients, contaminants, and organic matter for downstream; and reduced local habitat and availability and variability for organisms (Paul and Meyer 2001, Meyer et al. 2005, 16  Walsh et al. 2005, Booth et al. 2016). Despite decades of research into the impacts of land cover disturbance on small to large streams, how alteration to individual or many headwaters cumulatively impacts downstream environments remains a challenge in stream ecology (Freeman et al. 2007, Wenger et al. 2009).  From a practical perspective, the cumulative effects of alterations to small streams may be difficult to measure because of their high density within watersheds, their types and intensities of land uses (e.g., extremes of 0% and 100% of different land cover classes is common), and that the effects of individual alterations are likely diluted as small streams combine into larger streams. Furthermore, much of our understanding of small stream functions come from studies using a few sites to examine community or process responses to stressors (Wallace et al. 1997, Meyer et al. 2005, Yeung et al. 2017). Fewer studies take a landscape approach to study or incorporate headwaters, and most that do, investigate variation in responses as they relate to site or landscape variables taken out of spatial context (i.e., assuming independence of sites; e.g., Alexander et al. 2007, Dodds and Oakes 2008, Abbott et al. 2018). While these landscape approaches are certainly research question-dependent, moving forward with cumulative effects assessment requires better understanding and prediction of the spatial variability of small stream properties across the landscape.    Spatial stream network (SSN) models are useful for incorporating spatial context into predictive models. SSN models are a class of spatial generalized linear mixed effects models that use stream topology and between-site distances to incorporate spatial dependency (Peterson and Ver Hoef 2010, Ver Hoef and Peterson 2010). For a given physical or biological indicator, SSN models can be used for constructing models with site, reach, or landscape variables and for predicting values at unmeasured locations using spatial dependencies. The approach allows for 17  studying important spatial relationships that may represent unmeasured variation (Peterson and Ver Hoef 2010). Furthermore, the predictions can be aggregated as its own landscape variable to assess small stream influences on a downstream indicator. In this way, SSN models provide a rich spatial framework for understanding and predicting cumulative effects of headwater conditions.   Here, we explored the spatial variability of headwater conditions across an urbanizing region using SSN models. Our goals were to use an empirical dataset of rapid assessment measurements taken at headwaters to explore variation in these measurements, construct a composite indicator of headwater condition using these measurements, and relate this indicator to landscape variables while accounting for spatial dependencies. We were interested in answering three questions. First, what are the general properties of HDFs and how does the composite indicator relate to the original data? Second, what landscape variables and spatial dependencies explain variability in our composite indicator? Third, how does incorporating spatial dependencies change relationships with landscape variables and change predictive capacity versus a model without these dependencies? For the first question, we used multivariate analysis to explore variability in measurements taken at two different scales within a site, correlated our composite indicator with scores of sites along multivariate axes, and conducted a sensitivity analysis of our composite indicator. For the second question, we constructed SSN models and examined how our indicator related to landscape variables taken at different scales (e.g., within localized GIS-generated buffers, within GIS-generated buffers surrounding the stream, or within the whole catchment) and using different types of spatial dependencies. For the third question, we fit our best model with and without spatial dependencies and compared their outputs. We expected wide variability in the structural properties of headwater drainage features 18  but that headwater conditions (i.e., composite indicator) would be strongly related to landscape variables. We also expected that incorporating spatial dependencies would change the magnitude of landscape variable effects but increase predictive capacity overall. If a standardized composite indicator of headwater condition can be reasonably predicted with landscape variables and spatial context at unmeasured locations, then this approach would be useful to practitioners for predicting the cumulative effects of headwater alterations on downstream environments.    2.2 Methods 2.2.1 Study region and headwater drainage features dataset Our urbanizing region was within the Toronto and Region Conservation Authority (TRCA) watershed management jurisdiction (Figure 2.1). The Greater Toronto Area is a densely populated region of 6.9 million people with an expected growth to 9.7 million by 2041 (OMMA 2017, OMF 2018). Following historic growth patterns, it is expected that development will proceed away from downstream reaches of large rivers to headwater reaches (TRCA 2007). Sites were located in the Ecological Land Classification of Canada’s Mixedwood Plains Ecozone (glacial deposits and sedimentary rock geology, mixture of deciduous and coniferous vegetation, warm summer and cool winter climate; Statistics Canada 2018), mostly draining from the Oak Ridges Moraine in the north into Lake Ontario in the south.  To explore landscape variability in headwater streams, we used an empirical dataset of HDFs generated using the Ontario Stream Assessment Protocol (OSAP) (Stanfield et al. 2013). An HDF is broadly defined as “a depression in the land that conveys surface flow.” The protocol is designed to assess HDFs (e.g., natural streams, ditches, swales) with respect to their relative conditions (e.g., bank erosion) and contributions to stream networks (e.g., discharge, sediment 19  movement). Several HDFs can be nested within a headwater site (spatial extent: 40 m upstream and downstream) and rapid assessments are undertaken at both scales; headwater sites are most often measured at road crossings due to ease of sampling (Figure 2.2).    We obtained the data through the Flowing Waters Information System (FWIS), a collaborative database overseen by the Centre for Community Mapping (COMAP) with various government and academic contributors through which OSAP data can be accessed (COMAP 2019). Three tables were exported from FWIS that include all data collected using OSAP S4.M10 “Assessing Headwater Drainage Features”: “tblHeadwaterSE” contains site-level data, “tblStreamFeature” contains feature-level data, and “tblFlowMeasure” contains feature-level flow measurement data (Stanfield et al. 2013) . For each table, we screened those variables used in subsequent analyses and removed sites having suspicious data (e.g., records outside bounds of protocol categories, extreme outliers). If data were suspicious at the feature-level, we removed the entire site. We used only upstream features to estimate conditions at the site-scale to avoid biasing stream characteristics with the immediate effects of a roadway. Our cleaned analysis dataset consisted of 1140 sites (Figure 2.1).   2.2.2 Multivariate analysis of HDF data We explored the cleaned TRCA HDF dataset using multivariate analyses at two separate scales: feature- and site-scale. We used a principal coordinates analysis (PCoA) with the Gower dissimilarity matrix since it allows for mixed variable types (e.g., numerical and factor variables) (Legendre and Legendre 2012, Borcard et al. 2018). At each scale, we selected only those variables used in generating the composite indicator (outlined below) and only those sites with complete data (variables in Table 2.1 and Table 2.2). At the feature-scale, one potential issue was 20  that six variables estimate dominant riparian vegetation per feature (three bands extending 0–1.5 m, 1.5–10 m, and 10–30 m on each side of a feature); these variables might be redundant and overwhelm the PCoA such that it primarily describes variation in riparian vegetation. Therefore, in a separate multivariate analysis of the feature-scale data, we aggregated these bands by selecting the most frequently occurring riparian vegetation category (e.g., scrubland) across all band:side combinations (six aggregated to one). Altogether, three PCoAs were run: feature-scale, feature-scale riparian-reduced, and site-scale. For each PCoA, we generated the Gower dissimilarity using daisy in ‘cluster’ 2.0.6 and ran the PCoA in Microsoft R Open 3.5.1 (cmdscale) on the square-rooted dissimilarity matrix to ensure the dissimilarities were Euclidean (Legendre and Legendre 2012, Maechler et al. 2018, Microsoft 2018, R Core Team 2018) . We retained axes based on the broken stick method (Legendre and Legendre 2012). Using envfit in ‘vegan’ 2.5-4, we fit our data back to the ordination space and explored patterns using numerical output of factor loadings, plotting, and 95% confidence ellipses (Oksanen et al. 2019).    2.2.3 Evaluating a composite indicator of headwater condition (HCI)  2.2.3.1 Generating the HCI For broader applicability to biomonitoring programs, we summarized the HDF protocol data into a composite indicator expressing headwater condition at the site-scale, the headwater condition index (HCI). The indicator was constructed using guidance on developing theoretical frameworks, variable selection, missing data, weighting and aggregation schemes, and conducting multivariate and sensitivity analysis from the Organisation for Economic Co-operation and Development’s handbook on composite indicators (OECD 2008). The indicator ranged from 0–1 and represented the quality of a site, where 0 indicated poor conditions and 1 21  indicated good conditions. We assumed that good condition sites tend to have natural attributes (i.e., natural channel streams, low valley and bank erosion, forested riparian vegetation) whereas poor condition sites tend to have anthropogenic attributes (i.e., channelized, high valley and bank erosion, lack of riparian vegetation). We applied this definition of condition to both feature-scale and site-scale variables (Table 2.1 and Table 2.2).  HCI integrated upstream feature conditions and site conditions (Figure 2.3). Briefly, we selected variables at each scale that could describe site condition. For each variable at the feature-scale, we applied a ranking scheme to the raw values (Table 2.1) and produced a standardized score (i.e., condition) for each value. For each feature, using these scores we calculated the geometric mean across all variables (√𝑥1 × 𝑥2  × … 𝑥𝑛𝑛, where 𝑥𝑖 is an individual variable’s score for a given feature, i). We used the geometric mean because we did not expect that variables were independent or compensate for another variable’s condition (Saltelli et al. 2008). Feature-scale condition was then calculated as the weighted geometric mean of each feature (since there can be multiple features per site) based on their contribution to site flow. At the site-scale, we again applied a ranking scheme (Table 2.2) and standardization and then calculated the geometric mean condition across all site variables. Finally, we calculated the geometric mean of feature conditions and site conditions to generate the final HCI score per site (Figure 2.3). See Appendix A for more detail.   2.2.3.2 Evaluation of HCI using sensitivity analysis   Using a variance-based global sensitivity analysis of the HCI-generating procedure, we asked two questions. First, which feature- or site-scale variables are important contributors to systematic variance in the HCI? Second, which scale (feature- or site-scale) is the most important 22  contributor? Variance-based sensitivity analysis uses Monte Carlo sampling of the full input variable space to explore how variation in model input interacts to affect model output (Saltelli et al. 2008). Two metrics were evaluated from the sensitivity analysis, 𝑆𝑖 and 𝑆𝑇𝑖. 𝑆𝑖 is the first-order sensitivity index (i.e., simple additive effect) and 𝑆𝑇𝑖 is the total sensitivity index (i.e., additive effect + interactions with other variables) for each input factor 𝑋𝑖 (e.g., feature type, sediment deposition). High sensitivity index values indicate that the factor is an important contributor to systematic variance in a model output (here, HCI). If 𝑆𝑇𝑖 >> 𝑆𝑖, this indicates that the factor also has important interactions with other variables. If the sum of 𝑆𝑖 = 1, this indicates that the model is mostly additive. If the sum 𝑆𝑖 > 1, this indicates that there are important interactions to consider. To answer question one, we calculated 𝑆𝑖 and 𝑆𝑇𝑖 for each variable based on 10000 randomly sampled Monte-Carlo realizations from each variable’s discrete integer values. To answer question two, we summed each 𝑆𝑖 for variables on each respective scale. This was done using sobolSalt in ‘sensitivity’ 1.15.2 in R (Iooss et al. 2018).  2.2.3.3 Evaluation of HCI using multivariate analysis  We used Spearman rank correlation tests (base R’s cor.test) to examine how HCI correlates with variation in the original dataset. We correlated the scores along the first three axes of the reduced feature-scale PCoA and the site-scale PCoA with the HCI. For simplicity, we assumed that observations within each of the variables (e.g., PCoA scores or HCI) were independent.   23  2.2.3.4 Evaluation of HCI using spatial analysis We explored the spatial distribution of HCI and how it relates to landscape variables (e.g., catchment area, local topography, land cover) using SSN models. Spatial dependency can be expressed in several ways. Considering a stream confluence shaped like “Y” that flows from the upper branches to the lower branch, autocovariance can be tail-up (TU) indicating that sites at upper branches are correlated with sites downstream but not with each other, tail-down (TD) indicating that all sites are correlated based on stream network distances to a common junction, or Euclidean (EUC) indicating that sites are correlated based on straight line distances between sites. The three autocovariance types can be used simultaneously to account for different spatial dependencies. We constructed SSN models within the Toronto and Region Conservation Authority jurisdiction. See Appendix B for more detail.   We used a two-stage approach to model HCI using SSN models, similar to other studies (Peterson and Ver Hoef 2010). First, we selected the best landscape variables using Akaike information criterion (AIC)-based model selection using fixed spatial dependencies including an exponential TU, an exponential TD, and an exponential EUC. We fit these models using maximum likelihood and compared the models using AIC, choosing the lowest AIC model as the best model. We evaluated the assumptions of the model using residual plots. Second, using the best model, we fit alternative covariance structures with restricted maximum likelihood and again compared models using AIC, choosing the lowest AIC model as the best model for generating predictions. Although interpretation of models within a range of ∆AIC from the lowest AIC model is more appropriate as they represent similarly likely models, we chose the lowest AIC model for the practical reasons (e.g., 5 best landscape models x 30 potential spatial models = 150 model runs). There were cases where sites were sampled twice. Although we 24  wished to specify sites as a random effect, the lack of > 2 visits meant insufficient replication to do so. Therefore, we assumed the spatial model would account for this non-independence. We assessed predictive capacity as the R2 of a linear regression between leave-one-out cross validation (LOOCV) predictions and the original observations per model.   For the predictor variables, we expected that HCI would be a function of local topography, geology, catchment area, and land cover. For topography, we used the topographic wetness index (TWI) estimated within a 30 m-radius buffer around a site; TWI is a topographically-based index of the tendency for a cell to accumulate water and is proportional to catchment area and inversely proportional to local slope (Beven and Kirkby 1979). For geology, we used the baseflow index (BFI) estimated at the catchment-scale; BFI is a geologically-based index of the average proportion of streamflow derived from groundwater (Neff et al. 2005). For catchment area, we used log10-transformed estimates (original units: m2).  For land cover, we were uncertain about the predictive capacity of variables or the scale at which their measurement would be most relevant. Therefore, we held TWI, BFI, and catchment area constant in each model but substituted one of five land cover variables estimated over one of four separate spatial scales (5 variables x 4 scales = 20 models), The variables were the landscape disturbance index (LDI), mean annual maximum normalized difference vegetation index (NDVI), mean annual median normalized difference built-up index (NDBI), mean built-up index (BU; [NDBI – NDVI]), and road density which represented different ways of expressing catchment composition (e.g., LDI = weighted % disturbed land; NDVI-NDBI-BU = continuous indicators of surface reflectance associated with land covers/vegetation densities). The scales were a 100 m-radius circular buffer (l_100), a 1000 m-radius circular buffer (l_1000), a 50 m-radius buffer extending 25  perpendicularly from each side of the stream along the entire upstream network (r_50), and the full catchment (c_) for each site. See Appendix B for more detail.   2.3 Results 2.3.1 General properties of HDF data  The HDF dataset captured a wide array of headwater conditions across the TRCA jurisdiction. The median and 2.5th to 97.5th percentile range for the site catchment areas and LDI was 0.29 km2 (0.0062–4.15 km2) and 0.09 (0.01–0.56, where 0.6 indicating maximally disturbed in our dataset), respectively. Across all HDFs, there were nearly equal proportions of feature types sampled (approximately 10% of data each of natural streams, channelized streams, wetlands, etc.). Across all HDFs, there was less natural riparian vegetation in 1.5–10 m and 10–30 m riparian zones (approximately 50%) than in the narrow 0–1.5 m band (67%). Of the non-natural riparian vegetation categories (i.e., None, Lawn, or Cropped Land), cropped land was consistently dominant (20%, 24% and 30% of all features in the 0–1.5 m, 1.5–10 m, and 10–30 m zones, respectively). A large proportion (86%) of the features had no evidence of adjacent bank erosion (e.g., rills and gullies). Across all sites, the majority had no evidence of major sources of nutrients upstream (98%; e.g., intensive agriculture like feedlots); however, nearly equal proportions had ongoing and active (54%) or no evidence (45%) of potential contaminant sources upstream (e.g., point and non-point sources such as storm sewer outflow, industrial discharge pipes, non-intensive agricultural operations).   26  2.3.2 Multivariate analysis of HDF data   At the feature-scale, we retained five PCoA axes that cumulatively explained 52% of the variation in the data. We found that the mean position and confidence ellipses of individual riparian vegetation categories (e.g., forest, shrubland) tended to be similar across all band:side combinations (e.g., left bank 0–1.5 m versus right bank 0–1.5 m) (Table C.1.1 and Figure C.1). This indicated redundancy and supported the need for the reduced PCoA.   In the reduced feature-scale PCoA, we retained three PCoA axes that cumulatively explained 39% of the variation. PCoA1 (14% variation) was a gradient of sediment deposition and adjacent bank erosion; features with lower scores tended to have higher deposition and higher bank erosion. Feature types were distributed variably across this axis, but “defined natural channels” and “multi-thread” features tended to have lower scores whereas “wetland” and “pond outlet” features tended to have higher scores. There was no strong discrimination of features based on riparian vegetation along this axis (Table C.1.2 and Figure C.2). PCoA2 (13% of variation) tended to separate categories of riparian vegetation; features with lower scores were dominated by “cropped land” riparian vegetation, whereas features with higher scores were dominated by “scrubland” (Table C.1.2 and Figure C.2). PCoA3 (12% of variation) tended to separate sites based on agriculture; features with lower scores tended to have higher adjacent bank erosion, tended to be dominated by agricultural cropland vegetation in the riparian area, and tended to be tile drain outlets or to be undefined features (Table C.1.2). Considering these axes as disturbance gradients, more disturbed sites would have lower PCoA1 scores (natural or multi-thread channels with higher erosion), lower PCoA2 scores (dominated by riparian cropland), and lower PCoA3 scores (tile drain outlets with higher erosion surrounded by cropland).  27  In the site-scale PCoA, we retained seven axes that cumulatively explained 93% of the variation. Here, we only discuss the first three that cumulatively explained 55% of the variation. In assessing the possibility of site-level impairment based on several factors, most of the data fell into two of five possible categories: “ongoing” (mean 22% across all variables) and “no evidence” (mean 74% across all variables). We will discuss these two categories as they relate to the three axes. Along PCoA1 (23% of variation), sites with higher scores had ongoing or historical channel hardening, dredging or straightening, barriers and/or dams in proximity, online ponds upstream, springs or seeps, and evidence of channel scouring or erosion (Table C.1.3 and Figure C.3). Along PCoA2 (19% of variation), sites with lower scores had ongoing or historical major nutrient and contaminant sources upstream, dredging or straightening, online ponds upstream, and evidence of springs or seeps (Table C.1.3 and Figure C.3). Along PCoA3 (13% of variation), sites with lower scores had ongoing or historical major contaminants, channel hardening, dredging or straightening, barriers and/or damns in proximity, and evidence of channel scouring (Table C.1.3). Considering these axes as disturbance gradients, the most disturbed sites would have high PCoA1 scores, low PCoA2 scores, and low PCoA3 scores; however, due to the mostly binary nature of these site-scale data it is difficult to assign a functional property to each axis.         28  2.3.3 Results for evaluating a composite indicator of headwater condition (HCI) 2.3.3.1 General properties of the HCI HCI values had a median value of 0.51 (2.5th–97.5th percentiles: 0.18–0.87) with a corresponding mean of 0.52 ± 0.19 (1 SD) across the TRCA jurisdiction ( Figure 2.5). Although scaling from 0–1 is done only after all calculations have been performed (i.e., endpoint of the HCI process diagram in Figure 2.3), we scaled the feature-scale and site-scale geometric means from 1–10 to 0–1 to plot and derive similar summary statistics for those scales for comparison. The feature-scale variables had a median value of 0.65 (2.5th–97.5th percentiles: 0.15–1.0) with a corresponding mean of 0.60 ± 0.14 (1 SD) and the site-scale (i.e., without multiplying by the feature-scale value) had a median value of 0.46 (2.5th–97.5th percentiles: 0.12–1.0) with a corresponding mean of 0.46 ± 0.10 (1 SD) ( Figure 2.5).   2.3.3.2 Results for evaluation of HCI using  multivariate analysis   Based on the interpretation of PCoA axes above, we found that HCI was significantly correlated with indicators of disturbance at feature- and site-scales. At the feature-scale, PCoA1 and PCoA3 were significantly correlated with the overall HCI (Spearman’s ρ = -0.41 and ρ = -0.45 with p < 0.001, respectively) (Figure 2.6); suggesting that low HCI indicates headwater features with more eroded banks and sediment deposition and/or those surrounded by cropland. At the site-scale PCoA1 and PCoA2 were significantly correlated with the overall HCI (Spearman’s: -0.55 and 0.50, respectively, and both with p < 0.001) (Figure 2.6); suggesting that low HCI indicates headwater sites with altered channel characteristics (i.e., dredging, straightening) and more nutrient and contaminant sources upstream. 29   2.3.3.3 Results for evaluation of HCI using sensitivity analysis The HCI was more sensitive to systematic uncertainty in feature-scale variables (∑ 𝑆𝑖𝑞𝑖  = 0.56) than site-scale variables (∑ 𝑆𝑖𝑟𝑖  = 0.44) (Figure 2.7). The model is additive (∑ 𝑆𝑖𝑞+𝑟𝑖  = 1.001) indicating that variables do not have strong systematic interactions with one another. At the feature-scale, systematic uncertainty in Type was the strongest contributor to variance in the HCI (𝑆𝑖 = 0.18), followed by Sediment Deposition (𝑆𝑖 = 0.12), Adjacent Sediment Transport (𝑆𝑖 = 0.11), and Valley Sediment Transport (𝑆𝑖 = 0.11). Individual riparian vegetation bands were small individual contributors since they aggregate to a single riparian vegetation variable in the model (but were inputted as separate variables to the model based on the algorithm). At the site-scale, all variables contributed approximately equally (𝑆𝑖 = 0.05). Sensitivity values corresponding to Figure 2.7 are presented in Table A.1.  2.3.3.4 Results for evaluation of HCI using spatial analysis The first of our two-stage approach indicated that our BU index estimated over the entire catchment (i.e., scale c_) was the best landscape variable for use in subsequent models (lowest AIC and greater relative likelihood compared to other models). All landscape variables explained little of the overall variation in HCI (≤ 5% based on R2 [fixed]), with satellite indices tending to be the best although the top three models had indices estimated at different scales (c_, r_50, and l_100) (Table 2.3). For the second of our two-stage approach, models incorporating spatial dependencies accounted for between 17–76% of the variation (R2 [full] in Table 2.4). The landscape variables explained 5% of the variation in HCI and was similar across all potential spatial models (median R2 [fixed] = 0.043; range: 0.036–0.082). Based on LOOCV R2 values, 30  predictive capacity increased from 12% to 23% (range: 16–23%) comparing models without spatial dependencies to ones with a dependency structure (Figure 2.8). In the best model, TWI (i.e., topography), BFI (i.e., geology), and c_BU (i.e., land use) variables were significant (95% confidence interval not overlapping zero) with c_BU having the strongest effect (Table 2.5). A 1 SD increase in TWI (1.24) decreased HCI by 0.01, a 1 SD increase in BFI (0.14) increased HCI by 0.01, and a 1 SD increase in BU (0.15) decreased HCI by 0.05 (Figure 2.9). Together, this meant that sites with more urban cover tended to have poorer headwater conditions, controlling for the effects of local topography and geology. For the spatial components, the linear-with-sill TU explained 2% of the variation, the linear-with-sill TD explained 9% of the variation, and the exponential EUC explained 42% of the variation. The EUC range parameter (θr = 723 m) indicates that spatial autocorrelation between nearby sites < 1 km apart is high (Table 2.5). Using the landscape variables and spatial dependencies together, we generated a set of high-density prediction points along the stream network (~5 points per reach) and took the average along each reach to produce Figure 2.10. This figure and associated predictions can be used to assess how expected headwater conditions affect downstream environments (e.g., moving average, Chapter 4 and 5).   2.4 Discussion  Small streams are numerous across many landscapes, so understanding their cumulative influence on downstream environments requires understanding their spatial variability. We analyzed data from a flexible protocol designed to capture a wide array of headwater types, constructed a composite indicator for assessing site conditions, and related this indicator to landscape variables while accounting for spatial dependencies. One interesting finding was that a 31  high proportion of headwaters had non-natural riparian cover (> 30%) suggesting potential anthropogenic threats at even the origins of larger streams in this region. At the site-scale, we found that sites tended to be classified into dichotomies of whether channels were modified (e.g., straightened) and had potential contaminant sources upstream or not. Our composite indicator correlated well to gradients in the original data (i.e., PCoA axes) and most sites were classified as having moderate conditions (mean HCI: 0.52 ± 0.19 [1 SD]). Satellite indices appeared to be the best landscape predictors of HCI; however, despite attempting to estimate landscape variables at local- to broad-scales (i.e., local buffers to catchments), landscape variables did not explain large amounts of variation in HCI (R2 ≤ 0.05). Rather, incorporating spatial dependencies increased predictive capacity and explained variation.   Composite indicators are regularly used in assessing the status of freshwaters (Vollmer et al. 2016). For example, variations of an index of biological integrity (Karr 1981) have been applied to stream, lake, and wetland communities within and across regions for over three decades (Steedman 1988, Minns et al. 1994, Hoyle and Yuille 2016, Vollmer et al. 2016). Given the many variables recorded at HDF sites at different scales (i.e., feature- and site-scales), we found it useful to condense this information into a single composite indicator. Additionally, we could impose directionality based on categorical rankings. To the best of our knowledge, no other standardized protocol exists for the rapid assessment of headwater drainage features (although see Fritz et al. 2006 for comprehensive assessments). Our goal was to use these data to construct an index of natural headwater features (i.e., high values having more natural channels/wetlands with intact riparian forest cover and little erosion) that could be used to describe a site’s current condition but also be aggregated along stream networks to describe headwater characteristics as an additional predictor in cumulative effects models. Importantly, 32  for observations, one can examine the individual variables that contributed to low HCI and develop mitigating strategies. Likewise, while predictions of HCI at unmeasured locations may be useful in aggregate, site-specific measures may be required for determining the individual factors resulting in poor headwater condition. In this way, the composite indicator should be a useful link between site-specific factors and regional factors related to headwater conditions.  Freshwater practitioners also regularly use landscape models for inference and prediction of ecosystem components when guiding management decisions. For example, loss of sensitive benthic macroinvertebrates (e.g., Ephemeroptera, Plecoptera, and Trichoptera) is commonly found with increased catchment disturbance (e.g., increased road density, urban cover) (e.g., Stanfield and Kilgour 2006, Bazinet et al. 2010, Coles et al. 2012, Wallace et al. 2013). Given the density and importance of headwater streams, we examined how landscape variables could be used to generate predictions of headwater conditions at unmeasured locations. We were unsurprised by the relationship of HCI with landscape variables (i.e., that HCI tended to decrease with increasing catchment disturbance) but were surprised that explanatory power was so low (R2 [fixed] ≤ 0.05). We expected that variables estimated at local scales (l_100 or l_1000) would be better landscape predictors as they might represent local landscape configuration better than catchment composition (i.e., at a site there may be high urban cover in the immediate vicinity but low urban cover in the entire catchment). However, our top models (conservatively, ∆AIC < 10 from best model) had landscape variables estimated at different scales – the best model used a whole-catchment satellite index of urban cover. Satellite indices did appear to be better predictors than relatively static measures of land cover like road density and LDI, although our land cover categories were derived from annual satellite data (AAFC 2011). Care should be taken in the interpretation of these predictors given that explanatory power of the fixed effects 33  was so low. Nevertheless, it is possible that satellite indices better represent nuances in landscape composition that cannot be captured in land cover classifications and better reflect that land covers are more continuous than discrete (i.e., that “urban” is not simply impervious cover devoid of vegetation and that “forest” can have wide variation due to species compositions and vegetation densities), even if satellite pixels are susceptible to noise (Pettorelli et al. 2005). Finally, our coarse landscape variables do not capture land-use decisions made at very fine scales. For example, HDFs with 30% agricultural land cover in a 100 m buffer could have very different degrees of riparian protections. These finer scale decisions may better explain HCI and, overall, much more study is needed into the landscape variables associated with HCI. Incorporating spatial dependency with landscape models can increase predictive capacity since predictions “borrow strength” from nearby sites when there are strong underlying spatial patterns (Peterson and Ver Hoef 2010, Ver Hoef and Peterson 2010, Frieden et al. 2014). Examining the spatial distribution of observations and predictions in Figure 2.10, there is certainly spatial aggregation of headwater conditions, particularly in the northwestern and northeastern regions. In lieu of strong landscape drivers of HCI, we anticipated that incorporating spatial dependencies would change landscape variable estimates (e.g., magnitude and significance of effects) and increase predictive capacity as in other studies (Isaak et al. 2014, Frieden et al. 2014). We compared our best landscape model with and without spatial dependencies. For landscape variables, adding spatial dependencies resulted in 14–40% decreases in the effects of landscape variables and a 55% decrease in the fixed effects explanatory power. However, it resulted in a 110% increase in predictive capacity and accounted for non-independence of sites. Interestingly, our best model suggested that Euclidean distances (EUC) explained most of the spatial variation in HCI (42%) and that autocorrelation was 34  strongest < 1000 m away. If we consider the confluence of two headwaters shaped like “Y”, one might expect similar headwater conditions based on Euclidean distances if there are strong localized landscape conditions (i.e., temperature, precipitation, land use) versus the downstream site which likely has stronger network influences of upstream sites (e.g., TU and TD autocorrelation) (Gomi et al. 2002). A study of the importance of EUC, TU, and TD variance components for an indicator may reveal that EUC becomes less important as more downstream sites are added to a dataset. However, this intriguing finding is tempered by the fact that other top spatial models (∆AIC < 10) had no EUC component, had range parameters far exceeding the maximum distance between points indicating autocorrelation at all practical distances, and that predictive capacity was relatively stable across all spatial models despite widely varying spatial structures. Furthermore, there may have been a lack of sites to adequately model TU dependencies since 72% of sites had no upstream sites, 12% had one, and the remaining 16% had more than one. This lack of upstream sites would likely be a challenge for any SSN models used for headwater streams. A study of a network with a high density of headwater sites with more network connections could improve our understanding of which headwater spatial dependencies best improve predictive capacity.  Our study recognized that landscape-scale analysis and prediction of small stream conditions is lacking despite their prevalence and expected cumulative effects on downstream environments (Meyer and Wallace 2001, Lowe and Likens 2005, Meyer et al. 2007, Wohl 2017). Our analysis of HDF variables revealed that a composite indicator of small stream condition is useful for mapping variability of observations and predictions of headwater condition across this large urbanizing region. However, our landscape analysis also highlighted the challenges of predicting these conditions using landscape variables since the conditions may be highly 35  localized and more dependent on finer-scale variation (e.g., land use decisions on riparian composition) than landscape variables can approximate. As seen in other studies, incorporating spatial dependencies improved predictive capacity along stream networks (Peterson and Ver Hoef 2010, Isaak et al. 2014, 2017, Frieden et al. 2014). However, strong conclusions of the nature of those spatial dependencies (EUC, TU, TD) are tenuous without further study on well-sampled networks. In conclusion, predicting properties of small streams across landscapes is challenging, but incorporating these properties into cumulative effects stream research may reveal important links between headwaters and downstream environments.         36  2.5 Tables  Table 2.1 Headwater drainage feature variables used for generating the HCI, feature-scale. Original categories are protocol values for each variable and scores are for generating the HCI. “NA” indicates no score applied.   Variable Condition adjustments Category Original Score Feature type  Defined natural channels 1 3 Channelized 2 1 Multi-thread 3 3 No defined feature 4 2 Tiled 5 1 Wetland 6 3 Swale 7 3 Roadside ditch 8 1 Pond outlet 9 2     Adjacent sediment transport  None 1 5 Rills 2 3 Rills and gully 3 1 Gully 4 2 Tile outlet scour 5 3 Sheet erosion 6 4 Instream bank erosion  7 1 Other 8 NA     Sediment deposition  None 1 5 Minimal 2 4 Moderate 3 3 Substantial 4 2 Extensive 5 1     Riparian vegetation  (left bank, 0–1.5 m);  Riparian vegetation  (right bank, 0–1.5 m); Riparian vegetation  (left bank, 1.5–10 m);  Riparian vegetation  (right bank, 1.5–10 m); Riparian vegetation  (left bank, 10–30 m); Riparian vegetation  (right bank, 10–30 m) None 1 1 Lawn 2 2 Cropped land 3 3 Meadow 4 4 Scrubland 5 5 Forest 6 6 Wetland 7 6    37  Table 2.2 Headwater drainage feature site variables used for generating the HCI, site-scale. Original categories are protocol values for each variable and scores are for generating the HCI. “NA” indicates no score applied.  Variable Condition adjustments Category Original Score Major nutrient sources upstream; Potential contaminant sources upstream; Channel hardening; Dredging or straightening; Barriers and/or dams in proximity; Online ponds upstream; Evidence of channel scouring or erosion Ongoing & active 1 1 Historical evidence 2 2 No evidence, but reported 3 3 No evidence 4 4 Unknown 5 NA     BMPs or restoration activities; Springs or seeps at site Ongoing & active 1 4 Historical evidence 2 3 No evidence, but reported 3 2 No evidence 4 1 Unknown 5 NA 38  Table 2.3 Model selection table for fixed effects for HCI. See Section 2.2.3.4 for land cover definitions. AIC is used for comparing models, where a low AIC indicates a better and more parsimonious fit. ∆AIC is the change in AIC relative to the best model (model 1). Rel L. is the relative probability of being the best model. Spatial parameters for the tail-up (TU), tail-down (TD), and Euclidean (EUC) components of the model are the partial sill (θv) and range (θr). R2 (fixed) is the variance explained by the fixed effects. R2 (full) is the variance explained by the fixed effects and random effects (i.e., spatial dependencies). R2 (CV) is the leave-one-out cross validation R2 between predictions and observations.  Model Land cover AIC ∆AIC Rel L. TU TD EUC R2 (fixed) R2 (full) R2 (CV) θv θr θv θr θv θr 1 c_BU -812.14 0.00 1.00 0.006 290 0.008 1624 0.004 16265 0.05 0.57 0.23 2 r_50_BU -804.51 7.63 0.02 0.006 300 0.007 1988 0.004 22562 0.04 0.57 0.22 3 l_100_NDVI -802.16 9.98 0.01 0.004 298 0.010 1695 0.005 27912 0.03 0.58 0.22 4 c_NDVI -800.10 12.04 0.00 0.009 381 0.005 4381 0.004 21499 0.03 0.57 0.22 5 l_100_BU -799.49 12.65 0.00 0.006 301 0.008 2260 0.005 34303 0.03 0.59 0.22 6 r_50_NDVI -797.84 14.30 0.00 0.010 400 0.005 4873 0.004 25756 0.03 0.58 0.22 7 l_1000_BU -795.07 17.08 0.00 0.005 310 0.009 1752 0.004 20176 0.03 0.57 0.21 8 c_NDBI -794.32 17.82 0.00 0.006 308 0.008 2073 0.006 41838 0.02 0.60 0.22 9 l_1000_NDVI -790.58 21.56 0.00 0.007 334 0.007 2747 0.004 23374 0.02 0.57 0.21 10 r_50_NDBI -790.48 21.67 0.00 0.007 322 0.007 2642 0.007 52304 0.02 0.61 0.22 11 l_100_LDI -787.84 24.30 0.00 0.008 399 0.007 3661 0.005 27864 0.02 0.58 0.22 12 l_1000_ROAD -785.87 26.28 0.00 0.005 274 0.009 1843 0.005 17956 0.02 0.58 0.21 13 l_1000_NDBI -785.41 26.73 0.00 0.005 319 0.010 1782 0.006 45758 0.02 0.60 0.21 14 l_1000_LDI -785.09 27.05 0.00 0.006 300 0.008 2230 0.005 16007 0.02 0.58 0.21 15 l_100_NDBI -785.00 27.14 0.00 0.008 331 0.007 3092 0.008 59839 0.02 0.62 0.21 16 l_100_ROAD -784.78 27.36 0.00 0.006 358 0.008 2658 0.008 54079 0.02 0.62 0.21 17 c_LDI -783.77 28.38 0.00 0.007 326 0.007 3158 0.005 27671 0.02 0.59 0.21 18 r_50_LDI -781.80 30.34 0.00 0.007 339 0.007 3049 0.006 34645 0.01 0.59 0.21 19 c_ROAD -781.21 30.93 0.00 0.007 339 0.008 2485 0.007 41315 0.01 0.60 0.21 20 r_50_ROAD -779.14 33.01 0.00 0.007 339 0.008 2535 0.009 62227 0.01 0.62 0.21    39  Table 2.4 Model selection table for spatial components for HCI. Models for tail-up (TU) and tail-down (TD) are Epanechnikov (EP), Exponential (EX), Linear with sill (LS), Mariah (MA), and Spherical (SP). Euclidean (EUC) models are Cauchy (CA), Exponential (EX), Gaussian (GA), and Spherical (SP). AIC is used for comparing models, where a low AIC indicates a better and more parsimonious fit. ∆AIC is the change in AIC relative to the best model (model 1). Rel L. is the relative probability of being the best model. Spatial parameters for the TU, TD, and EUC components of the model are the partial sill (θv) and range (θr). R2 (fixed) is the variance explained by the fixed effects. R2 (full) is the variance explained by the fixed effects and random effects (i.e., spatial dependencies). R2 (CV) is the leave-one-out cross validation R2 between predictions and observations. Dashed lines (--) indicate no estimate.   Model Spatial components AIC ∆AIC Rel L. TU TD EUC R2 (fixed) R2 (full) R2 (CV) θv θr θv θr θv θr 1 LS-TU + LS-TD + EX-EUC -794.01 0.00 1.00 0.001 101212 0.003 92577 0.014 723 0.05 0.58 0.23 2 LS-TU + LS-TD -789.08 4.93 0.08 0.014 375 0.004 68078 0.000 -- 0.05 0.58 0.22 3 EP-TU + EP-TD -787.65 6.36 0.04 0.014 411 0.003 76282 0.000 -- 0.05 0.58 0.22 4 SP-TU + SP-TD -787.20 6.81 0.03 0.014 457 0.003 78371 0.000 -- 0.05 0.57 0.22 5 EX-TU + EX-TD  -786.78 7.23 0.03 0.014 926 0.004 136430 0.000 -- 0.05 0.57 0.22 6 MA-TU + MA-TD + SP-EUC -786.61 7.40 0.02 0.011 2119 0.004 115513 0.025 328009 0.04 0.72 0.23 7 MA-TU + MA-TD + EX-EUC -786.39 7.62 0.02 0.011 1746 0.005 95181 0.009 207102 0.04 0.67 0.23 8 LS-TU + LS-TD + GA-EUC -786.17 7.84 0.02 0.014 375 0.004 68119 0.026 194772 0.04 0.94 0.22 9 MA-TU + MA-TD -785.99 8.02 0.02 0.013 4111 0.004 602616 0.000 -- 0.05 0.57 0.22 10 MA-TU + MA-TD + CA-EUC -785.77 8.24 0.02 0.011 1943 0.005 92904 0.003 27138 0.04 0.88 0.23 11 MA-TU + MA-TD + GA-EUC -785.69 8.32 0.02 0.011 1918 0.005 90963 0.002 20149 0.04 0.94 0.23 12 EP-TU + EP-TD + EX-EUC -785.50 8.51 0.01 0.007 288 0.007 1491 0.004 27787 0.04 0.77 0.23 13 EP-TU + EP-TD + SP-EUC -784.98 9.04 0.01 0.007 284 0.007 1505 0.004 18874 0.04 0.59 0.23 14 SP-TU + SP-TD + EX-EUC -784.91 9.10 0.01 0.006 308 0.007 1535 0.004 26295 0.04 0.76 0.23 15 SP-TU + SP-TD + GA-EUC -784.75 9.26 0.01 0.014 429 0.003 77488 0.022 144618 0.04 0.68 0.22 16 EX-TU + EX-TD + GA-EUC -784.57 9.44 0.01 0.014 952 0.003 137601 0.001 17367 0.05 0.95 0.22 17 SP-TU + SP-TD + SP-EUC -784.34 9.67 0.01 0.006 303 0.008 1556 0.004 18829 0.04 0.66 0.23 18 EX-TU + EX-TD + CA-EUC -784.32 9.70 0.01 0.014 832 0.003 101795 0.006 100480 0.04 0.70 0.23 19 SP-TU + SP-TD + CA-EUC -784.04 9.97 0.01 0.006 306 0.008 1552 0.003 11980 0.04 0.93 0.23 20 EP-TU + EP-TD + GA-EUC -783.50 10.51 0.01 0.007 284 0.008 1540 0.003 14043 0.05 0.68 0.23 21 LS-TU + LS-TD + CA-EUC -783.46 10.55 0.01 0.013 988 0.003 67831 0.010 120579 0.04 0.58 0.22 22 EX-TU + EX-TD + EX-EUC -783.35 10.66 0.00 0.010 386 0.004 5559 0.005 41828 0.04 0.69 0.23 23 EX-TU + EX-TD + SP-EUC -783.21 10.81 0.00 0.011 421 0.004 7565 0.003 18984 0.04 0.61 0.23 24 EP-TU + EP-TD + CA-EUC -778.88 15.13 0.00 0.014 1081 0.002 24137 0.015 81985 0.04 0.90 0.22 25 LS-TU + LS-TD + SP-EUC -778.15 15.86 0.00 0.014 375 0.000 55 0.005 18494 0.04 0.85 0.22 26 CA-EUC -775.25 18.76 0.00 -- -- -- -- 0.016 632 0.08 0.56 0.22 40  Model Spatial components AIC ∆AIC Rel L. TU TD EUC R2 (fixed) R2 (full) R2 (CV) θv θr θv θr θv θr 27 EX-EUC -770.81 23.20 0.00 -- -- -- -- 0.016 1449 0.08 0.56 0.22 28 SP-EUC -742.03 51.98 0.00 -- -- -- -- 0.007 18466 0.04 0.50 0.17 29 GA-EUC -737.54 56.47 0.00 -- -- -- -- 0.004 6131 0.06 0.19 0.17 30 -- -684.40 109.61 0.00 -- -- -- -- -- -- 0.12 0.12 0.11    41  Table 2.5 AIC is used for comparing models, where a low AIC indicates a better and more parsimonious fit. RMSPE is Root Mean Squared Prediction Error where lower values indicate a better fit. R2 (fixed) is the variance explained by the fixed effects. R2 (full) is the variance explained by the fixed effects and random effects (i.e., spatial dependencies). R2 (CV) is the leave-one-out cross validation R2 between predictions and observations. Fixed effects estimates include 95% CI; bolded CIs indicate significant effects. Parameters for random effects are partial sill (θv; overall variance parameter) and range (θr; spatial effects only). Dashed lines (--) indicate no estimate.   Model Model comparison Fixed effects Random effects Parameter Estimate Parameter Estimate (95% CI) Parameter Estimate      TU TD EUC Best   AIC RMSPE R2 (fixed) R2 (full) R2 (CV)  -794.01 0.16 0.05 0.58  0.23 Int. TWI BU BFI log10 Area 0.513 (0.47, 0.55) -0.013 (-0.02, -0.00) -0.045 (-0.06, -0.03) 0.015 (0.00, 0.028) 0.005 (-0.01, 0.02) Partial sill (θv) Range (θr) Proportion    of variation 0.001 101212 0.02 0.003 92576 0.10 0.014 723 0.42          Non  AIC RMSPE R2 (fixed) R2 (full) R2 (CV) -684.40 0.18 0.12 0.12 0.11 Int. TWI BU BFI log10 Area 0.504 (0.49, 0.52) -0.022 (-0.03, -0.01) -0.052 (-0.06, -0.04) 0.025 (0.01, 0.03) 0.001 (-0.01, 0.01) Partial sill (θv) Range (θr) Proportion    of variation -- -- -- -- -- -- -- -- --     42  2.6 Figures   Figure 2.1 Map of HDF sites (nsite = 1140) within TRCA jurisdiction (subwatershed boundaries are broken lines) overlaid on built-up index values from Landsat imagery (i.e., [NDVI – NDBI]). Darker pixels indicate more built-up area. Landsat imagery courtesy of the U.S. Geological Survey.  43   Figure 2.2 A headwater drainage feature (HDF) site separated into upstream and downstream components according to flow direction. Light dashed lines indicate spatial extent of site measurements (40 m). Dark dashed lines indicate primary feature contributing most site flow. Multiple features per site can be measured (e.g., tile drain, ditch). Figure adapted from OSAP S4.M10 (Stanfield et al. 2013).    44    Figure 2.3 Procedure for generating the Headwater Condition Index (HCI). HCI is the scaled (0–1) geometric mean of the site geometric mean (range: 1–10) and flow-weighted feature geometric mean (range: 1–10) of variables at their respective scales.   45    Figure 2.4 Percentage of dominant vegetation categories in zones 0–1.5 m, 1.5–10 m, and 10–30 m extending out from the stream, including both left and right banks, taken over all headwater drainage features.   46    Figure 2.5 Percentage of HCI values at various scales. For feature-scale (a) and site-scale (b) histograms, the values have been rescaled from 1–10 to 0–1.    47    Figure 2.6 Plots of site-scale HCI with first three axes of feature-scale reduced PCoA (a–c) and site-scale PCoA (d–e). Spearman’s ρ values are 0.40, -0.07, 0.44 (p < 0.05) for a–f and -0.55, 0.49, and -0.003 (p < 0.05 for d and e but p > 0.05 for f).      48    Figure 2.7 Results of Headwater Condition Index (HCI) sensitivity analysis. Sensitivity value is either the first-order sensitivity index (closed circles; simple additive effect) or the total sensitivity index (open triangles; additive effect + interactions with other variables) per input factor. High sensitivity values indicate higher contribution to systematic variance in HCI.    49   Figure 2.8 Results from plotting observed HCI values vs. predicted HCI values based on leave-one-out cross-validation predictions (LOOCV) for a non-spatial model (a) and a model incorporating spatial dependency (b). Line is the 1:1 relationship.   50   Figure 2.9 Relationship of BU with HCI, controlling for the effects of topographic wetness index, baseflow index, and catchment area.    51    Figure 2.10 Spatial distribution of Headwater Condition Index (HCI) values (points) and predictions (lines) for headwater drainage feature sites in the Toronto and Region Conservation Authority jurisdiction (a) and in the Rouge River watershed (b, delineated with black outline in a). Lines are DEM-generated streams with a 1 ha initiation threshold. Stream colours are the aggregated HCI value based on high density prediction points (~5 points per stream). Colours range from poor to good HCI values (purple → blue → yellow). Grey lines are streams greater than 2.91 km2 catchment area.   52  Chapter 3: Measuring function and structure of urban headwater streams with citizen scientists 3.1  Introduction Monitoring ecosystem patterns and processes is important for evaluating the relative condition of freshwaters and how they respond to stressors. Despite links between pattern (i.e., “structural” properties like species richness and biomass) and process (i.e., “functional” properties like species turnover and nutrient cycling), monitoring programs often assess structural measurements and ignore complementary functional ones that could result in more comprehensive ecosystem assessments (Gessner and Chauvet 2002, Tiegs et al. 2013, von Schiller et al. 2017).  In stream ecosystems, one important function is the decomposition of organic matter. Organic matter originates from both terrestrial sources (e.g., leaf litter) and aquatic sources (e.g., decomposing organisms) and its decomposition can be the basis of community productivity, particularly in small streams (Wallace et al. 1997, Richardson et al. 2010). Traditionally, decomposition rate is estimated by measuring mass loss of leaf material over an incubation period. Studies often contrast rates based on mode (e.g., benthic macroinvertebrate vs. microbial decomposition), species (e.g., deciduous vs. coniferous), or environmental gradients (Gessner and Chauvet 2002, Kuglerová et al. 2017a, Lecerf 2017). Lack of consistency in the quality of decomposing material over larger spatial and temporal scales is a remaining challenge for monitoring programs (Tiegs et al. 2019). To confront this challenge, decomposition rate can be estimated using a recently developed standardized cotton-strip assay, where cotton strips are incubated in-stream and subsequently measured for their tensile strength loss (i.e., ability to 53  resist breaking when pulled apart) (Tiegs et al. 2013). Although not a direct surrogate for leaf litter (Tiegs et al. 2007), these strips can be applied across a wide array of habitat types and stream conditions for assessing decomposition rate variability.   Urban streams run through catchments with complex land uses and land covers. Despite regional differences in climate, physiography, and management that can result in varied stream responses, urban streams tend to have altered structural properties such as increased flow variability, altered stream morphologies, elevated nutrient and contaminant concentrations, and reduced richness of species but increased dominance of tolerant species (Meyer et al. 2005, Walsh et al. 2005, Booth et al. 2016). Decomposition rate is expected to be sensitive to many of these stressors and likely in a nonlinear fashion (Young et al. 2008, Woodward et al. 2012). For example, broad-scale surveys suggest decomposition rates can be hump-shaped across a gradient of ambient water nutrient concentrations, with high rates at intermediate concentrations and low rates at the low and high nutrient extremes; local effects, however, are equivocal and highly context dependent (Woodward et al. 2012, Chauvet et al. 2016). Similarly, low urban cover likely stimulates decomposition rate through nutrient enrichment (e.g., increased runoff) but higher urban cover likely has deleterious effects on microbial and benthic communities because of high enrichment (e.g., increased biological oxygen demand) or contaminant loads (Meyer et al. 2005, Chadwick et al. 2006, Young et al. 2008, Woodward et al. 2012, Booth et al. 2016). Urban streams can have lower onsite organic matter retention which negatively affects nutrient removal and consequently increases nutrient export downstream (Meyer et al. 2005, Walsh et al. 2005). However, the size, nutrient status, and location of a site along the stream network influences the processing rates of materials from adjacent ecosystems (Wipfli et al. 2007, Richardson and Sato 2015, Tornwall et al. 2017).  54  Headwater streams (generally 1st and 2nd order with a drainage area of approximately < 1 km2) are strongly influenced by local conditions like topography, soil characteristics, rainfall, and vegetation whereas progressively larger streams (i.e., when many headwater streams combine) are more a function of fluvial network processes (Gomi et al. 2002). As catchments become more urbanized, their headwaters are disproportionately converted to ditches and storm drains, or are themselves buried, when compared to larger streams (Kaushal and Belt 2012, Stammler et al. 2013). Cumulatively, this headwater loss likely reduces ecosystem services like decomposition and results in substantial changes to downstream ecosystems – by reducing connectivity that allows species to move between headwater and downstream reaches, by removing sustained supplies of nutrients such as organic matter, and by altering whole-network flow dynamics (Gomi et al. 2002, Meyer et al. 2007, Freeman et al. 2007, Wipfli et al. 2007). Understanding individual headwater structural and functional variability could provide valuable insight into these cumulative effects, yet considering that headwater streams can represent 70–80% of the total stream length in unimpaired stream networks, monitoring their presence (i.e., in the case of ephemeral streams; Russell et al. 2015, Williamson et al. 2015) and properties requires considerable effort.   Citizen scientists provide valuable data at greater spatial and temporal scales than could be achieved by conventional research (Theobald et al. 2015). Citizen scientists are typically community volunteers that become local, active participants in data collection, stewardship, and policy initiatives (Dickinson et al. 2012, McKinley et al. 2017). Despite some concerns about data quality, perception about the use of citizen science data by peers, and the extent to which citizen science data are actually used to fill data gaps in peer-reviewed research, these factors partially depend on the role (i.e., acquisition/analysis), training, and demographics of the citizen 55  scientists themselves (Lewandowski and Specht 2015, Burgess et al. 2017, McKinley et al. 2017). With appropriate training and sampling design, citizen scientists can generate data comparable to researchers for basic and applied research (Boudreau and Yan 2004, Lewandowski and Specht 2015, van der Velde et al. 2017). Here, citizen scientists tested the feasibility of a field component of a standardized decomposition rate survey aimed at becoming an active community-based monitoring program.     We trained and deployed citizen scientists (the “Rot Squad”) to measure structural and functional properties of headwater streams in the rapidly urbanizing Greater Toronto Area (York Region), Canada. Our goals were to 1) contribute data on headwater stream structural and functional properties, in particular, data on cotton-strip decomposition rate, and 2) evaluate the sensitivity of decomposition rate to multiscale factors (i.e., from strip- to site-scale) such as microhabitat conditions, stream and riparian vegetation characteristics, and catchment characteristics. For structural properties, citizen scientists measured and categorized stream and riparian vegetation characteristics across a gradient of urbanization. For functional properties, they deployed cotton strips for an approximately three-week period. We evaluated the similarities between sites and between variables of the structural properties using a multivariate analysis. We then evaluated the sensitivity of decomposition rate by partitioning variance at three spatial scales and proposing a series of models wherein we added explanatory variables at those scales including the expected hump-shaped distribution with urbanization. We expected that decomposition rate would respond equally to local-scale characteristics (e.g., substrate, riparian vegetation, and velocity) and site-scale catchment characteristics since smaller streams are tightly linked to their surrounding landscape. If cotton-strip decomposition rate is sensitive to 56  indicators of landscape disturbance, then it would be a useful measure to incorporate into regional headwater monitoring programs using citizen scientists.  3.2 Methods  3.2.1 Data collection 3.2.1.1 Study region and site selection We selected 40 headwater stream sites in York Region, Ontario, Canada, across gradients of land use and catchment sizes (Figure 3.1). York Region, a suburban region near Toronto, is important for monitoring ecosystem changes with urbanization as its expected population growth is 62% from 1.11 million in 2016 to 1.79 million by 2041 (OMMA 2017, Statistics Canada 2017). Its current land use is 17% forested, 37% mixed agriculture, and 42% urban (AAFC 2015) and its catchments already contain many agricultural and urban threats to streams.  Sites were located at stream-road crossings coinciding with the Ontario Stream Assessment Protocol’s Section 4 Module 10 (OSAP S4.M10) “Assessing Headwater Drainage Features” (Stanfield et al. 2013). A headwater drainage feature (HDF) is “a depression in the land that conveys surface flow” and includes such features as natural streams, swales, and roadside ditches. We selected the 40 sites based on adequate flow during sampling and ease-of-access for citizen scientists using the OSAP HDF database (nsite = 27) and by ground-truthing additional sites (nsite = 13). Our sites were representative of York Region HDF catchment sizes and land uses, with a slight overrepresentation of catchments with high urban cover (Table D.1). Our study was conducted in the springtime (April–early May) of 2017.  At each site, we established one location at 15 m downstream and a second at 15 m upstream of the road crossing (hereafter, locations). This distance was applied to avoid the 57  immediate effects of the road crossing (e.g., culvert plunge pools, artificial and armoured channel substrate). In many cases, sampling at road crossings also mimics nick points in stream gradient, with flatter and slower upstream waters and faster downstream waters.  3.2.1.2 Citizen scientists recruitment and training  In fall 2016, we recruited 3 ecology graduate student volunteers and 2 EcoSpark staff members to pilot our training materials and protocol; their feedback and logistical considerations resulted in the current training structure and protocol.   During spring 2017, we used social media, an online job board, and local post-secondary institution listservs to recruit citizen scientists. Interested individuals were directed to an EcoSpark-hosted website providing background information and pictorial overviews of the “Rot Squad mission.” Individuals were encouraged to sign up with their colleagues in groups. Twenty-five individuals volunteered with eight volunteering as group leaders; group leaders received a prior day of detailed training. On deployment day, citizen scientists received in-class and in-field training in the morning and applied methods at two field sites per team in the afternoon. Group leaders coordinated transportation and participated in field activities; group members performed various field activities that they felt most comfortable completing (e.g., measuring or recording). Six of the deployment day citizen scientists and two ecology graduate students volunteered for cotton strip collection; these volunteers were sent a slide presentation and given a brief demonstration on collecting strips.  58  3.2.1.3 Site measurements and cotton strip deployment/collection  Citizen scientists located pre-selected sites using printed maps. At two locations within each site, they assessed channel characteristics and riparian vegetation, and deployed cotton strips and temperature loggers using a standardized protocol. Brief methods are outlined here but see Kielstra et al. (2019) for full protocol.   For channel characteristics at each location, they documented flow conditions (i.e., five ranked categories from “No surface water” to “Surface flow substantial”); recorded wetted stream widths, depths, and velocities using a hierarchical choice of methods based on flow conditions to aid in estimating discharge; and estimated dominant and subdominant channel substrates (e.g., from seven size classes including clay, gravel, and boulders).   For riparian vegetation at each location, they categorized dominant riparian vegetation composition in bands extending 0–1.5 m, 1.5–10 m, and 10–30 m outwards from the location on the left and right bank, looking upstream (e.g., “Lawn”, “Cropped Land”, and “Forest”). They also assessed the canopy cover directly over the location, choosing among four ranked categories from “0–24%” to “75–100%”.  To deploy cotton strips and temperature loggers at each location, they hammered a piece of rebar into the streambed and tied nylon string to a large metal washer which was slid over the rebar. Cotton strips, that were prepared following Tiegs et al. (2013) using unprimed #12 cotton duck canvas (Product CC12A72F, Curry’s Art Supply, Toronto, Canada), were attached at 0.5 m intervals to the nylon string using a zip tie through a pre-pierced hole in the strip. Galvanized steel nuts were attached between the strips using zip ties to ensure strips remain submerged. Temperature loggers were fixed to the rebar out of direct sunlight. Loggers were an Onset HOBO Pendant UA-001-64 (n = 18; temperature resolution: 0.14°C; logging interval: 15 59  minutes; Onset Computer Corporation, Bourne, MA, USA) or an iButton DS1922L-F5 (n = 62; temperature resolution: 0.5°C ; logging interval: 20 minutes; Maxim Integrated, San Jose, CA, USA). Strip and logger deployment times were recorded.  Strips and loggers were collected after approximately three weeks. At each location, stream wetted width was measured. Starting at the most downstream strip, each strip was assigned a collection depth category (e.g., “Buried under sediment”, “Floating above stream bottom”, or “Out of water”). The zip tie was then cut, the strip was lightly brushed using a paintbrush to remove loose particles and then lowered into 70% isopropyl alcohol for 30 seconds using a clothespin. The strip was wrapped in aluminum foil along with a unique label (e.g., SiteX-LocationX-StripX). The temperature logger was retrieved, and collection times were recorded.  In the laboratory, cotton strips were unwrapped and placed in a drying oven at 40°C. After drying (approximately 48 hours), we painted approximately 2 cm of the strip ends on both sides with Simply Brand Acrylic Structure Gel (part no. 126250900; Daler-Rowney, Bracknell, England). In earlier tests, we found that painting the ends before use on the tensiometer significantly decreased numbers of strips breaking at the tensiometer clamps (paint = 0%, no paint = 40%, χ2 = 4.47, p = 0.05). We also prepared reference strips that underwent the same procedure above but were not conditioned in a stream.  3.2.1.4 Decomposition rate determination  We measured tensile strength using an Instron 3367 Universal Testing Machine with a 2 kN load cell at room temperature and ambient humidity (Instron, Norwood, MA, USA). Approximately 1 cm of each strip end was placed in the clamps (Instron 2712-003). The clamps were smooth-60  surfaced and supplied by 550 kPa of air, sufficient pressure to minimize slippage and breakage at contact points. Strips were pulled upward 2 cm/min. Maximum tensile strength before breaking was recorded (units: Newtons) along with type of break (e.g., middle of strip, at the clamp).  We estimated % tensile loss per degree day (TLDD) to compensate for temperature effects on decomposition rate and for comparison with other studies. Degree days for each location were calculated by summing the daily mean temperatures over the incubation period. At three sites with a single location-level logger missing, we used the other location’s data since we found no significant difference in summary metrics (e.g., mean, coefficient of variation, and degree days) over the study period between locations that had both loggers (paired t-tests). At one site with both loggers missing, we used the per-day average of all available loggers. TLDD was calculated for each strip using  1 100ijkreferenceijkjkTensileStrengthTensileStrengthTLDDDD  −       =   where i is a strip, j is a location, and k is a site. TensileStrengthreference is the mean of reference strips. There was a strong correlation between TLDD and tensile loss per day not accounting for temperature (correlation aggregated to location-level mean: Pearson’s r = 0.95, p < 0.001). Hereafter, decomposition rate refers to TLDD.         61  3.2.1.5 Site-level landscape characteristics  We delineated catchments using a digital elevation model and estimated catchment land cover at three different scales: within the entire upland catchment of sites (denoted “c_” for catchment) and within circular buffers extending 250 m and 1000 m away from the site and intersected with the shape of the upland catchment (denoted “l_250” and “l_1000” for local). The goal was to examine land use at local scales (buffers) and the entire catchment; however, values could be identical for very small catchments across those scales. We estimated land cover in terms of composition (e.g., proportions) and configuration (e.g., connectedness of different types).  To delineate catchments, we used a 10 m grain provincial digital elevation model and topographically corrected it using a breaching algorithm (OMNRF 2015) in SAGA GIS 6.2.0 through R 3.4.3 and ‘RSAGA’1.0.0 (Brenning 2008, OMNRF 2015, Conrad et al. 2015, R Core Team 2017). We used a D8 flow-routing algorithm and 1 ha accumulation threshold to build DEM-generated flow lines to which we “snapped” our sites (mean snapping distance ± 1 SD: 8 m ± 12 m) using tools in ‘openSTARS’ 1.0.0 and ‘rgrass7’ 0.1-10 that used GRASS 7.4.0 (Bivand 2017, GRASS Development Team 2017, Kattwinkel and Szöcs 2017). A boundary drawn around topographically contributing cells is defined as the catchment. Using the DEM, we also calculated the pixel-specific topographic wetness index (TWI) which is an index of a pixel’s tendency to accumulate water (i.e., proportional to local drainage area and inversely proportional to local slope) (Beven and Kirkby 1979); we took the mean value over the l_250 and l_1000 buffers for each site. Other important tools for spatial data manipulation were R packages ‘gdalUtils’ 2.1.0.7, ‘raster’ 2.6.7, ‘sp’ 1.2.7, and ‘rgdal’ 1.2-16 (Pebesma and Bivand 2005, Greenberg and Mattiuzzi 2015, Bivand et al. 2017, Hijmans 2017).    62   For land cover composition, we calculated land cover proportions by extracting pixels within each scale and calculating the proportion of each land cover class. Land cover categories were taken from AAFC (2015), which we found was more temporally accurate than other provincial land cover datasets despite a larger pixel sizes (30 m here vs. 10 m). We aggregated these proportions to calculate the landscape disturbance index (LDI) sensu Stanfield et al. (2009), a weighted proportion of “disturbed” land cover, where low LDI indicates a relatively undisturbed catchment (Appendix D). We also calculated a satellite-derived continuous index of vegetation cover. This was done by generating a composite image of the maximum Normalized Difference Vegetation Index (NDVI) for June–August 2016, using Landsat 8 imagery in Google Earth Engine (Gorelick et al. 2017a). Every pixel in the composite image is its maximum value across several images taken throughout the year, or, its maximum vegetation density (Pettorelli et al. 2005). The mean of these composite pixels was taken at each site for each catchment scale, respectively.   For landscape configuration, we chose urban/developed as our category of interest to calculate patch density and the patch cohesion index using ‘SDMTools’ 1.1-221 in R (VanDerWal et al. 2014). Patch density measures the total number of patches divided by the total area (i.e., how many urban patches are present?) whereas patch cohesion measures the connectedness of patches (i.e., are urban patches clumped together or dispersed across the landscape?). At one site, there was no urban cover in the buffers based on our land cover grain (30 m). Rather than omit these from analyses, we inserted a “0” for both indices signifying that any urban cover present is greatly disaggregated.  63  3.2.2 Data analysis  3.2.2.1 Data selection We determined tensile strength for 464 of 480 originally deployed cotton strips (recovery rate: 97%). We excluded strips that were not under water at collection time (7% of recovered strips). The tensile strength of our 11 reference strips was 31.06 ± 1.23 kg (mean ± 1 SD). All strips with tensile strength above this mean were assigned the reference strip value (c. 1% of strips) to signify no decomposition over the sampling period.  Tensile strength measurements can be influenced by how the material failed. For example, breaking at clamps can indicate failure due to pressure from contact points rather than the material itself. Therefore, we excluded data based on each strip’s breaking properties. We subset the data by each location (i.e., one scale above the lowest-level of replication, strips) and made the following decisions. We assumed that strips breaking in the middle yielded the best quality results. If at least 4 strips broke in the middle at each location, we only used those data. If less than 4 strips met this criterion, we also included strips breaking closer to the top/bottom portions of the strip, or that broke at two places between the clamps. If less than 4 strips met these criteria, then we included all available strips. Our final dataset was 345 strips in 78 locations in 40 sites (74% of recovered strips, 98% of locations, and 100% of sites).  3.2.2.2 Statistical analysis  We analyzed our data to determine 1) how stream structural properties (i.e., channel characteristics and riparian vegetation) and decomposition rate vary within and between sites, and 2) how decomposition rate varies with local scale factors and catchment land cover.  64   First, we analyzed structural measurements using a Principal Coordinates Analysis (PCoA) on the Gower dissimilarity matrix of site:location combinations (nsite = 40 and nlocation = 79 had complete data for nvariables = 13). The Gower dissimilarity matrix allowed us to analyze numerical and categorical data together for discerning site and/or variable groupings for structural properties. We generated the Gower dissimilarity using ‘cluster’ version 2.0.6 and ran the PCoA in base R (cmdscale) on the square-rooted Gower dissimilarity matrix to ensure the dissimilarities were Euclidean (Maechler et al. 2018). We used a subset of the PCoA axis scores at the site:location level as predictors in models below. We also conducted Spearman rank correlation tests of the subset PCoA scores against catchment variables generated at the varying scales (i.e., buffers and whole catchment scales).  Second, we analyzed the decomposition rate by investigating the variation at each scale (e.g., strip, location, and site) and how local structural or catchment variables explain that variation. We fit linear mixed effects models using ‘SSN’ 1.1.12 in R, assuming a Gaussian error distribution (Ver Hoef and Peterson 2010, Peterson and Hoef 2014). Our random effects were locations within sites. We fit models using SSN since they also allow for determining if adding spatial random effects (e.g., spatial autocorrelation) increases explanatory power. Since variables can be measured at different scales, a corresponding change of estimated residual 2 at that variable’s scale was monitored (% change in 2).   To represent riparian vegetation, we calculated a “riparian vegetation score” as the sum of the ranked vegetation categories (1 = none to 7 = forested) for the right and left bank of each zone (6 zones total) and then divided this by the maximum score possible. For example, a site that was completely dominated (i.e., all 6 zones) by forest vegetation (i.e., score of 7) would 65  have sum of 42 and a riparian vegetation score of 1 (i.e., 42/42 [maximum possible]). We also took the most common type amongst the six zones as the predictor, “riparian vegetation mode”.  We used an information theoretic framework to evaluate the level of support for a series of candidate models investigating how variation at each scale might be explained. First, we fit a null model to express residual variation at each scale. Then, we fit sets of local models (strip and location scale data, nmodel = 6), site buffer models (nmodel = 8), and catchment models (nmodel = 6) (Table D.3). We compared these proposed models using Akaike’s Information Criteria (AIC) and accepted the best models in each set (relative likelihood > 0.5 based on AIC) to generate a final set for comparing across scales (Table 3.1). Finally, we fit Euclidean distance spatial random effects (i.e., overland distance) to determine if adding spatially explicit information improves predictive ability. We did not fit network distance models since our spatial locations were sparse and spread over several independent stream networks. For comparing models, we needed to remove two locations due to missing data for some variables. We additionally calculated marginal R2 and conditional R2 sensu Nakagawa and Schielzeth (2013).  3.3 Results  Generally, our streams had substrates dominated by silt and sand, slow-moving water, and were surrounded by meadow or scrubland riparian vegetation. Decomposition rate was rather variable at all scales of investigation (i.e., strip, location, and site). Strip depth categories, stream velocity, and the interaction between local vegetation density and topography were statistically significant, yet poor explanatory factors at these scales (i.e., low marginal R2).  66  3.3.1 Structural measurements  For channel characteristics, the median wetted width was 1.76 m (range: 0.42–16.30 m), the median depth was 0.13 m (range: 0.02–0.41 m), and the median velocity was 0.17 m/s (range: 0–0.72 m/s). The channel substrates were dominated by silt (51%) and sand (15%) and less so by clay, gravel, and cobble (each c. 10%) or boulders (1%). The median average temperature of our locations across the study period was 9.6 ºC (range: 7.8–13.1 ºC), and the median standard deviation was 2.4 ºC (range: 1.4–4.7 ºC) (Figure D.1).  For riparian characteristics, the immediate riparian zones (0–1.5 m and 1.5–10 m) were dominated by meadow type habitat (< 25% tree cover) followed by scrubland (> 25% and < 60% tree cover) and smaller proportions of no vegetation cover, lawns, or forest (Figure 3.2). In the zone extending from 10–30 m, the dominant riparian vegetation was either no vegetation cover or lawns and approximately equal proportions of meadow, scrubland, or forest (Figure 3.2). Due to the springtime season of study, canopy cover was mostly low (96% of sites had < 50% cover).  The first four PCoA axes cumulatively explained 39% of the variation and were used in subsequent decomposition rate models. We discuss the first two here (explaining 17% and 8% of the variation, respectively). Using envfit in ‘vegan’, we fit our factors to the ordination space. Overall, the PCoA was more strongly influenced by riparian vegetation characteristics than channel characteristics, based on regressions between original variables with location scores along those axes using envfit (R2 = 0.25–0.41 for vegetation categories and R2 = 0.16 for canopy cover; R2 = 0.16–0.29 for substrate and flow categories and R2 = 0.35 for channel velocity and non-significant R2 for depth and width). Along PCoA1, sites with lower scores tended to have slower flow, greater channel widths and depths, smaller channel substrate, and meadow-type riparian habitats (Table D.2 and Figure D.2). Otherwise, sites had varying channel and riparian 67  vegetation characteristics along this axis. Along PCoA2, sites with lower scores tended to have deeper, wider, and faster moving water whereas those with more positive scores tended to be forested with smaller, slower moving channels of varying substrate (Table D.2 and Figure D.2). Both PCoA1 and PCoA2 were not strongly correlated with any catchment scale variables (e.g., LDI, NDVI, urban patch density, and urban patch cohesion at c_, l_250, and l_1000 scales) at the 0.002 significance level (Bonferroni corrected, 0.002 = 0.05/24 tests).   3.3.2 Functional measurements  The median decomposition rate of strips was 0.23% tensile loss/degree day (range: 0–0.47% tensile loss/degree day) corresponding to a median 2.43% tensile loss/day (range: 0–6.64%) and a median absolute tensile loss of 50% (range: 0–96%) over a median 21-day incubation period (range: 16–33 days). The median tensile strength of the strips was 15.48 kg (range: 1.39–31.06 kg). Based on our null model, between-site differences accounted for 30% of the variation in decomposition rate while between-location accounted for 36%. The remaining 34% was among-strip variation (Table 3.1, model 4). These percentages suggest approximately equal controls on decomposition rates at site, location, and strip scales.  Overall, the chosen variables in the 20 a priori models explained little of the total variation in decomposition rate, despite attempts to fit models at multiple scales (Table 3.1, R2 of fixed effects and % change in site or location 2). However, the R2 of each full model (i.e., including random effects) was much higher suggesting that decomposition rate has high site- and location-specific affinity. The top models from all scales (i.e., those with relative likelihood > 0.5 in each set) tended to have significant predictors (i.e., 95% CI did not overlap zero) that explained a small amount of the total variation. 68  The top model (Table 3.1, model 1) indicated that strip depth category was important, with strips buried under sediment having a 15% lower decomposition rate than those on the stream bottom and 24% lower than those where it was difficult to determine the depth (e.g., typically sediment-laden murky water) (Figure 3.3). This was location specific, since including this variable reduced the location 2 by 11%. The top model at the location scale (Table 3.1, model 2) indicated that decomposition rate increased with stream velocity (Figure 3.4). An increase in velocity of 0.14 m · s-1 (1 SD) was associated with a 6% increase in decomposition rate. This was site specific, since including this variable reduced the site 2 by 6%. The top models at the site scale (Table 3.1, models 4 and 5) did not have any significant predictors. However, the significant interaction between local vegetation density (i.e., mean NDVI in a 250 m upstream buffer) and topography (i.e., mean TWI in a 250 m upstream buffer) suggested that sites with increased tendency to be saturated with water (i.e., moving from steep-banked channels with low accumulation to a more low-sloped wetland-type channels) increased the positive relationship between vegetation density and decomposition rate (Figure 3.5). For example, sites with TWI values of low (-1.5 SD in TWI: 6.52), average (TWI: 7.75), and high (+1.5 SD in TWI: 8.98) would correspond with a -12%, a ~0%, and +12% change in decomposition rate for a 1 SD increase in NDVI value, respectively.  For the top models (Table 3.1, models 1–2 and 4–5), when we added spatial components to the models based on Euclidean distances, we found no improvement in explanatory power. We added 4 different autocorrelation functions per model and found that they all explained < 5% of total variation with the majority explaining < 1%.  69  3.3.3 Citizen science engagement  Thirty citizen science volunteers contributed a total of 334 hours to piloting, training, deployment, and collection (Figure 3.6). Nineteen sites were completed on the deployment day, supplemented by 21 sites on subsequent days. Twenty-six sites were completed on the collection day, supplemented by 14 sites on subsequent days. Median time spent per site was 43 minutes (range: 19–96 minutes) for deployment and 24 minutes (range: 9–41 minutes) for collection. Participation in follow-up surveys was variable with 100% of participants from the pilot, 100% from the group-leader training session, 32% from the deployment day, and 42% from the collection day.  3.4 Discussion  The citizen scientists conducted a rapid survey of structural and functional properties of headwater streams in an urbanizing region, providing data for understanding headwater variability. For structural properties, we found that natural riparian vegetation cover declined moving away from the stream in our region, particularly beyond a 10 m distance. We also found that meadow-type riparian habitats were associated with wider, deeper, slower-moving streams with smaller substrates. Decomposition rate was most sensitive to local factors such as strip microhabitat (i.e., strip depth category), stream velocity, and both nearby upland riparian vegetation density and topography.  Local scale variables (i.e., strip- and location-scale) best explained decomposition rate and accounted for 70% of the total variation. Recent studies have shown that cotton-strip decomposition rate increases with temperature (Griffiths and Tiegs 2016, Scrine et al. 2017) but that this temperature dependency can change with season and deployment period (Vyšná et al. 70  2014). We controlled for the effects of temperature by standardizing % tensile loss measurements by degree days (base 0°C) and conducting our study in the spring season (April–May). Low spring temperatures may have limited microbial activity and biological effects on decomposition rate, however, our variation in cotton-strip decomposition rates (both accounting for temperature and not) is similar to those reported elsewhere (Clapcott and Barmuta 2010, Tiegs et al. 2013, Griffiths and Tiegs 2016).  Against our expectations dominant substrate and riparian vegetation cover did not explain significant variation in decomposition rate. For substrate, this is likely because our sites lacked variability and consisted of mostly small particle substrates. For riparian vegetation cover, this could be because we assessed dominant vegetation rather than density of that cover. Furthermore, the measurement of vegetation density (% canopy cover) likely did not discriminate between sites due to the spring season (i.e., low canopy cover at all sites). Low riparian forest density can increase concentration of small particle sizes and reduce macroinvertebrate densities that can drive decomposition rates (Sponseller and Benfield 2001, Lecerf and Richardson 2010). When we grouped local variables (PCoA1), decomposition rate increased along this gradient moving away from meadow riparian habitats with wide and deep channels and small particles towards more variable riparian habitats (mostly scrubland) with narrower channels and larger substrates.   Strip burial was associated with lower decomposition rate. Increased stream velocity could have reduced sediment settling rates and explain the positive relationship we found. Leaf material undergoes three phases of decomposition upon entering the stream (i.e., leaching of soluble materials, microbial conditioning, and finally physical fragmentation or ingestion by macroinvertebrates) while microbial decomposition is the expected principal breakdown 71  mechanism of cotton strips over these short incubation periods (Gessner et al. 1999, Tiegs et al. 2007). Even if depositional areas in headwaters have higher microbial activity (Clapcott and Barmuta 2010), burial likely reduces microhabitat conditions conducive to organic matter breakdown by microbes and benthic macroinvertebrates in several ways (Herbst 1980, Tillman et al. 2003, Scott and Zhang 2012). For example, burial can reduce oxygen and nutrient delivery to microbes in pore water, reduce abrasion from streamflow, and reduce effective colonization area for biota (Herbst 1980, Danger et al. 2012). Burial depth or low oxygen/nutrient in the hyporheic zone likely did not affect our results since strips were generally found in the first few centimetres below the streambed where pore water is readily replenished unless there is groundwater upwelling (Findlay 1995, Tillman et al. 2003). Furthermore, fine-scale studies of cotton-strip decomposition in headwater habitat patches revealed no relationship with sediment nutrients (Clapcott and Barmuta 2010). Physical abrasion likely affects leaf litter more than cotton strips, since the cotton strips retain their tightly woven shape and small fragments of material are not easily broken off (Lecerf 2017; but see Clapcott and Barmuta 2010 for slight differences among depositional and coarse gravel substrates). The effect of burial, then, likely represents a combination of reduce microbial activity, reduced access to cotton strips, or reduced microbially-mediated palatability of cotton strips to any macroinvertebrates consuming the cotton (Herbst 1980, Sinsabaugh et al. 1985, Sponseller and Benfield 2001, Clapcott and Barmuta 2010, Scott and Zhang 2012). Anecdotally, we found benthic macroinvertebrates such as dipteran larvae and amphipods attached to our cotton strips at many sites. Given the many local scale factors affecting a large proportion of variation in cotton-strip decomposition rates (70%), more mechanistic studies with coincident sampling of microbial activity, benthic macroinvertebrates, 72  and water quality are needed as covariates if cotton strips are to be used as a broader scale indicator of ecosystem functional health.  The broader scale variables (i.e., site-scale GIS variables) were not strong predictors of the remaining 30% of variation attributed to between-site differences. We did find that sites with higher local vegetation (i.e., NDVI values in the 250 m upstream buffer) had higher decomposition rates and that this effect was stronger at sites that tended to accumulate water (i.e., TWI values in the 250 m upstream buffer); we expect that these areas may have higher organic carbon and nutrient content with potentially higher microbial activity and decomposition rates as a result (Manzoni et al. 2012, Grabs et al. 2012).  We did not find strong relationships between our measures of urban cover and decomposition rates despite the expected hump-shaped relationship. In small streams, some studies have found this relationship while others have not, which may be attributed to a combination of factors including decomposition rate indicator type and variation in the range of urban cover and catchment sizes examined. For example, increased leaf litter decomposition rate associated with increasing urban cover (0–20%) was attributed to increased microbial activity, but this varied by leaf species (Imberger et al. 2008). Similarly, a threshold of 30–40% urban cover was observed after which leaf litter decomposition rates declined for two leaf species – the threshold was similar but the magnitude of response varied by species (Chadwick et al. 2006). In contrast, controlling for variation in material quality by using cotton strips revealed no patterns with urban cover despite strong relationships between nutrients or stream temperature; the authors suggested that context-specific factors explained at least some of this lack of relationship (Imberger et al. 2010). In larger streams across a gradient of freshwater to estuary sites (4th to 5th order), cotton-strip decomposition rate was hump-shaped across a gradient of “pristine catchment 73  cover” (% native cover) but their response was the inverse of our expected relationship (i.e., lowest rates at 50% pristine cover) and only apparent when phosphorous concentrations and salinity were accounted for (Bierschenk et al. 2012). Across a range of functional indicators, catchment sizes, and land uses, another study found leaf litter and wooden stick decomposition rates responded negatively and linearly to “landuse stress scores” whereas cotton-strip decomposition rates showed a threshold response; however, an urban gradient was not used in the study (Young and Collier 2009). We anticipated that sampling many headwater streams with varying land uses which are tightly coupled with surrounding landscapes would reveal stronger urbanization controls (Gomi et al. 2002). More study is needed across a range of catchment sizes and land uses to establish at what scale cotton-strip decomposition rate is most sensitive to catchment disturbances.  Citizen scientists provide substantial in-kind contributions to basic and applied research (Theobald et al. 2015, McKinley et al. 2017). In addition to contributing to data collection, citizen scientists can be active participants and advocates for conservation actions at the local level (Gray et al. 2017, Sullivan et al. 2017). Many of our volunteers were post-secondary students or newly graduated students that wanted to add to their in-field experiences; and in this way demonstrated a proactive concern for the environment as well as a practical need for experience for future employment. Furthermore, we had several volunteers contribute to multiple stages of the project supporting the notion that there is often a long-tail distribution of participation in citizen science projects (Figure 3.6) (Segal et al. 2015). Despite partnering with a well-established non-governmental organization with high capacity for recruiting and coordinating volunteers, we did have challenges with recruitment and obtaining feedback. For example, the protocol was inappropriate for individuals unable to navigate steep and uneven 74  terrain. The need for personal vehicles may have discouraged participation. The project also took place when key community contacts and volunteers were heavily engaged in the province’s coordinated land-use planning review. Finally, we found that feedback was higher when surveys were completed “in-class” rather than “at-home”; our deployment day logistics did not allow for in-person debriefing. Regardless of these challenges, our citizen scientists completed the field component with minimal assistance and recording errors and were able to compress nearly a month’s worth of researcher field activity into several days. Prior knowledge and experience, training (particularly field training), and repeated measures improve accuracy and precision of citizen-science gathered data (Lewandowski and Specht 2015). We relied on crew leader and volunteer training (i.e., establishing prior knowledge), making ourselves available for consultation (e.g., phone calls in the case of uncertainty), simple methodologies and field sheets, and site photographs for ensuring data quality. Additional measurements suggested here (e.g., coincident sampling of microbial activity, water quality, and sampling of benthic macroinvertebrates in headwaters) would require additional training and time per site, potentially outweighing the benefits of incorporating citizen scientists into a monitoring program. Collaborating and contributing to existing citizen science programs for measuring water quality could be a useful first step (e.g., FreshWater Watch provides sample kits and a platform for data upload and visualization). Still, a broader scale assessment across a larger range of catchment sizes and land uses would be more appropriate to establish the scale at which citizen scientists could best be deployed.  More study is needed on mechanisms driving patterns of cotton-strip decomposition rates at local and regional scales. Local scale organic matter decomposition is an important functional property and a basis for community production in downstream reaches. Monitoring this function, 75  particularly in headwater streams where much of organic matter entrainment and processing takes place, would require considerable effort for monitoring programs. Urban streams are complex and despite the apparent lack of sensitivity to urban cover in these important upper reaches of the stream network, we provide estimates of variation at different scales and a protocol outlining how citizen scientists could be deployed to capture a large range of variation in both structural and functional stream measurements that could inform regional scale models. Importantly, we suggest the protocol be used across a range of catchment sizes, land uses, and seasons to establish the scale at which decomposition rate is sensitive to catchment disturbance. We suggest a subset of sites be used to establish sensitivity to patterns of microbial, macroinvertebrate, and water quality measurements and how these relate to patterns of catchment disturbance. Together, establishing local scale mechanisms and monitoring broader scale structural and functional properties could inform managers when sites are outside the normal range of variation.   76  3.5 Tables Table 3.1 Linear mixed model results from model comparison. Models are ordered from best to worst candidate. Model numbers in parentheses refer to models from Table D.4. Text in model column also refers to the scale at which the variables were estimated. AIC is used for comparing models, where a low AIC indicates a better and more parsimonious fit. ∆AIC is the change in AIC relative to the best model (model 1). Rel L. is the relative probability of being the best model. For site-scale variables, their values can be estimated over “c_” or “l_250” catchments, where “c_” is the full catchment and “l_” is a local 250 m buffer, respectively. Depth categories: Depth 1 – Buried under sediment; Depth 2 – On stream bottom; Depth 3 – Floating above stream bottom; Depth 5 – unable to determine. Depth 4 was excluded during data processing. Coefficient 95% CI are ± 1.96 x SE of the coefficient. Those CI not overlapping 0 are bolded. “Int.” is the intercept. Variance components (2) are presented for reference. R2 (fixed) is the variance explained by the fixed effects. R2 (full) is the variance explained by the fixed effects and random effects. En dash indicates “not applicable.”  Model AIC ∆AIC Rel L. Fixed effects β (95%CI) Site 2 Location 2 Strip 2  % change Site 2  % change Location 2 R2 (fixed) R2 (full) 1 (1);  Strip -841.443 0.000 1.000 Int. (Depth 1) Strip depth 2   Strip depth 3  Strip depth 5 0.200 (0.17, 0.23) 0.031 (0.01, 0.06) 0.023 (-0.01, 0.05) 0.048 (> 0.00, 0.10) 0.00321 0.00291 0.00302 +16.8 -10.7 0.02 0.67              2 (2); Location -841.096 0.347 0.841 Int.  Velocity Depth 0.225 (0.20, 0.25) 0.018 (> 0.00, 0.04) -0.003 (-0.02, 0.02)  0.00257 0.00326 0.00308 -6.4 0.0 0.01 0.65              3 (--); Null -840.531 0.912 0.634 Int.  0.225 (0.20, 0.25) 0.00275 0.00326 0.00308 – – 0.00 0.66              4 (15); Site  -839.595 1.848 0.397 Int. c_LDI c_LDI2 0.202 (0.17, 0.24) 0.002 (-0.02, 0.03) 0.023 (-0.01, 0.05) 0.00281 0.00322 0.00308 +2.4 -1.2 0.01 0.66                           77  Model AIC ∆AIC Rel L. Fixed effects β (95%CI) Site 2 Location 2 Strip 2  % change Site 2  % change Location 2 R2 (fixed) R2 (full) 5 (16); Site -839.531 1.912 0.384 Int. c_LDI c_log10Area c_LDI   x       c_log10Area 0.230 (0.21. 0.25) 0.012 (-0.01, 0.04) 0.014 (-0.01, 0.04) -0.019 (-0.05, 0.01) 0.00263 0.00327 0.00308 -4.4 +0.3 0.01 0.66              6 (7); Site -839.521 1.923 0.382 Int. l_250 NDVI l_250 TWI l_250_NDVI x      l_250 TWI 0.240 (0.21, 0.26) 0.001 (-0.02, 0.03) 0.010 (-0.01, 0.04) 0.029 (> 0.00, 0.06) 0.00270 0.00324 0.00308 -1.7 -0.7 0.01 0.66  78  3.6 Figures    Figure 3.1 Map of sites (nsite = 40) within York Region of the Greater Toronto Area overlaid on the maximum value composite image of NDVI values for 2016. Symbol sizes are proportional to decomposition rates estimated at the site-level. Landsat imagery courtesy of the U.S. Geological Survey.   79   Figure 3.2 Percentage of dominant vegetation categories in zones 0-1.5 m, 1.5-10 m, and 10-30 m extending out from the stream, including both left and right banks, taken over all site:location combinations.  80   Figure 3.3 Decomposition rate variation with strip depth category. Points are predicted means and 95% CI for strip depth category at time of collection. Estimates do not consider uncertainty in the random effects. Strips could be 1) buried under sediment, 2) on stream bottom, 3) floating above stream bottom, 4) out of water (not used in analysis), or 5) unable to determine (e.g., if water was too murky).  81   Figure 3.4 Decomposition rate variation with stream velocity. Points are predicted means and 95% CI at the location level when the estimates of a model including only depth and random intercepts are subtracted from the response. Line is predicted relationship with 95% CI and does not consider uncertainty in the random effects.  82   Figure 3.5 Decomposition rate variation with NDVI at different levels of TWI. Points are means and 95% CI at the site level. Polygons are the 95% CI for predicted relationship between decomposition rate and NDVI in a 250 m upstream buffer at low TWI (-1.5 TWI SD; light grey), average TWI (medium grey), and high TWI (+1.5 SD; dark grey).  83   Figure 3.6 Number of hours volunteered by citizen scientists according to project stage and individual.   84  Chapter 4: Predicting cumulative effects of  landscape variables on stream benthic macroinvertebrate communities using spatiotemporally extensive biomonitoring data and hindcasting  4.1 Introduction Biomonitoring programs assess how ecological communities deviate from reference conditions to guide protection and restoration. Both natural and anthropogenic stressors can cumulatively affect this deviation in complex ways (i.e., non-linear) and at multiple scales (i.e., local and regional effects) (Crain et al. 2008, Piggott et al. 2015, Jones 2016). Consequently, reference conditions are best referred to by a distribution of a given indicator (e.g., biological diversity) under least disturbed conditions rather than an absolute value (Stoddard et al. 2006). Considering that biomonitoring programs tend to focus on general ecosystem conditions across the landscape (e.g., national, state/provincial, municipal; Therivel and Ross 2007), a significant challenge is the lack of suitable reference sites for testing how far sites deviate from reference conditions.  A complementary approach to using reference sites is employing larger regional datasets that capture a wide array of site conditions to construct relationships between indicators and predictors – these relationships are then used to generate predictions of a site’s expected reference condition at low anthropogenic disturbance (hereafter, hindcasting; Hawkins et al. 2010). Contemporary observations and hindcasted predictions can then be compared to assess a site’s condition. Hindcasting is used in various forms (i.e., different statistical models) for many purposes in the literature. For example, species distribution modelling uses hindcasting to employ “climate envelope models” for predicting past distributions of organisms based on their 85  climatic niches (Nogués-Bravo 2009). Models have used contemporary species distributions and hindcasting to assess how land use changes caused deviation from historic distributions (Regos et al. 2018). In lakes, paleolimnology uses sediment community composition for inferring past climate and environmental conditions (Smol et al. 1991, Dixit et al. 1992). In streams, multiple regression has been used to hindcast fish and benthic macroinvertebrate (BMI) community metrics for assessing impairment, but modelling constraints (e.g., predictors:observations ratio, collinearity of predictors) compels the use of smaller subsets of indicators and predictors (Baker et al. 2005, Kilgour and Stanfield 2006, Angradi et al. 2009). For biomonitoring, an ideal hindcasting approach should accommodate a variety of indicators and predictors (both natural and anthropogenic) under a common modelling framework that does not rely heavily on model specification (e.g., distributions of indicators or predictors, interactions between predictors). Rather, it should focus on improving predictive capacity and understanding drivers of regional-scale variation.    Random forest models are particularly suitable as cumulative effects models for hindcasting or forecasting (Jones et al. 2017). These models are a form of classification and regression tree models wherein random subsets of observations and predictors (i.e., trees) are successively split along predictor variables into smaller groups (i.e., leaves; Breiman 2001). Ultimately, observations in the same leaf are expected to have the same value. With few model constraints (e.g., linearity in the response), this analysis is especially suited for data when there is an expected threshold response to stressors. For example, steep curvilinear declines in abundance of many stream benthos are expected at even low levels of urbanization (< 5%; King et al. 2011), although linear responses have also been found including in our study region (Bazinet et al. 2010, Wallace et al. 2013). Results of random forest analysis can also be used to map predicted 86  response across single stressor gradients while averaging across all other variables (i.e., an independent effect analogous to partial regression plots; Jones and Linder 2015). Similarly, hindcasting can be used by setting disturbance predictors (e.g., urban cover) to low levels while accounting for other environmental predictors (e.g., geology, temperature) for modelling cumulative effects.  Despite the advantages of using random forests as cumulative effects models, the measured predictors may not capture all relevant spatial structures (i.e., scales) of an indicator (from local- to broad-scale variation). Indicator values at sites can result from induced spatial dependence (i.e., relationship between an indicator and a predictor variable) or autocorrelation (i.e., similarity with other sites) (Legendre and Legendre 2012). Classic geostatistical models incorporate spatial structures (e.g., model-based adjustments to the covariance matrix) with goals of describing the spatial structures and increasing predictive capacity at unsampled locations (Peterson and Ver Hoef 2010). For random forests, incorporating additional spatial variables as predictors could be similarly useful and provide valuable insight into any scales that require more study. These variables could be intuitive or more abstract spatial representations. For example, an additional spatial variable could be a covariate representing condition of headwaters weighted by their distance or flow contribution to a BMI sampling site (i.e., intuitive predictor with a distance component). Or, additional spatial variables might be derived from spatial eigenfunction analysis that constructs variables analogous to principal components based on matrices representing connectivity and distances between sites (Blanchet et al. 2008, 2011, Legendre and Legendre 2012, Borcard et al. 2018). Incorporating these predictors could help resolve scale issues when interpreting relationships found in regional biomonitoring data.   87  Here, we used a spatiotemporally extensive biomonitoring dataset to develop BMI cumulative effects models. For each site, we generated commonly used BMI community metrics as our indicator variables and generated a suite of predictor variables that describe environmental variation (e.g., geology) and anthropogenic variation (e.g., % non-natural land cover). For broadest applicability, we focused on predictors that could be extracted from regional-scale databases (e.g., geology, annual temperature) rather than reach- or site-scale predictors (e.g., substrate characteristics). We were interested in answering four questions. First, what BMI metrics are best predicted by our environmental context and land cover variables? Second, across all BMI metrics, what predictor variables are most important for explaining variation across the landscape? Third, across all BMI metrics, what proportion of sites significantly deviate from reference conditions (i.e., what proportion of sites are > 1 and > 2 standard deviations [SD] from the mean hindcasted deviations)? Fourth, how does incorporating spatial variables increase predictive capacity? For the first question, we ran random forest models for each BMI metric using the predictor variables; we evaluated the predictive capacity of the full set of predictors using cross-validation. For the second question, we evaluated the importance of individual predictors for each BMI metric using the % increase in model mean-squared error (MSE), if the predictor were randomly permuted, as a diagnostic. For the third question, we generated hindcasted predictions for each site and evaluated how sites deviated from that prediction. For the fourth question, we took a regional subset and constructed spatial variables within the region and tracked the change in predictive capacity from a base model when we added the spatial variables. Together, these analysis steps can help guide conservation practitioners in assessing suitable BMI metrics for a given question (e.g., urban impacts on BMI diversity), in assessing which sites are expected to deviate from reference conditions using regionally-informed 88  predictions, in understanding those predictors that drive the BMI metric, and in understanding if more predictors are needed at certain scales.     4.2 Methods 4.2.1 Data collection  4.2.1.1 Study region and dataset  To generate a spatially and temporally extensive stream macroinvertebrate dataset, we merged four datasets, primarily from Southern Ontario, Canada (Figure 4.1). Sites were located in two Ecological Land Classification of Canada Ecozones: 1) the Mixedwood Plains Ecozone (97%) characterized by glacial deposits and sedimentary rock, a mixture of deciduous and coniferous vegetation, warm summers, and cool winters; and 2) the southern Boreal Shield Ecozone (3%) characterized by thin gravel soils and bedrock, coniferous and some deciduous vegetation interspersed with wetlands and lakes, short warm summers, and long cold winters (Statistics Canada 2018).   We compiled a dataset of 2550 unique stream sites and 5949 unique sampling events (i.e., a unique point in space and time) with standardized site codes and taxonomy using FME Desktop 2019 (Safe Software 2019); this dataset was reduced for analysis in Section 4.2.2.1 below. Data were compiled from the multi-agency Flowing Waters Information System (FWIS; 24% of sampling events), the Ontario Ministry of the Environment (MOE; 35%), the Ontario Benthos Biomonitoring Program (OBBN; 16%), and the Toronto and Region Conservation Authority (TRCA; 25%). Although sources vary in sampling methods they produce similar results when deriving indices commonly used in biomonitoring programs (Borisko et al. 2007). Further sampling details are found in Appendix E.  89  4.2.1.2 Landscape variables  To assess the importance of landscape variables for BMI metrics, we extracted nine environmental and eleven land cover variables for each site (Table 4.1). Environmental variables were meant to provide landscape context for underlying gradients in stream characteristics (e.g., geology, precipitation), whereas land cover variables were those more associated with anthropogenic activities (e.g., % urban cover). In a subset of sites in the Toronto region, we also incorporated eigenvector-based spatial variables representing network (AEM, described below) and straight-line distances (MEM, described below) between sites, and headwater variables representing upstream headwater conditions, to determine how this addition affects predictive capacity. We chose this region for practical reasons since sampling of these stream networks is relatively dense and there were existing spatial data for generating AEM, MEM, and headwater variables (Chapter 2).   To calculate landscape variable statistics, we first delineated catchments for our sites as a polygon overlaying all pixels topographically contributing to a site based on a digital elevation model (DEM). For broadest applicability, we used the Ontario Integrated Hydrology Data (OIH; OMNRF 2016). Briefly, sites were “snapped” to the nearest stream network point (OIH layer Stream Grid) using a tolerance threshold of 40 m or were manually adjusted to a known location. Catchments were generated for these points using the OIH layer Enhanced Flow Direction. These two processing steps were done using Arc Hydro Tools in ESRI ArcMap 10.3 (ESRI  2014). Then, for gridded data (e.g., elevation and land use), we extracted pixels within the catchment and calculated variable-specific statistics with a custom R script employing spatial libraries cited below; for some statistics such as the mean pixel value, we weighted the pixels by their contribution to the area of the catchment. For vector data (e.g., roads), we intersected the 90  catchments with the vector data and calculated variable-specific statistics (e.g., road density). The custom R scripts for this processing used ‘exactextractr’ 0.1.0, ‘gdalUtils’ 2.0.1.14, ‘lwgeom’ 0.1-5, ‘raster’ 2.8-19, ‘rgdal’ 1.3-6, ‘rgeos’ 0.4-2, ‘SDMTools’  ‘sf’ 0.7-2, and ‘sp’ 1.3-1 and dependencies (Pebesma and Bivand 2005, VanDerWal et al. 2014, Bivand and Rundel 2018, Greenberg and Mattiuzzi 2018, Pebesma 2018a, 2018b, Baston 2019, Hijmans 2019). For environmental variables, we calculated statistics for catchment elevation, geology, physiography, and site temperature and precipitation. For elevation statistics (mean, standard deviation, and range of catchment elevation), we used the OIH Filled DEM. For geology, we calculated % catchment area occupied by quaternary surficial geology classes and then calculated the Baseflow Index (BFI), a geologically-based index of the groundwater contribution to stream flow (see Appendix B for calculation; G Model in Neff et al. 2005). For physiography, we calculated % catchment area occupied by physiographic features (e.g., beaches, eskers, and rock ridges) and conducted a principal components analysis (PCA) on the site by feature matrix using R’s ‘vegan’ 2.5-4, extracting scores along the first two components as predictors (10% and 9% of the variation, respectively) (Legendre and Legendre 2012, Oksanen et al. 2019). For climate, we extracted site point estimates of the annual mean temperature and total precipitation from annual grids provided by Natural Resources Canada, taking the mean over the time period 1995–2010. See Appendix B for data sources and analysis details.     For land cover variables, we calculated statistics for land use, satellite indices of greenness and urban cover, and road density. For land use, we calculated % catchment area occupied by land classes from a custom-generated grid from two land cover grids ca. 2011. Using these % classes, we calculated the Landscape Disturbance Index (LDI) sensu Stanfield et al. (2009), a weighted proportion of “disturbed” land cover, where low LDI indicates a relatively 91  undisturbed catchment. We also calculated landscape configuration metrics of patch density (i.e., how many urban patches are present per unit area) and patch cohesion (i.e., urban patch clumping) for the urban landscape category. For the configuration metrics, the algorithm produced a “not applicable” if there was no urban cover in the catchment. Rather than discard the site from the random forest model, we inserted a “0” for these indices signifying that any urban cover less than the land cover grain (i.e., 30 m) would be greatly disaggregated. To account for temporal differences in land cover, we also calculated satellite-derived images of annual maximum Normalized Difference Vegetation Index (NDVI) and annual median Normalized Difference Built-up Index (NDBI). Higher values of NDVI are associated with higher greenness (or, vegetation density) whereas higher values of NDBI are associated with higher densities of urban cover (Zha et al. 2003, Pettorelli et al. 2005). Each sample event took on the mean of catchment values of NDVI and NDBI in the year that the site was sampled. For road density, we intersected the catchment boundary with a road layer and calculated the density (units: length [km] / area [km2]). See Appendix B for data sources and analysis details.   To determine the effect of incorporating spatial and headwater variables on predictive capacity, we constructed two sets of eigenvector-based spatial variables and three headwater variables for a subset of sites coincident with the analysis in Chapter 2 (i.e., in the TRCA jurisdiction and sites sampled after 2009; nsite = 184). We fit these sites to the spatial stream network to derive topological and geographical information for input to subsequent analyses. The first set of eigenvector-based variables came from an asymmetric eigenvector maps (AEM) analysis; this constructs spatial variables through single value decomposition (analogous to principal components analysis) of a weighted connectivity matrix representing the spatial connections (i.e., upstream → downstream) and distances between sites (Blanchet et al. 2008, 92  2011, Legendre and Legendre 2012, Borcard et al. 2018). For the AEM, we used inverse-distance weights of the stream lengths sensu Blanchet et al. (2011) to represent ease-of-travel between sites. Similarly, the second set of eigenvector-based variables came from a distance-based Moran’s eigenvector maps (MEM) analysis which used straight-line geographic distances (Dray et al. 2006, Legendre and Legendre 2012, Borcard et al. 2018). For headwater variables, we developed several ways of estimating headwater condition at the sites. First, we used the best predictive model from Chapter 2 to generate predictions at the sites, representing the moving average estimate of HCI (HCIma). Since this moving average would incorporate the effects of larger-order streams, we also isolated headwaters in two ways. We calculated 1) the mean of all reaches < 95th percentile catchment area of headwater drainage features from Chapter 2 (< 2.9 km2), weighted by their individual contributions to the total catchment area of the site (HCIArea), and 2) weighted by their distance from the site (HCIDistance).   4.2.2 Data analysis 4.2.2.1 Data selection and processing  We constrained our dataset by taxonomy, sampling months, and geography. To control for varying taxonomic resolution across datasets, we used the OBBN/OSAP 27-group taxa which is a mixture of Classes, Orders, sub-Orders, and Families (Jones et al. 2004, Stanfield 2017). We included an additional four taxa to better link datasets (Arachnida, Clitallata, Other, Worms;  Appendix E). To control for sampling effort across datasets, we performed individual-based rarefaction on each sample to a maximum count of 100 individuals (Gotelli and Colwell 2011). We separated our rarefied data into those gathered from pools (17% of all samples), riffles (53%), composites (22%) (i.e., pool and riffle combined), and those missing information (8%); 93  we included only samples from riffles or composite samples (total: 75% of all samples) since composite samples tend to be mostly from riffles (i.e., 2 riffles and 1 pool per sample). In cases where multiple samples were taken per sampling event, we averaged the rarefied sample abundances to a single sample. We removed taxa absent from > 99% of the sites. We excluded sites in Northwestern Ontario as this region was not well represented (1% of sites). In the case of multiple sampling events per site, we selected the most recent sampling event. Finally, to avoid potentially erroneous delineations of small catchments, we removed sites with catchment areas < 2.5th percentile of the catchment area (< 1.05 km2). Altogether, this processing resulted in a final dataset consisting of 1952 sampling events with 31 taxa spanning years 1995–2016.  We calculated a set of BMI metrics used in biomonitoring programs; these included richness/diversity metrics, % composition of various taxonomic groups, scores along multivariate axes from a principal coordinates analysis (PCoA) of the rarefied abundance data, and the Hilsenhoff family biotic index (HBI) modified to our coarse taxonomy level (Stanfield and Kilgour 2006) (Table 4.2). To examine correlations between BMI metrics, we performed PCA on the site by metrics matrix.   4.2.2.2 Statistical analysis  We used random forest models to investigate the cumulative effects of environmental and land cover variables on our BMI metrics. For each metric, we fit our landscape variables using quantile regression forests with ‘quantregForest’ 1.3.7 in R (Meinshausen 2017). We used 1000 trees, six random variables at each split, and a minimum end node size of five. A prediction for an observation is generated as the median prediction of all trees in which the observation was not used (i.e., out-of-bag data [OOB], generally 1/3 of trees). In this way, random forest analysis has 94  implicit cross-validation, and predictive capacity can be assessed using the goodness-of-fit between OOB predictions and observations (e.g., using R2 of a linear model as our diagnostic).  We evaluated the importance of each predictor variable to each BMI metric as the % increase in MSE, which in this case is the increase in MSE if a predictor variable is randomly permuted, averaged across all trees.   For hindcasting, we kept the environmental variables identical but fixed land cover variables to represent minimal anthropogenic disturbance. Hindcasted predictions were  generated by “dropping” the environmental and adjusted land cover matrix (e.g., minimally disturbed) data down the classification trees of the random forest; the hindcasted prediction was calculated as the median across all trees and represents the expected BMI metric under minimally disturbed conditions. Hindcasted deviations were calculated by subtracting the observed BMI metric from the hindcasted prediction. In this way, we track how a site deviated from expected values at a given baseline of anthropogenic disturbance (i.e., 0% disturbance). We monitored the proportion of sites that deviated from the mean deviation by 1 SD and 2 SD at each level, classifying them as “moderately impaired” and “impaired” depending on the metric. We evaluated the effects of different baseline conditions (e.g., 10% disturbed versus 50%) by generating separate prediction sets and increasing land cover variables from zero to 50% of each land cover variables maximum value by intervals of 10% (i.e., five sets where all land cover variables increase by the same relative amount). At each interval, we calculated Pct.NNat as the sum of Pct.Agri and Pct.Urba since their sum is highly correlated with Pct.NNat in the dataset (Pearson’s r = 0.99) (Table 4.1 for landscape variable definitions). For the spatial and headwater variables, we generated a base model for the TRCA spatial subset and then added the variables (AEMs, MEMs, HCIm, HCIArea, HCIDistance) in separate 95  random forest analyses for monitoring the change in predictive capacity across all metrics. For these models, we increased the number of trees to 5000 in order to increase the probability of all AEMs and MEMs being included in the random splits.  4.3 Results 4.3.1 Landscape variables and BMI community metrics We examined varied BMI communities across many sizes, compositions, and configurations of stream catchments (Table 4.1 and Table 4.2). For landscape variables, catchments had a median area of 17 km2 with a 2.5th and 97.5th percentile range of 1.14–251 km2 dominated by non-natural cover (median: 81% of catchment area) of varying compositions that were mostly agriculture (median: 61% of catchment area) and urban (median: 7% of catchment area). For BMI metrics, sites were dominated by Insecta (median: 84% of organisms) of varying compositions that were mostly Chironomidae (median: 30% of organisms). However, non-Insecta could range near 87% of organisms (97.5th percentile); this group was not dominated by any particular sub-group (e.g., Amphipoda [i.e., freshwater shrimp] or Isopoda [i.e., sowbugs]).   For our community PCoA, we examined the first two axes explaining 24% and 14% of the variation, respectively (Table E.2 and Figure E.1). Sites with high PCoA1 values had community compositions with relatively high abundances of Ephemeroptera (i.e., mayflies) and Trichoptera (i.e., caddisflies) whereas sites with low values had relatively high abundances of Isopoda, Oligochaeta (i.e., aquatic earthworms), and Chironomidae (i.e., non-biting midges). Sites with high PCoA2 values were associated with relatively high abundances of Amphipoda and Isopoda whereas sites with low values had relatively high abundances of Chironomidae and Oligochaeta. 96   From the PCA on the BMI community metrics, we found that some metrics were redundant for classifying communities (Figure E.2). For example, Pct.EPT and HBI were negatively correlated with each other and positively and negatively correlated with PC1, respectively. Pct.Dipt and Pct.Chir were also positively correlated with each other and positively correlated with PC2. Some metrics were moderately correlated with these variables and axes. For example, Shannon and Simpson diversity metrics and Evenness were moderately positively correlated with Pct.EPT but negatively with Pct.Chir. Similarly, Pct.NInse and associated metrics such as Pct.Isop or Pct.IG were moderately positively correlated with HBI but negatively with Pct.Chir (Figure E.2).         4.3.2 Predictive capacity for BMI metrics and predictor importance  Our random forest analysis showed high variability in how BMI metrics could be predicted by landscape variables (Figure 4.2; OOB prediction error R2 median and range: 0.21 [0.10–0.53]). The BMI metrics best predicted by landscape variables were PCoA1 (OOB R2 = 0.53), Pct.EPT (OOB R2 = 0.44), and HBI (OOB R2 = 0.39). The random forest analysis also showed wide variability in predictor importance across BMI metrics (Figure 4.3 and Table 4.3). Overall, landscape variables had a median and range importance of 27% increase in MSE (12–70%); environmental variables had the same median and range importance of 27% increase in MSE (12–70%) whereas land cover variables had a median and range importance of 26% (12–38%). For environmental variables, sampling month, mean elevation, and mean temperature had the highest median variable importance; 36%, 34%, and 34% increase in MSE, respectively. For land cover variables, mean NDVI, standard 97  deviation NDVI, and % non-natural had the highest median importance; 29%, 28% and 28% increase in MSE, respectively. We visualized relationships between well-predicted BMI metrics and landscape variables using partial variable plots (Figure 4.4); these plots sequentially fix values through the range of a predictor, holding all others at their respective values, for visualizing the shape and magnitude of the independent relationship between a BMI metric and landscape variable. Since care should be taken interpreting predictor extremes due to generally insufficient data, we plotted predictor data between the 10th and 90th percentiles. For PCoA1, more Ephemeroptera and Trichoptera were expected at sites with higher elevation ranges (25–125 m) but the relationship appears to diminish at higher elevation ranges (> 125 m). More Ephemeroptera and Trichoptera were also expected at sites with higher catchment vegetation density (i.e., higher mean NDVI) but the relationship was much weaker. For % Ephemeroptera, Plecoptera, and Trichoptera, like the relationship with PCoA1, higher values were associated with larger elevation ranges and there was a weak positive relationship with increased urban density. For HBI, lower values were associated with sites sampled in late summer to autumn whereas higher values were associated with higher LDI values in the range of 0–30 LDI.    4.3.3 Hindcasted deviations  The hindcasted deviations from our random forest analysis showed that most sites fell within the range of natural variation, defined as [-1, 1] SD of the standardized hindcasted deviations (Figure 4.5). A median and range of 72% (61–91%) of data fell in this natural range across all metrics whereas 22% and 5% of sites fell beyond 1 and 2 SD away, respectively, from the mean standardized hindcasted deviations (i.e., 0). These proportions did not strongly deviate when we 98  used the range of baseline scenarios (e.g., 0–50% of the maximum value of the land cover variables).  4.3.4 Incorporating spatial and headwater variables in a regional subset  In our TRCA subset, we found that predictive capacity was lower (OOB R2 median and range: 0.12 [0.02–0.37]) than the regional model (OOB R2: 0.21 [0.10–0.53]) and that the rank order of best-predicted metrics was quite different. For example, Pct.EPT was ranked much lower in the TRCA subset (14th vs. 2rd) and PCoA1 was ranked lower (5th vs. 1st) whereas HBI remained high ranking (3rd vs. 1st) and some variables ranked much higher, like Pct.Amph (2nd vs. 17th) and Richness (4th vs. 13th); this suggests inter-regional variation in landscape variables driving differences in BMI metrics but that some metrics may be useful across scales (e.g., HBI). Interestingly, however, the correlation between hindcasted deviations between the TRCA subset and those same sites from the regional model was high (median and range of Spearman’s ρ across all BMI metrics: 0.95 [0.70–0.98]).    The importance of spatial proxies for unmeasured variables (i.e., AEMs and MEMs) affecting predictive capacity was BMI metric-specific (Figure 4.7). Overall, incorporating AEM or MEM spatial variables did not increase predictive capacity based on paired one-tail t-tests for OOB R2 between BMI metrics with and without AEMs or MEMs (all p > 0.05). For AEMs, some variables having the lowest base predictive capacity increased their OOB R2 by 200% but was still low overall (OOB R2 < 0.15). For the best predicted metric, HBI, OOB R2 increased by 0.05 (+9%) with AEM variables; broad-scale (e.g., AEM3) and moderate-scale variables (e.g., AEM40) were the most important (Figure 4.8). For MEMs, the BMI metrics from these spatial variables were Pct.Isop (+ 51%), Pct.EPT (+46%), and Pct.Dipt (+ 34%) but again the overall 99  OOB R2 was still low (OOB R2 < 0.12). For HBI, OOB R2 increased by 0.01 (+ 3%) and again, the most important MEM variables were broad-scale (e.g., MEM2) and moderate-scale (e.g., MEM28) (Figure 4.9).   Like spatial variables, headwater variables did not increase predictive capacity on average and depended on the BMI metric of interest (paired one-tail t-tests p > 0.05 for OOB R2 between BMI metrics and the three HCI measures). Median change in predictive capacity was 4% for HCIm, 1% for HCIArea, and 8% for HCIDistance. The BMI metric with the highest absolute increase in OOB R2 was Pct.Chir (increase of 0.05) with an 89% increase in predictive capacity when HCIDistance was included. Examining partial variable plots (not shown) revealed a 10% decline in Pct.Chir moving from HCI values of poor condition headwaters (HCI ~ 0.30) to average quality headwaters (HCI ~ 0.55).    4.4 Discussion  In this paper, we used a spatiotemporally extensive dataset to generate a suite of BMI metrics, related them to environmental and anthropogenic landscape variables, and generated hindcasted predictions for evaluating how sites deviated from reference conditions. We found that three BMI metrics were best predicted: a community gradient representing shifts from tolerant to sensitive taxa (Figure 4.2). Importance of individual predictors was BMI metric-specific with no dominant predictors as indicated by the overlapping importance values across all predictors (Figure 4.3). Across all metrics, we found that the majority of sites fell within 1 SD of the mean hindcasted deviation with decreasing frequency beyond 1 SD and 2 SD (Figure 4.5). Urban regions tended to have higher frequencies of impaired sites (e.g., HBI deviations > 1 SD and 2 SD); for example, the Toronto region along the northwestern portion of Lake Ontario and the 100  Ottawa region in the northeastern portion of our study region (Figure 4.6). Finally, we found that incorporating spatial variables or headwater variables did not substantially increase predictive capacity in our regional subset (Figure 4.7). Overall, the random forest hindcasting framework was useful for understanding environmental and anthropogenic factors impacting BMI metrics and deviations from reference conditions.    Community metrics are typically derived for comparison with a set of reference conditions (Reynoldson et al. 1997, Stoddard et al. 2006, Hawkins et al. 2010). Our hindcasted predictions should be considered “least disturbed conditions” representing the “best available conditions given today’s landscape” since they are model-generated from contemporary observations (Stoddard et al. 2006). Accuracy of the hindcasted predictions will depend on the predictive capacity of the model itself (i.e., OOB R2). Our BMI metrics ranged widely in OOB R2 from 0.10–0.53; three metrics (i.e., PCoA1, Pct.EPT, and HBI with OOB R2 ≥ 0.4) had reasonably high predictive capacity and were comparable to R2 values from other studies in the region (Kilgour and Stanfield 2006, Wallace et al. 2013). Although PCoA1 was best predicted and represented a disturbance gradient contrasting tolerant with sensitive communities (i.e., high proportions of Chironomidae and Oligochaeta vs. Ephemperoptera and Trichoptera), multivariate analyses are more useful for exploration than as indicators since dataset changes require re-analysis and interpretation. Rather, community indicators like Pct.EPT and HBI are more interpretable and usually are within fixed ranges (e.g., Pct.EPT falls between 0–100%). Pct.EPT and HBI (or analogues) are regularly used as indicators of disturbance in urban and rural settings as they represent the loss of sensitive/intolerant organisms – Pct.EPT is expected to decrease with disturbance and HBI is expected to increase (Klemm et al. 2003, Cuffney et al. 2010, Coles et al. 2012, Johnson et al. 2012). For example, across nine large metropolitan areas similar to 101  Toronto, EPT richness and the community tolerance index (analogous to HBI) consistently and strongly responded to urban development (Cuffney et al. 2010, Coles et al. 2012). Importantly, confounding environmental gradients may alter interpretation of how anthropogenic gradients affect these metrics (Reece et al. 2001, Cuffney et al. 2010, Johnson et al. 2012, Booth et al. 2016). Our attempt to include environmental and anthropogenic gradients by expanding geographic extent and relaxing model assumptions allowed for reasonably good landscape models for generating hindcasted predictions.     To understand cumulative effects and to generate hindcasted predictions, practitioners seek predictors that drive indicator variability (Klemm et al. 2003, Baker et al. 2005, Kilgour and Stanfield 2006, Cuffney et al. 2010, Théberge 2016). However, given different models may be required to account for different indicator numerical boundaries, statistical distributions, or the shapes of relationships with predictors (e.g., non-linear models), comparison of predictor importance may be non-trivial. Some practitioners aggregate response metrics to a single index that is subsequently related to predictors, and, although attempts have been made to standardize multimetric procedures (Stoddard et al. 2008), geographic and temporal constraints, weighting schemes, and inclusion criteria may increase complexity of interpretation or compound errors associated with estimating individual metrics (Reece et al. 2001, Klemm et al. 2003, Angradi et al. 2009, Théberge 2016). For this reason, we examined the importance of predictions under a common statistical framework allowing for uniform interpretation and relative ranking of predictors across all of our BMI metrics. We could also visualize marginal relationships between individual predictors and indicators while accounting for other predictor variables. In this way, a practitioner can seek an indicator most likely to respond to a planned catchment activity. LDI was the most important anthropogenic variable for HBI in our study (  102  Table 4.3 and Figure 4.4). Examining HBI in Figure 4.4, if increased urban or agricultural cover increased catchment LDI from 10 to 40, one would expect HBI to increase between LDI 10 and LDI 30 but remain at similar levels from LDI 30 to LDI 40. However, the marginal effect of sampling month is similar in range to the marginal effect of LDI, suggesting that timing of measurements is just as important (Figure 4.4). Similarly, sampling month is most important for Pct.EPT but has a larger marginal effect than NDBI (Figure 4.4). This exercise can be used to generate hypotheses for testing more specific empirical models (Jones et al. 2017, Kuglerová et al. 2019). Here, we focused more on general results as they relate to predictive capacity and hindcasted deviations rather than metric-specific relationships.  Hindcasted predictions were generated as an alternative to specifying reference sites, assuming that a larger geographic extent would capture more catchment variability. Our data were certainly biased towards disturbed catchments, with a median of 80% non-natural cover and a 2.5th percentile of nearly 25%. However, we did have catchments at the extremes of urban and agricultural cover (i.e., near 0 and near 100% cover; Table 4.1). Because of this land cover bias, we evaluated a site’s deviation in the context of a distribution of other site deviations (i.e., standardized hindcasted deviations; Figure 4.5), similar to Kilgour and Stanfield (2006). It was unsurprising that most sites fell within 1 SD of the mean deviation across all metrics given the moderate predictive capacity of our models. Across the landscape, we were also unsurprised that high deviation sites (beyond 1 SD) were found either inside or outside of urban/agricultural regions, depending on the metric’s expected response to these stressors, given the well-known threats that anthropogenic activities pose to the physical and ecological condition of receiving waterbodies (Paul and Meyer 2001, Vörösmarty et al. 2010, García et al. 2017). Our hindcasted predictions are likely conservative since there is a large amount of unexplained variation across 103  all models and since each prediction incorporates this uncertainty. Our landscape models only used coarse landscape variables measured at catchment scales. Incorporating sample-, site-, or reach-scale predictors could serve to further account for confounding temporal or spatial factors and further resolve the classification of sites. In the absence of such measures, however, we used spatial variables (i.e., representing the similarity of sites based on network or straight-line distances) or a headwater variable (i.e., representing the cumulative condition of upstream headwaters) as proxies for these predictors.  Incorporating spatial dependency can change predictive capacity and the interpreted relationships of predictors for an indicator. Spatial dependency can be incorporated in several ways including adjusting covariance structures (e.g., geostatistics), adding covariates expected to account for spatial dependency (e.g., geology), or adding covariates representing distances between sites (e.g., spatial eigenanalysis) (Legendre and Legendre 2012). Recent spatial improvements to random forest modelling include incorporating geographic buffers for interpolation that show comparable results with geostatistical techniques (Hengl et al. 2018), but we were interested in understanding if any important spatial gradients changed predictive capacity or required more investigation. We expected that incorporating spatial variables would explain similarity between sites not captured by our landscape variables (e.g., similarity in channel substrates, morphology, temperature, and chemistry). We were surprised to find that there was little change in predictive capacity, and in fact, incorporating these variables sometimes reduced predictive capacity likely by adding more variability to the dataset. This suggests that, overall, our covariates captured most of the spatial variability between sites.  Despite finding that spatial variables did not substantially change predictive capacity, some AEMs and MEMs were still important for individual metrics. Incorporating AEMs or 104  MEMs lead to only small increased predictive capacity (< 10%) for the best-predicted metric, BMI. For AEMs and MEMs, it is expected that spatial variables explain broader- to finer-scale variation as eigenvalues decrease (Blanchet et al. 2011). The broad-scale spatial variables (high eigenvalue AEMs/MEMs) tended to be most important and appeared to represent between-network clustering. Using HBI as an example, there was obvious spatial clustering in HBI values related to AEM3 and MEM2 which represented broad-scale patterns in network distances and straight-line distances, respectively (Figure 4.8, Figure 4.9). The moderate-scale spatial variables (AEM40 and MEM28) were more difficult to interpret but represented some clustering between sites when mapped (MEM28); spatial eigenvectors can be more difficult to interpret when the sampling is not uniform (Blanchet et al. 2008, Legendre and Legendre 2012).  The three headwater variables did not substantially improve predictive capacity but were important for our PCoA2 gradient and Evenness (> 20% increase in MSE). Headwater variables were estimated from models that incorporate similar landscape variables (models not shown). It is likely that, despite attempting to weight headwater variables by distance or catchment area, at these large catchment scales headwater heterogeneity is less important and that their values are increasingly correlated to whole-catchment landscape variables. More study is needed on how to incorporate headwater heterogeneity in downstream community models given their density and ecological significance within watersheds (Gomi et al. 2002, Meyer et al. 2007, Wipfli et al. 2007, Richardson 2019a). Overall, adding spatial variables did not substantially change predictive capacity but were useful since different spatial structures were important for different metrics; mapping this variability may be key for informing practitioners which scales require further investigation.   105  Our approach used a uniform statistical framework for generating cumulative effects models that can also be used for hindcasting reference conditions. We also showed that incorporating spatial variables can provide useful information for individual metrics even if the gains in predictive capacity were low. In regions where practitioners have limited availability of reference sites or limited resources to establish and sample reference sites, using a hindcasting approach with larger regional datasets could be invaluable. Furthermore, random forest models require less expert knowledge to fit but can incorporate complex model structures if required (e.g., here, with spatial eigenfunction analysis and headwater variables or interpolation sensu Hengl et al. 2018). Hindcasting, generating contemporary predictions for unmeasured sites, or forecasting can be completed with minimal extra effort although care should be taken when data are outside of dataset boundaries where geostatistical models tend to have much more power (Hengl et al. 2018). Ultimately, random forest models should be used to guide complementarily robust empirical models. For biomonitoring, regionally informed random forest models can provide reasonable expected community conditions before getting any boots wet.            106  4.5 Tables Table 4.1 Landscape variables measured at the catchment-scale used in random forest models. The term “masl” indicates “metres above sea level”. Summary statistics are percentiles for the predictor.  Variable grouping Variable Abbreviation Summary statistics P50% (P2.5%, P97.5%) Environmental  Baseflow Index (unitless) BFI 0.58 (0.34, 0.89) Catchment area ( log10 (area [km2] + 0.1) Area.Log10 1.22 (0.06, 2.40) Catchment elevation mean (masl) Elev.Mean 239.29 (86.21, 448.50) Catchment elevation range (masl) Elev.Rang 84.46 (17.81, 293.53) Catchment elevation standard deviation (m) Elev.StDe 17.17 (2.77, 65.45) Mean precipitation (mm; mean annual precipitation 1995–2010) Precip.Mean 915.38 (832.06, 1081.00) Mean temperature (°C; mean annual mean 1995–2010) Temp.Mean 7.45 (6.29, 8.94) PCA scores of % physiography (unitless): Axis 1 scores Phys.PC1 -0.07 (-0.33, 0.72) PCA scores of % physiography (unitless): Axis 2 scores Phys.PC2 0.08 (-0.76, 0.46) Sampling month  Samp.Month      Land cover % Agriculture Pct.Agri 61.20 (0.94, 91.53) % Non-natural Pct.NNat 80.53 (23.94, 98.62) % Urban Pct.Urba 7.12 (1.02, 94.37) Landscape disturbance index (unitless) LDI 10.41 (2.31, 78.38) Mean annual maximum NDVI (unitless) NDVI.Mean 0.69 (0.43, 0.76) Mean annual median NDBI (unitless) NDBI.Mean -0.215 (-0.33, -0.09) Patch cohesion index Patch.Cohes 9.20 (4.60, 9.90) Patch density Patch.Dens 6 x 10-5 (2 x 10-7, 3 x 10-4) Road density ( log10(road density [km/km2] + 0.1) ) Road.Log10 0.17 (-1.00, 1.00) Standard deviation annual maximum NDVI (unitless) NDVI.StDe 0.106 (0.06, 0.21) Standard deviation annual median NDBI (unitless) NDBI.StDe 0.127 (0.08, 0.17)     107  Table 4.2 Benthic macroinvertebrate metrics used as response variables in random forest models. Summary statistics are percentiles for the BMI metric.  BMI metric Abbreviation Summary statistics P50% (P2.5%, P97.5%) % Amphipoda Pct.Amph 0.50 (0.00, 56.11) % Chironomidae Pct.Chir 23.00 (0.75, 79.42) % Diptera Pct.Dipt 30.59 (3.00, 85.00) % Ephemeroptera, Plecoptera, and Trichoptera Pct.EPT 19.52 (0.00, 78.84) % Insecta Pct.Inse 84.50 (12.89, 99.80) % Isopoda Pct.Isop 0.00 (0.00, 56.00) % Isopoda and Gastropoda Pct.IG 1.00 (0.00, 58.50) % Mollusca Pct.Moll 0.50 (0.00, 31.93) % Non-Insecta Pct.NInse 15.50 (0.20, 87.11) % Worms Pct.Worms 2.00 (0.00, 52.00) Evenness Evenness 0.67 (0.33, 0.84) Hilsenhoff Biotic Index HBI 5.86 (4.03, 7.66) Principal coordinates analysis scores (Hellinger distance matrix) PCoA1 & PCoA2 -0.019 (-0.53, 0.56) 0.02 (-0.57, 0.41) Richness Richness 10.00 (5.00, 15.00) Shannon Diversity Shannon 1.55 (0.60, 2.07) Simpson Diversity Simpson 0.71 (0.26, 0.84) Simpson Diversity Index SimpsonDiv 0.29 (0.16, 0.74)     108  Table 4.3 Landscape variable importance for its most influenced and least influenced BMI metric. Variables are ordered by highest median importance across all BMI metrics. Variable importance as % increase in mean squared error (MSE). Abbreviations can be found in Table 4.1 and Table 4.2.   Landscape variable Most important for (% MSE) Least important for (% MSE) Samp.Month Pct.EPT (70) Pct.Moll (13) Elev.Mean Pct.Inse (58) Pct.Moll (21) Temp.Mean Shannon (40) Pct.Worm (16) Precip.Mean PCoA1 (46) Pct.Moll (17) NDVI.Mean PCoA1 (38) Pct.Moll (14) NDVI.StDe PCoA1 (34) Pct.Moll (15) Pct.NNat PCoA1 (35) Pct.Worm (18) Pct.Agri PCoA2 (36) Pct.Moll (19) Phys.PC1 PCoA1 (42) Pct.Worm (20) Area.Log10 Pct.EPT (45) Pct.Moll (14) Patch.Dens Richness (32) Pct.Moll (12) Phys.PC2 PCoA2 (40) Pct.Moll (15) Elev.Rang Pct.EPT (57) Pct.Moll (16) Pct.Urba PCoA1 (33) Pct.Moll (16) BFI Pct.Inse (35) Pct.Moll (12) Patch.Cohes PCoA1 (32) Pct.Moll (15) LDI HBI (32) Pct.Moll (12) Elev.StDe PCoA1 (41) Pct.Moll (17) Road.Log10 HBI (30) Pct.Moll (16) NDBI.Mean PCoA1 (35) Pct.Moll (13) NDBI.StDe Pct.EPT (33) Pct.Moll (13)    109  4.6 Figures    Figure 4.1 Map of sites (black points; nsite = 1952) used in random forest cumulative effects models and sites (red points; nsite =  184) used for adding spatial predictors.    110     Figure 4.2 Observed vs. out-of-bag (OOB) predictions for benthic macroinvertebrate (BMI) metrics from random forest models. OOB predictions are the median predicted value for an observation across all trees where the observation was not used (generally 1/3 of trees). Abbreviations are available Table 4.2.    111    Figure 4.3 Landscape variable importance (% increase in mean-squared error [MSE]) summarized across all benthic macroinvertebrate metrics. Each point represents the importance of the variable to a single metric. Abbreviations are available in Table 4.1.   112     Figure 4.4 Partial variable plots for the top three predicted benthic macroinvertebrate metrics, moving downwards, and their top environmental (left) and land cover (right) predictors. The rank order of the variable importance for each BMI metric precedes the x-axis label. For each plot, the relationship is only shown between the predictor’s 10th and 90th percentile (see Section 4.3.2 for explanation).        113    Figure 4.5 Deviations from hindcasted reference conditions (lowest impact) for each benthic macroinvertebrate (BMI) metric. Intervals represent standard deviations from the mean hindcasted deviation. Interpretation of the direction of deviation will change per metric. BMI metrics are ordered as in Figure 4.2 and descriptions are in Table 4.2.    114    Figure 4.6 Map of sites coloured by standardized deviations from hindcasted predictions for Hilsenhoff Biotic Index (HBI). Higher values (yellow) indicate higher deviation from hindcasted prediction and indicate higher likelihood of organic pollution according to HBI.           115   Figure 4.7 Observed vs. out-of-bag (OOB) predictions for benthic macroinvertebrate (BMI) metrics when adding spatial variables to the Toronto region subset. OOB predictions are the median predicted value across all trees in which an observation was not used to generate the tree.    116    Figure 4.8 Variable importance (% increase in mean-squared error [MSE]) of AEM variables for HBI (left) and spatial representation of HBI values and AEM values for chosen variables (right). For HBI, more shaded points indicated higher HBI values (i.e., more organic pollution). For AEMs, white points indicate positive AEM scores on the axis with the relative size indicating higher scores whereas black points indicate negative scores with higher relative size indicating lower scores.      117   Figure 4.9 Variable importance (% increase in mean-squared error [MSE]) of MEM variables for HBI (left) and spatial representation of HBI values and MEM values for chosen variables (right). For HBI, more shaded points indicated higher HBI values (i.e., more organic pollution). For AEMs, white points indicate positive AEM scores on the axis with the relative size indicating higher scores whereas black points indicate negative scores with higher relative size indicating lower scores.   118  Chapter 5: Incorporating spatial dependencies for increased understanding of stream indicator responses to landscape disturbance across multiple scales.   5.1 Introduction  Freshwater ecosystems are threatened by the cumulative effects of natural and anthropogenic stressors occurring at multiple scales (e.g., local, regional, and global) (Resh et al. 1988, Strayer and Dudgeon 2010, Vörösmarty et al. 2010, Reid et al. 2019). For freshwater practitioners, understanding sources of variation and threats for various indicators is crucial for developing protection and conservation strategies at relevant scales (Reid et al. 2019).   Indicator responses to stressors may differ depending on spatial context (Resh et al. 1988, Utz et al. 2016). In urban streams, for example, relationships between stream chemistry or biological assemblages with urbanization may be consistent but their magnitudes may vary within and between regions (Booth et al. 2016, Utz et al. 2016). For example, urbanization tends to increase ionic strength through point and non-point source contaminants, but regions with naturally high ionic strength may have more chemical and biotic resistance than those regions with naturally low ionic strength (Cuffney et al. 2010, Utz et al. 2016). Many studies use gradients of catchment land use (e.g., urban cover) to understand important landscape drivers of variation in physical and biological ecosystem components (Stanfield and Kilgour 2006, Cuffney et al. 2010, Bazinet et al. 2010, Riva-Murray et al. 2010, Wallace et al. 2013, Baruch et al. 2018, Kielstra et al. 2019, Kuglerová et al. 2019). For example, road density was a strong predictor of benthic macroinvertebrate family richness and a biotic index across Toronto, Ontario streams (Wallace et al. 2013). Most of these studies use stream sites and their catchments as independent 119  observational units for modelling. Fewer studies examine sources of variation for the indicator within those catchments (e.g., longitudinal variation), the scales at which predictors explain variation (e.g., between-catchment or between-site differences), or how incorporating spatial dependencies affects predictive capacity and explained variation. Examining these sources of variation and gains in predictive capacity could provide useful insight into multiscale patterns beyond the predictors themselves.  Indicators can be directly and indirectly linked to landscape variables through abiotic and biotic pathways (King et al. 2005). Some indicators may have strong correlations with landscape variables leading to relatively simple interpretation whereas others may be more nuanced and better explained by a combination of landscape variables and spatial dependencies acting as surrogates for unmeasured variables (i.e., induced spatial dependency versus autocorrelation; Legendre and Legendre 2012). Furthermore, not properly accounting for data dependencies can increase the chance of detecting relationships that are spurious (i.e., type I errors; Zuur et al. 2010). To account for non-independence, additional covariates or adjustments to covariance matrices can be used. Covariates, for example, could be geological settings, distances to disturbance, or scores along multivariate axes that represent distances between sites (Blanchet et al. 2008, Zuur et al. 2010). To avoid overparameterization and increase generalization to other systems, the simplest covariance matrix adjustment would include random effects representing hierarchies of observations (i.e., mixed effects models; Bolker et al. 2009). In these models, observations can be spatially clustered within their respective hierarchies that are assumed to come from populations at their respective levels (e.g., populations of sites within populations of catchments). However, this clustering is not spatially explicit and does not consider distance relationships between catchments or sites.    120  Spatial stream network (SSN) models are a good example of how spatial dependencies might contribute to our understanding of stream networks. These are more complex but potentially more informative extensions of generalized linear mixed effects models in that they incorporate fixed effects, random effects, and appropriate spatial relationships based on network topology and distances (Peterson and Ver Hoef 2010, Ver Hoef and Peterson 2010). In this way, SSN models are particularly suited for tracking sources of variation. They can also improve predictions at unmeasured locations since these values are a function of predictors and autocorrelation with nearby sites (i.e., universal kriging; Isaak et al. 2014). By combining this rich spatial framework with landscape gradients, practitioners can tailor analyses to better understand multiscale aspects of stream networks (i.e., on-network and across-network variability; Peterson et al. 2013). From a practical management perspective, examining null sources of variation and how incorporating dependencies affects model parameters and fit can help determine whether simple or more complex models are most suited for predicting indicators at unmeasured locations.   Here, we explored indicator variation within and between stream catchments across a land cover disturbance gradient. Generally, our goals were to examine which landscape and site variables were associated with indicators and to track changes in the sources of variation as these predictors and spatial dependencies were incorporated into models. For each indicator, we constructed spatial stream network models to determine the best fixed effects and spatial dependency structures followed by tracking variation from a null (i.e., intercept + random effects), to a non-spatial (i.e., fixed effects + random effects), to a spatial model (i.e., fixed effects + random effects + spatial dependencies). For our first response variable, we used digitally generated stream-roadway crossings to model the probability of headwater drainage 121  feature sites (HDF, outlined below) as a function of land cover and spatial dependencies. Predictions from the best model were used to generate a probabilistic stream network for weighting reach-scale predictions of headwater condition index values (Chapter 2) as predictors in subsequent models. For the remaining response variables, we took samples of water chemistry, respiration rate, decomposition rate, and benthic macroinvertebrate communities within catchments of varying land cover disturbance. We expected high between-catchment variability in all our responses since our gradient exploited between-catchment differences in land cover disturbance. We also expected that local-scale landscape variables (e.g., those generated in 100 m-radius circular buffers around sites) and spatial dependencies would tend to explain more between-site differences. If adding spatial dependencies explains null variation and improves predictive capacity, then these complex models are useful for targeting management action at the locations and scales that most influence an indicator.    5.2 Methods  5.2.1 Data collection 5.2.1.1 Probabilistic stream network To determine the extent of stream networks in our seven catchments, we searched for headwater drainage features (HDFs) at potential stream-roadway crossings. Under the Ontario Stream Assessment Protocol 9’s S4.M10 “Assessing Headwater Drainage Features”, HDFs are generally sampled at stream-roadway crossings and defined broadly as “a depression in the land that conveys surface flow” (Stanfield et al. 2013). Since the extent of wet streams could be temporally variable, we chose this method to generate a probabilistic stream network based on land cover and landscape variables. Reach-scale predictions were used to weight the influence of 122  individual headwaters and their reach-scale predicted headwater condition index (HCI) values (from Chapter 2) on downstream indicators in subsequent models.  Following methods in Chapter 2, we generated potential stream networks in the seven catchments using a 1 ha accumulation threshold (Figure 5.1; details of catchment sampling outlined below). For each network, we generated paper maps of stream-roadway intersections and used smartphone-based GPS information to locate the potential site. We searched for infrastructure that could convey surface flow under the roadway (i.e., culverts, bridges). If there were any means of surface water connections within approximately 20 m (i.e., maximum of 2 pixels difference on the DEM), then we coded this as 1 (present), and otherwise, it was coded as 0 (absent). Due to a variety of logistical constraints (e.g., access to landowner properties, high density of some road networks in heavily urbanized catchments), we were not able to sample all stream-roadway crossings that would be generated by an intersection of stream and road GIS layers (Table F.1).    5.2.1.2 Stream sampling  To examine the influence of landscape variables and spatial dependencies on downstream ecosystems, we took samples of water chemistry, cotton strips for measuring respiration rate and decomposition rate, and benthic macroinvertebrates (BMI) as indicators. In 2015, we took samples in Ganaraska (GR), Ganatsekiagon (GN), Morningside (MO), and Wilket (WI) catchments. In 2016, we repeated some sampling at these streams to complete the dataset (e.g., if samples were unusable or lost in the previous year), to examine year-to-year variation in some variables, and to add a new variable (i.e., respiration rate). In 2016, we filled gaps in our land 123  cover gradient and extended the geographic extent of sampling by adding Rogers (RO), Vaughan (VA), and Pringle (PR) catchments (Figure 5.1).  Up to twelve sites were sampled per catchment that were spaced at approximately equal distances apart, based on stream length, following the major branch upstream (Figure 5.1). In some cases, sites were removed or required adjustment due to lack of landowner permission or unsuitable conditions. We felt this equidistant approach to be suitable for examining longitudinal variation since the wide array of site conditions across forested, agricultural, and urban streams did not lend themselves to standardizing by channel types or riparian cover. Nevertheless, we sampled near our predetermined points in zones that tended to be of moderate velocity (i.e., avoiding pools or uncharacteristically high velocity riffles) and that represented average conditions of the reach within approximately 40 m upstream and downstream of a site. In each year, our sampling was conducted from mid-May to early July (i.e., early summer). In total, we sampled 83 sites across seven catchments.    For site chemistry, we took measurements twice per site during cotton strip deployment (see below) using a calibrated YSI EXO SONDE 2 multi-parameter probe (YSI Incorporated, Yellow Springs, OH, USA). Parameters of interest were: temperature (Temp; °C), specific conductivity (SpCond; µS/cm standardized to 25°C), dissolved oxygen (DO.mgL for mg/L and DO.PercEU % saturation), pH, turbidity (Turb; formazin nephelometric units [FNU], unitless), Chlorophyll a (Chl-a; µg/L), blue-green algae phycoerythrin (BGA; µg/L), and fluorescent dissolved organic matter (FDOM; quinine sulfate units [QSU] where 1 QSU = 1 µg/L quinine sulfate). At each sampling event, we took continuous measurements at 1 s intervals for approximately 5 minutes and calculated the median value for each parameter.  124  As an assay for respiration rate and decomposition rate, we deployed eight cotton strips per site for approximately three weeks. In 2015, a wooden stake was driven into the streambed and individual strips were fanned out from the stake using malleable wire and zip ties. In 2016, following the loss of many 2015 cotton strip samples, we improved this method by driving rebar into the streambed and fixing the strips to a nylon string set up longitudinally along the streambed held down by weights. Strips were collected from downstream to upstream to avoid site disturbance.  For respiration rate of the cotton strips, we took measurements at three sites per catchment in 2016 (most upstream site, a middle site, and the most downstream site) following methods and calculations in Tiegs et al. (2013). Briefly, respiration rate is determined by measuring changes in the dissolved oxygen (DO) of stream water following the two hour incubation of strips in respiration chambers (i.e., 50 mL centrifuge tubes), accounting for differences in stream DO, strip dry mass, and incubation time (Tiegs et al. 2013). We measured DO with calibrated WTW Oxi 330 (Xylem Analytics, Weilheim, Germany). After drying the strips for > 48 hours in a 40°C drying oven, we measured strip dry mass to the 0.0001 g using a static-free analytical balance.  For decomposition rate of the cotton strips, we also followed Tiegs et al. (2013) with modification detailed in Kielstra et al. (2019) and Chapter 3. Briefly, cotton strips were placed in a drying oven at 40°C for > 48 hours in the laboratory. After drying, approximately 2 cm of the strip ends were painted on both sides with acrylic structure gel to minimize breakage at tensiometer clamps. Strips were stored in a drying oven or desiccator until tensile strength determination. Based on tensiometer availability, strips were analyzed with two different machines in 2015 and 2016, but the methods were identical and reference strip tensile strength 125  was comparable (Appendix F). A strip was placed in tensiometer clamps and pulled upward at 2 cm per minute. Maximum tensile strength before breakage was recorded (units: N) along with breakage characteristics (e.g., middle of strip, at clamp). Decomposition rate was calculated as the percentage loss of tensile strength per degree day (TLDD); degree days (base temperature: 0°C) were calculated using data from simultaneously deployed iButton temperature loggers at each site (temperature resolution: 0.5°C ; logging interval: 20 minutes; Maxim Integrated, San Jose, CA, USA). TLDD was used to compensate for temperature effects between years and facilitate comparison with other studies. For sites with missing temperature data (e.g., temperature logger washed away), we used the data from the closest upstream site. At some sites, we had unanticipated logger storage capacity problems that required us to extrapolate temperature readings based on quantile random forest regression-modelled relationships with nearby weather stations (Appendix F).  For BMIs, we collected replicate samples using a Surber sampler at water depths < 0.5 m. This quantitative sampler (organisms/m2) uses a standard grid placed on stream bottom with a 250 m mesh net fixed to the frame and positioned open to upstream flow. Although not suitable for low water-velocity zones, which were present at some sites (e.g., upper reaches of catchments), we felt it more appropriate to use the same sampling device across all sites for consistency. Replicates were taken at three equidistant locations along a transect spanning the stream width. This tended to capture flow variability within the sites. In some cases of very narrow stream width, we took samples longitudinally at equidistant locations moving from downstream to upstream. Sediments were disturbed within the grid and were washed into the net by water flow; at low velocity zones, water was forced into the net by vigorously pushing water from upstream. For each replicated, we retained and preserved captured material/benthos 126  > 500 m in 80% isopropyl alcohol. In the laboratory, contents were examined under dissecting microscopes (up to 35X) or compound microscopes (up to 400X) and BMIs were identified to the Ontario Benthos Biomonitoring Network (OBBN)/OSAP 27-group taxa or lower using various BMI keys and online resources (e.g., to the family-level for many insect taxa) (Peckarsky 1990, Jones et al. 2004, Merritt et al. 2008, Thorp and Covich 2010, Stanfield 2013). Some samples (9%) could not be examined as they dried out due to improper storage or during transport. For substrate characteristics, we classified substrates within each BMI Surber sample into dominant and subdominant categories using definitions in OSAP 9 S1.M5 definitions (Hogg et al. 2013).     5.2.1.3 Site-scale landscape characteristics  We expected that our indicators would be a function of environmental (e.g., geology, local topography, catchment area) and land cover (e.g., landscape disturbance index [LDI]) landscape variables. We fit our sites to the spatial stream network derived in Chapter 2 and estimated landscape variables as in Chapter 2. For topography, we calculated the topographic wetness index (TWI) in a 30 m-radius circular buffer; TWI is proportional to catchment area and inversely proportional to local slope and indicates tendency for water accumulation (Beven and Kirkby 1979). For geology, we used the baseflow index (BFI); BFI is a geologically-based estimate of groundwater contribution to streamflow (Neff et al. 2005). For catchment area, we used log10-transformed estimates (original units: m2). For land cover, we used land cover proportions to calculate the land cover-weighted landscape disturbance index (LDI). We also calculated satellite indices including the Normalized Difference Vegetation Index (NDVI; indicator of vegetation density), the Normalized Difference Built-up Index (NDBI; indicator of 127  urban density) and the built-up index (BU; [NDBI – NDVI] is an indicator of urban density controlling for vegetation density) based on 2016 satellite imagery. We also calculated road density (ROAD). Land cover variables were estimated at four spatial scales per site: within a 100 m-radius circular buffer (l_100), within a 1000 m-radius circular buffer (l_1000), within a 50 m-radius buffer extending perpendicularly from each side of the stream along the entire upstream network intersected with their individual reach-contributing areas (r_50), and within its whole catchment (c_). See Appendix B for more detail. Summary statistics for sites in this study are found in Table F.2.     5.2.2 Data analysis  We analyzed our data to determine which landscape and spatial factors best explain variability in our five responses (i.e., HDF presence/absence, water chemistry, respiration rate, decomposition rate, and BMI community composition). Although we took slightly different approaches depending on the response variable data structure (each outlined below), we generally followed the two-stage SSN approach as in Chapter 2. First, random effects and spatial dependencies were held constant for comparing a set of models with different fixed effects. Second, the best fixed effects were held constant for comparing a set of models with different spatial dependencies. The best model contained the best fixed effects and spatial dependency. Finally, the best spatial model was compared with a null model (i.e., intercept + random effects) and a non-spatial (i.e., best fixed effects + random effects). All data processing was done using Microsoft R Open 3.5.1  (Microsoft 2018, R Core Team 2018).  128  For the probabilistic stream network, we used binomial SSN models to model the probability of an HDF at potential stream-roadway crossings based on landscape variables. Our response variable was the presence/absence of an HDF-type site at potential stream-roadway crossings. Our fixed effects were TWI, BFI, catchment area, land cover, and the interaction between land cover and catchment area. Twenty models were fit substituting the land cover variable (5 land cover variables x 4 scales of estimating). For comparing fixed effects, the dependency structure had a random effect of catchment, an exponential tail-up (EXP-TU), an exponential tail-down (EXP-TD), and an exponential Euclidean (EXP-EUC) component. Models were compared with the Brier score (ranging between 0 and 1, where 0 indicates the best fit; analogous to mean-squared error) calculated using the leave-one-out cross validation (LOOCV) predictions and the original observations per model. With the best fixed effects model, we fit and compared 29 spatial models. Finally, we compared the best spatial model with non-spatial and null models. There were 995 potential stream-roadway crossing observations across seven catchments with a varying number of observations per catchment.  For water chemistry, we first used quantile random forest regression to screen chemistry indicators best predicted by landscape variables. In addition to single water chemistry variables, we included scores from the first two axes of a principal components analysis of the chemistry variables in the event that a gradient of water chemistry was better predicted than a single variable (Figure F.2). We used default settings for the random forest (ntrees = 500, six random variables at each split, and minimum end node size of five), assumed independence of observations, and selected the three best-predicted chemistry variables as those with the highest OOB R2 (model-based implicit cross-validation; see Chapter 3 for detail). For each of the three best-predicted chemistry variables, we used the two-stage SSN approach. We fit the fixed SSN 129  models as above, using a Gaussian distribution and including sampling year as a categorical fixed effect. We also added four models meant to examine the effects of headwater conditions on downstream chemistry using the following approach. The best model from Chapter 2 was used to generate predictions of the HCI, which was aggregated to a reach-scale average for our purposes. We generated four estimates of headwater condition. First, the simple moving average estimate of HCI (HCIma) for an individual site’s reach. For the next three, we isolated headwaters since HCIma includes larger-order streams. We calculated the weighted mean of HCI for all reaches < 95th percentile catchment area of headwater drainage features from Chapter 2 (< 2.9 km2) based on catchment areas of the reaches (HCIArea), the distance reaches were from the site (HCIDistance), or the predicted probability that the stream is a part of the network (HCIProb, where probabilities are reach-scale averaged probabilities from best probabilistic model above). The fixed effects model dependency structure had catchment and catchment:site as categorical random effects and EXP-TU, EXP-TD, and EXP-EUC as spatial dependencies. With the best fixed effects model (lowest AIC and highest relative likelihood; Burnham and Anderson 2002, Zuur et al. 2013), we fit and compared 29 spatial models. Again, for each best chemistry variable, we compared the best model to a non-spatial and a null model. Our measure of predictive capacity was the R2 between observations and observation-level predictions using a 5-fold cross validation procedure stratified by sites (i.e., 80% of the sites are used to predict 20% of the sites). This analysis used 196 observations across 83 sites and seven catchments.  For respiration rate, we fit models without spatial dependencies because we had sparse sampling of sites (i.e., 3 sites per catchment). We fit the fixed SSN models as above (i.e., substituting land cover variables), however, we only included the land cover variable, catchment area, and their interaction as fixed effects to avoid overfitting. We also added models including 130  the single effects of HCIma, HCIArea, HCIDistance, HCIProb, SpCond, Chl-a, and FDOM. For the water chemistry variables, we used the mean for a given catchment:site:year combination in order to have a single estimate for each respiration rate sampling event. We used these chemical variables since they represent variation in resources (i.e., ionic strength for SpCond, algal resources for Chl-a, and organic resources for FDOM). For the dependency structure, we included catchment and catchment:site as random effects. We compared the best model to the null model. Our measure of predictive capacity in this case was the R2 of a leave-one-group-out cross validation (i.e., sites) since we had few sites. This analysis used 168 observations across 21 sites and seven catchments.    For decomposition rate, we fit the same fixed effects models as water chemistry but also added models having year as a categorical fixed effect and the single effects of HCIma, HCIArea, HCIDistance, HCIProb, SpCond, Chl-a, FDOM, and the top three BMI metrics (see below). Again, for water chemistry and BMI variables, we used the mean for a given catchment:site:year combination to have a single estimate for each decomposition rate sampling event. The fixed effects model dependency structure had catchment and catchment:site as categorical random effects and TU-EXP, TD-EXP, and EUC-EXP as spatial dependencies. With the best fixed effects model (lowest AIC and highest relative L), we fit and compared 29 spatial models. Our measure of predictive capacity was the 5-fold cross validation as above. We only used sites with complete data. This analysis used 495 observations across 81 sites and seven catchments.  For BMI communities, we generated the same BMI metrics as in Chapter 4 for our samples (Table 5.2). As with water chemistry, we screened the BMI metrics for those best predicted by landscape variables using quantile random forest regression. For the three-best predicted BMI metrics, we used the two-stage SSN approach. We fit the same fixed effects 131  models as decomposition rate (i.e., substituting land cover variables and adding models having headwater condition and water chemistry). Again, for water chemistry, we used the mean for a catchment:site:year combination to have a single estimate for each benthic sampling event. Similarly, we used the mean for a catchment:site:year combination for TLDD. The dependency structure had catchment and catchement:site as random effects with TU-EXP, TD-EXP, and EUC-EXP as spatial dependencies. With the best fixed effects model (lowest AIC and highest relative L), we fit and compared 29 spatial models. Again, we compared the best model to a non-spatial and null model and assessed predictive capacity using 5-fold cross validation as above. We only used sites with complete data. This analysis used 249 observations across 81 sites and seven catchments.     5.3 Results Our landscape and response variables captured regional-scale (i.e., between catchment) to local-scale (i.e., between-site) variation across our seven catchments. For landscape variables, the proportion of total variation attributed to between-catchment differences tended to increase as variables were estimated over larger areas (i.e., from l_100 to c_) whereas between-site differences tended to decrease (Figure 5.2). For response variables, visual inspection and null models with random effects indicated that between-catchment and between-site differences accounted for varying amounts of total variation per response (Figure 5.3, Figure 5.4); these proportions also variably decreased as fixed effects and spatial dependencies were added (Figure 5.4, Table 5.3).  Based on the null models, between catchment-differences explained a median 47% (range: 6–81%) whereas between-site differences explained a median 25% (range: 1–51%) across all response variables. Based on the best model per response variable, fixed effects 132  explained a median 15% (range: 2–30%) of the total variation. For those that had a spatial dependency in the best model (5 of 8 response variables tested), the spatial dependencies explained a median 51% (range: 32–84%) of the total variation. In general, adding fixed effects and/or spatial dependencies did not increase predictive capacity (measured by the LOOCV Brier score or cross-validation R2) but had noticeable effects on the significance of predictors and explained variation of model components.   5.3.1 Potential versus realized stream networks Our potential networks (i.e., > 1 ha accumulation threshold) had a median of 82 km of stream length across the seven catchments (range: 57–126 km). Our median confirmation rate was 23% but this varied by catchment (range: 5–52%); between-catchment differences explained 24% of the variation in presence/absence (Table 5.3, HDF P/A null model). Using a conservative predicted probability of 0.8 as the cut off for a stream reach to be present, this would reduce the median stream length to 49 km (range: 27–105 km) across the seven catchments. For a probability of 0.5, the median would be 67 km (range: 46–120 km) across the seven catchments.   The best probabilistic stream network model included spatial dependencies (Table 5.3 and Tables F.3–F.5). Fixed effects explained 11% of the total variation; between-catchment differences reduced from 24% to 5% of total variation; and spatial dependencies of SP-TU (44%), SP-TD (8%) and SP-EUC (32%) explained the highest proportion of total variation. The large TU component indicated strong autocorrelation between sites but its range (1.4 m) suggested a large amount of variation at very small scales. The EUC component indicated overland autocorrelation among sites gradually diminishing at separation distances of 1.5 km. Catchment area had a positive effect on HDF probability whereas r_50_ROAD and TWI had a 133  negative effects; the r_50_ROAD effect size was double that of catchment area and TWI. A 1 SD increase in r_50_ROAD (4.4 km/km2) above mean road density (5.5 km/km2) was associated with a 0.36 decrease in probability (Table F.5). We did not find the expected interaction between land cover and catchment area. The model was a good fit to the data having a low Brier score (0.08) and high AUC (0.90). The final model was used to produce Figure 5.5 and predictions in subsequent models for weighting HCI.  5.3.2 Stream sampling 5.3.2.1 Water chemistry Our water chemistry quantile random forest regression found SpCond, pH, and DO.PercEU were best predicted by our landscape variables. Their OOB R2 values were 0.80, 0.60, and 0.28, respectively (Figure F.3). Chemistry parameter summary statistics can be found in Table F.6. Based on null models with random effects, the proportion of between-catchment and between-site residual variation differed markedly per parameter (Table 5.3). SpCond had higher between-catchment than between-site differences (81% versus 1%), pH had nearly equal amounts (34% each), and DO.PercEU had higher between-site than between-catchment differences (51% versus 2%). Together, these proportions suggest varying scales of influence on different water chemistry parameters.   For SpCond, the most likely model (lowest AIC, highest relative likelihood) had fixed effects explaining 30% of the total variation and a GA-EUC spatial dependency explaining 51% of the variation (Table 5.3 and Tables F.6–F.9). Comparing the null with non-spatial models, fixed effects explained mostly between-catchment differences (Table 5.3, reduction in Catchment proportion of variation). The strongest predictor was r_50_BU; a 1 SD increase 134  (0.16) was associated with a nearly 1000 µS/cm increase in SpCond when holding other predictors at average values (Figure 5.6). The GA-EUC dependency suggested progressively increasing spatial variability leveling off near 10 km separation distances. The R2 (CV) for this model was 0.81.  For pH, the non-spatial model was the most likely with fixed effects explaining 25% of the total variation that reduced between-catchment variation from 34% to 9% (based on the null model) but had little effect on between-site variation (Table 5.3 and Tables F.10–F.12). The strongest predictors were catchment area and BFI; and 1 SD increase in both predictors was associated with 0.09 increased pH units when holding other predictors at average values (Figure 5.6). Although the non-spatial model was most likely, adding a SP-EUC dependency reduced between-site variation from 32% to 17% and suggested spatial autocorrelation between nearby sites within approximately 1.8 km. The R2 (CV) for the non-spatial model was 0.34.    For DO.PercEU, the non-spatial model was also most likely (Table 5.3 and Tables F.13–F.15). The fixed effects explained 4% of the total variation and Year was the strongest predictor (mean difference of -5.69% DO in 2016). The strongest landscape predictor was HCIma but it was not significant (Figure 5.6). A substantial proportion of between-site variation remained (51% of total as for the null model) even after adding a SP-EUC dependency (2% of total variation). The R2 (CV) for the non-spatial model was 0.02.      5.3.2.2 Respiration rate  The median cotton strip respiration rate was 0.12 mg O2 g-1 hr-1 (range: 0.03–0.64). Respiration was highest in our most disturbed catchment, WI (median and range: 0.20 [0.14–0.31]), and lowest in one of our least disturbed catchments, RO (median and range 0.07 [0.06–0.12]) (Figure 135  5.7). The null model was most likely but 91% of the total variation was attributed to between-site differences and nearly 0% to between-catchment differences – we suspected one outlier site could be influencing these results (Figure 5.7, mean respiration rate > 0.4 mg O2 g-1 hr-1;  Tables F.16–F.17). Upon removal and re-running model comparisons (not shown) there were slight differences in model ranking but Chl-a remained the best predictor. Null model between-catchment and between-site changed markedly – 63% and 25%, respectively (Table 5.3). Furthermore, the non-spatial model was now most likely and indicated that a 1 SD increase in Chl-a concentration (3.72) was associated with a 0.03 increase in cotton strip respiration rate. This fixed effect explained 17% of the total variation and reduced between-catchment differences from 63% to 40% and between-site differences from 25% to 16%.  The R2 (CV) of the best model was 0.43.  5.3.2.3 Decomposition rate  The median decomposition rate of strips was 0.22% tensile loss per degree day (range: 0.00–0.38). Decomposition was highest in GA (median and range: 0.30 [0.05–0.36]) and lowest in RO (median: 0.11 [0.01–0.21]) which were our least disturbed catchments (Figure 5.8). Null models showed more between-catchment differences (51%) than between-site differences (28%) (Table 5.3 and Tables F.18–F.20). Fixed effects explained 10% of the total variation in the non-spatial model and decreased between-catchment differences to 30% but increased between-site differences to 35% of the total variation. Adding SP-TU and SP-TD spatial dependencies reduced the variation explained by fixed effects from 10% to 2% and reduced the between-catchment and between-site differences to near 0% each. These dependencies explained 67% (SP-TU) and 12% (SP-TD) of the total variation, respectively. Both BFI and catchment area 136  became non-significant after adding spatial dependencies. The range parameter of SP-TU suggested autocorrelation at flow-connected sites gradually declining at distances up to 18 km. The R2 (CV) of the best model was 0.44.       5.3.2.4 Benthic macroinvertebrates  Our quantile random forest regression of BMI metrics found PCoA1, Pct.EPT, and PCoA1.fine (i.e., higher taxonomic resolution) were best predicted. Their OOB R2 values were 0.68, 0.65, and 0.64, respectively (Figure F.4). Summary statistics for all BMI metrics can be found in  Tables F.21 and PCoA analysis results can be found in Tables F.22–F.23 and Figures F.5–F.6. We found site scores along PCoA1 and PCoA1.fine to be highly correlated (Spearman’s ρ = 0.99, p < 0.001) suggesting these are similar community gradients regardless of taxonomic resolution. Therefore, we chose PCoA1 (24% of variation in community abundance) for use in subsequent models. Sites with higher scores along this axis tended to have fewer aquatic worms (Oligochaeta; loading as the mean from a linear model fit to the axis scores: -0.35) and more caddisflies (Trichoptera; 0.16), stoneflies (Plecoptera; 0.25), black flies (Simuliidae; 0.28), crane flies (Tipulidae; 0.30), and mayflies (Ephemeroptera; 0.49) (Table F.22 and Figure F.5). Since the next best-predicted metric (after PCoA1.fine) explained only 8% of the variation in community abundance (PCoA4.fine, OOB R2 = 0.62) and for ease of interpretation, we used HBI in subsequent models (OOB R2 = 0.61). Therefore, PCoA1, Pct.EPT, and HBI were used in SSN models. PCoA1 was strongly correlated with Pct.EPT (Spearman’s ρ = 0.80, p < 0.001) and HBI  (Spearman’s ρ = -0.81, p < 0.001. Based on the null models for PCoA1, Pct.EPT, and HBI, between-stream differences explained 45–50% of the total variation and between-site differences explained 20–25%. 137   For PCoA1, the most likely model had SpCond as a fixed effect explaining 12% of the variation and a GA-EUC spatial dependency explaining 30% of the variation (Table 5.3 and Tables F.24–F.26). Comparing this with the non-spatial model suggested catchment-scale effects on benthic communities through chemistry (specifically, SpCond). The GA-EUC dependency suggested progressively increasing spatial variability leveling off at distances of approximately 2.5 km. The R2 (CV) for this model was 0.55.  For Pct.EPT, the most likely model had fixed effects explaining 17% of the variation with an EX-TU explaining 17%, an EX-TD explaining 2%, and a GA-EUC spatial dependency explaining 26% of the variation. Pct.EPT was highest in Ganaraska (median and range: 43% [0–85]) and lowest in catchment Pringle (median and range: 0% [0–6]) (Figure 5.9). Fixed effects explained mostly between-catchment differences whereas spatial dependencies explained mostly between-site differences (Table 5.3 and Tables F.27–F.29). Adding spatial dependencies caused catchment area and BFI to become non-significant. The strongest predictor was Year (mean difference of -8.54 Pct.EPT in 2016) likely driven by catchments sampled in 2016 having low Pct.EPT (catchments VA and PR). The next strongest predictor was an interaction between catchment area and l_1000_BU; sites with larger catchment areas had stronger negative relationships between Pct.EPT and localized urban cover (Table F.29). Like PCoA1, the GA-EUC dependency showed progressively increasing spatial variability leveling off at approximately 3 km. The R2 (CV) of this model was 0.37.   For HBI, the most likely model was non-spatial and had fixed effects explaining 15% of the variation that reduced between-site differences from 47% to 25% of the total variation but no strong effect on between-site differences (Table 5.3 and Tables F.30–F.32). As with Pct.EPT, there was a significant interaction between catchment area and l_1000_BU; sites at more 138  downstream sites had stronger positive relationships between HBI and local urban cover. The R2 (CV) of this model was 0.48.     5.4 Discussion In this paper, we used a land cover disturbance gradient to explore how multiscale variation of indicators was explained by landscape and site variables and spatial dependencies. Our approach provided useful insight into stream indicators along three of the four dimensions of the urban watershed continuum (i.e., longitudinal, lateral, and landscape development patterns over time, but not vertical [i.e., surface and groundwater connections]; Kaushal and Belt 2012). For example, null models with random effects provided insight into how variation was partitioned at multiple sampling scales; contained in these null sources of variation are the cumulative effects of environmental and anthropogenic factors affecting aspects of the four dimensions above. We found that between-catchment variation tended to be the highest for our indicators and reflected our landscape disturbance gradient. Our study’s examination of the effects of landscape predictors measured at different scales provided insight on those scales best targeted for conservation and mitigation (i.e., those scales with highest explained variation). We found that landscape predictors tended to explain more between-catchment variation than between-site variation but that these variables tended to be estimated at the local scale (i.e., l_1000); this suggests that local-scale disturbance tends to be a function of the catchment-scale disturbance itself. Finally, examining spatial dependencies provided insight into how random variation was spatially structured. We found that the random variation attributed to between-catchment and between-site differences could be shifted to modelled spatial variation (i.e., rather 139  than simple shifts due to random intercepts), which could be used for generating similarly predictive models. Mapping the “true” extent of stream networks is non-trivial. From an analysis perspective, choice of stream initiation threshold (i.e., minimum catchment area at which a stream is initiated), extraction technique, and topographic errors all contribute to changes in the position and extent of digitally-generated stream networks (Lindsay 2006). From a field perspective, the wetted extent of stream networks varies seasonally. Other studies have attempted to evaluate the extent of permanent and ephemeral streams versus digitally generated streams. For example, one study used terrain derivatives like slope and local topographic curvature to generate a probabilistic stream network for perennial and intermittent streams with good success (Russell et al. 2015). Another study found that the flow initiation thresholds (i.e., channel heads) of perennial, intermittent, and ephemeral streams greatly differs in forested and urban contexts (Roy et al. 2009). We combined these approaches by targeting potential HDF sites across a landscape disturbance gradient to examine the importance of fixed effects and spatial dependencies. Based on our null model, there was a moderate amount of between-catchment variability suggesting that the absolute probability is partially dependent on the catchment itself. Adding our fixed effects explained some of this variation. The spatial model explained 100% of the total variation and was split into fixed effects (11%), catchment effects (5%), TU (44%), TD (8%), and EUC (32%) components. Predictive capacity was good but similar for the null, non-spatial, and spatial models (Brier scores < 0.12). These scores may have been biased by low confirmation rate overall (> 70% were absent). Nevertheless, our best spatial model did have some intriguing anecdotal findings. First, in Ganaraska, the upper portions of the network have distinctly sandy soils with high infiltration resulting in a lack of overland flow despite having 140  large catchment areas; using field-based observations and spatial dependencies helped model this known drainage pattern. Second, in Pringle, there lower probabilities in the highly urbanized lower portions of the network than similarly sized streams in the upper portions of the network; again, model (i.e., r_50_BU) and field-based observations helped model this known drainage pattern. Finally, in Wilket, we noticed that the length of the main channel was overestimated; this could have been avoided by targeting stream-roadway crossings at known mainstem sites as well. Overall, this work suggests that SSN models are useful for determining null variation and augmenting fixed effects models with more nuanced spatial landscapes.  Our analysis revealed multiscale patterns of stream chemistry. In null SSN models, we found quite different scales of variation depending on the indicator – variation in SpCond was much more related to between-catchment differences than between-site differences, pH was nearly equally related, and DO.PercEU was mostly related to between-site differences. We were not able to predict DO.PercEU using our expected model; the only significant relationship was a slight decline in values in 2016. We suspect that oxygen saturation was more likely related to reach-scale variability in channel morphology directly upstream of the site that our TWI could not capture (i.e., presence of nearby riffles) or diurnal fluctuations (Allan and Castillo 2009). For pH, sites with larger catchment areas and higher BFI values (i.e., catchments draining geology of coarse lacustrine deposits) tended to be more alkaline. However, these factors mostly explained between-catchment differences and there was little practical change (i.e., acidic → alkaline) in pH units across our study catchments (range for all measurements: 7.26–8.67).   Our finding that SpCond was higher in more disturbed catchments was expected since point and non-point sources tend to increase chemical loading, despite regional differences in the expected magnitude of this relationship (Paul and Meyer 2001, Walsh et al. 2005, Cuffney et al. 141  2010, Bazinet et al. 2010, Wallace et al. 2013, Corsi et al. 2015, Booth et al. 2016). Our research adds predictor and spatial insight into this expected gradient in several ways. For the first aspect, the satellite-derived BU, specifically in the riparian zone, was the best predictor explaining 30% of the total variation in SpCond. The next best predictors were HCI variables, but they were best predicted by c_BU (Chapter 2). Many studies use an estimate of impervious cover, but BU is an intriguing satellite index in that it may better reflect a continuum of urban land cover than categorical indices (e.g., impervious cover) or the satellite indices NDBI and NDVI themselves. NDBI tends to overestimate built-up areas when there are bare soils (e.g., agricultural fields) and NDVI may underestimate urban cover in neighbourhoods with extensive tree cover (Zha et al. 2003, He et al. 2010). Based on visual inspection, we found that BU better distinguished agricultural fields from forested stream valleys (compared with NDVI) and from built-up areas (compared with NDBI) (Figure F.7). Based on this finding, BU may be a useful index for tracking urbanization patterns through time. For the second aspect, with respect to spatial variation and longitudinal relationships, there were no clear longitudinal or autocorrelation patterns in SpCond (e.g., TU dependency) that one might expect with stream chemistry in more undisturbed catchments. We believe that this is due to varying patterns in urbanization across catchments moving from upstream to downstream. For example, Pringle becomes progressively more urbanized moving from upstream to downstream. In contrast, Morningside is heavily urbanized in upstream but has wider riparian zones in its downstream sites. Similarly, Wilket emerges from a large upstream stormwater network and then meanders through an urban park. These contrasting patterns are likely why a landscape variable that can reflect both local riparian condition and upstream influences (r_50_BU) was most important. These patterns are also likely why GA-EUC was the best special dependency structure – in contrast with other EUC 142  dependencies which model gradual decreases in spatial similarity with increasing separation distances, the Gaussian dependency structure indicates progressively declining similarity of sites (i.e., sites close to each other are more similar but this rapidly declines; Dale and Fortin 2014, Ver Hoef et al. 2014). Based on these findings, the importance of longitudinal development patterns of urbanization and BU as predictors for stream indicators requires more study. Cotton strip respiration and decomposition rates are important and related functional indicators for streams. With cotton strips, both can be considered integrated measures of stream microbial activity, however, respiration rate more likely indicates current autotrophic and heterotrophic activity whereas decomposition rate likely indicates longer-term heterotrophic activity involved in the breakdown of cellulose (Tiegs et al. 2013). Increased oxygen demand (i.e., respiration) can indicate increased organic pollution whereas the interpretation of decomposition rate is less clear due to potential non-linear relationships with landscape disturbance (Allan 2004, Young et al. 2008, Woodward et al. 2012). To the best of our knowledge, no other studies have examined cotton strip respiration rate across a landscape disturbance gradient. Respiration rate was highest in our most disturbed and lowest in our least disturbed catchments, respectively. Chl-a concentration explained 17% of the total variation and reduced between-catchment and between-site differences by relatively equal proportions (approximately 35% decrease each compared with null variation). Chl-a represents the autotrophic constituent of microbial activity and its increase can be associated with increased nutrient supply or light (Bunn et al. 1999). Both Chl-a and BGA (cyanobacteria) were correlated with c_LDI (both Spearman’s ρ > 0.55, p < 0.001) suggesting that site and catchment disturbances tended to increase microbial activity and oxygen demand reflected in respiration rate. Respiration rate field measurements are time consuming and were moderately correlated 143  with decomposition rate (Spearman’s ρ = 0.43, p < 0.001), however, as noted in Tiegs et al. (2013), these complementary measurements may address different research questions. More studies are certainly needed on landscape drivers and spatial variation in cotton strip respiration rate to determine its utility in stream ecosystem health assessments.   Leaf litter and cotton strips have been used widely for decomposition rate studies; cotton strips are a more recent attempt at standardizing methods for multiscale analysis (Woodward et al. 2012, Tiegs et al. 2013, 2019, Chauvet et al. 2016). The magnitude and shape of the relationship of decomposition rate with disturbances is equivocal and likely depends on the cumulative effects of environmental and landscape variables and the scale of study (Chadwick et al. 2006, Young et al. 2008, Young and Collier 2009, Imberger et al. 2010, Bierschenk et al. 2012, Woodward et al. 2012, Chauvet et al. 2016, Kielstra et al. 2019, Tiegs et al. 2019). Our results suggest that there are strong spatial controls (i.e., autocorrelation) on decomposition rate along stream networks. Based on visual inspection of longitudinal gradients, decomposition rate was highly variable across sites but there appeared to be some clustering in consecutive sites. Our best spatial model suggested that this is partially dependent on local-scale variation in catchment disturbance (l_1000_LDI; 2% of total variation) but mostly dependent on similarity to upstream sites (TU; 67% of total variation) at distances up to 18 km (i.e., all practical distances). Extending this analysis to include more network branches (i.e., headwaters) could provide excellent insight on how the spatial properties of this functional indicator affect the interpretation of its relationship with landscape disturbance.  Benthic macroinvertebrate communities can also provide useful insight into the structural and functional conditions of stream ecosystems. Our metrics used in SSN models generally represented shifts from communities with sensitive organisms (i.e., high PCoA1, high Pct.EPT, 144  and low HBI) to those without. Similar to chemistry, we were unsurprised that the loss of sensitive organisms was coincident with increasing urbanization since this is a pattern regularly seen in the literature (Paul and Meyer 2001, Klemm et al. 2003, Walsh et al. 2005, Stanfield and Kilgour 2006, Cuffney et al. 2010, Bazinet et al. 2010, Wallace et al. 2013, Booth et al. 2016). Fixed effects of conductivity and the interaction between l_1000_BU and catchment area explained up to 17% of the total variation that was mostly related to between-catchment differences. This tended to be lower than the R2 values in other studies in the region but our models included all sampling scales and it was unclear how replicate samples were handled in their regression models (Stanfield and Kilgour 2006, Bazinet et al. 2010, Wallace et al. 2013). We were surprised that these correlated metrics had different spatial dependencies in the best models: PCoA1 had a EUC spatial dependency (30% of variation); Pct.EPT had TU (17%), TD (2%), and EUC (26%) spatial dependencies; and the non-spatial model was most likely for HBI. These results suggest that even if BMI metrics are considered redundant they may vary in spatial dependencies that require further examination. There were several challenges with the approaches used here. First, a substantial amount of unexplained variation remained across all models (20–40%) even after accounting for random effects and spatial dependencies. This is likely because we chose to include all observations (i.e., replicates within sites) and control for data hierachies using random effects. Interpretation was more complex since it required comparing random effect variation between models with and without model components. However, we felt that this better represented total variability than averaging within-site data to align with site-scale predictors which we found was often done in regression models. Second, for our random forest screening models, we assumed that observations were independent. This may have increased type I errors since randomly excluded 145  within-site observations (i.e., OOB data) are predicted by in-bag within-site data because the random forest model classifies data based on predictors which, for all within-site data, will be equal. Third, there were challenges with the “best model” approach. Typically, one would want to examine and summarize models with AIC differences within a range (e.g., < 10) or at least those models with a high relative likelihood of being the best (i.e., relative likelihood > 0.5). Furthermore, for practical reasons we restricted the fixed effects to structures that we thought were ecosystem-relevant (i.e., the effect of a land cover variable controlling for catchment area, topography, and geology). These may not have been the most suitable model forms for each variable and further study could examine several indicator-relevant fixed effects models. Fourth, our study lacked temporal variability in sampling which may change the relationships we found here. However, applying the same approaches with both spatial and temporal variation could yield even further insight into the relevant scales of variation for management. Fifth, our sites were set up in a linear fashion along stream networks which did not represent the dendritic nature of stream networks. We used this pattern to mostly examine longitudinal variation but our study would have benefited from including a more diverse configuration of sites. Finally, our SSN models were only capable of examining average spatial variation; there may have been consistent shifts in spatial variation with disturbance that could be explored if shifts in the spatial parameters were allowed (i.e., analogous to random slopes in mixed effects models). Unfortunately, we did not have enough sites per network to conduct separate analyses for exploring across network varaibility in spatial dependencies. Nevertheless, our approach employing SSN models provides a very useful first step towards examining varition and improving predictions along stream networks.    146   Here, we focused on tracking sources of variation as fixed effects and spatial dependencies were added to null models. Although we had slight gains in predictive capacity, our results support the notion that there are significant information gains by incorporating spatial dependencies into stream network models (Peterson et al. 2013). Random effects and spatial dependencies explained a larger proportion of the variation than fixed effects. Examining these sources of variation should guide practitioners about what scales require better sampling or the scales at which conservation or mitigation action is best targeted for sites positioned in the urban watershed continuum (Kaushal and Belt 2012). For example, one should consider that local landscape disturbance negatively influenced decomposition rate but was not more important than the similarity and influence of decomposition rates at upstream sites. Exploring spatial connections should increase our understanding of where and how stressors are acting on stream networks and other freshwater ecosystems.      147  5.5 Tables Table 5.1 Generalized fixed effects, random effects, and spatial dependencies used in spatial stream network models per response variable. HDF P/A is the probabilistic stream network model. BMI is benthic macroinvertebrates.  Response Fixed effects Dependencies Landscape HCI Year Chemistry Decomposition BMI Catchment Site Spatial HDF P/A ✓      ✓ ✓ ✓           Chemistry  ✓ ✓ ✓    ✓ ✓ ✓           Respiration ✓ ✓  ✓   ✓ ✓            Decomposition ✓ ✓ ✓ ✓  ✓ ✓ ✓ ✓           BMI ✓ ✓ ✓ ✓ ✓  ✓ ✓ ✓    148  Table 5.2 Benthic macroinvertebrate (BMI) metrics used as response variables in random forest models Summary statistics in Table F.21.  BMI metric Abbreviation % Amphipoda Pct.Amph % Chironomidae Pct.Chir % Diptera Pct.Dipt % Ephemeroptera, Plecoptera, and Trichoptera Pct.EPT % Insecta Pct.Inse % Isopoda Pct.Isop % Isopoda and Gastropoda Pct.IG % Mollusca Pct.Moll % Non-Insecta Pct.NInse % Worms Pct.Worms Count Count Evenness Evenness Hilsenhoff Biotic Index HBI Principal coordinates analysis scores  (27-group taxonomy; Hellinger distance matrix) PCoA1–PCoA3 Principal coordinates analysis scores  (finer taxonomy; Hellinger distance matrix) PCoA1.fine–PCoA8.fine Richness Richness Shannon Diversity Shannon Simpson Diversity Simpson Simpson Diversity Index SimpsonDiv    149  Table 5.3 Model comparison and component variation for spatial stream network models applied across seven catchments. HDF P/A is the probabilistic stream network model. Respiration* is the set of models run after removing an outlier site. For models, Null includes random effects, Non is non-spatial version of the best spatial model (BestS). Brier scores near zero indicate high predictive accuracy. Area under the receiver operator curve (AUC) scores near one indicate good model presence/absence discrimination. ∆AIC and Rel L. is relative to the best model in the Null, Non, and BestS set. R2 (CV) is the cross-validation R2. Spatial dependencies are tail-up (TU), tail-down (TD), and Euclidean (EUC). Dashed lines (--) indicate no estimate.     Response Model Model comparison Proportion of variation    Fixed Catchment Site TU TD EUC   Brier score; AUC       HDF P/A Null 0.11; 0.67 -- 0.24 -- -- -- --  Non 0.10; 0.85 0.08 0.14 -- -- -- --  BestS 0.08; 0.90 0.11 0.05 -- 0.44 0.08 0.32            ∆AIC; Rel L.; R2 (CV)       SpCond Null 137; 0.00; 0.76 -- 0.81 0.01 -- -- --  Non 8.91; 0.01; 0.81 0.31 0.54 0.02 -- -- --  BestS 0.00; 1.00; 0.81 0.30 0.00 0.00 -- -- 0.51          pH Null 19.70; 0.00; 0.21 -- 0.34 0.34 -- -- --  Non 0.00; 1.00; 0.34 0.25 0.09 0.32 -- -- --  BestS 1.70; 0.43; 0.34 0.23 0.06 0.17 -- -- 0.20          DO.PercEU Null 13.39; 0.00; 0.00 -- 0.06 0.51 -- -- --  Non 0.00; 1.00; 0.02 0.04 0.02 0.51 -- -- --  BestS 3.92; 0.14; 0.02 0.04 0.00 0.50 -- -- 0.02          Respiration Null 0.00; 1.00; 0.53 -- 0.00 0.93 -- -- --  Non 4.17; 0.12; 0.05 0.01 0.00 0.93 -- -- --          Respiration* Null 6.75; 0.03; 0.29 -- 0.63 0.25 -- -- --  Non 0.00; 1.00; 0.43 0.17 0.40 0.16 -- -- --          TLDD Null 34.80; 0.00; 0.39 -- 0.51 0.28 -- -- --  Non 34.82; 0.00; 0.42 0.10 0.30 0.35 -- -- --  BestS 0.00; 1.00; 0.44  0.02 0.00 0.00 0.67 0.12 --          PCoA1 Null 21.06; 0.00; 0.47 -- 0.55 0.20 -- -- --  Non 8.22; 0.02; 0.48 0.11 0.22 0.31 -- -- --  BestS 0.00; 1.00; 0.55 0.12 0.09 0.13 -- -- 0.30          Pct.EPT Null 122.83; 0.00; 0.30 -- 0.41 0.23 -- -- --  Non 21.63; 0.00; 0.36 0.21 0.16 0.22 -- -- --  BestS 0.00; 1.00; 0.37 0.17 0.08 0.00 0.17 0.02 0.26          HBI Null 5.11; 0.08; 0.42 -- 0.47 0.25 -- -- --  Non 0.00; 1.00; 0.48 0.15 0.26 0.22 -- -- --  BestS 2.90; 0.23; 0.45 0.15 0.25 0.00 0.03 0.21 -- 150  5.6 Figures  Figure 5.1 Maps of catchment main sites (red triangles) and potential stream-roadway crossings (points) (upper panel) as a subset of seven catchments (lower panel) of varying landscape disturbance. Moving from west to east, landscape disturbance index (LDI; lower is less disturbed) at the most downstream site is 0.05 for Rogers, 0.39 for Vaughan, 0.55 for Wilket, 0.51 for Morningside, 0.50 for Ganatsekiagon, 0.39 for Pringle, and 0.03 for Ganaraska, where 0.60 is maximally disturbed (i.e., 100% urban cover) in this dataset.   151    Figure 5.2 Longitudinal patterns of built-up index (BU) estimated at four scales at main sites across seven catchments. Scales were a 100 m-radius buffer around sites (l_100), a 1000 m-radius buffer around sites (l_1000), a 50 m buffer extending out perpendicular to the stream on either side and along the upstream reaches (r_50), and the whole catchment (c_). “% distance downstream” is the percentage of distance travelled along the network from the most upstream site to the most downstream site. Catchments are Ganaraska (GR), Rogers (RO), Ganatsekiagon (GN), Vaughan (VN), Pringle (PR), Morningside (MO), and Wilket (WI) coloured based on landscape disturbance index values (LDI) at most downstream site (purple → green → yellow indicates increasing catchment disturbance).   152   Figure 5.3 Longitudinal patterns of landscape disturbance index (LDI), chemistry (SpCond), respiration rate, decomposition rate (% tensile loss per day [TLDD], and benthic macroinvertebrate (Pct.EPT) variables across seven catchments. % distance downstream is the percentage of distance travelled along the network from the most upstream site to the most downstream site.  Catchments are Ganaraska (GR), Rogers (RO), Ganatsekiagon (GN), Vaughan (VN), Pringle (PR), Morningside (MO), and Wilket (WI) coloured based on landscape disturbance index values (LDI) at most downstream site (purple → green → yellow indicates increasing catchment disturbance).     153    Figure 5.4 Percentage of variation explained by fixed effects, random effects (catchment and site), and spatial dependencies (tail-up [TU], tail-down [TD], and Euclidean [EUC]) across response variables used in SSN models. Null models includes random effects and Non is the non-spatial version of the best spatial model (BestS).    154    Figure 5.5 Predicted probabilities of stream reaches in three catchments of varying catchment disturbance using the null, best fixed effects non-spatial model, and best spatial model. Here, landscape disturbance index values (LDI) based on most downstream site where 0.6 is maximally disturbed (i.e., 100% urban).    155    Figure 5.6 Boxplots of chemistry variables (a, c, e) and relationships with strongest predictor (b, d, f). Boxes (a, c, e) are interquartile range (IQR, 25th and 75th percentiles) and whiskers extend to data within 1.75 x IQR. Polygon (b, d, f) is 95% confidence interval (CIs) about the relationship. Points (b, d, f) are means and 95% CIs for sites. “Partial” indicates that estimates were generated in absence of predictor and subtracted from observations; random effects are not subtracted.  Lines (b, d, f) indicate different regression lines based on years.   156   Figure 5.7 Boxplots of respiration rate (a) and its relationship with strongest predictor (b). Boxes (a) are interquartile range (IQR, 25th and 75th percentiles) and whiskers extend to data within 1.75 x IQR. Polygon (b) is 95% confidence interval (CI) about the relationship. Points (b) are means and 95% CIs for sites.    157    Figure 5.8 Boxplots of % tensile loss per degree day (TLDD) (a) and relationship with strongest predictor (b). Boxes (a) are interquartile range (IQR, 25th and 75th percentiles) and whiskers extend to data within 1.75 x IQR. Polygon (b) is 95% confidence interval (CI) about the relationship. Points (b) are means and 95% CIs for sites. “Partial” indicates that estimates were generated in absence of predictor and subtracted from observations; random effects are not subtracted.  Lines (b) indicate different regression lines based on years.    158    Figure 5.9 Boxplots of benthic macroinvertebrate metrics (a, c, e) and relationships with strongest predictor (b, d, f). Boxes (a, c, e) are interquartile range (IQR, 25th and 75th percentiles) and whiskers extend to data within 1.75 x IQR. Polygon (b, d, f) is 95% confidence interval (CIs) about the relationship. Points (b, d, f) are means and 95% CIs for sites. “Partial” indicates that estimates were generated in absence of predictor and subtracted from observations; random effects are not subtracted.  Lines (b, d, f) indicate different regression lines based on years (different line types) and substrate category.     159  Chapter 6: Conclusion Freshwater ecosystems will continue to be threatened by natural and anthropogenic stressors in the Anthropocene. Likewise, environmental practitioners will continue to be tasked with understanding and predicting cumulative effects of these stressors in the context of complex interactions across spatial and temporal scales (i.e., the invisible present and invisible place). Given the importance of freshwaters and this inherent complexity, studies and methods in ecology need to increase their multiscale predictive capacity.  I used several studies across long land cover disturbance gradients to test hypotheses related to predictive capacity and explained variation. I found good support for the hypothesis that predictive capacity and explained variation increase by incorporating dependencies representing multiple spatial scales. I found varying support for the hypothesis that incorporating spatial dependencies reduces the statistical significance of some predictors when examined (Chapters 2 and 5). At larger regional scales (Chapter 4), I did not find general support for the hypothesis that urban factors were more important than other environmental factors but this depended on the indicator. At the smaller regional scales in which I employed specific land-use gradients (Chapters 2, 3, and 5), I did find general support for this hypothesis, but again this depended on the indicator; however, predictors had low to moderate explanatory power overall. I found general support for the hypothesis that local-scale landscape variables explain more variation than whole-catchment variables (Chapters 2, 3, and 5). Finally, I did not find support for the hypothesis that headwater conditions explain similar or more variation than other landscape variables (Chapters 4 and 5). The support for these hypotheses is outlined below along with limitations and future considerations.    160  6.1 Multiscale dependency effects on predictive capacity and explained variation  I hypothesized that predictive capacity and explained variation increases by incorporating dependencies that represent multiple spatial scales. Since I used slightly different ways of examining this hypothesis per chapter, I assess support for the hypothesis for each chapter below.   In Chapter 2, incorporating spatial dependencies doubled predictive capacity of headwater conditions (i.e., HCI) although it was still relatively low (LOOCV R2 = 0.23). Adding spatial dependencies increased explained variation from 12% (i.e., non-spatial model) to a range of 17–76% depending on the dependency structure. There was wide variability in explained variation with different dependency structures despite relative stability in predictive capacity. This suggested adding “any” spatial dependency structure increased predictive capacity. However, only 16% of sites had more than one upstream site and 72% had no upstream sites. This lack of upstream and downstream connections is challenging for spatial stream network models attempting to model flow-connected autocorrelation since few observations would be in each distance class. I anticipate the study of more densely sampled headwater networks should reveal more insight into the best spatial dependencies for headwater indicators. I also anticipate that simulation studies considering multiscale variability (i.e., between-catchment, between-site, and between-observations) for any parameter may reveal when increased sampling density stabilizes spatial parameters. In this way, coupling finer-scale study of spatial dependencies along dense networks with simulation studies may reveal more insight into best dependency structures for increasing predictive capacity for headwater indicators.   In Chapter 3, I only evaluated explained variation in the study of decomposition rates in urban headwaters. Adding Euclidean spatial dependencies did not increase explained variation. However, applying a hierarchical data structure (i.e., grouping) explained 66% of the total 161  variation. In mixed effects models, predictions can be marginal or conditional. Marginal refers to the expected value given the fixed effects whereas conditional refers to the expected value given the fixed effects and random effects (Pinheiro and Bates 2000). Considering a null model with hierarchical clustering of observations, the conditional expectations or predictions (also known as Best Linear Unbiased Predictors) are simply group-level departures from the mean. By design, this increases predictive capacity. However, if variance estimates are used in subsequent studies or if multiscale variation is used for power analysis in designing sampling effort, these multiscale dependencies can also increase predictive capacity.   In Chapter 4, I evaluated predictive capacity using out-of-bag prediction error (OOB R2) for benthic macroinvertebrate (BMI) metrics. For explained variation, a useful measure was the % increase in model mean-squared error (% increase MSE) if the predictor was randomly permuted. Predictive capacity was generally higher at the larger regional scale than in the smaller Toronto region. In the Toronto region, incorporating spatial variables (asymmetric and Moran’s eigenvector maps or headwater variables) did not increase predictive capacity on average but tended to increase predictive capacity in those metrics that were most poorly predicted in a base model. Furthermore, adding spatial variables was useful since important variables seemed to represent between-network differences that can result from historical or community level interactions (e.g., priority effects) that may not be captured by a land cover gradient (Urban and De Meester 2009, Brown et al. 2011). With respect to % increased MSE for predictors, environmental predictors were similarly important as land cover predictors to BMI metrics. Together, these pieces of information support the notion that adding spatial and temporal contexts can increase predictive capacity and explained variation.    162   In Chapter 5, I used hierarchical structures in null models and evaluated how predictive capacity and explained variation changed when predictors and spatial dependencies were incorporated into models. Based on LOOCV and 5-fold cross-validations, predictive capacity did not substantially increase from null to non-spatial to spatial models. This was likely because null models included hierarchical structures that explained a median 72% of the variation that indicated strong correlations of observations within groups. More interesting were the shifts in explained variation from hierarchical structure to spatial dependencies. Of the remaining variation after accounting for fixed effects, spatial dependencies accounted for a median 58% (range: 1–100%) versus the hierarchical structure (median and range: 25% [0–96%]). Although indicator-dependent, these findings tended to support that adding spatial dependencies increases predictive capacity and explained variation.  Although widely recognized in the literature, a remaining challenge in environmental research is how best to account for temporal and spatial scales for increased predictive capacity (Peters 1991, Levin 1992, Polasky et al. 2011, Chave 2013, Houlahan et al. 2017). My studies certainly support that indicator choice matters since predictive capacity, explained variation, and the importance of predictors and spatial dependencies were indicator specific (Siddig et al. 2016). From an indicator perspective, my screening approach using random forests (Chapters 4 and 5) is a useful contribution for three reasons: 1) it can point to those variables best predicted by potential predictors, 2) it can allow visualizing of relationships (e.g., non-linearities) with potential predictors, and 3) it can accommodate more potential predictors (i.e., potentially more nuanced effects) than more common approaches like linear regression. Machine learning techniques (e.g., random forest regression) are increasingly used in environmental sciences (Farley et al. 2018), however, I see major disadvantages considering that incorporating 163  dependency structures (hierarchical and/or spatial) were important in my studies. Specific predictors would be needed to model dependencies (e.g., categorical factors, AEMs, MEMs), but even then, current machine learning algorithms may not adequately sample from levels or separation distances to avoid spurious conclusions. Nevertheless, we did use AEMs and MEMs as spatial variables in our attempts to examine unmeasured variation. With only increased predictive capacity in mind (i.e., not the underlying spatial relationship), random forest techniques have been developed using distance buffers to “borrow strength” from nearby sites, akin to kriging in geostatistics (Hengl et al. 2018). An interesting extension of this idea would be to have buffers that extend in the upstream, downstream, or all directions (akin to the TU, TD, and EUC) compared with SSN model predictions. However, the strongest gains in predictive capacity, in my opinion, would be coupling machine learning techniques (choice of indicator and potential predictors) with more explicit empirical models to examine causal relationships that account for multiscale dependencies.     6.2 Multiscale dependency effects on statistical significance of predictors  I found limited support for the hypothesis that incorporating multiscale dependencies reduces explained variation and statistical significance of fixed effects. For example, in Chapter 2, adding spatial dependencies only reduced explained variation of fixed effects and effect sizes of the parameters but not their statistical significance. In Chapter 5, adding spatial dependencies reduced explained variation by a relatively small amount. In three of eight cases (SpCond, TLDD, and Pct.EPT) was statistical significance of any factors changed.  It is well known that not incorporating dependency structures can lead to spurious conclusions in environmental studies (Zuur et al. 2010, Legendre and Legendre 2012). In 164  response, environmental scientists continue to develop methods for confronting complexity in observational studies. For example, only recently have relatively complex but powerful Bayesian spatiotemporal frameworks become more accessible to novice practitioners (e.g., R-INLA; Blangiardo et al. 2013, Zuur et al. 2017). Some promising methods even consider phylogeny and species traits along with spatial and temporal contexts (Ovaskainen et al. 2017). Stream ecologists have benefited from pioneering developments that align statistical frameworks with how stream networks are assumed to function (Cressie et al. 2006, Ver Hoef et al. 2006, Ver Hoef and Peterson 2010). For example, that autocorrelation patterns may differ for chemistry (e.g., from upstream to downstream) and fish (e.g., in both directions) and that different weighting schemes (e.g., stream order, stream slope) may be more useful than others for research questions (Peterson et al. 2013, Frieden et al. 2014). These models have wide applicability from predicting stream temperature through to the abundance of organisms and testing of conceptual models (Isaak et al. 2014, 2017, Larsen et al. 2019). More studies applying this approach may reveal important insight into the spatial variation of stream ecosystems across scales. However, there are some limitations to overcome. In my studies, for example, I was mostly interested in variation of indicators at different levels (i.e., between-catchment and between-site), which requires variance component estimates. In the current models, variance components and spatial parameters do not have measures of uncertainty. SSN models may benefit from employing Bayesian Markov chain Monte Carlo sampling to provide estimates of uncertainty about all parameters (e.g., Finley et al. 2007). This would include those fixed effects parameters susceptible to type I errors based on covariance of observations in frequentist frameworks.  165  6.3 Urban and local-scale predictor importance   I hypothesized that urban predictors (e.g., landscape disturbance index, satellite-based built-up index) would be the strongest in my studies since anthropogenic effects are so pervasive in freshwaters. I also hypothesized that local-scale landscape variables (e.g., l_100_LDI) and spatial dependencies would explain more total variation than larger-scale landscape variables (e.g., c_LDI) and tend to explain more between-site variability than between-catchment variability. This was because I anticipated that discontinuities through local-scale disturbances are likely more common than broader longitudinal patterns.    My studies were consistent with evidence in the literature that urban landscapes exert strong influence on freshwater indicators like water chemistry (e.g., SpCond) and sensitive organisms (e.g., HBI) (Paul and Meyer 2001, Walsh et al. 2005, Stanfield and Kilgour 2006, Vörösmarty et al. 2010, Booth et al. 2016, Reid et al. 2019). They were also consistent with evidence that different environmental settings can be as important or change the shape and magnitude of relationships (Cuffney et al. 2010, Booth et al. 2016, Utz et al. 2016). Together, they support the notion that cumulative effects assessment should consider spatial and temporal contexts (Jones 2016). In Chapter 2, c_BU had an approximately three times greater effect than environmental factors on headwater conditions despite low overall explanatory power. In Chapter 3, the cotton strip-scale depth property (e.g. burial) had nearly equal or greater effects (dependent on depth category) than NDVI; the NDVI effect was only stronger at higher water accumulation sites. In Chapter 4, the importance of environmental and land cover predictors was nearly equal in the larger regional model but the variation in importance was wide. Even in the smaller Toronto region, where I expected land cover to be more important since environmental settings were comparable across the region, this finding was similar. This suggests that, overall, 166  spatial context needs to be accounted for to avoid spurious findings. In Chapter 5, r_50_BU had nearly double the effect size of other variables in the probabilistic stream network model. For chemistry, r_50_BU had the strongest effect on SpCond but there were no strong urban predictors for pH or DO.PercEU. For decomposition rate, l_1000_LDI was the strongest predictor but nearly equal to the categorical effect of Year. For respiration rate, the strongest predictor, Chl-a, was moderately correlated with LDI. Finally, for benthic macroinvertebrates, SpCond was most important for PCoA1, which was related to r_50_BU, and l_1000_BU had relatively strong effects when interacting with catchment area for Pct.EPT and HBI. Considering all my studies, urban predictors, and particularly satellite indices, appeared to be important contributors to explaining indicator variation.   Our studies were also consistent with findings that stream models should consider local-scale landscape variables since they may better predict indicator responses than whole catchment measures (King et al. 2005, Frieden et al. 2014, Kuglerová et al. 2019). Several lines of evidence support that local-scale landscape variables were better predictors and that the best predictors tended to be satellite indices. In Chapter 2, c_BU was the strongest predictor, but local estimates were very similar to whole-catchment estimates at these headwater sties. In Chapter 5, r_50_BU was the strongest predictor in the probabilistic stream network model; l_1000_BU was the strongest for SpCond, Pct.EPT, and HBI; and l_1000_LDI was strongest for TLDD. Coupling our local-scale finding with our finding that satellite indices tended to be the best landscape predictors, a useful avenue of research would be integrating gradient surface metrics; these metrics are suites of derived metrics based on continuous data rather than more common and well-established categorical analogues (Kedron et al. 2018). For example, we used the standard 167  deviation of satellite indices in Chapter 4 to estimate “surface roughness” or average vegetation density variability.   There are some important limitations to consider with this local-scale finding. First, there were more variables (e.g., l_100, l_1000, and r_50 versus c_) which may have increased the chances of this finding. Second, for consistency across studies, I felt that my GIS buffers represented practical local scales of variation as in other studies (Frieden et al. 2014, Kuglerová et al. 2019) but a more thorough understanding of the “scale of effect” (i.e., radius of buffer) might reveal more important information for each indicator (Jackson and Fahrig 2015, Miguet et al. 2016). Third, that my hypothesis was not correct. Rather than explaining between-site differences as I expected, those local variables tended to explain between-catchment differences which is consistent with the finding that catchments can set a background template to local scale estimates (King et al. 2005). Furthermore, spatial dependencies explained more between-catchment differences for chemistry, equally for between-catchment and between-site differences for TLDD, and more between-site differences for benthic macroinvertebrates. These findings suggest varying spatial controls on indicators and that application of SSN models to a variety of indicators may yield more general findings in urban stream networks.    6.4 Importance of headwaters to downstream  I hypothesized that headwater conditions explain similar or more variation than other landscape variables since they may represent more nuanced cumulative effects depending on how they are aggregated (e.g., distance-weighted vs. moving average).  In Chapter 4, HCI did not increase predictive capacity but had similar median importance to many other variables across the BMI metrics. In Chapter 5, HCI ranked high as a predictor for SpCond, explaining 25% of the 168  variation. However, rankings of HCI variables varied widely between indicators and tended to be in the mid to lower ranges where the relative likelihood of these models was < 0.01. This suggests that, generally, HCI was not explaining more variation than others. A possible reason for this finding is that local-scale landscape variables were better predictors than whole catchment predictors – my HCI predictors were aggregate conditions of upstream headwaters and therefore more correlated to whole catchment predictors. I attempted to account for spatial arrangement of sites (i.e., distance-based and probability-based weightings of headwater conditions) but this did not result in high variability between the different aggregation schemes. My use of SSN models was more for capturing patterns of variation in HCI across the landscape and there may be more suitable network approaches (e.g., network structure approaches; (Grant et al. 2007, Peterson et al. 2013, Stanfield et al. 2014) that link these predictions to downstream than my simple aggregation schemes.   Despite wide recognition of tight linkages between headwaters and surrounding catchments and that they individually provide habitat, processing, and exporting functions, fewer studies have taken landscape approaches that directly link headwater conditions to downstream properties (Bishop et al. 2008, Stanfield et al. 2014, Kuglerová et al. 2017b). More studies link an indicator to its downstream counterpart (e.g., water chemistry or temperature) or some aspect of headwater connectivity to downstream (Wipfli et al. 2007, Fritz et al. 2018). Some studies take hybrid approaches using properties like catchment area and topological connections to examine influence of tributary influence on downstream (Benda et al. 2004, Kiffney et al. 2006, Jones and Schmidt 2018). Owing to dendritic structure, network length of these tributaries can far exceed that of downstream which poses significant challenges to biomonitoring in terms of resources, let alone expected spatial and temporal variability (Gomi et al. 2002, Richardson 169  2019b, 2019a). Tracking headwater variability may be futile since it is expected that smaller scale variability is diluted as more and more streams combine (Abbott et al. 2018, Fritz et al. 2018). It is imperative to examine the individual and cumulative effects of headwater impairment on downstream environments considering that, despite their perceived importance on the landscape (Wohl 2017), they are still disproportionally buried or turned into drainage infrastructure in practice (Kaushal and Belt 2012, Stammler et al. 2013, Napieralski et al. 2015). I anticipated that SSN models would confront many of these challenges because: 1) I could generate a diagnostic indicator of headwater condition, 2) relate it to a set of landscape variables, and 3) use fixed effects and spatial dependencies to predict headwater condition “accumulation” down the stream network. My HCI had relatively weak links to landscape variables, nevertheless incorporating spatial dependencies increased explanatory power and predictive capacity. This suggests more intensive sampling is beneficial and that landscape links to headwaters need to be further examined. It could also be that HCI might benefit from specific management considerations (e.g., erosion) rather than capturing general headwater condition. For headwater streams, there is still strong promise in SSN models for developing network predictions for headwater indicators and downstream indicators, relating them to each other, and determining at what scales headwater effects diminish.    6.5 Concluding remarks  In this dissertation, I was interested in using multiscale approaches to improve our understanding of complex interactions in stream ecology. Predictive capacity is still generally poor in ecology – this likely owes more to contemporary studies existing in the invisible present and invisible place and the cumulative effects of natural and human stressors than analytical methods. Nevertheless, 170  more emphasis needs to be placed on examining null hierarchical variation and non-random spatial and temporal residual variation for improving conceptual understanding of contemporary systems and reducing our tendency to generate weak predictions; examining these patterns may reveal important and missing scales of sampling. Examining null hierarchical variation allows for the consideration of scales that most vary in the context of our studies. Examining and quantifying residual variation can guide us on what scales need further consideration. I strongly suspect that wide application and reporting of these pieces of information will provide useful insight for making empirical models and concepts more spatially explicit (e.g., river continuum and serial discontinuity concepts). From Levin (1992), “the identification of pattern is an entrée into the identification of scales.” Relatively new techniques such as random forest models have little utility if they only lead to good prediction but no holistic understanding of sources of variation. The United Nations defines capacity building and development as increasing capability over time (United Nations Office for Disaster Risk Reduction 2017). Ecological predictive capacity building should result from multiscale models that appropriately account for spatial dependencies (e.g., spatial stream network models) to make the best decisions in the face of complexity.      171  Bibliography  AAFC [Agriculture and Agri-Food Canada]. 2011. Annual Crop Inventory, 2011. Government of Canada. Available: https://open.canada.ca/data/en/dataset/58ca7629-4f6d-465a-88eb-ad7fd3a847e3     AAFC [Agriculture and Agri-Food Canada]. 2015. Annual Crop Inventory, 2015. Government of Canada. Available: https://open.canada.ca/data/en/dataset/3688e7d9-7520-42bd-a3eb-8854b685fef3 Abbott, B. W., G. Gruau, J. P. Zarnetske, F. Moatar, L. Barbe, Z. Thomas, O. Fovet, T. Kolbe, S. Gu, A.-C. Pierson‐Wickmann, P. Davy, and G. Pinay. 2018. Unexpected spatial stability of water chemistry in headwater stream networks. Ecology Letters 21:296–308. Alexander, R. B., E. W. Boyer, R. A. Smith, G. E. Schwarz, and R. B. Moore. 2007. The role of headwater streams in downstream water quality. JAWRA Journal of the American Water Resources Association 43:41–59. Allan, J. D. 2004. Landscapes and riverscapes: the influence of land use on stream ecosystems. Annual Review of Ecology, Evolution, and Systematics 35:257–284. Allan, J. D., and M. M. Castillo. 2009. Stream ecology: structure and function of running waters. Second Edition. Springer, Dordrecht, The Netherlands. Angradi, T. R., M. S. Pearson, T. M. Jicha, D. L. Taylor, D. W. Bolgrien, M. F. Moffett, K. A. Blocksom, and B. H. Hill. 2009. Using stressor gradients to determine reference expectations for great river fish assemblages. Ecological Indicators 9:748–764. Baker, E. A., K. E. Wehrly, P. W. Seelbach, L. Wang, M. J. Wiley, and T. Simon. 2005. A multimetric assessment of stream condition in the Northern Lakes and Forests ecoregion 172  using spatially explicit statistical modeling and regional normalization. Transactions of the American Fisheries Society 134:697–710. Baruch, E. M., K. A. Voss, J. R. Blaszczak, J. Delesantro, D. L. Urban, and E. S. Bernhardt. 2018. Not all pavements lead to streams: variation in impervious surface connectivity affects urban stream ecosystems. Freshwater Science 37:673–684. Baston, D. 2019. exactextractr: Fast extraction from raster datasets using polygons. Available: https://github.com/isciences/exactextractr Bazinet, N. L., B. M. Gilbert, and A. M. Wallace. 2010. A comparison of urbanization effects on stream benthic macroinvertebrates and water chemistry in an urban and an urbanizing basin in Southern Ontario, Canada. Water Quality Research Journal 45:327–341. Benda, L., N. L. Poff, D. Miller, T. Dunne, G. Reeves, G. Pess, and M. Pollock. 2004. The network dynamics hypothesis: how channel networks structure riverine habitats. BioScience 54:413–427. Beven, K. J., and M. J. Kirkby. 1979. A physically based, variable contributing area model of basin hydrology. Hydrological Sciences Journal 24:43–69. Bierschenk, A. M., C. Savage, C. R. Townsend, and C. D. Matthaei. 2012. Intensity of land use in the catchment influences ecosystem functioning along a freshwater-marine continuum. Ecosystems 15:637–651. Bishop, K., I. Buffam, M. Erlandsson, J. Fölster, H. Laudon, J. Seibert, and J. Temnerud. 2008. Aqua Incognita: the unknown headwaters. Hydrological Processes 22:1239–1242. Bivand, R. 2017. rgrass7: Interface between GRASS 7 Geographical Information System and R. Software available through: https://CRAN.R-project.org/package=rgrass7. 173  Bivand, R., T. Keitt, and B. Rowlingson. 2017. rgdal: Bindings for the “Geospatial” Data Abstraction Library. Software available through: https://CRAN.R-project.org/package=rgdal. Bivand, R., and C. Rundel. 2018. rgeos: Interface to geometry engine - open source ('GEOS’). Software available through: https://CRAN.R-project.org/package=rgeos. Bivand, R. S., E. Pebesma, and V. Gomez-Rubio. 2013. Applied spatial data analysis with R, Second edition. Springer, New York, USA. Blanchet, F. G., P. Legendre, and D. Borcard. 2008. Modelling directional spatial processes in ecological data. Ecological Modelling 215:325–336. Blanchet, F. G., P. Legendre, R. Maranger, D. Monti, and P. Pepin. 2011. Modelling the effect of directional spatial ecological processes at different scales. Oecologia 166:357–368. Blangiardo, M., M. Cameletti, G. Baio, and H. Rue. 2013. Spatial and spatio-temporal models with R-INLA. Spatial and Spatio-temporal Epidemiology 4:33–49. Bolker, B. M., M. E. Brooks, C. J. Clark, S. W. Geange, J. R. Poulsen, M. H. H. Stevens, and J.-S. S. White. 2009. Generalized linear mixed models: a practical guide for ecology and evolution. Trends in Ecology & Evolution 24:127–135. Booth, D. B., A. H. Roy, B. Smith, and K. A. Capps. 2016. Global perspectives on the urban stream syndrome. Freshwater Science 35:412–420. Borcard, D., F. Gillet, and P. Legendre. 2018. Numerical Ecology with R. Second edition. Springer-Verlag, New York, USA. Borisko, J. P., B. W. Kilgour, L. W. Stanfield, and F. C. Jones. 2007. An evaluation of rapid bioassessment protocols for stream benthic invertebrates in Southern Ontario, Canada. Water Quality Research Journal 42:184–193. 174  Boudreau, S. A., and N. D. Yan. 2004. Auditing the accuracy of a volunteer-based surveillance program for an aquatic invader Bythotrephes. Environmental Monitoring and Assessment 91:17–26. Breiman, L. 2001. Random forests. Machine Learning 45:5–32. Brenning, A. 2008. Statistical geocomputing combining R and SAGA: The example of landslide susceptibility analysis with generalized additive models. Pages 23–32 SAGA – Seconds Out (= Hamburger Beitraege zur Physischen Geographie und Landschaftsoekologie, vol. 19). Institut für Geographie der Universität Hamburg, Hamburg, Germany. Brett, M. T., S. E. Bunn, S. Chandra, A. W. E. Galloway, F. Guo, M. J. Kainz, P. Kankaala, D. C. P. Lau, T. P. Moulton, M. E. Power, J. B. Rasmussen, S. J. Taipale, J. H. Thorp, and J. D. Wehr. 2017. How important are terrestrial organic carbon inputs for secondary production in freshwater ecosystems? Freshwater Biology 62:833–853. Brown, B. L., C. M. Swan, D. A. Auerbach, E. H. Campbell Grant, N. P. Hitt, K. O. Maloney, and C. Patrick. 2011. Metacommunity theory as a multispecies, multiscale framework for studying the influence of river network structure on riverine communities and ecosystems. Journal of the North American Benthological Society 30:310–327. Bunn, S. E., P. M. Davies, and T. D. Mosisch. 1999. Ecosystem measures of river health and their response to riparian and catchment degradation. Freshwater Biology 41:333–345. Burgess, H. K., L. B. DeBey, H. E. Froehlich, N. Schmidt, E. J. Theobald, A. K. Ettinger, J. HilleRisLambers, J. Tewksbury, and J. K. Parrish. 2017. The science of citizen science: Exploring barriers to use as a primary research tool. Biological Conservation 208:113–120. 175  Burnham, K. P., and D. R. Anderson. 2002. Model selection and multimodel inference. Springer, New York, USA. Chadwick, M. A., D. R. Dobberfuhl, A. C. Benke, A. D. Huryn, K. Suberkropp, and J. E. Thiele. 2006. Urbanization affects stream ecosystem function by altering hydrology, chemistry, and biotic richness. Ecological Applications 16:1796–1807. Chapman, L. J., and D. F. Putnam. 1997. Physiography of southern Ontario; Ontario Geological Survey, Miscellaneous Release—Data 228. Ontario Geological Survey, Sudbury, Ontario. Chauvet, E., V. Ferreira, P. S. Giller, B. G. McKie, S. D. Tiegs, G. Woodward, A. Elosegi, M. Dobson, T. Fleituch, M. A. S. Graça, V. Gulis, S. Hladyz, J. O. Lacoursière, A. Lecerf, J. Pozo, E. Preda, M. Riipinen, G. Rîşnoveanu, A. Vadineanu, L. B.-M. Vought, and M. O. Gessner. 2016. Litter Decomposition as an Indicator of Stream Ecosystem Functioning at Local-to-Continental Scales. Pages 99–182 Advances in Ecological Research Volume 55. Elsevier, UK. Chave, J. 2013. The problem of pattern and scale in ecology: what have we learned in 20 years? Ecology Letters 16:4–16. Clapcott, J. E., and L. A. Barmuta. 2010. Metabolic patch dynamics in small headwater streams: exploring spatial and temporal variability in benthic processes. Freshwater Biology 55:806–824. Coles, J. F., G. McMahon, A. H. Bell, L. R. Brown, F. A. Fitzpatrick, B. C. Scudder Eikenberry, M. D. Woodside, T. F. Cuffney, W. L. Bryant Jr., K. Cappiella, L. Fraley-McNeal, and W. P. Stack. 2012. Effects of urban development on stream ecosystems in nine 176  metropolitan study areas across the United States. Page 152. USGS Numbered Series, U.S. Geological Survey, Reston, VA. COMAP [Centre for Community Mapping]. 2019. Flowing Waters Information System (FWIS). University of Waterloo, Waterloo, Ontario. Conrad, O., B. Bechtel, M. Bock, H. Dietrich, E. Fischer, L. Gerlitz, J. Wehberg, V. Wichmann, and J. Böhner. 2015. System for Automated Geoscientific Analyses (SAGA) v. 2.1.4. Geoscientific Model Development 8:1991–2007. Corsi, S. R., L. A. De Cicco, M. A. Lutz, and R. M. Hirsch. 2015. River chloride trends in snow-affected urban watersheds: increasing concentrations outpace urban growth rate and are common among all seasons. Science of The Total Environment 508:488–497. Crain, C. M., K. Kroeker, and B. S. Halpern. 2008. Interactive and cumulative effects of multiple human stressors in marine systems. Ecology Letters 11:1304–1315. Cressie, N., J. Frey, B. Harch, and M. Smith. 2006. Spatial prediction on a river network. Journal of Agricultural, Biological, and Environmental Statistics 11:127–150. Crutzen, P. J. 2006. The “Anthropocene.” Pages 13–18 in E. Ehlers and T. Krafft, editors. Earth System Science in the Anthropocene. Springer, Berlin, Heidelberg, Germany. Cuffney, T. F., R. A. Brightbill, J. T. May, and I. R. Waite. 2010. Responses of benthic macroinvertebrates to environmental changes associated with urbanization in nine metropolitan areas. Ecological Applications 20:1384–1401. Dale, M. R. T., and M.-J. Fortin. 2014. Spatial analysis: a guide for ecologists. Second Edition. Cambridge University Press, Cambridge, UK; New York, USA. 177  Danger, M., J. Cornut, A. Elger, and E. Chauvet. 2012. Effects of burial on leaf litter quality, microbial conditioning and palatability to three shredder taxa: Leaf litter burial and palatability. Freshwater Biology 57:1017–1030. Dickinson, J. L., J. Shirk, D. Bonter, R. Bonney, R. L. Crain, J. Martin, T. Phillips, and K. Purcell. 2012. The current state of citizen science as a tool for ecological research and public engagement. Frontiers in Ecology and the Environment 10:291–297. Dixit, S. S., A. S. Dixit, and J. P. Smol. 1992. Assessment of changes in lake water chemistry in Sudbury area lakes since preindustrial times. Canadian Journal of Fisheries and Aquatic Sciences 49:8–16. Dodds, W. K., and R. M. Oakes. 2008. Headwater influences on downstream water quality. Environmental Management 41:367–377. Dray, S., P. Legendre, and P. R. Peres-Neto. 2006. Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (PCNM). Ecological Modelling 196:483–493. Dray, S., R. Pélissier, P. Couteron, M.-J. Fortin, P. Legendre, P. R. Peres-Neto, E. Bellier, R. Bivand, F. G. Blanchet, M. De Cáceres, A.-B. Dufour, E. Heegaard, T. Jombart, F. Munoz, J. Oksanen, J. Thioulouse, and H. H. Wagner. 2012. Community ecology in the age of multivariate multiscale spatial analysis. Ecological Monographs 82:257–275. Ellis, L. E., and N. E. Jones. 2016. A test of the serial discontinuity concept: longitudinal trends of benthic invertebrates in regulated and natural rivers of northern Canada. River Research and Applications 32:462–472. ESRI [Environmental Systems Research Institute]. 2014. ArcMap. Redlands, USA. 178  Farley, S. S., A. Dawson, S. J. Goring, and J. W. Williams. 2018. Situating ecology as a big-data science: Current advances, challenges, and solutions. BioScience 68:563–576. Findlay, S. 1995. Importance of surface-subsurface exchange in stream ecosystems: The hyporheic zone. Limnology and Oceanography 40:159–164. Finley, A. O., S. Banerjee, and B. P. Carlin. 2007. spBayes: an R package for univariate and multivariate hierarchical point-referenced spatial models. Journal of Statistical Software 19:1–24. Freeman, M. C., C. M. Pringle, and C. R. Jackson. 2007. Hydrologic connectivity and the contribution of stream headwaters to ecological integrity at regional scales. JAWRA Journal of the American Water Resources Association 43:5–14. Frieden, J. C., E. E. Peterson, J. Angus Webb, and P. M. Negus. 2014. Improving the predictive power of spatial statistical models of stream macroinvertebrates using weighted autocovariance functions. Environmental Modelling & Software 60:320–330. Fritz, K. M., B. R. Johnson, and D. M. Walters. 2006. Field operations manual for assessing the hydrologic permanence and ecological condition of headwater streams. U.S. Environmental Protection Agency, Office of Research and Development, Washington DC. Fritz, K. M., K. A. Schofield, L. C. Alexander, M. G. McManus, H. E. Golden, C. R. Lane, W. G. Kepner, S. D. LeDuc, J. E. DeMeester, and A. I. Pollard. 2018. Physical and chemical connectivity of streams and riparian wetlands to downstream waters: A synthesis. JAWRA Journal of the American Water Resources Association 54:323–345. García, L., W. F. Cross, I. Pardo, and J. S. Richardson. 2017. Effects of landuse intensification on stream basal resources and invertebrate communities. Freshwater Science 36:609–625. 179  Gessner, M. O., and E. Chauvet. 2002. A case for using litter breakdown to assess functional stream integrity. Ecological Applications 12:498–510. Gessner, M. O., E. Chauvet, and M. Dobson. 1999. A perspective on leaf litter breakdown in streams. Oikos 85:377. Gomi, T., R. C. Sidle, and J. S. Richardson. 2002. Understanding processes and downstream linkages of headwater systems. BioScience 52:905. Gorelick, N., M. Hancher, M. Dixon, S. Ilyushchenko, D. Thau, and R. Moore. 2017a. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment. Gorelick, N., M. Hancher, M. Dixon, S. Ilyushchenko, D. Thau, and R. Moore. 2017b. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment 202:18–27. Gotelli, N. J., and R. K. Colwell. 2011. Estimating species richness. Page in A. E. Magurran and B. J. McGill, editors. Biological diversity: frontiers in measurement and assessment. Oxford University Press, Oxford ; New York. Grabs, T., K. Bishop, H. Laudon, S. W. Lyon, and J. Seibert. 2012. Riparian zone hydrology and soil water total organic carbon (TOC): implications for spatial variability and upscaling of lateral riparian TOC exports. Biogeosciences 9:3901–3916. Grant, E. H. C., W. H. Lowe, and W. F. Fagan. 2007. Living in the branches: population dynamics and ecological processes in dendritic networks. Ecology Letters 10:165–175. GRASS Development Team. 2017. Geographic Resources Analysis Support System (GRASS GIS) Software, Version 7.4. Open Source Geospatial Foundation. Software available through: https://grass.osgeo.org. 180  Gray, S., R. Jordan, A. Crall, G. Newman, C. Hmelo-Silver, J. Huang, W. Novak, D. Mellor, T. Frensley, M. Prysby, and A. Singer. 2017. Combining participatory modelling and citizen science to support volunteer conservation action. Biological Conservation 208:76–86. Greenberg, J. A., and M. Mattiuzzi. 2015. gdalUtils: Wrappers for the Geospatial Data Abstraction Library (GDAL) Utilities. Software available through: https://CRAN.R-project.org/package=gdalUtils. Greenberg, J. A., and M. Mattiuzzi. 2018. gdalUtils: Wrappers for the Geospatial Data Abstraction Library (GDAL) Utilities. Software available through: https://CRAN.R-project.org/package=gdalUtils. Griffiths, N. A., and S. D. Tiegs. 2016. Organic-matter decomposition along a temperature gradient in a forested headwater stream. Freshwater Science 35:518–533. Hawkins, C. P., J. R. Olson, and R. A. Hill. 2010. The reference condition: predicting benchmarks for ecological and water-quality assessments. Journal of the North American Benthological Society 29:312–343. He, C., P. Shi, D. Xie, and Y. Zhao. 2010. Improving the normalized difference built-up index to map urban built-up areas using a semiautomatic segmentation approach. Remote Sensing Letters 1:213–221. Hengl, T., M. Nussbaum, M. N. Wright, G. B. M. Heuvelink, and B. Gräler. 2018. Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ 6:e5518. Herbst, G. N. 1980. Effects of burial on food value and consumption of leaf detritus by aquatic invertebrates in a lowland forest stream. Oikos 35:411. 181  Hijmans, R. J. 2017. raster: Geographic Data Analysis and Modeling. Software available through: https://CRAN.R-project.org/package=raster. Hijmans, R. J. 2019. raster: Geographic Data Analysis and Modeling. Software available through: https://CRAN.R-project.org/package=raster. Hogg, S., L. Stanfield, and C. Jones. 2013. Site features for water quality surveys. Pages 77–86 in L. Stanfield, editor. Ninth edition. Ontario Ministry of Natural Resources, Peterborough, Canada. Houlahan, J. E., S. T. McKinney, T. M. Anderson, and B. J. McGill. 2017. The priority of prediction in ecological understanding. Oikos 126:1–7. Hoyle, J. A., and M. J. Yuille. 2016. Nearshore fish community assessment on Lake Ontario and the St. Lawrence River: A trap net-based index of biotic integrity. Journal of Great Lakes Research 42:687–694. Hynes, H. B. N. 1975. The stream and its valley. SIL Proceedings, 1922-2010 19:1–15. Imberger, S. J., R. M. Thompson, and M. R. Grace. 2010. Searching for effective indicators of ecosystem function in urban streams: assessing cellulose decomposition potential: Cellulose strips as a lotic functional indicator. Freshwater Biology 55:2089–2106. Imberger, S. J., C. J. Walsh, and M. R. Grace. 2008. More microbial activity, not abrasive flow or shredder abundance, accelerates breakdown of labile leaf litter in urban streams. Journal of the North American Benthological Society 27:549–561. Iooss, B., A. Janon, G. Pujol,  with contributions from K. Boumhaout, S. D. Veiga, T. Delage, J. Fruth, L. Gilquin, J. Guillaume, L. L. Gratiet, P. Lemaitre, B. L. Nelson, F. Monari, R. Oomen, O. Rakovec, B. Ramos, O. Roustant, E. Song, J. Staum, R. Sueur, T. Touati, and 182  F. Weber. 2018. sensitivity: Global Sensitivity Analysis of Model Outputs. Software available through: https://CRAN.R-project.org/package=sensitivity. Isaak, D. J., E. E. Peterson, J. M. Ver Hoef, S. J. Wenger, J. A. Falke, C. E. Torgersen, C. Sowder, E. A. Steel, M.-J. Fortin, C. E. Jordan, A. S. Ruesch, N. Som, and P. Monestiez. 2014. Applications of spatial statistical network models to stream data: Spatial statistical network models for stream data. Wiley Interdisciplinary Reviews: Water 1:277–294. Isaak, D. J., J. M. Ver Hoef, E. E. Peterson, D. L. Horan, and D. E. Nagel. 2017. Scalable population estimates using spatial-stream-network (SSN) models, fish density surveys, and national geospatial database frameworks for streams. Canadian Journal of Fisheries and Aquatic Sciences 74:147–156. Jackson, H. B., and L. Fahrig. 2015. Are ecologists conducting research at the optimal scale? Global Ecology and Biogeography 24:52–63. Johnson, R. C., M. M. Carreiro, H.-S. Jin, and J. D. Jack. 2012. Within-year temporal variation and life-cycle seasonality affect stream macroinvertebrate community structure and biotic metrics. Ecological Indicators 13:206–214. Jones, C., K. Somers, B. Craig, and T. Reynoldson. 2004. OBBN [Ontario Benthos Biomonitoring Network] Protocol Manual, Version 1.0. Ontario Ministry of Environment. Jones, F. C. 2016. Cumulative effects assessment: theoretical underpinnings and big problems. Environmental Reviews 24:187–204. Jones, F. C., R. Plewes, L. Murison, M. J. MacDougall, S. Sinclair, C. Davies, J. L. Bailey, M. Richardson, and J. Gunn. 2017. Random forests as cumulative effects models: A case 183  study of lakes and rivers in Muskoka, Canada. Journal of Environmental Management 201:407–424. Jones, N. E., and B. J. Schmidt. 2018. Influence of tributaries on the longitudinal patterns of benthic invertebrate communities. River Research and Applications 34:165–173. Jones, Z., and F. Linder. 2015. Exploratory data analysis using random forests. Available: http://zmjones.com/static/papers/rfss_manuscript.pdf Karr, J. R. 1981. Assessment of biotic integrity using fish communities. Fisheries 6:21–27. Kattwinkel, M., and E. Szöcs. 2017. openSTARS: An open source implementation of the “ArcGIS” toolbox “STARS.” Software available through: https://CRAN.R-project.org/package=openSTARS. Kaushal, S. S., and K. T. Belt. 2012. The urban watershed continuum: evolving spatial and temporal dimensions. Urban Ecosystems 15:409–435. Kedron, P. J., A. E. Frazier, G. A. Ovando-Montejo, and J. Wang. 2018. Surface metrics for landscape ecology: a comparison of landscape models across ecoregions and scales. Landscape Ecology 33:1489–1504. Kielstra, B. W., J. Chau, and J. S. Richardson. 2019. Measuring function and structure of urban headwater streams with citizen scientists. Ecosphere 10:e02720. Kiffney, P. M., C. M. Greene, J. E. Hall, and J. R. Davies. 2006. Tributary streams create spatial discontinuities in habitat, biological productivity, and diversity in mainstem rivers. Canadian Journal of Fisheries and Aquatic Sciences 63:2518–2530. Kilgour, B. W., and L. W. Stanfield. 2006. Hindcasting reference conditions in streams. Page 698 in R. M. Hughes, L. Wang, and P. W. Seebach, editors. Landscape influences on 184  stream habitats and biological assemblages. American Fisheries Society, Bethseda, Maryland, USA. King, R. S., M. E. Baker, P. F. Kazyak, and D. E. Weller. 2011. How novel is too novel? Stream community thresholds at exceptionally low levels of catchment urbanization. Ecological Applications 21:1659–1678. King, R. S., M. E. Baker, D. F. Whigham, D. E. Weller, T. E. Jordan, P. F. Kazyak, and M. K. Hurd. 2005. Spatial considerations for linking watershed land cover to ecological indicators in streams. Ecological Applications 15:137–153. Klemm, D. J., K. A. Blocksom, F. A. Fulk, A. T. Herlihy, R. M. Hughes, P. R. Kaufmann, D. V. Peck, J. L. Stoddard, W. T. Thoeny, M. B. Griffith, and W. S. Davis. 2003. Development and evaluation of a macroinvertebrate biotic integrity index (MBII) for regionally assessing mid-Atlantic highlands streams. Environmental Management 31:656–669. Kuglerová, L., L. García, I. Pardo, Y. Mottiar, and J. S. Richardson. 2017a. Does leaf litter from invasive plants contribute the same support of a stream ecosystem function as native vegetation? Ecosphere 8:e01779. Kuglerová, L., E. M. Hasselquist, J. S. Richardson, R. A. Sponseller, D. P. Kreutzweiser, and H. Laudon. 2017b. Management perspectives on Aqua incognita: Connectivity and cumulative effects of small natural and artificial streams in boreal forests. Hydrological Processes 31:4238–4244. Kuglerová, L., B. W. Kielstra, R. D. Moore, and J. S. Richardson. 2019. Importance of scale, land-use, and stream network properties for riparian plant communities along an urban gradient. Freshwater Biology 64:587–600. 185  Larsen, S., M. C. Bruno, I. P. Vaughan, and G. Zolezzi. 2019. Testing the River Continuum Concept with geostatistical stream-network models. Ecological Complexity 39:100773. LaZerte, S. E., and S. Albers. 2018. weathercan: Download and format weather data from Environment and Climate Change Canada. The Journal of Open Source Software 3:571. Lecerf, A. 2017. Methods for estimating the effect of litterbag mesh size on decomposition. Ecological Modelling 362:65–68. Lecerf, A., and J. S. Richardson. 2010. Litter decomposition can detect effects of high and moderate levels of forest disturbance on stream condition. Forest Ecology and Management 259:2433–2443. Legendre, P., and L. Legendre. 2012. Numerical ecology. Third English Edition. Elsevier, Amsterdam, The Netherlands. Levin, S. A. 1992. The problem of pattern and scale in ecology. Ecology 73:1943–1967. Lewandowski, E., and H. Specht. 2015. Influence of volunteer and project characteristics on data quality of biological surveys: Data quality of volunteer surveys. Conservation Biology 29:713–723. Lindsay, J. B. 2006. Sensitivity of channel mapping techniques to uncertainty in digital elevation data. International Journal of Geographical Information Science 20:669–692. Lowe, W. H., and G. E. Likens. 2005. Moving headwater streams to the head of the class. BioScience 55:196–197. MacDonald, L. H. 2000. Evaluating and managing cumulative effects: Process and constraints. Environmental Management 26:299–315. 186  Maechler, M., P. Rousseeuw, A. Struyf, M. Hubert, and K. Hornik. 2018. cluster: Cluster analysis basics and extensions. Software available through: https://CRAN.R-project.org/package=cluster. Magnuson, J. J. 1990. Long-term ecological research and the invisible present. BioScience 40:495–501. Manzoni, S., P. Taylor, A. Richter, A. Porporato, and G. I. Ågren. 2012. Environmental and stoichiometric controls on microbial carbon-use efficiency in soils: Research review. New Phytologist 196:79–91. Marcarelli, A. M., C. V. Baxter, M. M. Mineau, and R. O. Hall. 2011. Quantity and quality: unifying food web and ecosystem perspectives on the role of resource subsidies in freshwaters. Ecology 92:1215–1225. McKenney, D. W., M. F. Hutchinson, P. Papadopol, K. Lawrence, J. Pedlar, K. Campbell, E. Milewska, R. F. Hopkinson, D. Price, and T. Owen. 2011. Customized spatial climate models for North America. Bulletin of the American Meteorological Society 92:1611–1622. McKinley, D. C., A. J. Miller-Rushing, H. L. Ballard, R. Bonney, H. Brown, S. C. Cook-Patton, D. M. Evans, R. A. French, J. K. Parrish, T. B. Phillips, S. F. Ryan, L. A. Shanley, J. L. Shirk, K. F. Stepenuck, J. F. Weltzin, A. Wiggins, O. D. Boyle, R. D. Briggs, S. F. Chapin, D. A. Hewitt, P. W. Preuss, and M. A. Soukup. 2017. Citizen science can improve conservation science, natural resource management, and environmental protection. Biological Conservation 208:15–28. Meinshausen, N. 2017. quantregForest: Quantile regression rorests. Software available through: https://CRAN.R-project.org/package=quantregForest. 187  Merritt, R. W., K. W. Cummins, and M. B. Berg. 2008. An introduction to the aquatic insects of North America. Kendall/Hunt, Dubuque, USA. Meyer, J. L., M. J. Paul, and W. K. Taulbee. 2005. Stream ecosystem function in urbanizing landscapes. Journal of the North American Benthological Society 24:602–612. Meyer, J. L., D. L. Strayer, J. B. Wallace, S. L. Eggert, G. S. Helfman, and N. E. Leonard. 2007. The contribution of headwater streams to biodiversity in river networks. JAWRA Journal of the American Water Resources Association 43:86–103. Meyer, J. L., and J. B. Wallace. 2001. Lost linkages and lotic ecology: Rediscovering small streams. Pages 295–317 in M. C. Press, N. J. Huntly, and S. A. Levin, editors. Ecology: Achievement and Challenge: 41st Symposium of the British Ecological Society. Blackwell Science, Oxford, UK. Microsoft, R. C. T. 2018. Microsoft R Open. Microsoft, Redmond, Washington. Miguet, P., H. B. Jackson, N. D. Jackson, A. E. Martin, and L. Fahrig. 2016. What determines the spatial extent of landscape effects on species? Landscape Ecology 31:1177–1194. Minns, C. K., V. W. Cairns, R. G. Randall, and J. E. Moore. 1994. An index of biotic integrity (IBI) for fish assemblages in the littoral zone of great lakes’ areas of concern. Canadian Journal of Fisheries and Aquatic Sciences 51:1804–1822. Møller, A., and M. Jennions. 2002. How much variance can be explained by ecologists and evolutionary biologists? Oecologia 132:492–500. Mouquet, N., Y. Lagadeuc, V. Devictor, L. Doyen, A. Duputié, D. Eveillard, D. Faure, E. Garnier, O. Gimenez, P. Huneman, F. Jabot, P. Jarne, D. Joly, R. Julliard, S. Kéfi, G. J. Kergoat, S. Lavorel, L. Le Gall, L. Meslin, S. Morand, X. Morin, H. Morlon, G. Pinay, R. 188  Pradel, F. M. Schurr, W. Thuiller, and M. Loreau. 2015. Predictive ecology in a changing world. Journal of Applied Ecology 52:1293–1310. Nakagawa, S., and H. Schielzeth. 2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution 4:133–142. Napieralski, J., R. Keeling, M. Dziekan, C. Rhodes, A. Kelly, and K. Kobberstad. 2015. Urban stream deserts as a consequence of excess stream burial in urban watersheds. Annals of the Association of American Geographers 105:649–664. Neff, B. P., S. M. Day, A. R. Piggot, and L. M. Fuller. 2005. Base flow in the Great Lakes Basin. Page 23. United States Geological Survey, Reston, Virginia, USA. Nogués-Bravo, D. 2009. Predicting the past distribution of species climatic niches. Global Ecology and Biogeography 18:521–531. OECD, and Joint Research Centre - European Commission. 2008. Handbook on constructing composite indicators: Methodology and user guide. OECD. OGS [Ontario Geological Survey]. 2000. Quaternary geology, seamless coverage of the province of Ontario: Ontario Geological Survey, Data Set 14----Revised. Government of Ontario, Sudbury, Ontario, Canada. Oksanen, J., F. G. Blanchet, M. Friendly, R. Kindt, P. Legendre, D. McGlinn, P. R. Minchin, R. B. O’Hara, G. L. Simpson, P. Solymos, M. H. H. Stevens, E. Szoecs, and H. Wagner. 2019. vegan: Community Ecology Package. Software available through: https://CRAN.R-project.org/package=vegan.  [OMF] Ontario Ministry of Finance. 2018. Ontario Population Projections Update. Available: https://www.fin.gov.on.ca/en/economy/demographics/projections/ 189  [OMMA] Ontario Ministry of Municipal Affairs. 2017. Growth Plan for the Greater Golden Horseshoe, 2017. Available: http://placestogrow.ca. OMNRF [Ontario Ministry of Natural Resources and Forestry]. 2011. Southern Ontario Land Resource Information System Version 2.0. Government of Ontario, Sault Ste. Marie, Ontario, Canada. Available: https://www.ontario.ca/page/land-information-ontario. OMNRF [Ontario Ministry of Natural Resources and Forestry]. 2015. Digital Elevation Model - Version 2.0.0. Available: https://www.ontario.ca/page/land-information-ontario. OMNRF [Ontario Ministry of Natural Resources and Forestry]. 2016. Ontario Integrated Hydrology Data. Page 44. Government of Ontario, Peterborough, Ontario. Available: https://www.ontario.ca/page/land-information-ontario. OMNRF [Ontario Ministry of Natural Resources and Forestry]. 2019. Ontario Road Network Road Net Element. Government of Ontario, Peterborough, Ontario. Available: https://www.ontario.ca/page/land-information-ontario. Ovaskainen, O., G. Tikhonov, A. Norberg, F. Guillaume Blanchet, L. Duan, D. Dunson, T. Roslin, and N. Abrego. 2017. How to make more out of community data? A conceptual framework and its implementation as models and software. Ecology Letters 20:561–576. Paul, M. J., and J. L. Meyer. 2001. Streams in the urban landscape. Annual Review of Ecology and Systematics 32:333–365. Pebesma, E. 2018a. lwgeom: Bindings to Selected “liblwgeom” Functions for Simple Features. Software available through: https://CRAN.R-project.org/package=lwgeom. Pebesma, E. 2018b. Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal. Available: https://journal.r-project.org/archive/2018/RJ-2018-009/index.html 190  Pebesma, E. J., and R. S. Bivand. 2005. Classes and methods for spatial data in R. R News 5:9–13. Peckarsky, B. L., editor. 1990. Freshwater macroinvertebrates of northeastern North America. Comstock Publishing Associates, Ithaca, New York, USA. Peters, R. H. 1991. A critique for ecology. Cambridge University Press, Cambridge, UK; New York, USA. Peterson, E. E., and J. M. Ver Hoef. 2010. A mixed-model moving-average approach to geostatistical modeling in stream networks. Ecology 91:644–651. Peterson, E. E., J. M. Ver Hoef, D. J. Isaak, J. A. Falke, M.-J. Fortin, C. E. Jordan, K. McNyset, P. Monestiez, A. S. Ruesch, A. Sengupta, N. Som, E. A. Steel, D. M. Theobald, C. E. Torgersen, and S. J. Wenger. 2013. Modelling dendritic ecological networks in space: an integrated network perspective. Ecology Letters 16:707–719. Peterson, E., and J. V. Hoef. 2014. STARS : An ArcGIS toolset used to calculate the spatial information needed to fit spatial statistical models to stream network data. Journal of Statistical Software 56:1–17. Pettorelli, N., J. O. Vik, A. Mysterud, J. M. Gaillard, C. J. Tucker, and N. C. Stenseth. 2005. Using the satellite-derived NDVI to assess ecological responses to environmental change. Trends in Ecology & Evolution 20:503–510. Piggott, J. J., C. R. Townsend, and C. D. Matthaei. 2015. Reconceptualizing synergism and antagonism among multiple stressors. Ecology and Evolution 5:1538–1547. Pinheiro, J. C., and D. M. Bates. 2000. Mixed-effects models in S and S-PLUS. Springer, New York, USA. 191  Polasky, S., S. R. Carpenter, C. Folke, and B. Keeler. 2011. Decision-making under great uncertainty: environmental management in an era of global change. Trends in Ecology & Evolution 26:398–404. R Core Team. 2017. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Software available through: https://www.r-project.org. R Core Team. 2018. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Software available through: https://www.r-project.org Ramey, T. L., and J. S. Richardson. 2017. Terrestrial invertebrates in the riparian zone: Mechanisms underlying their unique diversity. BioScience 67:808–819. Reece, P. F., T. B. Reynoldson, J. S. Richardson, and D. M. Rosenberg. 2001. Implications of seasonal variation for biomonitoring with predictive models in the Fraser River catchment, British Columbia. Canadian Journal of Fisheries and Aquatic Sciences 58:1411–1417. Regos, A., L. Imbeau, M. Desrochers, A. Leduc, M. Robert, B. Jobin, L. Brotons, and P. Drapeau. 2018. Hindcasting the impacts of land-use changes on bird communities with species distribution models of Bird Atlas data. Ecological Applications 28:1867–1883. Reid, A. J., A. K. Carlson, I. F. Creed, E. J. Eliason, P. A. Gell, P. T. J. Johnson, K. A. Kidd, T. J. MacCormack, J. D. Olden, S. J. Ormerod, J. P. Smol, W. W. Taylor, K. Tockner, J. C. Vermaire, D. Dudgeon, and S. J. Cooke. 2019. Emerging threats and persistent conservation challenges for freshwater biodiversity. Biological Reviews 94:849–873. 192  Resh, V. H., A. V. Brown, A. P. Covich, M. E. Gurtz, H. W. Li, G. W. Minshall, S. R. Reice, A. L. Sheldon, J. B. Wallace, and R. C. Wissmar. 1988. The role of disturbance in stream ecology. Journal of the North American Benthological Society 7:433–455. Reynoldson, T. B., R. H. Norris, V. H. Resh, K. E. Day, and D. M. Rosenberg. 1997. The reference condition: A comparison of multimetric and multivariate approaches to assess water-quality impairment using benthic macroinvertebrates. Journal of the North American Benthological Society 16:833–852. Richardson, J. 2019a. Biological diversity in headwater streams. Water 11:366. Richardson, J. S. 2019b. Headwater Streams. Page Reference Module in Earth Systems and Environmental Sciences. Elsevier. Richardson, J. S., and T. Sato. 2015. Resource subsidy flows across freshwater-terrestrial boundaries and influence on processes linking adjacent ecosystems: Cross-ecosystem resource subsidies across the water-land boundary. Ecohydrology 8:406–415. Richardson, J. S., Y. Zhang, and L. B. Marczak. 2010. Resource subsidies across the land-freshwater interface and responses in recipient communities. River Research and Applications 26:55–66. Riva-Murray, K., R. Riemann, P. Murdoch, J. M. Fischer, and R. Brightbill. 2010. Landscape characteristics affecting streams in urbanizing regions of the Delaware River Basin (New Jersey, New York, and Pennsylvania, U.S.). Landscape Ecology 25:1489–1503. Roy, A. H., A. L. Dybas, K. M. Fritz, and H. R. Lubbers. 2009. Urbanization affects the extent and hydrologic permanence of headwater streams in a midwestern US metropolitan area. Journal of the North American Benthological Society 28:911–928. 193  Russell, P. P., S. M. Gale, B. Muñoz, J. R. Dorney, and M. J. Rubino. 2015. A spatially explicit model for mapping headwater streams. JAWRA Journal of the American Water Resources Association 51:226–239. Safe Software. 2019. FME Desktop. Safe Software, Surrey, British Columbia, Canada. Saltelli, A., M. Ratto, T. Andres, F. Campolongo, J. Cariboni, D. Gatelli, M. Saisana, and S. Tarantola. 2008. Global Sensitivity Analysis. The Primer. John Wiley & Sons, Ltd, Chichister, West Sussex, UK. von Schiller, D., V. Acuña, I. Aristi, M. Arroita, A. Basaguren, A. Bellin, L. Boyero, A. Butturini, A. Ginebreda, E. Kalogianni, A. Larrañaga, B. Majone, A. Martínez, S. Monroy, I. Muñoz, M. Paunović, O. Pereda, M. Petrovic, J. Pozo, S. Rodríguez-Mozaz, D. Rivas, S. Sabater, F. Sabater, N. Skoulikidis, L. Solagaistua, L. Vardakas, and A. Elosegi. 2017. River ecosystem processes: A synthesis of approaches, criteria of use and sensitivity to environmental stressors. Science of The Total Environment 596–597:465–480. Schneider, D. C. 2001. The rise of the concept of scale in ecology. BioScience 51:545–553. Scott, S. E., and Y. Zhang. 2012. Contrasting effects of sand burial and exposure on invertebrate colonization of leaves. The American Midland Naturalist 167:68–78. Scrine, J., M. Jochum, J. S. Ólafsson, and E. J. O’Gorman. 2017. Interactive effects of temperature and habitat complexity on freshwater communities. Ecology and Evolution 7:9333–9346. Segal, A., Y. (Kobi) Gal, R. J. Simpson, V. Victoria Homsy, M. Hartswood, K. R. Page, and M. Jirotka. 2015. Improving productivity in citizen science through controlled intervention. 194  Pages 331–337 Proceedings of the 24th International Conference on World Wide Web - WWW ’15 Companion. ACM Press, Florence, Italy. Seitz, N. E., C. J. Westbrook, and B. F. Noble. 2011. Bringing science into river systems cumulative effects assessment practice. Environmental Impact Assessment Review 31:172–179. Siddig, A. A. H., A. M. Ellison, A. Ochs, C. Villar-Leeman, and M. K. Lau. 2016. How do ecologists select and use indicator species to monitor ecological change? Insights from 14 years of publication in Ecological Indicators. Ecological Indicators 60:223–230. Sinsabaugh, R. L., A. E. Linkins, and E. F. Benfield. 1985. Cellulose digestion and assimilation by three leaf‐shredding aquatic insects. Ecology 66:1464–1471. Smol, J. P., I. R. Walker, and P. R. Leavitt. 1991. Paleolimnology and hindcasting climatic trends. SIL Proceedings, 1922-2010 24:1240–1246. Sponseller, R. A., and E. F. Benfield. 2001. Influences of land use on leaf breakdown in southern Appalachian headwater streams: a multiple-scale analysis. Journal of the North American Benthological Society 20:44–59. Stammler, K. L., A. G. Yates, and R. C. Bailey. 2013. Buried streams: Uncovering a potential threat to aquatic ecosystems. Landscape and Urban Planning 114:37–41. Stanfield, L. 2007. Ontario Stream Assessment Protocol, Version 7. Government of Ontario. Stanfield, L. 2013. Ontario Stream Assessment Protocol, Version 9. Government of Ontario. Stanfield, L. 2017. Ontario Stream Assessment Protocol, Version 10. Government of Ontario. Stanfield, L., L. Del Giudice, E. Bearss, and D. Morodvanschi. 2013. Assessing Headwater Drainage Features. Pages 403–435 in L. Stanfield, editor. Ninth edition. Ontario Ministry of Natural Resources, Peterborough, Ontario, Canada. 195  Stanfield, L. W., L. Del Giudice, F. Lutscher, M. Trudeau, L. Alexander, W. F. Fagan, R. Fertik, R. Mackereth, J. S. Richardson, N. Shrestha, and Tetreault, G. 2014. A discussion paper on: Cumulative effects from alteration of headwater drainage features and the loss of ecosystem integrity of river networks. Ontario Ministry of Natural Resources, interal publication. Stanfield, L. W., B. Kilgour, K. Todd, S. Holysh, A. Piggott, and M. Baker. 2009. Estimating Summer Low-Flow in Streams in a Morainal Landscape using Spatial Hydrologic Models. Canadian Water Resources Journal 34:269–284. Stanfield, L. W., and B. W. Kilgour. 2006. Effects of percent impervious cover on fish and benthos assemblages and instream habitats in Lake Ontario tributaries. Page 698 in R. M. Hughes, L. Wang, and P. W. Seebach, editors. Landscape influences on stream habitats and biological assemblages. American Fisheries Society, Bethseda, Maryland, USA. Statistics Canada. 2018. Ecological Land Classification, 2017. Government of Canada, Ottawa, Canada. Available: http://publications.gc.ca/pub?id=9.849860&sl=0 Statistics Canada. 2016. Census Profile, 2016 Census - York, Regional municipality [Census division], Ontario and Ontario [Province]. Available: https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/prof/details/page.cfm?Lang=E&Geo1=CD&Code1=3519&Geo2=PR&Code2=35&Data=Count&SearchText=York&SearchType=Begins&SearchPR=01&B1=Population&TABID=3. Steedman, R. J. 1988. Modification and assessment of an index of biotic integrity to quantify stream quality in Southern Ontario. Canadian Journal of Fisheries and Aquatic Sciences 45:492–501. 196  Stoddard, J. L., A. T. Herlihy, D. V. Peck, R. M. Hughes, T. R. Whittier, and E. Tarquinio. 2008. A process for creating multimetric indices for large-scale aquatic surveys. Journal of the North American Benthological Society 27:878–891. Stoddard, J. L., D. P. Larsen, C. P. Hawkins, R. K. Johnson, and R. H. Norris. 2006. Setting expectations for the ecological condition of streams: The concept of reference condition. Ecological Applications 16:1267–1276. Strayer, D. L., and D. Dudgeon. 2010. Freshwater biodiversity conservation: recent progress and future challenges. Journal of the North American Benthological Society 29:344–358. Sullivan, B. L., T. Phillips, A. A. Dayer, C. L. Wood, A. Farnsworth, M. J. Iliff, I. J. Davies, A. Wiggins, D. Fink, W. M. Hochachka, A. D. Rodewald, K. V. Rosenberg, R. Bonney, and S. Kelling. 2017. Using open access observational data for conservation action: A case study for birds. Biological Conservation 208:5–14. Swanson, F. J., and R. E. Sparks. 1990. Long-term ecological research and the invisible place. BioScience 40:502–508. Tanentzap, A. J., B. W. Kielstra, G. M. Wilkinson, M. Berggren, N. Craig, P. A. del Giorgio, J. Grey, J. M. Gunn, S. E. Jones, J. Karlsson, C. T. Solomon, and M. L. Pace. 2017. Terrestrial support of lake food webs: Synthesis reveals controls over cross-ecosystem resource use. Science Advances 3. Tanentzap, A. J., E. J. Szkokan-Emilson, B. W. Kielstra, M. T. Arts, N. D. Yan, and J. M. Gunn. 2014. Forests fuel fish growth in freshwater deltas. Nature Communications 5. Théberge, É. 2016. Macroinvertebrate assemblage responses to anthropogenic stressors: a bioassessment scoring tool for managing stream ecosystems in Eastern Ontario. 197  University of Ottawa, Ottawa, Ontario, Canada. Available: https://ruor.uottawa.ca/handle/10393/34263. Theobald, E. J., A. K. Ettinger, H. K. Burgess, L. B. DeBey, N. R. Schmidt, H. E. Froehlich, C. Wagner, J. HilleRisLambers, J. Tewksbury, M. A. Harsch, and J. K. Parrish. 2015. Global change and local solutions: Tapping the unrealized potential of citizen science for biodiversity research. Biological Conservation 181:236–244. Therivel, R., and B. Ross. 2007. Cumulative effects assessment: Does scale matter? Environmental Impact Assessment Review 27:365–385. Thorp, J. H., and A. P. Covich, editors. 2010. Ecology and classification of North American freshwater invertebrates. Third Edition. Academic Press, Amsterdam, The Netherlands; Boston, USA. Tiegs, S. D., J. E. Clapcott, N. A. Griffiths, and A. J. Boulton. 2013. A standardized cotton-strip assay for measuring organic-matter decomposition in streams. Ecological Indicators 32:131–139. Tiegs, S. D., D. M. Costello, M. W. Isken, G. Woodward, P. B. McIntyre, M. O. Gessner, E. Chauvet, N. A. Griffiths, A. S. Flecker, V. Acuña, R. Albariño, D. C. Allen, C. Alonso, P. Andino, C. Arango, J. Aroviita, M. V. M. Barbosa, L. A. Barmuta, C. V. Baxter, T. D. C. Bell, B. Bellinger, L. Boyero, L. E. Brown, A. Bruder, D. A. Bruesewitz, F. J. Burdon, M. Callisto, C. Canhoto, K. A. Capps, M. M. Castillo, J. Clapcott, F. Colas, C. Colón-Gaud, J. Cornut, V. Crespo-Pérez, W. F. Cross, J. M. Culp, M. Danger, O. Dangles, E. de Eyto, A. M. Derry, V. D. Villanueva, M. M. Douglas, A. Elosegi, A. C. Encalada, S. Entrekin, R. Espinosa, D. Ethaiya, V. Ferreira, C. Ferriol, K. M. Flanagan, T. Fleituch, J. J. F. Shah, A. Frainer, N. Friberg, P. C. Frost, E. A. Garcia, L. G. Lago, P. E. G. Soto, S. 198  Ghate, D. P. Giling, A. Gilmer, J. F. Gonçalves, R. K. Gonzales, M. A. S. Graça, M. Grace, H.-P. Grossart, F. Guérold, V. Gulis, L. U. Hepp, S. Higgins, T. Hishi, J. Huddart, J. Hudson, S. Imberger, C. Iñiguez-Armijos, T. Iwata, D. J. Janetski, E. Jennings, A. E. Kirkwood, A. A. Koning, S. Kosten, K. A. Kuehn, H. Laudon, P. R. Leavitt, A. L. L. da Silva, S. J. Leroux, C. J. LeRoy, P. J. Lisi, R. MacKenzie, A. M. Marcarelli, F. O. Masese, B. G. McKie, A. O. Medeiros, K. Meissner, M. Miliša, S. Mishra, Y. Miyake, A. Moerke, S. Mombrikotb, R. Mooney, T. Moulton, T. Muotka, J. N. Negishi, V. Neres-Lima, M. L. Nieminen, J. Nimptsch, J. Ondruch, R. Paavola, I. Pardo, C. J. Patrick, E. T. H. M. Peeters, J. Pozo, C. Pringle, A. Prussian, E. Quenta, A. Quesada, B. Reid, J. S. Richardson, A. Rigosi, J. Rincón, G. Rîşnoveanu, C. T. Robinson, L. Rodríguez-Gallego, T. V. Royer, J. A. Rusak, A. C. Santamans, G. B. Selmeczy, G. Simiyu, A. Skuja, J. Smykla, K. R. Sridhar, R. Sponseller, A. Stoler, C. M. Swan, D. Szlag, F. T. Mello, J. D. Tonkin, S. Uusheimo, A. M. Veach, S. Vilbaste, L. B. M. Vought, C.-P. Wang, J. R. Webster, P. B. Wilson, S. Woelfl, M. A. Xenopoulos, A. G. Yates, C. Yoshimura, C. M. Yule, Y. X. Zhang, and J. A. Zwart. 2019. Global patterns and drivers of ecosystem functioning in rivers and riparian zones. Science Advances 5:eaav0486. Tiegs, S. D., S. D. Langhans, K. Tockner, and M. O. Gessner. 2007. Cotton strips as a leaf surrogate to measure decomposition in river floodplain habitats. Journal of the North American Benthological Society 26:70–77. Tillman, D. C., A. H. Moerke, C. L. Ziehl, and G. A. Lamberti. 2003. Subsurface hydrology and degree of burial affect mass loss and invertebrate colonisation of leaves in a woodland stream. Freshwater Biology 48:98–107. 199  Tornwall, B. M., C. M. Swan, and B. L. Brown. 2017. Manipulation of local environment produces different diversity outcomes depending on location within a river network. Oecologia 184:663–674. [TRCA] Toronto and Region Conservation Authority. 2007. The natural functions of headwater drainage features: A literature review. Available: http://www.trca.on.ca/dotAsset/79264.pdf. Turner, M. G. 1989. Landscape ecology: The effect of pattern on process. Annual Review of Ecology and Systematics 20:171–197. United Nations Office for Disaster Risk Reduction. 2017. Terminology. https://www.unisdr.org/we/inform/terminology#letter-c. Urban, D. L., R. V. O’Neill, and H. H. Shugart,. 1987. Landscape ecology. BioScience 37:119–127. Urban, M. C., and L. De Meester. 2009. Community monopolization: local adaptation enhances priority effects in an evolving metacommunity. Proceedings of the Royal Society B: Biological Sciences 276:4129–4138. Utz, R. M., K. G. Hopkins, L. Beesley, D. B. Booth, R. J. Hawley, M. E. Baker, M. C. Freeman, and K. L. Jones. 2016. Ecological resistance in urban streams: the role of natural and legacy attributes. Freshwater Science 35:380–397. VanDerWal, J., L. Falconi, S. Januchowski, L. Shoo, and C. Storlie. 2014. SDMTools: Species Distribution Modelling Tools: Tools for processing data associated with species distribution modelling exercises. . Software available through: https://CRAN.R-project.org/package=SDMTools. 200  Vannote, R. L., G. W. Minshall, K. W. Cummins, J. R. Sedell, and C. E. Cushing. 1980. The river continuum concept. Canadian Journal of Fisheries and Aquatic Sciences 37:130–137. van der Velde, T., D. A. Milton, T. J. Lawson, C. Wilcox, M. Lansdell, G. Davis, G. Perkins, and B. D. Hardesty. 2017. Comparison of marine debris data collected by researchers and citizen scientists: Is citizen science data worth the effort? Biological Conservation 208:127–138. Ver Hoef, J. M., and E. E. Peterson. 2010. A moving average approach for spatial statistical models of stream networks. Journal of the American Statistical Association 105:6–18. Ver Hoef, J. M., E. E. Peterson, D. Clifford, and R. Shah. 2014. SSN : An R Package for Spatial Statistical Modeling on Stream Networks. Journal of Statistical Software 56. . Software available through: https://CRAN.R-project.org/package=SSN Ver Hoef, J. M., E. Peterson, and D. Theobald. 2006. Spatial statistical models that use flow and stream distance. Environmental and Ecological Statistics 13:449–464. Vollmer, D., H. M. Regan, and S. J. Andelman. 2016. Assessing the sustainability of freshwater systems: A critical review of composite indicators. Ambio 45:765–780. Vörösmarty, C. J., P. B. McIntyre, M. O. Gessner, D. Dudgeon, A. Prusevich, P. Green, S. Glidden, S. E. Bunn, C. A. Sullivan, C. R. Liermann, and P. M. Davies. 2010. Global threats to human water security and river biodiversity. Nature 467:555–561. Vyšná, V., F. Dyer, W. Maher, and R. Norris. 2014. Cotton-strip decomposition rate as a river condition indicator – Diel temperature range and deployment season and length also matter. Ecological Indicators 45:508–521. 201  Wallace, A. M., M. V. Croft-White, and J. Moryk. 2013. Are Toronto’s streams sick? A look at the fish and benthic invertebrate communities in the Toronto region in relation to the urban stream syndrome. Environmental Monitoring and Assessment 185:7857–7875. Wallace, J. B., S. L. Eggert, J. L. Meyer, and J. R. Webster. 1997. Multiple trophic levels of a forest stream linked to terrestrial litter inputs. Science 277:102–104. Walsh, C. J., A. H. Roy, J. W. Feminella, P. D. Cottingham, P. M. Groffman, and R. P. Morgan. 2005. The urban stream syndrome: current knowledge and the search for a cure. Journal of the North American Benthological Society 24:706–723. Wenger, S. J., A. H. Roy, C. R. Jackson, E. S. Bernhardt, T. L. Carter, S. Filoso, C. A. Gibson, W. C. Hession, S. S. Kaushal, E. Martí, J. L. Meyer, M. A. Palmer, M. J. Paul, A. H. Purcell, A. Ramírez, A. D. Rosemond, K. A. Schofield, E. B. Sudduth, and C. J. Walsh. 2009. Twenty-six key research questions in urban stream ecology: an assessment of the state of the science. Freshwater Science 28:1080–1098. Williamson, T. N., C. T. Agouridis, C. D. Barton, J. A. Villines, and J. G. Lant. 2015. Classification of ephemeral, intermittent, and perennial stream reaches using a TOPMODEL-based approach. JAWRA Journal of the American Water Resources Association 51:1739–1759. Wipfli, M. S., J. S. Richardson, and R. J. Naiman. 2007. Ecological linkages between headwaters and downstream ecosystems: Transport of organic matter, invertebrates, and wood down headwater channels. JAWRA Journal of the American Water Resources Association 43:72–85. Wohl, E. 2017. The significance of small streams. Frontiers of Earth Science 11:447–456. 202  Woodward, G., M. O. Gessner, P. S. Giller, V. Gulis, S. Hladyz, A. Lecerf, B. Malmqvist, B. G. McKie, S. D. Tiegs, H. Cariss, M. Dobson, A. Elosegi, V. Ferreira, M. A. S. Graca, T. Fleituch, J. O. Lacoursiere, M. Nistorescu, J. Pozo, G. Risnoveanu, M. Schindler, A. Vadineanu, L. B.-M. Vought, and E. Chauvet. 2012. Continental-scale effects of nutrient pollution on stream ecosystem functioning. Science 336:1438–1440. Yeung, A. C. Y., A. Lecerf, and J. S. Richardson. 2017. Assessing the long-term ecological effects of riparian management practices on headwater streams in a coastal temperate rainforest. Forest Ecology and Management 384:100–109. Young, R. G., and K. J. Collier. 2009. Contrasting responses to catchment modification among a range of functional and structural indicators of river ecosystem health. Freshwater Biology 54:2155–2170. Young, R. G., C. D. Matthaei, and C. R. Townsend. 2008. Organic matter breakdown and ecosystem metabolism: functional indicators for assessing river ecosystem health. Journal of the North American Benthological Society 27:605–625. Zha, Y., J. Gao, and S. Ni. 2003. Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. International Journal of Remote Sensing 24:583–594. Zuur, A. F., J. M. Hilbe, and E. N. Ieno. 2013. A beginner’s guide to GLM and GLMM with R. Highland Statistics, UK. Zuur, A. F., E. N. Ieno, and C. S. Elphick. 2010. A protocol for data exploration to avoid common statistical problems. Methods in Ecology and Evolution 1:3–14. Zuur, A. F., E. N. Ieno, A. A. Savelʹev, and A. F. Zuur. 2017. Using GLM and GLMM. Highland Statistics Ltd, Newburgh, UK. 203  Appendices  Supporting information for Chapter 2 – Headwater condition index and sensitivity analysis  Generating a headwater condition index, detailed approach  There are 𝑛 features within 𝑚 sites, with 𝑞 variables assessed at the feature-scale and 𝑟 variables assessed at the site-scale. For each variable, 𝑞 or 𝑟, we developed a ranking scheme that could allow for equivalencies among categories within each variable and could allow for different ranking schemes between variables (e.g., variables could have rankings 1–3, 1–2, 1–4, etc.). We took each variable and scored the rankings from 1–10. We aggregated feature scores, aggregated site scores, and then calculated an overall HCI using the following methods. To aggregate feature scores, we calculated the geometric mean across feature-scale variables (𝐶𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑛) and then calculated the weighted geometric mean across all features (𝐹𝑒𝑎𝑡𝑢𝑟𝑒 − 𝑠𝑐𝑎𝑙𝑒 𝐻𝐶𝐼𝑚) weighted by each feature’s contribution to flow (𝑤𝑖). Prior to calculating the feature-scale geometric mean, we took the geometric mean of the six riparian vegetation variable scores (left and right bank for zones 0–1.5 m, 1.5–10 m, 10–30 m) to generate a new variable representing riparian vegetation condition. 𝐶𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑛 was calculated using      1q qn iiCondition Condition =     (1)  204  where 𝑞 is a feature-scale variable score. Feature-scale flow weights, 𝑤𝑖, were calculated using the “Flow” variable which is an ordered categorical variable ranked from low (1) to high (4) flow (e.g., no flow to significant > 0.5 L/s). The 𝑤𝑖 values were calculated using    1ii niiFlowwFlow== (2)  where 𝑛 is the number of features at a site. For example, if there are three features with flows equal to (“No Flow”, “Flow Substantial”, “Flow Substantial”), or (1, 4, 4), then each of their weights would be (0.11, 0.44, 0.44). Finally, 𝐹𝑒𝑎𝑡𝑢𝑟𝑒 − 𝑠𝑐𝑎𝑙𝑒 𝐻𝐶𝐼𝑚 was calculated using  111niiinwwm iiFeature scale HCI Condition == − =     (3)  where 𝑛 is the number of features, 𝐶𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑖 is a feature’s 𝐶𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑛 from eqn. 1, and 𝑤𝑖 is a feature’s weight from eqn. 2.      To aggregate site scores, we calculated the geometric mean across all site-scale variables (𝑆𝑖𝑡𝑒 −  𝑠𝑐𝑎𝑙𝑒 𝐻𝐶𝐼𝑚). This was calculated using   11r rm iiSite scale HCI Condition= − =     (4)  where 𝑟 is an individual site variable. Finally, I calculated the overall HCI (𝐻𝐶𝐼𝑚) as the geometric mean of the feature and site conditions using 205   2m m mHCI Feature scale HCI Feature scale HCI= −  −  (5) This final 𝐻𝐶𝐼𝑚 was scaled from the interval 1–10 to the interval 0–1.   206   Tables Table A.1 Results for headwater condition index (HCI) sensitivity analysis. Variables are ordered by scale and first-order sensitivity index, 𝑆𝑖, where i is an individual variable. Higher 𝑆𝑖 indicates higher independent contribution of each variable’s variance to variance in HCI. Total-effect index, 𝑆𝑇𝑖, is similar but also includes higher-order interactions. Greater differences (𝑆𝑇𝑖 − 𝑆𝑖) indicate the variable interacts with other variables in contributing to HCI variance.  Scale Variable 𝑆𝑖 𝑆𝑇𝑖  𝑆𝑇𝑖 − 𝑆𝑖  Feature Type 0.182 0.192 0.009 Feature SedimentDeposition 0.123 0.126 0.003 Feature AdjacentSedimentTransport 0.114 0.121 0.007 Feature ValleySedimentTransport 0.109 0.120 0.011 Feature RiparianVeg0m1_5mLeft 0.008 0.003 -0.004 Feature RiparianVeg1_5m10mRight 0.007 0.003 -0.004 Feature RiparianVeg10m30mLeft 0.006 0.003 -0.003 Feature RiparianVeg10m30mRight 0.006 0.003 -0.003  Feature RiparianVeg1_5m10mLeft 0.005 0.003 -0.002 Feature RiparianVeg0m1_5mRight 0.005 0.003 -0.001 Site ChannelHardening 0.052 0.052 0.001 Site MajorNutrientSourcesUpstream 0.051 0.054 0.003 Site OnlinePondsUpstream 0.051 0.053 0.002 Site SpringsOrSeepsAtTheSite 0.049 0.051 0.002 Site EvidenceofChannelScouringOrErosion 0.049 0.051 0.001 Site PotentialContaminantSourcesUpstream 0.048 0.053 0.005 Site BMPsOrRestorationActivities 0.048 0.055 0.007 Site BarriersAndOrDamsInProximity 0.045 0.052 0.007 Site DredgingOrStraightening 0.043 0.052 0.009    207  Supporting information for Chapter 2 - Geospatial analysis   Geospatial procedures  Generally, we gathered spatial information for input to regional-scale models relevant to land management. Using provincial digital elevation models (DEMs) and associated products (i.e., Ontario Integrated Hydrology layers [OIH]), we produced flow lines (streams/reaches) and reach-scale or site-scale catchment boundaries representing all topographically-contributing DEM pixels to a given feature. For reach-scale boundaries, this represented a set of non-overlapping tessellated polygons wherein all DEM pixels uniquely contribute to a given reach. For site-scale boundaries, this represented all DEM pixels contributing to a given site. These different catchment types were used in different analyses. A DEM was also used to produce topographic surface derivatives (e.g., slope). These surface derivatives and secondary data sources (e.g., satellite imagery, land cover layers) were used to extract landscape statistics at various spatial scales. These scales could include local buffers around a site (e.g., denoted as l_100 for a buffer of radius 100 m), buffers around a stream (e.g., denoted as r_50 for a buffer of 50 m on each side of the stream), or the whole catchment (denoted c_). More details about data sources and uses are available in Table B.1 and Table B.2.    Spatial stream networks  Spatial stream network (SSN) models are a class of generalized linear mixed effects models designed to model fixed effects (i.e., covariates) and spatial dependencies by incorporating topological and distance properties of branched stream networks. Here, we outline the basic construction of SSN models using examples from Peterson and Ver Hoef (2010). A basic linear model is:  208     𝐲 = 𝐗𝛃 +  𝛆 , (1)  where 𝐲 is the vector of the response variable, 𝐗 is a matrix of explanatory variables, 𝛃 is a vector of parameters modelling 𝐗 to 𝐲, and 𝛆 is a vector of random errors. Generally, var(𝛆) = 𝚺 where 𝚺 is a matrix. In a linear model, 𝚺 is typically identical and independently distributed (i.e., the assumption of independence). In a spatial generalized linear mixed model, the error term is expanded into “random effects” allowing for dependency among residuals. For SSN models, these can be simple (e.g., grouping by site) or spatial dependencies and involves adjusting the covariance matrix of the model. These adjustments could be Euclidean (EUC), tail-down (TD), or tail-up (TU) dependency structures outlined in below. An SSN model is a linear model with the error term expanded into random effects (𝐳):    𝐲 = 𝐗𝛃 + 𝛔EUC𝐳EUC + 𝛔TD𝐳TD + 𝛔TU𝐳TU + 𝛔NUG𝐳NUG , (2)  where cor(𝐳EUC) = 𝐑EUC, cor(𝐳TD) = 𝐑TD,and cor(𝐳TU) = 𝐑TU are correlation matrices for EUC, TD, and TU spatial dependency structures outlined below, and cor(𝐳NUG) = 𝐈 is the nugget effect (i.e., independent random errors) and 𝐈 is the identity matrix. The covariance matrix (e.g., σEUC2 𝐑EUC) of these dependency structures are combined to form:    𝚺 =  σEUC2 𝐑EUC + σTD2 𝐑TD + σTU2 𝐑TU + σNUG2 𝐈  , 𝚺 =  σEUC2 𝐑EUC + σTD2 𝐑TD + σTU2 𝐂TU ⨀ 𝐖TU + σNUG2 𝐈 , (3)  209  where 𝐂TU is the flow-connected correlations and 𝐖TU are weights used in the TU dependency structure. Note that not all of these structures need to be incorporated into an SSN model and singular structures or a combination of types can be applied.     Distances and topology  Distances between points can be Euclidean, flow-unconnected, or flow-connected depending on stream network topology. Euclidean distances are “as the crow flies” straight-line distances between points across the x- and y- map coordinate space. Flow-unconnected and flow-connected distances are hydrological distances between points following the stream network. Flow-unconnected distances are “as the fish swims” simple hydrological distances between sites (e.g., in Figure B.1, s2 → s3 = a + b) whereas flow-connected distances are “as the water flows” hydrological distances that incorporate topology – > 0 distances only occur if flow moves from one site to another (e.g., in Figure B.1 distance s2 → s1 = a + c whereas distance s1 → s3 = 0).  Autocovariance functions  As implemented in the ‘SSN’ package in R, spatial dependency structures (i.e., autocovariance functions) can be Euclidean (EUC), tail-up (TU), or tail-down (TD) and depend on straight line distances (EUC) or topology and stream network distances (TU and TD) between points. Here, I briefly outline the construction of covariance functions using examples from Cressie et al. (2006), Peterson and Ver Hoef (2010), Ver Hoef and Peterson (2010).    Consider the stream network in Figure B.1. The lowest downstream point l1 = 0 distance upstream. All distances are relative to this point. The i’s indicate a stream reach. The si are sites at distance s relative to l1 on reach i (e.g., it is possible that sites can have the same up upstream 210  distance relative to l1 but they will be on different reaches). Similarly, the ui’s are distances at the most upstream point of reach i and the li’s are the most downstream point of reach i, all relative to l1. The full set of stream reaches is I. The set of stream reaches upstream of si excluding i, is 𝑈𝑥𝑖. For s1, this would be 𝑈𝑠1 = {2, 3, 4, 5}. Similarly, the set of stream reaches downstream of si including i is 𝐷𝑥𝑖. For s3, this would be 𝐷𝑠3= {1, 3}. This can be extended on a per-reach basis such that 𝑈[1] = {2, 3, 4, 5} and 𝐷[3]= {1, 3}. These topologies and distances are used in developing covariance functions.  For covariance functions based on straight line distances, those supported by ‘SSN’ are in Table B.5. For covariance between sites’ network distances, more complex constructions are needed. A generalized construction for autocovariance functions integrates a moving average function over a white noise function. This construction is:     𝑍(𝑠|𝜃) =  ∫ 𝑔(𝑥 − 𝑠|𝜃) 𝑑𝑊(𝑥),∞−∞ (4)   where 𝑍(𝑠) is a random variable, 𝑥 and 𝑠 are locations on a line (i.e., one dimension), and 𝑔(𝑥|𝜃) is the moving average function on the line. Since 𝐸[ 𝑍(𝑠|𝜃)2] =  ∫ 𝑔(𝑥|𝜃)2∞−∞𝑑𝑥, this can be adjusted to allow for valid covariance between 𝑍(𝑠) and 𝑍(𝑠 + ℎ) using:     𝐶(ℎ|𝜃) =  ∫ 𝑔(𝑥|𝜃) 𝑔(𝑥 − ℎ|𝜃) 𝑑𝑥,∞−∞ (5)   211  where ℎ is the separation distance between sites. This formula is then adjusted based on assumed covariance between sites based on topological considerations explained further below.   TU dependency structures are ideal for variables that have no expected covariance when flow-unconnected. Examples include stream chemistry where values downstream tend to be influenced by values upstream and not vice versa. For TU models, the moving average function is only positive upstream of a site – when average functions overlap there is covariance between those sites. Generally, the integral in eqn. (4) is done per reach and summed for all reaches containing the moving average function 𝑔(𝑥|𝜃). Ver Hoef et al. (2006) showed that the moving average function must be split at a fork in the stream network in order to maintain stationarity (i.e., constant variance). This results in:     𝐶𝑢(𝑠𝑖[1], 𝑠𝑖[2]|𝜃)   =  {∏ √𝜔𝑘 𝐶(ℎ|𝜃)𝑘∈𝐵𝑠𝑖[1],𝑠𝑖[2]     if si[1]<si[2] are flow-connected                    0                                      if si[1] and si[2] are flow-unconnected,   (6)   where 𝑠𝑖[1] and 𝑠𝑖[2] are two different sites separated by distance ℎ, ∏ √𝜔𝑘 𝑘∈𝐵𝑠𝑖[1],𝑠𝑖[2]are spatial weights where 𝑘 ∈ 𝐵𝑠𝑖[1],𝑠𝑖[2] indicates reaches downstream and inclusive of reach with 𝑠𝑖[2] but exclusive of reach with 𝑠𝑖[1], and 𝐶(ℎ|𝜃) is autocovariance function based on eqn. (5). Briefly, weights are assigned to each stream reach and can be generated based on properties of upstream catchment, the stream itself, or simple weighting (e.g., catchment area, stream order, 50/50 split, respectively). Assigning weights is a three-step process of calculating the proportional influence 212  (PI) of a reach on its downstream reach at a fork, calculating an additive function value (AFV) based on the PI, and then calculating the weight. This process is shown in Figure B.2. Assigning weights has an added benefit of adjusting the extent to which autocorrelation might occur based on the variable of interest. For example, consider measuring stream conductivity downstream of a confluence of two streams where one has a much higher flow volume – the measurement is likely more similar to the stream with higher flow volume. Using eqn. (6), one of the tail-up covariance models generated using moving average constructions from Table B.6, and the separation distances between sites, the covariance matrix can be constructed. For the network in Figure B.1 using the exponential model, this would be:    𝐑TU = 𝐂TU⨀𝑾TU 𝐑TU = (𝜃𝑣 𝜃𝑣𝑒−3ℎ𝑠1,𝑠2/𝜃𝑟 𝜃𝑣𝑒−3ℎ𝑠1,𝑠3/𝜃𝑟𝜃𝑣𝑒−3ℎ𝑠1,𝑠2/𝜃𝑟 𝜃𝑣 𝜃𝑣𝑒−3ℎ𝑠2,𝑠3/𝜃𝑟𝜃𝑣𝑒−3ℎ𝑠1,𝑠3/𝜃𝑟 𝜃𝑣𝑒−3ℎ𝑠2,𝑠3/𝜃𝑟 𝜃𝑣)⨀(1 √𝜔2 √𝜔3√𝜔2 1 0√𝜔3 0 1). (7)   TD dependency structures are useful for variables that could have covariance between flow-connected and flow-unconnected sites. Examples in this case include organisms that can travel between reaches both upstream and downstream. For TD models, the moving average function is only positive downstream of a site. Again, when average functions overlap there is covariance between those sites such both flow-connected (e.g., s2 and s1) and flow-unconnected sites (e.g., s2 and s3) can covary. Using minus signs turns the moving average functions into TD models. For flow-connected sites, the resultant covariance is:  213    𝐶𝑐(ℎ|𝜃) =  ∫ 𝑔(−𝑥|𝜃) 𝑔(−𝑥 − ℎ|𝜃) 𝑑𝑥,−ℎ−∞ (8)   whereas for flow-unconnected sites, the resultant covariance for 𝑏 ≥ 𝑎 is:   𝐶𝑛(𝑎, 𝑏|𝜃) =  ∫ 𝑔(−𝑥|𝜃) 𝑔(−𝑥 − (𝑏 − 𝑎)|𝜃) 𝑑𝑥,−𝑏−∞ (9)   where 𝑎 is the distance from one site to the common junction with the second site, and 𝑏 is the same interpretation but for the second site. Again, when spatial autocorrelation functions are converted to moving average functions, the reparametrized moving average functions can be used to fill in the TD covariance matrix. For the network in Figure B.1 using the exponential model, this would be:   𝐑TD =(  𝜃𝑣 𝜃𝑣𝑒−3(𝑎+𝑏)𝜃𝑟 𝜃𝑣𝑒−3(𝑎+𝑏)𝜃𝑟𝜃𝑣𝑒−3ℎ𝑠1,𝑠2/𝜃𝑟 𝜃𝑣 𝜃𝑣𝑒−3(𝑎+𝑏)𝜃𝑟𝜃𝑣𝑒−3ℎ𝑠1,𝑠3/𝜃2 𝜃𝑣𝑒−3ℎ𝑠2,𝑠3/𝜃2 𝜃𝑣 )   (10)   Spatial predictions  Predictions through ‘SSN’ are generated using either universal kriging or universal block kriging, where block kriging is prediction over an area or reach. Universal kriging consists of predicting unsampled sites (y0) based on estimated predictors (e.g., % urban cover) and covariance relationships with sampled sites (y) with:     214   𝒚𝒐[𝒊]̂ = 𝒙𝒐[𝒊]?̂? + 𝒗´𝑽−1(𝒚 − 𝑿?̂?), (11)  where 𝒙𝒐[𝒊] are predictor site data for a given site, ?̂? are estimated predictor coefficients, 𝒗´ is the covariance vector of y0[i] and y,  𝑽 is the covariance matrix of y. Recall that V could be:   𝐕 = cov (Y) =  𝚺 =  𝜎EUC2 𝐑EUC + 𝜎TD2 𝐑TD + 𝜎TU2 𝐑TU + 𝜎NUG2 𝐈 from (3)  The prediction error variance is:   σo[i]2 = σo2 − 𝐯´𝐕−1𝐯 + δ(𝐗´ 𝐕−1𝐗)−1δ´ , (12)  where 𝛿 = 𝒙𝒐[𝒊] −  𝒗´𝑽−1𝑿. This variance can be broken down into components of deterministic error (term 1: 𝜎𝑜2), error due to covariance (0 if all observations are uncorrelated with 𝒚𝒐[𝒊] and 𝜎𝑜2 if y0 is an observation location), and error due to parameter estimation (var(?̂? −  𝛽) = (𝐗´ 𝐕−1𝐗)−1 and increases as y0[i] increases in distances away from site yi) (Bivand et al. 2013).  Reach-generated catchment characteristics  Using topological and distance information from the SSN framework, upstream catchment characteristics for sites and prediction points can be generated using reach-scale catchment characteristics (Peterson and Hoef 2014).   Consider the small stream network in Figure B.2. Following Table b in Figure B.2, for each stream reach, i, there is an associated reach-scale contributing area, RCAi. RCAs are non-overlapping polygons that have a 1:1 relationship with stream reaches; these areas are DEM-generated and represent the unique topographically contributing cells to each stream reach. For 215  each RCA, we can generate characteristics taken over the RCA based on other datasets using GIS (e.g., mean NDBI value, % urban cover).  Analysts more commonly wish to summarize an attribute over the entire catchment area rather than the RCA (i.e., all topographically contributing cells: for Figure B.2, p5, this would be [RCA4 + RCA5]). For the most downstream point on each stream reach, this can be achieved for proportional data and real-number data using the following:    𝑊𝑖 =(𝑤𝑖𝐴𝑟𝑒𝑎𝑖) + ∑ (𝑤𝑘𝐴𝑟𝑒𝑎𝑘)𝑘∈𝑈𝑖(𝐴𝑟𝑒𝑎𝑖 + ∑ 𝐴𝑟𝑒𝑎𝑘)𝑘∈𝑈𝑖, (13)  where Wi is the estimated watershed characteristic, wi and wk are the characteristics taken at the RCA-scale (e.g., mean NDBI, % urban cover),  𝑘 ∈ 𝑈𝑖 indicates an individual stream, k, as an element of all the upstream streams of stream reach i, Ui, and Area is the RCA area for a given stream reach. Importantly, for sites and prediction points that fall away from the most downstream point on a stream reach, the formula must be adjusted. Here, we assume that moving 10% up a stream reach length is equivalent to removing 10% of the RCA area. The adjusted formula is:    𝑊𝑖 =((1 − 𝑟𝑖)𝑤𝑖𝐴𝑟𝑒𝑎𝑖) + ∑ (𝑤𝑘  ×  𝐴𝑟𝑒𝑎𝑘)𝑘∈𝑈𝑖((1 − 𝑟𝑖)𝐴𝑟𝑒𝑎𝑖 + ∑ 𝐴𝑟𝑒𝑎𝑘)𝑘∈𝑈𝑖, (14)  where ri is the distance travelled upstream along reach i as a proportion of its total length. Note an important implication of eqn. (14) is that for sites at uppermost endpoints of the stream 216  network (i.e., ratio = 1 in a headwater stream), the estimate of catchment area would be 0 and the catchment characteristics would be undefined. In these cases, we calculated a small number to replace (1 − 𝑟𝑖) in eqn. (14) indicating a site has a relatively small catchment area; this number is the stream length/catchment perimeter. Figure B.2 demonstrates the calculation of some catchment characteristics for points along the stream network. Figure B.3 compares estimates of catchment area and mean NDBI within the catchment using the classic approach (i.e., delineating each catchment separately) and the edge-generate approach (i.e., the approach outlined above). We found that catchment area tended to be underpredicted for sites < 3 ha (104.5 ~ 30 000 m2) based on the 1:1 line whereas the mean NDBI was more consistent (Figure B.3).   Finer-scale catchment characteristics In some c