SPATIAL ANALYSIS IN A SUCCESSIONAL PERSPECTIVE: A BOREAL MIXEDWOOD LANDSCAPE IN NORTHEASTERN BRITISH COLUMBIA. by MARCO ALBANI Diploma di Laurea in Scienze Forestall, Universita degli Studi di Firenze, 1993 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE F A C U L T Y OF G R A D U A T E STUDIES THE F A C U L T Y OF FORESTRY (Department of Forest Sciences) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH C O L U M B I A October 2001 Â© Marco Albani, 2001 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department The University of British Columbia Vancouver, Canada DE-6 (2/88) A B S T R A C T Digital elevation models (DEMs), multitemporal Landsat TM images and forest inventory maps were use to study the spatial patterns of boreal mixedwoods in the Boreal White and Black Spruce zone of British Columbia. A Bayesian classification method was developed to produce a high resolution map of posterior probabilities of forest cover classes. The procedure employed the forest inventory maps as prior probabilities, while evidence was provided by the results of a maximum likelihood classifier applied to the satellite images. Topographic parameters, derived from the DEMs by fitting a conic surface equation, were used in a fuzzy-logic based classification to produce maps of terrain classes. A filtering technique, expressly developed to mitigate the systematic errors present, was applied to the DEMs before the classification. The resulting maps were employed to investigate the relations between terrain and vegetation using an appropriately modified version of Ivlev's electivity index. The analysis suggests that topography plays a role in the distribution of the different types of boreal mixedwood forest stands. As expected, black spruce (Picea mariana (Mill.) B.S.P.) and tamarack (Larix laricina (Du Roi) K. Koch) dominated stands were strongly associated with flat areas. Hardwood dominated stands were found significantly associated with convex slopes and ridges, white spruce (Picea glauca (Moench) Voss) dominated stands positively were associated with concave slopes and channels, and mixed stands were associated with neutral slopes. The forest cover maps were used to study the distribution of mixedwood and hardwood stands in relation to white spruce stands. The analysis employed a logistic regression model where topographic variables and stand age were used as covariates. The likelihood of a site being occupied by hardwoods instead of mixedwoods was found to be positively correlated with increasing distance from white spruce stands. This result suggests that the rates or the outcomes of succession in the mixedwood forest are influenced by the spatial arrangement of different stand types at the landscape level. The most likely explanation is the control of seed dispersal on white spruce seed availability in the landscape, although a direct causation cannot be inferred from this correlation analysis. ii T A B L E O F C O N T E N T S ABSTRACT ii TABLE OF CONTENTS iii LIST OF TABLES vii LIST OF FIGURES ix ACKNOWLEDGEMENTS xi DEDICATION xii CHAPTER 1: INTRODUCTION 1 1.1: BOREAL MIXEDWOOD DYNAMICS 1 1.2: RESEARCH OBJECTIVES ; 6 1.3: THESIS STRUCTURE 9 CHAPTER 2: LITERATURE REVIEW 11 2 .1 : THE PROBLEM OF SCALE IN ECOLOGY. 11 2.1.1: Scale: a glossary. . IJ 2.1.1.1: Clarifications on the concept of scale. 11 2.1.1.2: Definition of grain and extent. . 12 2.1.2: Scale effects on observations 13 2.1.3: Scale and analysis artefacts 17 2.1.4: Choosing the appropriate scale 18 2.2: METHODOLOGIES OF SPATIAL ANALYSIS IN LANDSCAPE ECOLOGY 2 1 2 .3 : THE USE OF DIGITAL ELEVATION MODELS IN ECOLOGICAL ANALYSIS 2 7 CHAPTER 3: INTRODUCTION TO THE STUDY AREA AND THE DATA SETS USED. _ 30 3 .1 : INTRODUCTION TO THE STUDY AREA 3 0 iii 3.1.1: Climate 32 3.1.2: Geology and Soils__ 33 3.1.3: Ecological Classification 35 3.2: DATA SOURCES 36 3.2.1: Overview of the data used 37 3.2.2: Field Data 38 3.3: CLASSIFICATION OF FOREST COVER USING REMOTE SENSING AND FOREST INVENTORY MAPS 42 3.3.1: Rationale 42 3.3.2: Remote Sensing Images Used and Pre-Processing 43 3.3.2.1: Images used 43 3.3.2.2: Image pre-processing 44 3.3.3: Image classification 45 3.3.3.1: Classification method 45 3.3.3.2: Classes used and training sites delineation 46 3.3.3.3: Integration of forest inventory information and remote sensing information using Bayesian probability theory. 50 3.3.3.4: Production of prior probability maps and classification results 51 CHAPTER 4: TERRAIN CLASSIFICATION FROM TRIM DEM DATA 59 4.1: BUILDING D E M S FROM T R I M . 59 4.1.1: An Introduction to the production of DEMs from TRIM fides and collection line artefacts ____ 59 4.1.2: Collection line artefacts as spatially structured elevation error. 60 4.1.3: The filter employed: line-based cross-smoothing. 63 4.1.4: The probability-based replacement function. . 67 4.1.5: Implementation 68 4.1.6: Effect of filter implementation on slope and aspect. 72 4.1.7: Conclusions. _ 76 iv 4.2: COMPUTATION OF TOPOGRAPHIC PARAMETERES FOR DEM-BASED TERRAIN CLASSIFICATION_ 79 4.2.1: Quantitative Analysis of Topographic Surfaces 79 4.2.1.1: Quadratic approximation of the topographic surface 79 4.2.1.2: The choice of appropriate scales for the analysis 82 4.2.1.3: Influence of DEM errors on topographic variables 83 4.2.1.4: TRIM interpolation effect _91 4.2.1.5: Excessive Generalisation 94 4.2.2: Terrain classification. 96 4.2.2.1: Identification of morphometric features from topographic variables. 96 4.2.2.2: From morphometric classification to terrain classification 100 CHAPTER 5: SPATIAL ANALYSIS 106 5.1: ASSOCIATION OF FOREST COVER WITH TERRAIN. 106 5.1.1: Analysis method 107 5.1.1.1: The modified Electivity Index 107 5.1.1.2: Statistical testing 108 5.1.1.3: Analysis implementation. 111 5.1.2: Results 114 5.1.3: Discussion 114 5.2: INFLUENCE OF THE DISTANCE FROM SW STANDS ON THE DISTRIBUTION OF H W D AND M W D STANDS. 1 1 7 5.2.1: Choice of data for the regression model 117 5.2.1.1: Distance from white spruce seed sources. 117 5.2.1.2: Dependent variable and choice of observations 119 5.2.2: Results : 120 5.2.2.1: Regression Models 120 5.2.3: Discussion 126 CHAPTER 6: DISCUSSION AND CONCLUSIONS 129 v 6.1: DATA QUALITY AND THE SPATIAL ANALYSIS OF BOREAL MIXEDWOODS. 129 6.1.1: Forest cover classification . 129 6.1.2: Digital Elevation Models 132 6.2: ECOLOGICAL SIGNIFICANCE OF THE SPATIAL ANALYSIS FINDINGS 133 6.2.1: Spatial association of terrain classes and forest cover. 133 6.2.2: Distance from white spruce stands and probability of mixedwood presence. 134 6.3: RECOMMENDATIONS FOR FURTHER RESEARCH 137 6.4: CONCLUSIONS 141 REFERENCES 144 vi L I S T O F T A B L E S Table 3.1: Characteristics of the satellite images used. 43 Table 3.2: The eight classes of the forest cover classification used. 46 Table 3.3: Number of training site pixels for each class: 48 Table 3.4: Average and minimum Bhattacharyya separability measures for 6 different combinations of seven of the eight available channels. 49 Table 3.5: Confusion matrixes for maximum likelihood classification using two different band combinations. 50 Table 3.6: Error Matrix Analysis of ground control points against MAXBAY results for t = 0.5. 57 Table 3.7: Error Matrix Analysis of ground control points against MAXBAY results for t = 0.75. 57 Table 3.8: Error Matrix Analysis of ground control points against MAXBAY results for t = 1. 58 Table 3.9: Errors of commission and omission in the MAXBAY classification for different values of the FIP confidence parameter t. 58 Table 4.1: Parameters used in the implementation of the line-based cross-smoothing filter. 68 Table 4.2: Summary statistics of the three DEMs elevation points and the effects of cross-smoothing: the change has non-zero mean. 69 Table 4.3: Areas (ha) of map sheets 94B.020 and 94B.060 falling in each slope class when slope is calculated from the original or the cross-smoothed DEM with window sizes of 3X3 and 5X5. Cross-smoothing does not "smooth out" the steeper slopes. 78 Table 4.4: Areas (ha) of map sheets 94B.020 and 94B.060 falling in each aspect class when aspect is calculated from the original or the cross-smoothed DEM with window sizes of 3X3 and 5X5. 78 Table 4.5: Distribution of the calculated RMSE of the topographic variables for the study area using Evans' method and a window size of 7 with no weighting function. 90 Table 4.6: Fuzzy membership functions for the terrain classes. 104 Table 4.7: Window-dependent parameters of the fuzzy membership functions. 104 vii Table 5.1: Modified electivity index for forest cover types and terrain classes (7X7 estimation window). 112 Table 5.2: Modified electivity index for forest cover types and terrain classes (15X15 estimation window). 112 Table 5.3: Modified electivity index for forest cover types and terrain classes (23X23 estimation window). 113 Table 5.4: Modified electivity index for forest cover types and terrain classes (31X31 estimation window). 113 Table 5.5: Logistic regression of distance from SW versus P(HWD): parameter estimates and deviance table. 121 Table 5.6: Logistic regression of P(HWD) on FIP polygon age: parameter estimates and deviance table. 125 Table 5.7: Logistic regression of P(HWD) on age, terrain and distance variables: parameter estimates and deviance table. 126 viii L I S T O F F I G U R E S Figure 3.1: Location of the study area. 31 Figure 3.2: Walter and Lieth diagram for Fort St. John Airport. 33 Figure 3.3: The study area (DEMs and Landsat TM intersection) in relation with the BEC zones, sub-zones and variants. 35 Figure 3.4: The Forest Inventory and Planning map sheets used in the thesis. 37 Figure 3.5: Prior probabilities for the bare opening, p(OB), and vegetated opening, p(OV), cover types calculated from the stand age. 52 Figure 3.6: The relation between the raster map of HWD prior probability and the FIP map in a section of the study area. 54 Figure 4.1: The DEM points superimposed to an analytical hill-shading for part of TRIM map sheet 94B.060, showing the collection line artefacts. 60 Figure 4.2: The main steps of the cross-smoothing algorithm. 64 Figure 4.3: Spatial distribution of elevation change between the original and the filtered DEM for TRIM map sheets 94B.020 and 94B.060. 71 Figure 4.4: Slope maps for TRIM map sheet 94B.020 as determined using the original or the filtered DEM and a 3X3 or a 5X5 window. 74 Figure 4.5: Slope maps for TRIM map sheet 94B.060 as determined using the original or the filtered DEM and a 3X3 or a 5X5 window. 75 Figure 4.6: Aspect maps for TRIM map sheet 94B.020 as determined using the original or the filtered DEM and a 3X3 or a 5X5 window. 76 Figure 4.7: Aspect maps for TRIM map sheet 94B.060 as determined using the original or the filtered DEM and a 3X3 or a 5X5 window. 77 Figure 4.8: Empirical correlogram of the elevation error of an interpolated DEM of 30 m resolution, obtained by adding a spatially uncorrelated error term to the TRIM sample points. 88 ix Figure 4.9: RMSE of the conic parameter d at different evaluation window sizes for different structure of the elevation error 89 Figure 4.10: Standard deviation of the difference between the conic parameters of artificial fractal surfaces and the conic parameters of their reproductions from simulated TRIM sampling and interpolation. 93 Figure 4.11: Box-plot of the distribution of probability values for the x 2 test for surface generalisation at different window sizes (x-axis) and decay coefficient values (trellis panel). 95 Figure 4.12: Linear and S-shaped fuzzy membership functions with parameters a and b shown. 102 Figure 5.1: Histogram of posterior probabilities for the SW forest cover class, showing their strong bimodal distribution. 118 Figure 5.2: Logistic regression curve of P(HWD) on distance and hexagonal binning of the observations. 121 Figure 5.3: Detail of the logistic regression of P(HWD) on distance in the zero to five km distance range. 122 Figure 5.4: Regression of P(HWD) on FIP polygon age. 125 x A C K N O W L E D G E M E N T S This thesis could not have been written without the help of several persons. First of all I would like to thank the members of my supervising committee Drs. Brian Klinkenberg and Peter Marshall, my research supervisor Dr. David Andison and my academic supervisor, Dr. Hamish Kimmins. They provided me with the necessary direction and support in this exploration of spatial analysis and boreal mixedwoods. Dr. Val LeMay, Dr. David Tait and Jerry Maedel at UBC, Dr. Richard Kabzems and Steve Warrington at the Ministry of Forests in Fort St. John provided much appreciated help. I gratefully acknowledge the assistance of Rene Beyers and Andras Darabant with work in the field and the laboratory. I am especially thankful to Dr. Leonardo Petri for his invaluable assistance during the field work and for his great friendship. The progress of the work was stimulated by several conversations with fellow graduate students and post-docs at UBC Forestry, both in the "Ponderosa days" and later: Jennifer Bennet, Robin Duchesneau, Sharon Hope, Aled Hoggett, John Lavery, Eliot Mclntire, Brad Seeley, Adrian Weber, and Clive Welham, as well as many others. Green College UBC provided much more than a place to live in the last stage of my time at UBC. Without its wonderful community I would have had a maybe shorter but certainly less interesting graduate program. Initial funding for the research was provided by Forest Renewal BC. I also received personal financial support through the R. Howard Webster Foundation Fellowship, the Canfor Corporation Fellowship in Forest Ecosystem Management, the Edward W. Bassett Memorial Scholarship in Reforestation, the University Graduate Fellowship of the University of British Columbia and the Hugh Robert Duncan Chisholm Scholarship in Forestry. xi D E D I C A T I O N Xpuoov y a p oi 5iÂ£r|UEVoi yfjv TroAAriv opuaaouoi KCU EupiaKouoiv oAiyov Heraclitus, fragm. 94 The game of science is, in principle, without end. Karl Popper, The Logic of Scientific Discovery. This thesis is dedicated to the memory of my father, Riccardo Albani. xu CHAPTER 1 : I N T R O D U C T I O N Observation of spatial patterns as a basis for understanding underlying ecological processes is certainly not a new idea in ecology. Indeed, one could trace it to the origins of the science of ecology in the works of Willdenow, Von Humboldt and Warming (Barbour et al, 1987). As Naveh and Lieberman (1994) note in the additions to the second edition of their seminal book, advances in remote sensing and geographic information systems have improved our ability to appreciate spatial patterns at the landscape scale. At the same time, practitioners of landscape ecology now have available a large number of quantitative methods by which to address in a rigorous way the study of landscape patterns and their relation to underlying processes (see for examples Turner and Gardner, 1991; Turner et al, 1991; Gustafson, 1998). The analysis of landscape patterns can be particularly insightful in the study of ecological systems where complex processes interact at different scales, especially where apparently stochastic large-scale processes are involved. The boreal mixedwood forest is such a system. 1.1: BOREAL MIXEDWOOD DYNAMICS The boreal forest is the largest forest biome in North America, and the major forest type of Canada. The boreal forest of Canada* occupies 213,960 thousand hectares, of which 139,751 are non-reserved productive forest land (Low et al., 1996). In British Columbia, the Note that these figures represent a subset of the Boreal Forest Region described in Rowe (1972) which is larger since it includes also the "boreal forest and grass" and "boreal forest and barren" regions of the inventory (Low et al, 1996). 1 boreal forest occupies 15,167 thousand hectares, o f which 11,401 are non-reserved productive forest land (Low et al., 1996). The Forest Regions of Canada (Rowe, 1972) distinguishes 44 sections in the boreal forest region, of which 12 occur in B . C . , but only two o f them are relevant to the area studied in this thesis: the mixedwood section and the lower foothills section. The mixedwood section is characterised by a mixture of trembling aspen (Populus tremuloides L . ) , balsam poplar (Populus balsamifera L . ) , paper birch (Betula papyrifera L . ) , white spruce {Picea glauca (Moench) Voss) and balsam fir (Abies balsamea (L.) M i l l . ) , with jack pine (Pinus banksiana Lamb.), black spruce (Picea mariana (Mi l l . ) B.S.P.) and tamarack (Larix laricina (Du Roi) K . Koch) in azonal sites. The mixedwood section extends from south-western Manitoba, and finds its westernmost reaches in northeastern B . C . The lower foothills section, a transition between the boreal and subalpine forest regions, extends west of the Mixedwood section. The Lower Foothills section has lodgepole pine (Pinus contorta Dougl.) and subalpine fir (Abies lasiocarpa (Hook.) Nutt.) as additional species, while jack pine is absent (Rowe, 1972). A s noted above, the study area includes both the mixedwood and the lower foothills sections, and one can expect it to be largely transitional between the two to a large extent, given the mapping scale at which these units were identified. In the biogeoclimatic classification system o f Brit ish Columbia the boreal mixedwoods are part o f the Boreal White and Black Spruce zone (Meidinger and Pojar, 1991). Ecologically the boreal forest can be characterised by a few main factors: the temperature regime is the most l imiting to tree growth of any forest biome (Thorpe, 1992), wildfires play an extremely important role in forest dynamics (Alexander and Euler, 1980), and there is generally a low diversity of tree species (Thorpe, 1992). 2 Boreal mixedwoods, especially in British Columbia, occur in landscapes where sharp changes in slope and drainage are rare, with the exception of the valley of the major rivers, (see Chapter 4 for quantitative data). Thick glacial deposits, either morainal or glaciolacustrine, overlay and mask most of the differences in bedrock (Catto et al, 1996). Those glacial features that would provide sharp contrast in soil texture and drainage, like eskers or eolian deposits, are rare, so that the physical environment frequently lacks sharp boundaries over large spatial scales (Farstad et al, 1965; Lord and Green, 1986). A noteworthy difference between the conifer dominated northern boreal forest and the mixedwood section is the importance of successional dynamics. As reported in a review of several studies of coniferous boreal forest dynamics (Heinselman, 1980), conifer dominated boreal forests see frequent stand replacing disturbance generally followed by the development of a single cohort of tree species which, although going through phases of stand development, will generally occupy the site until the next stand replacing disturbance. This led Heinselman (1980) to question the applicability of the concept of succession in boreal conifer forests. Although Heinselman's (1980) note refers mostly to the black spruce dominated coniferous forest, a similar pathway is found in the white spruce and aspen stands in interior Alaska (Youngblood, 1995). Single cohort mixtures play an important role in the mixedwood section as well (Thorpe, 1992), but there is abundant evidence that slower and gradual recruitment of spruce under aspen canopies is also possible. In central Alberta, aspen dominated stands were found to have spruce recruitment over time spans ranging from relatively short (15-20 years, producing essentially single story, even aged stands) to decades-long, producing an uneven-aged understory (Lieffers et al, 1996b). In some cases the regeneration of spruce occurs 40 or more years after stand initiation, but in a single pulse that creates a two-cohort stand (Lieffers et al, 1996b). 3 As Lieffers et al. (1996b) point out, the literature on age structure and development of white spruce in the understory of aspen stands is sparse. Nonetheless it is quite clear that in boreal mixedwoods there is the possibility of a process similar to autogenic succession, where white spruce regenerates under aspen canopy and eventually, in absence of a stand replacement event, dominates the stand (DeLong, 1989; Lieffers et al, 1996a). The proximity of seed sources (DeLong, 1989) and the presence of adequate safe sites (DeLong et al, 1997) and light levels (Constabel and Lieffers, 1996) have all been documented as critical factors in the establishment and growth of white spruce. While an aspen canopy seems to reduce the risk of frost damage (Groot and Carlson, 1996; Tanner et al, 1996), it does so at the expense of spruce height growth (Kabzems and Lousier, 1992). In plantation studies, germination and seedling survival seem to be more related to microsite than anything else (DeLong et al, 1997). Lieffers et al. (1996a) have proposed a multiple pathway model of mixedwood succession where the driving variables are the intensity of the stand replacement disturbance, fire, as well as the availability of white spruce seed at certain key phases of the stand development. Some of these driving variables might actually be controlled by processes occurring at the landscape level, and be dependent on the spatial arrangement of the mixedwood mosaic at any given time. Disturbance by fire is a landscape level process. There is disagreement in the literature on the relative importance of fuel and weather in controlling wildfire ignition and spread (Cumming, 2000), but there is evidence that different forest types have different ignition rates and fire return intervals (Renkin and Despain, 1992; Larsen, 1997; Cumming, 2001). The topographic location of a forest stand, as well as the nature of its neighbouring 4 stands, could influence its probability of being disturbed by fire, and maybe even the intensity of disturbance. The availability of white spruce seed is also potentially influenced by the landscape mosaic. White spruce does not have a permanent seed bank that can survive stand replacing disturbance by fire, and must rely on incoming seed transported by wind from neighbouring trees to re-colonise a site after disturbance (Zasada et al, 1992; Greene et al, 1999; Bourgeau-Chavez et al, 2000a). Studies of dispersal distance for white spruce seed show that it is effectively limited to within hundreds of meters of the source (Greene and Johnson, 2000). It has been suggested that the reduction of spruce seed sources in the landscape through selective logging and changes in disturbance regime can cause local extinction of white spruce and the conversion of mixedwoods to pure deciduous stands (Tanner et al, 1996; Wier and Johnson, 1998): Direct study of the process of white spruce seed dispersal and regeneration and its interaction with disturbance severity and intensity presents several difficulties. The intensity and severity of fires can vary widely both between fires and within the same fire (Johnson, 1992), affecting the survival rates of on-site propagules of white spruce competitors (aspen roots, lodgepole pine and black spruce serotinous cones). Timing of disturbance in relation to white spruce seed crop years is another key issue in terms of the ability of the white spruce to re-invade the disturbed site, since white spruce seed production is extremely variable from year to year. While stand-level studies are fundamental to understand the mechanics of the successional processes, the spatial and temporal variability inherent in the processes confines most direct studies to case studies, whose results in terms of rates and relative importance of processes are difficult to generalise. This is especially true considering that, while small fires in the boreal forest are relatively frequent, a limited number of extremely large fires is 5 generally responsible for the disturbance of the majority of any given area (Johnson, 1992), which makes it very hard to find proper replicates to any field measure or field experiment. A complementary approach to direct studies of ecosystem processes at the stand level is to search for evidence of successional processes in landscape level patterns. The patterns of spatial association between forest stand types and topography can be an indicator of the relationship between the physical environment and different boreal mixedwood successional pathways. The patterns of spatial arrangements of the different forest stand types relative to each other can be used to analyse how the presence of white spruce seed sources in the proximity of a site might influence the probability of that site developing into a mixedwood instead of a pure hardwood stand. 1.2: RESEARCH OBJECTIVES The analysis of spatial patterns requires spatial data. Because of the limitations that data resolution and extent impose on the analysis of spatial patterns (discussed in the next chapter in more detail), successful spatial analysis requires large quantities of spatial data, which are generally beyond the collection capability of a single researcher. This is why landscape ecologists often resort to spatially explicit data sets, that have been produced by land management agencies, but with objectives that are different from those of ecological analysis. Such spatial data are often not in a format that can be directly employed for a landscape ecological analysis. Collecting the available spatial information and processing it with the appropriate techniques so that it can be used for analysis is therefore a very important first step in any investigation in landscape ecology. The first objective of this thesis was to collect available spatial information for the boreal mixedwood forest in Northeastern British Columbia and to use it to produce an 6 ecological classification of the physical environment and of the vegetation. These data had to be collected in a way that ensured their suitability for the testing of relevant hypotheses about the spatial structure of the boreal mixedwoods. Extensive spatial data on forest cover can be obtained from forest inventory information. Cumming (1997; 2001) has used forest inventory data to investigate landscape patterns and natural disturbance processes of boreal mixedwoods in Alberta at the township scale. While that scale is appropriate to observe certain patterns, it is likely to miss others because of lack of resolution. To increase the resolution of the forest cover, I used optical digital remote sensing information provided by the Landsat 5 TM sensor. While other types of remote sensing information could have been purchased, optical remote sensing has been successfully used in several forest inventory and mapping applications, and methodologies of forest cover classification from remote sensing data are well known and tested (Wulder, 1998). For the boreal mixedwoods in particular, the Landsat TM sensor has been successfully used in forest cover classification in Alberta (Ghitter et al, 1995; Hall and Klita, 1997; Hall et al, 2000). Moreover, the processing of optical remote sensing images is not very complex and can be accomplished with a moderately priced GIS software package like Idrisi32 (Eastman, 1999). Landsat TM is therefore a reasonable choice for a project where remote sensing classification is a tool to be used in an ecological investigation. Obtaining spatial information about the physical environment independently from the forest cover information is necessary to avoid circularity in the investigation of their relations (Orloci, 1975). To do so, one option is to derive geomorphometric information from a digital elevation model (Pike, 2000). There are few applications of quantitative geomorphometry in forest and landscape ecological research; however, they are promising. Especially interesting is the possibility of producing an algorithm that is able to classify the 7 landscape into terrain classes, based on a digital elevation model, that also have ecological relevance. Once suitable spatial data sets had been assembled, it was possible to move to the second objective of the thesis, which was the use of spatial analysis to answer two main questions about the spatial patterns of the boreal mixedwood landscape: 1) The first question is: do any of the forest cover classes have spatial association, positive or negative, with terrain classes? Specifically, I was interested in investigating whether those stand types that are considered possible stages in mixedwood succession as outlined by Lieffers et al. (1996a) differ in their patterns of association with terrain classes from those that are not. The expectation is that these classes will differ in these patterns of association, since mixedwood succession is thought to be limited to the less extreme sites, while very dry or very wet sites are expected to be occupied mainly by black spruce and lodgepole pine. I was also interested in any difference in the selection for terrain classes within the group of stands that could be part of mixedwood successional pathway. The expectation was that there should not be any difference, since succession is thought to be driven exclusively by the characteristics of disturbance and the initial post-disturbance floristic composition, both of which are highly stochastic. Differences could be found, however, if terrain influences the outcome of succession. To have this influence, terrain need not to always override the effects of disturbance and initial floristic composition; but it might merely influence the probability distribution of these stochastic processes, so that while any successional pathway could be possible on any terrain type, their likelihood would differ. These difference would only be apparent by observing succession over large spatial or temporal scales. 8 These questions can be answered by testing the null hypothesis of randomness in the spatial distribution of forest cover types over the different terrain types. However, as will be discussed in Section 5.1.1, testing this null hypothesis with spatial data is a more complex task than it might first appear. 2) The second question is: do the spatial patterns of the boreal mixedwood support the idea that distribution of white spruce, and thus the formation of mixedwoods, is dispersal limited? This question can be answered by testing if the likelihood that any given site is occupied by a deciduous stand instead of by a mixedwood stand is independent of its distance from a spruce seed source. If spruce seed dispersal is very important, mixedwood stands should show a strong positive spatial association to white spruce seed sources, while pure aspen stands should show a negative association with white spruce seed sources. In contrast, if the distance to seed source is not a limiting factor, such an association should not be observed. 1.3: THESIS STRUCTURE The thesis is structured into five chapters in addition to this one. The following chapter (Chapter 2) presents a general literature review to set the stage for this research. First I will review the problem of scale in ecology, its implications for ecological science in general and for spatial analysis in particular. Then I will introduce some of the literature on the application of spatial analysis in ecology, describing several of the quantitative techniques that are available to describe spatial patterns. Finally, I will describe some applications of quantitative geomorphometry in the study of vegetation distribution. In Chapter 3,1 will provide an introduction to the study area, as well an introduction to the spatial data sets that are used. I will then describe the procedure used to produce the 9 forest cover classification from the integration of optical remote sensing and forest inventory data. The classification of terrain from quantitative geomorphometry is described in Chapter 4, where the filtering technique that I developed to remove systematic errors from the digital elevation models is also described. The spatial analysis is presented in Chapter 5, together with the technique that I developed to test spatial association using probabilistic instead of categorical maps. Finally, the results of spatial analysis and their ecological implications are discussed in Chapter 6. 10 C H A P T E R 2: L I T E R A T U R E R E V I E W 2.1: THE PROBLEM OF SCALE IN ECOLOGY. 2.1.1: Scale: a glossary. Ecology as a science is plagued with unclear terminology (Mcintosh, 1991). This is particularly true for landscape ecology: the same term is used with different meanings by different authors, while different terms are used with the same meaning. It is then important to start with some clarification and eventually the definitions of the terminology that I will employ. I am deriving the definitions mostly from Allen et al. (1984) and Kolasa and Rollo (1991). 2.1.1.1: Clarifications on the concept of scale. The first clarification to be made is about the interchangeable use of spatial and temporal scales in the discussion. In ecology, concepts and analytical methods elaborated for the temporal domain have often been transferred to the spatial domain, as in the general case of time series analysis. Some generalisations of terminology and discussion can be made between temporal and spatial scales, and indeed Allen and Starr (1982) treat temporal and spatial scales interchangeably in their discussion of hierarchy. Nonetheless space and time differ in important ways, the foremost being that time is unidimensional and unidirectional. A second clarification is on the use of the terms 'large' and 'small' in reference to scale. In landscape ecology, the term scale is used with opposite meaning than in cartography when referring to space. 'Large scale' identifies objects or processes occurring on spatial scales described by small scale maps. An example is the continental weather patterns influencing fire characteristics of the boreal forest, a large scale process, described by maps at an approximate scale of 1:50,000,000 (Johnson, 1992), which would be small scale maps in the cartographic sense. Conversely, 'small scale' identifies process usually described by large scale maps. While this use of the term for spatial scales is inconsistent with the cartographic terminology, it provides consistency of terminology between spatial and temporal scales. 2.1.1.2: Definition of grain and extent. A better description of scale is possible through the use of the concepts of grain and extent. Let us consider a set of observations. The grain of the observation set can be defined as the resolution of the data, or the interval of space or time over which the signal has been integrated to obtain the observation, and it determines the finest distinction that can be made in an observation set. The extent of the observation, instead, refers to the total space or time actually covered by the observation set. The extent of the observation set determines the largest possible phenomenon or pattern observed, and thus is the largest scale attainable from the observation set (Allen and Starr, 1982; Allen et al, 1984). This definition of grain and extent required the introduction of an observer's filter, as it has defined the scale (grain plus extent) of an observation set, not of a process. While the distinction between the external process (ontogenetically real, external to the observer) and the observation of it is a matter of extended philosophical discussion, the introduction of an observer's filter brings about an interesting consideration: the difference between functional scale and measurement scale. This discussion parallels the discussion of the difference between functional heterogeneity and measurement heterogeneity in Kolasa and Rollo (1991), although the concepts are applied to scale rather than to heterogeneity. But the two concepts seem to be deeply interlocked. Let us consider the scale of a measurement, or observation set, as defined above. 12 A variable of interest is observed assuming different values in different positions in the set, and, if one uses, for example, spectral analysis (Piatt and Denman, 1975), it is possible to assign a proportion of the variability observed to different frequencies (corresponding to different possible scales) through a Fourier transformation. This analysis, although constrained by the grain and extent of the observation in the smallest and largest frequencies observable, will define the importance of different scales in determining the variability of the variable of interest. We have described the measurement scales of this variable. On the other hand, some of these scales might not be relevant to any specific ecological entity. This happens if the ecological entity perceives and responds to variations that occur only in a certain range of scales. Variations in physical soil properties at a scale smaller than one cm are probably integrated by the root system of a mature tree. So if the variable of interest is some physical soil property, and one is concerned with whole trees, it is quite reasonable to consider as noise the variations in soil properties at a scale smaller than a certain level, even if one can measure them and assign them a certain fraction of variability in a spectral density diagram. At the same time, variations in soil physical properties that occur at geological time scales are functionally meaningless for a single individual tree. Functional scale is then defined by the interaction of a variable with an observer (the ecological entity). 2.1.2: Scale effects on observations The concept of a functional scale, and the concept of ecological entities as observers of the environment at a specific scale, can be quite fruitful in ecological analysis (Kolasa and Rollo, 1991). Using such an approach draws attention to how the scale of observation influences what is observed, and in certain cases even defines what is observed (Beals, 1973). 13 The scale of an observational data set potentially influences what is observed in many ways. Some of the effects are relatively obvious: processes occurring at a scale lower than the observation grain are integrated, and thus not observed in their complexity (I use the term "integrated" instead of "averaged", because the integration of the signal over the grain of the observation is not necessarily an average). Similarly, the processes occurring at a scale larger than the observational extent cannot be observed in their entirety, and can only be guessed at. Some other effects are more subtle. The grain of the observation has some interesting consequences when observations with different grain are related in a heterogeneous system. An illuminating example is the consequence of sub-grain environmental variability in gradient analysis. In vegetation sampling it is common to estimate some of the environmental parameters of a vegetation plot from randomly located subplots. The effect of this procedure, if the vegetation sampling is based on cover estimates on the subplot, is to potentially overestimate the ecological amplitude of the sampled species (Palmer and Dixon, 1990). If assumptions can be made on the frequency distribution of a species cover relative to the environmental parameter, the effect can be corrected for, but if no assumption is possible about the species' frequency distribution, then the effect remains. The reduction of grain, the aggregation of data to form observations at larger scale, involves many similar problems. The effect of data aggregation on analysis is known in geography as the Modifiable Areal Unit Problem (MAUP) (Jelinski and Wu, 1996). The MAUP is composed of a "scale problem", which related to the extent of aggregation, and "zoning problem", which relates to the shape and orientation of the aggregation window. The effect of the MAUP on simple metrics is relatively straightforward. As long as the data are 14 weighted on a per area basis, the global mean is insensitive to size and zoning*. The effect of the MAUP on the global standard deviation is less straightforward, but still some simple trends can be expected. The standard deviation will decrease with increasing aggregation, since the range of the data can only be reduced by a progressive averaging. The rate of decrease will be dictated by the level of spatial autocorrelation of the data, and such a property is at the base of the blocking technique of Greig-Smith (1952) and its subsequent modification (reviewed in Turner et al, 1991). But the rate of decrease is also sensitive to the zoning strategy, making the blocking methods generally sensitive to block size and to the starting position in the analysis. Landscape metrics (O'Neill et al, 1988) are extremely sensitive to the MAUP, sometimes in counterintuitive ways. Both the suite of landscape metrics (patch shape indexes, contagion, dominance) developed for categorical maps (O'Neill et al, 1996) and the geostatistical indexes (Moran's /, Geary ratio, Cliff-Ord Statistic) better suited for field data (Jelinski and Wu, 1996; Qi and Wu, 1996) are affected by the grain of the data. Landscape metrics based on categorical maps are also influenced by the number of different categories present in a map, thus by the classification scheme (Gustafson, 1998). The grain effect is also related to what is sometimes identified as a "size effect" in the literature. Hulshoff (1995), in her analysis of a Dutch landscape, points out that the shape index calculated as mean perimeter to area ratio is influenced by patch size as well as by patch shape. This is easily understood by considering that, for the same geometrical shape, the perimeter is a linear function of size, while the area is quadratic. A square of unit size side has a perimeter to area ratio of 4, but if the length of the side is doubled, the ratio becomes 2. An important * Jelinski and Wu (Fig. 1 in Jelinski and Wu, 1996) show a zoning effect on the calculation of the mean in their grid example, but this is because their mean is not weighted on the unit area. 15 implication is that the perimeter area ratio is also affected by units of measurement. This happens because the index is not dimensionless: it has a dimension of length'*'. Thus the perimeter to area ratio of a square of one hectare area is 0.04 m"1 or 40 km"1, while a square whose side measures 1 inch has a perimeter to area ratio of 4 if measured in inches"1, but of 1.575 if measured in cm"1. If the index, as is often the case, is calculated in lattice units, it is evident that changing the size of the grid will affect the value of the index even if no resampling of the elements is necessary. Size affects not only the perimeter to area ratio, but also edge density and contagion. Failure to identify the "size effect" on landscape metrics has brought about some confusion in the evaluation of the ability of landscape metrics to discriminate between different patch arrangements over a landscape (Hargis et al., 1998). Similar consequences are to be expected if grain effects are not considered when analysis with different resolutions are compared. Not only the grain, but also the extent of observations has non-trivial consequences on the patterns observed. When varying the extent of an observational data set, let us say a vegetation survey, the correlation between the occurrence of some specific values, e.g. the presence of two species, can change from significantly negative to non-significant, to significantly positive (Beals, 1973; Allen et al, 1984). A similar behaviour is common for all methods based on correlation such as joint count measures, or the electivity index (Pastor and Broschart, 1990). Landscape metrics are also sensitive to the extent of the observation. O'Neill et al. (1996) showed for example that the limited extent of 640 km2 hexagonal samples of land use in South-eastern United States prevented an accurate estimate of contagion and dominance, because many patches were sliced in smaller and simpler elements by the sampling unit. 16 2.1.3: Scale and analysis artefacts The scale of observation does matter in the observation of ecological phenomena. Both grain and extent can substantially affect the results of any analysis, and the analysis of the spatial arrangement of ecological entities or processes is particularly sensitive to scale effects. This is not necessarily detrimental to research. Scale effects can be artefacts of the analysis process, as well as signals of relevant ecological processes. Analysis methods that eliminate scale effects, if altogether possible, might actually lose more insight than they can gain (Jelinski and Wu, 1996). The problem is parallel to that observed in dealing with the spatial structure of ecosystems. The elimination of spatial autocorrelation between observation in a data set might enable one to use simpler statistical procedures, but at the same time can obscure some important characteristics of the system under study (Borcard et al, 1992; Rossi etal, 1992; Legendre, 1993). As an example, let us consider the extent effect reported by Allen et al. (1984). In their example, two forest species, one a pioneer and the other late successional, are sampled within a forest. At the forest level the two species show significant negative correlation, the pioneer occupying recently disturbed sites, and the late successional species occupying older forest patches. As the observational extent is increased, an adjacent grassland is included in the analysis. As one adds more and more grassland samples, the strength of the negative correlation between the two forest species diminishes, until, if enough additional grassland is included, it changes sign and becomes a positive correlation. The authors indicate that the initial and final analysis results are not data artefacts, but the consequence of two different processes operating at two different scales: a smaller scale process that segregates the two species from each other (e.g. succession) and a larger scale process (e.g. regional climate) that segregates the forest species from the grassland. 17 On the other hand scale effects can be analysis artefacts, unrelated to any ecological process. An example is provided by O'Neill et al. (1996), where a large region is sampled for landscape structure using regularly spaced hexagons of fixed area. If the landscape is composed of patches of size comparable with that of the sample units, the inadequate maximum extent of the sampling unit slices them, creating spurious small patches, and modifying the distribution of patch sizes. The spurious small patches will disappear as the sampling unit size is increased. This will have an effect in the calculation of those landscape metrics that are sensitive to patch size, although the changes in the indexes with scale cannot be associated with any ecological process. By changing scale it is possible to observe different ecological processes, but it is also possible to miss them in random noise, or to generate artefacts. The first consequence is that there are inappropriate scales, which provide no information or, worse, erroneous information, and there are appropriate scales that provide valuable information. The second consequence is that there is more than just one appropriate scale (or range of scales), depending on the system and the processes studied. 2.1.4: Choosing the appropriate scale The examples provided above show how some scales of observation are more insightful than others, and how some choices of grain and extent might produce data artefacts. Because most landscape ecology literature deals with the problem of scale in maps obtained by remote sensing or aerial photo interpretation, where continuous cover is provided, although with a specific resolution (grain), I avoided the discussion of scale in spaced sampling. In spaced sampling the concept of grain is more complex, since it has two components: the spacing between sample units, which corresponds to the minimum "lag" in 18 geostatistics, and the size of the sample units (Bellehumeur and Legendre, 1998). In sampling designs both spacing and sample unit size effect the high frequency component of the variance, especially if their scale is comparable, and different patterns can be observed varying one or the other or both (Jonsson and Moen, 1998). In choosing the observation scale, one would want to observe the system at scales that reveal some significant pattern, while at the same time avoid artefacts. The example above showed how the possibility to modify the scale of the analysis, both in terms of grain and extent, can be quite insightful and reveal possible scale effects and artefacts. Unfortunately this is not always possible, or at least the possibilities of modifying grain and extent of observations are bounded by the data collection method, thus, the importance of a careful selection of the data collection methodology, or of the understanding of the technology involved when, as in the case of satellite-based remote sensing, it is driving the choice of grain and extent (Fisher, 1997). The problem of the choice of scale in exploratory analysis differs from that in investigations that are aimed at specific processes. In the former case one has no specific reason to choose one scale over another, and, depending on the available data sources, one should possibly perform the analyses on a wide range of scales. Since the analysis is exploratory, the main risk is not of not finding anything, but to find structure that is an artefact of the data collection procedure: the result that no relevant structure could be detected is an important result if the analysis involved a wide range of scales. Artefacts are more likely to appear at the extremes of the scale spectrum, as either the extent or the grain of the observation interferes heavily with the analysis. In landscape pattern analysis, for example, O'Neill et al. (1996) advise the use of a grain that is 2 to 5 times smaller than the smaller patch, and an extent that is 2 to 5 times larger than the largest patch. 19 The risks of employing a wrong scale are obviously higher when one is interested in a specific process, because now one must avoid the inability to detect the process of interest because of scale problems, as opposed to the non-occurrence of the process. At the same time, the confounding effects of other processes occurring at different scales must be avoided. Knowledge of a process usually enables one to define at least a range of scales at which the process should be observed. If one is interested in competition between plant species, the average size of an individual plant is an obvious lower bound for the.grain of analysis, while the range of the spatial autocorrelation of the local environmental properties is a reasonable upper bound for grain, if competition is not to be confounded with habitat selection (Jonsson and Moen, 1998). The use of a small scale "pilot" sample to explore the structure of spatial autocorrelation of the variable of interest at scales lower than the minimum grain of the study can be quite insightful (Bellehumeur and Legendre, 1998), although one must be wary of the possibility that the variable of interest is not second-order stationary, i.e. that the spatial covariance is not only a function of the lag and of the orientation of the distance vector, but also of the positioning in the study area. The choice of scale is complicated by the fact that many ecological processes occur over different scales. I will use fire spread as an example, since it is quite relevant to the boreal mixedwoods. Fire spread is influenced by wind speed, fuel moisture content, fuel physical and chemical characteristics and terrain slope (Johnson, 1992). The scale of temporal variation of these variables is quite diverse, ranging from hours for wind and fuel moisture content to millennia for terrain slope (Johnson, 1992: 25). The scales of wind and moisture content overlap. Moisture content is more important than fuel chemistry in determining the rate of spread, because of the large heat sink of water. Slope is important, but 20 it varies at such a slow rate in time, that it can be considered constant. Thus, the temporal scale of wind and fuel drying is the one most relevant to the process. The grain will have then to be in the order of minutes to days, the exact choice depending on the temporal autocorrelation structure of the two variables, and on the limits of data collecting capabilities. Data collecting capabilities often limit the range of observable scales, although the present array of data sources is certainly wider than at any previous time in history. The development of remote sensing, both from satellites and aircraft, has considerably expanded the range of scales that can be directly observed, although the temporal or spatial resolution of this data source is often limiting. In satellite based optical sensors, for example, there is generally a trade-off between spatial resolution, temporal frequency and spectral resolution. The Landsat TM sensor has 4 bands in the visible and near-infrared region of the spectrum and a nominal pixel resolution of 30 m, while the sensor of the French SPOT satellite offers a 10 m nominal ground resolution, but with only one band that spans the visible region. The Landsat sun-synchronous orbit gives an acquisition period of 16 days for the same site, while the NOAA AVHRR gives up to four acquisitions daily, but at the expense of spatial resolution, which is l-km(Quattrochi and Pelletier, 1991). If resolution is a problem in remotely sensed data, extent is usually the problem with ground collected data, especially for physiological parameters. These are usually collected over relatively small areas, with serious implications when these observations have to be scaled-up in larger scale ecological studies (e.g. Jarvis, 1993; Pierce and Running, 1995). 2.2: METHODOLOGIES OF SPATIAL ANALYSIS IN LANDSCAPE ECOLOGY The spatial arrangement of ecosystem components can be an important clue to the understanding of ecological processes. Some examples are the use of spatial sequences in lieu of temporal sequences to study succession, or the often cited deduction of processes 21 from spatial patterns in Watt's (1947) description of heath or beech ecosystems in Britain. These examples of the analysis of spatial distribution are classical examples of ecological research, as are the studies of vegetation changes over climatic or edaphic gradients, but they all have in common the characteristics of involving a very simple relation between spatial arrangement and the variables driving the process of interest. Where the spatial distribution of some environmental variables is more complex, spatial patterns have often been seen more as a problem than an opportunity, and rightly so, since they reduce the ability to use simple statistical techniques because of spatial autocorrelation. On the other hand, spatial heterogeneity, and spatial and temporal autocorrelation, are a consequence of the structure and functions of ecosystems, and the insight that can be gained by removing them is obviously limited (Kolasa and Rollo, 1991). A wide array of quantitative techniques exists to deal with spatial structure in ecological studies. While some of these techniques are not new, e.g. Greig-Smith's (1952) blocking method, the use of spatial analysis has gained momentum in the last decade with the growing interest in landscape ecology and large scale studies (Rossi et al, 1992). Since most ecological systems are heterogeneous at different scales, I find it useful to distinguish between methods that compare variability across scales, and methods that are preoccupied with describing the arrangement of a variable's values at a specific scale. The first family of methods includes the various blocking methods (e.g. Greig-Smith, 1952; Pielou, 1977; see Turner et al, 1991, for a review) that use a nested analysis of variance to describe the relevant scales in the patterns of a certain variable. Blocking methods are used to detect changes in within vs. between block variance as block size and distance between blocks increase. The original method, proposed by Greig-Smith (1952), is based on the progressive aggregation of blocks on a regular spaced grid, so that the block 22 size is quadrupled at every step, but the method can be extended to linear transects, where block size is doubled each time. The mean square variance is plotted against the block size. If the variable is clumped, peaks in the mean square variance should appear at the block size corresponding to the clump size, since the blocks at that scale will be quite different from each other. This technique has several statistical limitations: the total number of units in the analysis must be a power of four (a power of two in the case of transects), the number of degrees of freedom in the nested analysis of variance decreases with increasing block size, the analysis is sensitive to block size and starting position, and the estimate of the variance at a given block size is not independent from the estimate at other block sizes so that it is not possible to test for significance of the peaks that are found (Turner et al, 1991). Since the blocking technique is relatively old and has had widespread use, these problems have received attention in the literature, and several improved blocking techniques, reviewed by Turner et al. (1991), have been proposed to overcome the limitations. Nonetheless there are two inescapable limitations of any blocking technique: it will be biased in the identification of regularly repeating patterns, and it will not identify where clump edges occur. The moving window analysis is an effective alternative when one is interested in determining the position of edges (Brunt and Conley, 1990). The moving window analysis is an automatic edge detection algorithm based on the sliding of an analysis window over the data series. Although moving windows are used both in single and two-dimensional data (e.g. Fortin, 1997), here I will describe only the single dimension case (transect). The moving window analysis is conduced by sliding a window of Â«-plot size, with n being even, across the data series, and comparing the two window halves with an appropriate metric. The distance between the two halves is plotted against the position of the window centre, and peaks in the distance metric can be interpreted as edges 23 (Turner et al, 1991). The main advantages of this technique are that the metric to be used can be chosen to suit the data type, so multivariate data can be analysed directly without data reduction, and that, since it identifies specific edges and not characteristics sizes, the method is applicable to the detection of irregularly spaced patches. Both the choice of the distance metric and of window size used influence the results. Different distance metrics have different properties. Both the Square Euclidean Distance (SED) and Mahalanobis distance have been used in the literature (Turner et al., 1991). These two metrics have specific advantages and specific problems. The SED is quite straightforward in calculation, and its properties are better known when applied to ecological data. On the other hand the SED is sensitive to the measurement units and to absolute values of the variables, which can lead to undesirable consequences when comparing vegetation samples, e.g. samples that have similar vegetation but in very different abundance might end up being more distant than samples that don't have the same species in common (Brunt and Conley, 1990). The advantage in the use of the Mahalanobis distance is that it accounts explicitly for the correlations that could be present between the variables (Dillon and Goldstein, 1984), although if the correlations are non-linear it is possible that the advantage would be lost (Turner et al, 1991). While an analysis of the performance of SED-based window analysis in simulation data sets is available (Brunt and Conley, 1990), I am aware of no such analysis employing the Mahalanobis distance. The size of the moving window is also important for the discrimination of edges. In general, wider windows are better at discriminating "real" edges from background heterogeneity, although a patch can be totally lost, or interpreted as a single edge, if the window is comparable in size with the patch size (Brunt and Conley, 1990). Another factor 24 influencing the algorithm performance is the number of variables used. Increasing the number of variables was found to reduce the ability of the algorithm to find "real" edges in simulated data sets when the "real" edge corresponded to a change in expected value for only a subset of the variables (Brunt and Conley, 1990). The moving window analysis somehow overlaps between the methods used to describe patterns across scale and the methods used to describe patterns at a given scale, since it can be used for examining either the arrangement of the edges detected, or the effects of changing window size on edge detection. There is also a set of statistics designed to describe patterns at a given scale that is found in the landscape ecology literature (O'Neill et al, 1988; Hulshoff, 1995; Jelinski and Wu, 1996; O'Neill et al, 1996; Gustafson, 1998; Hargis et al, 1998). These statistics, usually referred to as landscape indices (O'Neill et al, 1988; Hulshoff, 1995) or landscape metrics (Hargis et al, 1998), are designed to summarise the characteristics of a landscape described by a thematic map. While some of these indices are independent from the spatial arrangement of the classes on the landscape, relying only on their relative abundance (diversity, richness, evenness, dominance), the most interesting ones are those that attempt to summarise the spatial structure of the landscape. There is a wide array of such indices (Gustafson, 1998), that generally try to tackle either the complexity of patch shapes, as fractal measures (O'Neill et al, 1988) and shape indexes (Hulshoff, 1995), or the amount of edge, as contagion (O'Neill et al, 1988). As discussed in section 2.1.2, the use of these indexes has shown them to be generally sensitive to the scale of the analysis (O'Neill et al., 1996), the size of the patches (Hulshoff 1995; Hargis et al. 1998) and the distribution of the landscape in the different classes (Gustafson and Parker, 1992). 25 As noted above, the applicability of these metrics is generally limited to categorical maps. The rationale behind the use of landscape metrics is usually the study of fragmentation of a natural landscape matrix (Forman and Gordon, 1984), where a sharp contrast between the matrix and the patches is to be expected. This is especially true when the theoretical context used in the study is that of island biogeography (MacArthur and Wilson, 1967), where the landscape is seen as a binary mosaic of suitable and non suitable habitat (Gustafson, 1998). When dealing with ecological properties not easily defined by class variables, the use of these metrics can be more treacherous, since the results will be heavily influenced by the classification scheme that is employed. Theoretically, it is possible to account for possible effects of the classification scheme using statistics that keep track of the role of the different classes, like measures of edge sharing or the electivity index (Pastor and Broschart, 1990), or by contrasting the impact of different classification schemes on summary landscape metrics. An appealing alternative when dealing with continuously varying variables is to describe the spatial heterogeneity of a landscape using a textural index, the Inverse Difference Moment (IDM) (Musick and Grover, 1991). The IDM is a measure of homogeneity that has also the desirable property of being relatively insensitive to window size when calculated with a moving window. It is calculated as the sum of the relative frequency of the co-occurrence of every couple of values in the image, weighted by the difference between the values in a way that reduces the IDM value increasingly for the co-occurrence of values that are further apart. The IDM is a bounded index with a maximum value of 1 for a completely homogeneous image, and because the difference between categorical values has no sense, it is limited in use to continuous values. It has been proven to be useful in improving image classification from remotely sensed data (Franklin and Peddle, 26 1990; Kushwaha et al, 1994) and in assessing species diversity from air photos (Bijlsma, 1993), generally performing better than other textural measures. 2.3: THE USE OF DIGITAL ELEVATION MODELS IN ECOLOGICAL ANALYSIS Digital Elevation Models (DEMs) and Digital Terrain Models (DTMs), usually derived from DEMs in a GIS environment, are used in a wide range of landscape investigations (Pike, 2000), and present several opportunities in the study of vegetation distribution at the landscape scale (Florinsky, 1998b). DTMs can be defined as digital representations of the topographic variables for a given area, distinguishing them from DEMs, which only describe elevation (Florinsky, 1998b). Although it is theoretically possible to have a DTM produced by direct measurement of the topographic variable of interest, DTMs are generally generated from DEMs through a variety of analytical procedures (e.g. Evans, 1980; Haralick, 1983; Zevenbergen and Thorne, 1987; L i and De Dapper, 1996). Florinsky (1998b), in his review of the use of DTMs in landscape investigation, describes sixteen topographic variables, all derived from DEM data, that have been used in a variety of investigations. The objective of these topographic variables is to describe terrain shape, generally in the context of its relation with hydrological flows and solar radiation. Because of the importance of these processes in terrestrial ecosystems, DTM variables could be used in ecological studies that address topographic control on the community. In some applications, topographic variables have been used to automatically derive terrain feature classifications (Pellegrini, 1995; Wood, 1996; Coops et al, 1998). A reasonable question is how much can DEM-derived terrain variables represent the terrain information that is relevant to ecological processes at a given scale, or how much of 27 this relevant information is lost in the processes of DEM production and DTM derivation. While a general answer is not possible, since it will necessarily depend on the quality of the specific DEM used, there is evidence in the literature that DEM-derived topographic variables can be successfully used in statistical models to predict vegetation cover. DEM or DTM data have been successfully used to integrate remotely sensed information, with sensible improvements in the results of vegetation classification (e.g. Franklin, 1987), but here I will review only a few instances were DTM information has been used to predict independently estimated vegetation parameters. Pickup and Chewings (1996) used a set of topographic indexes (slope, horizontal and vertical curvature, drainage area) integrated with some biological indexes (non-vegetated drainage area, both total and percent, and distance from water points) to predict vegetation cover in an arid grazed rangeland in central Australia using multiple linear regression. The vegetation cover was measured form several multitemporal satellite images. The authors obtained multiple correlation coefficients between 0.55 and 0.70, with the higher values corresponding to the steeper areas which also had less grazing impact. In their study, the use of the ancillary ecological information (distance from water and percent of non-vegetated drainage area) was important to improve the model performance. Florinsky and Kuryakova (1996) studied the influence of topographic characteristics on vegetation in eastern Kazakhstan. They calculated a set of 9 topographic variables from a DEM in four different landscapes of 16-square km in size. They used the topographic variables to predict a rank index of vegetation cover with linear rank correlative analysis, obtaining correlation coefficients in the range of 0.26 to 0.68. Their study areas are characterised by a wide range of elevations, 400 to 1800 m between areas and 350-500 m 28 within-area, and by an extremely continental climate. Their best result in predicting vegetation index was in the area with lowest human impact. Ostendorf and Reynolds (1998) used a terrain based model to predict vegetation in an arctic watershed in north-central Alaska. They used slope and discharge measures derived from a DEM and a 6 class vegetation map obtained from field survey. A probabilistic model, based on a subset of the total study area, was produced to predict the probability of occurrence of a given vegetation type on a given combination of slope and discharge classes. When the model was extrapolated to the entire study area, it obtained an overall spatial accuracy of ca. 60%. Overall these studies show how, in environments where terrain controlled processes are very important, DEM derived measures provide relatively good predictors of vegetation, supporting the idea that DEM derived measures can be rightfully used as indirect terrain measures in large scale ecological investigations. 29 C H A P T E R 3: I N T R O D U C T I O N T O T H E S T U D Y A R E A A N D T H E D A T A S E T S U S E D . 3.1: INTRODUCTION TO THE STUDY AREA The study area is located in northeastern British Columbia. It spans from 56Â° 00' to 56Â° 54' latitude north, and from 122Â° 31' to 120Â° 48' longitude west (Figure 3.1), extending from the Rocky Mountain foothills in the west to Charlie Lake in the east, and from the Blueberry River in the north to the Peace River and the Moberly River in the south. The boundaries shown in Figures 3.1 and 3.3 are those of the intersection of the available digital map coverage with the Landsat TM images that were used. Areas with elevation over 1100 m, areas classified as "not forest" in the forest inventory, and some small watersheds east of Butler Ridge that drain into Williston Lake, in the south-west comer of the study area, were removed from the study, leaving an area of 485,220 ha subject to analysis. Administratively, the study area is within the Fort St. John Forest District and the Fort St. John Timber Supply Area (TSA) north of the Peace River. South of the Peace River, the area is part of Dawson Creek Forest District and of the Dawson Creek TSA. The topography of the area is characterised by flat plains or gently rolling hills, with the exception of Butler Ridge in the south-west corner, which is part of the Rocky Mountains foothills. The plains and gentle hills are dissected by the deeply cut valleys of the Peace River in the south, and by its tributaries. Most of the study area lies between 600 and 900 m elevation, with some the hilltops rising to over 1000 meters. Elevations over 1100 m are limited to Butler Ridge, and have been eliminated from the study area. 30 31 Farrel Creek, Halfway River and its tributaries, Graham River and Cameron River, drain the central and northern part of the study area, flowing from the north-west into the Peace River in the south, and their watersheds make up most of the area. The extreme northeastern part of the study area spans part of the watershed of the Blueberry River, which also has a mainly north-west to south-east orientation, and eventually drains into the Beatton River, a tributary of the Peace River. The extreme south-east part of the area spans into the valleys of the Moberly and Pine Rivers, two southern tributaries of the Peace River. The area is dotted by glacial lakes. The largest is Charlie's Lake, near Fort St. John, while several others dot the uplands dividing the Peace River and the Moberly River valleys. There are also extensive wetlands, particularly in the valley of Halfway River. Non-native use of resources in the area is relatively recent, due to the relative remoteness and lack of. accessibility of the area. Although the area was one of the first parts of British Columbia to be reached by European fur traders, with Sir Alexander Mackenzie reaching the Peace River area in 1792, the area was not accessed by railway until 1931 when the Northern Alberta Railway reached Dawson Creek. Accessibility increased with the construction of the Alaskan Highway in 1942, with the John Hart Highway connecting Dawson Creek to Prince George in 1952, and with the completion of the pacific Great Eastern Railway in 1958 (Farstad et al., 1965). 3.1.1: Climate The climate is continental, with long cold winters and short summers with moderate precipitation. At the Fort St. John Airport, the closest Environment Canada weather station to the study area, the average daily minimum temperature of the coldest month is -19Â° C in January, while the average daily maximum of the warmest moth is 21Â° C in July. 32 The annual precipitation average is 467 mm, of which 171 mm falls as snow, providing measurable snow cover between October and April. June, July and August are the only frost-free months, and the only three months with daily mean temperature above 10Â° C. The degree-days sum (above 5Â° C) is 1295. The Walter and Lieth diagram for Fort St. John Airport is shown in Figure 3.2. 3.1.2: Geology and Soils Bedrock in the study area is mainly medium to fine-textured sandstone and Cretaceous shales. Higher lands and flat-topped mesas occur where the sandstones are particularly thick (Farstad et al, 1965; Lord and Green, 1986). Most of the area is capped by Fort St. John Airport (1942-1990) 695 m. 1.6Â° C 467.5 mm (33.6Â°) 21.5"-19.4Â° (-47.2Â°) Temperature Precipitation 50 , _ 100 Figure 3.2: Walter and Lieth diagram for Fort St. John Airport. Data source: Environment Canada. 33 thick glacial till, sometimes covered by glaciolacustrine deposits (Lord and Green, 1986). The area was affected by Montane ice from the Rocky Mountains passes, a pedemontan lobe of Cordilleran ice outflowing from the Parsnip and Peace River Valleys, and Laurentide ice advancing south-west from the Lake Athabasca area (Catto et al, 1996). Recent research shows that the Montane and Cordilleran advance from the west was pre-Late Wisconsinan, while in the Late Wisconsinan the Laurentide ice expanded to the west reaching its maximum around 20,000 BP (Catto et al, 1996). During the retreat of the Laurentide ice and the expansion of the Montane ice, the centre of the study area was occupied by a large lake dammed by the Laurentide ice in the east (Mathews, 1980; Catto et al, 1996). The surficial geology of the area is thus dominated by these glacial events, with morainal loamy deposits in the uplands, and glaciolacustrine clayey and loamy, or glaciolacustrine loamy deposits in the lowlands. Glaciofluvial deposits are present in the embankments of the Peace River. Colluvial undifferentiated deposits are common on the steep slopes bounding the deeply cut rivers and creeks, while organic deposits are present under bogs and wetlands (Lord and Green, 1986). The available soil surveys cover only part of the study area (Farstad et al, 1965; Lord and Green, 1986). The map of Lord and Green (1986) covers from 120Â° 00' E 55Â° 30' N to 122Â° 00' E 56Â° 30' N , so no soil maps are available for the northernmost 15' of latitude and the westernmost 31' of longitude of the study area. For the part covered, the soil survey reports mostly Orthic Gray Luvisols in the morainal northeastern uplands, Brunosolic Gray Luvisols in the glaciolacustrine deposits around Halfway River and west of it, and Regosols in the alluvial and colluvial areas near active rivers and creeks. 34 3.1.3: Ecological Classification The area spans both the Mixedwood Section and the Lower Foothills Section of the Boreal Forest Region of Canada (Rowe, 1972). According to the Ecoregion Classification of British Columbia, the area is located in the Boreal Plains Ecoprovince, spanning the Central Alberta Uplands Ecoregion and the Peace River Basin Ecoregion (Demarchi, 1995). The study area is almost fully within the mw 1 variant (Moist Warm Subzone, Variant 1) of the Boreal White and Black Spruce (BWBS) zone of the Biogeoclimatic Ecosystem Classification (BEC) of British Columbia (Meidinger and Pojar, 1991), with some B E C digital maps provided by BC Environment's spatial data warehouse (ftp://ftp.env.gov.bc.ca/dist/arcwhse/) Crown Copyright, used with permission. Figure 3.3: The study area (DEMs and Landsat TM intersection) in relation with the BEC zones, sub-zones and variants. 35 higher elevation hills falling into the Wet Cool (wk) Subzone, Variants 1 and 2 (Figure 3.3). As noted above, the portions of the area falling into the Engelmann Spruce - Subalpine Fir (ESSF) and Alpine Tundra (AT) zones were actually masked out from the analysis by removing any area with elevation above 1100 meters, as well as the small watersheds draining into the Peace Reach of the Williston Lake east of Butler Ridge. 3.2: DATA SOURCES As discussed in section 1.2, the first objective of this thesis is to assess the fitness of available spatial information for the landscape analysis of boreal mixedwoods of Northeastern British Columbia. Several different spatial data sources are available, and used in this thesis. These include field data, satellite remote sensing images, aerial photographs, and GIS maps produced by provincial agencies: forest inventory maps produced by the BC Ministry of Forests, and Terrain Resource Information Management (TRIM) maps produced by the BC Ministry of Environment Lands and Parks. In this section I will describe briefly all data sources, giving more detail for the field data. Remote sensing images, forest inventory maps and TRIM data were subject to extensive GIS operations to produce the forest cover and terrain classification layers used in spatial analysis. The discussion of the data processing procedures is described at length later: the production of posterior probability maps of forest cover classes from the integration of forest inventory and remote sensing is described in section 3.3; the more complex procedure to obtain terrain classification from quantitative gemorphometry of TRIM maps is presented in Chapter 4. 36 3.2.1: Overview of the data used The first suite of data sources is the one used to describe forest cover, including Forest Inventory and Planning (FIP) digital maps from the B C Ministry o f Forest, two Landsat T M images, colour air photos and field data for ground control of the classification. The FIP maps used, listed in Figure 3.4, were provided in S A I F format courtesy o f the Fort St. John Forest District. I used the Feature Manipulation Engine ( F M E ) software to convert the FIP maps from S A I F to E S R I Arc V i e w files. O f the information available in the FIP database, I retained typing (to separate forest from non-forest), age and forest cover by species. To describe topography the data used are the digital elevation models ( D E M s ) o f the Terrain Resource Information Management (TRIM) digital maps produced by the Minis t ry of Environment, Land, and Parks ( M E L P ) of the Province of Bri t ish Columbia. The T R I M project was completed in 1996, with the production of 1:20,000 scale digital maps, covering 94B098 94B099 94B100 94A091 94A092 94A093 94A094 94A095 94A096 94A097 94A098 94B088 94B089 94B090 94A081 94A082 94A083 94A084 94A085 94A086 94A087 94A088 94B078 94B079 94B080 94A071 94A072 94A073 94A074 94A075 94A076 94A077 94A078 94B068 94B069 94B070 94A061 94A062 94A063 1 94A064 94A065 94A066 94A067 94A068 94B058 94B059 94B060 94A051 94A052 94A053 94A054 94A055 94A056 94A057 94A058 94B048 94B049 94B050 U4AU41 'MAD42 â€¢I4A04.I 94A044 94A045 94A046 94A047 94A048 94B038 94B039 94B040 "4A031 94A032 94A033 94A034 94A035 94A036 94A037 94A038 94B028 94B029 94B030 94A021 94A022 94A023 94A024 94A025 94A026 94A027 94A028 94B018 94B019 94B020 94A011 94A012 94A013 94A014 94A01S 94A016 94A017 94A018 94B008 94B009 94B010 94A001 94A002 94A003 94A004 94A005 94A006 94A007 94A008 Figure 3.4: The Forest Inventory and Planning map sheets used in the thesis (boldface and grey background) identified using the B C Geographic system, which divides each N T S quadrangle in one hundred map sheets numbered from 001 to 100. 37 the entire province with 7027 map sheets (GDBC, 1999). The TRIM digital elevation models are designed to produce triangulated irregular networks (TINs), and are composed of elevation points and break lines (GDBC, 1992). 3.2.2: Field Data Several sets of geo-referenced field data were collected as the structure of the research project evolved over the course of the study. In the initial design, the research was to be limited to a smaller area, map sheet 94B040 of the BC Geographic System, where spatial information was to be collected in the Terrestrial Ecosystem Mapping (T.E.M.) format. A consulting company was contracted for the preparation of the T.E.M. product. While T.E.M. is mainly based on photo-interpretation, a total of 155 sites were visited in the field in the summer of 1997. Ten sites were described using the full Ministry of Forests Ecosystem Field Forms plots, 40 sites were described using the T.E.M. Visual Inspection Form (Resource Inventory Committee, 1995) and 105 sites were described by reconnaissance data: field notes to describe the vegetation and other pertinent data to aid photo-interpretation. Al l the site positions were recorded on 1:10,000 colour aerial photographs of the area. While I accompanied the field crew of the consulting company throughout the data collection, I had no control on the sampling design or the classification scheme employed. In the summer of 1998,1 sampled polygons from the Terrestrial Ecosystem Map that was produced in 1998. The objectives of this sampling activity were twofold: firstly to check the accuracy of the T.E.M. product; secondly to collect data that would enable me to merge some of the numerous T.E.M. classes if they did not differ significantly in forest 38 composition. The sampling scheme used was a stratified random sampling, with the polygons as sampling units and the T.E.M. classes as strata. The T.E.M. specifications allow a polygon to be composed of up to three different ecosystem classes. This poses some serious problems in sampling polygons constituted by classes that are expected to be similar in forest structure, because one cannot know which of the classes one is sampling. For this reason, the selection was limited to homogenous or non-dubious polygons, non-dubious polygons being defined as polygons whose components have high contrast (e.g. a forest vs. a non-forest unit, or mixedwood vs. pure black spruce or black spruce / lodgepole pine units). This choice obviously introduces a risk of bias if the results are used for statistical inference about the entire polygon population, since it assumes that homogenous and non-dubious polygons are a representative subset of the population. Although this assumption is questionable, it is necessary for any sampling design based on the T.E.M. polygons. The only other alternative was to have included the heterogeneous polygons in the sampled population, and successively identified the selected ecosystem units and structural stage combinations on the ground, introducing yet another source of bias. Another problem was the selection of the strata: the T.E.M. legend for the map had 7 "ecosystem units", each with 6 "structural stages", giving a total of 42 possible strata, an excessively high number given my objectives and resources. Because the focus of the project was on mixedwood dynamics, the sampling involved only the T.E.M. ecosystem units that could have a role in the successional model of Lieffers et al. (1996a), excluding black spruce lowland and upland ecosystem units and the non-forested ecosystem units. Structural stages of 3 (shrub/herb) or lower were also excluded from the sampling. The sampling was also limited to those combinations of classes and structural types that represented at least 1% of 39 the total area. This set of selection criteria resulted in eight combinations of ecosystem unit and structural stages. An initial round of sampling was targeted at 24 polygons, 3 per class. The UTM co-ordinates of the polygon centroids were extracted from the GIS version of the T.E.M. map and entered into a Global Position System (GPS) unit for navigation. The centroids were identified in the field using the GPS. At each point, we estimated canopy cover and measured basal area. Basal area was sampled on a 200 or 400 square meter area (a plot of 2X50, 4X50 or 2X100 m). Plot size selection was based on density and homogeneity of tree layer: if the area was dense and homogenous a 2X50 m plot was used. A 4X50 m plot was used in areas judged to be homogeneous in trees distribution but with lower stem density, while the 2X100 m plot was used in areas with a patchy structure. The different plot sizes account for the wide differences in stem density and average stem size found in the stands. To assess canopy height and time since the last stand replacing disturbance, we measured the height and took an increment core of the three largest diameter trees in each plot. In some cases it was not possible to obtain sound cores, especially in pure aspen stands. The position of the plot centroid was recorded using the "ACU-LOCK" function of the GPS unit, which averages of 600 readings, one reading every 5 seconds. The results of the ACU-LOCK, which has the objective of compensating for selective availability error, are believed to have an horizontal RMS of 50 m. The first result of the sampling activity was the finding of a very high rate of misclassification for the T.E.M. polygons: we found 8 of 24 polygons (33%) to be grossly misclassified, i.e. their forest cover composition was totally incongruent with their ecosystem unit and structural stage definition. Surprised by the high level of inaccuracy, I cross-checked the T.E.M. map with the FIP information for congruency. Using IDRISI, I extracted the area-40 based average forest cover composition, area based average age, and minimum and maximum age of each TEM polygon from the FIP database. I then used a set of conservative congruity rules to compare the TEM classification with the FIP data. These rules were based on the TEM ecosystem unit and structural stage description. Out of 776 TEM polygons, 216 failed the consistency test (27%). As a consequence of the high error rate observed, I was decided to re-direct the sampling campaign so that reliable data sets could be produced as an alternative to the T.E.M. map. I decided to rely on the Landsat images for forest cover classification and at the same time to lay down a series of transects in the area mapped with T.E.M. to describe small scale variability within that area. I then sampled 23 points outside the 94B040 map sheet that were to be used for accuracy assessment of the remote sensing classification. The points were selected during extensive ground reconnaissance of the study area. I selected areas that would represent large targets for the TM sensor (large clearcuts or large homogenous stands), assessed canopy cover by tree species, vegetation cover, and recorded the geographic position using the ACU-LOCK function of the GPS unit. The arbitrary sampling enabled me to cover a wide range of forest cover conditions. Within map sheet 94B040 I laid out ten 1.2 km arbitrarily selected transects. The transect layout was designed to avoid large wetlands, clearcuts, roads and concentrations of seismic lines, while ensuring the most continuous forest cover possible, high variability in topography, vegetation and surficial material. Each transect consisted of a succession of 50 plots spaced 25 meters apart (for a total distance covered of 1225 m). On each plot I measured tree basal area by species (for DBH > 2.5) on two concentric circular plots: a 100 m 2 plot for DBH<22.5 and a 200 m 2 plot for DBH>22.5. The ecological rationale in the use of concentric plots with different diameter 41 cut-offs is that larger trees will probably influence a larger area with their canopy. A visual estimate of vegetation cover in the shrub and herb layers was taken in the 100 m 2 area. I also collected a sample of the upper 15 cm of the mineral soil for texture assessment, while humus form (Mor/Moder/Mull) and depth to mineral soil were recorded. Slope, aspect and slope position were recorded at every point, while at points 1, 25 and 50 of the transect a GPS ACU-LOCK position was acquired for geo-referencing. Preliminary analysis of the transect data for small-scale variability was not encouraging, because of the limited length of the transects, so I decided not to pursue it any further and to concentrate on the large scale patterns and on the classification of forest cover using remote sensing and forest cover maps. Use of the field data was restricted to building training sites and to ground-truthing. 3.3: CLASSIFICA TION OF FOREST COVER USING REMOTE SENSING AND FOREST INVENTOR Y MAPS 3.3.1: Rationale To analyse the forest cover distribution, I wanted a forest cover spatial database that was independent from the topographic information, and had consistency of resolution and consistency of quality over the entire study area. To accomplish this, I integrated multispectral optical remote sensing with the FIP forest inventory maps. Forest inventory polygons are delimited by the photo interpreter subject to a minimum mapping unit constraint. In the BC forest inventory data, this is generally 2 ha for well-defined boundaries and 5 ha for indistinct boundaries (Resources Inventory Task Force, 1999). The interpreters are also required to simplify complex polygon boundaries, and 42 boundaries may be additionally simplified where they are close to administrative boundaries or the map sheet neatline (Resources Inventory Task Force, 1999). I used the remote sensing information to break down forest cover polygons into components smaller than the minimum mapping unit used by the photo interpreter. The advantage of this strategy is that the level of consistency of classification accuracy and resolution over the study area is increased, since the remote sensing acquisition conditions are the same for the entire area, as are the classification procedure and the interpreter. Such an application has been described recently by Wulder and Franklin (2001), although I developed my approach independently from their work. 3.3.2: Remote Sensing Images Used and Pre-Processing 3.3.2.1: Images used For the classification of forest cover, I acquired two Landsat TM quarter scenes (Table 3.1) with data in bands 3, 4 and 5 (red, near infrared and middle infrared). The first image was taken on March 14th 1994 and the second on June 2 n d of the same year to maximise the phenological differences in the hardwoods (Ghitter et al., 1995). The images were provided by Radarsat International on CD-ROM in the BSQ (Band Sequential) format and were subsequently converted into formats compatible with the Idrisi and Arc-View software. Table 3.1: Characteristics of the satellite images used. Satellite Sensor Scene date Bands Track/Frame Scene Centre Landsat 5 TM 14 Mar 1994 343 49721 56Â° 25'24"N 121Â° 39' 35"W Landsat 5 TM 02Junl994 345 49/21 56Â° 25'24"N 121Â°42'01"W 43 3.3.2.2: Image pre-processing The first operation I performed was to correct the images for atmospheric scattering. This was accomplished using the method described by Chavez (1988), which, unlike more sophisticated methods (e.g. Gilalbert et al, 1994), does not require all the visible bands. Next, it was necessary to geo-reference the satellite imagery to the reference system used by the GIS information (UTM projection, Zone 10 NAD 83). Because the satellite position changes with time, it was necessary to geo-reference the two dates separately. Using the TRIM line-work and a colour composite image in Arc-View I identified and digitised about 150 ground control points (GCP's), such as road and seismic line intersections, that were clearly identifiable both in the Landsat image and on the TRIM line-work. The different sun position in the two images made some features, such as seismic lines, more clear in one image than the other, so the sets of GCP's of each images are not exactly the same. I used the control point co-ordinates with the Idrisi32 RESAMPLE module to register the images to the UTM reference system. I obtained a root-mean-square error (RMSE) of 0.57 pixels (17 m in the nominal resolution of the Landsat TM) using a second order (quadratic) polynomial fit with 111 GCP's for the March and 108 GPC's for the June image. To reduce the amount of image degradation from resampling, I carried out the image classification procedure in the image co-ordinates of the June image. This means that only the March image had to be resampled prior to image classification. This was done using 102 of the same GCP's used to relate the two images to the UTM system, and it was accomplished with a RMSE of 0.57 pixels. 44 3.3.3: Image classification 3.3.3.1: Classification method The classification methodology chosen was supervised classification, using a maximum likelihood classifier on the band data, integrated with the ancillary information from the FIP data. Maximum likelihood classification is the most commonly applied classification procedure in supervised classification of forests from optical satellite remote sensing (Wulder, 1998). Maximum likelihood uses statistical rules to determine the membership of each image pixel to a particular forest class based on the "spectral signature" of that class as obtained from training sites. For this research, I used the BAYCLASS classification routine of Idrisi32 (Eastman, 1999). The BAYCLASS routine is a classification method based on the maximum likelihood classifier which produces maps of class membership probabilities based on Bayes' theorem (Eastman, 1999). I decided to use the BAYCLASS method for two reasons: 1) I wanted to introduce the ancillary information from the forest inventory in the classification. 2) I wanted to preserve the classification uncertainty. The maximum likelihood classifier assumes a multivariate normal distribution of the values from the information channels used. Ancillary information, like the information coming from the forest inventory maps, violates this distribution assumption, and is not well suited to be introduced in the maximum likelihood classification process. An alternative approach, to which the BAYCLASS module is well suited, is to produce class membership probabilities using the maximum likelihood classifiers, and then integrate them with the 45 forest inventory information to complete the classification, as it will be described in section 3.3.3.3. 3.3.3.2: Classes used and training sites delineation Considering the objectives of the classification, I was interested in distinguishing between different types of forest cover, based on species composition. Since I was not interested in classifying non-forest land cover, I used the FIP maps to mask out all non-forest land. I devised an eight-class forest classification scheme that accounted for the major forest types that had been described in the Terrestrial Ecosystem Mapping expanded legend for map sheet 94B040 (D. Marcoux, 1998 unpublished data), as well as my observations during Table 3.2: The eight classes of the forest cover classification used. Class name Description HDW Hardwood stands: dominated by trembling aspen, balsam poplar or birch SBLW Black spruce lowlands: stands dominated by black spruce and sphagnum, with possible presence of lodgepole pine and tamarak. SBPL Upland Black Spruce/Lodgepole Pine: Upland stands dominated by black spruce and mosses, often with abundant lodgepole pine. Generally very dense stands with poor growth. OB Bare Openings: where bare soil dominates the reflectance: recent clearcuts, recent bums. OV Vegetated Openings: openings dominated by herbaceous or shrubby vegetation, including NSR and NCB classes of the inventory as well as regenerating clearcuts where the canopy has not closed. MWD Mixedwood stands: stands with hardwood, mostly trembling aspen, and conifers, lodgepole pine and/or white spruce, with the minority component having over 15% canopy cover. PL Stands dominated by lodgepole pine SW Stands dominated by white spruce 46 field work. The classes are reported in Table 3.2. Initially I attempted to distinguish between aspen/white spruce, aspen/lodgepole pine, and aspen/white spruce/lodgepole pine mixedwood stands, but the three classes had extremely poor separability, so I decided to collapse them into a single class. To delineate the training sites I used the field information described in Section 3.2.2, together with 1:10,000 colour aerial photos where all of the T.E.M. field plots had been typed, as well as a colour composite of the June image. The field data from summer 1998 was entered in a GIS layer and overlaid on colour composite images of the remote sensing data. The T.E.M. control points had been typed on the air photos. By visual interpretation of the colour composite images (one for June and one for March) and the aerial photos, it was possible to extend the information from the field plots to delineate training sites polygons for each class that had sufficient size and were not compromised by too many edge pixels, where the spectral response was likely influenced by more than one class. It must be noted that a large number of T.E.M. field plots could not be used because they were often collected at the edge of patches of continuos homogenous forest cover, rather than at the centre of the patch. Since they were designed to aid the interpretation of the 1:10,000 aerial photos, the T.E.M. plots were often located in patches too small to provide a reasonable sized target for the TM sensor. The need to find large patches is due to the resolution of the TM sensor (30X30 m) but also to the spatial positioning errors inherent in the geo-referencing of the TM images (both to the UTM gird and to each other) as well as the spatial errors in the location of the field plots from GPS selective availability. In this phase the transect were very useful since they provided almost continuous forest cover information for long tracts of forest, the edge of two consecutive transect plots being only 5 m apart. 47 Some additional polygons for the least represented classes (SW and PL) were obtained only by visual interpretation of the June satellite image backed by the forest inventory data. In the end, I delineated 87 training site polygons for a total of 11,208 pixels (Table 3.3). The BAYCLASS module of Idrisi32 is limited to a maximum input of 7 channels. I had 6 basic image channels available (bands 3, 4, and 5 for two dates) with the possibility of adding other information channels obtained by processing these 6. In particular, I wanted to use the Normalised Difference Vegetation Index (NDVI) as an additional information source, since Hall and co-workers (Ghitter et al, 1995; Hall and Klita, 1997; Hall et al, 2000) successfully used it in classification of boreal mixedwood stands. I calculated the NDVI for each date using TM band 4 as the near infrared channel, and linearly transformed the results to byte values so they could be used in the classification routine. This left me with 8 possible information channels to use. To decide which combination of channels would produce the best results, I first compared the signature separability statistics, using the multispectral classification tools of PCI Geomatics to calculate the Bhattacharyya distance (B-distance) separability measure Table 3.3: Number of training site pixels for each class: Class Pixels HWD 2,975 SBLW 750 SBPL 732 OB 1,740 OV 1,249 MWD 1,631 PL 1,315 SW 816 Total 11,208 48 (Bhattacharyya, 1943). I tried six different combinations, dropping alternatively bands 4 or 5 or the NDVI from each date. Al l the combinations performed similarly, both in the highest average separability and in the highest minimum separability, which in all cases was between PL and SBPL, with the two combinations that drop band 5 for one of the dates performing better than the others (Table 3.4). To decide between these two combinations, I compared their accuracy in a maximum likelihood classifier when used without the aid of the FIP information. To asses the classification accuracy I used those field plots that had not been used in the training sites, as well as 120 photo-interpreted points which I sampled systematically in the area covered by the colour air photos, using an hexagonal grid design, for a total of 259 points (Table 3.5). Table 3.4: Average and minimum Bhattacharyya separability measures for 6 different combinations of seven of the eight available channels. Bands Combination Average B-distance Minimum B-distance March: 3 5 NDVI; June: 3 4 5 NDVI March: 3 4 NDVI; June: 3 4 5 NDVI March: 3 4 5; June: 3 4 5 NDVI March: 3 4 5 NDVI; June: 3 4 5 March: 3 4 5 NDVI; June: 3 5 NDVI March: 3 4 5 NDVI; June: 3 4 NDVI 1.838384 1.855565 1.830334 1.837321 1.843381 1.873538 1.085724 1.126493 1.093846 1.026297 1.028498 1.179610 49 The difference in accuracy between the two combinations used is very small, with the combination that had lower signature separability having a higher accuracy level. Overall, these statistics give little reason to prefer one set of channels over the other. I decided then to use the combination with TM bands 3, 4, and NDVI for March and bands 3, 4, 5, and NDVI for June, considering that the TM band 5 for the March image is mostly separating the snow cover from the bare soil, a distinction that is counterproductive in this classification exercise. 3.3.3.3: Integration of forest inventory information and remote sensing information using Bayesian probability theory. The output of the BAYCLASS module is comprised of several probability maps, one Table 3.5: Confusion matrixes for maximum likelihood classification using two different band combinations. EC are the percent errors of commission, EO the errors of omission. March 3 4 5 NDVI; June 3 4 NDVI HWD OB MWD OV PL SBLW SBPL SW EC HWD 10 0 4 0 0 0 0 0 29% OB 0 11 0 0 0 0 0 0 0% MWD 6 0 54 0 1 0 0 3 16% OV 6 0 7 36 1 2 0 0 31% PL 0 0 9 0 7 1 2 5 71% SBLW 0 0 1 0 1 10 3 5 50% SBPL 0 0 5 0 3 14 25 3 50% SW 0 0 8 0 0 4 4 8 67% EO 55% 0% 39% 0% 46% 68% 26% 67% Overall accuracy 62% March 3 4 NDVI; June 3 4 5 NDVI HWD OB MWD OV PL SBLW SBPL SW EC HWD 9 0 4 0 0 0 0 0 31% OB 0 11 0 0 0 0 0 0 0% MWD 6 0 50 0 1 1 0 2 17% OV 7 0 5 36 0 0 0 0 25% PL 0 0 13 0 7 0 1 3 71% SBLW 0 0 3 0 2 13 3 4 48% SBPL 0 0 4 0 3 15 27 6 51% SW 0 0 9 0 0 2 3 9 61% EO 59% 0% 43% 0% 46% 58% 21% 63% Overall accuracy 63% 50 for each class, reporting the posterior probability, according to Bayes' Theorem, that each pixel belongs to that cover class. If equal prior probabilities are used, the BAYCLASS outputs are just the result of scaling the likelihood from the maximum likelihood classifier to sum to one (Eastman, 1999). Bayes' Theorem can be used again, using the forest inventory maps to produce prior probability maps, with the BAYCLASS results as evidence, to produce posterior probability maps that include the information from remote sensing as well as from the forest inventory maps. Consider the formula of Bayes' Theorem: where: p{hi\e) is the posterior probability of the hypothesis / being true given the evidence e p(e\ hi) is the probability of the finding the evidence e when the hypothesis i is true, in my case corresponding to the BAYCLASS result p( h,) is the probability of the hypothesis being true independently of the evidence, in our case obtained from the forest inventory maps. The summation term in the denominator scales the probabilities assuming that the hypotheses are exhaustive, that is, that one of the hypotheses presented must be true. I implemented this approach using the BAYES module of Idrisi32 (Eastman, 1999), using the FIP maps to produce the prior probability layers. 3.3.3.4: Production of prior probability maps and classification results P(hi I e) = pjelhj-pjhf (3.1) 51 â€¢ P(OV) -P(OB) 0 10 20 30 40 50 60 Stand Age Figure 3.5: Prior probabilities for the bare opening, p(OB), and vegetated opening, p (OV) , cover types calculated from the stand age. The FIP records the percent cover for the leading tree species in each polygon. These values were used to produce raster maps o f percent cover for each species, at the resolution of the Landsat image. A raster map o f stand age was also produced. I then used the following rules to produce basic probabilities for the forest cover classes: 1) The cover of trembling aspen (At) and poplar (Ac) were summed to obtain a basic hardwood cover. 2) I calculated a basic mixedwood probability p(MW) from this hardwood cover using a linear fuzzy membership function so that pixels with hardwood cover lower than 5% or higher than 95% have a mixedwood prior probability o f zero, 52 pixels with hardwood cover of 15% to 75% have a basic probability of 0.5 of being mixed at the pixel level, with intermediate values linearly interpolated. 3) I calculated the probability of the opening types from the age class data, using a fuzzy membership function, as shown in Figure 3.5. 4) It is now possible to calculate the basic probability of the non-opening cover types p(HWD) = (At + Ac) â€¢ (l - p(MW)) â€¢ [l - (p(OB) + p(OV))] p(PL) = â€¢ (1 - p(MW)) â€¢ [1 - (p(OB) + p(OV))) p(SW) = (Sw) â€¢ (l - p(MW)) â€¢ [l - (p(OB) + p(OV))] p(SBLW)-p(SBPL) = 'Sb ^ â€¢ + Lt Sb + Pl â€¢(l-p(MW))-[l-(p(OB)+p(OV))] (l-p(MW))-[l-(p(OB)+p(OV))) (3.2) p(MWD) - p(MW) â€¢ [l - (p(OB) + p(OV))] where PI is the proportion of lodgepole pine, Sw is the proportion of white spruce, Sb is the proportion of black spruce and Lt is the proportion of tamarack. 5) To account for spatial positioning errors as well as simplification of the boundaries in the forest cover mapping, I applied a 7X7 Gaussian filter to each prior probability map using the FILTER module of Idrisi32. 53 HWD prior probability Figure 3.6: The relation between the raster map of HWD prior probability and the FIP map in a section of the study area. This procedure resulted in basic prior probability maps for each cover class (Figure 3.6). Using these maps directly with the BAYES module implies the assumption that both the FIP information, and that the rules used to translate FIP information into cover class probabilities are 100% correct. To avoid using this assumption it is possible to modify the 54 maps using a coefficient t, between 0 and 1, to account for possible uncertainty in FIP mapping: where n is the number of hypotheses h (i.e. the number of cover classes). When t is equal to zero the classification is based entirely on the remote sensing evidence, while for t equal to one, the classification is limited to the possibilities given by the FIP map, no matter how heavily the remote sensing evidence would weigh for an hypothesis ruled out by the FIP information. To select a value for /, it is possible to compare again the classification results. Using the M A X B A Y module of Idrisi32, each pixel is assigned to the class that has the largest posterior probability. The resulting maps (one for each value of/) can be then assessed for accuracy against the ground control points available. The ranking of the different choices can be based on several accuracy assessment measures (Stehmen, 1997). These include the overall accuracy, the rates of errors of commission and errors of omission for each class, which are the complements of the user's and producer's accuracy, the kappa coefficient as well as the tau coefficient (Stehmen, 1997). As Stehmen (1997) points out, overall accuracy, as well as the rates of the errors of omission and error of commission, can be interpreted directly in probabilistic terms. For this reason I decided to use them instead of the more commonly used kappa or tau coefficients. The results for t equal to 0.5, 0.75 and 1 are shown in tables 3.6 to 3.8 (the accuracy of the classification for t = 0 is the same as the classification without the use of the FIP information). The cells entries are the number of samples in each combination of ground (3.3) n 55 control and classification class, while EC. is the percentage of errors of commission and E.O. is the percentage of errors of omission. Overall the introduction of the forest cover information increases classification accuracy, although not greatly nor significantly: if one compares the success rates of the classifications with an exact binomial test, it is not possible to reject the hypothesis that they all come from a population with a true success rate of 0.63, that of the classification without forest cover information. Especially similar are the results for t's of 0.75 and 1, with a difference in overall accuracy of only 1%. Moving to the results for the single classes, it is clear that the largest single source of confusion is between the conifer classes, especially between the two black spruce dominated classes. If the two classes are collapsed together, the overall accuracy for r = 1 becomes 75%. While the error rates for most classes decrease with increasing values of t, there are some exceptions. Table 3.9 shows the estimated error rates, omission and commission, for each class, for the different values of t. As it can be seen, not all the error rates decrease when t is assigned its possible maximum value of 1. The class whose classification is mostly improved by a t of 1 is SW, while the accuracy for MWD is the largest for a t of 0.75, but the changes are no large. The slight increases in accuracy for t = 1 come at the expense of the number of pixels classified. Because of rounding errors, some pixels are assigned by the maximum likelihood classifiers a probability of zero in some classes. If these are the only classes with a non-zero prior probability, as defined by the FIP information layer, the pixels cannot be classified, hence the smaller sample size in Table 3.8. Considering the small differences in classification performance, I then decided to use a r of 0.75 for the analysis. 56 Table 3.6: Error Matrix Analysis of ground control points (columns : truth) against M A X B A Y results for t = 0.5 (rows : mapped). EC are the percent errors of commission, EO the errors of omission. Forest Cover Type HWD OB MWD OV PL SBLW SBPL SW EC Total HWD 13 0 5 1 0 0 0 0 32% 19 OB 0 10 0 0 0 0 0 0 0% 10 MWD 6 0 52 0 1 0 0 2 15% 61 OV 3 0 2 32 0 0 0 0 14% 37 PL 0 0 11 0 7 0 1 5 71% 24 SBLW 0 0 2 0 0 12 3 2 37% 19 SBPL 0 0 5 0 3 18 27 7 55% 60 SW 0 0 9 0 0 1 3 7 65% 20 EO 41% 0% 40% 3% 36% 61% 21% 70% Total 22 10 86 33 11 31 34 23 samples 250 Overa l l accuracy 64% Table 3.7: Error Matrix Analysis of ground control points (columns : truth) against M A X B A Y results for t = 0.75 (rows : mapped). EC are the percent errors of commission, EO the errors of omission. Forest Cover Type HWD OB MWD OV PL SBLW SBPL SW EC Total HWD 13 0 5 1 0 0 0 0 32% 19 OB 0 10 0 0 0 0 0 0 0% 10 MWD 6 0 55 0 2 0 0 2 15% 65 OV 3 0 2 32 0 0 0 0 14% 37 PL 0 0 8 0 6 0 1 4 68% 19 SBLW 0 0 2 0 0 12 3 1 33% 18 SBPL 0 0 6 0 3 18 27 7 56% 61 SW 0 0 8 0 0 1 3 9 57% 21 EO 41% 0% 36% 3% 45% 61% 21% 61% Total 22 10 86 33 11 31 34 23 samples 250 Overa l l accuracy 66% 57 Table 3.8: Error Matrix Analysis of ground control points (columns : truth) against M A X B A Y results for t = 1 (rows : mapped). EC are the percent errors of commission, EO the errors of omission. Forest Cover Type HWD OB MWD OV PL SBLW SBPL SW EC Total HWD 13 0 6 1 0 0 0 0 35% 20 OB 0 10 0 0 0 0 0 0 0% 10 MWD 6 0 49 0 2 0 0 2 17% 59 OV 0 0 0 32 0 0 0 0 0% 32 PL 0 0 8 0 6 0 1 4 68% 19 SBLW 0 0 2 0 0 11 3 1 35% 17 SBPL 0 0 7 0 3 17 28 5 53% 60 SW 0 0 8 0 0 2 2 11 52% 23 EO 32% 0% 39% 3% 45% 63% 18% 52% Total 19 10 80 33 11 30 34 23 samples 240 Overall accuracy 67% Table 3.9: Errors of commission and omission in the M A X B A Y classification for different values of the FIP confidence parameter t. Errors of Commission Errors of Omission t 0 0.5 0.75 1 0 0.5 0.75 1 HWD 31% 32% 32% 35% 59% 41% 41% 32% OB 0% 0% 0% 0% 0% 0% 0% 0% MWD 17% 15% 15% 17% 43% 40% 36% 39% OV 25% 14% 14% 0% 0% 3% 3% 3% PL 71% 71% 68% 68% 46% 36% 45% 45% SBLW 48% 37% 33% 35% 58% 61% 61% 63% SBPL 51% 55% 56% 53% 21% 21% 21% 18% SW 61% 65% 57% 52% 63% 70% 61% 52% 58 C H A P T E R 4: T E R R A I N C L A S S I F I C A T I O N F R O M T R I M D E M D A T A 4.1: BUILDING DEMs FROM TRIM. 4.1.1: A n Introduction to the production of D E M s from T R I M files and collection line artefacts The Digital Elevation Models (DEMs) I used in this thesis are produced from the Terrain Resource Information Management (TRIM) product of the Province of British Columbia, Canada, a 1:20,000 scale digital map. TRIM DEMs are produced directly from photogrammetric models, and are designed to produce triangulated irregular networks (TINs), and are composed of elevation points and break lines (GDBC, 1992). Elevation points are distributed on a semi-regular grid, and are collected in sequence along a specific direction, generally north-south (GDBC, 1992), so that groups of elevation points measured sequentially, here referred to as "collection lines", are generally recognisable. In some TRIM maps, especially in areas with limited topographic variability, elevation values within collection lines have a higher degree of spatial autocorrelation than elevation values between collection lines. This produces topographic artefacts, specific errors of systematic nature that appear like "stripes" or "ridges" aligned with the collection lines. These collection line artefacts are particularly evident in analytical hill-shading images or in aspect maps (Figure 4.1). In the next sections I will describe in more detail the nature of these artefacts, and why they are problematic to my research. I will then introduce a filtering technique I devised to mitigate the artefacts impact on the DEMs, describe its implementation and its impact on basic terrain variables that can be derived from the DEM like slope and aspect. 59 4 . 1 . 2 : Collection line artefacts as spatially structured elevation error. DEM production processes are known to produce artefacts. An example is that of contour line "ghosts" in USGS DEMs (Guth, 1999), which are a consequence of one of several problems related to the interpolation of DEMs from contour line maps (Carrara et al. 1997). Collection line artefacts in TRIM appear as "stripes" aligned with the collection lines o o 5 5 4 0 0 0 5 5 6 0 0 0 5 5 8 0 0 0 U T M X gure 4 .1 : The DEM points are superimposed to an analytical hill-shading for part of TRIM map sheet 94B.060, showing the collection line artefacts. The area in the white square and the line of white points are the ones used in the example in Figure 4.2. of the DEM points when the TINs are converted to raster DEMs. The perfect alignment with the collection lines, and the fact that the "stripes" do not conform to the watershed pattern delineated by watercourses in the TRIM line files, made them recognisable as artefacts and not authentic topographic features. I confirmed this by field and aerial photograph observation for one of the map sheets, and by visual inspection of Landsat 5 satellite images for the rest of the map sheets. Similar artefacts have been described by Garbrecht and Starks (1995) in USGS 7.5-minutes DEMs produced with the manual profiling method. Collection line artefacts can be assumed to be a consequence of a specific spatial distribution of elevation errors. For the stripes to appear, part of the elevation error must have high autocorrelation in the direction of the collection lines, and low autocorrelation in the orthogonal direction. In this, the collection line artefacts are similar to the scan-line noise (Crippen, 1989) or "striping" (Pan and Chang, 1992) of satellite imagery, and generally conform to the expected error structure of any measurement that is performed in sequence. A possible error model for DEM data points is the following: Z(x,j)=z(x)+k(j)+e (4.1) where Z is the measured elevation at the point x in geographical space and position j in a collection line, z is the real elevation at the same point, k is an error term which is only a function of the position on the collection line and e is a residual error term. Note that because we have only one measure of Z at each (x,j), it is not possible to attempt an estimate of k at a given j without making assumptions on the spatial distribution of k and z. However, only three assumptions are necessary to produce the collection lines artefacts: k must be autocorrelated along j, k must be larger than e, and z must be autocorrelated in x. As will be shown in section 4.1.5, the presence of line collection artefacts does not necessarily indicate a violation of the accuracy standards for the TRIM map product. Indeed 61 it is possible to assume that the distribution ofk + e meets the mapping accuracy standards, with variance lower than the square of the Root Mean Square Error (RMSE): VAR(k + e)<RMSE{Zy (4.2) The accuracy specification of the TRIM product for spot elevation points is an elevation Linear Mapping Accuracy Standard (LMAS) of less than 5 m, corresponding to a maximum RMSE of 3 m (GDBC, 1992). When the DEM points are irregularly distributed, the specifications for the spacing of the points are expressed in terms of minimum point density, with an average spacing of 100 m in areas of slope less than 25Â°, and of 75 m in areas of slope higher than 25Â° (GDBC, 1992). As a consequence, the RMSE of the slope calculated by linear interpolation between two independent points is 6 % when the points are 100 m apart, which is sufficient to create artefacts in low relief areas. It is well recognised that spatial distribution of elevation errors influences local operations that are performed on DEMs (Lee et al, 1992; Hunter and Goodchild, 1997; Fisher, 1998; Heuvelink, 1998). Terrain parameters that are based on the first or second derivative of the DEM surface, such as slope, aspect, and surface curvature, are particularly influenced by the spatial autocorrelation of DEM errors. In DEMs where the spatial autocorrelation of errors is high compared with the size of the window used for parameter estimation, the error will behave like a local bias with less influence on surface shape. Conversely, DEMs with low spatial autocorrelation of error will introduce the most noise in the estimates of these parameters (Hunter and Goodchild, 1997; Heuvelink, 1998). This is why low-pass filters are traditionally employed to mitigate the effect of isotropic elevation errors on parameter estimations by removing the high-frequency component of the DEM, either in the space domain with moving-window averaging filters or in the frequency domain using Fourier transformation (Pellegrini, 1995). 62 Collection line artefacts have strong anisotropy in the autocorrelation of the error, and influence parameter estimation differently than an isotropic autocorrelated error. The error influence on slope will be greater or lesser depending on the orientation of the collection lines relative to the orientation of the terrain surface. Since the collection line orientation in the TRIM product is mostly the same for the entire map sheet, aspect estimates will not only be locally erroneous, but also globally biased toward the direction orthogonal to the collection lines. Collection line artefacts will also negatively affect the parameterisation of drainage features in the same way as the striping patterns in USGS 7.5-minute DEMs (Garbrecht and Martz, 2000). This is why I decided to mitigate line collection artefacts with an appropriate filter prior to performing local operations on the DEM data. 4.1.3: The filter employed: line-based cross-smoothing. Because of their specific structure, collection line artefacts are easier to identify and target for removal than less structured error. The artefacts should be removed using a filter that targets the component of the elevation signal that is autocorrelated along^, but has no autocorrelation inx (Equation 4.1). Also, to avoid the effect of interpolation errors, the filter should operate, when possible, directly on the measured points and not on an interpolated surface. This last requirement is especially necessary because collection lines can have different lengths, spacing and orientation in a map sheet, ruling out the use of filters based on Fast Fourier Transformation, that have been successfully used in removing striping artefacts in remote sensing images (Pan and Chang, 1992; Watson, 1993). 63 The filtering algorithm I developed, line-based cross-smoothing, is a modification of Crippen's (1989) filtering routine, and it was written as a series of functions in the S-plus command line language, taking advantage of the matrix operations of that language. It consists of seven main processing steps (Figure 4.2). â€¢ Original elevation Filtered surface si New elevation * Residuals from filtering Loess-smoothed residuals [lire 4.2: The main steps of the cross-smoothing algorithm: a) a collection line is identified (step 1 of the algorithm) and a surface is interpolated around it (step 2); b) a low-pass filter is applied to the surface in the direction orthogonal to the collection line (step 3); c) a loess interpolator is applied to the residuals of line points from the filtered surface (steps 4,5, and 6); d) the smoothed residuals are subtracted from the original elevation creating a new elevation for each point (step 7). The step numbers in parenthesis refer to the explanation in the text. 64 1. The data points are assigned to individual collection lines, based on their entry order in the database, their relative distance in UTM co-ordinates, and the orientation of the segments defined by each sequence of two points. Collection lines are defined as sequences of points where the Manhattan distance between two consecutive points is lower than a specified threshold, and the difference in azimuth between two consecutive segments is within a specified threshold. This step requires the definition of three parameters: XMAX and YMAX, the maximum distances in the x and y directions between two consecutive points before a line is broken, and AMAX, the maximum change of azimuth allowed between two consecutive segments before a line is broken. 2. A surface grid is then interpolated around each collection line. The grid is aligned with the collection line, so that it is possible to apply directional filters that are orthogonal to the collection line. This step is necessary to separate x from j as much as possible, by considering only the variation of Z in the direction orthogonal to j. The nodes are spaced to minimise the average distance from each original point to a node in the grid. The algorithm uses the interp function of S-plus (MathSoft, 2000) for the interpolation. The resolution of the grid, RES, is the most important parameter in this step. The other parameters are the number of points used for the interpolation of each node, NPSINT, and the size of the grid in the direction orthogonal to the collection line. The size of the grid should be chosen so that it can accommodate the filter described in the next step. 3. A low-pass filter is applied to the grid in the direction orthogonal to the collection line. The filter employed is a simple mean filter with fixed kernel size LEN. The product of LEN and RES gives the dimension of the kernel in map units, and should be calibrated to fit the spacing of the collection lines, so that the kernel spans a few collection lines, 65 keeping in mind that the spacing of the collection lines is generally not constant over a DEM. 4. Each original point in the collection line is assigned a filtered value by inverse distance weight (IDW) interpolation (Lam, 1983) of the four closest nodes of the grid. 5. The difference between the original elevation value and the filtered value, R, is recorded for every point. This residual value can be considered composed of the error k, assumed to have low frequency along the collection line, and hz, the high-frequency component of elevation in the direction orthogonal to the collection line. R = hz + k (4.3) 6. A loess interpolator, a localised regression function, is then applied to the R values along each collection line. The interpolator is used to separate k from hz, since hz is assumed to have high frequency also along the collection line. The loess function of S-plus is used, using the distance of each point from the beginning of the collection line as the predictor variable. Only lines that reach a threshold minimum number of points MINPS are processed. The scale parameter for the loess function is the span, the size of the kernel where the local regression is applied. In the S-plus loess function, the span is specified as a fraction of the total number of points. In the algorithm implementation the span is calculated for each individual collection line, so that the same number of points (NPOINTS) is included in the kernel for each collection line. 7. A new filtered elevation estimate FZ is than obtained by subtracting the loess interpolation result from the original measurement: FZ = Z-loess(R) (4.4) 66 The loess interpolator was chosen because it allows for unevenly spaced points along the collection line, it does not require any specific assumption about the relation between k and the position on the collection line, and it is relatively robust and can be implemented even when collection lines are relatively short. 4.1.4: The probability-based replacement function. While the filter described above will work well in many cases, sometimes it can produce values of FZ that depart from Z more than it would be reasonable to expect considering the assumed distribution of k. For this reason I developed a probability-based replacement function, which I used to constrain replacements of Z with FZ within an acceptable range. This function is based on the concept that the probability that FZ represents a better estimate of the true elevation decreases when the difference AZ between FZ and Z exceeds certain thresholds, based on the mapping accuracy standards. Equation 4.2 assumes that the total error (line collection plus spatially unstructured) is distributed within the mapping accuracy standards. Let us add the assumption that the error is normally distributed with mean zero. This later assumption is not always justified in real DEMs (Monckton, 1994), but it is the assumption behind the use of RMSE as a single descriptor of DEM accuracy (Li, 1988). Based on this assumption, it is possible to calculate the probability p(AZ) that the difference between the measured elevation and the unknown real elevation would be equal or lower than AZ. Where p(AZ) is higher than a certain threshold, it is reasonable to retain the original Z value as a better estimate of the real z. Instead of using a single threshold, a "fuzzy" threshold can be employed. The fuzzy replacement function is defined by two probability levels pi andp2. The first probability, p,, is the one at which the filtered value substitutes the 67 original value. The probability p2 is the one at which the original elevation value is maintained. At intermediate probabilities, the new elevation NZ is calculated as the weighted average of the two possible elevations: NZ = (l - a)Z + aFZ where a 1 for p(AZ)>p p(AZ)-p2 Pi-Pi 0 for p2 < p(AZ) < P l (4.5) for p(AZ) < p2 The choice of the thresholds determines the amount of variation that the filtering process is allowed to impose on the original data points. The fuzzy threshold allows for a smoother transition between filtered and non-filtered values in areas where the filtered values might depart gradually from the original values. 4.1.5: Implementation The parameter values for the filtering algorithms used in this study are reported in Table 4.1.1 applied the cross-smoothing algorithm described above, including the replacement function, to all 63 available TRIM map sheets. For reasons of brevity, the results for only two map sheets, chosen among those most plagued by the line collection artefacts, Table 4.1: Parameters used in the implementation of the line-based cross-smoothing filter (see text for explanation). Parameter Value XMAX, YMAX 230nT AMAX 15Â° RES 20 m NPSINT 6 LEN 15 MINPS 9 NPOINTS 7 P, 0.850 P 2 0.997 68 are reported here. These are the map sheets 94B.020 and 94B.060 in the British Columbia Geographic System of Mapping (GDBC, 1992). Table 4.2 reports the summary statistics for the point files before and after the cross-smoothing. The number of points affected by the cross-smoothing is lower then the overall number of points because RES cannot not be calculated at the edges of the DEM, where it is not possible to apply the orthogonal filter. FZ is also not calculated for any individual collection line that has a number of points with valid RES lower than the MINPS parameter, the minimum for the loess function to operate. For all of these points, the original Z values are retained. The summary statistics for change in Table 4.2 are calculated for those points where a FZ value was calculated. For this applicationpx was set to 0.850 andp2 to 0.997. Given the RMSE of TRIM files, the range of possible change for each point was limited to Â± 4.37 m. It can be seen that in each map sheet the mean of the change is slightly positive; that is, the cross-smoothed DEMs are slightly lower then the original ones. Although both values are Table 4.2: Summary statistics of the three DEMs elevation points and the effects of cross-smoothing: the change has non-zero mean. Map Sheet 94B.020 94B.060 N Z points 21892 25175 N points modified 20818 22153 Original Z Values Mean 831.72 767.26 Standard Deviation 127.17 68.66 Minimum * 658.00 595.00 Maximum f 1372.00 960.00 Cross-smoothed Z values Mean 831.48 767.15 Standard Deviation 127.00 68.57 Minimum 658.00 595.00 Maximum 1372.00 958.09 Change (Original - Cross-smoothed) Mean 0.24 0.13 Standard Deviation 1.56 1.78 * The original values have a precision of 1 m. 69 quite low compared with the original D E M precision of 1 m, the bias is statistically significant: the hypothesis that E ( Z - F Z ) = 0 given V A R ( Z - F Z ) = 2 R S M E 2 can be rejected at a = 0.001 (z test). Nonetheless, it should be noted that the range of the distribution has not been substantially altered by the filter. It is revealing to look at the spatial distribution of the changes, which is represented in Figure 4.3 as the difference between the raster DEMs obtained from the original and the filtered point files in a TIN, as described in the next section. Fifty-meter elevation contours obtained from the cross-smoothed D E M are superimposed on the difference images, so that it is possible to relate the filter interaction with the shape of land. The filter has performed quite well in picking up the autocorrelated noise along the lines in the flatter areas. Nonetheless, extreme change values are found in areas of more complex topography. In the east-facing slopes at the western edge of map sheet 94B.020, the filter, while still operating differently on different lines, is generally lowering the elevation. This happens again in the north-west comer and in the eastern section of 94B.060. The explanation lies in the general north-south orientation of the collection lines. The wide-kernel low-pass orthogonal filter used in step 3 of the algorithm wil l particularly affect slopes where the gradient direction is orthogonal to the collection lines. If the slope face is sufficiently wide and has the same general orientation of the collection line, the change imposed by the low pass filter does not have a high-frequency component along the collection line. This change is then indistinguishable from autocorrelated noise, and is captured by the loess function in step 6 of the cross-smoothing algorithm. Since in this area the slopes are convex, the filter effect is to lower the elevation. 70 94B.020 550000 555000 560000 550000 555000 560000 94B.060 550000 555000 560000 550000 555000 Figure 4.3: Spatial distribution of elevation change between the original and the filtered DEM for TRIM map sheets 94B.020 and 94B.060. 71 By contrast, the local minima, which are increased by the averaging fdter, tend to occur in deeply cut and meandering channels. These are generally not aligned with the collection lines, so their residuals from the horizontal fdtering have mostly a high-frequency component along the collection line and are not removed from the elevation in step 6. 4.1.6: Effect of filter implementation on slope and aspect. As discussed in the introduction, the main objective of applying the cross-smoothing fdter to the DEM data is to improve the description of the terrain shape, not to improve the DEM accuracy at any given point. To explore the effect of the fdter on DEM-derived parameters, both the original and the fdtered point files, together with the unaltered break-line files, were used to produce a TIN using the 3D Analyst extension of ESRI Arc-View GIS 3.2. Each TIN was then converted to a 30 m resolution raster DEM using the interpolation routine of Arc-View 3D Analyst, and slope and aspect were calculated from the raster DEMs. In this section, the evaluation of the performance of the cross-smoothing filter is based only on how well it removes the collection-line artefacts, identified visually and subjectively as departures from the expected natural shape of the terrain. A proper quantitative estimate would require measures of slope and aspect from a source of greater accuracy, which was not available. The method used by Skidmore (1989a) to compare different algorithms to calculate slope and aspect, where contours are hand-drawn by an operator interpolating the measurement of elevation from the grid, is not appropriate in this case as the elevation themselves are changing, and not the calculation method. The calculation of slope and aspect from a gridded DEM is influenced by the algorithm used, with different algorithms generally producing different results (Skidmore, 1989a; Jones, 1998). The quadratic regression model (Evans, 1980 cited in Wood, 1996), as 72 implemented in the LandSerf software package by Wood (1996), has proven to be the most precise method for these calculations (Florinsky, 1998a; Jones, 1998). Wood's (1996) implementation of the algorithm in LandSerf, which is freely available for download on the Internet, has the advantage that it is possible to calculate parameters with different window sizes, where the use of a wider window is akin to applying a square smoothing filter to the DEM, as the parameters are estimated at different scales. Slope and aspect were calculated from the gridded DEMs with 3X3 and 5X5 windows, corresponding to 90X90 and 150X150 meters. Both windows are wide enough to cover the signal from at least two different collection lines, given the general spacing of the lines in the TRIM data. Slope data were successively reclassified into five slope classes. The class limits used, reported in Table 4.3, are those relevant for the ecological classification of the area (Beckinghamera/., 1996). Figures 4.4 and 4.5 show the slope maps obtained from the original and filtered DEMs with the two different window sizes. In both map sheets it is possible to observe the influence of the line collection artefacts on the slope estimate in the DEM obtained from the original TRIM data: vertical stripes of different slope class appear in areas of generally homogenous slope. These artefact effects persist even when the slope is evaluated on a 5X5 window. The cross-smoothing filter is able to eliminate most of the artefact effects and retain the original detail, even when slopes are evaluated with a 3X3 window. 73 ORIGINAL CROSS-SMOOTHED Figure 4.4: Slope maps for TRIM map sheet 94B.020 as determined using the original or the fdtered DEM and a 3 X 3 or a 5 X 5 window. The area distribution among the slope classes (Table 4.3) is slightly altered by the use of the cross-smoothing fdter, compared to the use of a 5 X 5 evaluation window, which eliminates the very steep slopes and greatly reduces the area of steep slopes. What is probably of greatest interest is how the cross-smoothing fdter alters the direction of the slopes, as it is evident in the aspect maps (Figures 4.6 and 4.7). 74 ORIGINAL C R O S S - S M O O T H E D Figure 4.5: Slope maps for TRIM map sheet 94B.060 as determined using the original or the fdtered DEM and a 3X3 or a 5X5 window. The evaluation of aspect is severely influenced by the collection line artefacts in both map sheets, and again the cross-smoothing fdter performs quite well in removing the artefacts' influence, even when the smaller window is used to evaluate aspect. However, there are some straight-line features in both the aspect and slope maps that are not affected by the cross-smoothing. These features correspond to roads when the elevation measurements along the roads differ from the height of land interpolated from the point fdes. The road line-work comes from the TRIM break lines fdes and was not subject to 75 ORIGINAL CROSS-SMOOTHED Figure 4.6: Aspect maps for TRIM map sheet 94B.020 as determined using the original or the fdtered DEM and a 3X3 or a 5X5 window. cross-smoothing. As expected from the north-south orientation of the collection line artefacts, the original DEMs show constantly lower area in the north and south aspect classes than the cross-smoothed ones (Table 4.4). 4.1.7: Conclusions. Collection-line artefacts, like scan-line noise in satellite images (Crippen, 1989), are a form of error that is potentially more disruptive than its magnitude would indicate. Because 76 ORIGINAL C R O S S - S M O O T H E D Figure 4.7: Aspect maps for TRIM map sheet 94B.060 as determined using the original or the fdtered DEM and a 3X3 or a 5X5 window. of their high level of spatial structure, these artefacts greatly influence the estimate of local topographic parameters from a DEM in areas of generally low relief, and are likely to adversely affect any of the several possible subsequent GIS applications based on these parameters. The line-based cross-smoothing algorithm I devised has proven to satisfactorily remove the line collection artefacts, producing a DEM surface where calculated slope and aspect are more realistic. 77 Table 4.3: Areas (ha) of map sheets 94B.020 and 94B.060 falling in each slope class when slope is calculated from the original or the cross-smoothed DEM with window sizes of 3X3 and 5X5. Cross-smoothing does not "smooth out" the steeper slopes. Slope class 94B 3X3 window . . , cross smoothed .020 5X5 window . . , cross smoothed 94B 3X3 window n . . . cross smoothed .060 5X5 window . . , cross smoothed Flat (< 2 %) Slight (2-10%) Moderate (10-30%) Steep (30 - 70 %) Very Steep (> 70 %) 4113.27 4354.74 5970.69 6027.21 2675.97 2394.27 271.80 255.42 0.27 0.36 4479.57 3788.28 5945.40 7003.62 2405.52 2061.27 201.51 178.83 0.00 0.00 2276.37 2667.06 7233.39 7209.36 3348.81 3002.49 172.44 152.19 0.99 0.90 2541.15 1972.26 7591.41 8634.87 2819.16 2358.54 80.28 66.33 0.00 0.00 Table 4.4: Areas (ha) of map sheets 94B.020 and 94B.060 falling in each aspect class when aspect is calculated from the original or the cross-smoothed DEM with window sizes of 3X3 and 5X5. The original DEMs have lower area in the north and south aspect classes than the cross-smoothed ones. 94B.020 94B.060 3X3 window 5X5 window 3X3 window 5X5 window Aspect (Azimuth) original cross smoothed Original cross smoothed Original cross smoothed original cross smoothed No Aspect 120.24 47.88 45.99 27.00 29.79 18.00 14.94 0.00 337.5 - 22.5 985.41 1020.24 973.53 988.56 1375.47 1642.05 1500.21 1667.07 22.5 - 67.5 2204.82 2484.99 2340.54 2572.56 1692.36 1787.67 1745.10 1833.30 67.5-112.5 4569.03 4575.87 4695.30 4722.03 2196.27 2006.91 2053.80 1940.13 112.5 - 157.5 1761.30 1956.06 1812.33 1962.18 1593.45 1571.13 1647.99 1663.74 157.5 - 202.5 932.31 1048.95 931.32 1001.70 1741.50 2017.08 1835.37 2002.77 202.5 - 247.5 800.73 678.60 768.78 645.93 1724.76 1695.06 1797.66 1735.02 247.5 - 292.5 892.08 586.08 744.66 538.20 1403.46 1130.22 1216.98 1064.43 292.5 - 337.5 766.08 633.33 719.55 573.84 1274.94 1163.88 1219.95 1125.54 It did this without reducing the detail of the DEM and with modifications of the original elevations that are smaller than the LMAS of the original TRIM data. Nonetheless, the algorithm introduced a small but significant bias in the elevation measures. This could indicate that the improved topographic description is obtained at the expense of elevation 78 accuracy. One should therefore be wary of using its results in applications that are more dependent on the elevation values than they are on the shape of the land. 4.2: COMPUTATION OF TOPOGRAPHIC PARAMETERES FOR DEM-BASED TERRAIN CLASSIFICATION 4.2.1: Quantitative Analysis of Topographic Surfaces Some ecological applications in the rapidly expanding field of geomorphometry (Pike, 2000) have been introduced in Chapter 2.4.1. In this section I will describe the methodology I employed to produce the maps of four ecologically relevant topographic variables (slope, aspect, profile and plan curvature), and the terrain classification that will be used in Chapter 5 for hypothesis testing. 4.2.1.1: Quadratic approximation of the topographic surface Slope, aspect, profile and plan curvature can be defined for any continuous surface (Evans, 1980). In order to estimate them from a DEM, it is necessary to convert the gridded elevations of the DEM into a continuous surface. The surface must also be double-differentiable, if the curvatures are to be described. Evans (1980)* approximated the terrain surface using a conic equation of the form: z = ax2 + by2 + cxy + dx + ey + f (4.6) where z is the elevation and x and y are the co-ordinates of the point in the reference plane. Using equation 4.6, it is relatively simple to estimate the parameter from a regularly spaced grid. While other equation forms have been proposed and used, in particular the one 79 of Zevenbergen and Thorne (1987), Evans' approach provides the most precise estimate of the terrain parameters in presence of elevation errors (Florinsky, 1998a). At co-ordinates x = 0, y = 0, one finds that the slope G, aspect A, profile curvature kr and plan curvature kt can all be calculated from the parameters of the conic equation (Wood, 1996): G = arctan(V(/ 2 + e2) A = arctanf k. = k, -200(ad2 +be2 +cde) (e2 +d2)(l + d2 +e2y/2 -200(ae 2 +bd2 -cde) (e2+d2) (4.7) where the curvature* terms are multiplied by 100 since they are often very small. Florinsky (1998a) uses the symbols kv for vertical (profile) curvature and kh for horizontal (plan) curvature, but his formula for the calculation of the horizontal curvature is slightly different. Zevenbergen and Thome (1987) use the notation PROFC and PLANC, but their calculations are based on a different conic form. To avoid confusion I prefer using the notation kr and kh although in practical terms all these measures are dealing with the same surface properties. In Evans' (1980) original application, the terrain surface parameters were estimated at every node of the DEM grid using a 3X3 window, or nine elevation measurements. With * Wood (1996) reports an earlier work of Evans (1979), but Evans (1980) is a more accessible source. f Wood (1996) erroneously states that these formulae are for the calculation of directional derivatives, when they actually compute the curvatures, hence the term elevated to the 3/2 power. 80 nine elevation measurements, equation 4.6 can be solved using a least squares method with two degrees of freedom. Wood (1996) generalised Evans' approach to a square window whose side can be any odd integer. Wood's approach has the advantage of enabling the analysis of terrain surface at multiple scales, using the conic equation to generalise the terrain surface. Wood's approach is conceptually similar to Pellegrini's (1995) idea of simplifying the DEM using Fast Fourier Transformation and a filter in the frequency domain, but it is more elegant, since it does not require pre-processing of the data and avoids the problems of ringing artefacts that can be created by back-transforming fdtered surfaces. Another advantage of this approach is the possibility of using a weighting matrix to reduce the influence of the points at the margins of the trend surface (Wood, 1996). My implementation of Wood's approach differs from Wood's original one only in the form of the weighting function used. Wood uses a decay function of the form: w0 = (4.8) " (^+1)" where dy is the Euclidean distance in grid cells of the element from the centre of the window, and n is an exponent that determinates how fast the decay is occurring. I instead used a Gauss-like weighting function of the form 2/r Wy = e K ' (4.9) where h is the maximum distance along one side of the evaluation window and r is a decay parameter. This weighting function has a different decay shape, and is scaled on the window size, so that the weight distribution in terms of relative window position is constant, and the window size is the only factor in defining the scale of observation. This does not happen with 81 equation 4.8, where, for values of n > 0, the size of the window loses relevance as it increases, since the external elements assume increasingly smaller values, and the scale of observation is eventually mainly determined by n. Once a weight matrix has been built, the parameters of the quadratic function can be calculated using the least squares method. This is done by solving the system (Wood, 1996): f V " 1 4 V 2 2 2_,xi y> wi Zx . 3 > v/ Z*>2-w Z*<2wÂ« ^ ( V 2 ^ y< wi Z-^V Z-w3*".- Z-"y>2w< Z^.-V b Z z--^ 2^ Z V w y wi Z-^-w zw^ c z^-w Z*.2 > v. Zjc'->;'w' d Z z^w< Z V w / 2~Lxiy?wi 2Zxiyiwi Z ^ 2 ^ Z-w e Zz<-w v 2 Â» . z ^ Z^.-w Z x / w . Z w * j Jj > z.w. (4.10) which is done by inverting the first matrix. Several methods for matrix inversion are available in the Matrix library of S-plus, with each method producing different numerical results (MathSoft, 2000). For consistency of results with Wood's software LandSerf, in the implementation I chose to use L U decomposition. Because of the regular nature of the DEM grid, one needs to invert the matrix only once for any combination of window size and weight matrix. 4.2.1.2: The choice of appropriate scales for the analysis With the method described above it is possible to calculate the topographic variables at different scales using a range of window sizes, as well as a range of r values to determinate the relative weight of the centre and the periphery of the window. The minimum possible scale is dictated by the resolution of the DEM grid, while the maximum possible scale is limited by the extent of the DEM. Not all these scales will produce meaningful results, because of limitations of the data as well as limitation of the analysis method: 82 1. Very small windows are most influenced by errors in the DEM elevations, especially if the error has very low spatial autocorrelation (Lee et al, 1992; Fisher, 1998; Heuvelink, 1998). 2. The DEM grid used in the analysis is produced through interpolation from the TRIM, and it is thus a simplification of the actual topography from the TRIM sampling scheme. At very small scales, the grid surface shape is more likely to represent the shape of the specific interpolator used than the "real" topographic surface. 3. As the evaluation window gets larger, it is less likely that the quadratic equation will be a good descriptor of the topographic surface in the neighbourhood of the evaluation point, especially for low values of r. 4.2.1.3: Influence of DEM errors on topographic variables The influence of DEM errors on topographic variables has generally been dealt with using error modelling and stochastic simulations (Fisher, 1997; Hunter and Goodchild, 1997; Heuvelink, 1998). In the specific case of Evans' (1980) approach on a 3X3 window, Florinsky (1998a) has produced formulae to calculate the RMSE of topographic variables using a first-order Taylor approximation. In the following paragraphs, I extend Florinsky's (1998a) approach to the general case of any combination of grid size, window size and decay parameter. Assuming spatial independence of the errors, the general formula for the variance of a parameter p of the conic is (Florinsky, 1998a; Heuvelink, 1998): 2 (4.11) 83 where a 2 2 is the variance of the elevation error (i.e. the square of the RMSE). If one expresses equation 4.10 in its general matrix format and solves for the parameter vector b, one obtains (Dillon and Goldstein, 1984) b = (X'X)-'X'Z (4.12) where X is the independent variables (co-ordinates) matrix and Z is the dependent variables (elevations) vector. Then the variance of b can then be estimated as 2 2 _ (X'X)-'X' dZ dz. (4.13) but since Z is just the sequence of zâ€ž the derivative of Z in z, is a position vector, a vector of 0's with a 1 in the itb position, 4.13 simplifies to 2 2â€”^ Â°b =cr2 _ c 2 Xi Wi 2 x.v,.w. (X'X)"1 x,.w,. V wi ) (4.14) Thus, it is relatively straightforward to estimate the effect of changing the window size and weights structure on the variance of the parameters of the conic equation. Equations 4.11-14 assume the of absence of spatial correlation in elevation errors. This is an extremely unlikely occurrence (Fisher, 1997; Heuvelink, 1998), especially considering that the DEMs are derived from the TRIM point and line files through interpolation of the TIN model. Considering the case of Taylor expansion with spatial 84 autocorrelation between random variables (Heuvelink, 1998, pg. 37), 4.14 can be altered to include the correlation of the elevation error: < J (X'X)-1 X j2W j yj wJ XjyjWj XjWj yjwj (X'X)" 2 X; W; ylwi X.J/.W,. yw (4.15) where p (z^zj) is the correlation of elevation error for the positions i and j. If the weighting function is symmetric, the following relations hold for a symmetric DEM grid (allowing for rounding error in the matrix inversion): 2 2 2 2 (4.16) which makes it easier to calculate the variance of the terrain parameters defined in equation 4.7: VAR(G) VAR(A) = 'dG\2 2 fdG^ dd CT = . . . = 8A dd fdA^2 1 (\ + d2+e2) __â€ž2 d +e (4.17) (4.18) 85 VAR(kr) = rdk\2 2 fdk\2 2 fdk.^2 v 3a y 0 40000 2 CT + 5 6 2 V ^ V o rdk\2 2 fdky ydd j0 (e 2 +J 2 ) 2 +( l + e 2 + J 2 ) 3 â€¢{ ( e 4 +J 4 ) - c r a 2 + e 2 J 2 . c r c 2 + "+" Cj (2<fa + ce)2 + (2eb + cdf + M + be2 + cde) (d2 + e2) + \i + d2 +e2j (4.19) (ad2 +be2 +cdef (d2+e2) VAR(k,) = da cr, + 'dk,^2 40000 V ^ y 0 cr, + (e 2+rf 2) 2+(l + e 2 + J 2 ) 3 ( e 4 + J 4 ) - C r f l 2 + e 2 J 2 . , T c 2 + c r ; . (2bd - cef + (2ae - c j ) 2 - 4 k ! t ^ l z ^ I (4.20) For the case of a 3X3 window and no spatial autocorrelation, VAR(a) = 2, VAR(c) = 0.25 and VAR(d) = 1/6, so that substituting these terms in equations 4.17, 4.18, and 4.19 yields Florinsky's (1998a) equations 24, 25 and 27, when one accounts for the different notation used by Florinsky and for a small typographical error in Florinsky's (1998a) equation 27 (I. V. Florinsky, personal communication, January 2001). Equation 4.19 differs from Florinsky's (1998a) equation 26 because Florinsky's kh is not the same curvature as kt. If one assumes the elevation error to be stationary, the variances of the conic parameters depend only on the evaluation window size and on the weighting function, and are the same for the entire DEM. The variances of the topographic variables at any point are 86 a function o f the variances o f the conic parameters as wel l as the value of the conic parameters at the evaluation point, and thus assume different values throughout the D E M . Inspecting equations 4.16-4.20, we find that the variances o f slope and aspect w i l l generally increase for decreasing values of d and e, and thus with decreasing slope, while the variances of the curvatures w i l l have a more complex relation with the value of the conic parameters. The numerical results for equations 4.15 to 4.20 depend on the spatial structure o f the elevation error. Generally, one would want to derive the spatial correlation structure o f elevation error from an adequate sample o f elevation values of higher accuracy than the D E M (Fisher, 1998), but such a sample was not available to me, so I had to take a decision based on a reasonable structure for the elevation error. M y first choice was to account for the specific structure of T R I M . Assuming no spatial correlation o f errors on the cross-smoothed T R I M points, I derived a spatial correlation structure for the elevation errors in the interpolated D E M s through Monte Carlo simulation. I randomly selected 6 non-contiguous squares with 1200 m sides from the study area. From each square, using the A r c V i e w " T I N to G r i d " function, I generated a 30 m resolution D E M using the cross-smoothed T R I M elevations, as wel l as 6 alternative realisations o f the D E M produced by adding a random term with zero mean and standard deviation o f 3 m to the elevation of each T R I M point. 87 I then subtracted the original DEM from the each of the realised DEMs to create 36 simulated elevation error maps. I used these maps to compute an empirical correlogram of the elevation error, and I fitted a smoothing spline model to the average correlation at each distance (Figure 4.8). The spacing of the TRIM points and the Arc View interpolation algorithm result in the elevation errors of the interpolated grid having a correlation range of 5 grid units, i.e. 150 m. For additional numerical simulation I used an autocorrelation function of the form: . f _ _ ] 2 p(zi,zj) = e2^^ (4.21) where dy is the Euclidean distance between the points z,- and zj, and range is a range 0.8 -f 7 9 11 13 15 17 19 distance (grid units) Figure 4.8: Empirical correlogram of the elevation error of an interpolated DEM of 30 m resolution, obtained by adding a spatially uncorrelated error term to the TRIM sample points. 88 parameter so that the correlation coefficient py converges to zero for dy = range. Because o f the nature o f the conic function, we find that the variances of the parameters a, b, c, d, and e are generally positively correlated, and that the variance o f d and e is much larger (often by a 103 to 104 factor) than the variances o f a, b and c. I decided to concentrate on the variance and R M S E of d as the best predictor o f the overall variance of the quadratic parameters. The results o f calculating the R M S E o f d for various combinations o f autocorrelation structure and window sizes are shown in Figure 4.9. It has been said in the literature that the assumption of no spatial autocorrelation is a worst case scenario for the calculation of slope and aspect R M S E on a 3X3 evaluation sizes and different autocorrelation structures 0.05 0.04 â€”' I 0.03 i CC Q . $0.02 CL. 0.01 0.00 ~i 1 1 1 1 1 1 1 1 r 9 11 13 15 17 19 21 23 25 27 29 window size Figure 4.9: R M S E of the conic parameter d at different evaluation window sizes for different structure o f the elevation error: the assumption of no autocorrelation is not a worst case scenario. 89 window (Hunter and Goodchild, 1997; Heuvelink, 1998). This is not the case when one uses Evan's method, especially if one uses window sizes larger than 3X3, as my numerical simulations show. The maximum RMSE of the parameters d and e of the quadratic model in a 3X3 window occurs for a range value of 4 grid cells in Equation 4.21. It is interesting to note that the distance between the corners of a 3X3 evaluation window is 3x2Â° 5 = 4.243, which means that the RMSE of d is maximised when the corners of the evaluation window are independent while the intermediate points are not. This makes sense, considering that the quadratic interpolation function is mostly affected by the edge values. It is likely that the relation between the size of the evaluation window and the range giving maximum RMSE of the shape parameters could be determined analytically, but that falls beyond the scope of this thesis. What can be determined from Figure 4.9 is that the RMSE of d and e is generally very small once window sizes of 7 or larger are employed. Using the empirical correlation structure from the simulated error on the TRIM points, I calculated the RMSE of the terrain variables for the entire study area. As the summary results show on Table 4.5, the values are very small and cannot substantially affect the results of the analysis. Only a very small fraction of RMSE values for the A (aspect) Table 4.5: Distribution of the calculated RMSE of the topographic variables for the study area using Evans' method and a window size of 7 with no weighting function. G A K Minimum 0.00900 0.01866 0.00000 0.00000 Median range â€” 0.26-0.85 â€” â€” Mean 0.01792 - 0.00049 0.00047 3 r d Quartile range - 0.41-1.96 - -Maximum 0.01800 - 0.01993 0.02133 90 variable get very large, but this happens when d and e are very close to zero, in which case the slope is zero too; the aspect is undefined, and, in ecological terms, irrelevant. Because of these few extremely large values, the mean of the RMSE of A is not a good indicator of the central tendency of the distribution, and in Table 4.5 I reported the range of the medians and of the 3 r d quartile of the 8 sub-areas into which I divided the study area for computational purposes. Summarising, the results from this analysis indicate that the DEM elevation errors have only a minimal influence on the conic parameters as well as on the topographic variables for evaluation window sizes of 7X7 or greater. 4.2.1.4: TRIM interpolation effect The DEM grid used in the analysis is produced through interpolation from the TRIM, and at very small scales the grid surface shape is more likely to represent the shape of the specific interpolator used than the "real" terrain surface. To assess the importance of this interpolation effect at various scales, I created a series of 200X200 pixel synthetic surfaces with different fractal dimensions, sampled them using the TRIM sampling scheme, and recreated the surfaces from the sampled points. Using the fractal surface generator of LandSerf 1 .7, one fractal surface with pixel size of 30 meters was created for fractal dimension values of 2.1, 2.3, 2.5, 2.7 and 2.9. Each surface was then sampled using a "simulated TRIM scheme" obtained by extracting a 6000X 6000 meters square section of the TRIM data for the study area and overlaying it on the fractal surface. I then calculated the vector of quadratic model parameters for the original surface and the simulated TRIM DEM using different evaluating window sizes, and calculated the difference between the two vectors. Figure 4.10 shows how the choice of window size affects 91 the standard deviation of the difference between the quadratic parameter o f the original surface and those of the simulated T R I M for different fractal dimensions o f the original surface (the number in the bar on top of each panel). The value of this exercise lies in the insight it gives in the interaction between the scale of the T R I M sampling scheme, the complexity o f the topography (expressed by the fractal dimension), and the choice o f window size for the estimation o f the conic parameters. The shape o f the curves in Figure 4.10 is more important than the absolute values found, which cannot be compared with those described in the previous section, because these values are a function o f the relief of the fractal surface, and do not necessarily correspond to the range o f relieves found in the study area. A s is to be expected, the higher the fractal dimension o f the original surface, the larger the difference between the original and the sampled surface. The parameter values converge for larger window sizes, where the evaluation windows spans more and more original T R I M points. The more interesting result is how it takes window sizes o f up to 7 X 7 or 9 X 9 for the second derivative o f the curves to change sign, even for mid-values of the fractal dimension, which indicates that window sizes of 7X7 or larger should be employed to minimise the influence of the interpolation function on the terrain variables. 92 parameter a Q012 Qorjfji QCOtf QOOOi 2.1 2.3 25 27 29 7 V 8 s V h I V â€¢^â€¢BSBBBBBSH V 0 B V 8v - 'â€¢â€¢mmâ€” 7 0 B7 0.012 Q006 Q0O4 0.000 3 9 15 21 27 3 9 1521 27 3 9 1521 27 3 9 1521 27 3 9 15 21 27 8 I T J parameter c H= Q00& 0) "55 E CO ro CL O c .0 _ > Q) TD TJ CO i_ T J C CO â€”' co QCOtf 0.0021 2.1 2.3 25 2.7 2.9 B 7 B V 0 0 â€¢7 â€¢>7 $7 . ."MSSSUBBIHI 7 Â° 7 S - *Â»SWWBBBB V B 07 ^*Â»SSgÂ§IBB M'WfBDMWO Tâ€”iâ€”iâ€”r 1â€”1â€”1â€”1â€”rnâ€”1â€”1â€”1â€”1â€”1â€”1â€”1â€”1â€”r QC06 O.0C4 -Q0Q2 -0.000 3 9 1521 27 3 9 1521 27 3 9 1521 27 3 9 1521 27 3 9 1521 27 -2 parameter d 0.3-Q2i QH QOi 2.1 2.3 2.5 27 2.9 7 0 7 0 H 7 B 7 * V iv 87 . ^ i B t u n m , I .. . % Â» Â» Â» . Â°7 D Â°o77 â€¢0.3 Q2 0.1 0.0 * r=0 â€¢ r = 1 0 r=2 v r = 3 3 9 1521 27 3 9 1521 27 3 9 1521 27 3 9 1521 27 3 9 1521 27 window size (pixels) Figure 4.10: Standard deviation of the difference between the conic parameters of artificial fractal surfaces and the conic parameters of their reproductions from simulated TRIM sampling and interpolation using different window sizes (x axis), decay parameters (r values), and fractal dimensions (trellis panel). 93 4.2.1.5: Excessive Generalisation So far we have seen how using increasingly larger windows improves the ability of Evans' method to describe the shape of the underlying topographic surface with less and less influence from measurement and interpolation errors. This improved performance comes with a cost, since it involves a higher and higher generalisation of the terrain surface. Moreover, the least square method minimises the weighted sum of square of the residuals on the entire window, and, as the window gets larger, it is less likely that the quadratic equation will be a good descriptor of the terrain surface in the neighbourhood of the point where the terrain variables are estimated. It is then important to assess the amount of generalisation occurring at different window sizes. One way to do this is to measure the variance of the residuals of the quadratic model in a neighbourhood of the evaluation points. Because I am interested in the surface shape, and not in the elevation per se, I used a modified residual term, obtained by subtracting the residual at the centre of the evaluation window from the neighbourhood residuals. This corresponds to a translation of the interpolated surface along the z-axis, so that it has a residual of zero at the evaluation point. The variance of the modified residuals is then a better index of how the interpolated surface is different from the data surface in slope, aspect, and curvature. For every DEM cell, it is possible to compare the residual variance with the maximum variance of the elevation error, as stated in the TRIM specifications, using a %2 test. The test result indicates the probability of a random error creating a sample variance of the residuals equal or larger than the one observed; that is, the probability that a random error would create a larger change in the surface shape than the one introduced by the interpolation procedure. 94 The trellis box-plot in Figure 4.11 shows the distribution of the i2 test results for a random sample of 600 DEM cells, obtained by systematically sampling 100 cells from each of the six 1200X1200 m square sub-areas described in section 4.3.1.3. It is interesting to notice how the relation between the distribution of the test results and the window size reflects the general topography of the study area, with few areas (mostly the deeply incised river trenches) where even a small window size produces a substantial divergence from the original surface, and several areas (the extensive flat or gently rolling areas) where even vary 1.0 0.8 " 0.6 " 0.4 " 0.2 -0.0 1.0 -ro .Q o i 0.8 -CL TJ 0.6 -<U i ra 0.4 " cr </] 0.2 " !c o.o -o 1.0 -0.8 -0.6 -0.4 " 0.2 -0.0 -â€¢ n 0.5 1 1 1 1 1 1 1 1 T â€” l 1 1 I 1 1 1 1 1 1 1 1 1 i 1 T -7 11 15 19 23 27 31 35 39 43 47 51 3 7 11 15 19 23 27 31 35 39 43 47 51 window 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Figure 4.11: Box-plot of the distribution of probability values for the x 2 test for surface generalisation at different window sizes (x-axis) and decay coefficient values (trellis panel). 95 large window sizes create little deviation from the original surface. Figure 4.11 also shows how there might not be a single "best" scale to evaluate terrain variables over the entire study area. Different portions of the study area will allow a higher degree of generalisation then others before the local shape is modified significantly. Based on the analyses described in the last three sections, I decided to produce maps of the conic parameters and the terrain variables using four combinations of window sizes and r values: a window size of 7X7 with an r of 0, a window size of 15X15 and an r of 0.5, a window size of 23X23 and r of 1, and finally a window size of 31X31 and an r of 2. 4.2.2: Terrain classification. While topographic variables have been successfully correlated directly with vegetation characteristics (Skidmore, 1989b; Florinsky and Kuryakova, 1996; Pickup and Chewings, 1996; Ostendorf and Reynolds, 1998), they can also be used to classify the DEM in terrain classes that are similar to those used by ecologists to describe sites in the field. In the following section I will describe the methodology I used to produce such a classification. 4.2.2.1: Identification of morphometric features from topographic variables. It is possible to use topographic variables to identify some basic morphometric features of the DEM surface, basing the classification on surface slope and the rate of change of the surface slope in two orthogonal directions. Such a classification usually recognises peaks, pits, ridges, channels, saddles, neutral slopes and flats (e.g. Haralick, 1983; Pellegrini, 1995; Wood, 1996; but see Coops et al, 1998, for a different approach). While sometimes the terms used in the literature are different, the basic concepts remain the same: 1. Peaks and pits are points were the slope is equal to zero and the surface is convex (peak) or concave (pit) in every direction. These can also be defined as points where all 96 directional first derivatives of the terrain surface are equal to zero, while the directional second derivatives have all the same sign. 2. A saddle occurs where the surface has no slope and is concave in one direction and convex in the orthogonal direction. In derivative terms, a saddle is point where all directional first derivatives are zero, while orthogonal directional second derivatives have opposing sign. 3. A ridge occurs where the surface has a slope and it is convex in the direction orthogonal to the slope, or, if the slope is equal to zero, when the surface is convex in the direction of maximum curvature and has no curvature in the other direction. A channel is the mirror of a ridge. 4. Neutral slopes and flats are areas that are neither concave nor convex, with either positive (neutral slopes) or no slope (flats). These are areas that have all directional second derivatives equal to zero Because I am using Evans' method, and approximating the DEM with a quadratic surface, every point of the DEM surface falls under one of these categories, and it is not be possible to find inflection points. The use of the quadratic surface has two other consequences that should be considered: 1. For a quadratic surface described by Equation 4.6, peaks and pits occur at a single geometric point, and only if the quadratic surface defines an elliptic conic section. A saddle, too, occurs at a single geometric point, and only if the quadratic surface defines a hyperbolic conic section. It is extremely unlikely that a DEM grid node will correspond exactly with one of these points. 2. Except for some very specific cases (the singular points mentioned above, the planar cases where all a, b, and c are zero), every point of the surface satisfies the definition of 97 either a ridge or a channel given above, this because any vertical section of the surface described by Equation 4.6 is a parabola, and thus it has a definite profile curvature. A better definition of ridges and channels is based on the axes of the general conic defined by a horizontal section of the surface (Wood, 1998): in all cases where the axes are defined, they identify the points of maximum and minimum planar curvature of the surface, which are the most ridge-like and channel-like regions of the surface. In a surface with an elliptical conic section, the axis corresponding to the largest planar curvature (in absolute value) designates the feature, while the sign of the plan curvature indicates if the axis is a channel or a ridge. In a surface with a hyperbolic conic section, one of the two axes is a channel and the other a ridge, while in a surface with a parabolic conic section there is only one axis, and the sign of the plan curvature indicates if it is a ridge or a channel. The conic axes and their intersection define then the important morphometric features of the surface, but, as noted above, there is no guarantee that the DEM grid points, where the surface topographic variables are evaluated, will fall on one of these points. Wood (1998) presents a surface classification algorithm that searches in a radius r of each grid points for intersections with the conic axes, and uses these axes, if present, to classify the grid point. This approach recognises that the most important feature of the DEM pixel might not be located at the evaluation point, and, in this, is conceptually similar to Pellegrini's (1995) search for zero-crossings of the directional derivatives of elevation. In the discussion below I generally follow Wood (1998), but some modifications have been introduced for clarity. 98 The first step in this approach is to identify the type of conic section defined by the surface. If c2 - Aab & 0 , the section is either elliptic or parabolic, and has then two defined axes. The axes meet at co-ordinates (Wood, 1998) 2bd - cd 2ae - cd xc = 2 , , > yc = 2 A , (4-22) c -Aab c -Aab and if both xc and yc are smaller than half the DEM resolution, then the evaluation pixel contains the centre of the conic section, and should be classified accordingly. If c2 â€” Aab > 0 it should be classified as a saddle. If c2 - Aab < 0, the pixel should be classified as a peak if a + b > 0, as a pit otherwise. In the case where the centre of the conic section does not lie in the pixel, there is still the possibility that a conic axis would pass through the pixel. To identify the axes it is necessary to rotate the reference system by an angle 0 defined as (Wood, 1998): 1 ( c ^ # = â€” arctan 2 (4.23). \a-bj The rotated reference system X, Y has the following relation with the evaluation reference system: x = Xcos&-Ysm&, ^ = X s i n ^ + Tcos^ (4.24) and the equation of the terrain surface in the X, Y, system has no mixed term. This is obtained by substituting 4.24 in 4.6, which gives: z = a'X2 +b'Y2 +d'X + e'Y + f (4.25) where 99 a'= acos2 3 + bsin2 $ + ccos3sin& b'-bcos2 9 + a sin 2 9-c cos 9 sin 9 d'= d C 0 S i 9 + esin^ e'= e cos Â£ - a " sin .9 In the new co-ordinate systems, the conic axes correspond to the lines defined by = 0 for any X= dX 2a' (4.26) d z nr v ~e< â€” = 0 for any Y= 8Y J 2b' and the closest point of each axis to the origin can be expressed in the original co-ordinates system by rotating back the axes: d' Q d' . _ x= cos-i9, y = sin-.9 2a' 2a' x = â€” s i n - 9, y = cos- 9 2V 2b' (4.27) and it is possible then to see if one of the axes crosses the pixel. Note that in the parabolic case only one of the two axes is defined. If one of the axes crosses through the pixel, the axis can be classified as a ridge or a channel by looking at the sign of the second derivative of z moving orthogonal to the axis, which will be 2a' for the axis parallel to X, and 2b' for the axis parallel to Y. It must be noted that the plan curvature of the axis is identical to the second derivative, since the first derivative of z at the axis is zero. 4.2.2.2: From morphometric classification to terrain classification The morphometric classification described in the previous section is geared toward the delineation of drainage networks, and is mostly concerned with the sign of planar curvature, which controls the divergence or convergence of the surface flow. 100 In forest ecosystem mapping, terrain morphology is often rather described in terms of topographic position (Luttrnerding et al, 1990). Topographic position is related to profile curvature, controlling the acceleration or deceleration of the downhill flow, and to the size of the up-slope contributing area, factors that influence the moisture and nutrient regimes of a site (Kimmins, 1997). Both the most recent field guide for the study area (DeLong, 1988), and the field guide for its ecological equivalent in Alberta (Beckingham et al., 1996) use a classification of topographic position into seven classes: crest (also called ridge), upper slope, middle slope, lower slope, toes, depression and level. The topographic position classes are then used, together with slope and soil properties, to asses the moisture regime of a site. While topographic position alone is not enough to determinate the moisture regime, in conjunction with slope it restricts the range of probable moisture regimes of a site. Topographic position is related to the drainage network, and methods to classify topographic position in a DEM have been proposed in the literature (Skidmore, 1990; Coops et al, 1998). Skidmore (1990) identifies ridges and channels on a DEM, and then considers the ratio of the distances of each pixel from the closest ridge and the closest channel to produce a classification of the slopes into upper, mid and lower slopes. Coops et al. (1998) improve on Skidmore's (1990) method, introducing a more complex algorithm that analyses the flow lines between ridges and flat areas, and classifies upper (convex) or lower (concave) slopes based on the departure of the flow-line profile from a straight line joining the top and the bottom of the slope. A computationally simpler alternative is to classify slopes into convex (shedding), neutral and concave (receiving) based on their profile curvatures, since upper slopes are generally convex in profile, while lower slopes and toes are generally concave. This is the 101 Figure 4.12: Linear and S-shaped fuzzy membership functions with parameters a and b shown. approach used by Pellegrini (1995) and the one I implemented, with modifications that account for the use of the quadratic model in calculating profile curvature. The classification I devised is geared towards the identifications of moisture regimes proposed by Beckingham et al. (1996, Table 1.10), and it recognises seven classes: level areas (flats), channels, concave slopes, neutral slopes, convex slopes, ridges and very steep slopes. I defined these classes as fuzzy sets (Zadeh, 1965), so that each DEM pixel can have a membership value varying from 0 (completely non member) to 1 (completely member). For all classes, membership is based on a single terrain variable using either a linear or "S-shaped" (cosine based) fuzzy membership function (Figure 4.12). The classification was implemented in S+, and produced directly from the map of the quadratic parameters. 102 The type of fuzzy membership functions and their parameters are reported in Tables 4.6 and 4.7. The "neutral slope" class is the complement to the sum of concave slope, convex slope and level memberships. Al l membership algorithms account for the presence of conic axis, as discussed in the previous section. Where one axis of the conic section crosses the pixel, the evaluation point is moved from the pixel centre to the axis point closest to the pixel centre. If the centre of the conic section falls in the pixel, the evaluation point is moved to the centre of the conic section, and the curvature evaluated is the maximum in the ridge case and the minimum in the channel case. The use of distance-dependent parameters (Table 4.7) accounts for the increased smoothness of the surface when larger window sizes are used. For ridge and channel classification, the variable used in fuzzy membership function is the directional first derivative of the terrain surface in the direction orthogonal to the aspect (or the conic axis) at a distance d from the evaluation point. It is a linear transformation of the planar curvature, since all vertical sections of the model surface are parabolas and have constant second derivative. However it expresses the curvature in more intuitive terms: S(d) of 0.01 means that if one were to walk a distance d from a ridge top in the direction opposite to the ridge axis, the slope in that direction would increase to 1%. In the concave and convex slope classification, the parameter values of a and b are based on the curvature effect on a slope of 0% on a parabola. For g=30, a 0% slope changes to 1.5% in 30 m for k, = a, while it changes to 3% for kr = b. These parameter values have been chosen considering the slope classes used by Beckingham et al. (1996, Table 1.10), where terrain is considered level if the slope is less than 2%. To insure that the fuzzy membership functions are mutually exclusive and exhaustive, i.e. that the sum of the fuzzy membership for each terrain classes over a single 103 Table 4.6: Fuzzy membership functions for the terrain classes. Terrain Class Variable Function type a Level G S-shaped*** 1% Steep G S-shaped 25% Ridge S(d)' Linear 0.01 Channel S(df Linear 0.005 Convex Slope kr Linear 1.5/g Concave Slope -kr Linear 2.5% 35% 0.09 0.07 0.3/g 0.3/g Slope at distance d in the direction orthogonal to the gradient (a linear function planar curvature). If a conic axis is crossing the pixel, the planar curvature is evaluated at the axis point closer to the pixel centre. **If a conic axis is crossing the pixel, kr is replaced by kt evaluated at the axis point closer to the pixel centre. "'Decreasing function: membership is 1 for G < a and 0 for G > b. Table 4.7: Window-dependent parameters of the fuzzy membership functions. Window Size D g 7X7 (4.41 ha) 15m 30 m 15X15 (20.25 ha) 30 m 60 m 23X23 (47.61 ha) 45 m 90 m 31X31 (86.49 ha) 60 m 120 m pixel is identically equal to one, I employed an algorithm designed to preserve the maximum amount of information. First the classes are made mutually exclusive in pairs (ridge vs. channel, concave vs. convex, flat vs. steep) by dividing the membership value by the maximum of 1 and the sum of the two pairs. The mutually exclusive "channel" and "ridge" classes, are then preserved, having the largest amount of information, while the other classes are recalculated, in sequence, using the following set of equations, that account for the decreasing amount of terrain information 104 when moving from the channel/ridge class pair, on to the concave/convex, and finally to the steep/flat pair: concaveme = concavem â€¢ (l - (channelm + ridgem )) convexme = convexm â€¢ (l - (channelm + ridgem )) steepme = steepm â€¢ (l - (channelm + ridgem + concaveme + convexme)) (4.28) fl atme = flatm ' C1 ~{channelm + ridgem + concaveme + convexme)) neutralme = 1 - (channelm + ridgem + concaveme + convexme + flatme + steepme) The m subscript indicates the pair-wise mutually exclusive membership, while the me subscript indicates the overall mutually exclusive and exhaustive membership function. Both the posterior probabilities produced in Chapter 3, and this exhaustive and mutually exclusive fuzzy membership classification can be interpreted as sub-pixel classification: a pixel that has a posterior probability (or fuzzy membership) of 0.3 for class A, and 0.7 for class B, can be interpreted has being comprised of the (crisp) classes A and B for respectively 30% and 70% of its surface. This interpretation is at the base of the analysis methods that I will use in the next chapter. 105 C H A P T E R 5: S P A T I A L A N A L Y S I S 5.1: ASSOCIATION OF FOREST COVER WITH TERRAIN. In Section 3.3 I described the development of forest cover classification, while terrain classification was described in Chapter 4. Since these two classifications are independent, it is now possible to investigate if any of the forest cover types has a particular positive or negative spatial association with any of the terrain types. This is done using a methodology based on map overlay. The first step in the analysis is to assess how well the data used and the scales employed capture those relations between terrain and forest cover that are known to exist: lowland black spruce dominated stands are expected to be strongly correlated with topography and drainage. A failure to detect this relation would indicate that one of the classification scheme, the data quality and the scale of analysis are inadequate for ecological investigation, and the methodology is flawed. Once the methodology has proven itself, it can be used to investigate the association between terrain and those forest cover types that are of more interest in the study of mixedwood successional patterns. In particular, I am interested in investigating whether there is a different pattern of association between those stand types that are considered possible stages in mixedwood succession as outlined by Lieffers et al. (1996a), namely HWD, MWD and SW, and those that are not included: SBPL and SBLW (see Table 3.3 at page 41 for definitions). I am also interested in any difference in the selection for terrain classes within the group of stands that could be part of mixedwood succession. 106 5.1.1: Analysis method 5.1.1.1: The Modified Electivity Index To investigate the relation between terrain and forest cover, I developed a modified version of Ivlev's Electivity Index. The Electivity Index is a measure of deviation from randomness between the association of different objects in a contingency table. It was developed to study foraging habits of animals (Jacobs, 1974), and it has been previously applied in landscape ecology (Pastor and Broschart, 1990). The Electivity Index assumes no uncertainty about the membership of an observation in the sets that form the rows and columns of the contingency table. I developed the Modified Electivity Index to accommodate for uncertainty in the site classification (expressed as fuzzy membership) and in forest cover classification (expressed as posterior probability) maps produced in Chapter 3 and Chapter 4 respectively. The Modified Electivity Index differs from Ivlev's Electivity Index in the way the proportion of the area in each class and the joint occurrences of two classes are calculated. Let us consider two classes i and j, where i is a terrain class and j is a forest cover class. The Modified Electivity Index is calculated as: where pi is the average value of the fuzzy membership map of i; pj is the average value of the posterior probability map of/, and r^ is the average of the pixel-wise product of the two maps. r,( l -Pi) (5.1) 107 This is equivalent to interpreting the fuzzy membership level or the posterior probability of a pixel in a class as the fraction of the pixel belonging to the same class, as suggested for posterior probabilities by Eastman (1999) in the description of the BAYCLASS module of Idrisi32. Consequently, the proportion of the study area in a class is the average value of all the pixels in the fuzzy membership or posterior probability map for that class. The map of the joint occurrence of two classes at the sub-pixel level can be calculated as the pixel-wise product of their probability and fuzzy membership measures, as would be expected in the case of two independent probabilities. The average of this map is the proportion of the study area where the two classes occur together, r^ . 5.1.1.2: Statistical testing The Modified Electivity Index is an indicator of how strongly each entry in the contingency table departs from values expected in the case of independence or random association of the row and column classes. While the expected value for Ey is 0 in the case of completely random association between / and j, the magnitude of Ey does not necessary imply a statistically significant departure from random association, since Ey does not take into account the number of observed random events in the contingency table. In the case of the fuzzy membership map, this operation implies also the definition of the terrain classes as a crisp instead of a fuzzy set. It was to extend this sub-pixel interpretation to the fuzzy membership maps that the fuzzy memberships were forced to be exhaustive and mutually exclusive (i.e. the summation of the membership level over all classes is 1 for each pixel). 108 It is possible to test the significance of the departure from random association of each entry in a contingency table using Fisher's exact method. If we are interested in a single element k,l of a nxm contingency table with nkt occurrences, it is possible to reduce the full table to a 2X2 table by summing row-ise and column-wise all the occurrences in the rows and columns other than k and /: nu=nki nn =-nkl+ Ynkj 7 = 1 n m n 1=1 7=1 1=1 (5.2) In this case the departure from random association is tested against the chi-squared distribution using one degree of freedom (the period subscript indicating the column or row total): 2 = n (nu-n22-nl2-n2J ^ nt-n2-n] -n2 which can be expressed in proportion terms, where py = nyln.. : x2 _njpn-p22-pl2-p2{f P.\-P.2-P\.-Pi. It can be seen from Equation 5.4 that the result of the test depends very much on the total number of observations in the contingency table. In using this test with GIS raster overlays, one must ask if and when each pixel can be counted as an independent observation, since these decisions will influence the result of the test. This question can be re-expressed as the choice of the minimum area that can be considered as an independent random event. Given that the various p in Equation 5.4 are the 109 proportion of the total surface of the study area A in each combination of classes, Â«. can be expressed as: where m is the minimum area considered as an independent random event. The original pixel size, the resolution of the spatial information collected, is an obvious lower bound to m. In the case of raster maps obtained completely or partially from polygon maps, as in the case of my forest cover type maps, the minimum mapping unit of the polygon map must also be considered. But these data collection issues are only a lower bound because they not do consider the "natural" spatial autocorrelation of the classes in the contingency table. Both terrain and forest cover are likely to have a spatially autocorrelated structure at scales larger than the minimum mapping unit, so that the assumption of independence between observations at the base of Fisher's exact test would be violated if one were to set m to the minimum mapping unit. In the absence of compelling evidence to set a specific size for m, I decided to take a different approach. Substituting 5.5 in 5.4, one obtains: Since the critical chi-squared forp =0.05 is 3.84, we can calculate the minimum size of m that would reject the hypothesis of independence of the two classes at 95% confidence A n = â€” m (5.5) m = A-(Pn -Pn-Pu -Pu) X2 â€¢ P.l â€¢ P.2 â€¢ Pi. â€¢ Pi. (5.6) as: (5.7) Px-P.2-Px.-P2. 110 It is then possible to calculate m for each entry of the contingency table and consider it not only in terms of data collection scale, comparing it to the original mapping resolutions, but also in terms of the scale of the processes that are likely to influence the relation between the two set of classes considered, which in our case are topography and forest cover. 5.1.1.3: Analysis implementation. I implemented the analysis as a Microsoft Excel Visual Basic for Application (VBA) program, using Application Programming Interface (API) calls to Idrisi32, where routinely each cover type posterior probability map was multiplied by each fuzzy membership terrain map using the OVERLAY module of Idrisi32, and the average value of the resulting map was extracted for the analysis area using the EXTRACT module of Idrisi32. The EXTRACT module's output is a text file that was read by the V B A program into an Excel spreadsheet where the Modified Electivity Index and m were calculated. Given the large size of the study area (each map had a size of 8,607,827 pixels, or 34 Mb of data in double precision format), the procedure was quite machine intensive, requiring over a hour on a Dell Workstation running WinNT 4.0 with an Intel PHI processor at 933 MHz and 256 Mb of RAM. I l l Table 5.1: Modified electivity index for forest cover types and terrain classes (7X7 estimation window). The m parameter (ha) for each combination is reported in brackets. OB MWD OV PL SBLW SBPL SW HWD % total area channel -0.52 (4.4) -0.21 (21.8) 0.11 (4.9) -0.45 (20.9) -0.01 (0.0) -0.36(17.7) 0.69 -0.15 (17.2) 1.9% (286.0) ridge -1.09 (5.1) -0.02 (0.1) 0.10(1.8) -0.63 (15.0) -1.28 (27.8) -1.00 (33.3) 0.01 (0.0) 0.35 (63.9) 0.8% concave -0.47 (7.6) -0.24 (52.4) 0.21 (42.0) -0.47 (45.1) -0.03 (0.2) -0.37 (37.2) 0.58 -0.10 (16.4) 4.0% (356.9) convex -0.84 (23.6) -0.01 (0.2) 0.05 (2.5) -0.51 (69.1) -1.27 -0.84 -0.19(23.0) 0.41 5.3% (177.5) (169.8) (536.7) flat -0.09(1.3) -0.36 -0.10(24.8) -0.50 1.62 0.05(3.4) . -0.62 0.11 (74.9) 14.6% (382.2) (178.7) (5,118.5) (485.1) steep -1.45 (4.8) -0.15 (3.4) -0.19(3.6) -0.39(5.1) -1.86(26.9) -1.86(42.3) 0.78 0.19(10.8) 0.6% (128.3) neutral 0.41 (36.8) 0.29 -0.01 (0.1) 0.59 -1.14 0.27 0.08(18.6) -0.18 72.7% (431.1) (413.7) (2,288.5) (129.5) (311.1) % total area 1.2% 21.7% 15.5% 6.9% 5.4% 8.3% 12.4% 28.5% Table 5.2: Modified electivity index for forest cover types and terrain classes (15X15 estimation window). The m parameter (ha) for each combination is reported in brackets. OB MWD OV PL SBLW SBPL SW HWD % total area channel -0.48 (3.2) -0.17 (12.3) 0.11 (4.6) -0.47(18.3) -0.08 (0.6) -0.44 (20.3) 0.65 -0.11 (8.0) 1.6% (197.6) ridge -0.89(3.4) -0.02 (0.0) 0.02 (0.1) -0.70(15.0) -1.88 (32.8) -1.40 (40.6) -0.11 (1.1) 0.46 0.7% (103.1) concave -0.51 (10.9) -0.19(43.2) 0.25 (74.8) -0.50 (63.8) -0.10(2.6) -0.44 (63.1) 0.54 -0.09(14.7) 5.0% (358.6) convex -0.91 (33.3) 0.01 (0.2) -0.07 (5.4) -0.61 -1.77 -1.21 -0.28 (59.7) 0.52 6.7% (113.8) (306.4) (336.1) (1,192.1) flat -0.08(1.2) -0.36 -0.10(25.9) -0.47 1.63 0.10(12.5) -0.62 0.10(59.8) 15.2% (399.4) (164.9) (5,174.1) (493.8) steep -0.87(1.1) -0.15(1.3) -0.24 (2.1) 0.06 (0.1) -2.19(11.4) -1.63 (14.7) 0.79 (50.8) 0.11 (1.3) 0.2% neutral 0.43 (42.8) 0.26 0.01 (0.4) 0.61 -1.08 0.33 0.15(68.0) -0.24 70.4% (392.1) (469.2) (1,993.9) (193.3) (596.4) % total area 1.2% 21.7% 15.5% 6.9% 5.4% 8.3% 12.4% 28.5% 112 Table 5.3: Modified electivity index for forest cover types and terrain classes (23X23 estimation window). The m parameter (ha) for each combination is reported in brackets. OB MWD ov PL SBLW SBPL SW HWD % total area channel -0.34(1.9) -0.15 (10.0) 0.13 (5.8) -0.30 (8.9) -0.06 (0.3) -0.28 (9.6) 0.56 -0.14(12.6) 1.6% (141.3) ridge -0.79 (3.3) 0.01 (0.1) 0.00 (0.0) -0.57(12.3) -1.97 (37.9) -1.40(45.4) -0.23 (5.3) 0.47 0.8% (123.5) concave -0.42 (9.7) -0.17(44.4) 0.24 (85.2) -0.41 (55.5) -0.07 (1.7) -0.34 (49.7) 0.49 -0.10(21.6) 6.0% (327.4) convex -0.77 (30.8) 0.03 (1.8) -0.12(18.9) -0.54 -1.79 -1.24 -0.31 (83.9) 0.54 7.8% (108.0) (360.9) (401.3) (1,437.6) flat -0.05 (0.4) -0.35 -0.11 (27.0) -0.43 1.59 0.13 (24.6) -0.60 0.08 (39.6) 15.3% (371.4) (144.7) (4,920.4) (473.1) steep -0.96 (0.6) -0.13(0.5) -0.32 (1.7) 0.23 (0.6) -2.56 (6.0) -1.50 (6.6) 0.68(16.3) 0.17 (1.6) 0.1% neutral 0.38 (36.5) 0.24 0.02 (1.5) 0.55 -1.00 0:31 0.17 (94.7) -0.25 68.3% (339.7) (422.3) (1,692.8) (186.1) (654.6) % total 1.2% 21.7% 15.5% 6.9% 5.4% 8.3% 12.4% 28.5% area Table 5.4: Modified electivity index for forest cover types and terrain classes (31X31 estimation window). The m parameter (ha) for each combination is reported in brackets. OB MWD OV PL SBLW SBPL SW HWD % total area channel -0.26 (1.3) -0.16(12.1) 0.13 (6.8) -0.19(4.7) -0.02 (0.1) -0.16(4.1) 0.52 -0.17(20.5) 1.9% (134.8) ridge -0.74 (3.9) 0.03 (0.3) -0.02 (0.1) -0.46(11.6) -1.98 (48.7) -1.33 (55.4) -0.28 (9.6) 0.46 1.0% (153.2) concave -0.36 (9.0) -0.17 (49.8) 0.24 (97.8) -0.34 (49.0) -0.02 (0.1) -0.22 (27.2) 0.45 -0.13 (43.6) 7.4% (321.7) convex -0.66 (29.4) 0.05 (7.5) -0.14(30.8) -0.44 (91.8) -1.72 -1.18 -0.32 0.52 9.2% (417.7) (454.1) (101.8) (1,544.9) flat -0.04 (0.2) -0.34 -0.12(32.3) -0.42 1.58 0.15 (32.2) -0.58 0.07 (31.4) 15.2% (350.4) (138.8) (4,826.5) (455.7) steep -1.58(0.6) -0.08 (0.1) -0.38(1.4) 0.22 (0.3) -2.73 (3.8) -1.61 (4.4) 0.66 (9.5) 0.19(1.3) 0.1% neutral 0.37 (36.9) 0.21 0.02 (2.5) 0.51 -0.94 0.31 0.16(90.5) -0.25 65.1% (297.1) (394.6) (1,456.0) (190.3) (667.6) % total 1.2% 21.7% 15.5% 6.9% 5.4% 8.3% 12.4% 28.5% area 113 5.1.2: Results The analysis results are reported in Tables 5.1 through 5.4. Each table reports the Modified Electivity Index values for each combination of terrain and forest cover classes, while the value of the parameter m, in hectares, is reported in brackets. The total area of each row and column, expressed as percent of the total study area, is reported at the margins. SBLW has a strong positive association with the "flat" terrain class (Modified Electivity Index of 1.63 in the case of the 15X15 window) that is also significant for its large m. SBLW also has a very strong negative association with the "steep", "ridge", and "convex" terrain classes (modified electivity indices of -2.19, -1.88 and -1.77 respectively for the 15X15 case), and strong negative association with the "neutral slope" class. Of the other cover types, SBPL is the only one that presents strong indices of positive or negative association (Modified Electivity Index > 1 in absolute value): it has strong negative association with both steep slopes and ridges. Other classes show less marked values of the electivity index, but in some cases have relatively large values of m, indicating a considerable departure from randomness. SW is positively associated with channels, concave slopes, and steep slopes, and negatively associated with flat areas. HWD instead is positively associated with ridges and convex slopes, and it lacks the clear negative association with flat areas exhibited by both SW and MWD. MWD falls in between the two, having positive association with neutral slopes and negative association with flat areas. 5.1.3: Discussion In interpreting the analysis results, it is important to considered the marginal totals of the classes. The Modified Electivity Index is more sensitive to classes with small marginal 114 totals; the parameter m will be smaller in these cases, since it is harder to detect departure from randomness when the expected values for one element of the contingency table are particularly small. Several combinations have very large values of m compared to the 0.09 ha size of the map pixels. Large values of the Modified Electivity Index are often associated with relatively small values of m in the terrain classes "steep" and "ridge", given the small proportion of the landscape occupied by these two classes. The changes in window size, while altering the marginal totals of the terrain classes, have little or no influence on the general patterns of the Modified Electivity Index. The results unequivocally show some very strong deviations from randomness in the association between SBLW and terrain. These deviations, which remain clear for all scales of evaluation of the terrain parameters, show that the SBLW class is strongly associated with the terrain class with the poorest drainage, "flat", while negatively associated with the better drained classes. While the result is ecologically obvious, it indicates that the data and the methodology used are able to detect site-stand relations. The most evident difference between the "mixedwood group" ( MWD, HWD, SW) and the "non-mixedwood" group (SBPL, SBLW) is the general strength of association with terrain classes. SBPL and SBLW have much larger absolute values of the Modified Electivity Index, suggesting that the physical environment plays a more important role in the distribution of these types than it does for the "mixedwood group". Within the "mixedwood group", there are interesting differences in patterns of association. In general HWD stands are positively correlated with sites that can be interpreted as being drier or poorer (ridges, convex slopes, flat areas), while SW is positively correlated 115 with richer and moister sites (channels, concave slopes), but not sites where there could be excessive moisture (flats). These results must be interpreted with caution: the differences observed could be due to many different reasons, and do not necessarily indicate a difference in successional pathways between the different terrain classes. Different hypotheses can be proposed to explain these results: 1) There is no difference in successional pathways between the different terrain classes, and succession after stand replacement disturbance is controlled exclusively by disturbance intensity, which is not correlated to terrain class. Topography has instead a certain amount of control on disturbance frequency, so that generally early successional stands (HWD) are more common on the drier sites, while mostly late successional stands (SW) are more common in the moister sites. 2) Different terrain classes support different successional pathways. Ridges and convex slopes are more likely to regenerate to pure aspen after fire, while channels and concave slopes are more likely to regenerate to spruce, mixedwoods being more likely in more mesic sites. This can be due either to terrain control on disturbance intensity or to site influences on regeneration and competition after fire or to a combination of both. It is not possible at this point to discriminate between these two hypotheses using the data sets available, but this analysis has been useful in providing some fruitful observations from which new experiments can be devised. 116 5.2: INFLUENCE OF THE DISTANCE FROM SW STANDS ON THE DISTRIBUTION OF HWD AND MWD STANDS. Disturbance intensity and spruce seed availability are major determinants of post-disturbance mixedwood development in the model proposed by Lieffers et al. (1996a). This model implies that the spatial arrangement of spruce seed sources on the landscape is very important, because the distribution of white spruce is thought to be dispersal-limited (Greene and Johnson, 2000). Widespread removal of white spruce, with consequent reduction in the strength of seed source, is reported to be the possible cause of a shift of mixedwoods to pure aspen stands after disturbance (DeLong, 1989). If white spruce seed availability, and therefore white spruce regeneration, is indeed distance-dependent, mixedwood stands should show a positive spatial association with white spruce seed sources, while pure aspen stands should show a negative association with white spruce seed sources. In contrast, such an association will not be observed if the distance to seed source is not a limiting factor. A null hypothesis can be formulated as: H 0 : the likelihood that any given site is dominated by HWD instead of by MWD is independent of its distance from a white spruce seed source. This hypothesis can be tested using logistic regression of class membership (MWD vs. HWD) on the distance from the closest white spruce seed source. 5.2.1: Choice of data for the regression model 5.2.1.1: Distance from white spruce seed sources. To calculate the distance of each pixel from a white spruce seed source it is first necessary to produce a map of the seed sources. I did that by reclassifying as a white spruce 117 seed source any pixel where the posterior probability of membership in the S W cover class was higher than 0.75. While this cut-off value is arbitrary, its exact value is not very important because of the strongly bimodal distribution of posterior probabilities in the SW class (Figure 5.1): cut-off values between 0.4 and 0.8 have little influence in the production of the final map. After defining the white spruce seed sources, I used the DISTANCE module of Idrisi32 to calculate the distance of each pixel from the closest seed source. Figure 5.1: Histogram of posterior probabilities for the SW forest cover class, showing their strong bimodal distribution. 118 5.2.1.2: Dependent variable and choice of observations While the dependent variable of a logistic regression is generally a binary measure (Ter Braak and Looman, 1995) it can also be a probability, which is bounded between 0 and 1. Considering also that all the posterior probability data obtained from the Bayesian classification of forest cover have a strongly bimodal distribution, like the one shown in Figure 5.1, logistic regression seems a very appropriate method by which to investigate the influence of distance from seed source on the posterior probability of a pixel being HWD vs. being MWD. 5.2.1.2.1: Choice of dependent variable The logistic regression discriminates between two possible outcomes: they are either belonging to the HWD or to the MWD class. In the forest cover data set, it is possible that a pixel may not belong to either class, although in the testing of hypothesis H 0 1 am only interested in the discrimination between MWD and HWD. The dependent variable cannot then be either HWD or MWD alone, the posterior probabilities of the two classes, since HWD is not necessarily 1 when MWD is 0.1 then defined the dependent variable as: HWD P(HWD) = P(HWD\MWDKJHWD) = (5.8) MWD + HWD where P(HWD) can also be interpreted as the posterior probability of HWD in the case where MWD is the only other possible class. 5.2.1.2.2: Selection of observations The choice of P(HWD) as the dependent variable in the logistic regression imposes the exclusion from the observation data set of all those pixels where the sum of MWD and HWD is zero, since P(HWD) is undefined there. I actually decided to exclude from the 119 regression all pixels where the sum of MWD and HWD was equal or lower than 0.5. This choice is based on the assumption that any error affecting MWD and HWD has a much larger effect on P(HWD) as the sum of the two terms get smaller, making the measure unstable. I decided to exclude from the observation data set all the pixels belonging to FIP polygons with age lower than 80. This was done to limit the analysis to those stands which had had enough time since the last disturbance event to succeed to mixedwood. I also limited the analysis to each 25 th row and 25th column of the raster maps, using the CONTRACT module of Idrisi32. This corresponds to a regular grid sampling scheme, with observations spaced 750 m apart to remove spatial autocorrelation in the model variables. Once all these operations had been performed, the sample size was 1739. ' 5.2.2: Results 5.2.2.1: Regression Models The simplest possible logistic model is (Ter Braak and Looman, 1995): log r P(HWD) A \ + P(HWD) = b0 + bxd (5.9) where d is the distance of the observation from a SW>0.75 pixel. This model does not account for the influence of the terrain parameters, nor for the influence of age class. 120 Q _ 0 5000 10000 15000 distance (m) Figure 5.2: Logistic regression curve of P(HWD) on distance and hexagonal binning of the observations. The results of fitting this model using the G L M routine of S-plus are shown in Table 5.5. Both parameters are significant at a = 0.001. The fitted line is plotted in Figure 5.2, Table 5.5: Logistic regression of distance from SW versus P(HWD): parameter estimates and deviance table. Deviance Residuals: Minimum ls tQuartile Median 3rdQuartile Maximum -2.80497 -1.04936 Coefficients: Intercept b0 distance bi 0.307741 1.126468 1.427189 Value Std. Error t value -0.6126 0.078313 -7.82244 0.000704 6.73 10"05 10.46085 Null Deviance: 2261.116 on 1738 degrees of freedom Residual Deviance: 2118.798 on 1737 degrees of freedom Number of Fisher Scoring Iterations: 3 121 CM O O o 40 21 1 1000 â€”I 1â€” 2000 3000 distance (m) 4000 5000 Figure 5.3: Detail of the logistic regression of P(HWD) on distance in the zero to five km distance range. while Figure 5.3 shows the detail for the distance range 0-5000 m. Because of the large number of observations, in both figures the original observations are represented through hexagonal binning, using the hexbin function of S-plus (MathSoft, 2000). The null hypothesis of independence between distance from seed source and hardwood probability has been rejected. A visual interpretation of Figure 5.2 might suggest that the regression is pulled by the longer tail of the observations with P(HWD) close or equal to 1. This is actually not true. If the model is fitted only in the 0-5000 m range, which removes only 0.01% of the observations, the relation is still significant (a =0.0001) and bi actually increases to 0.000723, giving a steeper sigmoid curve. This trend is even accentuated if the model is fit 122 only in the 0-2000 m range: this removes 12.5 % of the observations, and yields a significant (a =0.0001) bj of 0.000938, with an even steeper sigmoid curve. The regression is more influenced by the observations closest to the spruce seed source. If the observations in the closest 90 m to the seed source are excluded, bt is reduced to 0.000663 (p < 0.0001). Removing these observation is aimed at mitigating the effect of SW "speckles" inside MWD stands. These could be the consequences of classification errors as shown in chapter 3 (Tables 3.7 to 3.9), MWD pixels have been incorrectly classified as SW, while that does not happen for HWD. Yet, even when these observations are removed, there is still a significant correlation between distance and P(HWD). As shown through section 5.1, MWD and HWD pixels differ in Modified Electivity Index in almost all topographic classes. They also differ in stand age, as reported by the FIP maps. The mean age of all the observations weighted on P(HWD) is 105.50, while it is 118.34 weighted on P(MWD), the complement of P(HWD). If the same observations are divided in HWD and MWD based on their maximum likelihood, the two groups have means of 105.44 and 118.46 years respectively, and the difference is statistically significant (Student's t test, a = 0.001). The forest inventory stand age can indeed be successfully used in logistic regression to predict P(HWD), as shown in Table 5.6 and Figure5.4. To assess the relation of all these variables with the distance from seed source, it is necessary to see how their addition to the logistic model influences the model fitting and its results. 123 With these additions, the model is: log P(HWD) = b0 + bxd + b2a + b3G + b4k, + b5kr + V \ + P(HWD) +b6channel + bridge + +bsflat + b9steep + +bl0concave + buconvex (5.10) where a is the FIP polygon age, the quantitative terrain variables are those defined in Equation 4.7, calculated on a 7X7 window, while the terrain classes are the fuzzy memberships, described in section 4.2.2.2, also calculated on a 7X7 window. The model is not very parsimonious, and several of the terrain variables are highly correlated with each other, both by definition and for topographic reasons, but this is not a problem given the large number of observations. The results of fitting the logistic regression with the full model are shown in Table 5.7. The distance variable is still highly significant even when all the other variables are added to the model, as it can be seen from the coefficient t value (Ter Braak and Looman, 1995). The residual deviance of the model in Equation 5.9 is 1914.266 with 1727 degrees of freedom. If the distance variable is dropped, the model has a residual deviance of 2044.759 with 1728 degrees of freedom. The difference of residual deviance due to distance is then 130.494 with 1 degree of freedom, which is highly significant (chi-squared test,/? = 0). 124 Table 5.6: Logistic regression of P(HWD) on FIP polygon age: parameter estimates and deviance table. Deviance Residuals: Minimum 1st Quartile Median 3 r d Quartile Maximum -1.635953 -1.156148 0.4600737 1.014448 2.00779 Coefficients: Value Std. Error t value Intercept ~~b0 3.617 0.302 1L98 age b, -0.03189 0.00269 -11.83 Null Deviance: 2261.116 on 1738 degrees of freedom Residual Deviance: 2094.894 on 1737 degrees of freedom Number of Fisher Scoring Iterations: 2 Figure 5.4: Regression of P(HWD) on FIP polygon age and hexagonal binning of the observations. The observations left of the dashed line have age lower than 80 years and they have not been used in the regression. 125 5.2.3: Discussion The results of the logistic regression analysis reject the hypothesis that the likelihood that any given site being dominated by HWD instead of by MWD is independent of its distance from a white spruce seed source. The result is very clear, especially considering that the SW class had large error rates in the remote sensing classification (Tables 3.8 and 3.9 in Chapter 3). Since these are errors of both omission and commission, they can create false seed sources, as well as eliminate real ones from the analysis, and they are much more likely to obscure a distance dependent relation than to create a spurious one. Another factor that has the potential to have obscured this relation is the impossibility to discriminate in the cover Table 5.7: Logistic regression of P(HWD) on age, terrain and distance variables: parameter estimates and deviance table. Deviance Residuals: Minimum 1st Quartile Median 3 r d Quartile Maximum -2.825203 -0.9717914 0.2572394 0.9660549 2.044349 Coefficients: Value Std. Error t value Intercept b0 1.894 0.335 5.654 ** distance b, 0.000733 0.0000735 9.975 ** age b2 -0.0295 0.0028 -10.492 ** G bs 0.148 0.019 7.557 ** k, b4 -1.569 1.780 -0.881 K b5 2.579 1.436 1.796 * channel b6 -0.775 0.922 -0.840 ridge b7 0.293 5.074 0.058 flat b8 2.037 1.909 1.067 * steep b9 -2.011 0.632 -3.180 ** concave bw 0.688 0.368 1.869 * convex bu 1.932 2.026 0.953 Null Deviance: 2261.116 on 1738 degrees of freedom Residual Deviance: 1914.266 on 1727 degrees of freedom Number of Fisher Scoring Iterations: 4 **/><0.01 *p<0.05 126 classification between mixedwoods with a spruce component and lodgepole pine-aspen stands. A stand that had developed to a pine and aspen mixture because of excessive distance from spruce seed source, would still be likely to have been classified as MWD in this analysis, weakening the relation between distance from spruce seed source and P(HWD) The results of the fitting of Equation 5.10 show that when the model is expanded to include forest inventory age and terrain variables, the influence of distance is slightly . increased (Table 5.7), as the regression parameter for distance goes from 7.04xl0~4 to 7.33xl0"4. It is particularly important that distance is still significant after the introduction of the terrain variables, which account for site effects on P(HWD). Assume that some sites have a lower probability to become mixedwoods, either immediately after disturbance or through succession, independently from the presence adjacent sources of white spruce seed. The results from the first section of this chapter support this hypothesis, showing how HWD, MWD and SW stands associate preferentially with different site types. If the "HWD sites" are generally spatially separated from the "SW sites", the spatial segregation of HWD from SW for reasons of geomorphology is confounded with the effect of distance from seed sources. The results of the second logistic regression show instead that the distance variable significantly contributes to the model even when the site variables have been accounted for, and thus the spatial arrangement of SW, HWD, and MWD cannot be explained only in terms of site selection. Overall, these results support the hypothesis that the distance from white spruce seed source is an important factor in mixedwood succession, and that in some cases it might drive the succession pathways after disturbance. 127 Much less successional significance should be attached to the strong correlation between stand age and mixedwood probability (Figure 5.4). As recalled in Chapter 2, the literature reports two main pathways of mixedwood formation: either immediately after disturbance or through invasion of spruce under aspen canopy. If the first pathway prevails, there should be little or no relation between stand age and P(HWD), given that disturbance rates are the same for each stand type. The strong relation between age and P(HWD) could then be interpreted as a prevalence of the second pathway in the study area. However, it is also likely that stand ages for aspen dominated stands are underestimated in the forest inventory, as Cumming et al. (2000) have found for Alberta. In this data set, the relation could have also been reinforced by the direction of the possible classification errors, since OV (vegetated opening) stands could have been classified as HWD, which is the reason why the stands younger than 80 years have been excluded from the analysis. A successional interpretation of these data is also confounded with possible differences in disturbance frequencies, especially in fire return intervals. It has been suggested in the literature that fire return intervals differ between deciduous and coniferous stands in the western boreal mixedwood forest (Larsen, 1997; Cumming, 2001), although the results of Cumming differ both quantitatively and in ranking with those of Larsen. It is also unclear to me how the aspen-pine mixtures classified as mixedwoods could have influenced this analysis, since these stands are unlikely to have a similar disturbance regime to that of the aspen-spruce mixedwood stands. 128 C H A P T E R 6: D I S C U S S I O N A N D C O N C L U S I O N S 6.1: DATA QUALITY AND THE SPATIAL ANALYSIS OF BOREAL MIXEDWOODS. Data quality is a serious issue in spatial analysis, especially in GIS operations where it is often hard to assess how errors in the spatial data sets will propagate through the analysis (Heuvelink, 1998). The issue is not limited to errors or uncertainty in the data, but, as discussed in Chapter 2.1, it also involves the resolution and extent of the data sets, which all contribute to the fitness of a specific data set to be used for ecological investigation. The first objective of this thesis was to assess how suitable the spatial data available in British Columbia are for ecological investigations at the landscape scale in the boreal mixedwoods. For technical reasons, this was not done through a formal assessment of the propagation through the entire analysis of the errors estimated in the spatial data sets. Just the production of the data sets used for the analyses described in Chapter 5 required several machine-hours and several Gigabytes of GIS maps. To produce the number of alternate map realisations necessary for a proper error propagation analysis with the Monte Carlo method would have required an amount of computing power and storage capacity beyond the possibilities and the scope of this thesis. Hence, I will limit myself to drawing conclusions from what I have learned from the error analysis that I have conducted at the level of the individual data sets, as well as from the GIS and remote sensing operations that produced the GIS layers used in the analysis. 6.1.1: Forest cover classification The forest cover classification performed below my expectations. The accuracy level of my classification was below the levels obtained by Ghitter et al. (1995) and by Hall et al. 129 (2000) for relatively similar stands in Alberta, especially considering that I employed a coarser classification scheme. It must be considered that my error analysis tables are relatively conservative, since they refer to a "hardening" of the soft classification employed in the analysis. The Bayesian classification has a higher level of information content than its "hard" equivalent, since it incorporates knowledge about the uncertainty of the classification, as discussed for fuzzy classifications by Zhang and Stuart (2001). When the Bayesian classifier is "hardened" to create a categorical map, the uncertainty information is ignored, and each pixel is assigned to the class with the highest posterior probability. Incorrectly classified pixels are all considered equally erroneous, independently of the uncertainty information that is incorporated in the classification, and subsequently used in the analysis. Unfortunately, accounting for uncertainty in the classification greatly increases the complexity of the accuracy assessment procedure, as is shown in the methods proposed by Gopal and Woodcock (1994) and Woodcock and Gopal (2001) for crisp categorical maps with fuzzy ground control points. A parametric error analysis of a probabilistic classification would have required a much larger number of sample points than I had available. Even considering the conservative nature of the accuracy assessment, the classification results are far from satisfactory. This can be probably attributed to the use of only 3 TM bands in the classification. For a classification of Alberta mixedwood stands, Ghitter et al. (1995) showed that, while the use of all 7 TM bands did not produce better class separability than the use of bands 3, 4 and 5, it had a far better classification accuracy. Another component of my classification procedure that could be improved is the selection of training sites. Because the study was initially limited to map sheet 94b040, most training sites, as well as most ground control points, were located in this general area. This area 130 lacked good targets from some of the forest cover classes, especially the pure white spruce and mature lodgepole pine classes, and was partly covered by clouds in the March image. It is also surprising that the introduction of the FIP information resulted in little improvement in the classification accuracy. This could be partly attributed to accuracy problems in the FIP information. The most recent forest inventory audit for the Fort St. John TSA was aimed at assessing the accuracy of the volumes and site index labels, rather than the accuracy of the cover labels, but, in the case of the forest versus non-forest labels, the reported accuracy was 86% on a sample size of thirty (Resources Inventory Branch, 2000). However, the limited impact of the FIP data on the classification results suggests that most of the classification accuracy problems can be attributed to the remote sensing information. The outputs of the Bayesian classifier of Idrisi32 had a strongly bimodal distribution, with most pixels having probabilities close to 0 for every class except for one. In this situation, there was little possibility for the FIP information to make a difference. It is possible that a different GIS-remote sensing integration methodology might have produced better results. Rule-based methods like the one used by Hall et al. (2000) in Alberta, or the methods recently proposed by Wulder and Franklin (2001), might prove to be more successful. The introduction of hybrid segmentation procedures (e.g. Stuckens et al, 2000) could be also interesting, but somehow problematic. While segmentation is likely to improve classification accuracy, it imposes decisions about the scale at which the classification takes place. This is a very relevant issue in mapping of mixedwoods where the mixture of conifer and hardwoods occurs at a very fine scale as well as in clumps. 131 6.1.2: Digital Elevation Models While I had concerns about DEM quality at the outset of this project, especially considering the low-relief topography of the study area, I did not expect to encounter the systematic error patterns that I found in the TRIM data sets: the collection-line artefacts described in Chapter 4. The TRIM DEM data model, based directly on a TIN structure, is unique to British Columbia, so there is little information in the literature on the applications of such a DEM structure in forestry and environmental sciences. Systematic error patterns similar to those found in the TRIM DEMs of the study area have been described for some USGS Level 1 7.5-minutes DEMs (Garbrecht and Starks, 1995; Garbrecht and Martz, 2000). There is an important difference in the data structure between the TRIM DEMs and the USGS DEMs; while the USGS DEMs are distributed as raster maps, the original TIN structure is available for TRIM. This enables the application of a line-based cross-smoothing filter, which proved to be effective in mitigating the line-collection artefacts. It is important to notice that the application of the cross-smoothing filter is based on the preservation of entry order in TRIM elevation points. If the point order in the database is shuffled by subsequent GIS operations, and is not recorded in the associated database, it becomes almost impossible to reconstruct the collection lines and apply the appropriate filter. As for non-systematic errors, the analysis of the error propagation in sections 4.2.1.1 and following showed that fully random or even isotropic spatially autocorrelated errors have a limited influence on the estimation of local shape parameters, especially when the estimation window is of large enough dimensions to reduce the effects of the surface simplification operated by the TRIM sampling scheme. The scale of terrain analysis does not appear to be particularly important, as there is only a limited difference in the electivity index tables for the range of analysis scales used. 132 This is quite likely a consequence of the specific terrain of the study area, which alternates very gentle topography with deeply cut small valleys, so that there is little in the way of intermediate topographic features. The use of more sophisticated algorithms for drainage analysis specifically designed for low relief topography, like the one developed by Martz and Garbecht (1998), might provide the kind of DEM-derived drainage information that I chose not to use because of the limited ability of the automatic drainage algorithms of Idrisi32 or Arc View 3.2 to perform well in areas of low relief. 6.2: ECOLOGICAL SIGNIFICANCE OF THE SPA TIAL ANALYSIS FINDINGS 6.2.1: Spatial association of terrain classes and forest cover. The analysis of the spatial association between forest cover and topographic classes provides some interesting results, suggesting that deciduous stands, mixedwoods and white spruce dominated stands can be located along a moisture and nutrient gradient, since deciduous stands are associated with convex topographic features, usually ridge tops and crests, while spruce dominated stands are associated with concave topographic features which occur at the bottom of the slopes. A similar relation was shown by Bridge and Johnson (2000) for the boreal mixedwoods of Saskatchewan, where the topographic gradient was described using allometric measures of hillslopes from topographic maps. A direct effect of the nutrient and moisture regimes on the competitive ability of different species is not the only mechanism by which terrain could influence the composition of the main tree vegetation. Different topographic features could be associated with different disturbance regimes, in terms of both fire frequency and fire intensity. To explain longer fire intervals for the bottoms of big river valleys, Cumming (2000) suggests that fire ignition and spread in areas of lower slope position might be limited by soil and fuel moisture, while 133 crests might be exposed to more frequent lightning strikes. Also, if fire is more likely to move up slopes, upper topographic positions would be exposed to fires more frequently even if ignition probabilities were not correlated to topographic position. Different topographic positions might also have different probabilities of experiencing specific fire intensities. This might then drive post-fire succession by controlling seedbed availability for white spruce as well as the quantity of viable aspen roots and lodgepole pine serotinous cones left on the site, as in the model suggested by Lieffers et al. (1996a). Since both Larsen (1997) and Cumming (2001) report differences in fire return intervals between deciduous dominated, mixed and conifer dominated stands, it is difficult to assess from only the spatial distribution of the forest cover types if the relation with terrain could be a cause or an effect of disturbance frequencies. However, all of the studies cited above are from either Alberta or Saskatchewan, where lodgepole pine is not present. In my study area, lodgepole pine is a component of the boreal mixedwoods, not only on extreme sites but also in mesic sites (Beckingham et al., 1996), forming mixtures with aspen, spruce, and spruce and aspen. If the forest cover classification maps could be improved to discriminate successfully between aspen-pine and aspen-spruce mixedwoods, they could be used to investigate the role of lodgepole pine in mixedwood dynamics, particularly the relation between mixedwood stands with a lodgepole pine component and topographic position, stand age and spruce seed sources. 6.2.2: Distance from white spruce stands and probability of mixedwood presence. As discussed in Chapter 5, the results of the analysis of HWD vs. MWD stands and their distance from SW stands supports the hypothesis that white spruce seed dispersal is acting as a limiting factor in the establishment of mixedwood stands. I believe this is an 134 interesting result. It is obvious that a mixedwood with a spruce component cannot be established if white spruce has no seed source from which to enter a recently disturbed stand. This is a consequence of the fire ecology and life history traits of white spruce, which does not develop a seed or bud bank, since it has neither serotinous cones nor the ability to sucker from roots as its competitors (Zasada et al, 1992; Greene et al, 1999; Bourgeau-Chavez et al, 2000a). Greene and Johnson, who have intensively studied and modelled the regeneration dynamics of boreal forests (Greene and Johnson, 1989, 1994, 1995, 1996, 1999), estimate that abundant regeneration of white spruce in bums is limited to about 70 m from the burn edge, if the edge is a spruce seed source. Under the assumption of very low spruce mortality during stand development, they estimate that at maturity white spruce will reach an abundance equivalent to surrounding stands only up to 600 m from the edge of the burn (Greene and Johnson, 2000). However, this does not necessarily imply that the seed dispersal limitation of spruce regeneration actually occurs in the natural landscape dynamics of boreal mixedwood stands: the convoluted shape of bums as well as the presence of seed sources in bum islands may minimise the occurrence of situations in which seed rain is the limiting factor to white spruce establishment (Greene and Johnson, 2000). The results of my analysis suggest that distance from seed sources does indeed play a limiting role in boreal mixedwood dynamics, and that the shape of fires or the presence of residual stands do not offset the dispersal limitation of white spruce. This supports the idea that landscape context is an important driver in mixedwood succession not only in managed, but also in natural landscapes. If fires in the BWBS have differential burning preferences, as Cumming (2001) has shown for northeastern Alberta, disturbance could be reinforcing the 135 dispersal limitation in those cases where residual stands and burn edges correspond mainly to deciduous stands. The results from this study are far from conclusive, given the quality of the forest cover information. If one were to obtain better forest cover information, it would be possible to extend the spatial analysis of P(HWD) distance dependence on SW stands to account for a series of factors that were not included in the present analysis. First of all it would be interesting to produce a map of potential white spruce seed rain instead of just a distance map. The distance map accounts for the closest potential seed source, but does not account for the overlapping effect of multiple seed sources, nor for the differences in strength between the different seed sources. The possibility of seed dispersal from the older mixedwood stands is also not accounted for, nor are those seed sources that have been lost relatively recently to natural disturbance, logging or land clearing, but which might have played a relevant role, prior to their loss, in determining the species composition of the present landscape. Ideally, the analysis of distance relations should be limited to those stands that have been disturbed recently enough that most of the seed sources that played a role in succession since stand replacement can be accounted for. The predictor variable should be a cumulative measure of potential seed rain instead of the simple distance from the closest seed source. The limitation to conducting such an analysis is not in analytical techniques or in the availability of an adequate seed dispersal model, but in the quality of the forest cover information: the power of such a detailed analysis would be severely limited by high uncertainty in the forest classification map. 136 6.3: RECOMMENDATIONS FOR FURTHER RESEARCH Before any further investigation in the landscape ecology of the study area, I recommend the production of a reliable, high-resolution forest cover classification map. By high-resolution, I refer to a nominal resolution of 30 m, attainable with the Landsat T M sensor. This resolution is adequate to investigate landscape dynamics, and it has been successfully used in other landscape investigations of northern forests (He et al, 1998). Since it is hard to find good high-resolution data on the physical environment, any additional increase in resolution of the forest cover data is unlikely to provide benefits for landscape investigations, as the resolution of other data sources will become the analysis bottleneck. This forest cover map needs not be produced for the entire BWSB zone, nor I am suggesting a change in Forest Inventory practices. What I am advocating is the production of a map for research purposes, which exceeds the quality standards of regular Forest Inventory maps, and can be used to study landscape patterns. This forest cover map should be a raster and not a vector map. The vector data model implies the presence of distinct geographic objects to which attributes are attached (Burrough and MacDonald, 1998). In the case of forest inventory or Terrestrial Ecosystem Mapping, these objects are the polygons identified by an experienced photo-interpreter. This data model has several advantages in situations where there are sharp boundaries between stands, but where these boundaries do not exist, the use of a vector data model becomes problematic, and produces some serious limitations to landscape analysis. The first limitation is the absence of an explicit data resolution. It is true that a minimum mapping unit is defined in the mapping specifications, but its application is dependent on the distinctness of the polygon boundaries, and it involves a subjective judgement call from the interpreter. The consequence is that the minimum mapping unit (i.e. 137 the resolution or grain of the data) will be inconsistent between cover types as well as between interpreters. This has several consequences for the analysis of landscape patterns, which, as discussed in Chapter 2, is dependent on the resolution of the spatial data used. The size class distribution of the different forest cover types will be affected by these inconsistencies, as well as the spatial distribution of small polygons, since different areas will have been mapped by different interpreters at different times. These spatial inconsistencies will also affect the spatial distribution of measures of fragmentation, interspersion and landscape diversity. The second serious limitation of the vector data model is specific to the definition of mixedwoods. A mixedwood is a scale dependent concept, and very much related to the minimum mapping unit. Consider a set of clumps of 0.5 ha monocultures. These will be grouped into a mixedwood stand if the minimum mapping unit is 5 ha, while they will be identified as individual small stands if the minimum mapping unit is 0.1 ha. It is impossible to discriminate between intimate mixtures and clumped mixtures when the clumps are smaller than the minimum mapping unit. However, a variable and unclear minimum mapping unit confuses the matter further, since the scale at which a mixedwood has been defined is inconsistent between different mixedwood polygons. The use of a pixel based classification mapping (i.e. the use of a raster model) clearly defines what the minimum mapping unit is. This does not imply that the classification procedure has to operate exclusively on individual pixels in isolation. Contextual information can be introduced to improve the accuracy (Stuckens et al., 2000), but the segmentation procedure should be designed so that it is consistent across the cover classes as well as across the classification area. In the production of this high-resolution forest cover map, I recommend the investment of sufficient resources to obtain a satisfactory level of discrimination between 138 mixedwood classes as well as a satisfactory level of classification accuracy. The classification should also include uncertainty information, and possibly a spatial model of the classification errors. This will probably require the use of more sophisticated remote sensing classification techniques than those that I have employed, and possibly a larger selection of satellite images and image channels. It will also require a larger number of training sites and ground control points. However, an accurate classification is crucial for any quantitative measurement of spatial patterns of the mixedwood landscape. If the classification is unbiased, global estimates of abundance of the different classes are reasonably robust over large areas. The same cannot be said for spatial measures. The calculation of landscape indices, for example, is so affected by the spatial accuracy of the information that even simple GIS operations, such as converting the data from one format to another, can significantly affect the outcome (Bettinger et al., 1996). In absence of a spatial model of the classification errors and in the presence of high classification errors, it is even impossible to assess the potential consequences of the propagation of classification errors into the calculation of the most common indices of landscape patterns. The procedures proposed by Heuvelink (1998) or Fisher (1998) to assess the propagation of errors in the spatial operations are likely to simply tell us that analysis results are completely unreliable. I believe that the mapping I am advocating is indispensable to any investigation of mixedwood patterns that wants to exceed the limitations presented by the forest inventory spatial information, especially in the description of small scale patterns. If the spatial patterns of natural landscapes are to be a reference and a guideline for managed mixedwood landscape, the reference measurements should encompass the entire range of scales at which forest management can be practised. The currently available spatial data only captures the medium to large scale patterns well, and is seriously limited in its capacity to describe small 139 scale (-0.33 ha of resolution) patterns, the scale of the patterns that could be created using group selection silviculture. The ability to describe small scale patterns is related to the need to produce an ecologically sound and scale-explicit definition of mixedwoods. Since a mixedwood is a scale dependent concept, I believe that it is very important to identify the range of scales where defining a mixedwood is ecologically meaningful; that is, the degree of mixture intimacy under which a mixedwood stand has ecological characteristics and processes that appreciably differ from the sum of its components in monoculture. This is a complex research problem, because there are many processes that are likely to behave differently in a mixture, even a clumped one, than in a monoculture, and they are likely to occur at different spatial scales. For example, the influence of a group of deciduous trees on a nearby group of conifers will be exerted up to a different distance according whether one considers light competition, root competition, or litter dispersal. This distance will also change through time as the relative heights and the structure of neighbouring stands change. Other processes might have an influence at much larger scales: seed dispersal, probability of fire ignition and spread, and herbivory through habitat relations. I would recommend proceeding with research on two fronts. Critical scales of mixedwood aggregations existing on the landscape should be quantified using the analysis of landscape patterns based on high resolution forest cover mapping. At the same time, spatially explicit individual tree simulation should be used to investigate at which critical scales of aggregations does a mixedwood start to behave as a set of individual monocultures. A final recommendation relates to the study of the role of lodgepole pine in the boreal mixedwood dynamics in British Columbia. Since lodgepole pine is a not relevant component of boreal mixedwoods in most of the north American boreal forest (Bourgeau-140 Chavez et al., 2000b), its role in ecological succession in boreal mixedwoods is not accounted for in most of the boreal mixedwood literature. However, in western Alberta and British Columbia, lodgepole pine is an important component of mesic ecosites associated with mixedwoods (Beckingham et al, 1996). Mixedwoods with a lodgepole pine component are likely to have different successional dynamics than mixedwoods composed of only aspen and white spruce, considering that lodgepole pine has a very different autoecology and fire ecology from both aspen and white spruce (Bourgeau-Chavez et al, 2000a). I believe that some fundamental questions on the role of lodgepole pine need to be addressed. Do aspen and pine mixtures have a different fire regime from pure aspen stands? Is a burned aspen and pine stand more likely to be colonised by white spruce seed than a burn in pure aspen? Is recruitment of white spruce under aspen facilitated or made more difficult by the presence of pine in the main canopy? Answering these questions is extremely important to an understanding of mixedwood dynamics in the BWBSmwl. 6.4: CONCLUSIONS The results of Section 5.2, where significant expected and unexpected associations between forest cover and topographic classes were detected, show that, despite its limitations, the remote sensing classification I used is adequate for use in spatial analysis. The results also show that TRIM DEMs can provide relevant topographic information even in an area of such limited relief, and that this method of analysis has the potential for providing important ecological information. At the same time my experience shows that in working with large scale spatial data produced by various agencies one is likely to encounter several data quality issues, both expected and unexpected. In dealing with these data quality problems I had to develop some novel methods as well as extend known ones: I developed the line-based cross-smoothing to 141 mitigate the effect of systematic errors in TRIM data. I expanded Wood's (1996; 1998) geomorphometry techniques to accommodate for uncertainty in the classification of terrain using fuzzy sets theory. I also expanded Florinsky's (1998a) work on error propagation to topographic variables derived from digital elevation models, so that I could account for spatial autocorrelation in the error. I used a new measure of excessive generalisation of the DEM using the chi-squared test, and tested the use of fractal surfaces to asses the effect of the TRIM sampling scheme on topographic variables. I also expanded Pastor and Broschart's (1990) application of Ivlev's electivity index to landscape ecology in two important ways: by incorporating uncertainty into the electivity index, and by the introduction of the m parameter, which accounts for the presence of spatial autocorrelation in the geographical data. Al l these methodological innovations were necessary to address the specific nature of large scale data, and indicate that practitioners of landscape ecology should be wary of taking large scale data at face value, without considering the implications of the errors inherent in the data sets and their possible propagation through analysis. At the moment, the possibility of additional analysis of the study area with this data is primarily limited by the quality of the forest cover information, while the DEM-derived classification does not appear to be a limiting factor. Especially restrictive are the high classification error rates for the white spruce dominated stands, as well as the inability to discriminate between At/Pi, At/Pl/Sw and At/Sw mixtures. The landscape level analysis of the relation between forest cover and terrain strongly suggests that terrain is playing a role in the distribution of the different types of boreal mixedwood forest stands. The role of terrain could be mainly a direct effect of moisture, nutrient and temperature regimes on the competitive ability of the different tree species, or it could be a control of topography on disturbance frequency or intensity, or both these 142 processes might be going on a the same time, mutually reinforcing each other. If the boreal mixedwood is a shifting mosaic continuously rearranged by disturbance, this is happening within the constraints set by the permanent features of landscape topography. While any kind of mixedwood successional pathway might still be possible on any kind of site, given the right set of initial conditions, my results indicate that the likelihood of a specific pathway are influenced enough by the topography of the site to produce a detectable signal in the spatial patterns at the landscape level. The analysis of mixedwood and hardwood stand distribution in relation to spruce stands suggests that the rates and/or the outcomes of succession in the mixedwood are significantly influenced by the spatial arrangement of the stands on the landscape. This supports the hypothesis that there are important ecological processes happening at the landscape scale, which cannot be explained by stand-level conditions alone. Dispersal-limitation of the seed availability of white spruce after disturbance is the most likely explanation for the patterns observed, suggesting that distance from seed sources does indeed play a limiting role in boreal mixedwood dynamics, which is not counterbalanced by the shape of fires or by the presence of residual stands, nor obliterated by the characteristics of disturbance and by the competitive abilities of aspen and lodgepole pine. Overall the results support the idea, advocated by Andison and Kimmins (1999), that British Columbia boreal mixedwoods can be understood and managed only considering large scale processes, as well as stand level processes. 143 R E F E R E N C E S Alexander, M.E. and Euler, D.L. 1980. Ecological Role of Fire in the Uncut Boreal Mixedwood Forest. In Boreal Mixedwood Symposium, Thunder Bay, Ont. Edited by: R. D. Whitney and K. M. McClain. Allen, T.F.H., O'Neill, R.V. and Hoekstra, T.W. 1984. mterlevel Relations in Ecological Research and Management: Some Working Principles From Hierarchy Theory. General Technical Report, RM-110. USDA Forest Service. Allen, T.F.H. and Starr, T.B. 1982. Hierarchy. Perspectives for Ecological Complexity. University of Chicago Press, Chicago. Andison, D.W. and Kimmins, J.P. 1999. Scaling up to understand British Columbia's boreal mixedwoods. Environmental Reviews, 7: 19-30. Barbour, M.G., Burk, J.H. and Pitts, W.D. 1987. Terrestrial Plant Ecology. The Benjamin/Cummings Publishing Company, Menlo Park, California. Beals, E.W. 1973. Ordination: Mathematical elegance and ecological naivete. Journal of Ecology, 61: 23-36. Beckingham, J.D., Corns, I.G.W. and Archibald, J.H. 1996. Field guide to ecosites of west-central Alberta. Special Report, 9. Canadian Forest Service, Northwest Region, Northern Forestry Center. Bellehumeur, C. and Legendre, P. 1998. Multiscale sources of variation in ecological variables: modelling spatial dispersion, elaborating sampling designs. Landscape Ecology, 13: 15-25. Bettinger, P., Bradshaw, G.A. and Weaver, G.W. 1996. Effects of geographic information system vector-raster-vector data conversion on landscape indices. Canadian Journal of Forest Research, 26: 1416-1425. Bhattacharyya, A. 1943. On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin Calcutta Mathematical Society, 35:99-109. Bijlsma, R.J. 1993. The characterization of natural vegetation using first-order and texture measurements in digitized, colour-infrared photography. International Journal of Remote Sensing, 14: 1547-1562. Borcard, D., Legendre, P. and Drapeau, P. 1992. Partialling out the spatial component of ecological variation. Ecology, 73: 1045-1055. 144 Bourgeau-Chavez, L.L., Alexander, M.E., Stocks, B J . and Kasischke, E.S. 2000a. Distribution of Forest Ecosystems and the Role of Fire in the North American Boreal Region. In Fire, Climate Change, and Carbon Cycling in the Boreal Forest. Edited by: Eric S Kasischke and Brian J Stocks. Ecological Studies, New York, pp. 111-131. Bourgeau-Chavez, L.L., Kasischke, E.S., Mudd, J.P. and French, N.H.F. 2000b. Characteristics of Forest Ecozones in the North American Boreal Region. In Fire, Climate Change and Carbon Cycling in the Boreal Forest. Edited by: Eric S Kasischke and Brian J Stocks. Ecological Studies, pp. 258-273. Bridge, S.R.J, and Johnson, E.A. 2000. Geomorphic principles of terrain organization and vegetation gradients. Journal of Vegetation Science, 11: 57-70. Brunt, J.W. and Conley, W. 1990. Behavior of a multivariate algorithm for ecological edge detection. Ecological Modelling, 49: 179-203. Burrough, P.A. and MacDonald, R.A. 1998. Principles of geographical information systems. Oxford University Press, Oxford. Carrara, A., Bitelli, G. and Carla, R. 1997. Comparison of techniques for generating digital terrain models from contour lines. International Journal of Geographical Information Science, 11: 451-473. Catto, N. , Liverman, D.G.E., Bobrowsky, P.T. and Rutter, N . 1996. Laurentide, Cordilleran and Montane glaciation in the western Peace River - Grand Praire region, ALberta and British Columbia, Canada. Quaternary International, 32: 21-32. Chavez, P.S. 1988. An improved dark-object subtraction technique for atmospheric scattering correction of multispectral data. Remote Sensing of Environment, 24: 459-479. Constabel, A.J. and Lieffers, V.J. 1996. Seasonal patterns of light transmission through boreal mixedwood canopies. Canadian Journal of Forest Research, 26: 1008-1014. Coops, N.C., Gallant, J.C, Loughhead, A.N., Mackey, B.J., Ryan, P.J., Mullen, I.C. and Austin, M.P. 1998. Developing and testing procedures to predict topographic position from Digital Elevation Models (DEMs) for species mapping (Phase 1). Report to Environment Australia, Canberra. Crippen, R.E. 1989. A Simple Spatial Filtering Routine for the cosmetic Removal of Scan-Line Noise from Landsat TM P-Tape Imagery. Photogrammetric Engineering and Remote Sensing, 55: 326-331. Cumming, S.G. 1997. Landscape Dynamics of the Boreal Mixedwood Forest. Ph.D. Thesis, University of British Columbia, Vancouver. Cumming, S.G. 2000. A synopsis of fire research in the boreal mixedwood forest. Technical Report, BERL-2000-02. Boreal Ecosystems Research Ltd., Edmonton, Alberta. 145 Cumming, S.G. 2001. Forest type and wildfire in the Alberta boreal mixedwood: What do fires burn? Ecological Applications, 11: 97-110. Cumming, S.G., Schmiegelow, F.K.A. and Burton, P.J. 2000. Gap dynamics in boreal aspen stands: is the forest older that we think? Ecological Applications, 10: 744-759. DeLong, C. 1988. A field guide for the identification and interpretation of serai aspen ecosystems of the BWBScl, Prince George Forest Region. Land Management Handbook, 16. BC Ministry of Forests, Victoria. DeLong, C. 1989. Dynamics of boreal mixedwood ecosystems. In Northern Mixedwood '89, 1991, Fort St. John, BC. Edited by: A. Shortreid. DeLong, H.B., Lieffers, V.J. and Blenis, P.V. 1997. Microsite effects of first-year establishment and overwinter survival of white spruce in aspen-dominated boreal mixedwoods. Canadian Journal of Forest Research, 27: 1452-1457. Demarchi, D.A. 1995. Ecoregions of British Columbia. Fourth Edition. British Columbia Wildife Branch, Ministry of Environment, Lands and Parks, Victoria, BC. Map (1:2,000,000). Dillon, W.R. and Goldstein, M . 1984. Multivariate Analysis. Methods and Applications. John Wiley and Sons, New York. Eastman, J.R. 1999. Idrisi32 Guide to GIS and Image Processing, Volume 2. Clark Labs, Worcester, MA. Evans, I.S. 1979. An integrated system of terrain analysis and slope mapping. Final report on grant DA-ERO-591-73-G0040, University of Durham, England. Evans, I.S. 1980. An integrated system of terrain analysis and slope mapping. Zeitschrift fur Geomorphologie, Supp l - B d 36: 274-295. Farstad, T., Lord, T.M., Green, A.J. and Hortie, H.J. 1965. Soil Survey of the Peace River Area in British Columbia. British Columbia Soil Survey, 8. Canada Department of Agriculture Research Station, Vancouver, BC. Fisher, P. 1997. The pixel: A snare and a delusion. International Journal of Remote Sensing, 18: 679-685. Fisher, P. 1998. Improved Modeling of Elevation Error with Geostatistics. Geolnformatica, 2: 215-233. Florinsky, I.V. 1998a. Accuracy of local topographic variables derived from digital elevation models. International Journal of Geographical Information Science, 12: 47-61. Florinsky, I.V. 1998b. Combined analysis of digital terrain models and remotely sensed data in landscape investigations. Progress in Physical Geography, 22: 33-60. 146 Florinsky, I.V. and Kuryakova, G.A. 1996. Influence of topography on some vegetation cover properties. Catena, 27: 123-141. Forman, R.T.T. and Gordon, M. 1984. Landscape Ecology. John Wiley and Sons, New York. Fortin, M.J. 1997. Effects of data types on vegetation boundary delineation. Canadian Journal of Forest Research, 27: 1851-1858. Franklin, S.E. 1987. Terrain analysis from digital patterns in geomorphometery and landsat MSS spectral response. Photogrammetric Engineering and Remote Sensing, 53: 59-65. Franklin, S.E. and Peddle, D.R. 1990. Classification of SPOT HRV imagery and texture features. International Journal of Remote Sensing, 11: 551-556. Garbrecht, J. and Martz, L.W. 2000. Digital elevation model issues in water resources modeling. In Hydrology and Hydraulic Modeling Support with Geographic Information Systems. Edited by: David Maidment and Dean Djokic. ESRI Press, Redlands, CA, pp. 1-28. Garbrecht, J. and Starks, P. 1995. Note on the use of USGS Level 1 7.5-Minute DEM coverages for landscape drainage analysis. Photogrammetric Engineering and Remote Sensing, 61: 519-522. GDBC 1992. Digital Baseline Mapping at 1:20 000. British Columbia Specifications and Guidelines for Geomatics, Content Series Volume 3. Geographic Data BC. GDBC 1999. Terrain Resource Information Management Program, URL: http://www.env.gov.bc.ca/gdbc/trim.htm Last accessed: April 27, 2000, Ghitter, G.S., Hall, R.J. and Franklin, S.E. 1995. Variability of Landsat Thematic Mapper data in boreal deciduous and mixed-wood stands with conifer understory. International Journal of Remote Sensing, 16: 2989-3002. Gilalbert, M.A., Conese, C. and Maselli, F. 1994. An atmospheric correction method for the automatic retrieval of surface reflectances from TM images. International Journal of Remote Sensing, 15: 2065-2086. Gopal, S. and Woodcock, C E . 1994. Theory and methods for accuracy assessment of thematic maps using fuzzy sets. Photogrammetric Engineering and Remote Sensing, 60: 181-188. Greene, D.F. and Johnson, E.A. 1989. A model of wind dispersal of winged plumed trees. Ecology, 70: 339-347. Greene, D.F. and Johnson, E.A. 1994. Estimating the mean annual seed production of trees. Ecology, 75: 642-647. 147 Greene, D.F. and Johnson, E.A. 1995. Long distance wind dispersal of seed trees. Canadian Journal of Botany, 73: 1036-1045. Greene, D.F. and Johnson, E.A. 1996. Wind dispersal of seeds from a forest into a clearing. Ecology, 77: 595-609. Greene, D.F. and Johnson, E.A. 1999. Recruitment of Populus tremuloides, Pinus banksiana, and Picea mariana into burns in the mixedwood forest of central Saskatchewan. Canadian Journal of Forest Research, 29: 462-473. Greene, D.F. and Johnson, E.A. 2000. Tree recruitment from bum edges. Canadian Journal of Forest Research, 30: 1264-1274. Greene, D.F., Zasada, J.C, Sirois, L., Kneeshaw, D., Morin, H., Charron, I. and Simard, M . -J. 1999. A review of the regeneration dynamics of North American boreal forest tree species. Canadian Journal of Forest Research, 29: 824-839. Greig-Smith, P. 1952. The use of random and contiguos quadrats in the study of the structure of plant communities. Annals of Botany, New Series, 16: 293-316. Groot, A. and Carlson, D.W. 1996. Influence of shelter on night temperatures, frost damage, and bud break of white spruce seedlings. Canadian Journal of Forest Research, 26: 1531-1538. Gustafson, E.J. 1998. Quantifying Landscape Spatial Pattern: What Is the State of the Art? Ecosystems, 1: 143-156. Gustafson, E.J. and Parker, G.R. 1992. Relationships between landcover proportion and indeces of landscape spatial pattern. Landscape Ecology, 7: 101-110. Guth, P.L. 1999. Contour line "ghosts" in USGS Level 2 DEMs. Photogrammetric Engineering and Remote Sensing, 65: 289-296. Hall, R.J. and Klita, D.L. 1997. Remote Sensing-GIS integration: progress toward defining a conifer understory classification system for use with Landsat TM data. In GER '97, Ottawa, Ontario. Hall, R.J., Peddle, D.R. and Klita, D.L. 2000. Mapping conifer understory within boreal mixedwoods from Landsat TM satellite imagery and forest inventory information. Forestry Chronicle, 76: 887-902. Haralick, R.M. 1983. Ridge and valley on digital images. Computer Vision, Graphics and Image Processing, 22: 28-38. Hargis, CD. , Bissonette, J.A. and David, J.L. 1998. The behavior of landscape metrics commonly used in the study of habitat fragmentation. Landscape Ecology, 13: 167-186. 148 He, H.S., Mladenoff, D.J., Radeloff, V.C. and Crow, T.R 1998. Integration of GIS data and classified satellite imagery for regional forest assessment. Ecological Applications, 8: 1072-1083. Heinselman, M.L. 1980. Fire and Succession in the Conifer Forests of Northern North America. In Forest Succession. Concepts and Applications. Edited by: Darrel C. West, Herman H. Shugart and Daniel B. Botkin. Springer Verlag, New York Heidelberg Berlin. Heuvelink, G.B.M. 1998. Error propagation in environmental modelling with GIS. Taylor & Francis, London. Hulshoff, R.M. 1995. Landscape indicies describing a Dutch landscape. Landscape Ecology, 10:101-111. Hunter, G.J. and Goodchild, M.F. 1997. Modeling the uncertainty of Slope and Aspect Estimates Derived from Spatial Databases. Geographical Analysis, 29: 35-49. Jacobs, J. 1974. Quantitative measurement of food selection: a modification of the forage ratio and Ivlev's electivity index. Oecologia, 14: 413-417. Jarvis, P.G. 1993. Prospects for Bottom-Up Models. In Scaling Physiological Processes from Leaf to Globe. Edited by: J.R. Ehleringer and C B . Field. Academic Press, San Diego. Jelinski, D.E. and Wu, J. 1996. The modifiable areal unit problem and implications for landscape ecology. Landscape Ecology, 11: 129-140. Johnson, E.A. 1992. Fire and vegetation dynamics: Studies from the North American Boreal Forest. Cambridge University Press., Cambridge. Jones, K.H. 1998. A Comparison of Two Approaches to Ranking Algorithms Used to Compute Hill Slopes. Geolnformatica, 2: 235-256. Jonsson, B.G. and Moen, J. 1998. Patterns in species associations in plant communities: the importance of scale. Journal of Vegetation Science, 9: 327-332. Kabzems, R. and Lousier, J.D. 1992. Regeneration, Growth and Development of Picea glauca under Populus spp. Canopy in the Boreal White and Black Spruce Zone. Canada - British Columbia Partnership Agreement on Forest Resource Development FRDAII, Victoria, BC. Kimmins, J.P. 1997. Forest Ecology. Prentice - Hall, London. Kolasa, J. and Rollo, C D . 1991. Introduction: The Heterogeneity of Heterogeneity: A Glossary. In Ecological Heterogeneity. Edited by: Jurek Kolasa and Steward T. A. Pickett. Ecological Studies. Springer-Verlag, New York, pp. 1-24. 149 Kushwaha, S.P.S., Kuntz, S. and Oesten, G. 1994. Applications of image texture in forest classification. International Journal of Remote Sensing, 15: 2273-2284. Lam, N.S.N. 1983. Spatial interpolation methods: a review. American Cartographer, 10: 129-149. Larsen, C.P.S. 1997. Spatial and temporal variations in boreal forest fire frequency in northern Alberta. Journal of Biogeography, 24: 663-673. Lee, J., Snyder, P.K. and Fisher, P.F. 1992. Modelling the effect of data errors on feature extraction from digital elevation models. Photogrammetric Engineering and Remote Sensing, 58: 1461-1467. Legendre, P. 1993. Spatial autocorrelation: trouble or new paradigm? Ecology, 74: 1659-1673. Li , M . and De Dapper, M . 1996. Geomorphometric varaibles extraction based on a grided digital terrain model (DTM). In GIS & Remote Sensing: Research, Development & Application. The procceedings of Geoinformatics '96., West Palm Beach, Florida, USA. Edited by: W. Guan, B. L i , T. Lo, S. L. Shaw and Q. Zhou. pp. 90-100. L i , Z. 1988. On the measure of digital terrain model accuracy. Photogrammetric Record, 12: 873-877. Lieffers, V.J., Macmillan, R.B., MacPherson, D., Branter, K. and Stewart, J.D. 1996a. Semi-natural and intensive silvicultural systems for the boreal mixedwood forest. Forestry Chronicle, 72: 286-292. Lieffers, V.J., Stadt, K.J. and Navratil, S. 1996b. Age structure and growth of understory white spruce under aspen. Canadian Journal of Forest Research, 26: 1002-1007. Lord, T.M. and Green, A.J. 1986. Soils of the Fort St. John - Dawson Creek area, British Columbia. Land Resources Research Centre Contribution, 85-27. Research Branch, Agriculture Canada, Vancouver, BC. Low, J.J., Power, K. and Gray, S.L. 1996. Canada's Forest Inventory 1991: the 1994 version. Information Report, BC-X-362. Pacific Forestry Center, Victoria. Luttmerding, H.A., Demarchi, D.A., Lea, E.C., Meidinger, D. and Void, T. 1990. Describing forest ecosystems in the field, 2nd Edition. Manual, 11. B.C. Ministr. Env., Victoria, B.C. MacArthur, R.H. and Wilson, E.O. 1967. The theory of island biogeography. Princeton University, Princeton, NJ. Martz, L.W. and Garbecht, J. 1998. The treatment of flat areas and depressions in automated drainage analysis of raster digital elevation models. Hydrological Processes, 12: 843-855. 150 Mathews, W.H. 1980. Retreat of the last ice sheets in northeastern British Columbia and adjacent Alberta. Bullettin 331, Geological Survey of Canada. MathSoft 2000. S-Plus 2000 Guide to Statistics, 1. Data Analysis Products Division, MathSoft, Seattle, WA. Mcintosh, R.P. 1991. Concept and Terminolgy of Homogeneity and Heterogeneity in Ecology. In Ecological Heterogeneity. Edited by: Jurek Kolasa and Steward T. A. Pickett. Ecological Studies. Springer-Verlag, New York, pp. 24-46. Meidinger, D. and Pojar, J. 1991. Ecosystems of British Columbia. Special Report Series, 6. British Columbia Ministry of Forests. Monckton, C.G. 1994. An investigation into the spatial structure of error in digital elevation data. In Innovation in GIS 1. Edited by: Michael F. Worboys. Taylor & Francis, London, pp. 201-214. Musick, H.B. and Grover, H.D. 1991. Image textural measures as indexes of landscape patterns. In Quantitative Methods in Landscape Ecology. Edited by: M . G. Turner and R. H. Gardner. Springer, New York, pp. 77-104. Naveh, Z. and Lieberman, A.S. 1994. Landscape Ecology, Theory and Application. Springer-Verlag, New York. O'Neill, R.V., Hunsaker, C.T., Timmins, S.P., Jackson, B.L., Jones, K.B., Riitters, K.H. and Wickham, J.D. 1996. Scale problems in reporting landscape pattern at the regional scale. Landscape Ecology, 11: 169-180. O'Neill, R.V. et al. 1988. Indices of landscape pattern. Landscape Ecology, 1: 153-162. Orloci, L. 1975. Multivariate analysis in vegetation research. Dr. W. Junkb. v., Publishers, The Hague. Ostendorf, B. and Reynolds, J.F. 1998. A model of arctic tundra vegetation derived from topographic gradients. Landscape Ecology, 13: 187-201. Palmer, M.W. and Dixon, P.M. 1990. Small scale environmental heterogeneity and the analysis of species distributions along gradients. Journal of Vegetation Science, 1: 57-65. Pan, J.-J. and Chang, C.-I. 1992. Destriping of Landsat MSS images by filtering techniques. Photogrammetric Engineering and Remote Sensing, 58: 1417-1423. Pastor, J. and Broschart, M . 1990. The spatial pattern of a northern conifer-hardwood landscape. Landscape Ecology, 4: 55-68. Pellegrini, G.J. 1995. Terrain shape classification of digital elevation models using eigenvectors and Fourier transforms. Ph.D. Thesis, State University of New York, Syracuse. 151 Pickup, G. and Chewings, V.H. 1996. Correlations between DEM-derived topographic indices and remotely-sensed vegetation cover in rangelands. Earth Surface Processes and Landforms, 21: 517-529. Pielou, E.C. 1977. Mathematical Ecology. John Wiley and Sons, New York. Pierce, L.L. and Running, S.W. 1995. The effects of aggregating sub-grid land surface variation on large-scale estimates of net primary production. Landscape Ecology, 10: 239-253. Pike, R.J. 2000. Geomorphometry - diversity in quantitative surface analysis. Progress in Physical Geography, 24: 1-20. Piatt, T. and Denman, K.L. 1975. Spectral Analysis in Ecology. Annual Review of Ecology and Systematics, 6: 189-210. Qi, Y. and Wu, J. 1996. Effects of changing spatial resolution on the results of landscape pattern analysis using spatial autocorrelation indexes. Landscape Ecology, 11: 39-49. Quattrochi, D.A. and Pelletier, R.E. 1991. Remote Sensing for Analysis of Landscapes: An Introduction. In Quantitative Methods in Landscape Ecology. Edited by: M . G. Turner and R. H. Gardner. Springer, New York, pp. 51-76. Renkin, R.A. and Despain, D.G. 1992. Fuel moisture, forest type and lightning-caused fire in Yellowstone National Park. Canadian Journal of Forest Research, 22: 37-45. Resource Inventory Committee 1995. Terrestrial Ecosystem Mapping Methodology for British Columbia. Unpublished report, Ministry of Environment. Resources Inventory Branch 2000. Fort St. John TSA Inventory Audit, URL: http://www.for.gov.bc.ca/resinv/audits/forttsa.htm Last accessed: June 6, 2001. Resources Inventory Task Force 1999. Vegetation Resources Inventory - Photo Interpretation Procedures. Version 2.2. BC Ministry of Forests. Rossi, R.E., Mulla, D.J., Journel, A.G. and Franz, E.H. 1992. Geostatistical tools for modeling and interpreting ecological spatial dependence. Ecological Monographs, 62:277-314. Rowe, J.S. 1972. Forest Regions of Canada. Publication, No. 1300. Department of the Environment Canadian Forestry Service, Ottawa. . Skidmore, A.K. 1989a. A comparison of techniques for calculating gradient and aspect from a gridded elevation model. International Journal of Geographical Information Systems, 3: 323-334. Skidmore, A.K. 1989b. An expert system classifies eucalypt forest types using Thematic Mapper data and a Digital Terrain Model. Photogrammetric Engineering and Remote Sensing, 55: 1449-1464. 152 Skidmore, A.K. 1990. Terrain position as mapped from a gridded digital elevation model. International Journal of Geographical Information Systems, 4: 33-49. Stehmen, S.V. 1997. Selecting and interpreting measures of thematic classification accuracy. Remote Sensing of Environment, 62: 77-89. Stuckens, J., Coppin, P.R. and Bauer, M.E. 2000. Integrating contextual Information with per-pixel classification for improved land cover classification. Remote Sensing of Environment, 71: 282-296. Tanner, D., DeLong, C. and Eastman, A. 1996. Investigations of Planting White Spruce under a Trembling Aspen Canopy. In Silviculture of Temperate and Boreal Broadleaf-Conifer Mixtures. Edited by: P. G. Comeau and K. D. Thomas. Province of British Columbia. Ministry of Forests Research Program, pp. 114-121. Ter Braak, C.J.F. and Looman, C.W.N. 1995. Regression. In Data Analysis in Community and Landscape Ecology. Edited by: R. H. G. Jongman, C. J. F. Ter Braak and O. F. R. Van Tongeren. Cambridge University Press, Cambridge, UK, pp. 29-77. Thorpe, J.P. 1992. Patterns of Diversity in the Boreal Forest. In The Ecology and Silviculture of Mixed-Species Forests. A Festschrift for David M. Smith. Edited by: M . J. Kelty, B. C. Larson and C. D. Oliver. Kuwler Academic Publishers, Dordrecht. Turner, M.G. and Gardner, R.H. 1991. Quantitative Methods in Landscape Ecology. Springer, New York. Turner, S.J., O'Neill, R.V., Conley, W., Conley, M.R. and Humphries, H.C. 1991. Pattern and Scales: Statistics for Landscape Ecology. In Quantitative Methods in Landscape Ecology. Edited by: M . G. Turner and R. H. Gardner. Springer, New York, pp. 17-50. Watson, K. 1993. Processing remote sensing images using the 2-D FFT - Noise reduction and other applications. Geophysics, 58: 835-852. Watt, A.S. 1947. Pattern and process in the plant community. Journal of Ecology, 35: 1-22. Wier, J.M.H. and Johnson, E.A. 1998. Effects of escaped settlement fires and logging on forest composition in the mixedwood boreal forest. Canadian Journal of Forest Research, 28: 459-467. Wood, J. 1998. Modelling the Continuity of Surface Form Using Digital Elevation Models. In 8th International Symposium on Spatial Data Handling, July 1 l-15th, 1998, Vancouver, Canada. Edited by: T.K. Poiker and N.R. Chrisman. pp. 725-736. Wood, J.D. 1996. The geomorphological characterisation of Digital Elevation Models. Ph.D. Thesis, University of Leicester. 153 Woodcock, C E . and Gopal, S. 2001. Fuzzy set theory and thematic maps: accuracy assessment and area estimation. International Journal of Geographical Information Science, 14: 153-172. Wulder, M . 1998. Optical remote-sensing techniques for the assessment of forest inventory and biophysical parameters. Progress in Physical Geography, 22: 449-476. Wulder, M . and Franklin, S.E. 2001. Polygon decomposition with remotely sensed data: rationale, methods and application. Geomatica, 55: 451-462. Youngblood, A. 1995. Development patterns in young conifer-hardwood forests of interior Alaska. Journal of Vegetation Science, 6: 229-236. Zadeh, L.A. 1965. Fuzzy Sets. Information and Control, 8: 338. Zasada, C.J., Sharik, T.L. and Nygren, M . 1992. The reproductive process in boreal forest trees. In A system analysis of the global boreal forest. Edited by: H. H. Shugart, R. Leemans and G. Bonan. Cambridge University Press, Cambridge, UK. Zevenbergen, L.W. and Thorne, CR. 1987. Quantitative analysis of land surface topography. Earth Surface Processes and Landforms, 12: 47-56. 154
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Spatial analysis in a successional perspective : a...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Spatial analysis in a successional perspective : a boreal mixedwood landscape in northeastern British… Albani, Marco 2001
pdf
Page Metadata
Item Metadata
Title | Spatial analysis in a successional perspective : a boreal mixedwood landscape in northeastern British Columbia |
Creator |
Albani, Marco |
Date Issued | 2001 |
Description | Digital elevation models (DEMs), multitemporal Landsat TM images and forest inventory maps were use to study the spatial patterns of boreal mixedwoods in the Boreal White and Black Spruce zone of British Columbia. A Bayesian classification method was developed to produce a high resolution map of posterior probabilities of forest cover classes. The procedure employed the forest inventory maps as prior probabilities, while evidence was provided by the results of a maximum likelihood classifier applied to the satellite images. Topographic parameters, derived from the DEMs by fitting a conic surface equation, were used in a fuzzy-logic based classification to produce maps of terrain classes. A filtering technique, expressly developed to mitigate the systematic errors present, was applied to the DEMs before the classification. The resulting maps were employed to investigate the relations between terrain and vegetation using an appropriately modified version of Ivlev's electivity index. The analysis suggests that topography plays a role in the distribution of the different types of boreal mixedwood forest stands. As expected, black spruce (Picea mariana (Mill.) B.S.P.) and tamarack (Larix laricina (Du Roi) K. Koch) dominated stands were strongly associated with flat areas. Hardwood dominated stands were found significantly associated with convex slopes and ridges, white spruce (Picea glauca (Moench) Voss) dominated stands positively were associated with concave slopes and channels, and mixed stands were associated with neutral slopes. The forest cover maps were used to study the distribution of mixedwood and hardwood stands in relation to white spruce stands. The analysis employed a logistic regression model where topographic variables and stand age were used as covariates. The likelihood of a site being occupied by hardwoods instead of mixedwoods was found to be positively correlated with increasing distance from white spruce stands. This result suggests that the rates or the outcomes of succession in the mixedwood forest are influenced by the spatial arrangement of different stand types at the landscape level. The most likely explanation is the control of seed dispersal on white spruce seed availability in the landscape, although a direct causation cannot be inferred from this correlation analysis. |
Extent | 7705673 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-09-16 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0099698 |
URI | http://hdl.handle.net/2429/12832 |
Degree |
Doctor of Philosophy - PhD |
Program |
Forestry |
Affiliation |
Forestry, Faculty of |
Degree Grantor | University of British Columbia |
GraduationDate | 2001-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_2001-71425X.pdf [ 7.35MB ]
- Metadata
- JSON: 831-1.0099698.json
- JSON-LD: 831-1.0099698-ld.json
- RDF/XML (Pretty): 831-1.0099698-rdf.xml
- RDF/JSON: 831-1.0099698-rdf.json
- Turtle: 831-1.0099698-turtle.txt
- N-Triples: 831-1.0099698-rdf-ntriples.txt
- Original Record: 831-1.0099698-source.json
- Full Text
- 831-1.0099698-fulltext.txt
- Citation
- 831-1.0099698.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0099698/manifest