Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Application of artificial neural networks for terrain stability mapping Pavel, Mihai 2003

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_2004-902501.pdf [ 22.78MB ]
JSON: 831-1.0075083.json
JSON-LD: 831-1.0075083-ld.json
RDF/XML (Pretty): 831-1.0075083-rdf.xml
RDF/JSON: 831-1.0075083-rdf.json
Turtle: 831-1.0075083-turtle.txt
N-Triples: 831-1.0075083-rdf-ntriples.txt
Original Record: 831-1.0075083-source.json
Full Text

Full Text

A P P L I C A T I O N O F A R T I F I C I A L N E U R A L N E T W O R K S F O R T E R R A I N S T A B I L I T Y M A P P I N G ; b y M f f l A I P A V E L D i p l . E n g . , Facu l ty o f Forestry, U n i v e r s i t y o f Brasov , R o m a n i a , 1987 M . F . , U n i v e r s i t y o f B r i t i s h C o l u m b i a , Vancouve r , 1997 A T H E S I S S U B M I T T E D I N P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F D O C T O R O F P H I L O S O P H Y i n T H E F A C U L T Y O F G R A D U A T E S T U D I E S Department o f Forestry - Forest Resources Management W e accept this thesis as conforming to the required standard T H E U N I V E R S I T Y O F B R I T I S H C O L U M B I A December 2003 © M i h a i Pave l , 2003 ABSTRACT This thesis investigates terrain stability mapping using Artificial Neural Networks (ANN). Preliminary analyses were conducted to evaluate the numerous types of A N N and select the one considered most appropriate for this problem. Kohonen Self-Organizing Maps were selected to be used in this study. Self-Organizing Maps include in principle two architectures (paradigms): Learning Vector Quantization (LVQ) for supervised learning, and the Self-Organizing Map itself (SOM) for unsupervised learning. Both architectures were used in this thesis. Analyses were performed on two study areas in southwestern British Columbia. Data were stored in a Geographic Information System (GIS), and terrain analyzed was represented in the raster format. Analyses were conducted based on topographic and geomorphic terrain attributes. Both supervised and unsupervised analyses produced good results. The attributes most relevant to terrain stability mapping were identified as slope, elevation, aspect, and existing geomorphic processes. In supervised mode, unstable terrain was delineated with accuracies of 94% and 95% for the two study sites, and unstable and potentially unstable terrain were delineated with accuracies of 91% and 82%, respectively. A comparison with a physically-based model showed that LVQ-based analyses yielded superior results. Unsupervised analyses also produced accurate terrain mappings, and SOM proved to have good explanatory power with respect to the influence of the attributes used. ii TABLE OF CONTENTS ABSTRACT II T A B L E OF CONTENTS Ill LIST OF TABLES VII LIST OF FIGURES VIII ACKNOWLEDGEMENTS XI CHAPTER 1. INTRODUCTION 1 1.1. STATEMENT OF THE ISSUE 1 1.2. STATEMENT OF OBJECTIVES 5 1.3. ASSUMPTIONS USED IN THIS THESIS 7 1.4. SCIENTIFIC AND OPERATIONAL IMPLICATIONS OF THE RESEARCH 7 1.5. THESIS OUTLINE 9 CHAPTER 2. SLOPE STABILITY ASSESSMENT: AN OVERVIEW 10 2.1. DEFINITION AND CLASSIFICATION OF SLOPE MOVEMENTS 11 2.2. FACTORS INFLUENCING TERRAIN STABILITY 12 2.2.1 Hydrologic and hydrogeologic factors 13 2.2.2 Terrain and soil properties 16 2.2.3 Vegetation influence 17 2.2.4 Forest development 18 2.3. METHODS BASED ON PROBABILISTIC REASONING 19 2.3.1. Principle (physically-based) models 19 2.3.2. Statistical methods 28 2.4. METHODS BASED ON ARTIFICIAL INTELLIGENCE (AI) 31 2.4.1. Subjective geomorphic mapping 32 2.4.2. Expert Systems 34 i i i 2.5. METHODS BASED ON FUZZY SETS AND FUZZY LOGIC 36 2.6. METHODS BASED ON ANN 38 2.7. SUMMARY REMARKS 39 CHAPTER 3. GENERAL DESCRIPTION OF ARTIFICIAL NEURAL NETWORKS AND SELF-ORGANIZING MAPS 41 3.1. SUPERVISED AND UNSUPERVISED CLASSIFICATION 41 3.1.1. Supervised classification 41 3.1.2. Unsupervised classification 42 3.2. GENERAL DESCRIPTION OF ANN AND SELECTION OF ANN USED IN THIS STUDY 44 3.2.1. General description of ANN. 44 3.2.2. Selection of the ANN used for terrain stability mapping 46 3.3. SELF-ORGANIZING MAPS 49 3.3.1. Introduction 49 3.3.2. Learning Vector Quantization (LVQ) 50 3.3.3. The Self-Organizing Map (SOM) 54 3.3.4. Comments on LVQ- and SOM-based data analysis, and interpretation of results 58 CHAPTER 4. STUDY AREAS 63 t 4.1. DESCRIPTION OF SEYMOUR STUDY SITE 64 4.2. DESCRIPTION OF JEUNE LANDING STUDY SITE 70 CHAPTER 5. DATA COLLECTION AND PRE-PROCESSING 75 5.1. D A T A SUPPORT (STORAGE) AND MANAGEMENT 77 5.2. TYPES OF D A T A AVAILABLE 78 5.2.1. Topographic data 75 5.2.2. Location of existing landslides 80 5.2.3. Geomorphic data 81 5.3. DATA CODING 87 5.3.1. Coding of ratio-type data 87 5.3.2. Coding of nominal and ordinal data 88 iv 5.4. PRE-PROCESSED DATA 89 CHAPTER 6. METHODOLOGY 94 6.1. T H E MODELING PROCESS 94 6.2. EVALUATION OF RESULTS 98 6.3. SUPERVISED LEARNING 100 6.3.1. Analyses based on topographic attributes 100 6.3.2. Analyses based on geomorphic attributes 105 6.3.3. Analyses based on topographic and geomorphic attributes 110 6.4. CROSS-VALIDATION OF THE MODEL I l l 6.5. COMPARISON BETWEEN TERRAIN STABILITY ASSESSMENT METHOD DEVELOPED IN THIS STUDY AND AN EXISTING M O D E L ( S I N M A P ) 112 6.6. UNSUPERVISED LEARNING 114 6.6.1. Unsupervised learning analyses 115 6.6.2. Interpretation of SOM-based results 115 CHAPTER 7. RESULTS AND DISCUSSION 116 7.1. SUPERVISED LEARNING 116 7.1.1. Results based on topographic attributes 116 7.1.2. Results based on geomorphic attributes 123 7.1.3. Results based on topographic and geomorphic attributes 131 7.2. RESULTS FOR CROSS-VALIDATION OF THE MODEL 142 7.3. M O D E L PREDICTION VS. S I N M A P 142 7.4. UNSUPERVISED LEARNING 144 7.4.1. Results for unsupervised learning 144 7.4.2. Interpretation of SOM-based results 147 7.5. SUMMARY OF RESULTS 156 CHAPTER 8. GENERAL DISCUSSION 158 8.1. DISCUSSION OF RESULTS AND IMPLEMENTATION OF THE M O D E L 158 8.2. GENERAL DISCUSSION ON A N N - B A S E D TERRAIN STABILITY MAPPING 162 v 8.3. INSIGHTS AND POSSIBLE SECONDARY MODELS DERIVED FROM TERRAIN STABILITY ANALYSIS WITH SELF-ORGANIZING MAPS : 164 8.3.1. Temporal terrain stability analysis 165 8.3.2. Implementation of terrain stability mapping with Self-Organizing Maps in a forest planning model.. 166 8.3.3. Utilization of Self-Organizing Maps to predict debris travel distance 767 8.3.4. Other comments on geomorphic attributes 169 CHAPTER 9. CONCLUSIONS 171 REFERENCES 175 APPENDIX 1 - GLOSSARY 188 APPENDIX 2 - TERRAIN SYMBOL 193 APPENDIX 3 - INTERPRETATION OF TERRAIN STABILITY CLASSES 194 APPENDIX 4 - CRITERIA FOR ASSIGNING TERRAIN STABILITY CLASSES 195 APPENDIX 5 - PREPARATION OF DEM 196 APPENDIX 6 - DESCRIPTION OF GEOMORPHIC ATTRIBUTES 198 APPENDIX 7 - DESCRIPTION OF THE MODELING PROCESS 201 APPENDIX 8 - DESCRIPTION OF THE IMPLEMENTATION PROCESS 202 v i LIST OF TABLES . T A B L E 3.1 A TYPICAL SET OF TRAINING VECTORS 51 T A B L E 5.1 CODING OF MATERIALS USING THE 1-OF-N CODING METHOD - EXAMPLE 88 T A B L E 5.2 T H E NEW CODING SYSTEM FOR SURFICIAL EXPRESSION - EXAMPLE 93 T A B L E 6.1 CORRELATION COEFFICIENTS FOR SEYMOUR 101 T A B L E 6.2 CORRELATION COEFFICIENTS FOR JEUNE LANDING 101 T A B L E 6.3 RESULTS OF M D A - CANONICAL CORRELATION COEFFICIENTS 102 T A B L E 6.4 RANKING OF ATTRIBUTES BASED ON M D A 103 T A B L E 6.5 SCENARIOS ANALYZED WITH COMBINATIONS OF TOPOGRAPHIC ATTRIBUTES 104 T A B L E 6.6 ANALYSES PERFORMED WITH GEOMORPHIC ATTRIBUTES 107 T A B L E 6.7 LIST OF INITIAL AND FINAL PARAMETERS USED IN S I N M A P 114 T A B L E 7.1 SEYMOUR - RESULTS FOR ANALYSES BASED ON TOPOGRAPHIC ATTRIBUTES 117 T A B L E 7.2 JEUNE LANDING - RESULTS FOR ANALYSES BASED ON TOPOGRAPHIC ATTRIBUTES 122 T A B L E 7.3 SEYMOUR - RESULTS FOR ANALYSES BASED ON GEOMORPHIC ATTRIBUTES 124 T A B L E 7.4 JEUNE LANDING - RESULTS FOR ANALYSES BASED ON GEOMORPHIC ATTRIBUTES 128 T A B L E 7.5 ANALYSES PERFORMED WITH THE REDUCED SET OF GEOMORPHIC AND TOPOGRAPHIC ATTRIBUTES 132 T A B L E 7.6 SEYMOUR - RESULTS FOR ANALYSES BASED ON THE REDUCED SET OF TOPOGRAPHIC AND GEOMORPHIC ATTRIBUTES 133 T A B L E 7.7 JEUNE LANDING - RESULTS FOR ANALYSES BASED ON THE REDUCED SET OF TOPOGRAPHIC AND GEOMORPHIC ATTRIBUTES 139 T A B L E 7.8 RESULTS OF UNSUPERVISED LEARNING ANALYSES 145 T A B L E 7.9 SUMMARY OF RESULTS: PREDICTION ACCURACY (%) FOR VARIOUS TYPES OF ANALYSES 156 v i i LIST OF FIGURES FIGURE 2.1 DESCRIPTION OF THE INFINITE SLOPE MODEL 24 FIGURE 3.1 DEVELOPMENT OVER TIME OF MOST POPULAR A N N 46 FIGURE 3.2 LEARNING VECTOR QUANTIZATION ( L V Q ) NEURAL NET 51 FIGURE 3.3 SIMPLIFIED REPRESENTATION OF THE L V Q LEARNING RULE 53 FIGURE 3.4 REPRESENTATION OF NEIGHBOURHOODS IN S O M 55 FIGURE 3.5 A PURELY THEORETICAL EXAMPLE OF A S O M WITH SIX NEURONS 60 FIGURE 3.6 REPRESENTATION OF S O M CODEBOOK VECTORS 61 FIGURE 4.1 LOCATION OF THE TWO STUDY SITES 63 FIGURE 4.2 SEYMOUR - D E M , ROADS AND STREAMS 65 FIGURE 4.3 SEYMOUR - TERRAIN STABILITY MAPPING 66 FIGURE 4.4 SEYMOUR - DISTRIBUTION OF SURFICIAL MATERIALS 67 FIGURE 4.5 SEYMOUR - DISTRIBUTION OF SUBSURFICIAL MATERIALS 68 FIGURE 4.6 SEYMOUR - DISTRIBUTION OF GEOMORPHIC PROCESSES 69 FIGURE 4.7 JEUNE LANDING - D E M , ROADS AND STREAMS 71 FIGURE 4.8 JEUNE LANDING - TERRAIN STABILITY MAPPING 71 FIGURE 4.9 JEUNE LANDING - DISTRIBUTION OF SURFICIAL AND SUB-SURFICIAL MATERIALS 72 FIGURE 4.10 JEUNE LANDING - DISTRIBUTION OF GEOMORPHIC PROCESSES 73 FIGURE 5.1 OVERVIEW OF DATA PREPARATION AND PRE-PROCESSING 76 FIGURE 5.2 DEFINITION OF SPECIFIC CATCHMENT AREA 79 FIGURE 5.3 BUFFERING OF LANDSLIDES 81 FIGURE 5.4 VISUAL REPRESENTATION OF GEOMORPHIC PROCESSES 85 FIGURE 5.5 MAPPING OF GEOMORPHIC PROCESSES BASED ON THEIR RELEVANCE TO TERRAIN STABILITY 86 FIGURE 5.6 DISTRIBUTION OF SCA FOR SEYMOUR AND JEUNE LANDING 90 FIGURE 6.1 DESCRIPTION OF ANALYSES CONDUCTED FOR MODEL DEVELOPMENT AND TESTING 95 FIGURE 7.1 SEYMOUR - PREDICTION ERRORS FOR ANALYSES BASED ON TOPOGRAPHIC ATTRIBUTES 118 FIGURE 7.2 M A P OF SEYMOUR SHOWING PREDICTION USING SLOPE, ELEVATION, SCA, ASPECT, BASED ON EXISTING LANDSLIDES 120 FIGURE 7.3 JEUNE LANDING - RESULTS FOR ANALYSES BASED ON TOPOGRAPHIC ATTRIBUTES 122 v i i i FIGURE 7.4 SEYMOUR - RESULTS FOR ANALYSES BASED ON GEOMORPHIC ATTRIBUTES 125 FIGURE 7.5 M A P OF SEYMOUR SHOWING PREDICTION BASED ON GEOMORPHIC ATTRIBUTES 127 FIGURE 7.6 JEUNE LANDING - RESULTS FOR ANALYSES BASED ON GEOMORPHIC ATTRIBUTES 129 FIGURE 7.7 MAP OF JEUNE LANDING SHOWING PREDICTION BASED ON GEOMORPHIC ATTRIBUTES 130 FIGURE 7.8 SEYMOUR - RESULTS FOR ANALYSES BASED ON THE REDUCED SET OF TOPOGRAPHIC AND GEOMORPHIC ATTRIBUTES 134 FIGURE 7.9 MAP OF SEYMOUR SHOWING PREDICTION BASED ON SLOPE, GEOMORPHIC PROCESSES, ELEVATION, AND ASPECT 135 FIGURE 7.10 M A P OF SEYMOUR SHOWING PREDICTION BASED ON SLOPE, GEOMORPHIC PROCESSES, ELEVATION, AND ASPECT 136 FIGURE 7.11 JEUNE LANDING - RESULTS FOR ANALYSES BASED ON THE REDUCED SET OF TOPOGRAPHIC AND GEOMORPHIC ATTRIBUTES 139 FIGURE 7.12 M A P OF JEUNE LANDING SHOWING PREDICTION BASED ON SLOPE, GEOMORPHIC PROCESSES, ELEVATION, AND ASPECT 140 FIGURE 7.13 MAP OF JEUNE LANDING SHOWING PREDICTION BASED ON SLOPE, GEOMORPHIC PROCESSES, ELEVATION, AND ASPECT 141 FIGURE 7.14 SEYMOUR - RESULTS OF S I N M A P ANALYSIS 143 FIGURE 7.15 SEYMOUR - RESULTS OF UNSUPERVISED LEARNING 145 FIGURE 7.16 JEUNE LANDING - RESULTS OF UNSUPERVISED LEARNING 146 FIGURE 7.17 MAP OF SEYMOUR SHOWING RESULTS OF S O M ANALYSIS 147 FIGURE 7.18 SEYMOUR - REPRESENTATION OF S O M 148 FIGURE 7.19 SEYMOUR - S O M INCLUDING ONLY STABLE NEURONS 149 FIGURE 7.20 SEYMOUR - RESULTS OF S O M ANALYSIS; A) GEOGRAPHICAL LOCATION OF STABLE NEURONS MAPPED ON THE S W CORNER OF S O M ; B) DETAIL 150 FIGURE 7.21 SEYMOUR - RESULTS OF S O M ANALYSIS; A) GEOGRAPHICAL LOCATION OF NEURONS MAPPED IN THE CORNERS AND CENTER OF S O M ; B) DETAIL 151 FIGURE 7.22 SEYMOUR - S O M DISPLAY BY ATTRIBUTE: A.) ELEVATION; B.) SLOPE; C.) ASPECT; D.) PLAN CURVATURE 153 FIGURE 7.23 SEYMOUR - S O M DISPLAY BY ATTRIBUTE: A.) PROFILE CURVATURE; B) INTERACTION OF CURVATURES; c.) SPECIFIC CATCHMENT AREA 155 x ACKNOWLEDGEMENTS I would like to express my sincere gratitude to my academic advisors, Drs. J.D. Nelson and R.J. Fannin, for providing the opportunity and technical support to carry out this project. Their advice and guidance from the initiation of this study to completion of the thesis was invaluable. I am grateful to the other members of my committee, Drs. J.L. Innes and T. Sayed, for their questions, reviews, and guidance during this project. I thank D. Byng and P. Bavis from Western Forest Products Ltd. for their financial support and for providing the Jeune Landing data set. Their contribution was a very important component of this research. I thank also Dr. T. Lewis for providing valuable information on the Jeune Landing study site. My gratitude is directed also towards D. Bonin, L . Gilmour, and D. Dunkley from the Greater Vancouver Regional District for offering the Seymour data set, for providing valuable input to this project, and access to the Seymour area. I am very grateful to the Science Council of British Columbia for partially funding this study, through a Graduate Research in Engineering and Technology (GREAT) Scholarship. I thank also the Forest Engineering Research Institute of Canada (FERIC) for. their support during the entire period of my graduate studies. Heartfelt thanks are extended to my friends Kevin Crowe, Tim Ross, and Oleg Godeanu for the help with compilation of the C code used in this thesis. Last but not least, I thank my wife Luminita and our children Adrian and Diana for their never ending support and understanding during the entire research period. This work is dedicated to my parents, Ionel and Maria, who considered the most important thing in their lives, creation for us, their children, the best conditions for studying. xi What do you have that you were not given ? A n d i f it was given to you, how can you brag? (I Corinthians: 4,7) xn CHAPTER 1. INTRODUCTION Landslides are various types of gravitational mass movements of the earth surface triggered by factors like earthquakes, volcanic activities, rainfall, natural and anthropogenic change of slope geometry. They may result in catastrophic disasters by destroying settlements in urban areas, and cause great economic losses by the destruction of construction works, and occasionally of cultural and natural heritages. For these reasons, landslide risk mitigation and the protection of natural and cultural heritage are extremely important. Landslides are studied in the field of earth sciences (geology, geomorphology, and geophysics), water sciences (hydrology and hydraulics), engineering sciences (civil and mining engineering, forest and agricultural engineering), and also are relevant in cultural and social sciences (Sassa 2003). This thesis investigates landslides in relation to forest development. The main objective of this study is the creation of a terrain stability mapping tool. The method developed in this study is aimed at having wide applicability in mountainous terrain. However, to avoid unrealistic assumptions or generalizations on data availability or its applicability, data analysis was conducted in the specific conditions existing in the Canadian province of British Columbia (BC). The data available and the type of outputs incorporated in the model were determined by the conditions / requirements relevant in this province. 1.1. STATEMENT OF THE ISSUE In the past few decades, the problem of landslides became a very important issue in forest resource management. The main reasons are: (i) valuable timber resources on flat valley bottoms have already been depleted; (ii) advanced harvesting technologies (cable yarding, powerful machines)have been introduced providing access to timber on steeper slopes at higher elevation; and (iii) water quality and environmental concerns have become an integral part of forest resources management. 1 Slope failure, both naturally occurring and forest development-related, causes productive forest site loss, increases industrial operating costs (to replace roads and bridges), interferes with fisheries by damaging fish habitat where fine sediment impinges on a productive watercourse, has a detrimental visual (aesthetic) impact, and reinforces environmentalists' negative image of economic development activities. The whole concept of sustainable development may be viewed cynically by the public where there is lack of serious attention to sustaining the land base (Slaymaker 2000). To identify terrain that may become unstable as a result of forest development, various methods have been developed for terrain stability mapping. The most widely used method in the world is the genetic method (also called subjective geomorphic mapping). This method is also in use in BC, and is regulated by the Forest Practices Code of this province (Province of B C 1995). The Code requires that a careful evaluation of landslide hazard within any areas proposed for forest development be conducted to ensure that forestry operations are performed in a manner that will not increase the frequency and magnitude of soil mass movements. Typically, assessments are carried out by geoscientists or engineers, using a combination of field and office based techniques. The method relies almost exclusively on the experience of the mapper, and the products of genetic mapping are by definition subjective. It is common knowledge that, with this method, no two mappers will produce identical polygons for a given landscape, or describe an area in exactly the same manner; furthermore, the same specialist may produce different maps of the same location if mapping is redone at a later time. This lack of consistency, along with overly conservative assessments have negative impacts on forest development, and is a major criticism of the method. To minimize the subjectivity of terrain stability mapping, there is a tendency to improve the assessments by including an objective component in these assessments. A major step in this process was introduction of Geographic Information Systems (GIS). GIS made terrain representation possible by Digital Elevation Models (DEM). Huge increases in data availability through remote sensing methods coupled with exponential increases in computing power have enabled this recent shift. 2 A s an alternative to subjective mapping, terrain stability specialists have been attracted by the theoretical rigour o f physically-based models. However, application o f these models over large tracts of land is hindered by problems related to accurate evaluation o f factors that control slope stability, and their spatial and temporal distribution. The use of a physically-based numerical models for slope stability analysis requires a reliable estimate of input parameters for each site. In particular, slope angle, shear strength of the material, good estimates o f depth to the failure plane, and groundwater regime at the time of failure are needed. For the time being, very few of these parameters can be measured / estimated with the required accuracy. Also , limitations of the methods used require introduction o f simplifying assumptions which sometimes represent serious departures from physical reality. Attempts to include spatial distribution across the landscape of input parameters have included probabilistic methods and concepts from the Fuzzy Sets Theory. Numerous functional models for terrain stability assessment have been developed; however, none of them has gained general acceptance. Other methods commonly used for terrain stability mapping are based on statistical analysis of relevant parameters. Statistical methods are inherently complex. Assumptions o f these methods are in general difficult to meet; for example, the multivariate normality assumption, required by many statistical methods, is rarely met with natural data. Transformations make the analysis even more complex, and methods more difficult to apply in practical situations. To date, both univariate and multivariate methods for terrain stability mapping have produced results unsuitable for practical purposes. Based on the existing situation, in this thesis, utilization o f Art if icial Neural Networks ( A N N ) is investigated for terrain stability mapping. The main characteristics o f A N N indicated that it is a good method for the problem at hand. Kohonen (2001) recommends that A N N should be considered in the following cases: • Abundance o f noisy and i l l defined data. Natural data are not always describable by statistical parameters, and their distributions are non-Gaussian. The functional relations 3 between natural data elements are often nonlinear. Under these conditions, adaptive A N N computing methods are more effective and economic than the traditional ones. • Collective effects. Although dependencies between input and output variables can be considered in any of the statistical formalisms, only A N N models rely on redundancy of representations in space and time. In other words, they normally disregard individual signal or pattern variables and concentrate on collective properties of sets of variables. A N N are often suitable for nonlinear estimation in which the classical probabilistic methods fail. • To create new information processing functions such as specific feature detectors and ordered internal representations, in response to frequently occurring patterns. These are 'intelligent' information processing functions that emerge only in ANN. Also, only A N N can create higher abstractions (symbolisms) from raw data automatically. Intelligence in A N N ensues from abstractions, not from heuristic rules or manual logic programming. Also, A N N are fast and compact, and much easier to apply to practical problems; this makes them a good method for terrain stability mapping. Oftentimes, in the literature, ANN-based analyses are described as 'pattern recognition' problems. This thesis uses the same, pattern-based approach. However, a more complete (although non-standard) definition of the method used in this study would be 'pattern analysis using A N N based on competitive learning'. In general, A N N analyses are separated based on the presence or absence of examples: when examples exist, this is called supervised classification; when there are no examples, this is unsupervised classification. For the terrain stability problem, supervised classification consists of classifying new terrain based on existing terrain stability maps. In unsupervised classification all terrain units have the same status, and the purpose of the analysis is to cluster them based on their attributes; next, clusters are labeled (classified) according to stability criteria. In general, when terrain stability analysis is performed in conjunction with forest development, the supervised learning case is suitable for older developments, for which examples exist, while the unsupervised case is appropriate for new developments. 4 1.2. STATEMENT OF OBJECTIVES The main objective of this thesis was the development of a model for terrain stability mapping using A N N (in A N N analyses the word 'model' may be considered inappropriate, and a better term may be 'approach'; however, in this thesis, these two words are used interchangeably). A first task in this process was evaluation of most common A N N and selection of the one considered most appropriate for the problem at hand. This thesis aimed at incorporating topographic and geomorphic terrain attributes that can be accurately measured / estimated across the landscape. Another requirement was that attributes be easily extracted from aerial photos or topographic maps, and be technically meaningful. This model aimed at incorporating attributes that are measured with about the same (consistent) accuracy. The main criterion for evaluation of attributes included in the model is driven by their intrinsic accuracy: topographic attributes, which are more accurately computed are included first, then geomorphic attributes that improve the quality of terrain stability mapping are added. The development and testing of this model was conducted on two sites in the southern part of coastal British Columbia. The detailed objectives (tasks) of this thesis are differentiated for supervised and unsupervised classification. The objectives of supervised classification are: 1. Delineate unstable terrain at the two study sites based on existing examples. Each study site is divided in two parts; the model learns patterns of instability from the first part, then uses these patterns (applies the same principles) to the second (unseen) data. 2. Identify terrain attributes (topographic and geomorphic) which are the most important in terrain stability mapping, in supervised classification. 3. Cross-validate the model between the two study sites: i.e. use the patterns of instability identified in one site to map the other site. 4. Contrast the prediction of this model with that of a physically-based model. 5 The objectives of unsupervised classification are: 5. Delineate (cluster) terrain units which are similar with respect to their attributes relevant to stability, and label the clusters according to stability criteria. 6. Identify terrain attributes which are most important in unsupervised classification and compare them with those identified in supervised mode. 7. Analyze in greater detail the impact of terrain attributes included in this study and attempt to explain their influence on terrain stability mapping. While the solution to the terrain stability problem is the primary output of the thesis, it is not the only one. An investigation like this is also expected to produce new insights and ideas connected to the problem domain and related issues. Although more speculative in nature, these better identify the potential applications of this model, and are outlined as a separate objective: 8. Identify secondary models and potential new studies derived from the approach developed in this study. In terms of what may constitute examples, this is driven by data availability. In this study, definition of examples is based on the terrain stability mapping method used in BC, and examples are selected using the following criteria: (i) existing landslides; (ii) terrain classified as unstable; (iii) terrain classified as unstable and potentially unstable; and (iv) a 5-classes system. The model analyzes data stored in GIS format. In GIS, fundamental units to be used in stability analysis can be identified, in the form of either subdrainages or pixels. Analyses based on subdrainages are in general affected by the aggregation of data, and the resulting map is rather generalized and may not fulfill requirements at a large scale. In this study, grid-cell units were used in the analysis, because in this representation the analyst can choose the grid size according to needs and data quality. 6 The landslides associated with forest development are mostly o f shallow translational type; this is the only type analyzed in this study. Rock failures and slope movements due to dynamic loading because of earthquakes or blasting fall outside the scope of this thesis. It is also recognized that along with initiation, the travel distance o f landslides need to be assessed. In general, in a probabilistic framework, the risk associated with landslides is defined as the hazard of initiation, times the consequences (driven mainly by travel distance) of these movements. However, in this thesis, only the problem of landslide initiation is addressed. 1.3. ASSUMPTIONS USED IN THIS THESIS The topography o f the two study sites used in this thesis is represented in GIS, using Digital Elevation Models ( D E M ) . The assumption related to this representation is that D E M accurately represent the terrain. This implies that all topographic attributes derived from the D E M are also accurate. Existing terrain stability maps, in digital format, are used in this study for extraction o f geomorphic attributes and to contrast predictions yield by this model with the ones produced by specialists. The assumption related to attributes and stability maps is that they accurately represent the 'reality'. 1.4. SCIENTIFIC AND OPERATIONAL IMPLICATIONS OF THE RESEARCH The ANN-based approach represents a new method for terrain stability mapping. A t the time when this study started, a literature survey showed that no similar method was applied to this problem. However, as this thesis was nearing finish, results of another Ph.D. thesis using a similar approach (Fernandez-Steeger 2002) became available on the Internet. In any case, this study can be considered one o f the first that introduced application o f A N N for terrain stability mapping. The type o f A N N selected in this thesis and the attributes included in analyses, plus results obtained and practical applicability o f the method, clearly differentiate this thesis from the other ANN-based approach. 7 ANN-based terrain stability mapping differs fundamentally from physically-based (functional) models developed in Soil Mechanics: in this approach, the model is developed from data, as opposed to functional models that are based on theoretical principles. General knowledge of terrain stability from other sciences is used in the model development process, but data are the main driver. In some respect, this approach is similar to multivariate statistical analyses. The operational implications of this study are driven mainly by the existing legal framework. In British Columbia, after the introduction of The Forest Practices Code in 1995, assessment of terrain stability on forest lands became a legal requirement. This has increased the need for both terrain stability mapping and associated ground checks. The purpose of this mapping is to identify the timber harvesting land base and to minimize the occurrence of landslides and their impacts. Mapping and assessing procedures have been developed, but these are recognized to be subjective. There is a need for improved techniques to assess landslide-prone terrain that complement the existing qualitative procedures. The model developed in this thesis has the potential to introduce objectivity to terrain stability assessments. This model was designed as a decision-support model, to be used in BC. The model aims to delineate with greater accuracy areas of potential instability, and identify zones that require detailed ground checks, thus significantly reducing costs. It can provide guidance to terrain specialists for decisions on harvest blocks and layout, and results can be used both for independent risk analysis and incorporated into harvest scheduling to assess impacts on timber supply. As with other model developed for natural resources management, this model is not meant to replace the analyst but assist him / her in terrain stability mapping. This approach allows creation of hazard classification maps and assessment of potential impacts on timber supply and harvest schedules over a large area in a relatively short time. The model can assist in limiting expensive ground checks to the most vulnerable areas. It has the potential to increase both the speed and objectivity of terrain stability 8 mapping. The model can also be a tool that can provide information necessary for addressing the following issues: loss of site productivity; loss of operational revenue; adverse impacts on aquatic resources; environmental liability in forest operations. 1.5. THESIS OUTLINE Definition and classification of landslides, and a presentation of existing methods for terrain stability mapping are provided in Chapter 2. Chapter 3 presents an overview of supervised and unsupervised classification methods, offers a general description of A N N , and a more detailed description of the ones chosen in this thesis. Description of the two study areas used for model development and testing is presented in Chapter 4. Data available for terrain stability mapping is introduced in Chapter 5. Given the importance of topographic and geomorphic terrain attributes used in this study, a fairly detailed description of methods used to derive them is given. This chapter includes also pre-processing methods, and a description of the final format of the data used in the analysis. The methodology used in the thesis is described in Chapter 6. Since a large number of analyses were conducted in this study, to facilitate their interpretation, Chapter 7 includes both results and short discussions on them. A more general discussion of results, in a larger context, and secondary findings of the thesis are presented in Chapter 8. Chapter 9 includes conclusions of this study. A glossary was added at the end of the thesis to explain some of the terms used. 9 CHAPTER 2. SLOPE STABILITY ASSESSMENT: A N OVERVIEW Problems of slope stability and the associated mass movement processes represent research themes common to both geotechnical engineers and geomorphologists, although their perspectives clearly differ (Anderson and Richards 1987). The main differences are driven by the spatial scales of analyses typical for the two domains, which are related in principle to data availability. Also, the temporal scale of analyses sometimes creates a major difference between the two fields of study. The first part of this chapter introduces the definition and classification of landslides, and next, a description of factors controlling terrain stability is presented. The last sections of the chapter present descriptions of methods used for terrain stability analysis and mapping, and summary remarks. Traditionally, methods used to assess terrain stability, are classified as either qualitative (subjective) -based on the knowledge and experience of the specialist, or quantitative (objective) - using principles of soil mechanics, hydrology and hydrogeology, or based on statistical analysis of parameters relevant to terrain stability. Descriptions of existing methods and their classification are presented by Province of B C (1996a), Guzzetti et al. (1999), Aleotti and Chowdhury (1999), Dai et al. (2002). These classifications, however, fail to clearly separate existing methods: the 'objective' methods incorporate in many cases parameters subjectively derived, and the 'subjective' methods often incorporate parameters and principles specific to objective sciences. For example, physically-based models use soil parameters which are simply educated guesses, and subjective geomorphic mapping incorporates principles from soil mechanics theory. Kohonen (2001) identifies the following computational formalisms that have been developed to cope with real-world situations: probabilistic reasoning, Fuzzy Sets theory, Fuzzy Logic, Artificial Intelligence, Genetic Algorithms, and Artificial Neural Networks. These formalisms are used in the last sections of this chapter to describe existing methods for terrain stability analysis. However, no terrain 10 stability methods based on Genetic Algorithms were identified in the literature, hence this formalism is not included. 2 . 1 . DEFINITION AND CLASSIFICATION OF SLOPE MOVEMENTS In general, 'landslide' is widely used as an all-inclusive term for almost all varieties of slope movements. Other terms like mass movements and slope failures are often synonymously used for landslides. In this study, the terms landslide, soil mass movement, mass wasting, and slope failure are used interchangeably. The definition of this phenomenon also varies. The most common definition is that provided by Cruden (1991): the term landslide is used to denote the movement of a mass of rock, debris or earth down a slope. Many partial or complete classifications of landslides have appeared in different languages. Slope movements may be classified in many ways, each having some usefulness in emphasizing features pertinent to recognition, avoidance, control, and correction. Terzaghi (1950) states that "landslides involve a multitude of combinations between materials and disturbing agents, and this opens unlimited vistas for the classification enthusiast. The result of classification depends quite obviously on the classifier's opinion regarding the relative importance of the many different aspects of the classified phenomenon". The criteria most widely accepted for classification of landslides are based on Varnes (1978), and Cruden and Varnes (1996), and emphasize the type of movement and type of material. Landslides are classified and described by two nouns: the first describes the material and the second the type of movement. The names for the types of materials are: rock, debris, and earth. Movements have been divided into five types: falls, topples, slides, spreads, and flows. The term 'complex' is also used for describing landslides that share the characteristics of more than one category. Landslide hazard is defined by Varnes (1984) as the probability of occurrence of a potentially damaging landslide phenomenon within a specified period of time and within a given area. 11 2.2. FACTORS INFLUENCING TERRAIN STABILITY It is widely recognized that a critical combination of factors and material characteristics is required for landslide initiation to occur. Selby (1985) states that resistance to denudation of a certain site is a function of soil, rock and vegetation strength, and reflects the capacity of the respective site to absorb applied energy. The landslide triggering factors are classified by Terzaghi (1950) as external or internal. External causes produce an increase in shear stress acting on slope materials, but without change in the shear strength of those materials, and internal factors consist in changes of shear strength of soil, without any change in shear stress applied. In the context of this thesis, examples of external factors may be removal of toe support when roads are constructed, imposition of surcharges due to heavy machinery, and weight of rain or snow, and examples of internal factors may be change in soil strength due to an increase of the pore water pressure, or loss of apparent cohesion due to root decay. Other transformations may include temporal variation in material properties such as friction angle and cohesion due to weathering, and mechanical alterations such as cementation, or related changes in site characteristics such as slope angle due to extensive erosion. Such changes at geological time scales fall outside the scope of this study. Numerous investigations have been conducted to identify the mechanism of landslides (e.g., Sitar et al. 1992; Iverson 2000). Positive pore-water pressure due to increasing groundwater levels is widely recognized as the triggering factor for most slope failures. This hypothesis is supported by experimental data and by the observation that slope failure occurrences increase during periods of intense rainfall or major rain-on-snow events. Based on the mechanism postulated, the most common triggering factors are intense rainfall or rapid snowmelt (Wieczorek 1996). Anderson and Sitar (1995) found that at various times during a given storm the soil in a potential landslide area may have a very low factor of safety (against failure) and even a very small stress increment may produce failure (only a small increase in applied shear stress is necessary to exceed the 12 peak shear strength of the material). Examples of additional stresses are: other sources of sudden loading such as caused by windthrow, soil failing higher up the slope, rockfall from higher slopes, or possibly even a strong gust of wind. Innes (1983) includes vibrations caused by thunder as possible landslide triggering factors. Factors that control terrain stability are related to hydrologic and hydrogeologic conditions, terrain and soil properties, vegetation, and land use. In relation to land use, this thesis focuses on forest development, and the other factors are described in the remainder of this section. 2.2.1 Hydrologic and hydrogeologic factors Hydrologic factors relevant to terrain stability analysis include precipitation, evapotranspiration, infiltration and snowmelt. These factors are in general well understood, along with their interaction with terrain, soil, and vegetation. A comprehensive description of hydrologic factors and mechanisms of surficial hydrology is presented in Kirkby (1978). In relation to terrain stability, researchers focus mainly on the study of groundwater hydrology, and more precisely on the distribution of pore-water, pressure within the soil mass, direction and magnitude of seepage forces, and existence of preferential flowpaths (macropores). There have been major efforts to identify the groundwater conditions leading to the initiation of rainfall-induced debris flows. The most widely accepted conceptual model is based on the pore pressure generation on a soil (regolith) covered hillslope, based on the assumption that the soil is more pervious than the underlying bedrock. The infiltrating water accumulates at the interface and creates a transient perched water table above the bedrock. Failure then occurs as a consequence of a combination of increased pore pressure and seepage forces (Sitar et al. 1992). In a slope with given hydrogeological properties, the height of the water table and its fluctuations are controlled by the infiltration rates that occur under the prevailing climatic regime. However, the spatial and temporal relation between precipitation and pore pressures is poorly understood. 13 The groundwater exerts gravitational, buoyancy and drag forces on a soil mass. Hodge and Freeze (1977) analyzed a variety of hydrogeological environments and discussed the implications of the hydraulic head patterns in relation to regional slope stability. The authors concluded that on the scale of individual hillslopes, there are major complications introduced especially by layered stratigraphy. Freeze (1980) studied the hydraulic conductivity of surface soils and found a great variation in this parameter. Individual hillslopes often exhibit different runoff-generating mechanisms at different places during the same storm or at the same place during different storms. Such complexities have significant impact on infiltration rates and the development and dissipation of pore pressure through time. Rulon and Freeze (1985) emphasized that layered slopes feature multiple seepage faces, perched water tables, and wedge-shaped unsaturated zones. The pore-pressure distributions and the locations of the seepage faces are strongly dependent on the positions of the impeding layers and their hydraulic properties. The authors conclude that predictions of pore-pressure fields based on homogenous saturated analyses may be significantly in error when applied to layered slopes. De Vries and Chow (1978) and Sidle et al. (1985) state that the hydrologic response of slopes is influenced strongly by the presence of very thin soils, a dense network of roots and other macropore structures and highly divergent or convergent steep slopes. Dietrich and Sitar (1997) comment that the coluvial mantle or the bedrock underlying the mantle may have layers of higher hydraulic conductivity which act as conduits and layers of lower hydraulic conductivity which act as barriers to the flow of the infiltrating water. Both of these conditions (plus exfiltration from conductive bedrock) lead to the development of localized areas of highly elevated pore pressures affecting slope stability. Fannin and Jaakkola (1999, 2000) and Fannin et al. (2000) found that pore water pressure heads induced by storms are very variable in space, and are weakly correlated with storm intensity. Piezometric response was found to be independent of rainfall intensity and duration, and the great variability of hydrologic response was attributed to the influence of preferential flow paths in the soil matrix. 14 Piezometric data indicated the potential for short-term pore pressures which are in excess of hydrostatic pressures, and potentially artesian, to develop during precipitation events. Wilkinson (1996) states that despite observations of a great spatial variation in piezometric responses of a hillslope, many attempts to model or predict these responses have been made. These models aim at predicting groundwater levels based on a number of input parameters, including total precipitation and intensity, antecedent moisture conditions, soil porosity and hydraulic conductivity, infiltration, evaporation, slope morphology or position, and catchment area. These models are empirical in nature or based on simplified probabilistic lumped-parameter approaches. Groundwater modeling commonly incorporates the assumption that the initial, or dry hydraulic conductivity is low, and saturated hydraulic conductivity is the maximum rate of water movement in the soil. A steady state is achieved when the entire soil profile is transmitting water at the maximum rate permitted by that horizon. It is common practice to assume that the worst possible groundwater flow condition for stability occurs when the ground is saturated and flow is parallel to the slope surface. Analysis of relations between hydrologic and physiographic parameters have been attempted either through experimental studies or empirical observations. Eisbacher and Clague (1981) comment that in the region around Vancouver, BC, widespread landsliding probably is initiated when 24 h rainfall exceeds 100 - 150 mm. Such storms have return periods of approximately 3 years. O'Loughlin (1972) found that landslides on sediment-veneered rock slopes in the Coast Mountains north of Vancouver, commonly occur when 24 h rainfall exceeds 150 mm. These values are attained on the average about once every 2 years in the southern Coast Mountains, and once every 3-5 years immediately south of the mountain front, but are much less frequent farther south. Sidle and Swanston (1982) estimated that storms with 24-h intensities great enough to cause complete saturation of shallow mid-slope soils in south-east Alaska have return periods of 2-5 years. Wieczorek (1987) and Haneberg (1991) observed that thin soil instability is usually related to short term storms of high intensity (i.e. less dependent on 15 antecedent moisture and precipitation) while thicker hillside soil instability is usually related to prolonged storms of moderate intensity. 2.2.2 T e r r a i n a n d s o i l p rope r t i e s Terrain properties are related to geology and geomorphology of the region, slope geometry (topographic profile of slope), slope gradient and aspect, thickness of unstable mass, and geometry of potential slip surface. Soil properties that influence initiation include physical characteristics such as in-situ density, hydraulic conductivity, and soil strength characteristics, including stress history in some cases. Soil strength parameters refer in principle to the angle of internal friction and cohesion. These parameters display sometimes great variability over relatively small areas. A variety of field and laboratory approaches are used to obtain the relevant values of soil strength parameters, ranging from multi-stage triaxial tests to simple estimates of reasonable strength parameters. As both laboratory and field test involve some disturbance compared to in-situ conditions, results may be very different compared to the 'real' values (Price 1985). In relation to terrain properties, many researchers attempted to develop criteria for practical applications. Of particular interest was identification of the minimum slope angle for hillslope activity. Innes (1983) found that most studies consider this angle to be 30°, although downslope process occur also on slopes as low as 20°, and in general, these phenomena are likely to occur on shallower gradients in gullies as the material is essentially cohesionless, whereas on hillslope sites, some degree of cohesion frequently exists. In BC, the slope of 60% (30°) is frequently considered a threshold between stable and potentially unstable terrain. Many observations on terrain attributes are presented in correlation with hydrologic characteristics and are relevant only for some specific areas. For example, O'Loughlin (1972) found that in Coastal B C and the Pacific Northwest, soil mass movements typically occur on steep slopes where relatively shallow and 16 cohesionless soils are underlain by impermeable bedrock or glacial till. These conditions together with high intensity, long-duration rainfall predispose the area to a moderate to high natural level of rapid, shallow soil mass movements. Elevation is an important attribute also, and particularly susceptible to mass movement are the mid to upslope concave depressions, which accumulate groundwater, and sites where rain-on-snow events can occur. Evans (1982) observed that glacial till and associated colluvium commonly mantles steep slopes in the major mountain ranges in BC, and instability of these materials has been visible particularly where steep slopes exist in areas of heavy rainfall. The influence of aspect was investigated by Jakob (2000), in a study performed on west coast of Vancouver Island, and O'Loughlin (1972) in Coastal BC. The authors found that most landslides initiated on east-, southeast-, and south-facing slopes (in the azimuth range from 90° to 225°). Carrara et al. (1991) came to the conclusion that west and north facing slopes did not prove to be significant variables when predicting instability,- and considered this a counter-intuitive result. Attempts to explain the influence of this attribute relate to the direct solar radiation received by different sites, snow melting regime, and the amount of water that infiltrates into the ground. More details on terrain and soil parameters in the context of various stability methods are presented in Chapter 2.3. 2.2.3 Vegetation influence Greenway (1987) and Wu and Sidle (1995) discuss the effects of vegetation on slope stability, and classify vegetation factors as hydrological or mechanical in nature. Hydrological factors include interception, infiltration, and transpiration, and mechanical factors include root reinforcement, the weight surcharge of vegetation, and wind effects. From these factors, the only one that is considered to have a major impact on terrain stability is root reinforcement. The binding effect of the tree roots has a major contribution to the shear strength of the material. Studies to analyze resistance provided by roots were performed by O'Loughlin (1972), Preston and Crozier (1999), Schmidt et al. (2001). These studies analyzed the tensile strength, species, depth, orientation, 17 relative health, density of roots, diameters, and also the effect of anthropogenic disturbance. Studies have demonstrated that root reinforcement does not affect the friction angle of soil, but does act to provide an apparent cohesive component to the shear strength. Other studies that summarize the research performed on this topic are represented by Hammond et al. (1992) and Sidle et al. (1985). Empirical observations confirmed that in the first years after forest harvesting, when the process of root decay starts, the root cohesion decreases dramatically (Schmidt et al. 2001). Rollerson et al. (2001) observed that most post-logging landslides occur within five years of harvest. After attaining a minimum value, as the new stand grows, the root cohesion is restored. The duration of this cycle is estimated at about 15-20 years, and depends on the type of vegetation and location of the site. Sidle (1991) presented a method of simulation for changes in root cohesion following harvesting. The proposed method involves both root deterioration after logging and regrowth of newly planted or invading vegetation. However, direct measurements of maximum tensile resistance of roots may not reflect the actual situation, as this may not be mobilized because the roots may pull out prior to breaking in tension. The contribution of root cohesion to shear strength is further complicated by effects of root morphology, density and distribution, which may depend on species, soil thickness, substratum penetration, and the resulting likelihood of intersection of roots and the failure plane. 2.2.4 F o r e s t d e v e l o p m e n t There is a common belief that forest development (namely, road construction and harvesting) is linked to terrain instability. Studies on the influence of forest development on terrain stability include O'Loughlin (1972), Wu and Swanston (1980), Krag et al. (1986), Sauder and Wellbum (1987), Luce and Wemple (2001). Considerable evidence suggests that some logging practices can affect the stability of forested hillslopes. The main points of this argument are: (1) forest removal changes the microclimate, i.e. precipitation and infiltration rates; (2) logging machinery compacts the soil, and surficial yarding 18 disturbance results in channelization of surficial runoff; (3) logging produces changes in macropore network by closing existing ones through compaction with heavy machinery and triggers the development of a macro-pore network by means of root degradation; (4) loss of root cohesion after forest removal; (5) altered drainage paths and a resulting concentration of hydrological stresses due to road building; and (6) construction of potentially unstable roadfills on hillslopes. Jakob (2000), Montgomery et al. (2000), and Roering et al. (2003) found that frequency of landslides in logged terrain is several times higher than in undisturbed forest. In gully channels, logging activities which result in woody debris accumulations can trap sediment and increase the probability of initiation due to gully sidewall instability or high streamflow in the channel. Blocked culverts at locations where forestry roads cross gully channels may also lead to the initiation of debris torrents during high streamflow. The following sections present the existing methods for terrain stability analysis and mapping based on the formalisms introduced earlier in this chapter. 2.3. METHODS BASED ON PROBABILISTIC REASONING 2.3.1. P r i n c i p l e (phys i ca l ly -based ) mode ls This section discusses physically-based models. When applied to small areas (for urban development) these methods are frequently applied deterministically. However, for larger areas, the uncertainty related to parameter input make the deterministic approach infeasible, and these methods are applied in a probabilistic manner. Terrain stability analysis, like most branches of science, concentrated in its early stages on the specific technical details required to develop functional models. Initial efforts in this field focussed on the processes, not on the distribution of properties that influence terrain stability (Davis 1999). Soil strength properties and slope stability analysis methods are typically described in Soil Mechanics textbooks, beginning with those written by the founders of this science (Terzaghi 1943; 19 Terzaghi and Peck 1967; Taylor 1962), to the more recent ones (Craig 1992), and also in numerous research papers (e.g. Terzaghi 1950; Morgenstern and Sangrey 1978; Duncan 1996; Wu 1993). The basis for defining soil failure, is the Mohr-Coulomb criterion, which states that the shear strength (s), is: s = c + a tan </> Equation 2.1 where a is normal stress on rupture surface, c is cohesion, and </> is angle of internal friction. Shear strength of soils is strongly influenced by drainage conditions. To account for that, a fundamental principle in soil engineering is the use of effective stress (a'), which was first defined by Terzaghi as: CJ'=G-U Equation 2.2 where a is total stress, and u is pore water pressure. Similarly with equation 2.1, the shear strength can be expressed consistently in terms of effective stress, using the corresponding strength parameters (for effective stress), c' and </>'. Analyses of slopes can be divided into two categories: those used to evaluate the stability of slopes and those used to estimate slope movement (Duncan 1996). Stability of slopes is usually analyzed by methods of limit equilibrium. These analyses require information about the strength parameters of the soil, and they provide no information about the magnitude of movements of the slope. In limit equilibrium techniques, slope stability is analyzed by first computing the factor of safety. The factor of 20 safety as applied to the analysis of translational slides, is defined as the ratio of the shear strength divided by the shear stress required for equilibrium of the slope: shear strength r = Equation 2.3 shear stress required for equilibrium To calculate a value for the factor of safety, a potential slip surface must be described. Slip surfaces are mechanical idealizations of the surface of rupture. The factor of safety must be determined for the surface that is most likely to fail by sliding, the so-called critical slip surface. To evaluate the stability of a slope by limit equilibrium methods, it is necessary to evaluate a considerable number of possible slip surfaces, to determine the location of critical slip surface and the corresponding minimum value of the factor of safety. This process is described as searching for the critical slip surface. Slope stability analysis is more complex for partially saturated soils (Fredlund 1987), because it involves computation of both pore-water and pore-air pressures, and requires selection of an angle of internal friction that reflects the influence of suction on strength. In practical applications, to account for anisotropic and heterogeneous materials, and variable pore-water pressure, analyses are performed based on methods of slices. These methods subdivide the potential sliding mass into slices, for purposes of analysis. The equilibrium conditions are considered slice by slice. If a condition of equilibrium is satisfied for each and every slice, it is also satisfied for the entire mass. Methods developed can satisfy either force equilibrium, or both force and momentum equilibria. In these analyses, the number of unknowns exceeds the number of equations, and the problems are statically indeterminate. To make up the imbalance between equations and unknowns, either various assumptions are used, or methods that do not satisfy all conditions of equilibrium are employed. Methods developed for slope stability analysis are presented in Soil Mechanics textbooks (e.g. Craig 1992), and are summarized in technical papers (Nash 1987; Duncan 1996). The most popular two-dimensional methods are: Fellenius (ordinary methods of slices), Bishop, Janbu's simplified and 21 generalized, US Army Corps of Engineers, Lowe and Karafiath, Spencer, Morgenstern and Price, Sarma. Generalizations in three dimensions are presented by Hungr (1987) and Leshchinsky and Huang (1991). Numerous software for application of limit equilibrium methods have been developed, like: X S T A B L (Sharma 1991), Slope/W (Geo-Slope International Ltd. 1998), C H A S M (Bristol Innovations Software Sales Ltd. 2000), C L A R A - W (O. Hungr Geotechnical Research Inc. 2003). Most software models are sophisticated, can analyze slope stability using either effective stress or total stress methods, and include more than one type of analyses. Traditionally, the effective stress approach, with drained strength parameters and an estimate of pore pressure or seepage forces has been used to arrive at a factor of safety against failure. Parameters required in limit equilibrium analyses have to be determined from field or laboratory tests. In cases where a landslide has occurred, soil strength parameters are sometimes determined by back analysis. Slope stability analysis based on limit equilibrium analysis seems intuitively appealing and relatively easy to apply. However, it is important to note that a limit equilibrium analysis does not consider deformation, and pore pressure generated in the soil mass as it starts to deform, unless explicitly added (Anderson and Sitar 1995). Methods used to estimate slope movement usually employ the finite-element method. The principles of this method are presented in Morgenstern and Sangrey (1978). Finite-element methods require information on the strengths of the soils, and their stress-strain behaviour. The difficulties of representing stress-strain relations make models based on deformation analysis less successful. Soil is generally nonlinear, nonuniform, inelastic, and anisotropic. Each of these characteristics must be idealized and used in a deformation analysis, and the difficulty in describing natural soil deposits in these terms is the major factor that limits the application of the finite element method in the analysis of slopes. The problem is aggravated by our limited knowledge of the in situ stresses (Morgenstern and Sangrey 1978). Methods based on deformations define movements and stresses throughout the slope, but they do not provide a direct measure of stability, such as the factor of safety calculated by limit equilibrium analyses. Duncan (1996) states that finite-element analyses are 22 more difficult and time-consuming than slope stability analyses, and they require special expertise if they are to be done successfully and productively. Experience with finite-element analyses has shown that they are most useful when performed in conjunction with field instrumentation studies. For the types of landslides related to forest development, the only practical approach is the so-called infinite slope analysis, characterized by slip surfaces that are very long compared with their depth. This is a two-dimensional model, which incorporates the main assumption that a layer of firm soil or rock lies parallel to the surface of the slope at shallow depth, and the slip surface is constrained to be parallel to the slope. Such analyses ignore the driving force at the upper end of the slide mass and the resisting force at the lower end. Various formulae for calculation of factor of safety exist, based on the type of soil and position of groundwater, as presented in Craig (1992), Duncan (1996), Hammond et al. (1992). A landmark in distributed slope stability analysis was the model called Level I Stability Analysis (LISA), developed by Hammond et al. (1992). This is a computer program for estimation of the relative stability of natural slopes, and can be used to make qualitative, relative comparisons between the stability of landforms, and to identify areas that should be targeted for additional analysis. The principles on which LISA is based are presented in Figure 2.1. 23 Figure 2.1 Description o f the infinite slope model (from Hammond et al. 1992) The factor of safety calculated in L I S A is based on Equation 2.4. F S = C r + C , + c o s ' atp.XD -DJHP.S- P,g)DJ tan , Dpsg sin 0 cos © where: C r is root cohesion, C s is soil cohesion, 9 is slope angle, p s is wet soil density, p w is the density of water, g is gravitational acceleration, D the vertical soil depth, D w the vertical height of the water table within the soil layer, and <j> is the internal friction angle of the soil. 24 The assumptions of the infinite slope model are (Hammond et al. 1992): • groundwater surface and failure plane assumed parallel to the ground surface, • failure plane assumed to be of infinite extent, with no side-effects or contributions to strength from lateral boundaries, • only one layer of soil, • the shear surface is located below the root mat, limiting root cohesion contributions to the periphery of the initiation zone. LISA uses Monte Carlo simulation to estimate the probability of slope failure, rather than a single factor of safety value. Distributions for input parameters are, however, only educated guesses, and are based solely on experience. Usually, there is little or no evidence at all to justify the selection. Application of distributed limit equilibrium models for large areas indicated that the major problem is determination of reliable values for input parameters. The important factors affecting shallow soil mass movements (soil characteristics, pore water pressure, slope steepness, the depth to the potential failure plane, and reinforcement provided by vegetation) are highly variable over space and time. Other than slope angle, all these parameters are difficult to measure in the field and vary greatly along and across natural slopes. Pore-water pressure and the position of the groundwater table in relation to the thickness of the soil mass are particularly difficult to measure, as they vary with storm patterns and local slope hydrology. Various methods have been used to derive (acquire) the necessary data. These methods aim in principle at interpolating values for some parameters based on existing samples. Examples include the research of Wilkinson (1996) and Juang et al. (2001). Wilkinson (1996) used geostatistical (qualitative) techniques to derive parameters needed in slope stability analysis. Geostatistical methods of interpolation (popularly known as kriging) attempt to optimize interpolation using spatially autocorrelated, but physically difficult to explain connections between variables. Juang et al. (2001) used A N N (quantitative) methods to describe slope stability parameters, on a relatively small construction site, for engineering purposes. Existing attempts confirm the statement made by Bjerrum in 1967, that 25 "accurate stability analysis of natural slopes by means of intensively applied geotechnical techniques cannot be carried out at reasonable cost over large areas" (cf. Carrara 1983). Introduction of Geographic Information Systems (GIS) had a major impact on terrain stability analysis. GIS allows representation of topography by using Digital Elevation Models (DEM) and also, allows modeling of hydrologic processes. A landmark in the modeling of hydrologic processes was represented by the work of Beven and Kirkby (1979). Their basic flow model is driven by topographic slope and transmissivity of the soil, while lateral flux is related to the contributing area. Contributing area is defined as the upslope area that drains into a certain point (this parameter is described in greater detail in Chapter 5.2.1). Based on the work of Beven and Kirkby (1979), two types of models were developed: (1) steady state models, which include: T O P M O D E L (Beven et al. 1984), TOPOG - based on the work of O'Loughlin (1986) and his colleagues from Commonwealth Scientific and Industrial Research Organization (CSIRO), in Australia (cf. CSIRO 2003), SHALSTAB (Montgomery and Dietrich 1994; Dietrich and Montgomery 1998), and SINMAP (Pack et al. 1998); and (2) dynamic models based on the kinematic wave equation, which include: TAPES - based on the work of Moore et al. (1988) and their colleagues at the Centre for Resource and Environmental Studies (CRES, Australia), which later was developed into TAPES-C for contour elevation data, and TAPES-G for grid elevation data (cf. CRES 2003), and d S L A M (Wu 1993; Wu and Sidle 1995). Dynamic models allow input of rain storms and continuous change over time in root vegetation strength, but require more detailed input, like values for soil porosity and hydraulic conductivity. The majority of models for distributed terrain stability analysis utilize the infinite slope stability approach. Refinement of this equation has taken place to allow more precise modeling of slopes, and this has resulted in a demand for input data which is often difficult to satisfy in practice. In general, the more rigorous and accurate that a slope analysis becomes, the more restricted that analysis is to specific slopes. Consequently, the analyst is faced with insurmountable problems of data acquisition when attempting to model mass movements over large areas and long time scales. Furthermore, the stochastic 26 variables in the infinite slope equation are considered independent. Although there exists some contradiction in the literature, soil cohesion and angle o f internal friction are generally considered to be inversely related, whereas unit weight and angle of internal friction are positively correlated (Hammond et al. 1992). Treating these variables as independent, could result in simulating unrealistic values o f soil shear strength. The vast majority o f functional models applied for distributed terrain stability mapping use a probabilistic approach. However, deterministic studies have been also performed. For example, van Westen and Terlien (1996) used a deterministic approach to identify areas that may become unstable after earthquakes. The unsatisfactory results obtained prove that the deterministic approach is not appropriate. Models based on the limit equilibrium approach incorporate the same specific assumptions of the method. Almost al l flow models incorporate the common assumptions o f uniformly constant soil depth and impermeable subsurface layer parallel to the ground surface. Supplementary assumptions are incorporated for modeling o f groundwater flow: e.g., dynamic models assume that groundwater flows in stream tubes that do not communicate with each other. In general, limitations of existing methods require introduction o f simplifying assumptions which sometimes represent great departures from physical reality. Hydrologic parameters are affected by inaccuracies o f measurements and by inadequate methods o f describing flow and storage o f water and antecedent moisture conditions. Very few models consider the effects of the unsaturated soil zone or the influence o f seepage. Iverson and Major (1986) have demonstrated the influence of seepage direction on slope stability, and are skeptical o f typical seepage-parallel-to-the-slope assumptions. Due to the varying degree o f weathering, fracturing, conductivity and topography o f underlying 'impermeable' bedrock, the seepage-parallel assumption is rarely accurate. It has been shown that the effect of non-parallel seepage is to impose an upward or downward gradient on surficial materials, which in turn changes effective stresses and hence shear strength of frictional materials. . 27 There is also great heterogeneity with respect to variation in space and time of parameters input in limit equilibrium models. For example, soil strength parameters vary in general one order of magnitude over large areas, whereas hydraulic conductivity can vary four or five orders of magnitude on relatively small areas. Besides, topographically based models are in general complex and difficult to apply in real-life analyses. At the end of his masters thesis in which he monitored groundwater flow patterns in coastal BC, and evaluated the most recent distributed terrain stability models, Jaakkola (1998) states: "If the interpreter of the model is not well versed in the idiosyncrasies of topographically based models, the results may in fact be difficult to understand, overly conservative, or drastically underestimated. Based upon the well known and observed complexity of groundwater behaviour, it seems difficult to justify the use of more complex flow models that attempt to incorporate more detailed processes within the hydrologic cycle. It seems more reasonable to work with a simplified base hydrologic model". K. Terzaghi, the founder of Soil Mechanics, in his address to the First International Conference on Soil Mechanics and Foundation Engineering, in 1936, stated: "In Soil Mechanics the accuracy of computed results never exceeds that of a crude estimate, and the principal function of theory consists of teaching us what and how to observe in the field" (cf. Cedergren 1967). It seems obvious that such comments set the limits of methods based on Soil Mechanics theory for large areas. 2.3.2. S t a t i s t i c a l me thods Terrain stability analysis using statistical methods became feasible after the introduction of GIS. GIS opened many possibilities to analyze the relations between landslide distribution and thematic information (stored in GIS). It has also overcome many of the difficulties associated with data handling and has increased the amount of data available through coupling with remote sensing. In GIS, terrain units can be identified and used in stability analysis. Commonly, there are two types of analysis units used: grid cell, and slope/catchment. Grid cells are fundamental units in terrain representation, and slope/catchment units are obtained by dividing an entire watershed into smaller spatial entities. 28 Each analysis unit can have a series of attributes relevant to stability attached, including: elevation, slope, aspect, type of surficial material, texture, etc. Attributes can be regarded as coordinates in a multi dimensional space, and thus each unit can be represented by a vector. In general a vector can be visualized as a line from the origin of the coordinate system to the point in question. In statistics and data processing in general, a vector is regarded just as an array of numbers. Based on the representation used, the terrain classification problem consists of analyzing high-dimensional data sets, and is amenable to statistical analysis. In real-life analyses, however, the problem is more complex because numerous parameters have to be considered. Furthermore, these parameters are both numerical (e.g. elevation, slope, etc.) and class-type (e.g. surficial material type, texture etc.). Statistical methods used range from simple inventories of existing soil mass movements (e.g., density of landslides per hectare or per kilometer of road) to univariate or multivariate analysis of topographic, geologic, geomorphologic and hydrologic attributes. A review of existing methods is presented in Province of B C (1996a). Univariate models are based on the assumption that slope instability is dictated by a single variable or process. This approach allows the mapper to observe the influence of individual slope attributes, but it ignores the possibility that a collection of variables that may be weakly associated with the outcome, can become an important predictor when taken together (Gulyas 1995). A multivariate modeling approach assumes that processes operating on the landscape are influenced to a greater or lesser degree by multiple factors (multivariate statistical methods consider different attributes of the same spatial entities and their interactions). Examples of studies that used multivariate statistical methods are (some studies used multiple methods of investigation): • Discriminant analysis: Carrara (1983), Carrara et al. (1991), Baeza and Corominas (1996), Dhakal (1999), Guzzetti et al. (1999). • Logistic regression: Mark and Ellen (1995), Gulyas (1995), Guzzetti et al. (1999), Millard (1999), Dai and Lee (2001). 29 • Cluster analysis: Niemann and Howes (1992). Some of these studies performed complex transformations on the original data, or used combinations of topographic, hydrologic, and geomorphic attributes that do not have a physical correspondent. Results of statistical methods are extremely variable: from very low success to very high. One of the studies which reported great success was carried out by Baeza and Corominas (1996), which employed discriminant analysis on terrain catchments. After performing complex transformations on the variables, the study reported a prediction accuracy of 95%. Other studies analyzed terrain instability only in relation to forest roads. For example Rollerson (1992) used parametric and non-parametric statistical tests, and Pack (1995) used Chi-square Automatic Interaction Detector (CHAID 1) to identify factors related to terrain instability triggered by road construction. Both studies concluded that the most important factors that indicate instability are slope, and presence of geomorphic processes in the area. Statistical analyses are usually carried out at the research level, and are commonly affected by the following problems: (1) in order to carry out the analysis, assumptions specific to various methods must be met. Multivariate normality and homogeneity of variances is seldom met with natural data. When assumptions are not met, data transformations are necessary, which make analyses more complex, and ' CHAID is an exploratory data analysis method used to study the relationship between a criterion (dependent) variable, which is categorical in nature, and a series of possible predictor (independent) variables. CHAID modeling selects a set of predictors and their interactions that optimally predict the dependent measure; the algorithm relies on the Chi-square test to determine the best next split at each step. The developed model is a classification (or data partitioning) non-binary tree that shows how major "types" formed from the predictor variables differentially predict the criterion variable (StatSoft 2003; Huba 2003). 30 interpretation of results more difficult; (2) the analysis is complex in itself, and oftentimes an analyst with an advanced degree in statistics is required; (3) when new data are added, the entire analysis has to be redone. It may be the case, that the new data set does not meet the assumptions, and different transformations may be necessary; and (4) analyses can be expensive and time consuming. Statistical methods are complex and subtle, and sometimes their application generated contradictory results. For example, Wieczoreck et al. (1997) conducted logistic regression on various terrain attributes related to landslides initiation and concluded that slope is not a significant variable. 2.4. METHODS BASED ON ARTIFICIAL INTELLIGENCE (AI) AI is a vast domain which includes many types of methods and tools to address a large variety of problems. The products of AI applicable to the terrain stability problem are the Expert Systems (ES). ES are computer programs capable of approximating human ability to solve problems, by using knowledge derived from experts in the field. They embody factual, heuristic, and procedural knowledge to address specific aspects of particular problems and to manipulate relevant knowledge expressed in symbolic description. None of the existing classifications (of terrain stability methods) consider subjective geomorphic mapping an AI method. Although results of subjective geomorphic mapping are not usually presented as computer programs, the method however, has the same major feature as ES: delineation of terrain polygons and assignment of stability classes is based on the knowledge of the specialist. Results of such analyses are usually presented as lists or tables used for classification. The main reason for not implementing them as computer programs is that they have limited applicability, for the area where they were developed. This chapter presents first the subjective mapping method, and then ES. 31 2.4.1. Sub jec t i ve g e o m o r p h i c m a p p i n g Subjective geomorphic mapping is the most widely used method in the world (Soeters and van Westen 1996; Keaton and DeGraff 1996) and the official method adopted for purposes of forest development planning in BC. The first step in this process consists of terrain mapping. This is a method to categorize, describe and delineate characteristics and attributes of surficial materials, landforms, and geomorphic processes within the natural landscape. In this province, terrain mapping is based on the Terrain Classification System for B.C. (Howes and Kenk 1997), and on the recommendations for mapping standards and procedures in the Guidelines and Standards for Terrain Mapping in B C (Province of B C 1996b, 1998). Delineation of terrain polygons is sometimes simply based on inferences. To account for uncertainties in polygon delineation, the mapping system uses dashed and dotted lines, when boundaries are not clear and respectively, assumed. All terrain attributes are summarized in a terrain symbol attached to each polygon. A description of terrain symbols and an example are presented in Appendix 2. Apart from terrain symbol, the following type of data are recorded for each polygon: • slope gradient of the entire polygon, which usually includes an estimated range, or range and average of typical slope gradients. • soil drainage classes, as described in Canadian System of Soil Classification (Agriculture Canada 1998). • other symbols, such as: (1) potential for landslide debris to enter streams, which is in fact a rough approximation of debris travel distance, based on hillslope gradient and slope morphology; (2) soil erosion potential - based on slope gradient, generic material, texture and soil drainage; and (3) risk of sediment derived from erosion to enter streams. Terrain stability mapping is a derivative of terrain mapping, as it uses the terrain polygons and the attributes identified as part of this activity. Essentially, terrain stability mapping is a method to delineate areas of slope stability with respect to stable, potentially unstable, and unstable terrain within a particular 32 landscape. The criteria used to separate terrain stability classes are usually defined based on slope gradient, surficial materials, texture, material thickness, slope morphology, moisture conditions and ongoing geomorphic processes. The Forest Practices Code (FPC) of B C uses a 5-class system. The description of stability classes according to FPC - Mapping and Assessing Terrain Stability (MATS) Guidebook (Province of B C 1999) is presented in Appendix 3. Terrain stability classes provide a relative ranking of the likelihood of a landslide occurring after timber harvesting or road construction, but they give no indication of the expected magnitude of a landslide. An example of criteria used for assigning stability classes to terrain polygons is presented in Appendix 4. The system presented in Appendix 4, clearly shows that slope is the most important factor in terrain stability assessment. The next important factor in assigning terrain stability classes is the presence of geomorphic processes. Further separation of stability classes is based on material and surficial expression. Terrain mapping and terrain stability mapping are undertaken initially by stereoscopic interpretation of aerial photographs supplemented with topographic maps (plus other types of maps and data, if available) and field checking. Terrain representation uses Terrain Resource Information Mapping (TRIM) contour maps at a scale of 1:20,000 with 20-m contour intervals, or privately produced maps at 1:20,000 or larger scale. The M A T S Guidebook specifies that the mapper must develop criteria for terrain stability classes specific to the mapped area. The criteria for terrain stability classes are typically qualitative and depend on the knowledge and experience of the terrain mapper. It is recognized that the assignment of terrain stability classes, which are used to identify areas of moderate and high likelihood of landslide initiation, is subjective. Because of regional variations in climate, geology, soils and other factors, few specific criteria apply universally across all regions of the province. Also, the mapper must ensure that the terrain stability criteria and interpretations are not overly cautious. Such interpretations can lead to unnecessary Terrain Stability Field Assessments, increased logging and road construction costs and unnecessary prohibitions on some forest practices. 33 In terrain mapping, average polygon size is a function of the natural variability, steepness and complexity of the terrain being mapped. However, since these polygons are manipulated by human operators (e.g. stereo-transferred on topographic maps), the minimum map polygons is recommended to be 1 cm 2 irrespective of map scale. At a 1:20,000 scale, which is commonly used in practice, this corresponds to 4 ha. Terrain polygons are relatively large, and consequently, most of them have complex descriptions that often include up to three types of materials and geomorphic processes. The classification system presented in Appendix 4 includes only one surficial material type per polygon, and the subsurficial materials are ignored. One can imagine that the process of assigning stability classes becomes more complex when more than one material is present and when variation of slope across the polygon is considered. Pavel (2001) investigated the possibility for implementing an Expert System for assigning terrain stability classes (similar to that described in Appendix 4). Development of the system proved that geomorphic factors may be present in a great number of combinations, and attempting to account for all combinations is a very complex task, sometimes impossible to achieve. 2.4.2. E x p e r t Sys tems In ES, knowledge modeling is achieved using the symbolic A l approach. ES are developed to capture human expertise, mostly in terms of if - then - else rules. The most difficult, time-consuming, and expensive task in building an expert system is that of constructing and debugging its knowledge base. Knowledge acquisition is the process of collecting domain knowledge from experts and transforming it into a computerized representation. Facts are the basic semantic building blocks of the knowledge base. Rules are built up from them and the inference engine attempts to deduce or verify them. Experience has shown that an explanation capability is considered one of the most important functions provided by ES. The ability to generate explanations is important for the user acceptance of such systems. However, of equal importance is the quality of the explanations delivered (Andrews et al. 1995). 34 The development of ES suffers from the so-called knowledge acquisition bottleneck: the hardest part of constructing any knowledge based system is extracting the information from 'the expert'. Various methods have been adopted for the development of the knowledge base, like interviews with experts, literature review, and questionnaire. However, it appears that no formal methodologies have yet crystallized. Domain knowledge of experts comes through years of experience and is often biased toward their own heuristic. Although human experts can solve many problems, they usually have great difficulty explaining why and how they make a particular decision, rendering thus, traditional knowledge acquisition methods powerless (Huang and Xing 2002). Besides, human intelligence springs not from rules of logic but from knowledge about particular problems and about the world in general (Feigenbaum 1995). Other problems in ES development relate to combining knowledge from multiple experts, and combining expert opinion with actual data. Also, with respect to knowledge representation, traditional AI has almost exclusively employed symbolic techniques as a means of knowledge representation. The semantics of logic imply that each symbol used in a representation and reasoning system (e.g. a procedural program or an expert system) can be assessed some meaning. The mechanisms to update symbolic representations are often complex in nature (Poole et al. 1998). Some examples of ES for terrain stability analysis are: XPENT (Aste et al. 1995; Faure et al. 1988, 1995), that can analyze the stability of large slopes of excavated material or embankments formed during roadworks; SISIPHE (Aste et al. 1995), a system that includes modules for the analysis of regional geomorphic information, for the validation of geotechnical data, and for the design of slope rehabilitation measures; S T A B C O N (Grivas and Reaga 1988), a system also meant for engineering applications; and the system developed by Wislocki and Bentley (1989) - a prototype, designed for hazard zonation for development purposes, based on proximity of the area to existing landslides or terrain showing signs of instability. 35 Moula et al. (1995) and Toll (1996) reviewed existing A l applications in geotechnical engineering, and concluded that A l techniques are good for some aspects of solving engineering problems, but do not work very well for others. For the terrain stability problem, given the enormous difficulties of acquiring the knowledge required for such a system, the authors concludes that most existing ES can be regarded as simple prototypes, which are not capable of reasoning at the level of a human expert, but rather be decision support tools, or 'assistants'. Kohonen (2001) states that one of the major shortcomings of rule-based A l is the combinatorial problem: there exist natural tasks where it is simply impossible to create rules to completely describe the systems. Cawsey (1998) presents some simple principles for selecting problems amenable to be developed into ES, and slope stability assessment clearly falls outside the area of such problems. The aim of conventional AJ approaches has been to make computers perform as well as humans. However, no general purpose approaches exist today which show any signs of being able to do that. The goal has not yet been met and it does not look likely that conventional A l holds any promising future (Kohonen 2001). 2.5. METHODS BASED ON F U Z Z Y SETS A N D F U Z Z Y LOGIC Fuzzy Logic, as its name suggests, is the logic underlying models of reasoning which are approximate rather than exact. It may be also defined as "the science bridging mathematical precision and vagueness of common-sense reasoning" (Bojadziev and Bojadziev 1995). Essentially, Fuzzy Logic is a generalization of one of the basic laws of thought, the principle of the excluded middle. The laws of thought in binary form are based on exact ideas of truth or falsehood, and for this reason, conventional logic sometimes leads to unsolvable paradoxes. The argument of the Sorites Paradox is suggested as a test of whether a concept is vague, and if so, it should be modeled using Fuzzy Logic, or otherwise, a Boolean model may be appropriate (Fisher 2000). Fuzziness is a type of imprecision characterizing classes that for various reasons cannot have or do not have sharply defined boundaries. These inexactly defined classes are called fuzzy sets. Fuzziness is not a 36 probabilistic attribute, in which the degree of membership of an entity is linked to a given statistically defined probability function. Rather, it is an admission of possibility that an individual is a member of a set, or that a given statement is true (Burrough and McDonnell 1998). Instead of probability, fuzzy sets theory uses concepts of admitted possibility, which is described in terms of the fuzzy membership function; these functions permit individuals to be partial members of different, overlapping sets. In fuzzy sets, the grade of membership is expressed in terms of a scale that can vary continuously between 0 and 1; this gives the degree to which the entity belongs to the set in question. Fuzzy sets refer to how we assign an object to one class or another, and fuzzy logic is about how we manipulate rules and concepts and how we infer different conclusions from these. Applications of Fuzzy Sets Theory and Fuzzy Logic in terrain stability analysis are still in the early stages. Abolmasov and Obradovic (1997) and Dodagoudar and Venkatachalam (2000) used limit equilibrium methods for terrain stability analysis incorporating the uncertainties in the soil parameters by considering them as fuzzy variables. In general, the problem with such methods is to determine the membership function unambiguously. Existing methods employ either expert opinion (which is subjective), or mathematically derived functions (which need large amounts of data). Besides, the relationship derived may be only site specific, and extrapolation would be very difficult. For terrain stability mapping on large areas it is recognized that many geographical phenomena are vague in their nature. Vagueness is an inherent property of geographical data, and that ignoring it is to strip away the essence of much of those data (Fisher 2000). Fuzzy boundaries and fuzzy attributes are needed for a spatial analysis. The theoretical foundation exists (or is in advanced stage of development). For example, Cross and Firat (2000) developed a 'fuzzy object data model', and Guesgen and Albrecht (2000) address the problem of imprecise reasoning in GIS. However, the problems related to lack of reliable spatial data persist, and reduce the applicability of this method to real-life situations. 37 2 . 6 . METHODS BASED ON A N N In a general sense, an A N N is defined as an information-processing system that has certain performance characteristics in common with biological neural networks. A N N have been developed as generalizations of mathematical models of human cognition and neural biology, based on the following assumptions (Fausett 1994): • information processing occurs at many simple elements called neurons, • signals are passed between neurons over connection links, • each connection link has an associated weight, which, in a typical neural network, multiplies the signal transmitted, • each neuron applies an activation function (usually non-linear) to its net input (sum of weighted input signals) to determine its output signal. There are two aspect of these type of models which are computationally appealing: representation and learning (Sarkaria 2000). In ANN, data representation is numeric, as opposed to the symbolic representation used in Expert Systems. Machine learning in general refers to performance improvement of models based on their input. In A N N , this almost always means updating weights of inter-unit connections, according to some algorithm. In practical application, it is much easier to update weights, than to update symbolic representation (Poole et al. 1998). A criticism of A N N is their inability to explain the reasoning used to attain a certain result. A N N act like black boxes providing little insight into how decisions are made, and this can make users lack confidence in the reasoning of the system. To solve this problem, researchers developed humanly understandable representations for A N N by extracting (fuzzy) rules from trained A N N . Techniques for extracting rules from trained A N N involve creation of Fuzzy Neural Networks, as described by Sayed and Razavi (2000), Huang and Xing (2002), and Sayed et al. (2003). 38 Very few examples of application of A N N to terrain stability were found in the literature. Adeli (2001) reviewed applications of A N N in Civil Engineering developed until the year 2000 and found none related to terrain stability analysis. More recently, ANN-based studies were performed by Al-Tuhami (2000) and Vulliet and Mayoraz (2000). Both these studies used feed forward networks trained using the back-propagation algorithm (Rumelhart et al. 1986a, 1986b) to analyze data obtained in lab or field tests and to compare theoretical results produced by A N N with results of other analytical methods. Originally, back-propagation was in fact a gradient descent search method, and like all methods of this type, could easily get stuck at local optima. To address this problem, improved versions of this algorithm were developed. A study was conducted at the same time with this thesis by Fernandez-Steeger (2002). This study used a resilient back-propagation (RPROP 2) algorithm (i.e. only supervised analysis). Various topographic, geological and geotechnical attributes were ranked according to experience and known relevance to the problem, and used to predict terrain instability. Data were stored and manipulated in GIS, and terrain was represented using a 20-m grid. To account for the shortcomings of the back-propagation algorithm, various architectures and learning parameters were investigated, and a success rate of up to 85 % was achieved. 2.7. SUMMARY REMARKS This chapter presented a general presentation on terrain stability analysis, the important factors in this activity, and reviewed existing methods for terrain stability mapping. Description of each method was accompanied by a discussion which identified the complexities and shortcomings of the method, that limit its applicability to real-life problems. At a more general level, Kasabov (1996) comments that the well-established methods of probability theory can handle uncertainties when they are strictly 2 RPROP is a version of the back-propagation algorithm, modified for faster convergence (cf. Rojas 1996). 39 represented in the terms of this theory. They are not suitable for chain reasoning or to represent subjective knowledge. Also, the symbolic methods of A l fail to provide comprehensive approximate reasoning techniques. The approaches based on principle models attempt to predict the behaviour of terrain units based on known laws of nature. The implicit assumption of these approaches is that all factors that play a role in terrain stability are taken into account, which implies that causes of terrain instability are determined. In fact, all functional models imply causality. Some studies explicitly declare as their objective, identification of cause and effect relations in terrain stability. Pearl (2000) reviewed the problem of causality throughout the history of science and philosophy, and concluded that in relation to natural systems, this issue has not been solved. Pearl (2000) states that causality is a learnable habit of the mind, and no satisfactory answer has been provided to the fundamental questions: "what empirical evidence produces cause-effect perception ?", and "what empirical evidence legitimizes a cause-effect connection ?". The existing scientific works presented in this chapter reinforced the idea that A N N are a viable method for terrain stability investigation. 40 CHAPTER 3. GENERAL DESCRIPTION OF ARTIFICIAL NEURAL NETWORKS AND SELF-ORGANIZING MAPS. In the previous chapter, spatial entities to be classified with statistical methods were represented as vectors of measurements o f feature values. Terrain classification was investigated in this thesis based on the same representation, for the supervised and unsupervised cases. The first part of this chapter discusses these two types of classification, in relation to terrain stability mapping. Next, an overview of most common A N N is presented. The last part o f the chapter includes a detailed description o f A N N used in this study, the Self-Organizing Maps. This includes a general description, algorithms, and issues related to data preparation and the post-processing o f data. 3 .1. SUPERVISED A N D UNSUPERVISED CLASSIFICATION 3.1.1. S u p e r v i s e d c lass i f i ca t ion Supervised classification (learning) is a problem of pattern recognition. Vesanto (2000) defines the supervised learning problem as the process o f training predictive models with data: there is an output 'y' which needs to be constructed from the inputs V . For the terrain stability problem, in the supervised case, a number of examples (pixels) already classified is assumed to be available; the problem is to assign new pixels to various classes based on how similar they are to the examples included in these classes. In statistics literature, supervised classification is often called discriminant analysis. The criterion (energy function) that discriminant analysis tries to minimize emphasizes the structure of the data. Typically, these criteria involve minimizing some measure of dissimilarity in the entities within each group, while maximizing the dissimilarity of different groups. Another popular approach for supervised classification is to model the classes using prototypes, and classify according to the shortest distance (defined in a suitable sense) to a prototype. Nearest neighbour (NN) classifiers, especially the k - N N algorithm, are among the simplest and yet most efficient 41 classification rules and are widely used in practice (Laaksonen and Oja 1996). In k - N N each class is given as a set of sample prototypes which is a training set of pattern vectors from that class. When an unknown vector is to be classified, its k closest neighbours are found from among all the prototype vectors, and the class label is decided based on a majority rule. A possible tie o f two or more classes is broken by decreasing & by one and re-voting. This rule is simple and elegant, and the error rate is small in practice. A generalization o f this approach consists o f the fuzzy k - N N algorithm (e.g. Sayed and Abdelwahab 1998). 3.1.2. U n s u p e r v i s e d c l a s s i f i ca t ion Unsupervised classification is a problem of pattern discovery; no examples are available. In this type o f analysis, data sets need to be summarized to gain insight into them; the goal is to present the data in a form that is easily understandable, but at the same time preserves as much o f the essential information in the data as possible. Such discovery is often described as data mining, the process o f extracting information from what appears to be arbitrary data. A n area for which stability assessment is carried out can be essentially represented as a database with a large number of spatial entity records, and this may be in the order o f thousands to millions o f records. The fields for each record represent various attributes relevant to stability analysis, noted for each spatial entity. A l l terrain units have the same status, and the purpose o f the analysis is to cluster them in stability classes, based on their attributes. A t the last step, clusters are labeled according to some criteria. Most often, labeling is done manually, analyzing clusters one by one. For larger data sets, however, manual labeling can be performed on a subset o f the data, then extended to the whole data set. Quick methods for analyzing high-dimensional data items have been proposed, based on direct visualization (e.g. Tukey 1977; M a n l y 1994; Kask i 1997). More elaborate methods for unsupervised classification are based on two main approaches: clustering, and projection methods. The goal of clustering is to reduce the amount o f data by categorizing or grouping similar data items together. 42 Cluster analysis groups entities by looking for the most similar items in a data set, combining them as one item, looking for the next most similar item, and so on until all the items in the data set are combined. The similarity of two items, or its opposite, can be measured in various ways. The end result of the algorithm is a tree of clusters called a dendrogram, which shows how the clusters are related. By cutting the dendrogram at a desired level, a clustering of data items into disjoint groups is obtained. Vector projection methods are used to find low-dimensional coordinates for high-dimensional data samples such that certain features of the original data set are preserved as well as possible. The goal is dimensionality reduction, and often visualization if the output space is 2- or 3-dimensional. Typically the features to be preserved are pairwise distances between data samples, and subsequently, preservation of the shape of the data manifold in the projection. The most common projection methods, and corresponding techniques, are: (1) linear projection methods: e.g. Principal Component Analysis (e.g. Manly 1994); and (2) non-linear projection methods: Multidimensional scaling (MDS; e.g. Manly 1994); Sammon's mapping (Sammon Jr. 1969); Principal curves (Hastie and Stuetzle 1989). The method used in conjunction with the Self-Organizing Maps is Sammon's mapping. Sammon's mapping is a nonlinear projection method that tries to optimize a cost function that describes how well the pairwise distances in a data set are preserved. The goal of the projection is to optimize the representations so that the distances between the items in the two-dimensional space will be as close to the original distances as possible. This aims at creating a space which would represent the relations of the data faithfully, but also reduce the dimensionality of the data set to a sufficiently small value to allow visual inspection. The cost function to be minimized is Equation 3.1 (Vesanto 2000): 43 Equation 3.1 i=l j=l where: djj is the distance between data samples i and j in the input space d'y is the corresponding distance between the projection coordinates in the output space N is the total number of samples. Sammon's mapping is fundamentally a gradient descent search method, and in practical applications the computation of the projection must be repeated several times starting from different initial configurations 3.2. G E N E R A L DESCRIPTION OF A N N A N D SELECTION OF A N N USED IN THIS STUDY. 3.2.1. G e n e r a l d e s c r i p t i o n o f A N N . The development of A N N was motivated by a desire to understand the brain and emulate some of its capabilities. McCulloch and Pitts (1943) designed what are generally regarded as the first neural networks. These researchers recognized that combining many simple neurons into neural systems was the source of increased computational power. After the seminal paper of McCulloch and Pitts, other models have been developed, many of them being simplified implementations of existing biological concepts (e.g. Hebb 1949; Rosenblatt 1958). A common view in these early stages was that the brain is in fact a set of simple elements interconnected to a huge graph with some laws governing the time course of the distribution of activity across the elements. Some of the elements are connected to sensors like eyes or ears and are considered input, other are attached to muscles and are considered output. For simplicity, the elements are called neurons, the graph they form 'neural networks' or 'neural nets'; other 44 names that have been used for the field include connectionism, parallel distributed processing, neural computation, adaptive networks, and collective computation. In general, an A N N is characterized by (1) architecture - the pattern of connections between the neurons; (2) learning algorithm - the method of determining the weights on the connections (i.e. the method of tuning the weights to the correct values); and (3) its activation function. A N N can be applied to a wide variety of problems, such as storing and recalling data or patterns, classifying patterns, performing general mappings from input patterns to output patterns, grouping similar patterns, or finding solutions to constrained optimization problems. In many cases neural and statistical techniques are seen as alternatives, or in fact A N N are seen as a subset of Statistics. Many studies have been conducted to identify the specifics or to produce a synthesis of the two fields. Holmstrom et al. (1996a; 1996b) emphasize, however, that outside some well-defined and focused application fields like multivariate classification, the two fields are different and the long term goals of A N N in designing autonomous machine intelligence still remain. Also, the two methods are very different computationally. From the very beginning of A N N research the goal was to demonstrate problem-solving without explicit programming. The neurons and networks learn from examples and store this knowledge in a distributed way among the connection weights. In terms of explicit error functions, the original methodology was bottom-up, exactly opposite to the goal-driven or top-down design of statistical classifiers. The power of'neural' algorithms is increased flexibility of architecture in the sense of the richness of the discriminant function family, and the possibility for incremental learning. In real-world problems, speed is another important consideration, and A N N are fast and compact. These characteristics make A N N suitable for analyzing very complex data structures, like the ones encountered in terrain stability mapping. 45 3.2.2. Selection of the ANN used for terrain stability mapping. A great diversity of A N N exists, comprising different architectures and learning laws, which makes their classification a complex task. A presentation of most common A N N in chronological order is given in Figure 3.1 (items introduced in this figure are described later in this chapter). |LvQ, SOM 1981, 1982 RBF 1988 ART I ART II 1983 1987 IARTIII 1989 Discrete Hopfietd 1982 Perception 1958 Multi-Layer Perception 1960 Ik w Backpropagaling| Perception 1974 Continuous Hopfield 1984 Bcdtsinam Machine 1984 Modified B ackpropagaIing| Perception 1986-1990 1950 1960 1970 1980 1985 1990 Figure 3.1 Development over time of most popular A N N (from Hodju and Halme 1999). Descriptions and classifications of artificial nets are presented in most books on this topic, e.g. Haykin (1994), Fausett (1994), Bishop (1995). Kohonen (2001) classifies A N N based on their function in: signal-transfer networks, state-transfer networks, and networks based on competitive learning. Signal-transfer networks. In these nets the output value depends uniquely on input; these structures are thus designed for input transformations. The mapping can be also fitted by algebraic computation or 46 gradient-step optimization. Typical representatives are layered feed-forward nets such as the multilayer Perception (Rosenblatt 1958), and the feed-forward net in which learning is defined by an error back-propagation algorithm (this popular algorithm is presented in many papers, e.g. Rumelhart et al. 1986a; 1986b). The radial-basis-function (RBF) networks (e.g. Haykin 1994) can also be counted as signal transfer networks. These nets are used for the identification and classification of input patterns, for control problems, for coordinate transformations, and for evaluation of input data. State-transfer networks. In these networks, the feedbacks and nonlinearities are so strong that the activity state very quickly converges to one of its stable values, which are called attractors. Input information sets the initial activity state, and the final state represents the result of computations. Typical representatives are the Hopfield network (Hopfield 1982) and the Boltzmann machine (Ackley et al. 1984). The main application of these nets are in various associative memory functions and optimization problems. Although they have also been used for pattern recognition, their accuracy remains much below that of other ANN. Network based on competitive learning. The essence of competitive learning is that, when an input arrives, the neuron that is best able to represent it wins the competition among the other neurons and is allowed to learn it even better. It creates a sort of division of labour between neurons and increases the performance of the net. The most representative nets based on competition are: Adaptive Resonance Theory (ART), introduced by Carpenter and Grossberg and presented in numerous A N N publications (e.g. Fausett 1994), designed for unsupervised learning, and Self-Organizing Maps (Kohonen 2001), which include Learning Vector Quantization (LVQ) for supervised learning, and the S O M itself for unsupervised learning. Apart from the nets presented above, there are other types that do not fall in any of the above categories. Many A N N are constructed from a different perspective, based on energy functions and other concepts 47 from Physics (Kartalopoulos 1996). A N N are a dynamic field, and many new algorithms are continuously reported in journals and conferences as belonging to this field. A N N were explored in the early stages of this study, either by programming them, or through analysis and studying their applications (Pavel 2000a, 2000b). From the networks available, the Self-Organizing Maps were considered the most suitable for terrain stability analysis. L V Q yielded very promising results in preliminary analysis conducted on subsets of data. This is a powerful algorithm, which performed very well in many other practical applications (Holmstrom et al 1996a, 1996b; Laaksonen and Oja 1996). L V Q was purposely developed for statistical pattern recognition, especially when dealing with very noisy high-dimensional stochastic data. The main benefit of this net is a very good recognition accuracy while at the same time reaching a radical reduction of computing operations when compared with more traditional statistical methods. SOM was selected for its unique capability to combine clustering and projection methods. SOM is a special case in that it can be used at the same time both to reduce the amount of data (by clustering, or vector quantization), and for projecting the data nonlinearly onto a lower-dimensional display (vector projection). By virtue of its properties, SOM groups together clusters that are similar in the input space. When applied to the terrain stability problem, this means that terrain units that are misclassified, are very likely to be assigned to one of the neigbouring classes, thus reducing very serious misclassifications. As presented in Chapter 2, the most important criticism of connectionist models is their inability to explain the reasoning for a result in a useful way. SOM has very good explanation capabilities, and this was another reason for selecting it. In this thesis, the purpose was to perform terrain stability mapping both in supervised and unsupervised mode. Having methods that stem from the same principle for both cases, seemed to be a good choice. In the remainder of this paper, the name 'Self-Organizing Maps' will be used to describe both nets; for remarks related to only one of them, they will be referred to as L V Q and SOM, respectively. 48 3 . 3 . SELF-ORGANIZING MAPS 3.3.1. I n t r o d u c t i o n Kohonen introduced the Self-Organizing Maps in the early 1980's. The author and his collaborators at the A N N lab at Helsinki University of Technology worked on further developing and refining these techniques. The SOM and L V Q have been described in numerous book chapters, review articles and edited books, and three world congresses have been dedicated to research on Self-Organizing Maps. The research group also maintains an up-to-date list on all studies and applications of these techniques, and the current list includes more than 5,300 items. Applications are extremely diverse, from analyses of the pulp and paper industry of the world, complex industrial applications, exploratory analysis of financial and economic data, medical applications, telecommunications, and machine vision, to classification of clouds, rainfall estimation, automated sleep classification, grading of beer quality, etc. However, there are very few applications in geological engineering and none in terrain stability mapping. The rest of this chapter gives a more detailed presentation of Self-Organizing Maps. Some notation needs to be introduced as a first step in this process. The vectors corresponding to spatial entities in the input data set will be denoted by X k , k = 1, . . . , N; here X k e R". Where N represents the total number of units (pixels) to be classified, and n represents the dimensionality of each entity (i.e. the number of attributes used to describe a spatial entity). More explicitly, the n-dimensional vector is denoted by X k = {xki, x k2, , . . . , xk n}T (the transposed notation is routinely preferred for typographic reasons); vector elements could represent different terrain attributes: e.g. x k l could represent slope, x^ - elevation; -aspect, etc. In this thesis, the terminology of Mathematics and Earth Sciences is used, and the elements of data vectors are called attributes. To quantify how similar/dissimilar patterns are (i.e. for grouping spatial entities), a distance measure needs to be defined. This is usually the first step in Multivariate Statistics and ANN; detailed 49 descriptions are presented in the literature (Manly 1994; Kohonen 2001). Some o f the most important distance measures are: Euclidean, 'city-block' or Manhattan, Mahalanobis, Hamming, and Levhenstein. Selection o f the appropriate distance measure is driven by the intrinsic nature of data analyzed. In this study, based on recommendations from the literature (Kohonen 2001; Kask i 1997; Fausett 1994) the Euclidean distance was used. For example, the Euclidean distance between the vectors X k = {x k l , Xk2, • • •, x k n } T and X j = {xn, x G , . . . , X i „ } T is defined by Equation 3.2: d(Xk,X,) = - * n ) 2 +(xk2-xl2f +... + (xkn-x]n)2 Equation 3.2 3.3.2. L e a r n i n g V e c t o r Q u a n t i z a t i o n ( L V Q ) L V Q can be trained to perform mappings when target values are available (classification is known) for the input training patterns. L V Q classifies data using prototypes, i.e. the purpose of analysis is to group the N input vectors into M clusters ( M < N ) , by using M representative (or exemplar, or codebook) vectors. A t the end o f the training process each class o f vectorial input samples is represented by its own set of codebook vectors. L V Q describes class borders around the representative vectors. A typical L V Q net is presented in Figure 3.2. In a L V Q net there are as many input units as an input vector has elements (in our case, n). The output vectors denoted by Y ] , Y 2 , .. . , Y M have the same dimensionality, n, as the input vectors. The weight vector for an output unit is in fact represented by its elements. During training, the net determines the output unit that is the best match for the current input vector; the weight vector for the winner is then adjusted with the so-called 'learning rate', in accordance with the learning algorithm of the net. 50 yu yu ym yn yn ym vm YMI yMn Figure 3.2 Learning Vector Quantization (LVQ) neural net. The L V Q algor i thm A typical set of vectors to be classified is presented in Table 3.1 (in Table 3.1, the dot'.' is used as a placeholder for the vector index). Table 3.1 A typical set of training vectors. Vector & Element X.1 (e -9-Eleva t ion ) X.2 (e.g. S l o p e ) etc. X. | X.n Xi x 11 x 1 2 x 1 i x 1 n X 2 x 21 x 2 2 x 2 i x 2 n xk x k1 x k 2 x k i x k n X N X N1 X N 2 x N i x N n L V Q training consists of moving the codebook vectors iteratively toward or away from the training samples. The variations of the L V Q algorithm differ in the way the codebook vectors are updated. 51 The LVQ algorithm used in this thesis is described by the following procedure: 1. Initialize the M reference vectors. Several strategies exist; the simplest one is to take the first M vectors from the data set. 2. Initialize learning rate oc, this should be a relatively small value. Typical values can be found in the literature or through preliminary analyses. 3. Whi le stopping condition is false, do: 4. Read the current training vector, X k . 5. Calculate the distance from X k to all codebook vectors, and find 1, for which d ( X k , Y i ) is minimum. 6. Update Y) using the following rule: If X k and Y ] belong to the same class Then Y ) new = Y | 0 ] d + Q(X k - Y | 0 id ) Else Y l new = Y] old - Q ( X k - Y) old) End If 7. Reduce the learning rate; the learning rate is usually reduced to 0 over the training process. In practice it may be also sufficient to use a fixed small value for the learning rate. 8. Test stopping condition. This condition may specify a fixed number o f epochs (number o f times when the data set is presented to the net), the learning rate reaching a sufficiently small value or, when no changes occur in the classification. A simplified application of the LVQ learning rule for the terrain stability problem is presented in Figure 3.3. The input vectors are considered to have a dimensionality equal to 3 (which may be represented by Elevation, Slope, and Aspect). Assume also that 3 codebook vectors are represented in the figure and represent 2 stability classes, as follows: 2 codebook vectors (represented by squares) for clustering the unstable pixels, and 1 codebook vector (represented by a circle) for the stable pixels. When a new vector (represented by the black dot) is input to the net, distances to all codebook vectors are calculated. If the 52 closest codebook vector belongs to the same class as the input vector, the codebook vector is moved 'closer' to the input vector; i f the codebook vector belongs to a different class, it is moved 'away' from the input vector. ( ^ Aspect Elevation Slope Figure 3.3 Simplified representation of the L V Q learning rule. A t the end of the training process the net can be used to classify new vectors, for which classification is not known. The new vectors are input into the net and a label (class) is assigned to each of them based on the closest codebook vector. Other comments on L V Q Mathematically, L V Q is the equivalent o f a Voronoi tessellation in a n-dimensional space. The Voronoi region o f unit 1 is defined to be the set consisting o f the points in the input space that are closer to reference vector Y t , than to any other vector. L V Q is less complex and computationally much lighter than S O M . However, due to intrinsic properties o f supervised learning, L V Q yields better results than S O M , and has a greater potential to be used for terrain stability mapping. 53 3.3.3. T h e S e l f - O r g a n i z i n g M a p ( S O M ) Self-Organization as a process is common in many biological systems, e.g. the human brains. It has been known for a long time that the various areas of the brain are organized according to different sensory modalities: there are areas performing specialized tasks, e.g., speech control and analysis of sensory signals (visual, auditory, somatosensory, etc.). Experimental data and observations convincingly demonstrate the existence of a meaningful spatial order and organization of brain functions. Pioneering results in the study of Self-Organization were obtained by von der Malsburg (1973). Self-Organizing nets group similar vectors together without the use of training data. A sequence of input vectors is provided, but no target vectors are specified. The net modifies its weights so that the most similar input vectors are assigned to the same output (or cluster) unit. The net produces a representative (codebook) vector for each cluster formed. Computationally, SOM can be considered to have an architecture similar to that of L V Q (Figure 3.2). The S O M algor i thm The SOM defines a mapping from the n-dimensional input data space (R n ) onto a two-dimensional (2-D) array of nodes, in such a way that observations that are similar in the input space, are also close to each other on the 2-D map; the 2-D array of nodes is in fact the SOM. The lattice type of the array can be defined to be rectangular, hexagonal or even irregular (Figure 3.4). Hexagonal is effective for visual display. When a new pattern is presented the neuron that is best able to represent it (with the shortest Euclidean distance), wins the competition with the other neurons, and is allowed to learn it even better (as defined in Chapter 2, machine learning refers to performance improvement of models based on their input). Apart from the winner, the nodes that are topographically close in the array up to a certain geometric distance, will activate each other to learn something from the same input, in a sequence of steps (the closer a neuron is to the winner, the better it can learn a new pattern presented to the network). 54 Neighbourhoods can be hexagonal or rectangular, as presented in Figure 3.4 (the winning neuron is the one in the center of the neighbourhood). Figure 3.4 Representation of neighbourhoods in SOM (hexagonal on left, rectangular on right; from Vesanto 2000). A reference vector is associated with every node in the SOM. Reference vectors can be visualized as 'hanging' from each neuron on the lattice. SOM training occurs in two phases. First, ordering of the SOM occurs during the initial period, then the second period is needed for the fine adjustment of the map. The process is described in the following procedure (the data set is similar with the one presented in Table 3.1; N.B. no classification is available for samples): 1. Select dimensions of the map (i.e. 2-D lattice) and initialize the reference vectors. Several strategies are presented in the literature for these tasks. The map dimensions can be selected by performing Sammon's mapping on the data to analyze or through preliminary analyses. For the reference vectors (Y), random vs. ordered initialization is possible - the simplest (although more time consuming) way is random initialization. 2. Begin the first phase of learning: initialize the first learning rate au and the first radius R. Typical values for the learning rate are recommended in the literature, and can be found in preliminary analyses. The initial radius should be relatively large, it can be more than half the diagonal of the network. 55 3. While stopping condition for first phase is false, do: 4. Read the current training vector, X k . 5. Calculate the distance from X k to all codebook vectors, and find 1, for which d(X k, Y|) is minimum. 6. For all neurons within the neighborhood of radius R around Y\, do: 7. Update Y using the following rule: Y new = Y o l d + 0(X k - Y 0]d) 8. Reduce radius of topological neighbourhood and learning rate. A linearly decreasing function is satisfactory for practical applications. 9. Test stopping condition for the first phase; this condition usually specifies a certain number of epochs. 10. Begin the second phase of learning; initialize the second learning rate a2 and the second radius r. Both should be much smaller than corresponding values for the first phases. 11. While stopping condition for the second phase is false, do: Repeat steps 4-8 with the new parameters. 12. Test stopping condition for the second phase. This condition is similar to the one for the first phase. During training, neighbouring neurons will gradually specialize to represent similar inputs, and the representation will become ordered on the map lattice (i.e. the SOM), and the grid can be used efficiently in visualization. At the end of the training process, insight into the SOM results can be gained by labeling the neurons (map calibration). This can be done either directly, by analyzing the entities grouped on each neuron, or by manually labeling another set of data and inputting it into SOM. Labels are assigned by majority voting. 56 Other comments on S O M Although the basic principle of SOM seems simple, the process behavior, especially relating to the more complex input representations, has been very difficult to describe in mathematical terms. The reasons for the self-organizing phenomena are very subtle and have strictly been proven only in the simplest cases. Kohonen (2001) provides a mathematical proof for the case of a one-dimensional, linear, open-ended array of functional units (neurons) to which one-dimensional input signals (vectors) are presented. The author proved that starting with randomly chosen initial values, these become ordered in an ascending or descending sequence, during the self-organization process. Once ordered, the values cannot become disordered in further updating. Rigorous mathematical treatment of the SOM algorithm has turned out to be extremely difficult, and the theoretical properties of SOM still remain without proof in the general case, despite the tremendous efforts of several authors (Cottrell et al.1998). The SOM algorithm is relatively simple and has the ability to produce organization, starting from possible total disorder. Learning consists of repeatedly modifying the synaptic weights of a neural network in response to activation patterns and in accordance with prescribed rules, until a final configuration develops. Haykin (1994) comments that "the key question is how a useful configuration can finally develop from self-organization. The answer to this question lies essentially in an observation that Alan Turing made in 1952, namely global order can arise from local interactions. This observation is of fundamental importance, since it applies to the brain and just as well to SOM. In particular, many originally random local interactions between neighboring neurons of a network can coalesce into states of global order and ultimately lead to coherent behaviour, which is the essence of self-organization". Self-Organization as a process was also studied in relation to larger systems, e.g. at the level of social life and even for the whole universe, with no reference to Kohonen networks. Krohn et al. (1990) review previous studies in this area, in which self-organization is described as an increment of order. Two phases are identified in this process. First, environmental perturbations are incorporated by the system in its structure, and in the second phase, the system expands. The first phase is called 'order from noise', 57 while the second is 'order from order'. There is clearly a striking similarity in these descriptions, as the two phases are also identified in Kohonen networks. 3.3.4. Comments on LVQ- and SOM-based data analysis, and interpretation of results. The problems of pattern recognition and/or discovery can be partitioned into: data acquisition and preprocessing, classification, and interpretation/evaluation of results. For Self-Organizing Maps, classification is performed according to the algorithms described above. However, preprocessing and interpretation of results are equally important steps in the analysis. Data acquisition and preprocessing. The quality of the data is crucial in ANN-based analyses. Regardless of how good the methodology is, the results will fundamentally depend on the quality and suitability of the data. Also, the choice of suitable representations for the data items is a key step in data analysis. Selection and the usefulness of different preprocessing methods depend strongly on the application, and a priori knowledge plays a very important role in this phase. Kaski (1997) notes that in practical applications the selection and preprocessing of the data may be even more important than the choice of the analysis method. For example, changes in the relative scales of the features have a drastic effect on the results. A frequently occurring problem in this type of analysis, is that of missing data. Some of the components of the data vectors are not available for all items, or may not even be applicable or defined. Several simple and more complex approaches have been proposed for tackling this problem. In Self-Organizing Maps, distances between vectors are calculated using only those components that are available. When the reference vectors are updated, only the components that are available will be modified. If only a small proportion of the components of the data vector is missing, the result of the comparison will be reasonably accurate. It has been demonstrated that better results can be obtained with the approach described above than by discarding the data items from which components are missing (Kaski 1997). However, when the majority of elements of a vector are missing it is better to discard the whole vector. 58 Interpretation and evaluation of results In this study, interpretation and evaluation of findings of L V Q and SOM can be performed by importing them back in the original representation system, i.e. the GIS. Clusters are then evaluated according to terrain stability criteria used. Apart from that, little insight can be gained into the mapping performed by L V Q . Codebook vectors are n-dimensional, and reside in a unordered space (i.e. a vector representing stable terrain can be among vectors representing unstable terrain), and no readily-available methods exist for extracting this information in another format (for example, rules). For SOM, as it also performs a projection on a lower-dimensional space, a more detailed visual evaluation is feasible, and its visualization capabilities are an important component of this thesis. A description of these features is provided below. Distance matrices are the most commonly used visualization technique to detect clusters from SOM. A distance matrix consists of distances between neighboring map units. It can either hold all distances between map units and their immediate neighbors, like the Unified-Matrix (U-matrix; Ultsch and Siemon, 1990), or just a single value for each map unit, such as the median of the distances to the neighbours. The distance matrices can be visualized using shades of a certain colour. For a better understanding of results presented later in this study, an example of a SOM is presented in Figure 3.5. This figure presents the result of a SOM analysis, which mapped a certain number of pixels on six (stable and unstable) neurons. 59 Figure 3.5 A purely theoretical example o f a S O M with six neurons. In Figure 3.5 the neurons are labeled according to the terrain stability problem as stable (S) or unstable (U). The graphical representation in Figure 3.5 is based on the computer implementation of Elomaa et al. (1999) (this implementation requires that one neuron be selected at all times, which masks the details o f the neuron 4, which is also unstable). The colours o f neurons illustrated in Figure 3.5 represent the U -matrix distances (the central part of the hexagon), and the individual distances to each neighbouring neuron (the sides of the hexagon). B y examining the distances between spatial entities grouped between various neurons on the S O M (in 2-D space), one can infer the distances in the original n-dimensional space. A scale for distances is provided on the right side o f the S O M : the light shading typically represents a short distance, and the dark shading represents a large distance. The S O M depicted in Figure 3.5 clearly presents the principle o f self-organization, as unstable neurons are grouped in the left part o f the grid, and stable neurons are on the right side. The mathematical representation o f the S O M displayed above is presented in Figure 3.6 (based on the method used by Kohonen 2001; and Hodju et al. 1999). 60 # A Self-Organizing Map codebook file 4hexa3 2 8.460000E-001 1.330000E+000 -2.700000E-001 -2.620000E+000 U , l 8.060000E-001 1.360000E+000 -4.180000E-002 -3.030000E+000 U,2 6.340000E-001 1.450000E+000 8.270000E-002 -3.650000E+000 S,3 5.620000E-001 1.270000E+000 1.810000E-001 -3.470000E+000 U,4 4.290000E-001 8.680000E-001 7.700000E-002 -2.580000E+000 U,5 -5.450000E-002 3.280000E-001 -2.760000E-001 -1.430000E+000 S,6 Figure 3.6 Representation of SOM codebook vectors. In Figure 3.6, the first line (marked with '#') is a comment. The second line presents the following information: (i) data dimensionality - each vector has 4 elements, i.e. 4 attributes are used to describe each spatial entity; (ii) the type of neighbourhood - hexagonal in this case; and (iii) the dimensions of the map, which in this example is 3 x 2. The last six lines give the details for each neuron: the first four numbers give the values for the attributes (in floating point format), and the last two symbols give the labels assigned (U or S), and the neuron number (N.B. neuron no. 4 is unstable). The data dimensionality of 4 (greater than 3) was purposely selected, so the vectors can not be seen in the original, 4-D space. The only visualization possible is after the projection and clustering performed by SOM, which produced the 2-D map (lattice). If one considers only its projection properties, the SOM can be compared with geographical projections. However, geographical projections are meant to 'transfer' data from a 3-D space (real world) to a 2-D space (a geographic map), whereas the SOM can project items from an original space with a dimensionality greater than three, to a 2-D map (like the one presented in Figure 3.5). Obviously, geographic projections have no intent of clustering the items projected. 61 A powerful method for investigating the 'reasoning' used by the SOM in the clustering process is to visualize component planes of the map. Each component plane can be thought of as a slice of the map and it consists of the values of a single vector component in all neurons (as presented before, vector elements can be visualized as 'hanging' from each neuron). Each component plane visualizes the spread of values of that vector element. Any component plane of the SOM can be displayed separately using colormaps, and from these maps, the analyst can see what values of the terrain attributes are typical for various regions of the map (e.g. for stable and unstable terrain). The method is used and presented in greater detail in conjunction with the results of my study. 62 CHAPTER 4. STUDY AREAS This study was conducted using data from two sites: (1) the Seymour Watershed within Greater Vancouver Regional District (GVRD), and (2) Jeune Landing, a site situated on Northern Vancouver Island. Location of the study sites is presented in Figure 4.1. Figure 4.1 Location of the two study sites. For both sites, digital topographic data were obtained in the form of B C Terrain Resource Inventory Mapping (TRIM) contour maps at a scale of 1:20,000, with 20-m contour intervals. Data were projected using the Universal Transverse Mercator (UTM) projection and North American Datum (NAD) 83. Other data obtained for both sites in GIS format included: location of roads, streams, and lakes; forest cover; and terrain stability maps. Also, the sets of air photos used for terrain stability mapping were available. An average of three days was spent doing reconnaissance work at each study site. 63 4.1. DESCRIPTION OF SEYMOUR STUDY SITE Seymour is one of the three community watersheds for the Greater Vancouver area. It lies within the Pacific Ranges of the Coast Mountains, and has a total area of 56.7 km 2. The terrain is characterized by rugged topography, steep rocky slopes at higher elevations, and mountain peaks more than 1300 m above the valley floor. Gentle to moderate slopes at low and mid elevations are typically mantled by deposits that range in texture from clays and silts to coarse gravels. More recent deposits, mostly coarse-textured, have been deposited on gentle slopes by streams and debris flows, and on steeper slopes by rockfall. Slopes are commonly steeper than 35%, with steepness generally increasing with elevation. Bedrock consists primarily of intrusive igneous rocks - granodiorite, quartz diorite, and lesser amounts of gabbro and migmatite. Bedrock is exposed at mid and upper elevations. Climate is generally mild and wet. The cold season features rain-on-snow events and large accumulation of snow at high elevations, which generally lasts into the summer (Anon. 1999). The Greater Vancouver Regional District (GVRD) conducted a very comprehensive ecological study of the watersheds they administer between 1995 - 1999. The study was coordinated by Acres International Ltd., and involved utilization of remote sensing techniques combined with on-the-ground surveys. It was estimated that this was the most comprehensive landscape level database developed to that date for any watershed lands in Canada, and perhaps in North America3. Figure 4.2 displays a D E M of the area. Elevation in Seymour ranges from 40 m to more than 1535 m, and slope varies between 0 and 72°. Location of existing slides (212) are also shown in Figure 4.2. 3 The statement is according to Streamline (vol. 5, no. 2) - a publication of the BC Ministry of Environment, Watershed Restoration Program. Based on the same source, for this study, Acres International Ltd. received in 2000 the award of merit of Association of Consulting Engineers of Canada (ACEC). 64 Elevat ion 0 - 2 0 0 200 - 300 300 - 500 500 - 600 600 - 800 800 - 900 900 - 1100 1100 - 1200 1200 - 1400 1400 - 1500 Landslides Streams / \ / Roads 5 Kilometers Figure 4.2 Seymour - D E M , roads and streams. G V R D practices very conservative management in order to provide good quality water, and does not focus on maximizing the return from the timber resource. For this reason, only a limited road network was developed for small logging operations that have since been curtailed. Figure 4.3 displays a terrain stability map of the area. Within Seymour, 397 terrain polygons were delineated (definition of stability classes is provided in Appendix 3). The map also includes the location of existing landslides. 65 Figure 4.3 Seymour - terrain stability mapping. Figure 4 .4 shows the distribution of surficial materials. Polygons delimited in terrain stability mapping were described by one to three surficial materials. 66 Surficial materials • Colluvium • Fluvial • Lacustrine • Moraine (till) Rock • Glacio-fluvial Figure 4.4 Seymour - distribution of surficial materials; a) First surficial material; b) Second surficial material; c) Third surficial material. The area is covered mainly by colluvial and morainal deposits (till), or combinations of these materials and rock. The valley floor is covered by fluvial and glaciofluvial materials. Distribution of subsurficial materials is presented in Figure 4.5. Up to two subsurficial materials were used to describe terrain polygons in Seymour. 67 Figure 4.5 Seymour - distribution of subsurficial materials; a) First subsurficial material; b) Second subsurficial material. For large areas, the subsurficial materials consist of till and rock. However, the valley floors represent a particular challenge with respect to stability, because the surficial deposits overlie glaciolacustrine deposits. Although these areas are characterized by low slopes, a high clay content, makes them prone to instability, as confirmed by the existing landslides in the area. The geomorphic processes identified in the study site are presented in Figure 4.6. 68 Geomorphic processes • Avalanches • Slow mass movements - init. zone • Rapid mass movements • Rapid mass movements - init. zone • Gully erosion Figure 4.6 Seymour - distribution of geomorphic processes; a) First Process; b) Second Process; c) Third Process. Large portions of the area analyzed are affected by various geomorphic processes, as presented in Figure 4.6. The processes identified are avalanches, gully erosion, rapid mass movements, and to a smaller extent, slow mass movements. Figure 4.6 clearly shows how complex the geomorphic conditions are, and consequently the difficulties associated with terrain stability mapping. 69 4.2. DESCRIPTION OF JEUNE LANDING STUDY SITE Jeune Landing area is located on Northern Vancouver Island. The total area of the site studied is 52.6 km2. The dominant rock type is andesite which is part of the Boananza Group. Other rock types known to the area consist of pillow basalt of the Karmutsen, and the Quatsino limestone (Muller et al. 1974). Common to this region are U-shaped valleys separated by rounded ridges. The area is characterized by cool, wet winters and warm, moist summers. Mean annual precipitation is about 3000 mm at sea level, and seventy to eighty percent of the precipitation occurs between October and March. Snow is usually confined to higher elevations and is ephemeral at lower and mid elevations. The area is subject to occasional rain-on-snow events. Greatest precipitation intensity /duration rain storms and highest wind velocities usually occur during the winter months. However, rain storms of sufficient magnitude to initiate debris flows usually occur several times every year and may not be confined to the winter period. Figure 4.7 displays a DEM of the area. The study site consists primarily of a long valley, which has a northwest orientation that conforms to the regional geology. Victoria Lake is situated at the northern limit of the site. The topography is characterized by a broad and flat valley floor, changing to a moderate gradient along the lower ridge toe-slopes. Further up, it continues with a steep gradient ridge crown, consisting largely of bedrock bluffs, and followed by the ridge summit, which has a relatively gently slope. Elevation ranges from 100 to 1300 m and slope varies between 0 - 75° (however, the vast majority of the area has slopes less than 50°). The area is dissected by numerous streams. The road network within the study site is relatively dense, because of intense harvesting in the area over the last few decades. Terrain stability mapping for Jeune Landing is presented in Figure 4.8. 70 5 Kilometers Elevation 100 - 200 200 - 300 300 - 500 500 - 600 600 - 700 700 - 800 800 - 900 900 - 1100 1100 - 1200 1200 - 1300 Streams /\J Roads Figure 4.7 Jeune Landing - D E M , roads and streams. Within the Jeune Landing area, 258 terrain polygons were delineated. Location of existing slides was not available. A s shown in Figure 4.8, the majority o f the area was classified as Class III, while some 71 polygons on the valley floor, on flat terrain, were assigned stability class IV. Office analysis and field visits could not clearly identify the criteria used for classification of Class IV polygons. Figures 4 .9 shows the distribution of surficial and subsurficial materials. For this area, up to two surficial materials and one subsurficial material were recorded for each polygon. 0 1 2 3 4 5 Kilometers Figure 4.9 Jeune Landing - distribution of surficial and sub-surficial materials; a) First surficial material; b) Second surficial material; c) Subsurficial material. 72 As opposed to Seymour, in Jeune Landing, extensive areas were classified as being covered by organic materials. The most widely spread surficial materials are colluvium and till, and the only subsurficial material recorded was rock (fluvial material is negligible). Geomorphic processes identified in the area are presented in Figure 4.10. Most polygons are described by one or two geomorphic processes, and a very small area is described with three geomorphic processes. Figure 4.10 shows that processes identified in the study area are avalanches, gully erosion, and rapid mass movements. For rapid mass movements, the initiation zone was not identified. Figure 4.10 Jeune Landing - distribution of geomorphic processes; a) First Process; b) Second Process; c) Third Process. 73 Overall , the quality o f the data for Jeune Landing is lower than for Seymour. This can be easily explained by the time and resources invested in construction of the two data sets. A l so , the structure of the Jeune Landing dataset indicates that its main purpose was ecological mapping and this may explain some o f the discrepancies noted in terrain stability mapping. 74 CHAPTER 5. DATA COLLECTION A N D PRE-PROCESSING Two types of data are considered in this study: topographic attributes derived from D E M , and geomorphic attributes included in the terrain mapping. Apart from these, for Seymour, location of existing landslides was available and included in analysis. This chapter presents the data available and the methods used for data preparation. Data preparation is application dependent and there are no universally valid specifications of how it should be done, although general guidelines exist in the literature (Bishop 1995; Vesanto 2000). The fundamental aim of data preparation is to make it easier to build precise and reliable models. Successful preparation enables one to construct more reliable and more understandable models faster and with less data. The preparation step has several objectives: • to select variables to be used for building the models, • to clean erroneous or otherwise uninteresting values from the data, • to generate new features which capture interesting problem characteristics better than the original, raw data, • to account for missing data and for input normalization, and • to transform the data into a format which the modeling tool can best utilize. An overview of this chapter is presented in Figure 5.1. The first part of this chapter presents the data^ support and management system; the second part describes in detail the types of data available; the third part presents the coding methods used for various types of data. Based on the coding method, preliminary analyses were carried out to evaluate their impact on the attributes, and results are presented in the last part of this chapter. 75 Topographic Attributes Elevation Slope Aspect Plan curvature Profile curvature (Chapter 5.2.1) Geomorphic Attributes Surficial and Sub-surficial Materials Texture Surficial Expression Geomorphic Processes (Chapter 5.2.3) Computation of New Topographic Attributes Interaction of Curvatures Specific Catchment Area (Sca) (Chapter 5.2.1) c ^ o •>-"CD <*> N L O TJ i -J= CD CO * ; c ro ro c o 9. ro ro Q CM CO w w >> ^ ro I ro Data Cc ers 5.3. -> liminary (Chapte —• CL ro . c O Pre-processed Data Topographic Attributes Elevation Slope Aspect Curvatures - Sca Geomorhphic Attributes Surficial and Sub-surficial Materials Expression Thickness Expression Slope Expression Complex Geomorphic Processes (Chapter 5.4) Location of Landslides (Chapter 5.2.2) Buffer existing Landslides (Chapter 5.2.2) Total Area of Landslides (Chapter 5.2.2) w w Figure 5.1 Overview of data preparation and pre-processing. 76 5.1. D A T A SUPPORT (STORAGE) A N D M A N A G E M E N T Topography of the two sites was represented in GIS, using the raster structure. The steps taken in the process of creation of D E M are presented in Appendix 5. Rasters were chosen to illustrate topography as they offer a better representation for the gradual variation of attributes in space than vectors, and also because they are more convenient for computations. Given the quality of the raw topographic data, the size of the study sites, and the main objectives of this study, a 20-m cell size for the D E M was chosen. The cell size is the smallest unit of interest in the mapping process, and defines the limit of the spatial accuracy. The smaller the cell size the more accurate the representation. However, this is at the expense of larger data sets and slower processing time. Geomorphic data is of vector-type, and consists of terrain polygons delineated according to the Terrain Classification System of BC. This method creates a discrete representation of the space (the transition between various regions is crisp). Therefore points that are close in the real space and have very similar characteristics, may be assigned to different polygons and regarded as different. This is a well known problem in spatial analysis and numerous studies addressed it (Davis 1999). Apart from this, the inherent uncertainty of natural data makes the problem even more difficult. Data on existing landslides is also presented in vector format, and it consists of points where the landslide initiation was assumed to have occurred. When both topographic and geomorphic attributes were used, the data were converted to the raster format, by overlaying the terrain polygons and position of landslides on the D E M . This procedure, however, did not change the shortcomings associated with vector-type data, where all cells that have their centroids inside a certain terrain polygon share the same geomorphic attributes. Also, only those cells that included landslides initiation points were initially considered to have this attribute. With respect to landslides initiation, this problem was addressed in the data preparation process. 77 5 . 2 . TYPES OF D A T A A V A I L A B L E 5.2.1. T o p o g r a p h i c d a t a The digital representation of terrain in a D E M contains essentially one parameter, the elevation. Based on elevation (m), the following parameters relevant to terrain stability, were computed: • Slope (deg.): maximum rate of change from a cell to its neighbours. • Aspect (deg.): the steepest down-slope azimuth direction from each cell to its neighbours • Plan (planform) curvature: defines the shape o f the surface perpendicular to the direction of the slope, and it is the second derivative o f the elevation function, in plan. • Profile curvature: defines the shape o f the surface on the direction o f the slope, and it is the second derivative o f the elevation function, in profile. Curvatures can be used to describe the physical characteristics o f a surface, such as erosion and runoff processes within a landscape, plus they identify convergence and divergence o f flow within a landscape. Two additional topographic features, considered relevant to terrain stability, were generated in this study: (1) interaction of curvatures, and (2) specific catchment area. Interaction of curvatures: was introduced to capture the interaction o f plan and profile curvatures, as they were thought to describe the terrain better than the individual attributes. These parameters can have positive or negative values, and there is no standard method to account for their interaction. Given the way these attributes are defined in GIS, their interaction was calculated as the difference between the plan and profile curvatures (plan curvature minus profile curvature). Thus, large negative values are characteristic o f topographic 'hollows', and large positive values are specific for 'noses'. This coding method clearly has some shortcomings, especially when the two attributes have the same sign and by subtraction they cancel out. However, the attribute was meant primarily to identify areas where water concentrates because of topographic gradients. Regarded from a statistical perspective, this new attribute 78 is a linear combination o f existing attributes and it is unlikely to bring anything new. However, it has a strong practical meaning, because it identifies areas of water concentration and water dissipation, and these areas are in general characterized by very different vegetation types, which is also an important factor in terrain stability. Specific catchment area (Sea). This parameter was introduced by Beven and Ki rkby (1979), and is considered one o f the landmark developments in recent hydrology (Pack et al. 1998). Sea is defined as the ratio o f upslope drainage area per unit contour length (it has units o f m 2 /m), and its computation is based on the assumption that lateral subsurface flow follows topographic gradients. The attribute is visually described in Figure 5.2. Figure 5.2 Definition of Specific Catchment Area (from Pack et al. 1998). Sea may be regarded as a complex combination of topographic attributes. Because it can reflect the position on slope (distance from the ridge and from valley floor), and the drainage basin order, the introduction of Sea was considered worthwhile. Sea was computed based on the £)oo algorithm developed by Tarboton (1997), implemented in the public domain software S I N M A P (Pack et al. 1998). Less sophisticated hydrologic functions for computing Sea are also available in GIS. 79 5.2.2. L o c a t i o n o f ex i s t i ng l ands l ides Location of existing slides was provided in the format of (x, y) coordinates. The presence / absence of landslides was in fact a Boolean value, i.e. rasters were classified as being or not being in the landslide initiation area. An important issue about landslides initiation is the accuracy of their location. In general, initiation points can not be located very accurately because these events may initiate in a certain point and then retrogress very rapidly up to a new point which becomes the headscarp of the landslide. Usually, many studies assume that the headscarp is the point of initiation. This is a well-known problem in terrain stability analysis (e.g. Pack et al. 1998) and researchers usually conduct detailed field inspections in order to produce a more accurate location of landslides initiation points. To account for this problem, this study used buffers created in the GIS, around initiation points. The 40-m diameter of the buffer was selected subjectively, based on previous experience and on knowledge of existing landslides in the watersheds. Al l cells that have their centroids within the buffers were assumed to be unstable, i.e. similar to the initiation points, as presented in Figure 5.3. However, if the buffer included cells from stable polygons (Class I, II, or III), these were eliminated, as were cells inside the buffer, with elevations higher than the headscarp. 80 \ ] u i ! A \ \ \ in \ \ \ \ IV 1 - — — contour line \ 111 ^ —~ \ \ \ ^ landslide initiation y\ / \ Ns \ . _ , ) point . / /^ C^  ^— / / .. 7 i / / \ y<\-.•.•.•.•.•.•.•.•.// / / , / / / /' / / / i \ / / / / i j \ buffer 7 / ^^ ^^ H^ terrain similar to / / 1 \ • landslide initiation V s point 1 / Figure 5.3 Buffering of landslides. Figure 5.3 presents the area of intersection of three polygons, Class III, IV, and V, along with contour lines. The landslide existing in polygon class V is marked with a dot, and the buffer around the landslide also extends into polygons class III and TV. The first contour line with an elevation greater than that of the landslide is outlined. As described, only points inside the buffer, with an elevation lower than the landslide were considered to be similar to the initiation points. These are represented by shaded area in Figure 5.3. As a result of buffering, the number of pixels considered to be landslides increased to 3,292. 5.2.3. G e o m o r p h i c d a t a According to the Terrain Classification System of B C (Howes and Kenk 1997), the first step in terrain mapping consists in delineating polygons homogeneous with respect to slope stability attributes. Next, based on these attributes, a stability class is assigned to each polygon. Criteria used in this process are provided in Appendix 3. 81 In the B C system, the following geomorphologic attributes are collected for each terrain polygon, and grouped in a terrain symbol: • Surficial Material (1-3 elements) • Texture of Surficial Material (1-3 types per element) • Expression of Surficial Material (1-3 types per element) • Subsurficial Material (1-2 elements) • Texture of Subsurficial Material (1-3 types per element) • Expression of Subsurficial Material (1-3 types per element) • Delimiters that show the areal spread of materials • Geomorphic Processes identified in the polygon (1-3 elements) A generic description of terrain symbols was already introduced in Chapter 2, and is also presented in Appendix 2. Each polygon may be described by up to three surficial materials and two subsurficial materials. Materials (surficial or subsurficial) are described also by texture and expression, and a full description of texture or expression (for one material) may include up to three attributes. Consequently, a complete description of one material (material itself plus texture and expression) could consist of seven attributes, and five materials (three surficial and two subsurficial) may thus include 35 attributes. Apart from this, there may be three types of geomorphic processes identified in the polygon, and three delimiters included in the terrain symbol. Therefore, the total number of attributes used for a complete geomorphic description of one polygon may be 41. A generic example of a complete description of a terrain symbol is presented below: 82 d tiVL.e d t M 2 e d t M 3 e - P , P 2 P 3 t sMie tsM 2 e the following abbreviations were used: t - texture M - surficial material e - surficial expression d - delimiter P - geomorphic process s M - subsurficial material In the formula above, the following notes clarify the use of delimiters: (1) when used in front o f the first surficial material it shows that surficial deposits cover the underlying materials only partially; (2) when more than one subsurficial material is recorded, it is usually not possible to quantify their areal spread, so in this case, no delimiter is used and materials are assumed to be equally spread. In practical applications, however, all these types o f data are never collected. Texture and surficial expression are usually recorded for one or two material types. In this thesis, surficial and subsurficial materials, texture, surficial expression, and delimiters, are used strictly based on their definition provided in the Terrain Classification System of B C . A detailed description o f these attributes is provided in Appendix 6. Interpretation and analysis of geomorphic processes was sometimes different than that o f the Terrain Classification System of B C , and this attribute is described in the following paragraphs. Geomorphic Processes. The following geomorphic processes were considered relevant to terrain stability and were used in this study: Piping (P); Gu l ly erosion (V) ; Snow avalanches (A); Slow mass movements (F); Slow mass movements - initiation zone (F"); Rapid mass movements ( R ) ; Rapid mass 83 movements - initiation zone (R"); Surface seepage (L). Piping and surface seepage were included in the model, although not encountered in the study sites. The list of processes includes snow avalanches although these are not closely related to terrain stability. However, in most cases snow avalanches incorporate rock, surficial material and vegetation debris. For the two sites analyzed, the vast majority of polygons for which snow avalanches were recorded, were also characterized by rapid mass movements. The Terrain Classification System includes subclasses and subtypes for geomorphic processes. Some subclasses identify (some) polygons as initiation zones for some processes, as is the case for slow and rapid mass movements. However, in this study, initiation zones were considered separate processes, and analyzed as such. Other subtypes of processes, which identify the types of materials entrained, were not included in this study, as they were considered either very subjective, or of little relevance to terrain stability prediction. In some areas, especially on the valley floor, the processes identified were not related to terrain stability, e.g. meandering channel. When this situation was encountered, the respective process was simply eliminated and the other processes moved one position forward. Up to three processes may be used in a terrain symbol for a polygon and these are written in order of decreasing visible areal extent. In the Terrain Classification System no information about frequency and intensity of events is intended, and the areal proportion of the processes is not stated. Processes are applied to polygons that include initiation, transportation, and depositional zones (or initiation, tracks, and runout). Therefore, geomorphic processes are recorded for polygons which are eventually assigned stability classes from I to V . This situation is illustrated in Figure 5.4. Figure 5.4 depicts a case where two polygons, Class V and II are affected by two events, an avalanche and a rapid mass movement. Two observations are important in this case: (1) since the avalanche has the greater areal extent, it is recorded first in the terrain symbol; and (2) both polygon class V (where the processes initiate), and polygon class II (which is only traversed by the events), have the two processes included in their symbol. 84 Figure 5.4 Visual representation of geomorphic processes. This study also investigated the case when processes were ordered according to their relevance to terrain stability. With respect to Figure 5.4, the scenario investigated was 'what-if rapid mass movements are recorded before avalanches. Details on the consequences of this notation are presented in Chapter 7. Changes produced by this notation on mapping the first geomorphic process for the two study sites are presented in Figure 5.5. Figure 5.5 shows major differences compared to previous representations (Figure 4.5 and 4.9). For relatively large areas, avalanches and gully erosion are replaced by rapid mass movements. 85 Geomorphic processes a.) • Avalanches • Slow mass movements - init. zone * Rapid mass movements • Rapid mass movements - init. zone • Gully erosion N 0 1 2 3 4 5 Kilometers Figure 5.5 Mapping of geomorphic processes based on their relevance to terrain stability; a) Seymour; b) Jeune Landing. Apart from the geomorphic attributes collected according to the Terrain Classification System of BC, other parameters may be collected for each polygon (as presented in Chapter 2.4.1), such as: slope gradient of the entire polygon, soil drainage class, and potential for sediment delivery. The potential for sediment delivery was not considered important for terrain stability analysis, and the drainage class is a highly inferential parameter, driven mainly by the type of material and texture. Hence, these attributes were not included. Slope gradient was included in analysis. This parameter is not strictly regulated and usually is input as an estimated range over the entire polygon. Most commonly, the five slope classes defined for surface expression are used, but in numeric format; e.g. for a polygon with slopes ranging from moderate to steep, slope will be recorded as '3 - 5'. 86 For clarity purposes, in the remainder of this thesis, the name of topographic and geomorphic attributes will be capitalized. 5.3. D A T A C O D I N G Data coding is necessary to convert the variables to such a form that the neural nets can best utilize them. Like many other modeling tools, Self-Organizing Maps can only utilize numerical information. Based on the common criteria for data classification (e.g. Manly 1994), the data used in this study fall in the following classes: • Ratio-type: all topographic attributes. • Ordinal: Polygon Slope. • Nominal: all geomorphic attributes, except for Polygon Slope. 5.3.1. C o d i n g o f r a t i o - type da t a In general, most numeric attributes fall in this category; as is the case for the topographic attributes. When analysis was conducted using only the seven topographic attributes, two cases were distinguished: (1) raw data were used; and (2) data standardization was performed before analysis. Standardization was performed because numeric attributes have different units and are different in their magnitude. Consequently, their impact on the distance metric may be different. Standardized variables are obtained for each attribute using the following formula: Xs - X X\. = Equation 5.1 where x'j - standardized i-th observation of variable x. x - mean of the variable, s - standard deviation of the variable. 87 5.3.2. Coding of nominal and ordinal data The method chosen to code nominal and ordinal data is 1-of-n coding, which is described in many A N N publications (e.g. Bishop 1995), and is used for coding symbolic variables into a set of numerical variables. The method consists o f assigning to each attribute a number of classes equal to the number of parameters (levels) specific for the respective attribute. Each attribute is represented by a vector, which has a value o f ' 1 ' for the class that corresponds to the actual value o f the attribute, and values of '0 ' for all the other classes. A n example on how Materials were coded according to this method is presented in Table 5.1. Table 5.1 Coding o f materials using the 1-of-n coding method - example. M a t e r i a l (example) M a t e r i a l Types 4 C D F F G L L M o R u V W G Col luvium 1 0 0 0 0 0 0 0 0 0 0 0 T i l l 0 0 0 0 0 0 1 0 0 0 0 0 Rock 0 0 0 0 0 0 0 0 1 0 0 0 A ten-class system was used for Polygon Slope; the first five classes were used for the minimum slope and the last five classes were used for the maximum slope existing in a polygon. W i t h this representation, attributes for a certain location (in my case, a pixel) that are sent to the A N N , consist of series o f 0 'and T . The follwing abbreviations are used, as described in Appendix 6: C - Colluvium; D - Weathered Bedrock (in situ); F - Fluvial Material; F G - Glaciofluvial Material; L - Lacustrine Material; L G - Glaciolacustrine Material; M - Morainal Material (Till); O - Organic Material; R - Bedrock; U - Undifferentiated Materials; V - Volcanic Material; W G - Glaciomarine Material. 88 When both topographic and geomorphic attributes were used, the 1-of-n coding method was used for both data types, to ensure compatibility. Each topographic attribute was coded according to a ten-class system and classes were defined by equally dividing the range o f the respective attribute. The number of classes was subjectively chosen (for topographic attributes). The 1-of-n coding method was investigated both for standardized and non-standardized topographic attributes. A note should be made, however, for the case when the analysis comprises two or more sites. If the data is not standardized and 1-of-n coding is used, the ranges for each variable should be the same, so that identical values from different sites are in the same class. If normalization is performed, data should be pooled first. 5 .4 . PRE-PROCESSED D A T A After coding, many preliminary analyses were conducted to asses the impact o f the attributes and o f coding methods. These analyses identified that Sea, surficial and sub-surficial materials, surficial expression, texture, and geomorphic processes needed further investigations. Results o f these investigations are presented in the rest o f this section. Specific catchment area. The distribution o f this attribute for the two sites is presented in Figure 5.6. This figure shows that values for Sea range from less than 100 m 2 /m (for pixels close to the ridge), to hundreds of hectares per metre (for pixels along the valley bottoms and the main streams). 89 Figure 5.6 Distribution of Sca for Seymour and Jeune Landing. This great difference in magnitude obscured the influence of pixels with small contributing areas, and cancelled the contribution this parameter would make in delineating stable and unstable terrain. To account for this, the upper limit of Sca for points to be included in analysis was set to 3,000 m2/m. This limit was chosen so that all existing slides in Seymour were included. This also proved to be a reasonable choice for Jeune Landing. Pixels with Sca greater than this limit were excluded, producing a reduction in the area of about 7% for Seymour and 5% for Jeune Landing. This reduction helped to better investigate the influence of Sca on terrain stability mapping. Based on the locations of points eliminated, there is good reason to believe that this operation did not affect the results. 90 From the geomorphic attributes, materials, surficial expression, texture, and processes were analyzed individually with respect to the coding method and their impact on prediction. The procedure used consisted in selecting a subset of polygons that had all the other attributes similar except for the one analyzed. Predictions made with and without the parameter were evaluated, prior to deciding whether or not it should be included. Surficial and subsurficial materials. The number of elements describing this attribute and included in a terrain symbol was investigated. The following example can clarify this issue, based on the comparison of two terrain polygons, one described by three materials (e.g. C.M.R - colluvium, till, and rock equally spread), and another one described by only one material (e.g. C - colluvium). If we compare only the first material (i.e. assume for the first terrain polygon that the second and third materials are missing values), we may draw the conclusion that the two polygons are identical. However, if we represent the second symbol as ' C . C . C (which has exactly the same meaning as before) it is clear that the two polygons are different. The solution is straightforward when comparing polygons described by one or three attributes (as above), but more complicated when two attributes are recorded. When a polygon is described by two attributes, an alternate coding method was considered. The first attribute was also copied to the third position (i.e. C M was changed to C.M.C). With respect to Euclidean distance, this is equivalent to copying the first attribute to the second position, and shifting the second attribute to the third position (i.e. changing C M to C.C.M). Preliminary analysis showed that this coding method did not yield good results because differences in the first (most important) and third materials were treated the same by the system. Consequently, polygons described by two materials were left as they are, and the third was considered a missing value. As Texture and Surface Expression accompany Materials, they are affected by the same problem and therefore the same procedure was applied. 91 Geomorphic Processes. This attribute is affected by the same problems as Materials, and therefore the same coding method was used. Surface Express ion. Problems related to this attribute are a direct consequence of the coding method used in the Terrain Classification System. For the attribute Surficial Expression, the system allows a variety of inputs: some inputs include both information on slope and depth o f surficial deposits, others only on one o f them. Sometimes, terrain elements have none of these attributes recorded, or combinations of different descriptors are used for the same attribute: e.g. material thickness can be described as veneer (v), blanket (b), or veneer-blanket (vb). Although the thickness o f these materials is about the same, the Euclidean distance between cells covered by such material is greater than it should be (the distance between two descriptors which are essentially the same, is similar to the distance between two attributes at the extreme points o f the scale). Combinations o f descriptors artificially increased the number of classes for this parameters. U p to 33 classes were necessary to describe Surface Expression in Seymour, and this had a negative impact on the analysis, because it grouped very similar cells in different stability classes, and also increased computational time. To account for this, a new system was devised for this attribute, which makes it more consistent with the other attributes (and more logical). In this system, it is considered that each terrain polygon is characterized by slope and depth attributes ( if not recorded, it was considered missing data). More complex surface expressions may or may not be present. Consequently, Surface Expression was split into the following three components: • Expression Thickness - types included: thin veneer (x), veneer (v), blanket (b), and mantle o f variable thickness (w). • Expression Slope - types included: plain (p), gentle slope (j), moderate slope (a), moderately steep slope (k), steep slope (s). • Expression Complex - types included: undulating topography (u); roll ing topography (m); hummocks (h); ridges (r); depressions (d); fan (f); cone (c); terraces (t). 92 To make the stability classification more conservative (realistic) when two descriptors were recorded for the same attribute, the one most unfavorable to stability was retained. A n example of the new coding method is presented in Table 5.2. For missing attributes, the symbol 'x' was used. In such cases, computation o f distance and updating o f vectors was performed as described in Chapter 3. Table 5.2 The new coding system for Surficial Expression - example. Surficial Expression (example) Expression . . . Thickness Slope Complex X v b w P j a k s u m f h r c d t fan(f) x X X X X X X X X 0 0 1 0 0 0 0 0 veneer (v) 0 1 0 0 X X X X X 0 0 0 0 0 0 0 0 hummock-steep (hs) X X X X 0 0 0 0 1 0 0 0 1 0 0 0 0 Texture. This is known to be a highly variable attribute because on relatively small areas, materials o f extremely different textures can be encountered. Most often, in the terrain mapping process, this attribute is simply inferred. In the example provided in Appendix 3, texture is not even included among the attributes used in the classification process. Preliminary analyses proved that as it is recorded, this parameter carries little weight, and oftentimes hinders the analysis. Consequently, texture was eliminated from the analyses carried out in this thesis. Given the number of attributes used and the number of classes by attribute, when 1-of-n coding is used, description o f geomorphic attributes for each pixel requires 188 digits for Seymour, and 127 digits for Jeune Landing. Topographic attributes are coded by 70 digits in each study site, and thus, the total number o f digits necessary for a complete description o f a pixel is 258 in Seymour, and 197 in Jeune Landing. 93 CHAPTER 6. METHODOLOGY This chapter describes the analyses performed during model development. The first step in this process is understanding the structure and shape of the 'cloud' formed by the data set (Vesanto 2000). Because the human eye is a very sophisticated, general-purpose pattern recognition system, simple visualizations are important tools. This 'feeling' of the data, is usually achieved by what is called 'playing with the data'. This was an important step in this study. Data were visualized in GIS, either as subsets or the whole dataset for each site. This chapter presents how the modeling process was conducted, the criteria used for evaluating the results, and details on how analyses were carried out for each case identified in the modeling process. 6.1. THE MODELING PROCESS Modeling is the step where the solution to the problem is found. The previous steps are basically preparation for modeling and the later steps deal with its practical implementation, but the actual solution is specified at this step. Essentially, L V Q and SOM were the methods used for modeling, for the supervised and unsupervised cases, respectively. An overview of the modeling process is presented in Figure 6.1. A summary of steps taken for data preparation, analysis, evaluation and visualization of the results is given in Appendix 7. 94 Supervised Learning | Unsupervised Learning 1. Model development and confirmation within each study site (2/3 vs. 1/3 of the data). Seymour Jeune Landing A Attributes available Topographic attributes. Geomorphic attributes. h Location of existing landslides. • Location of landslides not known. Criteria for evaluation a. landslides vs. ! N/A stable area. ; b. stable (els. I-IV) vs. unstable (els. V). c. stable and pot. stable (els. I-III) vs. unstable and pot. unstable (els. IV- \l V). d. 5-class system. Reduced set of attrib. - Seymour - Topographic attributes - Geomorphic attributes Reduced set of attrib. - Jeune L - Topographic attributes - Geomorphic attributes Reduced set of topographic & geomorphic attributes for both study sites. Seymour Reduced set of attrib. - Seymour - Topographic attributes - Geomorphic attributes Comparison 2. Model testing between watersheds, with the reduced set of attributes. 3. Comparison: model prediction vs. SINMAP. Jeune Landing Reduced set of attrib. - Jeune L - Topographic attributes - Geomorphic attributes Reduced set of topographic & geomorphic attributes for both study sites. Figure 6.1 Description of analyses conducted for model development and testing. 95 The rest of this section present a more detailed description of the modeling process introduced in Figure 6.1. Figure 6.1 identifies in fact four distinct modules of the analysis: three in the supervised case and one in unsupervised mode. The analyses performed in these four modules are described in greater detail in Chapters 6.3 - 6.6. The following paragraphs are meant to be explanatory with respect to Figure 6.1, which offers an overview of these analyses. Supervised learning is described first, then unsupervised learning. Supervised learning ( in relat ion to F igure 6.1) As presented in Chapter 3, due to its intrinsic nature, supervised learning is a more reliable method and Figure 6.1 emphasizes it as the main analysis method used in this thesis. In supervised mode, analyses were organized at three levels (marked accordingly, with Arab numerals), which are described next. At Level 1, model development and confirmation for each study site is conducted. This corresponds to Objectives 1 and 2 of this thesis. For ANN-based models it is well-known that if all the data available are used, the resulting net can be 'brittle', essentially learning to fit all the noise in the data with exacting precision (Michalewicz and Fogel 2000 ). When new data are collected and presented to a network that was trained with all the previously available data, the network often fails to perform to the same level as indicated in training. To address this 'learning vs. generalization dilemma', it is necessary to hold out some of the available data for testing the neural classifier that is trained on the remaining subset of the data. In the modeling process, the first important step to be clarified is how the data is split for training and testing. The amount of data to separate into training and testing sets is problem dependent and requires statistical judgement. Sometimes, the available data are separated at random into equal sets for training and testing, but there is no such rule that will work in all cases. Most practitioners (e.g. Sayed and Razavi 2000) recommend splitting the data into two thirds for training and one third for testing. This step is also identified in Figure 6.1. The total number of pixels was 126,168 for Seymour and 122,301 for Jeune Landing, and these were split according to the rule presented above. 96 The types of attributes available (topographic, geomorphic, and location o f landslides) were presented in Chapter 5. The following criteria were used to delineate unstable vs. stable terrain, and to assess the quality o f predictions yielded during model development: • Criterion a: unstable is considered to be only the terrain already affected by landslides, and the rest o f the area is considered stable. This criterion is only applicable to the Seymour watershed. • Criterion b: unstable terrain is considered to be only terrain Class V , and terrain Classes I -I V are considered stable. • Criterion c: unstable terrain in this case corresponds to Classes I V - V , and terrain Classes I - I l l are considered stable. This criterion is outlined (in bold characters) in Table 6.1, because it is the most important for practical applications, and is therefore considered the chief criterion in this analysis. • Criterion d: classification is based on the 5-class system. This attempts to classify terrain according to the system used in B C (Province o f B C 1999). These criteria were used for all analyses performed in this thesis. For clarity o f the presentation, in the rest o f this paper, the criteria described above w i l l be simply referred to as Criteria a - d. The analysis was carried out separately for topographic and for geomorphic attributes, producing a reduced set o f attributes for both types. A s presented in Figure 6.1, the next step consisted in combining these short lists and conducting further analysis to identify the topographic and geomorphic attributes which are the most important in terrain stability mapping. Obviously, this approach is amenable to criticism, on the grounds that there might be some combinations that could yield good results but were left out. Conducting the analysis this way was mainly driven by domain knowledge related to the nature of attributes, the accuracy with which they are collected, and on terrain stability in general. Also , to account for some uncertainty involved in this process, different combinations o f attributes were used, as described in Chapter 6.3. 97 At Level 2 of supervised analysis, cross-validation was used for model evaluation, which corresponds to Objective 3 of the thesis. When performing cross-validation, the map developed with the training data at one site was used to map the test data at the other site. The basic idea is to try to estimate how well the current hypothesis will predict unseen data. Level 3 of the supervised analysis addresses Objective 4 of the thesis. At this level, the mapping produced from this model, was compared with another model (SLNMAP) developed on different principles (physically-based). Unsupervised learning (in relation to Figure 6.1) As presented in Figure 6.1, unsupervised analyses were conducted starting with the short lists of topographic and geomorphic attributes. This was driven by the comments made in Chapter 3 with respect to expected accuracy of these methods. These analyses aimed at producing terrain stability maps, identifying the most important attributes, and address Objectives 5 and 6 of the thesis. Given the SOM ability to project and cluster vectors (pixels), an interpretation of SOM (i.e. 2-D lattice) was performed to analyze the influence of each attribute on terrain stability mapping - this corresponds to Objective 7 of the study. Objective 8 is more speculative in nature, and it is addressed at the end of the thesis, as a summary of the results obtained, combined with general knowledge of the terrain stability problem. 6 . 2 . E V A L U A T I O N OF RESULTS Michalewicz and Fogel (2000) stress that in any real-world problem, the analyst chooses the evaluation function, because it is not given with the problem. Different models are typically compared in terms of their accuracy, and the definition of accuracy should be specified before the analysis. In this model, quantitative evaluations of solutions were conducted through model confirmation (a comprehensive discussion on verification, validation and confirmation of models developed for natural processes is 98 presented in Oreskes et al. 1994). Essentially, the mapping produced by the model was tested against mapping produced by terrain specialists. Quality of solutions was evaluated based on the area predicted as unstable by the model vs. the existing prediction made by human experts. For the Seymour watershed, comments were also made with respect to the number o f landslides correctly predicted. Landslides correctly predicted were also used at step 3 o f analysis, when the A N N model was contrasted with another type o f terrain stability model ( S I N M A P ) . For terrain stability mapping, the classical binary classification problems includes two kinds o f errors: assignment o f a negative instance to a positive class, and assignment o f a positive instance to a negative class. Stated in statistical terms (hypothesis testing), this translates into: • The null hypothesis (H 0 ) : states for each entity analyzed that the 'pixel is stable'. • The alternate hypothesis (H[): states that 'the pixel is not stable'. The two types o f errors are: • Type I error : reject H 0 when in fact is true; this is a false alarm, where the model indicates instability in a stable area. • Type II: fail to reject H 0 when in fact it is false; this is a miss, where the model indicates stability and in fact the terrain is unstable. The two types o f errors are not equally costly. In terrain stability, like in many other fields (e.g. medical image analysis), Type II errors are much more costly than Type I, because extra checks are much less o f a problem than a miss. However, given the specifics of this study, Type II errors need to be addressed in further detail. A s presented in Chapter 2, terrain polygons are delineated manually, and this limits the smallest spatial unit to about 1 cm 2 , which at this scale represents 4 ha. Topographic data is represented using the raster data model, and the smallest spatial unit used in this study is a 20-m pixel (400 m 2 ) , which is 100 times smaller than the unit used in vector representation. There is clearly an inconsistency between the level of detail o f vector and raster data. This inconsistency is further investigated in Chapter 99 7, as it played a relatively important role in the analysis. There were cases when a certain polygon was assigned to a certain stability class by the human specialists, whereas the model assigned only a portion of it to the same class. Applicabil i ty o f the above concepts is restricted to Criteria a - c. For criterion d, which used a 5-class system, any misclassification was counted as an error. 6.3. S U P E R V I S E D L E A R N I N G Supervised learning was performed using the L V Q algorithm. Based on preliminary analyses and on data from the literature (Kohonen et al. 1996a; Kohonen 2001) the number o f codebook vectors for the L V Q analyses was set to 2,000, and the learning rate selected was 0.03. The training data was passed to the A N N two times (two epochs), and the number o f steps taken in the training process was about 170,000. This proved to be a good selection to avoid overtraining o f the A N N based on the training data, and achieve a good classification of the test data. These parameters were used both for Seymour and Jeune Landing. 6.3.1. Analyses based on topographic attributes In this case, each pixel has a series o f attributes (Elevation, Slope, Aspect, Plan Curvature, Profile Curvature, Interaction o f Curvatures, Sea) which may be very similar to neighbouring pixels, but not identical. It is very l ikely that stable and unstable areas delineated in the watershed w i l l not follow the terrain polygons very precisely. A l l attributes included in this case are numerical (ratio-type) data. Preliminary analyses were carried out to gain more insight in the data set, and to rank the attributes analyzed. Topographic attributes were first examined using correlation analysis, since this is, in general, a good method to identify how the variables depend on each other. Correlation coefficients for the Seymour and Jeune Landing study areas are presented in Tables 6.1 and 6.2, respectively. 100 Table 6.1 Correlation coefficients for Seymour. Variable Eleva -tion Slope Aspect Plan curvature Profile curvature Interaction of curvatures Sca Elevation 1 0.49 0.18 0.03 -0.15 0.11 -0.18 Slope 0.49 1 0.25 0.01 -0.02 0.02 -0.17 Aspect 0.18 0.25 1 0.02 -0.01 0.01 -0.02 Plan crv. 0.03 0.01 0.02 1 -0.42 0.85 -0.35 Profile crv. -0.15 -0.02 -0.01 -0.42 1 -0.84 0.21 Inter, of crv. 0.11 0.02 0.01 0.85 -0.84 1 -0.33 Sca -0.18 -0.17 -0.02 -0.35 0.21 -0.33 1 The coefficients displayed in Table 6.1 do not show high correlation between any variables. There is a moderate correlation (0.49) between slope and elevation (i.e. steep terrain occurs at higher elevations). Apart from that, the moderate to high correlations between curvatures were expected, due to the nature of these attributes. Sca is not correlated to any of the other topographic attributes. Table 6.2 Correlation coefficients for Jeune Landing. Variable Eleva -tion Slope Aspect Plan curvature Profile curvature Interaction of curvatures Sca Elevation 1 0.19 0.20 0.06 -0.12 0.11 -0.24 Slope 0.19 1 0.56 0.04 0.01 0.01 -0.14 Aspect 0.20 0.56 1 0.03 -0.01 0.02 -0.13 Plan crv. 0.06 0.04 0.03 1 -0.41 0.69 -0.26 Profile crv. -0.12 0.01 -0.01 -0.41 1 -0.94 0.09 Inter, of crv. 0.11 0.01 0.02 0.69 -0.94 1 -0.17 Sca -0.24 -0.14 -0.13 -0.26 0.09 -0.17 1 Correlation coefficients for Jeune Landing are similar to those for Seymour. Apart from the expected moderate to high correlations between curvatures, the only noticeable correlation is between slope and 101 aspect (0.56), which indicates that steep slopes occur predominantly on certain aspects. Again, Sea is not correlated to any of the other attributes. To identify and rank the attributes which contribute most to the correct classification of data, Multiple Discriminant Analysis (MDA) was performed. M D A was conducted using Statistical Analysis Software (SAS) version 8.02, with the procedures Candisc and Discrim. M D A was conducted to discriminate between two classes, according to Criterion c: terrain Class IV - V vs. Class I - III. First, for both sites, the training datasets were used to identify the discriminant function. Next, the testing datasets were classified with these functions, and the accuracy of evaluation was assessed. M D A classified correctly 64% of the testing data for Seymour, and only 38 % for Jeune Landing. From M D A , the canonical correlation coefficients were extracted and are presented in Table 6.3. The canonical correlation coefficients express the contribution of each attribute to the quality of the classification. Table 6.3 Results of M D A - canonical correlation coefficients. Variable Canonical correlation coefficients Seymour Jeune Landing Elevation 0.58 0.40 Slope 0.99 0.89 Aspect 0.19 0.39 Plan crv. 0.03 -0.24 Profile crv. -0.09 0.11 Inter, of crv. 0.07 -0.18 Sea -0.21 0.09 Based on Table 6.3, for Seymour, the two attributes that are the most important are Slope and Elevation. The attribute that comes third based on its absolute value is Sea; however, the fact that this is negatively correlated with the discriminant function is somehow counterintuitive, especially after eliminating pixels with very high values for Sea. The importance of the other attributes can be also inferred from the 102 absolute value of their canonical correlation coefficients. One note should be made also about Profile Curvature. The negative value o f this coefficient can not be explained. For Jeune Landing, Slope is also the most important attribute. Elevation and Aspect are ranked next, but the difference between their coefficients is small. Plan and Profile Curvatures swap ranks for this site. Sea ranks last with respect to its contribution to the correct classification. Based on the above results, the ranking o f attributes for Seymour and Jeune Landing is presented in Table 6.4. Table 6.4 Ranking o f attributes based on M D A . Rank Attribute Seymour Jeune Landing 1 Slope Slope 2 Elevation Elevation 3 Sea Aspect 4 Aspect Plan crv. 5 Profile crv. Inter, o f crv. 6 Inter, o f crv. Profile crv. 7 Plan crv. Sea The ranking o f attributes in Table 6.4 is debatable. One can argue that, especially for Jeune Landing, that ranking is based on a classification that yields relatively unsatisfactory results (38% correct). To address this problem, different scenarios were analyzed, which included various combination o f topographic attributes. Combinations were driven both by their importance derived from data already presented in this chapter, and by domain knowledge. Table 6.5 lists the combinations analyzed for Seymour and Jeune Landing. Obviously, the selection o f attributes included in each scenario was influenced by the results obtained in previously analyzed scenarios. However, for clarity of presentation, all analyses are introduced in this chapter, and results are presented in Chapter 7. 103 Table 6.5 Scenarios analyzed with combinations o f topographic attributes. Scenario Attr ibutes included Seymour Jeune L a n d i n g 1 Slope, Elevation, Sca, Aspect, Profile Curvature, Interaction of Curvatures, Plan Curvature. Slope, Elevation, Aspect, Plan Curvature, Interaction o f Curvatures, Profile Curvature, Sca. 2 Slope, Elevation, Sca, Aspect, Profile Curvature, Interaction o f Curvatures. Slope, Elevation, Aspect, Plan Curvature, Interaction o f Curvatures, Profile Curvature. 3 Slope, Elevation, Sca, Aspect, Profile Curvature. Slope, Elevation, Aspect, Plan Curvature, Interaction o f Curvatures. 4 Slope, Elevation, Sca, Aspect. Slope, Elevation, Aspect, Plan Curvature. 5 Slope, Elevation, Sca. Slope, Elevation, Aspect, Sca. 6 Slope, Elevation, Aspect. Slope, Elevation, Aspect. 7 Slope, Elevation. Slope, Elevation. 8 Slope, Sca. Slope, Sca. 9 Slope. Slope. For the Seymour site: • Scenarios 1-3 analyze the influence o f the three curvatures. • Scenarios 4 - 7 analyze Sca and Aspect; analyses are performed with both, one, or none of them, to identify their influence. • Scenario 8 includes only Slope and Sca (these parameters are considered very important in S I N M A P ) . 104 • Scenario 9 is conducted (for completeness) to analyze the quality of prediction based only on Slope. The analyses for Jeune Landing can be described as follows: • Scenarios 1-4 investigate the influence of Sea, and the three curvatures. • Scenarios 5-9 replicate the corresponding scenarios for the Seymour site. This approach was taken both to produce a list of most important attribute for terrain stability mapping, and also to further investigate the influence of Sea. As a part of analysis with topographic attributes, the effect of data standardization was investigated. When a decision was made as to retain an attribute or not, a very conservative approach was taken. When the contribution of an attribute was not clear, the attribute was further investigated. When there was a slight indication that an attribute contributes to a good prediction, it was retained. Vesanto (2000) specifies that, in general if the analyst is uncertain about what variables to use, it is better to take too many variables than too few. It is part of the modeling to eliminate the unimportant or unreliable variables. 6.3.2. Analyses based on geomorphic attributes In this case, all pixels within a polygon share the same parameters. Consequently, the polygons after classification follow the boundaries of the existing ones. Although this model was built from data, domain knowledge is also used in the development process. Russell and Norvig (1995) describe this as 'the problem of learning when you already know something', stressing that the use of background knowledge allows much faster learning than one might expect. For analyses based on geomorphic attributes, the first step consisted in ranking them, in the following order of decreasing importance: 105 • Geomorphic Processes - when present, they clearly indicate stability problems. • Polygon Slope - in general this is an important attribute for stability assessment; when recorded, this parameter makes Expression Slope redundant. • Expression Thickness - all studies based on the infinite slope model (e.g. Hammond et al. 1992) stress that thickness (depth) of materials is an important factor in terrain stability. • Surficial and Subsurficial Materials - in general, these parameters are important, but their ranking was also dictated by the accuracy with which they can be delineated. • Expression Complex - the importance of this parameter is relatively low since its 'value' is mostly included in the slope. • Delimiters - this parameter was ranked the last, mainly because of its vague definition and difficulties related to coding it. Although the above classification is subjective, it was driven by importance and accuracy with which attributes are recorded. To account for this subjectivity, many analyses were carried out by combining these attributes (Table 6.6). In Table 6.6, the first 6 scenarios include a different number of attributes for the two sites. This was a direct consequence of the fact that Jeune Landing was described with only two Surficial Materials and one Sub-surficial Material. Scenarios 7-21 are identical for both sites. A list of attributes is provided in Table 6.6 for each site and each scenario. At the end of the list the total number of attributes used is provided, along with the dimensionality of the vectors input, based on the 1-of-n coding (in brackets). 106 Table 6.6 Analyses performed with geomorphic attributes. Seen. No. General description Attributes included for Seymour Attributes included for Jeune Landing 1 A l l Geomorphic attributes. 3 Surficial Mat. , 2 Subsurf. Mat. (including Expr. Thickness, Expr. Slope, Expr. Compex), Delimiters, Polygon Slope, 3 Processes - 29 attributes (188). 2 Surficial Mat., 1 Subsurf. Mat. , (including Expr. Thickness, Expr. Slope, Expr. Compex), Delimiters, Polygon Slope, 3 Processes - 1 9 attributes (127). 2 A l l Geomorphic attributes less Delimiters. 3 Surficial Mat. , 2 Subsurf. Mat. (including Expr. Thickness, Expr. Slope, Expr. Compex), Polygon Slope, 3 Processes - 25 attributes (179). 2 Surficial Mat., 1 Subsurf. Mat. , (including Expr. Thickness, Expr. Slope, Expr. Compex), Polygon Slope, 3 Processes - 17 attributes (118). 3 A l l Geomorphic attributes less Expr. Slope. 3 Surficial Mat. , 2 Subsurf. Mat. (including Expr. Thickness, and Expr. Complex), Delimiters, Polygon Slope, 3 Processes - 24 attributes (163). 2 Surficial Mat., 1 Subsurf. Mat. , (including Expr. Thickness, and Expr. Complex), Delimiters, Polygon Slope, 3 Processes - 16 attributes (112). 4 A l l Geomorphic attributes less Polygon Slope. 3 Surficial Mat., 2 Subsurf. Mat. (including Expr. Thickness, Expr. Slope, Expr. Compex), Delimiters, 3 Processes - 27 attributes (178). 2 Surficial Mat., 1 Subsurf. Mat. , (including Expr. Thickness, Expr. Slope, Expr. Compex), Delimiters, 3 Processes - 16 attributes (117). 5 A l l Geomorphic attributes less Expr. Slope, and Expr. Complex. 3 Surficial Mat. , 2 Subsurf. Mat. (including only Expr. Thickness), Delimiters, Polygon Slope, 3 Processes - 19 attributes (123). 2 Surficial Mat., 1 Subsurf. Mat. , (including only Expr. Thickness), Delimiters, Polygon Slope, 3 Processes -13 attributes (88). 107 Seen. No. General description Attributes included for Seymour Attributes included for Jeune Landing 6 Eliminate 2nd. Subsurf. Mat., 3 r d Surf. Mat. and corresponding Expr. Thickness and Delimiters. 2 Surficial Mat., 1 Subsurf. Mat. (including Expr. Thickness), Delimiters, Polygon Slope, 3 Processes - 13 attributes (88). This is identical to Scenario No.5 for Jeune Landing. 7 Eliminate 2 n d Surf. Mat. and corresponding Expr. Thickness and Delimiter. 1 Surficial Mat., 1 Subsurf. Mat. (including Expr. Thickness), Polygon Slope, 2 Processes -9 attributes (69). Same as for Seymour. 8 Eliminate 1st Subsurf. Mat. and corresponding Expr. Thickness; only the Most. Representative Process. 1 Surficial Mat., Expr. Thickness (of 1st Surficial Mat.), Polygon Slope, Most Repres. Process - 5 attributes (34). Same as for Seymour. 9 As No. 8 but with the 1st Process. 1 Surficial Mat., Expr. Thickness (of 1st Surficial Mat.), Polygon Slope, 1st Process - 5 attributes (34). Same as for Seymour. 10 Eliminate Expr. Thickness (N.B. 1st Surf. Material has also a partial cover delimiter). 1 Surficial Mat., Polygon Slope, Most Repres. Process - 4 attributes (30). Same as for Seymour. 11 As No. 10 but with the 1st Process. 1 Surficial Mat., Polygon Slope, 1st Process - 4 attributes (30). Same as for Seymour. 12 Expr. Thickness (for 1st Surf.Mat.), Most Repres. Process, Polygon Slope. Expr. Thickness, Polygon Slope, Most Repres. Process - 3 attributes (22). Same as for Seymour. 108 Seen. No. General description Attributes included for Seymour Attributes included for Jeune Landing 13 As No. 12 but with the 1st Process. Expr. Thickness, Polygon Slope, 1st Process - 3 attributes (22). Same as for Seymour. 14 Expr. Thickness (for 1st Surf.Mat.), and Most Repres. Process. Expr. Thickness, Most Repres. Process - 2 attributes (12). Same as for Seymour. 15 As No. 14 but with the 1st Process. Expr. Thickness, 1st Process - 2 attributes (12). Same as for Seymour. 16 Expr. Thickness (for 1st Surf.Mat.), and Polygon Slope. Expr. Thickness, Polygon Slope - 2 attributes (14). Same as for Seymour. 17 Most Repres. Process, Polygon Slope. Most Repres. Process, Polygon Slope - 2 attributes (18). Same as for Seymour. 18 1st Process, Polygon Slope. 1st Process, Polygon Slope - 2 attributes (18). Same as for Seymour. 19 Most Repres. Process. Most Repres. Process - 1 attribute (8). Same as for Seymour. 20 1st Process. 1st Process - 1 attribute (8). Same as for Seymour. 21 Polygon Slope. Polygon Slope - 1 attribute (10). Same as for Seymour. 109 A description of scenarios included in Table 6.6 follows: • Scenarios 1-4: investigate all geomorphic attributes available, the influence of delimiters, and the contrast between Polygon Slope and Expression Slope. • Scenario 5: is based on the results of previous scenarios and investigates the influence of Expression Complex. • Scenarios 6-11: analyze the influence of Surficial and Subsurficial Materials. • Scenarios 12 - 21: analyze the influence of Expression Thickness, Geomorphic Processes and Polygon Slope. • The pairs of scenarios (8,9), (10,11), (12,13), (14, 15), (17, 18), and (19, 20) investigate the impact of using either the most spread process identified in a certain polygon, or the most destructive process (as presented in Chapter 5.2.3). As the total number of attributes was reduced, a special concern was the number of attributes that represent the same type of data. For example, at a certain point (Scenario 7), it was clear that geomorphic processes carry too much weight. Therefore, the number of processes was reduced from three to two, and then to one. At this point, the one to be retained was either the one with the largest areal extent, or the one most destructive with respect to terrain stability. • Scenarios 19-21 are mainly performed for completeness, so that a full evaluation of all geomorphic attributes is done. In a similar manner as with the topographic attributes, even when there was a slight indication that a geomorphic attribute contributes to a good prediction, it was retained for the next phase of the analysis. 6.3.3. Analyses based on topographic and geomorphic attributes. In this case, each pixel is characterized by unique topographic attributes derived from the D E M , and by geomorphic attributes common to all pixels that fall inside the same terrain polygon. This series of analyses starts with the topographic and geomorphic attributes retained as a result of previous analyses. 110 A s presented in Figure 6.1, at this stage, the two short lists o f attributes are combined, and analyses are performed to further reduce the list. The attributes retained in previous analyses, and the types of analyses performed at this stage are presented in Chapter 7. 6.4. CROSS-VALIDATION OF THE MODEL. The purpose o f this analysis was to investigate to which extent, results from one site can be extrapolated to the other. Coupled with the evaluation based on training and testing data for each site, this type o f analysis can indicate how mapping of a certain area can be extended to map a neighbouring area, based on the principles developed in this study. In my case, the factor favorable to this analysis is that the two watersheds are similar to a certain extent, with respect to their geomorphologic and hydrologic attributes. However, the degree of similarity is not as i f the two watersheds were adjacent, and the quality of the data available for the two sites is different. In Chapter 4, some issues were identified in Jeune Landing on classification of some Class I V polygons. A l so , for Jeune Landing the initiation zone was not identified for geomorphic process. To conduct the cross-validation, this information also had to be discarded from Seymour. In the light of these modifications, extrapolation of results was conducted only as a secondary check for the approach developed in this study. The main criterion was the prediction yield within each watershed, using the test data. The analysis was performed using the reduced set of topographic and geomorphic attributes. First, the codebook vectors created for Seymour were used to classify the test data in Jeune Landing. Then, the analysis was repeated conversely, where Seymour test data were classified with the codebook vectors from Jeune Landing. I l l 6.5. COMPARISON BETWEEN TERRAIN STABILITY ASSESSMENT METHOD DEVELOPED IN THIS STUDY A N D A N EXISTING M O D E L (SINMAP) Prediction yield by this model was contrasted with one of the most modern models for terrain stability assessment. The software called SINMAP (Pack et al. 1998) was used for this comparison. The main features of the SINMAP formulation are: • It is based on an infinite slope stability model. • Topography is represented as a D E M . • Spatial distribution of groundwater is based on shallow subsurface flow convergence and topographic slope. • Uncertainty of parameters is incorporated through ranges of soil and hydrology parameters. • Is interactively calibrated. • Is designed to work as an ArcView 5 extension. Essentially, SINMAP maps potential landslide initiation zones. The software applies only to shallow translational landslides controlled by shallow groundwater flow convergence. SINMAP inputs first consist of a unique value for soil density (kg/m3), and three soil parameters (given as ranges): (i) dimensionless cohesion defined in SINMAP as a function of root cohesion, soil cohesion, soil thickness and soil density; (ii) soil angle of internal friction (deg.); and (iii) the ratio transmissivity / recharge. Transmissivity is a function of hydraulic conductivity and soil thickness, and recharge is a measure of water input (this ratio has units of m. If available, a theme with existing landslides is also used to improve the accuracy of the prediction. 5 ArcView is a registered trademark of the Environmental Systems Research Institute Inc. (ESRI). 112 The main SINMAP output is a Stability Index for each cell in the D E M . Essentially, the Stability Index is a measure of the stability of the cell. Also, a calibration plot is presented. In this plot, a predefined number of random pixels are displayed, in coordinates Slope - Sea. The plot reflects the distribution of mapped area by stability classes. Statistics on the areal distribution of stability classes are also provided. If available, existing slides are plotted in the graph as well. Based on the calibration plot (and especially on distribution of existing slides between stability classes in the plot), the input parameters are adjusted. The analysis is repeated until a satisfactory state is reached. At the end of this process, the map obtained is considered the prediction made by SINMAP. As location of landslides was available only for Seymour, SINMAP was tested only on this site. There is no direct correspondence between the prediction yield by SINMAP and the classification system in use in BC. A first method for assessment was to compare the number of landslides correctly predicted by the two methods. Also, the area considered unstable or potentially unstable based on its Stability Index, was contrasted with polygons Class IV and V in the existing mapping. The parameters input in SINMAP were chosen based on results presented by Wilkinson (1996), and Jaakkola (1998), who conducted field tests and SINMAP analyses in an area adjacent to Seymour. Soil density (p) was set to 1800 kg/m3. The list of initial and final parameters is presented in Table 6.7. The main purpose of this analysis was to evaluate the two analytical methods represented by these models. 113 Table 6.7 Lis t o f initial and final parameters used in S I N M A P . Parameters Values In i t ia l F i n a l Dimensionless cohesion - C 0 - 2 0 - 1 . 2 Angle o f internal friction - 0 (deg.) 2 8 - 4 7 3 5 - 4 7 Transmissi vity / Recharge - T / R (m) 200 - 3000 1000 - 3000 6 . 6 . UNSUPERVISED LEARNING Preliminary steps in unsupervised learning consisted o f selecting the S O M (i.e. 2-D lattice) size, the number o f steps, neighbourhood size, and the learning rate required in training. Selection o f the map size is an important step: maps that are very large may take longer time to train and may have large empty (unused) areas, whereas maps that are too small may group dissimilar pixels on the same neuron. The first step in selection o f the map size utilized Sammon's mapping. Sammon's mapping (a gradient descent search) gives only a general indication on the spread o f the data, and for large data sets it may be very time consuming. For this reason, it was performed on a subset of 1,000 pixels, randomly selected from Seymour. Sammon's mapping can handle only numerical data, and was conducted only on topographic parameters. Beyond this, preliminary S O M analyses were conducted with subsamples o f data, from different portions of Seymour and Jeune Landing. Based also on recommendations from the literature (Kohonen et al. 1996b; Kohonen 2001), the following parameters were used for both Seymour and Jeune Landing: the dimensions of the map used were set to 15 x 10. For the first phase of training, the neighborhood width was set to 12, and learning ratio o f 0.05. The dataset was presented to the net in two epochs (the number o f steps was approximately 170,000). For the second phase o f training, the parameters were, respectively: 3 for neighborhood width, 0.02 for learning rate ratio, and six epochs (approximately 510,000 steps). 114 6.6.1. Unsupervised learning analyses In unsupervised learning the training data were used to create the S O M , and test data was used to calibrate it (label neurons as stable or unstable, by majority voting, based on the number o f hits). This method is more objective compared to manual labeling o f neurons and is also faster, but may artificially increase the quality of calibration, because test data consists o f random pixels scattered over the entire area and thus include all the possible situations. For unsupervised analysis, only the reduced set of topographic and geomorphic attributes from L V Q was used. This decision was made because background knowledge on terrain stability gives us strong reason to believe that the attributes that proved to be important in L V Q analysis w i l l be also important in S O M . Besides, the S O M analysis is more time consuming than L V Q . N o cross-validation o f results was performed for unsupervised learning. This was driven by the practical consideration that new areas are better classified in the supervised mode, using existing maps that are carefully checked. 6.6.2. Interpretation of SOM-based results A s presented in Chapter 3, visualization o f individual attributes on S O M is possible, but only valid for numerical parameters. This method allows investigation o f the 'reasoning' used for terrain stability mapping. To conduct this analysis, terrain stability mapping was performed for all the topographic attributes, and their distribution on the map was visualized. This type o f analysis confirms the importance (weight) o f attributes in terrain stability mapping. 115 CHAPTER 7. RESULTS A N D DISCUSSION Results are presented in this chapter in the same order introduced in Chapter 6. When analyses are performed separately for the two study sites, results for Seymour are introduced first, then for Jeune Landing. Results are presented based on the four criteria described in Chapter 6.1, and in general, presentation and discussion of results follow the order of criteria (from Criterion a to d). For most analyses results are presented both in graphic and in tabular form, and thus, occasional duplication occurs. This style was chosen because the graphics allow a better presentation of trends, and tables show the exact numbers. In this chapter, when errors are reported they refer to the total error (Type I plus Type II); in some cases, however, comments are made only on one type of error and this is clearly stated. A summary of the results is presented in Chapter 7.5. 7 . 1 . SUPERVISED LEARNING 7.1.1. Resu l t s based o n t o p o g r a p h i c a t t r ibu tes Results are presented according to scenarios described in Table 6.5. Seymour The analysis was performed using standardized and non-standardized data, both with raw numbers (i.e. numbers as they are) and with 1 -of-n coding. When numbers are used, standardized data produce better results. In general, these results were 2 - 4% better with respect to Criterion c, and varied for the other criteria. However, when 1-of-n coding is used, predictions are very similar. A possible explanation for this is that after elimination of pixels with very large Sca, topographic attributes differ from one another no more than one order of magnitude. When coupled with 1-of-n coding, these differences become less visible, and make standardization unnecessary. Results for Seymour, using non-standardized numbers and 1-of-n coding are presented in Table 7.1. Results with respect to Criteria b and c are also displayed in Figure 7.1. 116 Table 7.1 Seymour - results for analyses based on topographic attributes. Scena-Attributes included Errors (%) by criterion rio Crit. a Grit, b Crit. c Crit. d 1 Slope, Elevation, Sea, Aspect, Prof.crv., Int.crv., Plan.crv. 29 18 16 43 2 Slope, Elevation, Sea, Aspect, Prof.crv., Int.crv. 26 17.4 16.3 42.7 3 Slope, Elevation, Sea, Aspect, Prof.crv. 29 17.6 16.4 42.4 4 Slope, Elevation, Sea, Aspect. 27.4 17.9 16.4 43 5 Slope, Elevation, Sea. 28.2 20.8 21 48.8 6 Slope, Elevation, Aspect. 29.7 19 17.1 42.9 7 Slope, Elevation. 26.8 22.4 24 52 8 Slope, Sea. 28.1 25.7 24.9 53.2 9 Slope. 2.6 37.7 32.1 52.9 117 Prediction errors • Criterion b • Criterion c Scenario number Figure 7.1 Seymour - prediction errors for analyses based on topographic attributes, Criteria b and c. Table 7.1 shows that prediction errors associated with Criterion a are high, being in the range of 26 -30%. For Scenario 9, the error was only 2.6% because the entire area was classified as stable. This error is small simply because the area of existing landslides (and associated buffers) is small compared to the entire watershed - this is an artifact produced by representation of results using percentages. Errors associated with Criteria b and c are similar, which indicates consistent terrain stability mapping with respect to terrain Class IV and V. For Criterion c, results indicate that there is no change in prediction errors when curvatures are dropped (Scenarios 1 - 3). For Scenario 4, which includes Slope, Elevation, Sea, and Aspect, the errors are similar to the previous ones which include more attributes. Dropping Aspect (Scenario 5), increases the error. Scenario 6, based on Slope, Elevation, and Aspect, indicates that Sea has a relatively minor contribution to the quality of mapping. Scenarios 7-9 demonstrate the importance of Slope and Elevation. Although Scenarios 8 and 9 show that Sea has a major impact on classification, the series of Scenarios 7 - 9 tend to prove that in fact, the influence of Sea is accounted for by Elevation. Based on these results, Slope, Elevation, Aspect, and Sea were included in the next phase of analysis. L V Q classification with respect to Criterion c for the four attributes retained has an accuracy of 83.6% (Scenario 4). This represents a major improvement 118 compared to the classification based on statistical methods (MDA), which classified correctly only 64% of the testing data. Since there is little doubt about the influence of Slope, Elevation, and Aspect, their impact on the quality of prediction is further analyzed with respect to Criterion c. Scenario 9 shows that prediction based only on Slope has an error of 32.1% (approximately 68% accurate); this confirms that Slope is indeed a very powerful attribute in delineating unstable and potentially unstable terrain. The error is further reduced by introducing Elevation, to 24% (Scenario 7). Scenario 6 shows that introduction of Aspect reduces the error to 17.1% (i.e. prediction based on Slope, Elevation, and Aspect is 82.9% accurate). Obviously, these reductions in error represent not only the influence of an individual attribute, but also the combined influence of the newly introduced attribute and the existing ones. Classification with respect to Criterion b follows a similar trend, and the accuracy of prediction based on the same attributes (Slope, Elevation, and Aspect; Scenario 6) is 81%. With respect to Criterion d, Table 7.1 shows that classification errors range from 42 - 53%. These high errors indicate that when using only topographic attributes, the model is unable to generate a classification system based on 5 classes. Figure 7.2 presents the prediction based on the attributes retained at this phase (Slope, Elevation, Sca, Aspect - Scenario 4), with respect to Criterion a. Terrain polygons Class V are displayed as well. Terrain identified as unstable in Figure 7.2 captures the majority of existing landslides, excepting for 16 (all the landslides missed occur at lower elevation, as outlined in the map detail, Figure 7.2b). 119 1 Kilometers 5 Kilometers L E G E N D Stable terrain | | Unstable terrain ® Landslide and buffer Figure 7.2 Map of Seymour showing prediction using Slope, Elevation, Sca, Aspect, based on existing landslides; a) The entire watershed; b) Detail. Terrain mapped as unstable outside the existing landslides (and buffers) represents Type I error (which constitutes the vast majority of errors in this case), while existing landslides not identified by the system represent Type II errors. In general, unstable terrain falls within terrain polygons Class V. This map is compared with SINMAP's prediction in Chapter 7.3. 120 Jeune L a n d i n g . The first step for Jeune Landing also consisted in evaluating coding methods: standardized vs. non-standardized numbers, with and without 1-of-n coding. Results were similar to those obtained for Seymour. Standardized numbers produced better results, but when using 1 -of-n coding, the predictions are the same with both methods. Consequently, 1-of-n coding with non-standardized attributes was used. Based on the above results, data for Jeune Landing were recoded, this time based on the ranges for attributes for Seymour. Selecting Seymour as a basis for 1 -of-n coding was considered best, as ranges of attributes for Seymour generally include those for Jeune Landing. Coding both areas according to the same basis makes the cross-validation of mappings feasible. After the new coding, the analyses for Jeune Landing were redone, with essentially the same results, and 1-of-n coding with non-standardized numbers was still the best method. Results for Jeune Landing using topographic attributes are presented in Table 7.2. As noted before, data on existing landslides were not available, therefore Criterion a does not exist for Jeune Landing. Results for Criteria b and c are also presented in Figure 7.3. The first difference from Seymour, is that errors for Criteria b and c are different. The error associated with Criterion c is always higher, which indicates an inconsistency (a change in accuracy level) for mapping of terrain Class rv polygons. Table 7.2 and Figure 7.3 indicate that with respect to Criterion b errors are almost at the same level for the first eight scenarios. For Criterion c, prediction accuracy of Scenarios 1 - 6 is almost the same (as presented in Table 6.5, Scenarios 1 - 5 do not include the same attributes as Seymour, but Scenarios 6 - 9 are identical). Scenarios 1 - 4 indicate that dropping Sca and the curvatures has no effect on the quality of mapping. Scenarios 5-6 further investigate the influence of Sca and indicate that for Jeune Landing, this attribute has a minor influence. Scenarios 7-9 show that dropping Aspect has a major impact on prediction. Similarly with Seymour, Scenarios 7 and 8 show that Elevation and Sca have a similar effect when combined with Slope. 121 Table 7.2 Jeune Landing - results for analyses based on topographic attributes. Scena-r io Att r ibutes included E r r o r s (%) by cri ter ion C r i t . b C r i t . c C r i t . d 1 Slope, Elevation, Aspect, Plan, crv., Int.crv., Prof.crv., Sea. 14.3 25 33 2 Slope, Elevation, Aspect, Plan, crv., Int.crv., Prof.crv. 14.2 25.1 32.9 3 Slope, Elevation, Aspect, Plan, crv., Int.crv. 14.2 24.8 33.1 4 Slope, Elevation, Aspect, Plan.crv. 14 25 33 5 Slope, Elevation, Aspect, Sea. 14.1 24.9 32.7 6 Slope, Elevation, Aspect. 14.7 25.2 34.7 7 Slope, Elevation. 16.5 37.7 41.3 8 Slope, Sea. 14.8 34.5 40.7 9 Slope. 19.7 35.2 41 Prediction errors • Criterion b • Criterion c Scenario no. Figure 7.3 Jeune Landing - results for analyses based on topographic attributes, Criteria b and 122 As a result of these analyses, attributes retained for the next phase were the same as for Seymour: Slope, Elevation, Aspect and Sea. L V Q classification with respect to Criterion c for the four attributes retained has an accuracy of 75.1% (Scenario 5). For Jeune Landing, M D A classified correctly only 38% of the testing data. L V Q classification represents a major improvement compared to M D A , and a much greater improvement than for Seymour. The impact of Slope, Elevation, and Aspect on the quality of prediction is further analyzed with respect to Criterion c. Scenario 9 shows that prediction based only on Slope has an error of 35.2% (approximately 65% accurate); this is similar to the accuracy achieved in Seymour, and confirms once again the importance of Slope. Introduction of Elevation (Scenario 7) increases the error to 37.7%. This increase is counterintuitive and can not be explained; however, since introduction of Elevation in Scenarios 5 and 6 does not hinder the analysis, elimination of this attribute is out of question. Scenario 6 shows that introduction of Aspect reduces error to 25.2% (i.e. prediction based on Slope, Elevation, and Aspect is 74.8% accurate). Prediction with respect to Criterion b for the same attributes is 85.3% accurate. Similarly with Seymour, prediction based on Criterion d is characterized by high errors, which indicate the inability of the model to map according to the 5-class system. 7.1.2. Results based on geomorphic attributes. Results are presented according to scenarios in Table 6.6, for each study site. Seymour Results of geomorphic attributes (Criteria b - d) are presented in Table 7.3 (in this table, for typographic reasons, the attributes included in each scenario are not shown). Criterion a is not feasible in this case because landslides can not be associated with whole polygons. Results are also presented in Figure 7.4. With respect to Criteria b and c, Scenarios 1-12 show that most geomorphic attributes have little impact on the prediction. Attributes that proved to be the most important are: Expression Thickness, the most representative Geomorphic Process, and Polygon Slope (Scenario 12). Scenarios 13-21 further 123 investigate these attributes and confirm their importance. Scenarios 17 and 18 indicate that Expression Thickness carries little weight compared to the other two attributes, but it still contributes to a better prediction. Table 7.3 Seymour - results for analyses based on geomorphic attributes. Scenario Errors (%) by criterion Criterion b Criterion c Criterion d 1 3.9 5.9 9.1 2 4.3 5.5 12.4 3 5.9 5.5 9.6 4 3.6 8.1 18.2 5 4.1 6.2 12.2 6 4.7 3.6 18.3 7 5.1 6.9 15.6 8 4.2 5.2 17.1 9 6.1 7.8 17.5 10 4.2 5.8 13.7 11 5.2 6.7 13.8 12 5.4 4.1 17.2 13 5.7 7.9 17.9 14 12.2 12.3 25.9 15 13 13.5 27.7 16 19.8 9.1 28.8 17 6.6 5.4 16.8 18 9.2 5.9 19.6 19 20.3 16.3 25.2 20 24.8 17.6 25.4 21 17.5 9.7 37.8 124 40 35 30 S 20 * 15 10 5 0 Prediction m in S c e n a r i o 1 ro T -• Criterion b • Criterion c • Criterion d Figure 7.4 Seymour - results for analyses based on geomorphic attributes. The errors associated with Criteria b and c are in general similar, except for Scenarios 16, and 19-21 (Scenario 16 includes the attributes Expression Thickness and Polygon Slope, and Scenarios 19-21 include only one attribute, respectively most representative Geomorphic Process, first Geomorphic Process, and Polygon Slope). The high errors associated with these scenarios and the inconsistency with previous scenarios (much higher error for Criterion b than for Criterion c), seem to indicate that the attributes or combinations of attributes included, are not adequate for terrain stability mapping (in this form). For Geomorphic Processes and Polygon Slope the results are in concordance with current practice. The example provided in Appendix 4 clearly identifies these attributes as the most important ones for terrain stability mapping. Expression Thickness which quantifies the thickness of the layer of soil that overlies impervious bedrock, is also recognized as an important attribute in the literature. However, it is variable and difficult to measure over large areas. The scenarios carried out in pairs to identify the influence of the first (most widely spread) and the most representative Geomorphic Process indicate that the second choice yields better results. In Scenario 12, the errors for Criteria b - c are in the range of 4 - 5%. These 125 errors are unrealistically low and they are a direct consequence of the classification that assigns all pixels in a polygon the same attributes. Wi th respect to the 5-class system (Criterion d), the best classification is obtained with all geomorphic attributes (Scenario 1 - 9.1% error); when attributes are eliminated from the complete description, prediction based on this scenario degrades rapidly. Results for Scenario 1 are presented in Figure 7.5 where each terrain class is depicted with a different color, and for each stability class the corresponding terrain polygons are shown. The practical significance o f this result is that the L V Q model can accurately assign stability classes to terrain polygons already mapped (based on the B . C . Terrain Classification System). The model can basically perform the function o f a classification table, similar to the one presented in Appendix 4. To apply it, terrain polygons should be delineated and symbols assigned to all o f them. Then, a sample o f these polygons (covering most representative situations in the area) should have stability classes assigned. The model can learn stability criteria based on this classified subset, and then assign stability classes to the remaining polygons. However, this seems to have limited practical applicability. Based on results obtained with geomorphic attributes with respect to Criterion c (which was identified as the most important), the conclusion is that the attributes to be retained for the next phase o f the analysis are: Expression Thickness, the most representative Geomorphic Process, and Polygon Slope. However, when both terrain mapping and D E M are available, each pixel has a slope derived from the D E M , hence the slope o f the polygon is redundant. The only attributes to be used in the next phase (combined with topographic attributes) are Expression Thickness and the most representative Geomorphic Process. 126 Figure 7.5 Map of Seymour showing prediction based on geomorphic attributes; a) Class I; b) Class II; c) Class III; d) Class IV; e) Class V . 127 Jeune Landing Results for Jeune Landing are presented in Table 7.4 and Figure 7.6. To a certain extent, these results are similar to the ones in Seymour. Scenarios 1-12 show that, with respect to Criteria b and c, most geomorphic attributes have little impact on the prediction. The most important attributes identified in this study site are (the same): Expression Thickness, the most representative Geomorphic Process, and Polygon Slope (Scenario 12). Scenarios 13 - 21 confirm this, and Scenarios 17 and 18 indicate the relatively minor impact o f Expression Thickness. However, Table 7.4 and Figure 7.6 show a different trend in errors for Criteria b and c, i.e. a major drop in the quality o f prediction is visible from Criterion b to c. This seems to indicate an inconsistency in the quality of terrain stability mapping with respect to terrain Class V and I V . Table 7.4 Jeune Landing - results for analyses based on geomorphic attributes. Scenario Errors (%) by criterion Criterion b Criterion c Criterion d 1 3.1 19.9 24.5 2 4.5 19.5 26.5 3 6.2 19.5 26.2 4 3.9 22.1 29.7 5 3.9 20.2 28.3 6 - - -7 5.7 20.9 29.2 8 6.2 19.2 31.4 9 6.9 21.8 31.2 10 5.2 19.8 26.5 11 5.1 20.7 27.7 12 4.3 18.3 30.1 13 4.9 21.9 28.6 14 14.2 26.3 35.8 15 13.7 27.5 37.2 16 17.9 25.2 39.4 17 6.5 21.2 31.2 18 9.1 20.8 33.7 19 21.5 25.1 43.2 20 25.3 27.6 45.8 21 19.2 23.7 37.5 128 Prediction Figure 7.6 Jeune Landing - results for analyses based on geomorphic attributes. Similarly with Seymour, the exceptions are represented by Scenarios 19-21, and (to a smaller extent) by Scenario 16 (Scenario 16 includes the attributes Expression Thickness and Polygon Slope, and Scenarios 19-21 include only one attribute, respectively most representative Geomorphic Process, first Geomorphic Process, and Polygon Slope). However, for Jeune Landing, the inconsistency with previous scenarios is indicated by a reduced difference in accuracy between the two scenarios. These results seem to indicate that the attributes or combinations of attributes included are not adequate for terrain stability mapping. The scenarios carried out in pairs to identify the influence of the first (most widely spread) and the most representative Geomorphic Process indicate that the second choice yields better results. For Criterion d, similar to Seymour, the best prediction was achieved when all attributes were included (Scenario 1, 24.5% error) . Results for the 5-class system are presented in Figure 7.7, where each stability class is depicted with a different colour, and for each class the corresponding terrain polygons are shown (the complex polygon shapes, and the large area covered by Class III polygons, makes it more difficult to assess the maps displayed in this figure). 129 Figure 7.7 Map of Jeune Landing showing prediction based on geomorphic attributes; Class I; b) Class II; c) Class III; d) Class IV; e) Class V. The practical significance of results presented in Figure 7.7 is that L V Q classification based on geomorphic attributes can accurately assign stability classes to terrain polygons already mapped (based 130 on the Terrain Classification System), and the model can perform the function of a classification table (similar to the one presented in Appendix 4). However, as already stated for Seymour, this approach has limited practical applicability. As an overall comment, the quality of best results obtained in this analyses (Scenario 12, Criterion b -95.7%, and Criterion c - 81.7%) is a direct consequnce of assigning the same attributes to all pixels in a polygon. The most important attributes identified in these analyses are (the same as for Seymour) Expression Thickness, the most representative Geomorphic Process, and Polygon Slope. However, given the availability of a Slope for each pixel (calculated based on the DEM), the only attributes retained for the next phase of the analysis are Expression Thickness and the most representative Geomorphic Process. 7.1.3. Results based on topographic and geomorphic attributes. These analyses are conducted with attributes retained in previous phases, namely: Slope, Elevation, Aspect, Sca, Expression Thickness, and Geomorphic Processes. As a first step, these attributes were ranked in the following order: Slope, Geomorphic Processes, Elevation, Aspect, Sca, and Expression Thickness. The rationale for this order is based on the following: • Slope was ranked first based on previous results and also on general information from other studies, as already presented in this thesis. • Geomorphic Processes are a good indicator of instability, and this attribute proved to be important in the previous phase. However, as geomorphic processes are recorded both for polygons where processes initiate and for those traversed by them, this attribute was ranked second. Overall, previous analyses indicated that both Slope and Geomorphic Processes are very important attributes, and it is very likely that both will be kept in the final selection; therefore, their order is not relevant. 131 • The other topographic parameters were ranked based on the initial order, and on their performance in the previous phase. These analyses also justify the swapping of places between Aspect and Sea. Furthermore, previous analyses almost provided the proof for eliminating Sea; however, this attribute was still kept for a thorough investigation o f its influence on stability analysis. • Expression Thickness was ranked last, because of its relatively minor influence on the results, and because of the low accuracy related to its evaluation. Based on this ranking, the analyses carried out are presented in Table 7.5. In the remainder o f this sub-chapter, results are presented for the two study areas. Table 7.5 Analyses performed with the reduced set o f geomorphic and topographic attributes. Scenario Description 1 Slope, Geomorphic Processes, Elevation, Aspect, Sea, Expression Thickness. 2 Slope, Geomorphic Processes, Elevation, Aspect, Sea. 3 Slope, Geomorphic Processes, Elevation, Aspect. 4 Slope, Geomorphic Processes, Elevation, Sea. 5 Slope, Geomorphic Processes, Elevation. 6 Slope, Geomorphic Processes. 132 Seymour Results for Seymour are presented in Table 7.6. Table 7.6 Seymour - results for analyses based on the reduced set o f topographic and geomorphic attributes. Scena-Attributes included Errors (%) by criterion rio Crit. a Crit. b Crit. c Crit. d 1 Slope, Geom. P r o c , Elevation,. Aspect, Sca, Expr. Thick. 25.7 6.1 9.5 29.2 2 Slope, Geom. P r o c , Elevation, Aspect, Sca. 25 5.8 9.8 29.3 3 Slope, Geom. P r o c , Elevation, Aspect. 24.6 6.1 9.1 29.7 4 Slope, Geom. P r o c , Elevation, Sca. 26.3 8.1 11.2 28.9 5 Slope, Geom. P r o c , Elevation. 27.9 8.7 12.3 31.5 6 Slope, Geom. Proc. 34.9 18.5 19.8 37.8 Table 7.6 shows that prediction based on Criterion a is still characterized by a low accuracy. Predictions for Criteria b - c are also presented in Figure 7.8. 133 25 i 20 3" £ 15 2 10 5 Prediction errors • Criterion b • Criterion c 3 4 Scenario No. Figure 7.8 Seymour - results for analyses based on the reduced set of topographic and geomorphic attributes, Criteria b - c. For Criterion b, the lowest error is achieved with Scenario 2, and for Criterion c, the lowest error is achieved with Scenario 3. Since Criterion c is considered the most important, and the difference between the two scenarios with respect to Criterion b is only 0.3%, the conclusion is that the best prediction is achieved with Scenario 3 using the attributes: Slope, Geomorphic Processes, Elevation, and Aspect. The accuracy of prediction with respect to Criterion c (Scenario 3) is 90.9%. As presented in Chapter 7.1.1, prediction based only on Slope was 68% accurate. Scenarios 6, 5, and 3, in Table 7.6 show that a steady increase in accuracy is achieved by including Geomorphic Processes, Elevation, and Aspect, respectively. However, the main approach taken in this study is to add relevant geomorphic attributes to topographic attributes already selected. As presented in Chapter 7.1.1, an accuracy of 82.9% was achieved based on Slope, Elevation, and Aspect. Adding Geomorphic Processes to these attributes (Scenario 3 in Table 7.6), increases the accuracy to 90.9%. With respect to Criterion b results followed a similar trend. Accuracy based only on topographic attributes was 81%, and increased to 93.9% when Geomorphic Processes were added. The greater increase with respect to this criterion shows that Geomorphic Processes are better indicators of instability in relation to polygons Class V than with Class IV. Results for Criterion b are presented in 134 Figure 7.9. This map also displays terrain polygons Class V . The error consists mainly in small areas classified as unstable, in the stable area (Type I error). Figure 7.9 Map of Seymour showing prediction based on Slope, Geomorphic Processes, Elevation, and Aspect (Scenario 3), for Criterion b. Prediction with respect to Criterion c is presented in Figure 7.10 along with terrain polygons Class IV and V. Type I errors are relatively similar to the ones identified in Figure 7.9. Type II errors, however, are higher than in the previous case, because these errors consist mainly of almost complete miss of a polygon situated in the lower (south) part of the valley, and another one near the central part (these 135 polygons are marked on the map with numbers 1 and 2). Portions of polygons are also misclassified, mainly at lower elevations, at the 'bottom' of class rv polygons, as presented in the detail (Figure 7.10b). 0 1 2 3 4 5 Kilometers 05 1 Kilometers L E G E N D Stable terrain | | Unstable terrain N Figure 7.10 Map of Seymour showing prediction based on Slope, Geomorphic Processes, Elevation, and Aspect (Scenario 3), for Criterion c; a) The entire area; b) Detail. Arrows point to Type II errors (missed polygons). 136 Polygons 1 and 2, were both classified (by terrain stability specialists) as stability class I V . Polygon 1 i i i , / dzsMv has the terrain symbols , which stands for discontinuous veneer o f t i l l consisting o f mixed Rk fragments, silt and sand, overlying moderately steep rock. The classification system presented in Appendix 4 (for the same slope class), assigned to both materials ( M v and Rk) , a stability class o f III. The case of polygon 1 shows that stability classes are assigned in a subjective manner, and different mappers assign different stability classes to similar polygons. The terrain symbol for polygon 2 is aCbv//Rhs, which stands for colluvium veneer blanket o f blocks, which is much more spread than rock hummocky steep. For this polygon, the classification system in Appendix 4 assigns it to Class IV . The mapping system plays an important role in this classification. For the two polygons described above, i f smaller polygons were delineated, it is l ikely that, at least the rock component o f both polygons would have a different class assigned. A s shown in Figure 7.10b, relatively large portions o f polygons were misclassified, especially in areas situated at lower elevations. A more detailed evaluation was conducted for these polygons, by revisiting the aerial photos. The (obviously subjective) conclusion was that the mapping system again played an important role: i f smaller polygons were delineated, boundaries could have been drawn further uphill , and thus reduce the misclassification (Type II error). The polygons misclassified could be split in an upper portion which is Class TV, indeed, and a lower portion which can be classified as Class III. This problem is also related to the difference in size between terrain polygons and pixels. A s presented in Table 7.6, the accuracy of prediction in this case (Scenario 3, Criterion c) was 90.9%; however, the above comments suggest that real accuracy could be even higher, with a mapping system using smaller terrain units. For the 5-class system (Criterion d), Table 7.6 shows an improvement in accuracy compared to the case when only topographic attributes are used. However, prediction is still considered unsatisfactory, and the best results with respect to this criterion are still the ones obtained with all geomorphic attributes (Chapter 7.1.2). 137 Overall , results presented in this section are fundamental in this thesis, and they address Objectives 1 and 2 (for Seymour). These results prove that using a subset o f topographic attributes, and Geomorphic Processes (from the entire set of geomorphic parameters), accurate delineation o f unstable (Criterion b), or unstable and potentially unstable terrain (Criterion c) can be achieved. Accuracies of predictions with respect to these criteria were 93.9% and 90.9%, respectively. These results may not look like an improvement compared to predictions based solely on geomorphic attributes (in fact, for Seymour these predictions are less accurate). However, these results suggest that most attributes required for a good terrain stability mapping can be extracted from a D E M , and Geomorphic Processes can be produced based on a more simplified terrain mapping method. This makes implementation o f the method much easier. A discussion on quality o f these data and the methods to acquire them is presented in Chapter 8. Jeune Landing Results for Jeune Landing are presented in Table 7.7. Results with respect to Criteria b - c are also presented in Figure 7.11. Similarly to predictions based only on topographic attributes, Table 7.7 and Figure 7.11 show a relatively large difference in accuracy related to Criteria b - c. For Criterion c, the series o f Scenarios 1-5 show that for Jeune Landing, Expression Thickness can be eliminated from the analysis, but also Aspect and Sea had little impact on the results. A good prediction can be obtained based on Slope, Geomorphic Process, and Elevation (Scenario 5). However, these results still support the idea that a good prediction with respect to unstable and potentially unstable terrain can be obtained based on Slope, Geomorphic Processes, Elevation, and Aspect (Scenario 3). For these attributes, the accuracy o f prediction with respect to Criterion c (Scenario 3) is 82.4%. This represents an increase of 7.6 % compared to prediction based only on Slope, Elevation, and Aspect (74.8%, as presented in Chapter 7.1.1). 138 Table 7.7 Jeune Landing - results for analyses based on the reduced set of topographic and geomorphic attributes. Scena-r io Att r ibutes included E r r o r s (%) by cr i ter ion C r i t . b C r i t . c C r i t . d 1 Slope, Geom. Proc, Elevation, Aspect, Sea, Expr. Thick. 6.2 17.4 23.1 2 Slope, Geom. Proc, Elevation, Aspect, Sea. 5.7 17.3 25.2 3 Slope, Geom. Proc, Elevation, Aspect. 5.2 17.6 25.8 4 Slope, Geom. Proc, Elevation, Sea. 5 17.4 26.1 5 Slope, Geom. Proc, Elevation. 5.4 17.6 26.9 6 Slope, Geom. Proc. 10.3 24.6 35.4 Prediction errors 30 r -L U • Criterion b • Criterion c 3 4 Scenario no. Figure 7.11 Jeune Landing - results for analyses based on the reduced set of topographic and geomorphic attributes, Criteria b - c. With respect to Criterion b results followed a similar trend. Accuracy based only on topographic attributes was 85.3%, and increased to 94.8% when Geomorphic Processes were added (Scenario Similarly to Seymour, the greater increase with respect to this criterion shows that Geomorphic 139 Processes are better indicators of instability in relation to polygons Class V than with Class rV. Results for Criterion b are presented in Figure 7.12, which also displays terrain polygons Class V. Although the accuracy of prediction with respect to Criterion b is very similar to the one for Seymour, the nature of errors is different. For Jeune Landing, errors are mainly of Type II, and consist primarily of portions or entire unstable polygons that are classified as stable. Results with respect to Criterion c are presented in Figure 7.13, which also displays terrain polygons Class IV and V. The accuracy of this mapping is about 8% lower than the corresponding prediction for Seymour. 140 N L E G E N D Stable terrain | | Unstable terrain 5 Kilometers Figure 7.13 M a p o f Jeune Landing showing prediction based on Slope, Geomorphic Processes, Elevation, and Aspect (Scenario 3), for Criterion c. Figure 7.13 shows that large areas of stable terrain are classified as unstable (Type I error), but also, areas inside unstable and potentially unstable polygons are classified as stable (Type II error). Although the calculated accuracy is 82.4%, visual inspection o f Figure 7.13 indicates that relatively large areas are being misclassified (this accuracy may look high since polygons Class IV and V occupy a relatively small percentage o f the total area, the majority o f it being classified as Class III). For Criterion d, accuracy o f prediction is considered low. Overall , results for Jeune Landing support the results obtained for Seymour, and although Aspect did not prove to be as important as in the first case, this attribute did not hinder the analysis. Therefore, the conclusion is that good classifications with respect to Criteria b - c can be obtained based on Slope, Elevation, Aspect, and Geomorphic Processes (Scenario 3). Results presented in this section address Objectives 1 and 2, for the Jeune Landing study site. These results also support the idea o f extracting topographic attributes from D E M and Geomorphic Processes from a simplified terrain mapping, thus improving the practical applicability o f the method. 141 7.2. RESULTS FOR CROSS-VALIDATION OF THE MODEL. This section addresses Objective no. 3 o f the thesis. Cross-validation consisted in mapping one site (i.e. classify the test data) using the representative vectors from the other site. Since previous analyses did not produce good results for Criteria a and d, cross-validation was conducted only for Criteria b - c. To ensure compatibility between the two data sets, the analysis for Seymour was redone by neglecting the initiation zones for Geomorphic Processes. Accuracy o f prediction dropped for Criteria b and c to 90% and 88%, respectively (the decrease in accuracy was greater for terrain Class V , because initiation zones are more relevant for this class). Jeune Landing was mapped based on the Seymour classification, resulting in an accuracy of prediction for Criteria b and c, o f 84% and 74%, respectively. Mapping Seymour based on results from Jeune Landing produced accuracies o f 87% and 71%, respectively. The drop in accuracy for both scenarios was an expected result, as mapping one site based on codebook vectors of another site is less accurate than mappings obtained within the same site. The magnitude of the drop is primarily driven by the degree o f similarity between the two sites. However, the drop with respect to Criterion c is more severe than for Criterion b. A s was the case in the previous analysis, this is related to the quality of mapping in the two study sites in relation to unstable and potentially unstable terrain, and possibly to the mapping system. 7.3. M O D E L PREDICTION VS. SINMAP Objective no. 4 o f the thesis is addressed in this section, by contrasting the results o f LVQ-based terrain stability mapping with those produced by S I N M A P . The mapping produced by S I N M A P is presented in Figure 7.14 (the figure also shows terrain polygons Class I V and V ) . O n this map, brown and red colour (lower and upper threshold) symbolizes unstable and potentially unstable terrain. For terrain labeled as 'defended', S I N M A P can not model stability. It is described as 'stable for unknown reasons' (essentially, this terrain needs further investigation, and for practical purposes it is safer to consider it unstable). 142 0 1 2 3 4 5 Kilometers Figure 7.14 Seymour - results o f S I N M A P analysis; a) The entire area; b) Detail. In S I N M A P although reasonable effort was put into tuning the attributes, the system still failed to identify 21 landslides (18 near the valley floor and 3 in the rest o f the area). Also , very often, areas classified as unstable and potentially unstable include isolated pixels or clusters o f pixels of stable areas. In three cases, the stable pixels include existing landslides, as presented in the map detail (Figure 7.14b). To a certain extent, the S I N M A P prediction is similar to the L V Q prediction conducted only with topographic attributes, as presented in Figure 7.2. 143 For a more quantitative analysis, S I N M A P prediction was also evaluated with respect to unstable and potentially unstable areas, as mapped by terrain specialists. S I N M A P prediction was considered to include upper and lower threshold, and the defended zone, and this was compared to terrain Class I V and V . Accuracy of prediction was 82%, which is about 2% lower than the corresponding L V Q prediction based on topographic attributes, as shown in Table 7.1, Scenario 4 (also, L V Q predictions based only on Slope and Sca did not yield very good results, either). Compared to the best L V Q prediction (based on Geomorphic Processes, Slope, Elevation, and Aspect - Table 7.6, Criterion c, Scenario 3) the accuracy o f S I N M A P prediction is 9% lower. However, given the principles on which it is based, the SES1MAP analyses have an element of subjectivity, and predictions may vary with the experience of the user. In contrast with S E \ M A P , LVQ-based mapping produces relatively large contiguous areas o f similar terrain (stable or unstable), as shown for example in Figure 7.10; this is an important feature for practical applications. These results identify a major advantage of the 'pattern recognition' ( A N N ) approach used in this model, versus physically-based models: the A N N approach was able to 'learn' the patterns of instability existing on the entire watershed, whereas the physically-based approach is restricted by its principles, and was not successful in predicting instability in complex conditions. 7.4. UNSUPERVISED LEARNING The first part o f this sub-chapter presents the results o f unsupervised learning. The second part provides interpretation of SOM-based results. 7.4.1. Results for unsupervised learning A s outlined in Figure 6.1 (and justified in Chapter 6.1), unsupervised learning was conducted only for the reduced set of topographic and geomorphic attributes. Essentially, the analyses conducted are the same as for the supervised case, described in Table 7.5. Given the results obtained in supervised mode, 144 unsupervised analyses were conducted only with respect to Criteria b - c. Results are presented in numeric format, for both sites, in Table 7.8. For Seymour, results are also presented in Figure 7.15. Table 7.8 Results of unsupervised learning analyses. Scena-r io Attr ibutes included E r r o r s (%) by cr i ter ion Seymour Jeune L a n d i n g C r i t . b C r i t . c C r i t . b C r i t . c 1 Slope, Geom. Proc, Elevation, Aspect, Sea, Expr. Thick. 10.9 14.9 11.4 21.2 2 Slope, Geom. Proc, Elevation, Aspect, Sea. 10.6 15.1 11.9 20.9 3 Slope, Geom. Proc, Elevation, Aspect. 11.2 15.3 10.9 20.7 4 Slope, Geom. Proc, Elevation, Sea. 10.6 14.1 12.2 22.3 5 Slope, Geom. Proc, Elevation. 10.8 13.2 12.5 22.7 6 Slope, Geom. Proc. 17.9 19.4 17.8 29.4 Prediction errors 25 20 E 15 o fc 10 LU 5 0 Figure 7.15 Seymour - results of unsupervised learning. 145 1 a r r B 1 B Criterion b • Criterion c 1 2 3 4 5 6 Scenario no. Scenarios 1 - 5 in Table 7.8 and Figure 7.15 show that there is little difference in predictions with respect to Criterion b. For Criterion c, the best prediction is obtained with Slope, Geomorphic Processes, and Elevation (Scenario 5), and this is approximately 2% better than the case which includes Aspect (Scenario 3). Results for Jeune Landing are presented in Figure 7.16. Predictions for Jeune Landing tend to reinforce the idea that Aspect is an useful attribute, because prediction with this attribute (Scenario 3) is better than the one without it (Scenario 5) for both Criteria b and c. 35 30 1 Prediction errors • Criterion b • Criterion c 3 4 Scenario no. Figure 7.16 Jeune Landing - results of unsupervised learning. Overall, these results address Objectives no. 5 and no. 6 of the thesis, and prove that good quality mappings can be produced in unsupervised mode, based on the same attributes retained in supervised analyses: Slope, Elevation, Aspect, and Geomorphic Processes. As expected, the accuracy of mappings in unsupervised learning is lower than in the supervised case. With respect to Criteria b - c, accuracies of these predictions are 3 - 6% lower than the corresponding ones for the supervised case. However, in the absence of good examples to be used in supervised analyses, unsupervised learning can be used to predict with reasonable accuracy unstable, or unstable and potentially unstable terrain. 146 7.4.2. I n t e r p r e t a t i o n o f S O M - b a s e d resul ts This section addresses Objective no. 7 of the thesis. Interpretation of SOM-based results was conducted based on the principles presented in Chapter 3, which consist of analyzing the influence of each attribute on the quality of mapping. Given the previous results and the quality of the primary data for the two study sites, this analysis was performed only for Seymour, with respect to Criterion c. Separation and visualization of individual attributes (i.e. vector elements) is feasible only for the those of ratio-type. SOM analysis was performed for all the seven topographic attributes: Elevation, Slope, Aspect, Plan curvature, Profile curvature, Interaction of curvatures, and Sca based on standardized variables (in numeric representation, standardized data produced better results). Terrain mapping based on all seven topographic attributes is presented in Figure 7.17, which also displays terrain polygons Class IV and V. Prediction accuracy for this map is 80.4%. 1 2 3 4 Stable terrain Unstable terrain Landslide and buffer LEGEND N 5 Kilometers Figure 7.17 Map of Seymour showing results of SOM analysis; all topographic attributes. 147 The corresponding SOM (i.e. the 2-D lattice) is displayed in Figure 7.18. This figure shows the essential feature of self-organization: pixels close in high dimensional space (in this case data dimensionality is 7) are also close when projected on a 2-D space. Compact zones are formed on the SOM for pixels mapped as stable and respectively, unstable. For a better understanding, the stable zone is mapped separately in Figure 7.19. As described in Chapter 3, the colour of the neuron (hexagon) reflects the median distance of the neuron to its neighbours, and the sides reflect the distance to the corresponding neighbouring neuron. n.8 S 1 SO 100 101 182 163 104 1<K 108 109 110 111 112 :;• 113 114 11$ 11$ W 118 119 121 122 123 124 124 126 127 128 128 138 131 132 133 134 Figure 7.18 Seymour - representation of SOM. 148 Figure 7.19 Seymour - S O M including only stable neurons. Due to the random initialization o f the codebook vectors, different SOM's are obtained at each analysis. These SOM's however, produce very similar results. The one described in Figures 7.18 and 7.19 was selected because it is easier to interpret, and the connection between different regions o f the S O M and corresponding pixels on the geographic map is straightforward. For a better understanding o f the self-organizing process, Figure 7.20 and 7.21 were generated. Figure 7.20 displays the pixels that were mapped in the isolated stable zone, included in the unstable zone (the south-western corner o f the S O M , as shown in Figures 7.18 and 7.19). This identifies mainly the large polygons o f terrain stability Class II, in the south-eastern part of Seymour (Figure 4.2), plus some more scattered pixels at higher elevation areas. 149 200 400 Meters Figure 7.20 Seymour - results of SOM analysis; a) Geographical location of stable neurons mapped on the SW corner of SOM; b) Detail. Figure 7.21 displays pixels mapped at the extremes of SOM (the four corners and the center). This figure confirms that neurons further apart on the map correspond to pixels further apart in the 7-D ' terrain stability' space, i.e. pixels corresponding to the five neurons are spread over the entire area and represent different terrain stability conditions. 150 136 68 - " 1 2 3 4 5 Kilometers .. - . . . c a r - . V .DO.-„-. t s l r f ' . „ • 1 3 ' 63 ..1S ii5oV:-S <-35>:£.-..'•:. •>••••• :~« s 500 1000 Meters T e r r a i n stability classes IV v Figure 7.21 Seymour - results of SOM analysis; a) Geographical location of neurons mapped in the comers and center of SOM; b) Detail. The map presented in Figure 7.21b is further described in this paragraph, from east to west. In this figure, Class I pixels close to the valley floor were mapped on neurons 15 and 150. Out of the five neurons selected, none of them includes pixels from the Class V zone adjacent to Class I (label placement in ArcView may create some confusion). Next, Class II pixels are all mapped on neuron 15, and no pixels in the Class III zone that follows, were mapped on the five neurons selected. Further up, at higher elevations, Class IV and V pixels were mapped on neurons 1 and 68, and in the last zone, Class III pixels were mapped on neurons 136 and 150. 151 The influence o f each attribute on the mapping can be investigated by inspecting its distribution over the S O M . A s presented in Chapter 3, this distribution can be presented by taking 'slices' o f codebook vectors. Distribution for all seven topographic attributes are presented in Figures 7.22 and 7.23, using values o f normalized variables based on the scale on the right hand side o f the figure. The influence o f the attribute is inferred based on the magnitude of its value for each stability class. Figure 7.22 shows the distribution o f Elevation, Slope, Aspect, and Plan Curvature. Figure 7.22a shows the distribution o f Elevation. Based on previous analyses (and very subjectively) Elevation can be ranked second with respect to its contribution to delineation of unstable and potentially unstable terrain. The distribution in Figure 7.22a clearly shows that most stable terrain is located at low elevations (represented by dark-coloured neurons), while both stable and unstable terrain exist at higher elevations (represented by light-coloured neurons). The distribution of Slope is presented in Figure 7.22b. A s presented in previous analyses, Slope is the most important parameter in delineating unstable terrain. This is clearly reflected in the map, where darker neurons (low slope) correspond to stable areas, and the lighter neurons (higher slope) correspond to unstable areas. The distribution of this attribute follows the distribution o f stable and unstable neurons much better than Elevation. The influence o f Aspect is reflected in Figure 7.22c. This figure can be somewhat confusing because very high and very low values for Aspect may represent relatively similar compass directions (i.e. from North-West to North-East). The problem is aggravated by the various shades o f colours that represent similar values of the attribute. Visual inspection of Figure 7.22c shows that the only trend discernable for Aspect is related to the green colour (which correponds to mid-scale colour on a black-and-white picture). 152 A (subjective) count of green neurons showed that 27 are included in the unstable zone and only 14 in the stable zone. This result indicates that most unstable areas are facing South-East to South-West (although the influence of this attribute is not as clearly discernable as for Slope and Elevation). It is also visible that flat terrain (represented by black neurons) is stable. Figure 7.22d displays the distribution of Plan Curvature. The vast majority of pixels are characterized by very similar curvatures, and no clear influence of this attribute can be identified. Figures 7.23 a - b show the distribution of Profile Curvature, and Interaction of Curvatures. Like Plan Curvature, these maps show once again that these parameters do not contribute to a good classification of terrain. The very similar distribution of Plan Curvature and Interaction of Curvatures is driven by the magnitude of numbers, because Plan Curvature has larger values than Profile Curvature. Figure 7.23c shows the distribution of Sea, and shows that Sea does not contribute to classification. Despite the selection of pixels with Sea less than 3,000 m2/m, the vast majority of the Sea's left, are small. There are two clusters of neurons where pixels with high Sea are grouped, but they reflect opposite states of stability. This is further evidence that Sea was not an useful attribute for this analysis; Sea may be more useful for other studies, possibly when analysis is conducted on smaller or more homogenous areas. 154 Overall , although based on a map that is (only) 80.4% accurate, results in this section clarify the impact o f topographic attributes on terrain stability mapping. These results are also in agreement with previous findings o f this thesis. Although these results do not have the same role as rules have in Expert Systems, they have great explanatory power in describing the influence of topographic attributes on stability. 7 . 5 . SUMMARY OF RESULTS The results of most important analyses performed in this thesis are summarized in Table 7.9 (in this table numbers are rounded-off to the nearest integer). This table includes both results presented in this chapter, and those of Mult iple Discriminant Analysis ( M D A ) presented in Chapter 6. Table 7.9 shows results in terms of accuracy, rather than errors. Table 7.9 Summary o f results: prediction accuracy (%) for various types of analyses Type of analysis and attributes included Seymour Jeune Landing Criterion Criterion b c b c M D A (all topographic attrib.) - 64 - 38 LVQ (Slope) 62 68 81 65 LVQ (Slope, Elev., Aspect) 81 83 85 75 LVQ (Slope, Elev., Aspect, Geom. Proc.) 94 91 95 82 S O M (Slope, Elev., Aspect, Geom. Proc.) 89 85 89 79 Table 7.9 shows that A N N methods performed better than M D A . This proves the quality (greater detail) o f the discriminant functions of L V Q and S O M , which generate a better classification. Analyses showed that Slope has a major and clearly discernable influence on terrain stability mapping. Therefore, the influence of this attribute is analyzed individually. The influence of Elevation and Aspect for the two study sites and in different scenarios is not always very clear, and for this reason, the combined influence 156 of these attributes is presented in Table 7.9. In this study, L V Q prediction based on Slope alone produced a better result than M D A prediction including all topographic attributes (however, more sophisticated M D A analyses could be conducted, but this falls outside the scope o f this thesis). In L V Q analysis, Elevation and Aspect further improved the quality o f analyses: their influence ranges from 19% (Seymour - Criterion b) to 4% (Jeune Landing - Criterion b). The vast majority o f studies reviewed in Chapter 2 identified Slope and Elevation as important attributes in terrain stability mapping. Aspect is not as frequently included in stability analyses as Slope and Elevation. However, this study identified it as an important attribute; these results are in agreement with those o f O'Loughlin (1972), Carrara et al. (1991), and Jakob (2000), as presented in Chapter 2. The accuracy o f predictions is further improved by introduction of Geomorphic Processes, and the influence of this attribute ranges from 13% (Seymour - Criterion b) to 7% (Jeune Landing - Criterion c). For both study sites, introduction of Geomorphic Processes produced a greater increase in accuracy with respect to Criterion b than for Criterion c, because this attribute is more related to terrain stability Class V , than Class I V . A s expected, predictions with S O M are less accurate than with L V Q and the difference (in accuracy) ranges from 6% (Seymour - Criterion c, and Jeune Landing - Criterion b) to 3% (Jeune Landing - Criterion c). 157 CHAPTER 8. GENERAL DISCUSSION This chapter starts with comments on the results obtained and how they could be implemented. Next, a general discussion on this study and on ANN-based terrain stability mapping is included. The last part of the chapter presents insights and possible secondary models that can be derived from this study. 8 . 1 . DISCUSSION OF RESULTS A N D IMPLEMENTATION OF THE M O D E L A preliminary result o f this thesis was evaluation of most common A N N and selection o f Self-Organizing Maps. The results obtained with this method are very promising. The method uses data currently available, it creates the premises o f mapping large areas in short time, and has practical applicability. The main findings of this study are summarized below, in relation to the objectives addressed: Objectives 1 and 2: • Good delineation of unstable and potentially unstable terrain was obtained in supervised analysis, based on a subset of topographic attributes: Slope, Elevation, and Aspect. The accuracy o f this delineation for the two study sites was 83% and 75% respectively, which is much better than the one obtained with Mult iple Discriminant Analysis (64% and 38%, respectively). • A further improvement o f terrain mapping accuracy was obtained by including in analyses areas already affected by Geomorphic Processes. Accuracy o f this prediction for the two study sites was 91% and 82%, respectively. A discussion was already presented arguing that even higher accuracies may be obtained with an improved mapping system. • Mappings based on strictly unstable terrain followed the same trend as for unstable and potentially unstable, and achieved higher accuracies, of 94% and 95%, respectively. • For the case when mapping is conducted based only on geomorphic attributes, the model showed good capabilities of assigning stability classes to terrain polygons described 158 according to Terrain Classification Systems. Based on all geomorphic attributes, the model assigned stability classes to polygons described according to the Terrain Classification Systems with accuracies o f 90% for Seymour (Table 7.3, Scenario 1) and 76% for Jeune Landing (Table 7.4, Scenario 1). However, this result has limited applicability. • Analyses based on existing landslides yielded modest results. This drives the need for a temporal analysis o f terrain stability, and more details are provided later in this chapter. Objective 3: • Cross-validation showed that the model has good potential. Accuracies obtained in this study were not very high; mainly because o f physiographic differences between the two study sites, and also due to differences in data quality. Further investigation o f the model could be conducted on a site adjacent to Seymour, that is mapped at the same standard. Objective 4: • The model developed in this study was compared to a physically-based model and produced results that were 9% more accurate. A l so , the model produced contiguous stable / unstable areas which increases its practical applicability. Objective 5: • Unsupervised analyses proved to be a good tool for terrain stability mapping, that can be used when no examples are available. For the two study sites, unsupervised learning delineated unstable and potentially unstable terrain with accuracies o f 85% and 79%, respectively, and (strictly) unstable terrain with an accuracy o f 89% for each one o f the study sites. 159 Objective 6: • The attributes identified as the most important in unsupervised analyses are the same as in the supervised case, namely: Slope, Elevation, Aspect, and Geomorphic Processes. Objective 7 • The unsupervised analysis proved to be a good tool for explaining the contribution each topographic attribute makes to the classification. This representation demonstrated the functionality o f S O M and its strong capability to reduce the dimensionality of the data. Individual representation of topographic attributes confirmed the results found in Objectives 2 and 6, and increased the explanatory power o f this method. SOM-based results identified Slope as the most important attribute in terrain stability mapping, and confirmed the importance o f Elevation and Aspect. The contribution o f curvatures and Specific Catchment Area proved to be negligible. The fundamental objective o f this study was to produce terrain stability mapping based on attributes consistent in their level of accuracy. This was accomplished as Slope and Aspect are directly derived from Elevation, and the level of consistency between topographic attributes and Geomorphic Processes is similar. Unl ike many other kinds of quantitative data, Elevation is easy and inexpensive to measure. Efforts are continuously made to make this data more reliable and affordable. Digital photogrammetry and digital orthophoto mapping provide data on terrain elevation and land cover directly in digital form and terrain representation using laser altimeter scanners (Abedini et al. 1997) is becoming more common. Geomorphic Processes coded with the 1-of-n method further improved the prediction based on topographic attributes. This is in agreement with the results o f Rollerson (1992) and Pack (1995) who found slope and geomorphic processes to be the most important indicators o f terrain instability. This was an expected result, as many other examples can be found in the literature, in which Self-Organizing 160 Maps correctly classify entities described by nominal attributes, coded using the 1-of-n system. For example, Kohonen (2001) presents a taxonomic classification of animal and birds. The 1-of-n coding method was used for coding their morphological and physiological characteristics, and the S O M was very successful in grouping similar animals and birds in various parts of the map. The fact that from all geomorphic attributes, Geomorphic Processes was the only one important in this study, clearly suggest that a different, much simplified terrain mapping method is sufficient for extracting this attribute. This method requires only identification o f areas affected by existing geomorphic processes. Delineation o f such polygons, with areas much smaller than those identified based on the Terrain Classification System, makes feasible recording o f only one Geomorphic Process per polygon (the most relevant to terrain stability). This eliminates one o f the major issues (driven by the Terrain Classification System) that had to be addressed in this study, where geomorphic processes were recorded based on their areal extent. Existing technology, like digital photogrammetry, makes delineation o f geomorphic processes much easier, requires less time, and a less skilled person, compared to the entire terrain mapping procedure. Besides, Geomorphic Processes represent the most accurately delineated attribute, o f all the geomorphic attributes considered in this study. They can be identified on air photos and field inspections can confirm their areal spread. A description o f the steps necessary to apply the method developed in this study is presented in Appendix 8. Both supervised and unsupervised training proved valuable in this study. In the supervised case, existing terrain stability maps can be used to train the model. Next, the model could replicate the mapping system in a consistent manner to new, similar areas. Unsupervised learning can be used when no similar maps are available, and this technique can cluster areas which are similar with respect to stability. Calibration of clusters can be done either manually, or mapping a subset o f the data (and using it for automatic calibration). A good working knowledge o f the area would greatly reduce the time necessary for this step and improve its quality. Unsupervised learning is a more complex process, which requires more time and 161 experience. It proved important in the theoretical aspects o f terrain stability mapping investigation, but it may have a reduced practical applicability. This study started based on specific conditions in Brit ish Columbia, and the final product can be applied in similar physiographic regions. Moreover, analyses can be repeated almost anywhere, based on commonly available data. In different regions, the importance o f terrain attributes may change. However, there is good reason to believe that the principles identified in this thesis w i l l be still valid. Standardization in data collection and representation makes applicability o f this model easier. A l so , there are data relevant to terrain stability mapping, which are collected in conjunction with other forest management activities (e.g. timber cruising, road and block layout, etc.) and is not used. Experience has shown that getting the necessary data is only a matter of organizing some information that already exists. For example, when timber cruising or locating roads, crew members identify wet spots, springs, and seep areas. However, this information is not transferred to the terrain stability people. These areas can be digitized, included in maps, and considered in the stability analysis. This may result in very small polygons which is discouraged in current practice, because each polygon is analyzed 'manually' . However, this is feasible with my approach, because the analysis is performed by a computer model. 8.2. G E N E R A L DISCUSSION ON ANN-BASED TERRAIN STABILITY MAPPING Michalewicz and Fogel (2000) state the most common mistake when dealing with a model is to forget the assumptions. Assumptions used in this thesis were previously declared and were not violated during analyses. However, the most important assumption incorporated in this model, not explicitly declared so far, is the so-called 'Tobler's First L a w of Geography' (Tobler 1970). This law is widely accepted in Geography, because it is simple and has very r ich content. Waldo Tobler, a famous geographer in the United States, wrote 'everything is related to everything else, but closer things are more closely related'. For the problem at hand, Tobler's law simply implies that terrain with similar stability attributes cluster 162 together. B y virtues o f this law, all factors considered important in terrain stability mapping are incorporated in the analysis: soil strength parameters are implici t ly considered, but they are assumed to change gradually; changes in root cohesion are not abrupt, they occur gradually over a certain distance; the importance o f macropores resulting from root decay and animal activity is not neglected, however the change in their distribution is not abrupt. The assumptions implied by Tobler's law seem to be much easier to accept than assumptions used by other models, e.g.: uniform soil depth, groundwater flow following topographic gradients, and flow of groundwater taking place in channels that do not communicate with each other. Practical experience supports this assumption, because the majority o f big landslides are also accompanied by smaller ones, which occur within relatively short distances. Tobler's law is similar to the. problem of spatial autocorrelation in Statistics. In some analysis, to meet the assumptions of methods used, spatial and temporal autocorrelation o f values is considered a nuisance and various methods are used to eliminate it. In other cases, autocorrelation - discovering how variables are connected in space and in time - is in fact the main objective of the analysis (Legendre 1993). This study falls in the latter category. More investigations were conducted in this study with respect to the coding methods. For 1-of-n coding, apart from the binary notation used, the bipolar notation was also explored. The bipolar notation is similar to the binary, but the value '0' is replaced with '- 1'. Results obtained with bipolar notation followed the same trend, but were in general less accurate than binary notation. Also , the m-of-n (or thermometer) coding was explored; with this method, each attribute is represented by a vector, which has values o f ' 1 ' for al l classes up to the class that includes the actual value o f the parameters, and values o f '0' for the remaining classes. This method is intended for ordinal- and ratio-type data, as attributes should be ordered in increasing order of magnitude. Although this study also included nominal-type data, the m-of-n coding was experimented with, so that nominal attributes were (subjectively) ranked based on their impact on terrain stability. Results with this method were less accurate, and sometimes 163 erratic. Thermometer coding is very much affected by the subjectivity in ranking the nominal data, and also by the number o f classes created for an attribute. 8.3. INSIGHTS A N D POSSIBLE SECONDARY MODELS DERIVED F R O M TERRAIN STABILITY ANALYSIS WITH SELF-ORGANIZING MAPS This section addresses Objective no. 8 of the thesis. The model developed in this study is characterized only by spatial resolution. N o temporal resolution is incorporated, i.e. there is no indication on the time when the terrain becomes unstable. This shortcoming became especially apparent in relation to the ability o f the model to predict instability based on existing landslides. The model can not mimic those conditions that triggered a landslide at a certain point in time and thereby identify terrain instability based on the same principles. Factors that are relevant in temporal analysis include climatic conditions (e.g. variation o f precipitation) and changes produced by forest development (e.g.: vegetation age since harvesting occurred which determines root cohesion; distance to roads, and distance to culverts, based on the date when roads were constructed, which reflect changes in the drainage pattern, and removal o f toe support). Terrain stability in a certain area may also be affected by general processes existing at a certain time, which can not be quantified. Selby (1985) states that many natural processes are episodic in operation and land surfaces may evolve in a series of leaps, with periods of stability followed by brief periods o f severe erosion. However, there are good reasons to believe that all the above factors are 'built-in' during temporal analysis and w i l l be accounted for by the virtues o f Tobler's law of Geography. Creation o f a dataset suitable for temporal analysis became feasible only recently. The Forest Practices Code o f B C requires that al l landslides be wel l documented, including a detailed description of the events whenever they occur. Based on descriptions, and on maps and sketches attached to these reports, 164 it is possible to transfer them to a GIS and conduct a temporal analysis on terrain stability. There is good reason to believe that in a relatively short time, good datasets can be assembled. 8.3.1. Temporal terrain stability analysis The critical piece of information needed to apply temporal learning is the time when landslides occur. In remote areas, identification o f recent landslides used air photos; however, these were sometimes updated at long intervals, and made the entire process unreliable. More recently, techniques like video mapping make landslide inventory much easier and more reliable. Pfister (1999) and Davis et al. (2002) present video mapping and the opportunities o f updating geographic data based on video techniques and Global Positioning Systems (GPS). Us ing oblique video monitoring it is possible to capture seasonal and event influences by gathering data whenever new slide events are suspected to have occurred. The technique does not require perfect weather conditions, and could take place at nearly any time o f year. The costs can be low, as video data can be gathered opportunistically (for example, when a helicopter is ferrying equipment or personnel through the area). After processing, data can be integrated in a GIS. For temporal analysis, new data could be added to improve the accuracy of prediction: • Forest Cover. N e w attributes can be extracted from forest cover data (e.g. Vegetation Resources Inventory Guidelines - Province o f B C 2001): species, including trees, shrubs and weeds - based on the assumption that species have specific root cohesions; Site Index - this shows how wel l trees can grow on that site; crown closure - may account for intercept of rainfall in the crown of trees, evaporation etc; stand age - it is expected that root cohesion varies with age. Also , age, species, and crown closure can be a good replacement for root cohesion and overburden imposed by stand. • Logging method. For recently harvested blocks (probably less than 20 years age), this might be a good indication of soil compaction because macropores are closed by heavy machines and the drainage patterns are badly disturbed. 165 • Distance to roads at the time o f the event. • Distance to nearest logging block. • Distance to culverts. • Distance to streams. Distances can be calculated uphill or downhill , based on steepest path or rain-drop path. For logging blocks, distances can be calculated to the centroid or to the closest boundary point. A l so , location of roads, blocks or culverts could be assessed in relation to existing landslides. In temporal analysis, there may be a need to differentiate between events that initiate on open slopes and in gullies, and between landslides that initiated in clearcuts or forested terrain. 8.3.2. Implementation of terrain stability mapping with Self-Organizing Maps in a forest planning model If temporal analysis is incorporated, the method for terrain stability assessment developed in this study can be also applied in a forest planning model. Assuming that forest planning is performed starting at a certain time, the A N N should be trained to learn the events that occurred before that time, following these steps: 1) Start at some point in time, before the first landslide. 2) Describe the site, using classes of attributes like: topography (elevation, slope, aspect); terrain affected by geomorphic processes; forest cover (species, site index, crown closure, age); logging method for existing blocks; ecosystem classification; location o f roads, culverts and streams. 3) A t the time when the first landslide occurred, let the system learn it. I f possible, differentiate between natural vs. development related landslides. In the second case, use distance to roads, culverts and blocks, otherwise not. 4) Keep aging the forest until the next landslide occurred. If forest development occurred, update location o f blocks, roads and culverts. 166 5) Continue the process until the current time. Essentially, forest planning consists o f scheduling harvest blocks over a planning horizon. Usually, the purpose o f the planning activity is to ensure that a desired volume o f wood can be harvested from the area, but at the same time respect rules related to land management, e.g.: leave riparian buffers, leave sensitive areas, maintain a certain structure of the forest (distribution by age class), visual quality, adjacency (can not cut right beside an existing block until it has attained a certain age/height) etc. To access the blocks, also the corresponding roads must be constructed (with an appropriate time lag). A s the planning progresses through time, the factors relevant to terrain stability (i.e. new ages o f stands, distances to blocks and roads etc.) are updated. Supervised learning ( L V Q ) is the technique to be used. The L V Q net trained in the first phase reads the new attributes and redefines unstable areas as planning progresses through time. The final decision rests entirely with the planner, whether or not the situation is acceptable. The model could be linked with the forest planning software, A T L A S ( A Tactical Landscape Analysis Software), Nelson (1999). The model could provide A T L A S information on landslide initiation for every spatial entity (i.e. polygon or pixel). Since different cutblock sizes and shapes are laid out for a region, an average hazard index could be produced for each cutblock. A T L A S can incorporate output from the new model, which is subsequently used to modify the harvest schedule and display the landslide hazard associated with various management scenarios. 8.3.3. Utilization of Self-Organizing Maps to predict debris travel distance It is widely recognized that, along with landslide initiation, there is a need for assessment o f the travel distance o f a debris flow, and a corresponding evaluation of risk to downslope resources. A long with initiation, assessment o f travel distance o f the landslide is an important component o f the risk assessment. The movement o f a mass consisting o f soil, rock, organics, and water, is a complex phenomenon governed by properties of the materials and the path o f movement. A n overview of 167 modeling techniques for determining travel distance is given in Fannin and Wise (2001). Analytical techniques for determining travel distance may be categorized either as empirical or using a dynamic approach. Simplifying assumptions are often made where input parameters cannot be measured easily. Based on the method developed in this study, the same principles used for initiation can be also used to estimate travel distance using supervised learning ( L V Q ) . The following steps should be taken in this process: 1) Create a D E M (raster) for the area. 2) For each unstable cell (previously determined based on the initiation model) in the D E M identify the rain-drop path analysis. This is essentially the steepest path downhill from the cell to the valley floor. The main assumption is that a debris flow w i l l follow this path. One argument against this assumption (i.e. one factor that needs further investigation) is related to the size o f an obstacle that may change the path o f the rain-drop, but not that of a debris flow. 3) Break the (rain-drop) path into reaches. Wise (1997) defines a reach as a linear portion o f the event path which has consistent slope morphology, slope angle, azimuth, width, and volumetric behaviour characteristics. Other criteria like changes in vegetation type and age can be also included. 4) Run the analysis, i.e. train the net with existing data, and predict the travel distance for other unstable locations. The essence of this method is that for an existing slide, the theoretical travel path (rain-drop) extends to the valley floor. However, for various reasons, the actual slide may came to rest somewhere on the hillside. The L V Q model w i l l be able to learn this. In an ANN-based analysis, estimation o f initial volume, or of volume entrained or deposited along the path need not be explicitly input, because they are 'built-into' the principles o f supervised learning. A l o n g with initiation, travel distance can be incorporated in forest planning, to develop an integrated risk analysis tool. Overall , this implementation can address the problem of magnitude-frequency o f mass 168 movements (Innes 1985), as the impact of landslides can be predicted both in relation to hydrological events and forest development. 8.3.4. Other comments on geomorphic attributes Based on the results in this study, the concept of simplicity suggested by Ockham's razor looks very appealing. Ockham's razor suggests choosing the simplest hypothesis that matches the observed examples. In this study, prediction based solely on A N N proved to be highly accurate. However, there is reason to believe that some attributes were considered unimportant simply because they were inadequately represented, and this may be the case o f surficial materials. For future studies, combination with other techniques may improve representation of this attribute and yield valuable results. A good candidate for this seem to be Fuzzy Set Theory. If spatial entities in vector format (polygons) are included for materials delineation, a fuzzy representation may be more appropriate. Points on the border may represent different materials, and equally belong to both types, while further away from the border, the degree o f membership w i l l change, i.e. increase for one material type and decrease for the other. Some notes should be made about geomorphic processes included in this study. The model can accommodate al l processes that may produce landslides; however, in the two study sites analyzed, piping and surface seepage were not encountered. These processes seem to have a smaller influence on stability than the ones included. However, mathematically, their impact is the same and this should be investigated further. A l so , the number of classes used in 1 -of-n coding o f topographic attributes should be investigated more thoroughly. Many terrain stability studies use geological description as an input, especially the dip o f the bedrock. However, geology was not included in this research, because geological description at a scale compatible with the other data is rarely available. Furthermore, the accuracy o f geological descriptions is not high 6 , and i f the description o f surficial deposits is considered 6 K . Shimamura, Geological Survey of Canada - personal communication (February, 2000). 169 inaccurate, there in no reason to use geological materials which are probably described with even lower accuracy. 170 CHAPTER 9. CONCLUSIONS This thesis, investigated application of Self-Organizing Maps for terrain stability mapping. A n approach for delineation of unstable terrain was developed and tested on two study sites, relatively non-uniform with respect to quality o f the data available. In supervised mode, the model was able to delineate unstable and potentially unstable terrain with accuracies o f 91% and 82%, and (strictly) unstable terrain with accuracies o f 94% and 95%, respectively. In unsupervised mode, the model also performed wel l , and delineated unstable and potentially unstable terrain with accuracies o f 85% and 79% respectively, and (strictly) unstable terrain with an accuracy of 89% for each o f the study sites. A l l predictions were based on attributes consistent in their level of accuracy, namely, Slope, Elevation, Aspect, and Geomorphic Processes. Based on these results, arguments were presented that good quality terrain stability mapping could be performed with a set of topographic attributes that can be easily extracted from a D E M , plus geomorphic processes that could be obtained through a much simplified terrain mapping procedure. The quality o f the results obtained in this study is due to a careful selection of the A N N used, an informed selection of topographic and geomorphic attributes input, and a proficient pre-processing of variables. To my knowledge, this is one o f the first two studies that used A N N for terrain stability mapping, and so far, is the one that produced the best results. The results of this thesis have the potential to influence the professional practice in terrain stability mapping. The method developed proved to be successful and efficient in delineation o f unstable and potentially unstable terrain, and thus, can assist in l imiting expensive ground checks to the most vulnerable areas. A s presented in Chapter 8, the method can also be used in Forest Planning. Oftentimes, in real life applications, the chief criterion used to delineate stable and unstable terrain is strictly related to slope. Selection of this criterion is mainly driven by liability o f terrain stability specialists, and leads to conservative approaches and unnecessary deferment o f development in some areas. The approach 171 introduced in this study takes a very detailed look at other attributes important in terrain stability assessment and incorporates them in the analysis. The new model is easy to implement with data already available. Improvements proposed in the previous chapter could make this mapping method even more useful. O f particular importance is the potential to incorporate temporal analysis, since none of the existing methods has the capability to do it in a relatively easy manner. The last paragraphs of this chapter include comments in connection with the quality o f the data used, general comments on A N N and A l in general, and final (philosophical) remarks on this thesis. Terrain representation through D E M and data manipulation in GIS were very important components of this study. This model created the framework for data analysis at any resolution; a 20-m grid was used in this study, but spatial resolution can be relatively easily changed in GIS. However, scale consistency for attributes included in analyses and (thematic, positional, and temporal) accuracy of representations needs to be assessed in GIS-based models. Davis (1999) addresses the problem of data uncertainty at length and states that any map at a scale smaller than 1:1 leaves a gap between representation and reality. This gap (and in fact any abstraction that involves sampling and filtering) is in essence data uncertainty. Data uncertainty is not explicitly recognized in GIS (or, more exactly, by those who conduct GIS-based modeling), and oftentimes it is virtually ignored. A GIS enables spatial data to be viewed and manipulated at virtually any scale and is, in fact, a scaleless working environment. Besides, a GIS stores data at a resolution that is capable of locating a point down to a very high precision and it also reports all the information (coordinates, areas, etc.) with the same excessive precision. Davis (1999) stresses that these two characteristics - lack o f scale and the reporting o f extreme precision - virtually eliminated the implicit recognition o f uncertainty found in manual cartography (the dichotomy between imprecise reality and its precise digital representation makes many users reluctant to accept the fact that an answer o f lesser precision can be more 'correct'). The main method to address this problem is introduction o f standards for digital data capture. This field of inquiry is still in its infancy, due in part to the complexity 172 of the problem. However, recognition o f such problems coupled with modern methods for terrain representation, make GIS an extremely powerful tool for terrain stability mapping. In relation to A N N , most books on this topic also contain comments and comparisons between these models and their biological counterparts. Kohonen (2001) states that the early models, although still primitive, were definitely meant to be descriptions o f the brain, whereas most of the present A N N seem to have been elaborated for new generations of information-processing devices. A s for the question o f how the mind works, many more open questions have been raised than definite answers given (Wurtz 2000). Another subject of much debate related to A N N and A I in general is whether or not these systems can achieve intelligence. A discussion on this topic can be found in Russell and Norv ig (1995); for obvious reasons, when evaluated with respect to criteria such as intentionality and consciousness, A N N do not rate very high. However, similarity to biological systems or 'real intelligence', do not reduce the applicability o f A N N to many real-life problems, but rather set more clearly the difference between biological systems and artificial ones. Aristotle stated in his Nichomachian Ethics that each science (field o f study) has a built-in degree of precision: for some sciences, high precision is specific, for others, only lower degrees of precision are feasible. Geotechnical engineering is known as an 'imprecise' area of engineering because it deals with a material produced by nature (Toll 1996; Chan 1999). In many circumstances, our fundamental understanding of soil and rock behaviour still falls short of being able to predict how the ground w i l l behave. Furthermore, there is also a major difference when terrain stability is carried out for an urban development (i.e. housing, industrial), and when it is done for forest development. In the first case, the instability o f a very small area may have catastrophic consequences, in the second case, i f only a small portion is unstable, the consequences are usually minor. In this light one can draw the conclusion that Self-Organizing Maps proved to be a good method for terrain stability mapping for forest development purposes. 173 Alfred Whitehead, co-author with Bertrand Russell o f Principia Mathematica wrote (cf. Russell and Norv ig 1995) that "Civil izat ion advances by extending the number o f important operations that we can do without thinking about them". A s in many other issues, the opposite o f this statement was also expressed. Weinzenbaum (1976) stated that "Strengthening a particular technique (i.e. introducing automation), putting muscles on it, contributes nothing to its validity. The poverty o f the technique, i f it is indeed unable to deal with its presumed subject-matter, is hidden behind a mountain o f effort. . . the harder the subproblems were to solve, and the more technical success was gained in solving them, the more is the original technique fortified". This study does not claim in any way to produce an advancement o f civilization; besides, it is not clear how important terrain stability mapping is from this perspective. However, there is good reason to believe that the approach introduced responds to both requirements expressed in previous statements: this model makes terrain stability mapping more prone to automation, and also produces good quality predictions. In a general case, after the preliminary operations related to data preparation and preprocessing, the analyst can do the rest o f the work without being aware o f how the system works, and without thinking too much about it. The result is, however, a good quality terrain stability map. Obviously, further studies of this method w i l l clarify its applicability, and w i l l reveal the merits and the shortcomings o f the approach developed in this study. 174 R E F E R E N C E S Abedini , M . J . , Dickinson, W.T . , Rudra, R .P . 1997. Integration o f GIS tools and laser-scanned D E M with implications for rainfall - runoff modeling. Paper no. 973029, Presented at the A S A E Annual International Meeting, August 10-14, Minneapolis, Minnesota. 15 pp. Abolmasov, B . , Obradovic, I. 1997. Evaluation o f geological parameters for landslide hazard mapping by fuzzy logic. In Marinos, P .G . , Koukis , G . C . , Tsiambaos, G . C . , Stournaras, G . C . (eds.) Proc. o f Int. Symp. on Eng. Geol . and the Environment, Athens, Greece, pp. 471 - 476. Ackley , D . H . , Hinton, G .E . , Sejnowski, T.J . 1985. A learning algorithm for Boltzmann machines. Cognitive Science, 9. pp. 147-169. Reprinted in Anderson and Rosenfeld, 1988. Ade l i , H . 2001. Neural networks in C i v i l Engineering: 1989 - 2000. Computer-Aided C i v i l and Infrastructure Engineering, no. 16. pp. 126 - 142. Agriculture Canada, 1987. Canadian system of soil classification. Ottawa, Ont. Al -Tuhami , A . A . , 2000. Neural Networks: a solution for the factor o f safely problem in slopes. In: Landslides in research, theory and practice. Proceedings o f the 8 t h International Symposium on Landslides, Cardiff, pp. 45-50. Aleott i , P., Chowdhury, R. 1999. Landslide hazard assessment: summary review and new perspectives. B u l l . Eng. Geol . Env., no. 58. pp. 21 - 44. Anderson, J .A. , Rosenfeld, E . (eds.) 1988. Neurocomputing: Foundations of Research. M I T Press, Cambrdige, M A . Anderson, M . G . , Richards, K . S . 1987. Model ing slope stability: the complimentary nature of geotechincal and geomorphological approaches. In Anderson, M . G . , Richards, K . S . (eds.) Slope stability: Geotechnical engineering and Geomorphology. John Wi l ey & Sons, Chichester, pp. 1-9. Anderson, S.A., Sitar, N . 1995. Analysis of rainfall-induced debris flows. J. of Geotech. Eng., V o l . 121, No.7 . pp. 544-552. Andrews, R., Diederich, J., Tickle, A . B . 1995. Survey and critique o f techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, V o l . 8, N o . 6. pp. 373 - 389. Anon . 1999. Greater Vancouver Regional District ( G V R D ) watershed ecological inventory program. Study coordinated by Acres International Limited. 3 volumes, 500 + pp. Aste, J.P., K e , C , Faure, R . M . , Mascarelli , D . 1995. The SISIPHE and X P E N T projects - Expert Systems for slope instability. In B e l l , D . H . (ed.) Proc. o f the 6 t h Int. Symp. on Landslides, Christchurch. Balkema, Rotterdam, pp. 1647 - 1652. Baeza, C , Corominas, J. 1996. Assessment o f shallow landslide susceptibility by means o f statistical techniques. In Senneset K . (ed.) Proc. o f the 7 t h Int. Symp. on Landslides, Trondheim. Balkema, Rotterdam, pp. 147 - 152. Beven, K . J . , Ki rkby , M . J . 1979. A physically based, variable contributing area model of basin hydrology. Hydrological Sciences Bulletin, 24(1). pp. 43-69. 175 Beven, K . , Ki rkby , M J . , Schofield, N . , Tagg, A . F . 1984. Testing a physically based flood forecasting model ( T O P M O D E L ) for three U K catchments. J. Hydrol . , 69. pp. 119 - 143. Bishop, C M . 1995. Neural Networks for pattern recognition. Oxford University Press. Oxford. Bojadziev, G . , Bojadziev, M . 1995. Fuzzy Sets, Fuzzy Logic , applications. Wor ld Scientific Publishing Co. , N e w Jersey. Bristol Innovations Software Sales L td . 2000. C H A S M : combined hydrology and stability model. User's manual, v.3.3., Bristol , U K . 23 pp. Burrough, P . A . , McDonnel l , R . A . 1998. Principles o f Geographical Information Systems. Oxford University Press. Oxford. Carrara, A . 1983. Multivariate models for landslide hazard evaluation. Mathematical Geology, V o l . 15, N o . 3. pp. 403 -426. Carrara, A . , Cardinali, M . , Detti, R., Pasqui, V . , Reichenbach, P. 1991. GIS techniques and statistical models in evaluating landslide hazard. Earth Surface Processes and Landforms, V o l . 16. pp. 427 -445. Cawsey, A . 1998. The essence of Artif icial Intelligence. Prentice Ha l l Europe. London. Cedergren, H . R . 1967. Seepage, Drainage, and F low Nets. John Wi ley & Sons, Inc. N e w York . Center for Resource and Environmental Studies ( C R E S ) 2003. T A P E S . (June, 2003) Chan, Y . C . 1999. Foreword. In Geotechnical risk management. Procs. o f the 18 t h annual seminar organized by the Geotech. D iv . o f Honk K o n g Inst, o f Engineers, M a y 14. Honk Kong. pp. 1-2 . Commonwealth Scientific and Industrial Research Organization (CSIRO) 2003. T O P O G . (June, 2003) Cottrell, M , Fort, J . C , Pages, G . 1998. Theoretical aspects o f the S O M algorithm. Neurocomputing, 21. pp. 119-138 Craig, R . F . 1992. Soil Mechanics. Fifth Edition. Chapman & H a l l , London. Cross, V . , Firat, A . 2000. Fuzzy objects for Geographical Information Systems. Fuzzy Sets and Systems, 113. pp. 1 9 - 3 6 . Cruden, D . M . 1991. A simple definition o f a landslide. Bullet in o f the International Assoc. of Engineering Geology, N o . 43. pp. 27 - 29. Cruden, D . M . , Varnes, D.J . 1996. Landslide types and processes. In Turner, A . K , Schuster, R . L . (eds.) -Landslides - investigation and mitigation. National Academy Press, Transportation Research Board Special Report 247, Washington, D C . pp. 36-76. Da i , F . C , Lee, C F . 2001. Terrain-based mapping o f landslide susceptibility using a geographical information system: a case study. Can. Geotech.J., 38. pp. 911 - 923. 176 Dai , F . C , Lee, C.F . , Ngai , Y . Y . 2002. Landslide risk assessment and management: an overview. Engineering Geology, no. 64. pp. 65 - 87. Davis, T.J . 1999. Towards verification o f a natural resource uncertainty model. Ph.D. Thesis, University o f Bri t ish Columbia, Vancouver, B C . Davis, T.J . , Klinkenberg, B . , Keller , P . C . 2002. Updating inventory using oblique videogrametry data fusion. Journal of Forestry, March 2002. pp. 45-50. De Vries, J. , Chow, T . L . 1978. Hydrologic behavior o f a forested mountain soil in Coastal Brit ish Columbia. Water Res. Research, 14 (5). pp. 935 -942. Dhakal , A . S . 1999. Remote Sensing and GIS techniques for landslide hazard evaluation and prediction models. Ph.D. Thesis, U . o f Tsukuba, Japan. Dietrich, W . E . , Sitar, N . 1997. Geoscience and geotechnical engineering aspects o f debris-flow hazard assessment. In Chen, C-I. (ed.) Procs. o f 1 s t Int. Conf. on debris-flow hazards mitigation, San Francisco, California, pp. 656 - 676. Dietrich, W . E . , Montgomery, D.R. , 1998. S H A L S T A B : A digital terrain model for mapping shallow landslide potential. 30+ pp. (Nov. 2000) Dodagoudar, G.R. , Venkatachalam, G . 2000. Reliability analysis of slopes using fuzzy sets theory. Computers and Geotechnics, 27. pp. 101 - 115. Duncan, J . M . 1996. Soi l slope stability analysis. In Turner, A . K , Schuster, R . L . (eds.) Landslides -investigation and mitigation. National Academy Press, Transportation Research Board Special Report 247, Washington, D C . pp. 36-76. E B A Engineering Consultants Ltd . 1999. Detailed and reconnaissance terrain mapping with interpretation o f terrain stability, surface erosion potential, and potential fine sediment transfer, west Sugar Lake and Gates Creek areas, Brit ish Columbia. Project N o . 0801-98-87752. Project funded by Forest Renewal B C . 34 pp. Eisbacher, G . H . , Clague, J.J. 1981. Urban landslides in the vicinity o f Vancouver, Bri t ish Columbia, with special reference to the December 1979 rainstorm. Can. Geotech. J., 18. pp. 205 - 216. Elomaa, J., Halme, J., Hassinen, P., Hodju, P., Ronkko, J. 1999. N E N E T - Demo version 1.1a. (March, 2003) Environmental Systems Research Institute, Inc. (ESRI), 1996a. Using A r c V i e w GIS. Environmental Systems Research Institute, Inc. (ESRI), 1996b. Us ing the Arc V i e w Spatial Analyst. Evans, S.G. 1982. Landslides and surficial deposits in urban areas of Bri t ish Columbia: a review. Can. Geotech. J. , 19. pp. 269-288. Fannin, R .J . , Jaakkola, J. 1999. Hydrological response of hillslope soils above a debris-slide headscarp. Can. Geotech. J. 36(6). pp. 1111 - 1122. 177 Fannin, R.J . , Jaakkola, J. 2000. Piezometric observations: implications for debris flow initiation on forested hillslopes. In Bromhead, E . , Dixon , N . , Ibsen, M - L . (eds.) Proc. of the 8 t h Int. Symp. on Landslides, Cardiff. Thomas Telford Ltd. , London, pp. 537 - 542. Fannin, R .J . , Jaakkola, J., Wilkinson, J .M.T . , Hetherington, E . D . 2000. The hydrologic response of soils to rainfall at the Carnation Creek watershed, Brit ish Columbia. Water Resources Research, 36(6). pp. 1481 - 1494. Fannin, R.J . , Wise, M . P . 2001. A n empirical-statistical model for debris flow travel distance. Can. Geotech. J. , 38. pp. 982 - 994. Faure, R . M . , Leroueil, S., Rajot, J.P., Larochelle, P., Seve, G . , Tavenas, F . 1988. X P E N T , Expert System in slope stability (in French). Bonnard, C . (ed.) Proc. o f the 5 t h Int. Symp. on Landslides, Lausanne. Balkema, Brookfield V T . pp. 625 - 629. Faure, R . M . , Mascarell i , D . , Vaunat, J., Leroueil, S., Tavenas, F . 1995. Present state of development of X P E N T , Expert System for slope stability problems. In B e l l , D . H . (ed.) Proc. of the 6 t h Int. Symp. Landslides, Christchurch. Balkema, Rotterdam, pp. 1671 - 1678. Fausett, L . 1994. Fundamentals of neural networks: architectures, algorithms, and applications. Prentice H a l l , Inc. N e w Jersey. Feigenbaum, E . A . 1979. Themes and case studies of knowledge engineering. In Mich ie D . (ed.) Expert Systems in the Micro-Electronic Age. Edinburgh University Press, Edinburgh, pp. 3 - 2 5 . Fernandez-Steeger, T . M . 2002. Objectivation o f landslide hazard analysis with Art i f ic ial Neural Networks ( A N N ) . Overview of Ph.D. thesis. (June, 2003) Fisher, P. 2000. Sorites paradox and vague geographies. Fuzzy Sets and Systems, 113. pp. 7 - 1 8 . Fredlund, D . G . 1987. Slope stability analysis incorporating the effect o f soil suction. In Anderson, M . G . , Richards, K . S . (eds.) Slope stability: Geotechnical engineering and Geomorphology. John Wi ley & Sons, Chichester, pp. 145 - 186. Freeze, R . A . 1980. A stochastic-conceptual analysis o f rainfall-runoff processes on a hillslope. Water Resources Res., 16. pp. 391 - 408. Geo-Slope International Ltd . 1998. S L O P E / W for slope stability analysis. User's manual, v. 4. Calgary, Alberta. 100+ pp. Greenway, D .R . 1987. Vegetation and slope stability. In Anderson, M . G . , Richards, K . S . (eds.) Slope stability: Geotechnical engineering and Geomorphology. John Wi l ey & Sons, Chichester, pp. 187 -230. Grivas, D . A . , Reaga, G.C.1988. A n expert system for the evaluation and treatment o f earth slope Instability. In Bonnard, C . (ed.) Proc. of the 5 t h hit. Symp. on Landslides, Lausanne. Balkema, Brookfield V T . pp. 649 - 654. Guesgen, H . W . , Albrecht, J. 2000. Imprecise reasoning in geographic information systems. Fuzzy Sets and Systems, 113. pp. 121 - 131. 178 Gulyas, G . 1995. Terrain stability assessment using logistic regression analysis for the Jamieson-Orchid-Elbow creeks subdrainage, Seymour River Basin, B C . M . S c . Thesis, U B C , Vancouver. Guzzetti, F . , Carrara, A . , Cardinali, M . , Reichenbach, P. 1999. Landslide hazard evaluation: a review of current techniques and their application in a multi-scale study, Central Italy. Geomorphology, no. 31.pp. 181-216 . Hammond, C , H a l l , D . , Mi l l e r , S., Swetik, P. 1992. Level I Stability Analysis ( L I S A ) . Documentation for Version 2.0. U S D A , Intermountain Research Station. General Technical Report INT-285. 128 pp. Haneberg, W . C . 1991. Pore pressure diffusion and the hydrologic response o f nearly saturated, thin landslide deposits to rainfall. Journal o f Geology, 99. pp. 886 - 992. Hastie, T., Stuetzle, W . 1989. Principal curves. Journal o f the American Statistical Association, 84. pp. 502-516. Haykin, S. 1994. Neural Networks. A comprehensive foundation. Macmi l lan College Publishing Company, N e w York . Hebb, D . O . 1949. The Organization o f Behavior. Wiley , N e w York . Introduction and Chapter 4, 'The first stage o f perception: growth o f the assembly', pp. x i -x ix , 60-78. Reprinted in Anderson and Rosenfeld, 1988. Hodge, R . A . , Freeze, R . A . 1977. Groundwater flow systems and slope stability. Can. Geotech. J., 14. pp. 4 6 6 - 4 7 6 . Hodju, P., Halme, J. 1999. Neural Networks Tool - Nenet (March, 2003) Holland, S.S. 1976. Landforms of Brit ish Columbia: a phisiographic outline. B C Department o f Mines and Petroleum Resources - Bullet in 48. Victoria , B C . 138 pp. Holmstrom, L . , Koistinen, P., Laaksonen, J., Oja, E . 1996a. Comparison of neural and statistical classifiers - theory and practice. Research Report A 13, R o l f Nevanlinna Institute, University o f Helsinki , Finland. 37 pp. Holmstrom, L . , Koistinen, P., Laaksonen, J., Oja, E . 1996b. Neural network and statistical perspectives o f classification. Proceedings of the I C P R '96, Vienna, Austria, pp. 286-290. Hopfield, J.J. 1982. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79. pp. 2554-2558. Reprinted in Anderson and Rosenfeld, 1988. Howes, D . E . , Kenk, E . (contributing eds.) 1997. Terrain classification system for Bri t ish Columbia (version 2). Ministry o f Environment and Surveys, and Minis try o f Crown Lands. Victoria, B C . 101 pp. Huang, S.H. , X i n g , H . 2002. Extract intelligible and concise fuzzy rules from neural networks. Fuzzy Sets and Systems, 132. pp. 233 - 243. Huba, G.J . 2001. C H A I D . The Measurement Group. (Feb., 2003) 179 Hungr, O. 1987. A n extension o f Bishop's simplified method o f slope stability analysis to three dimensions. Geotechnique, 37(1). pp. 113 - 117. Innes, J .L . 1983. Debris flows. Progress in Physical Geography, V o l . 7, N o . 1. pp. 469 - 501. Innes, J .L . 1985. Magnitude-frequency relations o f debris flows in Northwest Europe. Geografiska Annaler, 67 A (1 - 2). pp. 23 - 32. Iverson, R . M . , Major, J.J. 1986. Groundwater seepage vectors and the potential for hillslope failure and debris flow mobilization. Water Res. Research, 22(11). pp. 1543 - 1548. Iverson, R . M . 2000. Landslide triggering by rain infiltration. Water Res. Research, V o l . 36, N o . 7. pp. 1897- 1910. Jaakkola, J. 1998. Forest Groundwater Hydrology: Implications for Terrain Stability in Coastal Brit ish Columbia. M A S c Thesis. U B C . Jakob, M . 2000. The impacts of logging on landslide activity at Clayoquot Sound, Bri t ish Columbia. Catena, 38. pp. 279 - 300. Juang, C . H . , Jiang, T., Christopher, R . A . 2001. Three-dimensional site characterization: neural network approach. Geotechnique, 51(9). pp. 799 - 809. Kartalopoulos, S .V. 1996. Understanding neural networks and fuzzy logic. Basic concepts and applications. I E E E Press, New York . Kasabov, N . K . 1996. Learning fuzzy rules and approximate reasoning in fuzzy neural networks and hybrid systems. Fuzzy Sets and Systems, 82. pp. 135 - 149. Kask i , S. 1997. Data exploration using Self-Organizing Maps. Acta Polytechnica Scandinavica, Mathematics, Computing and Management in Engineering Series N o . 82, Espoo. Published by the Finnish Academy of Technology. 57 pp. Keaton, J.R., DeGraff, J .V . 1996. Surface observation and geologic mapping. In Turner, A . K , Schuster, R . L . (eds.) Landslides - investigation and mitigation. National Academy Press, Transportation Research Board Special Report 247, Washington, D C . pp. 36-76. Kernighan, B . W . , Ritchie, D . M . 1988. The C Programming Language. Second edition. Prentice Ha l l Inc., N e w Jersey. Ki rkby , M . J . (ed.) 1978. Hillslope hydrology. John Wi ley & Sons, Chichester. Kohonen, T. 1982. Self-organized formation o f topologically correct feature.maps. Biological Cybernetics, 43. pp. 59-69. Reprinted in Anderson and Rosenfeld, 1988. Kohonen T., Hynninen J., Kangas J., Laaksonen J. , Torkkola K . 1996a. Learning Vector Quantization. Technical Report A 3 0 , Helsinki University o f Technology, Laboratory o f Computer and Information Science, Espoo, Finland. 32 pp. Kohonen T., Hynninen J. , Kangas J., Laaksonen J. 1996b. Self-Organizing Maps. Technical Report A 3 1 , Helsinki University o f Technology, Laboratory o f Computer and Information Science, Espoo, Finland. 38 pp. 180 Kohonen, T. 2001. Self-organizing maps. Third edition. Springer-Verlag, Berl in . Krag, R . K . , Sauder, E . A . , Wellburn, G . V . 1986. A forest engineering analysis o f landslides in logged areas on the Queen Charlotte Islands, Bri t ish Columbia. B C Minis t ry o f Forests and Lands, Land Management Report N o . 43, also F E R I C Special Report S R - 39. Vancouver, B C . 138 pp. Krohn, W . , Kiippers, G . , Nowotny, H . 1990. Selforganization - the Convergence of Ideas: A n Introduction. In Krohn, W . , Kiippers, G . , Nowotny, H . (eds.) Selforganization: Portrait o f a Scientific Revolution. Kluwer Academic Publishers, Dordrecht, pp. 1-10. Laaksonen, J., Oja, E . 1996. Classification with Learning k-Nearest Neighbors. In Proceedings of I C N N '96, Washington, D C . pp. 1480-1483. Lam, N . S . N . 1983. Spatial interpolation methods: a review. The American Cartographer, V o l . 10, No.2. pp. 129-149. Legendre, P. 1993. Spatial autocorrelation: Trouble or N e w Paradigm ? Ecology, 74 (6). pp. 1659 -1673. Leshchinsky, D . , Huang, C - C . 1991. Generalized three-dimensional slope stability analysis. Journal of Geotechnical Engineering, 118(11). pp. 1748 - 1765. Luce, C . H . , Wemple, B . C . 2001. Introduction to special issue on hydrologic and geomorphic effects of forest roads. Earth Surface Processes and Landforms, 26. pp. 111-113 . Manly , B . F . J . 1994. Multivariate statistical methods: a primer. Second edition. Chapman and Ha l l , London. Mark, R . K . , El len, S.D. 1995. Statistical and simulation models for mapping debris-flow hazard. In Carrara, A . , Guzzetti, F . (eds.) Geographical Information Systems in assessing natural hazards. Kluwer Acad . Publ. , Dordrecht, pp. 93 - 105. Masters, T. 1993. Practical neural network recipes in C++. Academic Press, Inc. Boston. M c C u l l o c h , W . , Pitts, W . 1943. A logical calculus of the ideas immanent in nervous activity. Bullet in o f Mathematical Biophysics, 5. pp. 115-133. Reprinted in Anderson and Rosenfeld, 1988. Michalewicz, Z . , Fogel, D . B . 2000. H o w to solve it: modern heuristics. Springer-Verlag, Berl in . Mi l l a rd , T. 1999. Debris flow initiation in Coastal Brit ish Columbia gullies. B C Forest Service, Technical Report T R - 002, Nanaimo, B C . 22 pp. Montgomery, D.R. , Dietrich, W . E . 1994. A physically based model for topographic control on shallow landsliding. Water Res. Research, vo l . 30, no. 4. pp. 1153 - 1171. Montgomery, D.R. , Schmidt, K . M . , Greenberg, H . M . and Dietrich, W . E . 2000. Forest clearing and regional landsliding. Geology, 28. pp.311-314. Moore, I.D., O'Loughlin, E . M . , Burch, G.J . 1988. A contour-based topographic model for hydrological and ecological applications. Earth Surface Processes and Landforms, 13. pp. 305 - 320. 181 Morgenstern, N . R . , Sangrey, D . A . 1978. Methods of stability analysis. In Shuster, R . L . , Krizek, R . J . (eds.) Landslides: analysis and control. National Academy of Sciences, Transportation Research Board Special Report 176, Washington, D C . pp. 155 -172. Moula , M . , To l l , D . G . , Vaptismas, N . 1995. Knowledge-based systems in geotechnical engineering. Geotechnique, V o l . 45, N o . 2. pp. 209 - 221. Muller , J .E. , Northcote, K . E . , Carlisle, D . 1974. Geology and mineral deposits o f Alert - Cape Scott map area, Vancouver Island, Brit ish Columbia. Geological Survey o f Canada, Department of Energy, Mines and Resources, Ottawa. Paper 7 4 - 8 . Nash, D .F .T . 1987. A comparative review of limit equilibrium methods of stability analysis. In Anderson, M . G . , Richards, K . S . (eds.) Slope stability: Geotechnical engineering and Geomorphology. John Wi ley & Sons, Chichester, pp. 11 - 76. Nelson, J. 1999. Forest Planning Studio (fps) - A T L A S Program. Faculty o f Forestry, University of Bri t ish Columbia, Vancouver, B C . 66 pp. Niemann, K . O . , Howes, D . E . 1992. Slope stability evaluations using digital terrain models. B C Ministry o f Forests, Land Management Report N o . 74, Victoria, B C . 28 pp. O. Hungr Geotechnical Research, Inc. 2003. C L A R A - W . (June, 2003) O'Loughlin, C L . 1972. A n investigation of the stability of the steepland forest soils in the Coast Mountains, southwest Brit ish Columbia. Ph .D. Thesis, U B C . O'Loughlin, E . M . 1986. Prediction o f surface saturation zones in natural catchments by topographic analysis. Water Resources Research, 22(5). pp. 794 - 804. Oreskes, N . , Shrader-Frechette, K . , Belitz, K . 1994. Verification, validation, and confirmation o f numerical models in the Earth Science. Science, V o l . 263. pp. 641 - 646. Pack, R .T . 1995. Statistically-based terrain stability mapping methodology for the Kamloops Forest Region, Brit ish Columbia. Proc. o f the 48 t h Canadian Geotech. Conf., Canadian Geotech. S o c , Vancouver, B C . 8 pp. Pack, R.T. , Tarboton, D . G . , Goodwin, C . N . 1998. A Stability Index Approach to Terrain Stability Hazard Mapping. SPWMAP User 's Manual . 68 pp. Pavel, M . 2000a. Autoassociative Neural Nets. Project prepared for E E C E 592 - Architecture for Learning Systems, U B C . 18 pp + software. Pavel, M . 2000b. Application o f the Backpropagation algorithm for landslide hazard analysis. Project prepared for E E C E 592 - Architecture for Learning Systems, U B C . 20 pp + software. Pavel, M . 2001. A n Expert System for assigning stability classes to terrain polygons. Term project prepared for M M P E 578 - Industrial Expert Systems, U B C . 10 pp + software. Pearl, J . 2000. Causality: models, reasoning, and inference. Cambridge University Press, Cambridge, M A . 182 Pfister, B . 1999. Video mapping: another candidate for next great technology. G E O Wor ld , Oct. 1999. pp. 36-37. Poole, D . , Mackworth, A . , Goebel, R. 1998. Computational Intelligence: a logical approach. Oxford University Press. N e w York . Preston, N . J . , Crozier, M . J . 1999. Resistance to shallow landslide failure through root-derived cohesion in east Coast H i l l Country soils, North Island, N e w Zealand. Earth Surface Processes and Landforms, 24. 665 - 675. Price, M . 1985. Introducing groundwater. George A l l e n & U n w i n Publishers Ltd . , London. Province o f Bri t ish Columbia. 1995. Forest Practices Code o f Bri t ish Columbia Act . Government of B C , Minis try o f Forests. Victoria, B C . Province of Bri t ish Columbia. 1996a. Terrain stability mapping in Bri t ish Columbia. A review and suggested methods for landslide hazard and risk mapping. Government o f B C , Resources Inventory Committee. Victoria , B C . 50+ pp. Province of Brit ish Columbia, 1996b. Guidelines and standards for terrain mapping in Bri t ish Columbia. Government o f B C , Resources Inventory Committee. Victoria , B C . 131 pp. Province of Bri t ish Columbia. 1998. Standard for digital terrain data capture in Bri t ish Columbia. Government o f B C , Resources Inventory Committee. Victoria , B C . 107 pp. Province o f Bri t ish Columbia, 1999. Mapping and assessing terrain stability guidebook. Second Edition. Forest Practices Code o f Brit ish Columbia. B C Ministry o f Forests and B C Environment, Victoria, B C . 36 pp. Province of Bri t ish Columbia, 2001. Vegetation resources inventory: photo interpretation procedures. B C Minis try o f Forests, Victoria , B C . 136 pp. Roering, J.J., Schmidt, K . M . , Stock, J .D., Dietrich, W . E . , Montgomery, D . R . 2003. Shallow landsliding, root reinforcement, and the spatial distribution of trees in the Oregon Coast Range. Canadian Geotech. Journal, 40. pp. 237-253. Rojas, R. 1996. Neural Networks: a systematic introduction. Springer-Verlag, Ber l in . Rollerson, T.P. 1992. Relationship between landscape attributes and landslide frequencies after logging: Skidgate Plateau, Queen Charlotte Islands. B C Ministry o f Forests, Land Management Report N o . 76, Victoria, B C . 11 pp. Rollerson, T., Mi l l a rd , T., Jones, C , Trainor, K . , Thomson, B . 2001. Predicting post-logging landslide activity using terrain attributes: Coast Mountains, Brit ish Columbia. Technical Report T R -011 , Vancouver Forest Region, Nanaimo, B C . 20 pp. Rollerson, T., Mi l l a rd , T., Thomson, B . 2002. Us ing terrain attributes to predict post-logging landslide likelihood on southwestern Vancouver Island. Technical Report T R - 015, Vancouver Forest Region, Nanaimo, B C . 15 pp. Rosenblatt, F . 1958. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review 65. pp. 386-408. Reprinted in Anderson and Rosenfeld, 1988. 183 Rulon, J.J., Freeze, R . A . 1985. Mult iple seepage faces on layered slopes and their implications for slope stability analysis. Can. Geotech. J., 22. pp. 347 - 356. Rumelhart, D . E . , Hinton, G . E . , Wil l iams, R . J . 1986a. Learning internal representations by error backpropagation. In Parallel Distributed Processing: Explorations in the microstructure o f cognition, V o l . I, Rumellhart D . E . and McCle l l and J .L . (Eds.), M I T Press, Cambridge, M A . pp. 318-362. Reprinted in Anderson and Rosenfeld, 1988. Rumelhart, D . E . , Hinton, G . E . , Wil l iams, R . J . 1986b. Learning representations by back-propagation errors. Nature, 323. pp. 533-536. Reprinted in Anderson and Rosenfeld, 1988. Russell, S. J., Norvig , P. 1995. Art if icial Intelligence: a modern approach. Prentice H a l l , Inc. New Jersey. Sammon Jr., J .W. 1969. A nonlinear mapping for data structure analysis. I E E E Transactions on Computers, 18. pp. 401 - 409. Sarkaria, S. 2000. Architectures for learning systems. Course notes, E E C E 592. University o f Bri t ish Columbia. S A S Institute. 2001. S A S User's Guide, version 8.02. The S A S Institute, Cary, N . C . Sassa, K . 2003. Establishment o f a new International Consortium on Landslides. Landslide News, 14/15. pp.2. Sauder, E . A . , Wellburn, G . V . 1987. Studies of yarding operations on sensitive terrain, Queen Charlotte Islands, B C . B C Ministry o f Forests, Land Management Report N o . 52, also F E R I C Special Report S R - 43. Vancouver, B C . 45 pp. Sayed, T., Abdelwahab, W . 1998. Comparison o f fuzzy and neural classifiers for road accidents analysis. J. o f Computing in C i v i l Eng., 12 (1). pp. 42 - 47. Sayed, T., Razavi, A . 2000. Comparison o f neural and conventional approaches to mode choice analysis. Journal of Computing in C i v i l Engineering, 14(1). pp. 23-30. Sayed, T., Tavakolie, A . , Razavi, A . 2003. Comparison o f adaptive network based fuzzy inference systems and B-spline neuro-fuzzy mode choice models. J.of Computing in C i v i l Eng., 17(2). pp. 123 - 130. Schmidt, K . M . , Roering, J.J., Stock, J .D., Dietrich, W . E . , Montogmery, D . R . 2001. The variability o f root cohesion as an influence on shallow landslide susceptibility in the Oregon Coast Range. Can. Geotech.J., 38. pp. 995 - 1024. Selby, M . J . 1985. Earth's changing surface: an introduction to geomorphology. Oxford University Press. N e w York . Sharma, S. 1991. X S T A B L : an integrated slope stability analysis program for personal computers. Reference Manual , v.4.0. Interactive Software Designs, Inc., Moscow, Idaho. 131 pp. Sidle, R . C , Swanston, D . N . 1982. Analysis of a small debris slide in coastal Alaska. Can. Geotech. Journal, 19(2), pp. 167-174. 184 Sidle, R . C . , Pierce, A . J . , O'Loughlin, C L . 1985. Hillslope stability and land use. Water Resources Monograph 11. American Geophysical Union, Washington, D C . 140 pp. Sidle, R . C 1991. A conceptual model o f changes in root cohesion in response to vegetation management. Journal of Environmental Quality, 20. pp. 43 - 52. Sitar, N . , Anderson, S.A. , Johnson, K . A . 1992. Conditions for initiation of rainfall-induced debris flows. Geotechnical Engineering Div is ion Specialty Conference: Stability and performance o f slopes and embankments - II. A S C E , N e w York . pp. 834-839. Slaymaker, O. 2000. Assessment o f the geomorphic impacts of Forestry in Brit ish Columbia. Ambio , V o l . 29, N o . 7. pp. 381 - 387. Soeters, R., van Westen, C.J . 1996. Slope instability: recognition, analysis, and zonation. In Turner, A . K , Schuster, R . L . (eds.) Landslides - investigation and mitigation. National Academy Press, Washington, D C . pp. 36-76. StatSoft. 2003. StatSoft Electronic Textbook. (Feb., 2003) Tarboton, D . G . 1997. A new method for the determination o f flow directions and upslope areas in grid digital elevation models. Water Resources Research, 33(2). pp. 309 - 319. Taylor, D . W . 1962. Fundamentals of Soil Mechanics. John Wi l ey & Sons, Inc. N e w York . Terzaghi, K . 1943. Theoretical Soi l Mechanics. John Wi ley and Sons, Inc. N e w York . Terzaghi, K . 1950. Mechanism of landslides, h i : Application of Geology to engineering practice, Berkeley Volume, The Geological Society of America, pp. 83-123. Terzaghi, K . , Peck, R . B . 1967. Soi l mechanics in engineering practice. Second Edition. John Wi l ey & Sons, Inc., N e w York . Tobler, W . 1970. A computer movie for simulating urban growth in the Detroit Region. Economic Geography, 46 (2). pp. 234 - 240. T o l l , D .J . 1996. Art i f ic ial Intelligence applications in Geotechnical Engineering. Electronic Journal of Geotechnical Engineering. University o f Durham. 30 pp. (March, 2003) Tukey, J .W. 1977. Exploratory Data Analysis. Addison-Wesley, Reading, M A . Ultsch, A . , Siemon, H .P . 1990. Kohonen's Self Organizing Feature Maps for exploratory data analysis. In Proceedings o f International Neural Network Conference ( I N N C '90). K luwer Academic Publishers, Dordrecht, pp. 305 -308 van Westen, C.J . , Terlien, M . T . J . 1996. Deterministic landslide hazard analysis in GIS. A case study from Manizales (Colombia). Earth Surf. Proc. and Landf , 21. pp. 853 - 868. Varnes, D . J . 1978. Slope movement types and processes. In Schuster, R . L . , Krizek, R . J . (eds.) Landslides: analysis and control. National Academy of Sciences, Washington, D C . pp. 11-33. 185 Varnes, D . J . 1984. Landslide hazard zonation: a review of principles and practice. Natural Hazards, N o . 3. U N E S C O , Paris. 63 pp. Vesanto, J. 2000. Using S O M in Data Min ing . Licentiate's thesis. Helsinki University o f Technology, Department of Computer Science and Engineering, Espoo, Finland. 50 pp. von der Malsburg, C . 1973. Self-organization o f orientation sensitive cells in the striate cortex. Kybernetik, 14. pp. 85-100. Reprinted in Anderson and Rosenfeld, 1988. Vull iet , L , Mayoraz, F . 2000. Coupling Neural Networks and mechanical models for a better landslide management. In: Landslides in research, theory and practice. Proceedings o f the 8 t h International Symposium on Landslides, Cardiff, pp. 1521-1526. Weizenbaum, J . 1976. Computer power and human reason: from judgement to calculation. W . H . Freeman, San Francisco. Wieczorek, G .F . 1987. Effect o f rainfall intesity and duration on debris flows in Central Santa Cruz Mounatains, California. In Costa, J.E., Wieczorek, G .F . (eds.) Debris flows / avalanches. G o l . Soc. o f America, Rev. Eng. Geology, 7. pp. 93 - 104. Wieczorek, G . F . 1996. Landslide triggering mechanisms. In Turner, A . K , Schuster, R . L . (eds.) Landslides - investigation and mitigation. National Academy Press, Transportation Research Board Special Report 247, Washington, D C . pp. 76 - 90. Wieczorek, G.F. , Mandrone, G . , DeCola , L . 1997. The influence o f hillslope shape on debris-flow initiation. In Chen, C-I. (ed.) Procs. of 1 s t Int. Conf. on debris-flow hazards mitigation, San Francisco, California, pp. 21 - 31. Wilkinson, J .M.T . 1996. Landslide initiation: a unified geostatistical and probabilistic modeling technique for terrain stability assessment. M . A . S c . Thesis, U B C . Wise, M . P . 1997. Probabilistic modeling o f debris flow travel distance using empirical volumetric relationships. M . A . S c . Thesis, U B C . Wis lock i , A . P . , Bentley, S.P. 1989. A n Expert System for landslide hazard and risk assessment. In Topping, B . H . V . (ed.): Art if icial Intelligence techniques and applications for c iv i l and structural engineers, C iv i l -Comp Press, Edinburgh, pp. 249-252. W u , T . H . , Swanston, D . N . 1980. Risk o f landslides in shallow soils and its relation to clearcutting in Southeastern Alaska. Forest Science, 26 (3). pp. 495 - 510. W u , T . H . , Tang, W . H . , Einstein, H . H . 1996. Landslide hazard and risk assessment. In Turner, A . K . , Schuster, R . L . (eds.) Landslides: investigation and mitigation. Transportation Research Board, , Special Report 247. National Academy Press, Washington, D C . pp. 106 - 128. W u , W . 1993. Distributed slope stability analysis in steep, forested basins. Ph.D. Thesis, Utah State University, Logan, Utah. W u , W. , Sidle, R . C . 1995. A distributed slope stability model for steep forested basins. Water Res. Research, V o l . 31, No.8. pp. 2097-2110. 186 Wurtz, R .P . 2000. Gossiping Nets. Book review of J .A. Anderson and E . Rosenfeld (Eds.), Talking Nets: A n oral history of neural networks. M I T Press, Cambridge, M A , 1988, 448 pp. Art if icial Intelligence, 119 (2000), pp. 295-299. Zadeh, L . A . 1965. Fuzzy sets. Information and Control, 8. pp. 338 - 353. Zhang, W . , Montgomery, D . R . 1994. Digital elevation model grid size, landscape representation, and hydrologic simulations. Water Res. Research, vol . 30, no. 4. pp. 1019 - 1028. 187 APPENDIX 1 - GLOSSARY (Most definitions related to A N N are adopted from Kohonen 2001; they were meant to be more explanatory than distinctive, and very concise.) artificial intelligence: ability o f an artificial system to perform tasks that are usually thought to require intelligence. artificial neural network (ANN): massively parallel interconnected network o f simple (usually adaptive) elements and their hierarchical organizations, intended to interact with the objects o f the real world in the same way as the biological nervous system do. In a more general sense, A N N also encompass abstract schemata, such as mathematical estimators and systems of symbolic rules, constructed automatically from masses of examples, without heuristic design or other human intervention. Such schemata are supposed to describe the operation o f biological or artificial neural networks in a highly idealized form and define certain performance limits. bottom-up: relating to inductive inference, i.e. from particulars to generals. calibration: determination of scaling or labeling for an analytical device by means o f well-validated input data. colluvium: materials that have reached their present positions as a result o f direct, gravity-induced mass movements. N o agent o f transportation such as water or ice is involved, although the moving material may have contained water or ice. D E M (Digital Elevation Model) : any digital representation o f the continuous variation o f ground relief over space. dimensionality: number of elements in a vector or matrix. discriminant function: mathematical function defined for each class of patterns separately. Input data are determined to belong to that class for which the discriminant function attains the largest value. energy function: objective function to be minimized in some optimization problem. Expert System: system embodying specialist expertise. Sometimes, E S are distinguished from Knowledge-Based Systems ( K B S ) in that the knowledge base o f an expert system is not derived from generally available (public) knowledge (e.g. textbooks, etc.), but comes from expert specialists in a problem domain and their 'private' knowledge of a field. This distinction is not very clear because the concept of expertise is, itself,, not well defined. The terms are oftentimes used as synonyms in the literature. fluvial: pertaining to streams and rivers. Fuzzy Logic: continuous-valued logic consisting o f maximum and minimum selection operators and making use o f membership functions to define the graded affiliation to sets. The three-valued logic was established independently by Lukasiewicz in 1920 and Post in 1921. The primary philosophical work in continuum valued logics was published by Lukasiewicz and Tarski in 1930, followed by Black in 1937, who introduced the concept of vague set. The most influential study in recent years is the introduction o f Fuzzy Sets by Zadeh (1965), which was directly preceded by the work o f Kaplan and Schott in 1951 (cf. Bojadziev and Bojadziev 1995; Fisher 2000). 188 Fuzzy Set Theory: set theory in which the operators obey rules o f fuzzy logic. generalization: way o f responding similarly (consistently) to a class of inputs, some o f which do not belong to the training set o f the same class. genetic algorithm: learning principle, in which learning results are found from generalizations of solutions by crossing and eliminating their members. A n improved behaviour usually ensues from selective stochastic replacements in subsets of system parameters. Geomorphology: The study o f the origin of landforms, the processes whereby they are formed, and the materials of which they consist. geomorphic processes: dynamic actions or events that occur at the earth's surface due to application o f natural forces resulting from gravity, temperature changes, freezing and thawing, chemical reactions, seismic shaking, and the agencies of wind and moving water, ice and snow. Where and when a force exceeds the strength of the earth material, the material is changed by deformation, translocation, or chemical reactions. Geographic Information System (GIS): a set of tools for collecting, storing, retrieving at w i l l , transforming and displaying spatial data from the real world for a particular set of purposes (Burrough and McDonne l l 1998). glaciofluvial materials: deposits and landforms formed by glacial meltwater streams. glaciolacustrine materials: sediments deposited in or along the margins of glacial lakes; primarily lake floor sediments consisting o f fine sand, silt and clay settled from suspension or from turbidity currents, and including coarser sediments (e.g. ice-rafted boulders) released by the melting o f floating ice. gradient: vector, the components o f which are partial derivatives o f a scalar function with respect to various dimensions in the data space. intelligence: ability to perform new tasks that are directly or indirectly vital to survival, and solve problems for which no precedent cases exist. Human intelligence is measured by standardized mental tests. lacustrine materials: sediments that have settled from suspension or underwater gravity flows in lakes. landform: any physical, recognizable form or feature of the earth's surface, having a characteristic shape, and produced by natural processes. lattice: regular, often two- or three-dimensional spatial configuration of nodes or neurons. laws of thought: the basic laws o f thought, were first developed by Aristotle, namely: (1) the law of identity; (2) the law of non-contradiction; and (3) the principle of the excluded middle. The principle of excluded middle ensures that all statements in conventional logic can have only two values, true or false; its role in mathematical (and logical) proof - reductio ad absurdum - has been of paramount importance in scientific and philosophical development (cf. Burrough and McDonne l l 1998). learning: generic name for any behavioural changes that depend on experiences and improve the performance of a system. In a more restricted sense learning is identical with adaptation, especially selective modification of parameters of a system. 189 learning rate: rate of change o f parameters especially relating to one learning step (or in time-continuous systems, to the time constants). learning vector quantization (LVQ): supervised-learning vector quantization method in which the decision surfaces, relating to those o f the Bayesian classifier, are defined by nearest-neighbour classification with respect to sets of codebook vectors assigned to each class and describing it. model: simplified and approximate description o f a system or process based on a finite set of essential variables and their analytically definable behaviour. neighbourhood: set of neurons (eventually including the neuron itself to which the neighbourhood is related) located spatially in the neural network up to a defined radius from the neuron. neuron: any o f the numerous types of specialized cells in the brain or other nervous systems that transmit and process neural signals. The nodes o f A N N are also called neurons. Ockham. W i l l i a m of Ockham (c. 1285 - 1349), the most influential philosopher o f his century and a major contributor to medieval epistemology, logic, and metaphysics, is credited with a statement called Ockham's Razor - in Latin, 'Entia non sunt multiplicanda praeter necessitatem', and in English, 'Entities are not to be multiplied without necessity'. Unfortunately, this laudable piece of advice is nowhere to be found in his writings in precisely these words (cf. Russell and Norv ig 1995). off-line: relating to computing operations such as training performed prior to the use o f the device. on-line: relating to direct connection between machine and its environment and its operation in real time. piping: subsurface erosion o f particulate materials by flowing water, resulting in the formation of underground caves and conduits and the development o f collapse depressions at the land surface. pixel: a data element having both spatial and spectral aspect. The spatial variation defines the apparent size o f the resolution cell , and the area on the ground represented by the data value. A l so , numerically expressed picture element. postprocessing: posterior operation performed on the outputs from, say, a A N N . Its purpose may be to segment sequences of output results into discrete elements, for instance to correct text on the basis o f grammatical rules, or to improve labeling in image analysis. preprocessing: set of normalization, enhancement, feature-extraction, or other similar operations on signal patterns before they are presented to the central nervous system or other pattern recognition systems. projection: ordered mapping from a higher-dimensional to a lower-dimensional manifold, prototype: typical sample from a class o f items used in training. raster data: an array o f cells (pixels) referenced by a set o f row and column numbers. It is one of the fundamental ways o f representing and storing spatial data. Sammon's mapping: clustering method that visualizes mutual distances between high-dimensional data in a two-dimensional display. 190 self-organization: in the original sense, simultaneous development o f both structure and parameters in learning. self-organizing map (SOM): result of a nonparametric regression process that is mainly used to represent high-dimensional, nonlinearly related data items in an illustrative, often two-dimensional display, and to perform unsupervised classification and clustering. spatial resolution: a measure of the smallest angular or linear separation between two objects, usually expressed as radians or meters. Sorites paradox: named after the Greek word for heap - soros, and judged to be among the most profound and important of all those known to philosophers. It starts with an initial condition that is true: one grain o f sand is not a heap. W e have a premise, which is apparently true: for any value o f n, adding 1 grain w i l l not turn a non-heap to a heap. A t the end o f a repeated application o f the premise we have the false conclusion that a collection o f a mi l l ion grains (for example) is not a heap. Having a correct sequence of premises, which reach a false conclusion, is a paradox (not a fallacy !). A s this paradox was known in ancient Greece, some philosophers refused to respond to all questions (in a series) which identified a boundary condition (cf. Fisher 2000). steepest descent: minimization o f an objective function o f the system by changing the parameters in the direction of its negative gradient with respect to the parameter vector. supervised learning: learning with a teacher; learning scheme in which the average expected difference between wanted output for training samples, and the average output, respectively, is decreased. symbolic: relating to representation or processing of information, whereby the elements o f representation refer to complete items or their complexes. syntax: set of rules that describe how symbols can be joined to make meaningful strings; systematic joining of elements, especially combination o f elements defined by a grammar (set of rules). till: material deposited by glaciers and ice sheets without modification by any other agent of transportation. top-down: relating to deductive inference, i.e. from generals to particulars, training: forced learning, teaching. unsupervised learning: learning without a priori knowledge about the classification o f samples; learning without a teacher. Often the same as formation o f clusters, whereafter these clusters can be labeled. A l so optimal allocation of computing resources when only unlabeled, unclassified data are input. updating: modification o f parameters in learning; also, bringing data up-to-time. vector data: a set o f graphic data that can be ultimately decomposed into point locations generally described by coordinates; it may include points, lines or polygons. It is one of the fundamental ways o f representing and storing spatial data. vector quantization: representation of a distribution of vectorial data by a smaller number o f reference or codebook data; optimal quantization of the signal space that minimizes the average expected quantization error. 191 Voronoi tessellation: division o f the data space by segments o f hyperplanes that are midplanes between closest codebook vectors. One partition in a Voronoi tessellation is a Voronoi polygon in 2-D, or a Voronoi polyhedron in a multidimensional space. water table: the upper surface of the zone of groundwater saturation in permeable rocks or surficial materials. weight vector: real-valued parametric vector that usually defines the input activation o f a neuron as a dot product of the input vector and itself. winner: neuron in competitive-learning neural networks that detects the maximum activation for, or minimum difference between input and weight vectors. 192 APPENDIX 2 - TERRAIN S Y M B O L General description o f terrain symbol, from Terrain Classification System of B C (Howes and Kenk 1997). TEXTURE (one to three lower case QUALIFIERS (up to two superscript letters) describes the size, roundness upper case letters) are used where appropriate to provide information about surficial materials and geomorphological processes. and sorting of particles in mineral sediments and the fiber content of organic materials. SURFICIAL MATERIAL (one upper case letter) is classified according to its mode of deposition. GEOMORPHOLOGICAL PROCESSES(one to three upper case letters) describes geomorphological processes that are modifying either surficial material or land forms. SURFACE EXPRESSION (one to three lower case letters) describes the form (shape) of the land surface or the thickness of the surficial material. Example o f a complex description for a terrain polygon: sgFG IMv.Rhs - R"bdVA Description: glacio-fluvial material with a sandy-gravelly texture, morainal (till) veneer and steep, hummocky rock. The glacio-fluvial material is the most extensive material, and t i l l and rock take the rest o f the surface and have almost equal surfaces. These deposits are underlain by glacio-lacustrine materials. In the polygon were identified rapid mass movements (" indicates initiation zone) i.e. rockfall (b) and debris flow (d), gully erosion and avalanches. 193 APPENDIX 3 - INTERPRETATION OF TERRAIN STABILITY CLASSES (Forest Practices Code o f B C - Mapping and Assessing Terrain Stability Guidebook, Province o f B C 1999) Terrain stability class Interpretation I • N o significant stability problems exist. II • There is a very low likelihood o f landslides following timber harvesting or road construction. • Mino r slumping is expected along road cuts, especially for 1 or 2 years following construction. III • Mino r stability problems can develop. • Timber harvesting should not significantly reduce terrain stability. There is a low likelihood o f landslide initiation following timber harvesting. • Mino r slumping is expected along road cuts, especially for 1 or 2 years following construction. There is a low likelihood o f landslide initiation following road construction. IV • Expected to contain areas with a moderate likelihood o f landslide initiation following timber harvesting or road construction. V • Expected to contain areas with a high likelihood of landslide initiation following timber harvesting or road construction. The classification addresses landslides greater than 0.05 ha in size, conventional timber harvesting practices, and sidecast road construction. The system included also the stability class I V R , which is expected to contain areas with a moderate likelihood of landslide initiation following road construction and a low or very low likelihood o f landslide initiation following timber harvesting. This stability class is not specifically addressed in the thesis, and comments about this situation are made in Chapter 8. 194 APPENDIX 4 - CRITERIA FOR ASSIGNING TERRAIN STABILITY CLASSES (from E B A Engineering Consultants Ltd . 1999) S L O P E C L A S S 1 2 3 4 50-70% (26-35°) 5 0-5% (0-3") 6-27% (3-15°) 28-49% (15-26°) 50-60% (26-30°) 61-70% (31-35°) >70%(>35°) I My, Mb; FGp, FGu; Fp; Dv; LGp, LGu; Rp, Ru < Rj.Ru u > II Mv, Mb; FGf, FGu, FGj; Ff, Fj; Cf; Dv; LGj, Lpi H i—i ACa; Ra | >-9 | Ruh, Rum, Rur with Mw, Cv, and/or Dv Lga <• •H CO III Mv, Mb; FGak, FGa; Cv, Cb aCk;Rk ERRAI LGa-V, LGk-V (•V refers to dissected slopes or . single gully) ERRAI IV LGk, LGs H Mb-V; Cb-V; FGks; Uks-V, Rs-R'V (-V refers to dissected slopes) Mv, Mb; FGk, FGs; Cv, Cb Mks-V; FGks-V; Cvb-V; V LGks-V, Uks-V (-V refers to single gully) all materials and landforms that are unstable (i.e. include the initiation zone of mass movements: -F",-R"s, and/or-R"b* 195 APPENDIX 5 - PREPARATION OF D E M The initial data consisted o f 20-m contour lines (vector-type data) in digital format ( T R I M data). D E M for the two study sites were created in Arc V i e w and included the following steps: 1. Random points were generated along each contour line (distance between these points was much shorter than 20 m). 2. The D E M was created by interpolating between the points, using the Inverse Distance Weighting (IDW) method. 3. The D E M was pit-filled. When a smooth continuous surface is approximated by a square grid it is inevitable that some cells w i l l be surrounded by neighbours that all have higher elevations. These pits are in general artifacts of the gridding process. The problem with artifact pits is that they disrupt the drainage topology and need to be removed. In my case, pits were removed by fil l ing them up. This is a standard step in D E M creation. 4. To account for uniformity existing over small areas (i.e. to account again for some unrealistic values created by the algorithms used), new values were computed for all pixels as neighbourhood statistics for all topographic attributes. This is called spatial filtering (or convolution) and involves passing a square window (otherwise known as kernel or filter) over the surface and computing a new value o f the central cell of the window as a function of the cell values covered by the window. The smoothing (low-pass) filter computes the value for the cell at the center o f the window as a simple arithmetic average of the values of the other cells. It has the effect of removing extremes from the data and producing a smoother image. Elevation was recomputed for all pixels based on a 3 x 3 grid that was moved over the entire D E M . The selection o f the grid size was subjective, and is generally determined by topography. 5. The other topographic parameters were derived: slope, aspect, plan and profile curvature, and specific catchment area. For each of these attributes a new grid was created. 196 Previous steps were accomplished using either functions readily implemented in A r c V i e w or scripts available on the web site o f E S R I . Interpolating between contour lines to calculate elevation for intermediate cells, and generating continuous surfaces from discrete (point) data (in my case, generating a D E M from contour lines) are classical problems in spatial analysis. Various solutions are applied in different situations but there is no solution valid for the general case. These are problems studied in their own right, and discussions o f these issue are presented in Burrough and McDonne l l (1998), and L a m (1983). Grids created in A r c V i e w based on the algorithm presented above (floating-point grids) do not have an associated table of attributes. Extraction o f attributes was performed by converting the grids back to vector format, and the centroid o f each pixel was projected on a two-dimensional network o f points. These points can be used for mapping or they can be converted back to grid format. Topographic attributes were spatially joined one by one, producing a table that summarized them. Geomorphic attributes were then spatially joined to these points. A s a last step, for both watersheds, a screening was performed to eliminate the areas covered by lakes, rivers and major streams. 197 APPENDIX 6 - DESCRIPTION OF GEOMORPHIC ATTRIBUTES This appendix presents surficial and subsurficial materials, texture, surficial expression, and delimiters, as described in the Terrain Classification System of B C (Howes and Kenk 1997) Surficial and Subsurficial Material. The types of materials (and corresponding map symbols) considered by the Terrain Classification System are presented in the following table. Types o f materials. Material Symbol Col luv ium C Weathered Bedrock (in situ) D Fluvia l Material F Glaciofluvial Material F G Lacustrine Material L Glaciolacustrine Material L G Morainal Material (Till) M Organic Material O Bedrock R Undifferentiated Materials U Volcanic Material V Glaciomarine Material W G Anthropogenic A Eolian E Ice I Marine W 198 Not all these material types were encountered in the two study sites. However, the model was designed to include all materials that were considered relevant, for future analysis o f other sites7. Texture of Material. This is separated in the following subclasses and types: • specific clastic terms: blocks (a), boulders (b), cobbles (k), pebbles (p), sand (s), silt (z), clay (c). • common clastic terms: mixed fragments (d), angular fragments (x), gravel (g), rubble (r), mud (m), shells (y). • organic terms: fibric (e), mesic (u), humic (h). Surface Expression (for Subsurficial and Surficial Material). This attribute is described using a binary key, based on the following criteria: • Topography is either bedrock-controlled or it reflects the surface configuration of the underlying material. The types are: thin veneer (x), veneer (v), blanket (b), and mantle o f variable thickness (w). • There is no apparent relation between the topography of the surficial material and that of underlying material. Two cases are distinguished here: Simple, constructional or erosional landforms, consisting primarily o f planar slopes. The types are: plain (p) - between 0 and 3 deg.; gentle slope (j) - between 4 and 15 deg.; moderate slope (a) - between 16 and 26 deg.; moderately steep slope (k) -between 27 and 35 deg.; steep (s) - steeper than 35 deg. More complex depositional or erosional landforms, consisting mainly of multi-directional, non-linear surfaces. The types are: undulating topography (u); roll ing The Anthropogenic, Eolian, Ice, and Marine materials were not included in the analysis as they were considered not related to terrain stability. 199 topography (m); hummocks (h); ridges ( r ) ; depressions (d); fan (f); cone (c ); terraces (t). Delimiters. These are intended to represent the areal spread o f materials. The following types are used: • " . " components on either side o f the symbol are o f approximately equal proportion. • " / " the component in front o f the symbol is more extensive than the one that follows. This can also be used to indicate a discontinuous covering of material. • " / /" the component in front of the symbol is considerably more extensive than the component that follows. There is no quantitative description of the delimiters in the classification system. 200 APPENDIX 7 - DESCRIPTION OF THE MODELING PROCESS In this study, the analysis was carried out in three different environments, organized in five steps: 1. Data preparation. Geomorphic and topographic attributes were manipulated in A r c V i e w as presented in Appendix 5. A t the end o f this process, each pixel (i.e. point) had both topographic and geomorphic attributes attached to it. The resulting attribute table was exported for further processing. 2. Data coding. Coding was performed as described in Chapter 5. This process had to be monitored very closely, to make sure that each parameter was coded correctly. The most appropriate environment for this step was considered to be a spreadsheet. Data exported from A r c V i e w were imported in Excel 97, where computer programs were written in Visua l Basic for data coding. 3. Data analysis. Given the size of data and complexity o f computations required, a faster tool was used. Training and testing were performed using programs written in C (Kemighan and Ritchie 1988). The code was compiled in Visual C++ 5.0, based on either source code or pseudocode from the following sources: Masters (1993), Fausett (1994), Kohonen et al. (1996a, 1996b), and Kohonen (2001). 4. Evaluate results. Results were imported back in Exce l 8 . Their quality was evaluated on the testing data set (vs. existing mapping), again using programs written in Visua l Basic. 5. Display results. A t the end o f analysis, results of the important (relevant) analyses were imported back and displayed in A r c V i e w . Essentially, GIS was used first to assemble the data and to export them. Data pre-processing, the analysis (modeling) and evaluation of results are carried out (programmed) outside the system using standard computer environments (languages). Once the model had been run and evaluated, the results of relevant analysis are transferred (imported) back to the GIS and displayed. Excel, Visual Basic, and Visual C++ are trademarks or registered trademarks of Microsoft Corporation. 201 APPENDIX 8 - DESCRIPTION OF THE IMPLEMENTATION PROCESS To implement the terrain stability mapping method developed in this study, the following steps need to be taken: 1. Produce a digital representation o f the terrain. A common method employed in this phase uses contour lines encoded in GIS. If data are not available in digital format, analog data can be converted to digital format, based on methods described in most GIS books. 2. Create a D E M in raster format. This process consists essentially in interpolating new values for elevation among existing ones. The interpolation method and the size o f the grid cell are a function o f the topography o f the region, the quality o f the raw data (how accurately it represents the terrain), the initial spacing between contour lines, and the intrinsic nature o f the problem analyzed (i.e. how large a pixel would be appropriate for the analysis). There is no general method to address these issues, and good knowledge o f the terrain and the problem analyzed are very important. These issues are analyzed in many GIS publications (e.g. L a m 1983; Zhang and Montgomery 1994; Burrough and McDonne l l 1998). 3. Refine the D E M . Based on the same criteria used at the previous step, pit-filling and spatial filtering may be used to refine the D E M created, using the same procedures described in Appendix 5. 4. Compute the other topographic attributes. Slope and aspect can be computed using algorithms available in all GIS packages. A new grid is created for each attribute. 5. Delineate areas affected by geomorphic processes on digital air photos. This step requires good knowledge o f GIS techniques for georeferencing the air photos to the existing terrain representation, and to embed the newly created data into the GIS. Polygons that describe terrain already affected by geomorphic processes can be delineated directly on the screen. For each polygon, only the most destructive process should be recorded, based on the following order: rapid mass movements, slow mass movements, gully erosion, and 202 avalanches. If air photos are not available in digital format, the process can be performed on analog (paper) photos, and the polygons can be stereo-transferred and digitized to import them in GIS format. However, these steps should be performed by a person with good knowledge o f photogrammetric techniques. 6. Spatially jo in the attributes. Topographic and geomorphic attributes need to be spatially joined one by one, producing a grid that includes all o f them. Based on the specific GIS implementation, this may be done directly in the raster format, or a conversion to vector format may be necessary, as described in Appendix 5. 7. Screen the data. A screening needs to be performed to eliminate the areas covered by lakes, rivers, major streams, and other areas not relevant with respect to terrain stability. 8. Export, code and analyze the data. Data should be exported in a convenient format (preferably text). Coding, analysis, and evaluation of various scenarios, should be performed using procedures developed in this thesis. 9. Visualize the results. The selected results should be imported back in GIS for mapping and visualization. 203 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items