PREDICTION OF UNCONFINED DEBRIS SLIDE - F L O W T R A V E L DISTANCE USING SET T H E O R Y by B O G D A N MIHAI STRIMBU B.Sc, Transilvania University-Romania, 1992 A THESIS SUBMITTED IN PARTIAL F U L F I L M E N T OF T H E REQUIREMENTS FOR T H E D E G R E E O F Master of Science in T H E F A C U L T Y OF G R A D U A T E STUDIES Faculty of Forestry Department of Forest Resources Management We accept this thesis as conforming to the required-standard T H E UNIVERSITY OF BRITISH C O L U M B I A October 2002 ©Bogdan Mihai Strimbu, 2002 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of fot&gST P&£oO&C£*i> The University of British Columbia Vancouver, Canada Date DE-6 (2/88) W/nier /6,2fll2 Abstract Mass movement risk assessments are usually separated into two components: movement initiation and the travel distance of the event. The initiation point is the point on the slope where the mass failed. This position is hard to determine and a common assumption is that it is the highest elevation of the scar. The travel distance is the distance from the initiation point to the point where all material is deposited. This study concentrates on particular forms of mass movements, namely unconfined debris flows and debris slides. The parameters that characterize mass movements change over time. Usually, measurements are performed after the event, resulting in the data being of questionable precision. The reliability of any mass movement travel investigation is dependent on the accuracy of the measured values. The results obtained are dependent on the precision of the original data, and can affect predictions made from the data in two ways: uncertainties in model lack-of fit (data suitability) and uncertainties in data meaning. This study builds a new debris slide-flow travel distance prediction model with a narrow confidence interval that can take into account the vagueness of the variables. Fuzzy set theory has been applied in order to overcome uncertainties related to the true value of the parameters. The study was performed using data from the Arrow Forest District, British Columbia, Canada. A total of 38 events were measured, classified as unconfined debris slide - flow, traveling through forested terrain, and used to build and test the debris slide - flow travel distance prediction model. The relationship between debris slide-flow length and other debris slide-flow attributes (i.e. geomorphology, geology, canopy closure and species) was established using regression analysis on crisp sets. A new attribute was introduced to capture the debris slide-flow path. The new path variable is based on the one-to-one relationship that exists between the binary and decimal numeration systems. The path variable uses uniform sections of the debris slide-flow, called reaches, which are larger than 25 m, except for first and last reach. Each reach can have a value 0 or 1 depending on the slope of the upstream reach. The first (uppermost) reach always has a value of 1. The values assigned to other reaches follows the rule that if the slope of the reach is ii less that the slope of the reach immediately above it, it is assigned value of 0; if the slope of the reach is greater than that of the upstream reach, it is assigned a value of 1. The event stops if the slope is less than 20° or it reach the stream. The analysis of the crisp data set revealed that the new path variable, slope, azimuth, height of the stand, canopy closure and horizontal and vertical curvature are the significant variables (at a significance level of a=0.05) affecting debris slideflow travel distance. The significant variables supplied by the regression analysis using crisp sets were fuzzified in order to introduce the vagueness of reality. The fuzzified variables were used in a fuzzy regression analysis, based on non-linear programming. The same variables used in the regression analysis of crisp sets were used in the fuzzy analysis. The confidence interval for debris slideflow travel distance prediction model using the fuzzy sets was smaller than 40% of the event travel distance. The models for the crisp and fuzzy sets show similar trends. Each model predicts the debris slide-flow travel distance with more than 80% precision. The final model used for the prediction combines both models, thereby minimising the confidence interval and the variable fuzziness. The equations derived from the models can be implemented in management software that uses digitized contour maps. iii TABLE OF CONTENTS ABSTRACT H T A B L E OF CONTENTS IV LIST OF FIGURES X ACKNOWLEDGEMENTS xi GLOSSARY 1 2 3 Xii INTRODUCTION 1 1.1 RISK ASSESSMENT OF LANDSLIDES 3 1.2 G O A L OF THIS STUDY 4 LITERATURE REVIEW 5 2.1 L A N D S L I D E CLASSIFICATION 5 2.2 DEBRIS F L O W INITIATION 7 2.3 DEBRIS F L O W T R A V E L DISTANCE 10 2.4 CONCLUSION 22 STUDY AREA 23 3.1 GEOLOGY 24 3.2 SOILS 25 3.3 CLIMATE 26 3.3.1 Temperature 27 3.3.2 Precipitation 28 3.3.3 Summary 29 iv 3.4 4 VEGETATION METHODS 31 32 4.1 S A M P L I N G DESIGN 32 4.2 D A T A SET CONSTRUCTION 38 4.3 DERIVED VARIABLES 41 4.4 D A T A CHARACTERIZATION 44 4.5 R E A C H CHARACTERIZATION 45 4.6 DEFINING T H E P A T H AS A V A R I A B L E 49 4.7 HYPOTHESES 54 4.8 ASSUMPTIONS 59 4.9 STEPS IN BUILDING T H E DEBRIS SLIDE-FLOW T R A V E L DISTANCE PREDICTION M O D E L .... 64 4.9.1 Overview 4.9.2 Selecting the significant variables 4.9.3 Identifying and testing the model using crisp sets 64 67 68 4.9.3.1 Normal distribution of the errors 69 4.9.3.2 Errors involving homoscedasticity 69 4.9.3.3 Errors are not correlated 70 4.9.3.4 Model multi-collinearity 71 4.9.3.5 Identifying and assessing the outliers 73 4.9.4 Regression function assessment in relation to the debrisflowending point 4.9.4.1 Regression function for events that end/do not end in a stream 78 4.9.4.2 Inferences about the two regression functions 78 4.9.5 4.9.5.1 Regression model built using fuzzy sets Fuzzifying the variables from the final crisp set model 77 81 82 v 4.9.5.2 4.9.6 4.10 Fuzzy regression analysis Regression assessment RISK ASSESSMENT RESULTS 5.1 83 87 90 91 SIGNIFICANCE OF T H E C O R R E L A T I O N B E T W E E N DEBRIS F L O W T R A V E L DISTANCE A N D T H E RAW VARIABLES 91 5.2 TRANSFORMING THE RAW VARIABLES 93 5.3 S E L E C T I O N OF T H E ESTIMATION A N D PREDICTION D A T A SETS 97 5.4 DEBRIS F L O W T R A V E L DISTANCE M O D E L BUILT USING CRISP SETS 98 5.4.1 Preliminary regression analysis 98 5.4.1.1 Testing the significance of the multiple regression 100 5.4.1.2 Identification of outliers and influential observations 101 5.4.1.3 Remedial measures for outliers and influential observations 103 5.4.2 Final regression analysis 105 5.4.2.1 Multiple regression significance 105 5.4.2.2 Identification of the outliers and influential observations 106 5.4.2.3 Distribution of errors 108 5.4.2.4 Homoscedasticity of the errors 108 5.4.2.5 Correlation amongst the errors 109 5.4.2.6 Selection of the significant variables 110 5.4.2.7 Debris flow travel distance prediction model built using crisp sets 112 5.4.2.8 Regression assessment in relation to debris flow termination 113 5.5 FUZZIFICATION OF T H E SELECTED V A R I A B L E S 121 5.6 DEBRIS SLIDE-FLOW T R A V E L DISTANCE M O D E L BUILT U S I N G F U Z Z Y SETS 122 5.7 6 REGRESSION ASSESSMENT DISCUSSION AND CONCLUSIONS 123 126 6.1 DISCUSSION 126 6.2 CONCLUSIONS 142 6.2.1 Recommendations for future research REFERENCES 145 146 APPENDIX 1: RAW GEOMORPHOLOGICAL VARIABLES USED IN T H E ANALYSIS 158 APPENDIX 2: DETAILS OF INDIVIDUAL DEBRIS FLOWS USED IN T H E STUDY. 165 APPENDIX 3 BASIC CONCEPTS IN FUZZY SETS THEORY 229 APPENDIX 3 BASIC CONCEPTS IN FUZZY SETS THEORY 230 APPENDIX 4: PREDICTION AND CONFIDENCE INTERVALS FOR T H E CRISP SET REGRESSION EQUATION 251 APPENDIX 5: PREDICTION AND CONFIDENCE INTERVALS FOR T H E FUZZY SET REGRESSION EQUATION 252 APPENDIX 6: PEARSON'S CORRELATION COEFFICIENTS FOR RAW AND TRANSFORMED VARIABLES 253 APPENDIX 7: PLOTS OF DEBRIS FLOW T R A V E L DISTANCE AGAINST CONTINUOUS RAW VARIABLE FROM T A B L E 5.1 256 APPENDIX 8. RESULT TABLES 262 vii LIST O F TABLES T A B L E 2.1 CLASSIFICATION OF MASS M O V E M E N T S , B A S E D O N H A N S E N (1984) A N D C R U D E N A N D V A R N E S (1996). A L L A R E CONSIDERED TO B E A F O R M OF L A N D S L I D E 5 T A B L E 2.2 L A N D S L I D E CLASSIFICATION B Y SPEED OF M O V E M E N T A N D POSSIBLE DESTRUCTIVE SIGNIFICANCE IN P O P U L A T E D A R E A S (MODIFIED F R O M C R U D E N A N D V A R N E S , 1996) 6 T A B L E 3.1 M E T E O R O L O G I C A L STATIONS USED TO DERIVE C L I M A T I C D A T A FOR T H E STUDY A R E A ( M E T E O R O L O G I C A L SERVICE OF C A N A D A , F E B R U A R Y 2002) 26 T A B L E 3.2 M E A N M O N T H L Y TEMPERATURES FOR T H E PERIOD 1940-1990 AT SITES IN OR CLOSE TO T H E STUDY A R E A ( F R O M M E T E O R O L O G I C A L SERVICE OF C A N A D A , F E B R U A R Y 2002) 27 T A B L E 3.3 M O N T H L Y PRECIPITATION AT VARIOUS SITES IN T H E STUDY A R E A ( M U L T I - A N N U A L A V E R A G E ) (FROM M E T E O R O L O G I C A L S E R V I C E OF C A N A D A , F E B R U A R Y 2002) 28 T A B L E 3.4 D E M A R T O N N E C L I M A T E TYPES 30 T A B L E 3.5 T H O R N T H W A I T E C L I M A T E TYPES, USING T H E C L I M A T I C INDICES OF THORNTHWAITE (1931) 30 T A B L E 4.1 T Y P E S OF M A S S M O V E M E N T S IN T H E STUDY A R E A 32 T A B L E 4.2 DISTRIBUTION OF EVENTS B Y B E D R O C K L I T H O L O G Y IN T H E S T U D Y A R E A 36 T A B L E 4.3 DISTRIBUTION OF DEBRIS FLOW EVENTS IN DIFFERENT CATEGORIES 37 T A B L E 4.4 IDENTIFICATION NUMBERS OF EVENTS TO B E S A M P L E D IN E A C H C L A S S 38 T A B L E 4.5 E L E M E N T S M E A S U R E D FOR E A C H E V E N T 39 T A B L E 4.6 S U M M A R Y STATISTICS 44 T A B L E 5.1 P E A R S O N ' S C O R R E L A T I O N COEFFICIENTS B E T W E E N DEBRIS F L O W T R A V E L DISTANCE A N D SELECTED PREDICTOR V A R I A B L E S 92 T A B L E 5.2 C O R R E L A T I O N S B E T W E E N T H E T R A N S F O R M E D V A R I A B L E S A N D DEBRIS F L O W T R A V E L DISTANCE 96 T A B L E 5.3 T H E EIGHT V A R I A B L E S REPRESENTING T H E INTERACTION OF P L A N E C U R V A T U R E , PROFILE C U R V A T U R E A N D C A N O P Y CLOSURE (K) 99 T A B L E 5.4 P A R A M E T E R ESTIMATES OF T H E REGRESSION EQUATION 101 T A B L E 5.5 ROBUST REGRESSION WEIGHT 103 T A B L E 5.6 REGRESSION SIGNIFICANCE , 105 T A B L E 5.7 P A R A M E T E R ESTIMATES FOR T H E FINAL REGRESSION 106 T A B L E 5.8 ROBUST REGRESSION WEIGHTS 107 viii T A B L E 5.9 TESTS FOR T H E DISTRIBUTION OF ERRORS 108 T A B L E 5.10 T E S T T H E ERRORS HOMOSCEDASTICITY 108 T A B L E 5.11 E R R O R C O R R E L A T I O N USING D U R B I N - W A T S O N PROCEDURE 109 T A B L E 5.12 B A C K W A R D SELECTION OF T H E SIGNIFICANT V A R I A B L E S 110 T A B L E 5.13 F O R W A R D SELECTION OF T H E SIGNIFICANT V A R I A B L E S 111 T A B L E 5.14 STEPWISE SELECTION OF T H E SIGNIFICANT V A R I A B L E S 111 T A B L E 5.15 REGRESSION EQUATION P A R A M E T E R S ESTIMATE FOR EVENTS T H A T DID NOT ENDED IN A STREAM T A B L E 5.16 113 P A R A M E T E R ESTIMATES FOR REGRESSION EQUATION M O D E L I N G T H E EVENTS T H A T E N D E D IN A S T R E A M 114 T A B L E 5.17 STATISTICS FOR TESTING T H E N O R M A L I T Y ASSUMPTION 115 T A B L E 5.18 W H I T E TEST FOR HOMOSCEDASTICITY 115 T A B L E 5.19 D U R B I N - W A T S O N A U T O - C O R R E L A T I O N TEST 116 T A B L E 5.20 WEIGHTS OF T H E B E A C O N - T U K E Y TEST 116 T A B L E 5.21 S U M M A R Y STATISTICS FOR T H E V A R I A B L E L T A B L E 5.22 SIGNIFICANCE OF T H E M I X E D EQUATION COEFFICIENTS 119 T A B L E 5.23 E L E M E N T S USED TO PERFORM L E V E N E TEST 119 T A B L E 5.24 SIGNIFICANCE OF T H E COEFFICIENTS OF T H E C O M B I N E D FUNCTION 120 D I F 117 T A B L E 5.25 C O N F I D E N C E LIMITS OF PREDICTED T R A V E L DISTANCE USING CRISP SETS 123 T A B L E 5.26 124 A S S E S S M E N T OF T H E F U Z Z Y REGRESSION A N A L Y S I S T A B L E 6.1 INPUT D A T A R A N G E USED IN DEBRIS FLOW T R A V E L DISTANCE PREDICTION 133 ix LIST OF FIGURES F I G U R E 3.1 L O C A T I O N OF M A S S M O V E M E N T S IN T H E SOUTHERN PORTION O F T H E S T U D Y A R E A .... 2 3 F I G U R E 3.2 G E O L O G Y OF T H E STUDY A R E A (MINISTRY OF E N E R G Y A N D M I N E S , B C , 2 0 0 2 ) 24 F I G U R E 3.3. M A I O R SOILS DISTRIBUTION IN T H E STUDY A R E A ( A G R I C U L T U R E A N D A G R I - F O O D C A N A D A , 2002) 25 F I G U R E 3.4 M O N T H L Y A V E R A G E T E M P E R A T U R E V A R I A T I O N FOR T H E PERIOD 1 9 4 0 - 1 9 9 0 ( F R O M M E T E O R O L O G I C A L S E R V I C E OF C A N A D A , F E B R U A R Y 2 0 0 2 ) 27 F I G U R E 3.5 M E A N M O N T H L Y PRECIPITATION A T VARIOUS SITES IN T H E S T U D Y A R E A ( F R O M M E T E O R O L O G I C A L S E R V I C E OF C A N A D A , F E B R U A R Y 2 0 0 2 ) F I G U R E 4.1 DEBRIS SLIDE/AVALANCHE WITH THREE REACHES F I G U R E 4.2 PROFILE OF A THREE-REACH EVENT 29 40 40 F I G U R E 4.3 C O M P A R I S O N OF T H E RATIO OF WIDTH B O T T O M TO WIDTH TOP FOR TWO TYPES OF REACHES 48 F I G U R E 4.4 L A T E R A L A N G L E , y, OF A R E A C H WITH TRAPEZOIDAL SHAPE... 48 F I G U R E 4.5 E V E N T 2 1 - 1 0 1 49 F I G U R E 4.6 L O N G I T U D I N A L PROFILE OF E V E N T 2 1 - 1 0 1 50 F I G U R E 4.7 PROFILE OF E V E N T 5 2 - 1 3 51 F I G U R E 4.8 Two 52 DIFFERENT DEBRIS SLIDE-FLOWS CODED B Y TWO DIFFERENT N U M B E R S F I G U R E 4.9 A Z I M U T H V A R I A T I O N CONSISTENT WITH KINETIC E N E R G Y OF T H E M A S S M O V E M E N T . . 5 3 F I G U R E 4 . 1 0 E X A C T COLLINEARITY. T H E COEFFICIENTS OF T H E REGRESSION A R E UNDETERMINED. A C H A N G E O F T H E REGRESSION'S P A R A M E T E R S P L A N E D O NOT C H A N G E T H E ERRORS S U M OF SQUARES ( B E L S L E Y ETAL., 1980). 72 F I G U R E 4 . 1 1 F O R C I N G T H E PREDICTED V A L U E TO B E L A R G E R T H A N T H E L O W E R B O U N D OF T H E FUZZIFIED M E A S U R E D V A L U E 85 F I G U R E 4 . 1 2 F O R C I N G T H E PREDICTED V A L U E TO B E S M A L L E R T H A N T H E UPPER B O U N D OF T H E FUZZIFIED M E A S U R E D V A L U E F I G U R E 4 . 1 3 F L O W C H A R T FOR SELECTING T H E PREDICTION FUNCTION F I G U R E 6.1 D E P E N D E N C Y OF DEBRIS F L O W T R A V E L DISTANCE O N A V E R A G E SLOPE 86 89 127 x Acknowledgements Firstly, I would like to thank my thesis advisor, Dr. John L . Innes for his trust in my research abilities. His guidance as well as his academic and logistic support during the research made the work on the project enjoyable and satisfying. I would also like very much to thank to Dr. Jonathan Fannin for his advice during modeling process. His experience in terrain stability made the understanding of the mechanisms involved in terrain failure easier. I thank also Dr. Peter Marshall for his enthusiastic support of some of the ideas developed in the research. His guidance on statistical methods used during the modelling process made the model assessment a much simpler task. I thank Dr. John Nelson for his support during the research. His advice during the fuzzy modelling stage lightened the understanding process of the non-linear programming techniques. I would also thank to Dr. Peter Jordan for his helping with landslide sampling design. This research was funded by Arrow IFPA. Kookanee Consultants, including Mr. Paul Jeakins, helped me with all the logistic support that I needed during the data collection process. Special thanks to Mr. Dave Abrosimoff who helped me with patience in showing me the Arrow forest district. I would like to thank to my wife for her never-ending understanding and support. Lastly, I would like to thank to all the people that helped me in my whole education process. xi Glossary 1. Aridity index 2. Assumption 3. Azimuth 4. Canopy closure 5. 6. Compact set Composite variable 7. Confidence interval 8. 9. Confined landslide Cook's distance 10 Correlation coefficient 11 COVRATIO 12 Crisp sets 13 Debris flow 14 Debris slide 15 Dependent variable 16 D F B E T A S 17 DFFITS 18 Element at risk 19 Failure surface 20 Fan The ratio of a region's multi-annual mean precipitation (in mm m") to the sum between 10 and multi-annual mean temperature (in °C) (de Martonne, 1926) A fact or statement (as a proposition, axiom, postulate, or notion) considered true without demonstration (based only on empirical observations or theoretical inferences) Horizontal direction expressed as the angular distance between the north direction and the direction of the object The percentage of ground area covered by the vertically projected tree crown areas A set in which all the convergent series have their limit a set's element Variable obtained combining two or more transformed or untransformed raw variables A confidence interval gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data Landslide that has as its path a gully or channel Statistical test carried out to assess the influence of an individual observation on the dependant variable A correlation coefficient is a number between -1 and 1 which measures the degree to which two variables are linearly related Statistical test carried out to find the outliers in the data set related to the independent variables that can affect the regression A set with compounding elements having as the membership function codomain the set {0, 1} Mass of poorly sorted sediment, saturated with water, flowing down as a result of the gravitational forces; both solid and fluid processes influence the motion. Small, rapid movement of largely unconsolidated material that slides or rolls downward to produce irregular topography The predicted or response variable of the functional or stochastic relation Statistical test carried out to assess the influence of the outliers in the data set related to the dependent variable (detects changes in the coefficients when an observation is eliminated from analysis) Statistical test carried out to assess the influence of the outliers in the data set related to the dependent variable (detects the difference between predicted values determined first with all the observations in the analysis and second with one dropped) Includes any land, resources, environmental values, buildings, economic activities and/or people in the area that may be affected by the landslide hazard Surface that forms the lower boundary of displaced material below the original ground surface The lowest part of the landslide where the displaced material lies above original ground surface 2 xu 21 Fine particle content 22 Fuzzy number The percentage of particles smaller than 0.002 mm (silt and clay) Fuzzy set with linear symmetric membership function, determined by its center (most likely value) and spread (difference between most likely and less likely value) 23 Fuzzy regression Regression that uses fuzzy numbers 24 Fuzzy sets Fuzzy logic is a superset of conventional (Boolean) logic that introduces the concept of partial truth - truth-values between "completely true" and "completely false". (A set with compounding elements having as the membership function co-domain the compact set [0, 1]) 25 Hat matrix leverage Statistical test carried out to find the outliers in the data set related to the independent variables that can affect the regression 26 Homoscedasticity The condition of equal error variances 27 Horizontal length The length of the landslide measured horizontally equals the product of slope length and slope cosine 28 Hypothesis A theory that has been put forward, either because it is believed to be true or because it is to be used as a basis for argument, but has not been proved 29 Independent variable The predictor or explanatory variable of the functional or stochastic relation 30 Influential Observation that causes major changes in the fitted regression equation if observation it is excluded from the data set 31 Lack-of-fit Test statistics used to assess the correctness of the functional part of the model 32 Laminar flow Flow that moves in parallel layers with one layer of fluid sliding over another 33 Landslide Any mass movement down the slope of rock, debris or earth 34 Landslide initiation Highest point of the landslide limited above by the undisplaced adjacent point material 35 Landslide path The landslide trajectory 36 Landslide travel The slope distance between landslide initiation point and landslide tip distance 37 Membership Function that associates a value between 0 and 1 (degree of belief) to any function element of a set (set of interest) 38 Modified variable Variable used in regression analysis obtained from a raw variable using different functions or procedures 39 Multi-collinearity Situation when the predictor variables are correlated among themselves (also known as inter-correlation) 40 Normal distribution A continuous random variable X, taking all real values in the range (-°°, +°°) is said to follow a Normal distribution with parameters (I and a, written: X~N(p,, a ) if it has probability density function: 2 exp •M xiii 41. Outlier 42. Plan/profile curvature 43. Precipitation effectiveness index 44. 45. 46. Probability of occurrence Raw variable Reach 47. Regression assessment 48. Regression significance 49. 50. Rheology Ridge regression 51. Risk 52. Risk assessment 53. Run-out 54. Shearing stress 55. 56. Significance level Significant variables 57. 58. 59. Single path landslide Slope length Soil specific weight 60. Temperature efficiency index An outlier is an observation in a data set that is far removed in value from the others in the data set The terrain curvature expressed in term of convexity/concavity of the horizontal/vertical profile 12 p PE = ^l\5*(——7^)' where P- average monthly precipitation 0/9 (inches) of each month and T-average monthly temperature (°F) The chance or probability that a landslide hazard will occur Variable expressed in international system units A linear portion of the event trajectory, having the same geology, constant slope, azimuth, width, volumetric behavior characteristics and confinement type Set of statistical tests used to evaluate the lack-of-fit of the regression function An estimated measure of the degree to which the relationship represented by the regression equation is representative for the population Science dealing with the deformation and flow of matter Method used to remedy the multi-collinearity problems of the regression equations by modifying the least square to allow biased estimators of the regression coefficients A characteristic of a situation or action wherein one or several uncertain undesirable outcomes are possible The process of quantifying and describing the risk associated with some situation or action Length of the last part of a mass movement characterized only by deposition Stress that slices matter into parallel sections that slide in opposite directions along their adjacent sides The probability of rejecting the null hypothesis when it is true Dependent variable used in regression analysis having the significance level smaller than an initial established value Landslide with trajectory described by a single continuous line The landslide length measured on the slope Ratio between weight of given volume of material and weight of equal volume of water at 4° C TE = ^—-—, where T is the mean monthly temperature measured in °F 61. Tip 62. Travel distance 63. Turbulent flow 64. Vulnerability Point on the fan farthest from landslide initiation point Distance measured from the initiation point until all the moving material is deposited Flow that moves in a chaotic manner rather than in parallel sliding layers The degree of damage caused by a landslide hazard to the elements at risk xiv 1 Introduction Landslides are natural phenomena occurring in steep terrain and involving down-slope mass movement. The material involved in the movement contains water, soil, rocks and organic debris. The motion speed varies from extremely slow (<16 mm/year) to extremely rapid (>5 m/sec) depending on the movement type (Cruden and Varnes, 1996). Without exception any landslide poses a risk to downslope resources. If the mass movement is rapid then human life can also be threatened (Hansen, 1984). In Canada, over a period of 157 years from 1840 to 1996, a total of 40 landslide disasters were recorded. Of these 40 events, the most destructive were rock avalanches, rockslides and rock falls; they caused 233 deaths. Debris flows and debris avalanches were responsible for 65 deaths. The province with highest concentration of landslide disasters was British Columbia, having more than 50% of the events that occurred in Canada (Evans, 1997). What can be done to prevent these kind of catastrophic events? The first step is to identify the causes of such events. Work done to date enables the areas most susceptible to terrain failure to be identified. The most common mechanisms triggering landslides have been identified as intense rainfall, rapid snowmelt, water-level change, volcanic eruption and earthquakes (Wieczorek, 1996). A l l these activities are connected with local conditions and several algorithms using different input data and analysis techniques have been elaborated to identify areas susceptible to terrain failure (Turner and McGuffey, 1996; Soeters and van Westen, 1996). The types of analysis used in landslide investigations include inventory, heuristic analysis, statistical analysis and deterministic analysis. Regardless of the analysis type, the terrain attributes that are always considered are slope, geology and land use. As a result of the studies targeting landslide causes and their initiation processes, a series of guidebooks to be used in 1 current geomorphological and engineering practice have been developed (Hammond et al, 1992; Mapping and Assessing Terrain Stability Guidebook, 1999). Most studies of landslide travel distance have considered a small number of attributes describing the landslide path, with the most frequently used attributes being average slope or event geometry (Miao and A i , 1988; Heim, 1989; Takahashi, 1991; Hammond et al. 1992; Cannon, 1993; Carrara et al., 1995; Hansen, 1996; Corominas, 1996; Megahan and Ketcheson, 1996; Wieczorek, 1996; Wise, 1997; Lau and Woods, 1997; Atkinson and Massari, 1998; Hungr, 1999; Finlay et al., 1999; Robinson et al., 1999; Fannin and Wise, 2001). However, the trajectory of a landslide is determined by a complex suite of factors, and not simply the slope or geometry. Some of the attributes related to terrain geomorphology and flow are often completely ignored in analyses because of the difficulties associated with their mathematical or physical expression, such as the variation in terrain or granulometry along the landslide path or the occurrence of turbulent flow. Landslide investigation models are normally divided into two groups: landslide initiation and landslide travel distance processes. Landslide initiation is determined by local conditions but the travel distance is determined by a large number of attributes that describe the local and general conditions. Where the slide stops determines the risk associated with terrain failure. Therefore if the starting point is known, the terrain stability risk assessment is largely dependent on the travel distance of the mass involved in the movement. The precision of any model used to predict landslide travel distances usually determines how useful it is. This inference is based on the underlying assumption that data used in the modeling process are measured without error. For landslides, data are usually collected after the event has 2 occurred. Therefore, data suffer from uncertainty arising from the post-event modification of attributes. These uncertainties change the appropriateness of particular investigation methods, so the assumptions used to build landslide travel distance models based on data'with crisp meaning are of questionable validity. To redress this problem, techniques that consider the fuzziness of the data need to be used in the modeling process. The vagueness associated with the data can be resolved using different methods. The most popular methods are poly-logic (Zadeh, 1965) and errors in all variables (Adcock, 1877, 1878; Kummel, 1876, 1879). The latter method, also known as orthogonal regression, eigenvalue methods or total least square methods, is founded on dual-logic methodology. In this study, fuzzy sets are used to deal with uncertainties related to the data values. Fuzzy sets represent a new stage in data analysis due to the involvement of the poly-logic concept. 1.1 Risk assessment of landslides Risk involves, at a conceptual level, the possibility of an undesirable outcome to occur plus the uncertainty related to the magnitude and timing of the occurrence of an adverse outcome. If either of these two elements is missing there is no risk. The most common approach defines risk as the product between the probability (P) that an undesired outcome (hazard) will occur and the consequences (C) of that occurrence. R=PxC (1.1) Terrain Stability Mapping in British Columbia (1997) defines a hazard as "a condition or event that puts something or someone, in a position of loss or injury or in a position of potential loss or injury". The probability that an adverse outcome will occur is called the probability of 3 occurrence and is defined as "the chance that a landslide will occur". The probability of occurrence is usually determined through landslide initiation models. A consequence is the result of the undesirable event that occurred. It is dependent on property or human life being affected (the element at risk) but is also related to their vulnerability to the landslide. Consequence is usually determined by the product between the elements at risk (E) and their vulnerability (V): C=ExV (1.2) If the initiation point of a landslide is known, the vulnerability depends only upon event travel distance. Therefore the elements used to calculate the travel distance identify the risk. In the event that the mass movement will hit the element at risk then the value associated with that element is affected. Consequence is therefore determined through landslide travel distance models. 1.2 Goal of this study The goal of this study is to calculate the debris flow-slide travel distance. Three objectives will help achieve this goal: • Development of an empirical relation between debris slide-flow travel distance and path attributes; • Determination of as narrow a confidence interval for the predicted value as possible; and • Incorporation of uncertainty (vagueness) in the analysis. 4 2 Literature review The investigation of landslides has a relatively short history compared with other sciences studying threats to human lives and goods. Some of the first investigations related to landslides were undertaken at the start of 19 th century and are related to the Rossberg event (1806) in Switzerland, and the Bindon event (1839) in England (Voight, 1978; Heim, 1989; Turner and Jayaprakash, 1996). In this section I review the literature on landside classification, debris flow initiation and debris flow travel distance. 2.1 Landslide classification Mass movements have been described and interpreted using different vocabularies, and a range of classification systems (Innes, 1983). Even now, some scientists use old terms like sturzstorms (correctly spelt Sturzstrom), (incorrectly) translated as "downfall storm" (Corominas, 1996), for rock avalanches. Such hybridizations of English and German terminology appear to have limited value, especially because of the errors in their use and translation. In this study, the classification proposed by Varnes (1978) and Cruden and Varnes (1996) has been adopted. According to Hansen (1984) and Cruden (1991), a landslide is any downslope mass movement of rock, debris or earth. Table 2.1 Classification of mass movements, based on Hansen (1984) and Cruden and Varnes (1996). A l l are considered to be a form of landslide. Movement type Rock Fall Topple Slide Spread Flow Rock fall Rock topple Rock slide Rock spread Rock flow Material type Soil coarse particle predominant (debris) Debris fall Debris topple Debris slide Debris spread Debris flow Soil fine particle predominant (earth) Earth fall Earth topple Earth slide Earth spread Earth flow 5 This thesis focuses on particular forms of landslides as defined by Hansen and Cruden: debris flows, debris slides and their combinations. Based on the velocity of movement, landslides have been classified into seven classes (Hansen, 1984; Cruden and Varnes, 1996), shown in Table 2.2. Debris flows and debris slides vary from rapid to extremely rapid movements (Resources Inventory Committee, 1997). Table 2.2 Landslide classification by speed of movement and possible destructive significance in populated areas (modified from Cruden and Varnes, 1996) Class Description 1 Extremely slow 2 Very slow 3 Slow 4 Moderate 5 Rapid 6 Very rapid 7 Extremely rapid Typical velocity Possible destructive significance <16 mm/year ^mm/yearl y m /year 1.6m/year 13 m/month 13 m/month - 1.8m/hour 1.8 m/ hour - 3 m/ min 3 m/min - 5 m/sec > 5 m/sec Construction possible with precautions Some structures undamaged Remedial construction can be undertaken Temporary / insensitive structures temporarily maintained Evacuation possible; structure and goods destroyed Some lives may be lost Many deaths; escape unlikely Landslide risk assessment is usually separated into two parts: landslide initiation and landslide travel distance. Each uses different procedures and mechanisms to explain landslide behaviour, including fluid mechanics (Takahashi, 1981, 1991; Innes, 1983; Leshchinsky and Huang, 1992; Hungr, 1995; Iverson, 1997), statistics (Chowdhury and Xu, 1993; Fannin and Wise, 1995; Corominas, 1996; Atkinson and Massari, 1996; Hungr, 1999;), topography (Montgomery and Dietrich, 1994; Wu and Sidle, 1995; Carrara et al, 1995) or vegetation (Greenway et al, 1984; Sidle et al, 1985; Wu et al, 1994; Watson et al, 1994; Helliwell, 1994). 6 2.2 Debris flow initiation The investigation of debris flow initiation must differentiate between the conditions required for an event to occur and the triggering activities (Innes, 1983). The minimum conditions for debris flows to occur are steep slopes, unconsolidated residual or transported material and high pore water pressures (Innes, 1983). Flows may be triggered by meteorological conditions or by a range of extrinsic factors, including earthquakes, volcanic eruptions, deforestation, and the excavation of a slope or its toe (Varnes, 1978; Wieczorek, 1996). Debris flow initiation is often investigated using fluid mechanics theory (Innes, 1983; Hungr et al., 1984; Takahashi, 1991; Iverson, 1997), but there are also models involving statistics (Atkinson and Massari, 1996), vegetation combined with a topographic index (Wu and Sidle, 1995), and forest practices (Fannin et al., 1996). Most studies investigating debris flow initiation involve fluid mechanics combined with the specific application of a different area such as soil mechanics (Terzaghi, 1943; Takahashi, 1991; Iverson, 1997) or root strength (Wu and Sidle, 1995). All studies include the determination of the event initiation slope, 6. Some rheological studies also include the degree of packing of the sediment (c*), density of the sediment (cr), density of the fluid (p) and internal friction of debris in the bed (<p) in the analysis. An example (from Takahashi, 1981, 1991; Innes, 1983) is given in Equation 2.1: tan<9= * °-P p c*(a-p) + p C { ) tan< (2.1) 1 The greatest weakness of rheological studies is the difficulty associated with the measurement of the parameters that appear in the Equation 2.1. This problem makes modeling techniques that involve fluid mechanics unattractive for practitioners. If the infinite slope model is used to investigate debris flow initiation, then rheological parameters are substituted with a series of variables defined by physical dynamics (Nelkon, 1974; Halliday and Resnick, 1986). The main mathematical tool used in risk analysis is the Factor of Safety (FS), defined as: 'Resistant F forces (2-2) — 7 S 'Driving forces The terrain is considered stable if FS is larger than 1. A series of models are used to calculate FS, ranging from the very simple (Powrie, 1997) that uses only slope (/5), soil depth (z), pore water pressure («), critical state strength and soil unit weight (y): (<p' it) cr F S = ^ L ( u l tan/? 2 yzcos (1 1 to more complex models (Hammond et al., 1992) that use, in addition to the above variables, an apparent cohesion attributed to tree root strength (CV), saturated soil thickness (z ), tree surcharge w (qo), soil cohesion (Cs), soil strength (ft), soil unit weight dry(%), moist (y) and saturated (ysat) and water unit weight (y ): w FS = C r + C s ~ ^ «" ~ V sin fi cos /3[q + y(z - z ) + y , z ] + c o s 2 P° [q + r i z 0 Z w ) + Z Y w Yw)]tan sa (2 4) w Regardless of the model used, the underlying assumptions used in the infinite slope model made the results unreliable. The infinite slope model is based on the following assumptions (Hammond etal, 1992; Powrie, 1997): 8 the failure surface is planar; the failure plan has infinite extent and is uniform; the failure plan and the phreatic surface are parallel to the ground surface; a single soil layer is considered; the soil is homogenous in all parameters used in calculation; and the two-dimensional analysis models reality (which has three dimensions). Because of these restrictive assumptions, FS can be either over- or underestimated (Iverson and Major, 1986). As an alternative, for event initiation, a series of models that consider a circular slip surface (Fellenius, 1927; Bishop, 1955) or noncircular slips (Morgernstern and Price, 1965; Janbu, 1973) have been used: Bishop's method: F S = ^ - ± h ^/isina' { l - r u ) ^ c r i t J^ * g i n g (2-5) g cosa + = 1 1=1 Janbu's method: FS= - l ^wtana i=i where Y[(w-ub)tanp ,*— 1 + rr £ t a n —] (2.6) a t a n f , tana J cr 1 + a: angle of inclination of slip surface to the horizontal; h : average height of a slice (in Bishop's method); b : slice width; w : weight of soil element; and r : pore pressure ratio r = — . u J* The main problems with these models are the assumptions that the soil has homogeneous properties in the area investigated and that the failure surface can be expressed by relatively simple functions. Even if some of the assumptions are relaxed (cf. Hammond et al., 1992), the models that use circular or non-circular failure surfaces will still predict with a low degree of accuracy. If statistical methods are used, the main tool that is applied is multiple regression analysis, with emphasis on the general linear model (Carrara et al., 1991; Wang and Unwin, 1992; Atkinson and Massari, 1996) or probability distributions fundaments (Bergado et al., 1988). Statistical models rely on the assumption that the predictor variables used in modeling are significantly correlated with event initiation and that the model covers for the gaps generated by using a limited number of variables for prediction. The ease of implementation of this model in current practice makes it very attractive. However, there is a large amount of uncertainty associated with the prediction. New methods are being developed both in mathematics and physics that lead to a more detailed and precise investigation, such as neural networks, operational research, fuzzy sets, Petri networks, catastrophic theory and games theory. Further studies combining different methods to address specific problems seem to be appropriate for debris flow initiation investigations. 2.3 Debris flow travel distance After the debris flow initiation point has been established, there is a need to know how far the debris flow will travel. A series of models and theories have been developed and used to answer this question. The most popular methods used to predict debris flow travel distance are based on dynamics, rheology, hydrology, topography and statistics. The simplest model for debris flow travel distance is based on dynamics. A n example of such a model is present in Equation 2.7 (Fang and Zhang, 1988): 10 2 V max „ /#(COSj0 sin B _ s ^f) (2.7) ( ~^-S( max ~ f ^max) rl where S : debris flow travel distance; v max : maximum velocity of the mass movement; ju: mobility; (3: inclination of the sliding path; / ' : kinetic friction; g : gravitational acceleration; and hmax, lmax '• maximum height and horizontal travel distance of the mass sliding's centre. Equation 2.7 has more of a didactic role than a practical one. The parameters that it involves describe the movement mechanism very poorly. However, the main reason that this type of equation is unattractive, apart from it being mechanically inappropriate, is that only three elements fully describe the travel distance: (3, h m a x and l m a x . The element at risk can be easily identified using Pythagoras' theorem or simple trigonometry. Therefore Equation 2.7 does not bring any new information to the analysis. In general, approaches to the prediction of debris flow travel distances that rely solely on dynamics provide unreliable results because of the restrictive set of assumptions used (e.g., homogeneous sliding material, the material moves downslope as a rigid mass, the kinetic friction is constant during the movement). Rheological approaches relax some of the constraints imposed on the moving material. The material is usually considered as having both rigid and fluid movement (e.g., the Bingham equation). An alternative model used in rheological approaches to debris flow travel distance is 11 the Bagnold equation (1954). Rheological models attempt to solve mass movement travel distance in a deterministic manner. This type of model (Takahashi, 1981,1991; Innes, 1983; Iverson, 1997) introduces variables that characterize the flow movement, such as viscosity, apparent density of the fluid when it incorporates suspended particles, and particle density. The theoretical framework used to determine debris flow travel distance is based on fluid kinematics. Two methods are generally used to describe the fluid motion by mathematical analysis: the Langrangian method and the Eulerian method. The Langrangian method describes the fluid based on the assumption that the motion of a particle is completely specified if the particle coordinates are known. The following set of equations is used in a three dimensional space: x = F (a,b,c,t) { y = F {a,b,c,t) 2 (2.8) z = F (a,b,c,t) 3 where x,y,z: the particle coordinates; a, b, c : independent variables; t: time; and Fi, F2, F3 : functions describing the particle motion. Equation 2.8 describes the spatial position (x, y, z) of any fluid particle at different times, reported relative to the initial position (xrj=a, )>o=b, zo=c) at the initial time t=to. 12 The Eulerian method describes the flow characteristics (e.g. velocity, acceleration) at various points on the fluid particle flow path in a different way. In three-dimensional space, the equations that are used are: u= f (x,y,z,t) { v = f (x,y,z,t) 2 w= where (2.9) f (x,y,z,t) 3 u, v, w : flow characteristics ( usually velocity) components; x, y, z '• three dimensional space components; t: time; and fi, fi, J3 - functions describing the relationship between flow characteristics and three dimensional space. The two methods are related by the following set of equations: dx u=— dt v = &dt (2.10) ' w=— dt W The Lagrangian method yields a complete description of the fluid particle path, but the mathematical difficulties encountered in solving the set of Equations 2.8 make the method impractical. The fluid motion expressed in terms of flow velocity through time at various points is of great practical significance, therefore the Eulerian method is commonly used. The mathematical simplicity of the method is an added advantage, even if the method itself does not lead to a complete description of the fluid particle path (Pao, 1961). 13 In the one-dimensional flow case, the Eulerian method leads to the following dynamic equation for the steady flow of a non-viscous fluid: (2.11) where p : fluid pressure; s : length; p: density of the fluid; z : elevation; v : fluid velocity; and g : gravitational acceleration. Equation 2.11 is termed Euler's equation for one-dimensional flow. If the fluid is homogeneous and incompressible, the integration of Euler's equation yields the relationship: p v 2 — h gz H p = constant (2.12) 2 which is known as Bernoulli's equation (Pao, 1961; Kaufmann, 1963). Real fluids are characterized by viscosity, which is mainly the result of inter-particle interactions. The most important role of viscosity in fluid motion is the shear strength, T i . Unlike elastic solids, where the shear strength depends on the magnitude of the deformation, the shear strength of a viscous fluid is proportional to the deformation rate of the fluid. Isaac Newton first formulated the law that governs fluid viscosity: (2.13) dy where ju: viscosity. 14 The simplest model determining debris flow travel distance assumes that the flow is laminar; the fluid is incompressible and occurs in a straight channel with parallel boundaries (Pao, 1961): L=-^b (2.14) 2 12pV where L : debris flow travel distance; Ap : pressure drop between initiation and deposition point; V: average velocity of the fluid across a transverse profile; /j,: viscosity; and b : distance between the channel boundaries. This model is extremely simple and its reliance on the assumptions stated above make it of little practical value. Consequently, a more complex model has been developed. The improved version of Equation 2.14 is termed the Hagen-Poisseuille theory, and it explains the laminar flow of incompressible fluids in a circular channel. The debris flow travel distance determined using Hagen-Poisseuille equation could be stated as: L = -^E- i (2.15) D 32pV where D is the diameter of the circular channel. For a semicircular channel of slope /?with radius D/2=R the flow velocity (u) is (Hungr et al., 1984): p(R -r ) 2 2 u- n sin p , n (Z.lo) 4p 15 Equations 2.15 and 2.16 are based on the assumptions that the fluid viscosity follows Newton's law of viscosity (Equation 2.13), and that there is no relative motion between fluid particles and boundaries (i.e. no slip of the fluid particles at the boundary). These results (Equations 2.15. and 2.16) are more complex than Equation 2.14, yet do not introduce a significant amount of new information. In addition to the practical problems encountered with the modeling activity described above, there is an even more significant problem. None of the approaches consider variation in the terrain. There is always an assumption of a constant slope. Consequently, the models described above are difficult to use in practice. Johnson and Rodine (1984) substituted for Newton's viscosity law with a combination of rigid dynamics and fluid mechanics (the Bingham equation): du (2.17) T = k + ju ay where k : the shear strength of the material. Equation 2.17 yields the following formula for the velocity distribution within a viscous debris flow: 0.5ysm/3(T -y )-k(T-y) 2 u- where 2 (2.18) y: unit weight of the debris; and T: debris thickness. 16 If y>Tk, the critical thickness for a Bingham material, then Equation 2.18 is used. However, even this approach is difficult to use in practice because it does not consider variations in slope. R.A. Bagnold considered flows containing different particle sizes. He tried to explain the importance of the flow composition for the fluid movement (he assumed spherical, perfect elastic particles with uniform dimensions and uniform average dispersion). The equation for flow velocity using Bagnold's ideas is (Hungr et al., 1984): .-cj&P where (2,9) c : grain concentration per unit volume; 0: particle diameter; and A : linear concentration of particles. Takahashi (1981, 1991) and Hungr et al. (1984) developed a debris flow travel distance model combining dynamics and fluid mechanics theory. The equation used to determine the event travel distance is: L= U=u u cos(0„ -6 ) 1 + d gh cos du (p u m 2(p m +c k u a (a - p )) m +c (o--p )) u (2.20) m c os9 tana(*-pJ_ . G=g where uC d gs ndd k : ratio between longitudinal section area of a moving earth block and square depth a of the surface water flow behind the moving earth block ; a: angle of particle encounter, analogous to the kinetic friction angle; dd, & '• slope deposition/upward section of the debris flow. u 17 Statistical procedures have been used to overcome the difficulties involved in measuring rheological variables. Statistical methods try to express the debris flow travel distance using different attributes that are easy to measure, such as slope, azimuth, drainage area, vegetation type and species, geology or soil type. Statistical procedures (usually multiple regression) may be practical, but the results are uncertain because rheology is excluded. Simple models include a maximum of four predictor variables (Cannon, 1993; Megahan and Ketcheson, 1996; Corominas, 1996; Finlay et al., 1999; Fannin and Wise, 1995, 2001). The dependent variable is sometimes transformed (Cannon, 1993; Coromionas, 1996; Finlay et al., 1999), or different techniques may be used to infer debris flow travel distance (Megahan and Ketcheson, 1996; Fannin and Wise, 1995, 2001; Hungr, 1999). As debris flow travel distance depends on terrain configuration and debris flow properties, one way to express this is to consider the volume variation along the path. A l l the statistical studies that have used this method have included the volume of either a part or the whole event. This approach is affected by recursivity, as noted by Corominas (1996). Even though the results obtained using this recursive technique is invalid, the lack of options for expressing the terrain configuration have led to the use of this method in practice. Cannon (1993) used Equation (2.25) to determine debris flow travel distance on an elementary uniform terrain (cell): V log ' where -V f =0.141og/?-1.41ogjg + 2.16 (2.21) Vj-Vf: volume change during debris flow movement on the cell; Vi: debris flow volume that enters in the cell; Vf: debris flow volume that leaves the cell; D : length of the debris flow path throughout the cell; and R : radius of the channel. 18 Unfortunately, there is insufficient detail to ensure that Equation 2.25 meets all the assumptions required for the prediction of debris flow travel distance. In addition, the model assumes that methods to calculate entrainment and deposition are known, which is the not the case. Back analysis is used to determine the travel distance from Equation 2.25 (Duncan, 1996). As the error V- - Vf in estimating the dependent variable log ' 1 is 26% (Cannon, 1993) and the dependent variable is not a linear function but logarithmic one, back analysis has to be performed using probability theory. Therefore, the confidence interval associated with the travel distance prediction, D, will become wider than that supplied by the direct transformation of the dependent V variable, log ' -V f 7 , in original units. This will provide an unreliable result for the predicted debris flow travel distance. Finlay et al. (1999) used Equation 2.26 to predict landslide travel distance for cut slope events: log L = 0.062 + 0.965 log H - 0.558 log fi where (2.21) L : landslide travel distance; H : height of failure of elevation difference between landslide scarp and toe; and fl: slope angle. The coefficient of determination for Equation 2.26 was 0.85 and the number of events (515 debris flows) used in the analysis was quite large. As with Cannon (1993), insufficient details were given to infer whether all the regression assumptions were met. As the event travel distance, L, was transformed, the confidence interval associated with the predicted value, log (L), cannot be transformed back into the original units using an exponential function (Mihoc and Firescu, 1966). (Several methods used for inference based on transformed variables are presented at the end of this chapter.) As the dependent variable was not transformed back to the 19 original units, the confidence interval of the predicted landslide travel distance was very wide, making the use of the equation unattractive. In the same way as Finlay et al. (1999), Megahan and Ketcheson (1996) developed an equation based on multiple regression stepwise selection: log(L) = 0.637 + 0.554log(V) - 0.139\og{Obstr) + 0.50Uog(Gradient) + 0.l63log(SourceArea) where (2.22) Obstr : length of obstructions measured normal to fall line of the hillside per 30m slope; Gradient: hillside gradient; and SourceArea : run-off contributing area. The coefficient of determination of Equation 2.27 was 0.91. This value revealed that the confidence interval for the predicted value, log(L) was narrow. As the event travel distance was transformed, the confidence interval for the debris flow travel distance was too large to make the equation useful. Corominas (1996) used Equation 2.28 for prediction: l o g y = -0.1051ogV- 0.012 (2.23) The coefficient of determination is 0.763 for the 71 events considered in the study. The confidence interval associated with the predicted value refers to the transformed dependant variable, log (H/L), and not to the event travel distance. The relatively small coefficient of determination indicates that the confidence interval for the event travel would be quite large, as presented at the end of this chapter. 20 Hansen (1996) replied to Corominas' (1996) article with a comment that needs to be examined. He transformed back in original units the equation used to describe all events, (Equation 1 in the article, Equation 2.29 in thesis) that had a coefficient of multiple determination of 0.625: l o g y = -0.085 log V-0.047 (2.24) into Equation 2.30 (Equation 2 in the article): y = 0.897V- 0 (2.25) 085 This transformation: l o g - = -0.085logy-0.047 = » = eW- -OMi 0Mi ^ H_ = e -o.047 -o.o85 = 0 . 8 9 7 y y 0 0 8 5 (2.26) is not allowed in stochastic dependency, which is based on probability distributions. The inverse application can be applied only to a bijective relation. As statistics do not use bijective relations, this transformation is invalid. Two studies have treated the confidence interval for the predicted value appropriately: Hungr (1999) and Fannin and Wise (2001). Both studies developed a regression equation to predict debris flow travel distance, with different predictor variables being used in each study. Hungr (1999) used terrain class, tributary drainage area to each reach and degree of lateral confinement of the flow path as predictors and predicted the volume. In contrast, Fannin and Wise (2001) used reach length, reach width, azimuth and slope as predictor variables and predicted the event volume on a reach-by-reach basis. These studies considered the terrain variation expressed as the volume variation along the path to determine the event travel distance. The probability distribution was used to calculate the predicted debris flow travel distance. Even though both methods use correct procedures to 21 determine the confidence interval, the value supplied for prediction is associated with a wide confidence interval. If linear regression is used for debris flow travel distance prediction, the dependent variable representing the event travel distance can be either transformed or not. When transforming back into the raw units, the following cases are possible: • if the correlation coefficient is 1, the transformation is the inverse of the transformation used (the dependency is no longer stochastic but functional); • re-transformation using a probability distribution (as done by Hungr, 1999; Fannin and Wise, 1995, 2001); • retransformation using a regression equation between the transformed and raw variable; • if the transformation is linear then the transformation into original units is the inverse of the original linear function used (which is also linear). 2.4 Conclusion There has been a considerable amount of work done on the prediction of debris flow travel distance, with some studies focusing on runout (e.g. Hungr, 1995). From this, it is possible to conclude that rheological models are difficult to use in practice because of the problems associated with the measurement of the elements describing flow behaviour, because the assumptions are far from reality, and because of the failure to include terrain configuration in the analysis. Statistical models are easy to implement but the confidence intervals associated with the predicted values are too large to provide useful results; in some cases the confidence intervals are larger than 200% of the actual length (Neter et al., 1996). 22 3 Study area The BC Ministry of Forests, Research Branch, conducted an inventory of the landslides present in the Arrow Forest District. The area investigated amounts to about 1.4 million ha, of which about 900,000 ha is provincial forest and private land, with the rest comprising lakes, glaciers and rocks. The landslides are distributed between Castlegar in the south and Nakusp in the north, Arrow Lake in the west and Kootenay Lake in the east. Landslides were identified using aerial photographs with a scale of 1:20,000. When the study started, of the 1784 slides identified, 582 located in the southern part of the District had been mapped. These are shown in Figure 3.1. Camptai Legend • • p 0 5 Landslides Towns TRIM map sheets Lakes 10 20 kilometers Figure 3.1 Location of mass movements in the southern portion of the study area 23 Only the mass movements located in the south of the study area were included in this study. This selection was made because the central and northern events had not been mapped precisely when the study started. Therefore, the total number of events included in the sampling design was 582. 3.1 Geology The geology of the study area is shown in Figure 3.2. 1. Melamorphic rocks Monashe complex ( Protezoic to 4.Quartz monozonitic (Paleocene) lO.Quartzite (Proterozoic to lower Paleozoic) Paleozoic) . , „ . . . „ . 2. Metamorphic racks ( Carbonifer to Penman) 3. Metamorphic rocks (Carboniferous to Permian) 5.Granite. alkali-teldspar-granite (Paleocene to Eocene) 6,Granite granodiorile (Middle Jurassic) 7.Garnodiorile (Cretaceus) 11.Limestone, slate silLstone. argile (Missipian to Permian) 12,Mudstone, silLstone, shale fine clastic (Jurassic) 8.Basaltic volcanic rocks (Jurassic) ^.Basaltic volcanic rocks (Carbonifer to Permian) O.Limestone, slate, siltstone, argillite (Triassic) 14.Quartzite (Proterozoic to Cambrian) Figure 3.2 Geology of the study area (Ministry of Energy and Mines, B C , 2002) 24 A variety of rock types are found in the area, including intrusive, metamorphic and sedimentary types (Figure 3.2). Intrusive and metamorphic rocks dominate between Arrow Lake and Kootenay Lake; sedimentary rocks outcrop to the north and east of Silverton. In the Slocan watershed, intrusive acidic rocks occupy more than 70% of the surface, with granite and granodiorite being the most common. 3.2 Soils The soils identified in the area are humo-ferric podzols and dystric brunisols. Their distribution is shown in Figure 3.3. 1. Humo-ferric podzolic 2. Dystric brunisolic 3. Glacier or Rock 4. Lakes Figure 3.3. Major soils distribution in the study area (Agriculture and Agri-food Canada, 2002). 25 The dystric brunisols consist of acidic brunisolic soils with a weak development of the organicmineral surface horizon. These soils occur on parent material with a low base status (i.e. granite, granodiorite). The humo-ferric podzols occur under forest vegetation and their properties are accentuated by coniferous forest species. Humo-ferric podzols typically develop from coarse- to medium-textured acidic lithologies. Soils are considered in this study because they supply the main material involved in mass movement and also as provide support for the vegetation. 3.3 Climate The climate in the area is characterized by hot summers and cold winters. The temperature and precipitation data were obtained from several stations covering the perimeter and centre of the study area. The stations are located mostly in areas where mass movements are rare, and therefore the meteorological and climatic values characterizing the locations affected by landslides have to be derived by interpolation. The following stations were considered: Kaslo, South Slocan, New Denver, Nakusp, Castlegar A and B, and Kootenay NP West Gate. Details for each station are presented in Table 3.1. The climatic elements considered were: mean monthly temperature (°C), rainfall (mmm" ), snowfall (cmm" ) and overall precipitation (mmirf ). 2 2 2 Table 3.1 Meteorological stations used to derive climatic data for the study area (Meteorological service of Canada, February 2002) Station name New Denver Kootenay West Gate South Slocan Kaslo Nakusp Castlegar A Castlegar B Station id 1145460 1154410 1147620 1143900 1145300 1141450 1141457 Latitude (North) 50°38' 49°27' 49°55' 50°15' 49°20' 4 9 o i' 2 Longitude (West) 117°22' 116°04' 117°31' 116°55' 117°48' 117°40' 117°47' Altitude (m) 1219 899 457 591 457 495 476 26 3.3.1 Temperature The average annual monthly temperature for the period 1940-1990 is presented in Table 3.2. Table 3.2 Mean monthly temperatures for the period 1940-1990 at sites in or close to the study area (from Meteorological service of Canada, February 2002). Station Kaslo Kootenay S. Slocan Castlegar A Castlegar B New Denver Nakusp Month Year Jan -3.2 -9 -3.6 -3.2 -2.4 Feb -0.6 -4.6 -0.7 -0.7 -0.2 Mar 2.6 1.5 3 3.7 3.9 Apr 6.9 6.6 7.9 8.3 8.4 May 11.4 11.5 12.4 13 12.8 June 15.3 15.4 16.3 16.9 16.8 July 18 18 19.2 19.9 19.8 Aug 17.9 17.5 19.2 19.8 19.6 Sep 12.8 11.7 13.8 14.4 14.2 Oct 7.2 5.1 7.6 7.8 8.1 Nov 1.6 -2.2 1.7 1.9 2.4 Dec -2.1 -8 -2.6 -2.3 -1.5 7.3 5.3 7.9 8.3 8.5 -3.3 -0.7 3 7.3 12.1 15.9 18.6 18.7 13.4 7.3 1.8 -2.3 7.7 -3.2 -1.3 2.1 6.9 11.9 15.8 18.3 17.8 12.7 6.8 1.9 -2.1 7.3 Average monthly temperature variation 25 Month —•— Kaslo —•— Kootenay - * - Castlegar B - • - N e w Denver S. Slocan — r — Nakusp Figure 3.4 Monthly average temperature variation for the period 1940-1990 (from Meteorological Service of Canada, February 2002). All the stations have the same pattern of temperature variation throughout the year, with a winter minimum below 0°C, and a summer maximum of about 20°C (Figure 3.4). 27 3.3.2 Precipitation The average monthly precipitation variation during the year is presented in Table 3.3. The values for rainfall and overall precipitation are in mm m" and for snowfall are in cm m" . 2 2 Table 3.3 Monthly precipitation at various sites in the study area (multi-annual average) (from Meteorological Service of Canada, February 2002). Month Station Kaslo Kootenay S. Slocan Castlegar A Castlegar B New Denver Nakusp Rainfall Snowfall Overall Rainfall Snowfall Overall Rainfall Snowfall Overall Rainfall Snowfall Overall Rainfall Snowfall Overall Rainfall Snowfall Overall Rainfall Snowfall Overall Year Jan Feb Mar Apr May June July Aug Sep Oct Nov Dec 32.5 76 108.6 4.6 29.9 34.5 33.1 65.2 98.3 20.4 67.9 74.5 22.3 39.8 62.1 29.2 77.7 106.9 23.3 88.1 111.4 37.9 34.6 72.6 4.1 17.6 21.7 41.9 23 64.8 23.6 37.1 57.4 22.6 20.2 43.2 37.6 33.2 70.8 33.7 33.4 67.1 47.8 11.7 59.4 9.4 8.3 17.8 50.4 7.6 57.9 44.8 16.2 59.6 37.7 4.5 42.3 49.8 8.1 57.9 46.3 6.6 54.6 47.2 0.8 47.9 25.9 3.2 29.1 50.8 0.4 51.2 47.2 2.3 49.9 43.2 0.7 43.9 49.3 0.7 50.1 48.3 0.6 48.9 53.2 0 53.2 39.3 0.3 39.6 59 0 59 60.7 0.2 60.9 55.8 0 55.9 59.5 0 59.5 59.6 0 59.6 64.9 0 64.9 58 0 58.1 69 0 69 64.2 0 64.2 59.4 0 59.4 74.2 0 74.2 79.8 0 79.8 51.4 0 51.4 44.3 0 44.3 49.2 0 49.2 43.9 0 43.9 39.8 0 39.8 61 0 61 63.2 0 63.2 51.4 0 51.4 38.9 0 38.9 45.7 0 45.7 42 0 42 39.7 0 39.7 58.5 0 58.5 55.9 0 55.9 51.5 0 51.5 34.3 0.1 34.4 50.3 0 50.3 43.6 0 43.6 45.1 0 45.1 55.2 0 55.2 61.7 0 61.7 64.3 0.8 65.1 20.4 1.5 21.9 57.9 1.4 59.3 52.8 1.6 54.4 38.1 1 39.2 67.1 0.6 67.8 62.3 0.9 63.2 79.4 24 103.4 16.2 14.9 31.1 76 20.8 96.8 57.4 32 88.5 47.3 15 62.3 76.3 16.7 93 68.7 20.3 89 46.1 76.9 123 7.9 35 42.8 45.9 67.6 114.2 32.5 67.3 92.9 29.5 34.7 64.2 41.1 69.3 111.2 38.7 65.6 104.3 627.7 224.7 852.4 303.2 110.8 414.1 629.2 186 815.9 533.2 224.6 731.9 480.7 116 597.1 658.9 206.4 866 641.6 215.5 858.8 All the stations have two maximum precipitation peaks, one in winter and one in summer (Figure 3.5). The winter peak is larger than that in the summer for the northerly stations (Nakusp, Kaslo, New Denver) and the stations located in narrow valleys (South Slocan and Castelgar A). The summer maximum is larger than the winter maximum for the southerly stations (Kootenay, Castlegar B). The driest conditions occur at the East Kootenay station, where spring and fall precipitation is extremely low (350% less than overall maximum in the summer). 28 Average monthly precipitation variation 0 2 4 6 8 10 12 14 Month —•— K a s l o - • - Kootenay - * - S. Slocan Castlegar A - * — Castlegar B - • — New Denver — i — Nakusp Figure 3.5 Mean monthly precipitation at various sites in the study area (from Meteorological Service of Canada, February 2002). Two stations, Nakusp and Castlegar B, have a third maximum in the fall. This is related to the uniformity of precipitation throughout the year; with the ratio between the overall minimum and maximum being less than 50%. 3.3.3 Summary The de Martonne (1926) aridity index was calculated for each station in order to determine the climatic regions according to the de Martonne classification. The formula for determining this aridity index is: where P : average multi-annual precipitation in mm m" ; and 2 T: average multi-annual temperature in °C . The climate information supplied by the seven stations in the area is presented in Table 3.4. 29 Table 3.4 de Martonne climate types. Station Kaslo Kootenay S. Slocan Castlegar A Castlegar B New Denver Nakusp Aridity index Climate type 49.2 27.1 45.7 40.0 32.3 49.1 49.6 Wet Wet Wet Wet Wet Wet Wet The precipitation effectiveness index and temperature efficiency index were calculated using the Thornthwaite (1931) classification system. The formulae for these indexes are: Precipitation effectiveness PE = f 115* (——) T T + 10 (3.2) 10/9 where P: average multi-annual precipitation (inches) of each month; and T: average multi-annual temperature of each month (°F). Temperature efficiency (3-3) Table 3.5 Thornthwaite climate types, using the climatic indices of Thornthwaite (1931). Station Kaslo Kootenay S. Slocan Castlegar A Castlegar B New Denver Nakusp Precipitation effectiveness Temperature efficiency 1511 40 987 29 1427 42 1183 45 861 46 1478 41 1464 39 Seasonal distribution of precipitation Rainfall adequate in all seasons Rainfall deficient in summer Rainfall adequate in all seasons Rainfall adequate in all seasons Rainfall adequate in all seasons Rainfall adequate in all seasons Rainfall adequate in all seasons Thornthwaite climate types Wet micro-thermal Wet taiga Wet micro-thermal Wet micro-thermal Wet micro-thermal Wet micro-thermal Wet micro-thermal The Thornthwaite and de Martonne indexes are consistent with each other. Both indicate that the climate in the area is wet and favourable to forest vegetation. The extremely wet climate (PE>128, which is the upper limit for the Thornthwaite classification system) indicates that eluviation processes (washing away and depletion) in the soil will be relatively active when the forest cover is removed. This increases the risk of slope failure (Greenway, 1987; Selby, 1993). 30 3.4 Vegetation The forest vegetation of the Arrow Forest District is diverse. Douglas-fir {Pseudotsuga menziesii var. glauca) stands cover 26% of the area, fir (Abies sp.) 24%, pine (Pinus sp.) 18%, and larch (Larix sp.) 14% (Ministry of Forests, BC, 2002). Hemlock (Tsuga heterophylla), spruce (Picea sp.), western redcedar (Thuja plicata) and deciduous species (e.g. Acer sp., Betula sp., Populus sp., and Alnus sp.) are not as prevalent, but contribute significantly to the diversity of species found throughout the District. Coniferous species dominate 68% of the stands. The rooting habits vary among the different species. Most of the species encountered in the study area develop a shallow root system (e.g., Abies lasiocarpa, Picea engelmannii, Pinus contorta, Thuja plicata) because the soil depth is 20 -75 cm (Agriculture and Agri-food Canada, 2002). Depending on the soil condition, Pinus contorta can develop a taproot, but this may bend or become horizontal if it reaches a hardpan or water (Burns and Honkala, 1990). Pseudotsuga menziesii, Abies grandis and Larix occidentalis develop a deep and extensive root system, but if bedrock or water is close to the soil surface the lateral roots of Abies grandis and Pseudotsuga menziesii replace the taproot (Burns and Honkala, 1990). Because the soil in the region is relatively shallow (<75 cm), most species develop a lateral root system. Usually, shrubs have a smaller root spread than trees (Gray, 1995). Furthermore, shrubs have a smaller root tensile strength than trees (i.e. the roots of Vaccinium sp. have a tensile strength of 16MPa and the roots of Thuja plicata have a strength of 56 MPa)(Greenway, 1987). Therefore, the roots of trees are more important for soil strength. The contribution of shrubs or trees to terrain stability depends also on the soil characteristics, such as depth, structure and texture (e.g. if the soil is very shallow or rocky, shrubs can provide more stability than trees). Overall, woody vegetation is more effective for shallow landslide stabilization than herbs (Gray, 1994). 31 4 Methods 4.1 Sampling design The landslides included in the Ministry of Forests inventory were classified as debris slides, debris flows and their mixtures, rock slides, rock falls and rock slides transformed into debris slides, slump (bedrock) and earth flows (following the Cruden and Varnes (1996) classification). Debris flows, debris slides and combinations of the two represent more than 95% of the total number (571) of the events. Because of this high frequency, only debris flows, debris slides and their combinations were considered in this study. The classification of events based on type is presented in Table 4.1. Table 4.1 Types of mass movements in the study area Type of movement Number of events Percentage Debris slide Debris flow Debris slide and debris flow (mixture) Rock fall and rock slide transformed into debris flow Rock slide Slump bedrock Slump superficial 423 11 137 1 5 1 4 72.7 1.9 23.5 0.2 0.8 0.2 0.7 TOTAL 582 100 Debris slides and flows may be confined or unconfined. Each of these events can develop into single-path or multi-path events. This study takes into account only events that were unconfined and had a single path, resulting in 435 events. A large number of elements were measured in the field in order to develop a relationship between different attributes of the debris slide-flows and their travel distances. Sampling was simple random sampling without replacement (SRS). This design was chosen because there were insufficient elements to calculate the stratum sample size in more complicated designs (e.g. 32 would need to estimate the within stratum variance). The values necessary to determine the sample size for SRS are: 1. Population size, N; in this case N=435. 2. Sampling error specified as a percentage, PV. 3. Coefficient of variation, C . This was derived from other similar studies. v The coefficient of variation is defined as Cv = 4 (4.1) x where s : standard deviation of the lengths of all events; and x : mean length of the sampled events. For a sample size greater than 20, the formula used to determine the sample size was (Cochran, 1977) 1 PV 2 1 (4.2) - + • where n is the sample size; N: population size; C : coefficient of variation; v PV: precision desired; and t: value of the Student distribution for n-1 degrees of freedom (DF) and for oc=0.15. As the /-value depends on n, finding the sample size is an iterative process. The iterations stop when the /-value used to calculate the sample size and the /-value provided by Student's distribution for n-1 degrees of freedom are the same. 33 The goal of any sampling design is to provide a sample size that leads to as high a precision as possible within reasonable costs. Mathematically, the distance between the mean sample length and the true mean length of all the events, d, is expressed by: d = L -A m where (4.3) m L - mean length of sampled events; and m A - true mean length of all the events. m Another way of writing the above formula is d = L -A =L *(l-^) m m (4.4) m The above series of equations translate the question from being dimensional to a ratio. Consequently, establishing the percentage of sampling error, pv - (1 - ^ - ) , would lead to the L m desired distance between the mean of the sample length and mean length for all the events. As this study has an exploratory character, the percentage of sampling error was established at a value of 30%, PV=0.3. Three studies were used to identify the desired value for the coefficient of variation: Wise (1997), Megahan and Ketcheson (1996), and Fannin and Rollerson (1993). The studies performed by Wise and by Fannin and Rollerson dealt with mass movement events on the Queen Charlotte Islands, British Columbia. The Queen Charlotte Islands ( 5 3 ° 0 0 ' N , 132° 30'W) have a temperate coastal maritime climate. Mean annual temperature is 8.1°C and mean annual precipitation is 1359 mm (Sandspit meteorological station). The study of Megahan and Ketcheson (1996) was carried out at Silver Creek, Idaho (44° 25'N, 1 1 5 ° 4 5 ' W ) . Mean annual temperature is 4.4°C and mean annual precipitation is 900 mm (close 34 to Deadwood Dam Idaho station #102385), both figures being similar to those from the Arrow Forest District (as shown in Tables 3.2 and 3.3). The values need in the sample size determination were taken from the above three studies as they have similar species, soil types, geology and climate with Arrow Forest District. Wise (1997) found the average length of open-slope, single-path events to be 250 m, with a standard deviation 200 m. The coefficient of variation was therefore 0.819. Using these values, the sample size determined using the above values for PV and a is 25. In the study by Fannin and Rollerson (1993), the average length of a type 1 event (single-path event that initiates on a relatively uniform planar slope) was 120 m, with a standard deviation of 100 m. The coefficient of variation was 0.82. These values also indicate that a sample size of 25 is required. Megahan and Ketcheson (1996) separated the events based on their source: culvert, rock drain or fill. The only events from their study considered in this research are those originating from culverts. Events originating from rock drains and fill had a very short average length (under 15 m), making them incompatible with the Arrow District events. The mean length of events originating from culverts was 53.0 m, with a standard deviation of 40 m. Therefore, the coefficient of variation is 74.7% and consequently the required sample size is 21. Combining these results, the greatest required sample size is 25. Therefore the minimum number of events needed to develop a debris slide-flow travel distance model was taken to be 25. The events constituting a sample need to cover as much variability as possible (Elfving, 1952; Demaerschalk and Kozak, 1974, 1975; Marshall and Demaerschalk, 1986). In order to assure that this condition was fulfilled, the events were divided into different categories. The categories 35 were chosen based on their possible influence on debris slide-flow travel distance: slope (Heim, 1989; Cannon, 1993; Hungr, 1995; Fannin and Rollerson, 1993; Corominas, 1996; Lau and Woods, 1997), geology - as an indicator of possible process (Finlay et. ai, 1999; Corominas, 1996), and event surface as an indication of length. A number of classes were established for each category to ensure representation across the total number of events. Wise (1997) split unconfined mass movements into two groups based on volumetric behaviour: entrainment and deposition. Based on slope, each group was then separated into two classes: for the entrainment group the cut-off value was established at 30°, and for the deposition group it was set at 25°. Hungr (1999) used four classes of slope for unconfined events: 0-15°, 16-20°, 2 1 - 3 0 ° and 3 1 - 5 0 ° . As the number of classes proposed by Hungr could lead to a large sample size, the number was reduced to three in this study: 0-25°, 2 6 - 3 5 ° , greater than 35°. These groupings were based on the necessity of covering as much variation as possible. The coverage in terms of the overall events considered in this study was: • 15% (64 events) in the first class (slope less than 25°); • 56% (242 events) in the second class (slope between 25° and 35°); and • 29% (129 events) in the third class (slope greater than 35°). The number of events on different geological classes is given in Table 4.2. Table 4.2 Distribution of events by bedrock lithology in the study area Rock type Quaternary Fine sedimentary (Mesozoic-Slocan group) Granite Fine-textured meta-sediments (Paleozoic-Lardeau) Gneiss Volcanic (Mesozoic-Rossland group) Number of events 3 37 291 12 87 5 36 The lithologies taken into account were granite (67%), gneiss (20%) and fine sedimentary (8%). The other lithologies were excluded from the sample, as inclusion would have increased the required sample size without adding much useful information. The Ministry of Forests inventory considered only the surface of the landslides. Because the shape of a reach comprising a debris slide-flow can be approximated with a rectangle (Wise, 1997), there is a strong relationship between surface and length. The event's area can therefore be used to represent the travel distance of the debris slide-flow. In the Ministry of Forests inventory, events were classified into four groups: 0.02-0.05 ha, 0.05-0.2 ha, 0.2-1 ha and 1-5 ha. Events with a surface area smaller than 0.05 ha identified using aerial photographs were checked in the field. A large number of them were found not to be debris slide-flows, and this class was therefore removed from the analysis. Consequently, three classes of event size were used: 0.05-0.2 ha, 0.2-1 ha and 1-5 ha. The final classification of events covered slope (three classes), geology (three classes) and size (three classes). The number of events in each class and category is given in Table 4.3. Table 4.3 Distribution of debris flow events in different categories. Size [ha] Geology Granite Gneiss Fine sedimentary 0.05-0.2 0.2-1 1-5 Slope [°] Slope [°] Slope [°] <25° 25°-35° >35° <25° 25°-35° >35° <25° 25°-35° >35° 25 7 0 106 28 14 38 13 7 6 8 0 43 29 19 50 6 5 3 0 0 3 1 3 0 0 0 As the established sample size (25) is small and unevenly distributed, a proportional allocation was considered for sampling. However, this would have led to a very small number of events (below 0.5) in some classes and a large number in others. Consequently, an equal allocation 37 procedure was chosen with the intent of having two events per class. However, some classes had only one event selected is because either the other events in the respective cell were incorrectly classified or the events were too inaccessible. The maximum survey time allocated for each event was one day; events that would have involved more time to survey were classed as inaccessible and dropped from the sample. A random selection of the events from each class was performed and the events selected are presented in Table 4.4. Table 4.4 Identification numbers of events to be sampled in each class 1 Size [ha] Geology Granite Gneiss 0.05-0.2 0.2-1 1-5 Slope [°] Slope [°] Slope [°; <25° 2 5 ° - 3 5 ° > 3 5 ° 71-12 22-102 73-18 31-4 74-20 73-28 <25° 25°-35° >35° 21-101 31-3 73-17 21-102 61-18 62-23D 52-26 61-15 62-14 94-30 94-34 94-36 94-35b Fine sedimentary 4.2 51- 3 52- 13 52- 10 53- 5 52- 12 53- 6 52-30 62-10 94-23 94-24 94-26d <25° 83-7 ~ 25°-35° 61-19 >35° 61-10 52-32 Data set construction The elements to be measured were determined based on their potential ability to influence debris flow travel distance. They were established through a literature review, knowledge of the physical behaviour of mass movements, and experience. The elements that were measured are presented in Table 4.5. The first number is the map-sheet and the second the landslide (or landslide group) id on that map-sheet, e.g. 9435b is the second events from the 35 landslide group on the map sheet F094. 1 th 38 Table 4.5 Elements measured for each event Category Vegetation Geomorphology Geometrical Element Values Stand composition Composition expressed in % canopy closure (i.e. 90% Hemlock 10% Cedar) Canopy closure Percentage from 0% to 100%, in 10% steps Average diameter of each species Average height of each species Plan curvature Continuous (cm) Vertical curvature Plane, convex, concave Type of reach Slope of the reach Azimuth of the reach Gully Position on the slope Length of each reach Width at the top part of the reach Width at the bottom part of the reach Entering, deposition or both In ° (degrees) In (degrees) Presence vs. absence Top, middle, bottom Continuous (m) Continuous (m) Continuous (m) Continuous (m) Depth of the top part of the reach measured at the V* of the width (see Appendix 2) Risk Terrain state Continuous (m) Plane, convex, concave 0 Depth of the top part of the reach measured at the Vi of the width (see Appendix 2) Depth of the top part of the reach measured at the % of the width (see Appendix 2) The event reaches the stream or not Continuous (m) Human activity Logging, road, absence Continuous (m) Yes vs. No Elements describing vegetation were measured following the procedures of Munteanu et al. (1980). Stand composition and canopy closure were estimated using a spherical densiometer. For diameter, the trees representing the average diameter for each species were identified visually, then measured. The measurements were made at 3-4 sampling points and included at least four trees at each sampling point. The average height (based on at least three trees at each sampling point) was also measured for the trees representing the average diameter category. Each reach, as defined by Wise (1997), was presumed to have linearity, uniformity of slope, azimuth, width and volumetric behaviour characteristics. This study includes two more restrictions regarding the definition of a reach: uniform geology over the path and consistent 39 flow confinement. A schematic debris slide with three reaches is presented in Figure 4.1 and Figure 4.2. Figure 4.1 Debris slide/avalanche with three reaches Figure 4.2 Profile of a three-reach event A reach in this study was defined as a linear portion of the event trajectory, having the same geology, constant slope, azimuth, width, volumetric behaviour characteristics and confinement type. The constancy of slope, azimuth, width, and volumetric behaviour characteristics, need to be defined more specifically. The limits where these elements could be considered as "unchanged" are location-dependent. Different parts of British Columbia can have different cutoff values defining a reach. These limits are precisely defined in the Section 4.5, where exact values have been assigned. Each event was surveyed for the elements, involving walking the entire length of each event. Two soil samples were taken from each reach, one from inside the event and one from outside. Specific weight and granulometry were determined. 40 4.3 Derived variables The variables derived from the field measurements include slope and horizontal length, average slope, area and volume of event. Slope length was calculated as the sum of the lengths of all individual reaches: (4.5) £ =i>, where L : slope length [m]; Li: z'-st reach slope length [m]; and k : number of reaches in the event. Horizontal length was calculated as the sum of the horizontal lengths of all individual reaches: ^=E(A*cos«p,.) (4.6) '=i where Lh ' or horizontal length [m]; Li: /-st reach slope length [m]; cos^ : cosine of the slope of reach i, (p ; and ; t k; number of reaches in the event. Average slope was calculated as the slope of the event as a whole: <p = a r c c o s ( ^ i ) (4.7) L 41 where <p : average slope of the event [°]; arcos: the inverse of cosines (the arc of cosine); and Lhon L: as above. The area of the debris flow was defined as the sum of the scoured areas of all reaches. The area can be expressed on the slope or projected onto a horizontal plane, with slope area being expressed as: _ ^ . m ^ L A ( 4 . 8 ) 2 where A: slope area [m ]; Wtf. width at the top of reach /[m]; Wbf. width at the bottom of reach i[m]; and Lefi, k: as above. Horizontal area, Ah , is expressed by: or * ^=E -*cos^.* L Wt +Wb 1=1 ' ' 2 (4.9) where Lefi,fy,Wtt, Wbi are as above. The volume of the slide was defined as the sum of the volume of all reaches: Voltot = %L *->—-L = i ^ 1=1 =X A* 1=1 W t i * '° (dli 25+ d (4.10) "'°' 5+ d "'° , 7 5 ) + W b ' * 8 { d m + dhi '° 5+ d m ) 42 where Voltot: volume of the debris flow as a whole [m ]; Sti, Sbi: cross-sectional area of the top/bottom part of reach z[m ]; 2 ^H;O.25> <4O.5> 4/o.75 depth at top of reach i at 0.25, 0.5, 0.75 of the reach's width [m]; : ; ; 4p,o25'4«o.5'^fcro.75 : depth at bottom of reach / at A, Vi, % of the reach's width [m];and l U, k : as above. The volume at the initiation point was defined as the volume of the first reach, with the initial volume being expressed by: Vol\ —L * *^ " ; 02 5 + ^' ' - l 0 5 + ^";0.75 ) + * (^fcl;0.25 ~*~ ^b\;0.5 + ^frl;0.75 ) , ^ ^ js where all the elements are as above but for The magnetic azimuth was measured in the field with a compass. In 2001, the magnetic declination for the Arrow district area was 18° East (Natural Resources Canada, 2002). As the azimuth used in the model has to be in agreement with the T R I M maps, the geographic azimuth was used. These derived variables were considered as raw data because they express some of the attributes of debris flows. 43 4.4 Data characterization The data collected for each event are presented in Appendix 1. Table 4.6 presents the basic statistics for the 38 events. Table 4.6 Summary statistics. Elements Number of reaches/event Slope length [m] Horizontal length [m] Width at initiation point [m] Slope initiation point [°] Average slope [°] Depth initiation [m] Canopy closure [%] High of the stand -first reach [m] Diameter of the stand-first reach [cm] Slope surface [m ] Horizontal surface [m ] Volume of first reach [m ] Volume of whole event [m ] 2 2 3 3 Average Median Minimum Maximum 2.7 249.5 209.8 21.5 36.9 34.9 1.2 0.8 25.5 22.7 4416.2 4101.1 1096.4 4545.1 2 108.8 82.95 12.25 38 34.5 1.1 0.9 28 23 1754.5 1482 543 1113 1 25.8 21.6 2.2 18 18 0.1 0 0 0 333 251 49 192 12 1341.7 1200 168.7 53 53 3 1 45 50 24708 22113 10021 36446 Standard deviation 2.23 329.32 287.30 29.79 6.53 7.92 0.60 0.29 11.92 11.01 5786.68 5337.14 1810.85 7801.43 The values reveal the great heterogeneity amongst the data, with the standard deviation being greater than the mean for almost all the elements. This indicates that the sampling captured the variability within the population. The debris slide-flows vary from a small-scale, 192 m (event 73-18) to a medium scale events, 3 36,446 m (event 61-10) (Innes, 1983). Slope lengths vary from 25.8 m to 1341.7 m. This level 3 of variation suggested that regression analysis would be useful to analyze the data (Demaerschalk and Kozak, 1974, 1975). 44 4.5 Reach characterization A reach, as defined in Section 4.3, presumes uniformity of geology, slope, azimuth, width, volumetric behaviour, confinement type and trajectory. This definition is too broad to be functional and limits to the variation of the attributes are required. Slope, azimuth and width require ranges. The remaining attributes are easy to identify, as they are class variables. The trajectory of a reach should be linear in order to fulfill energetic conditions. (In a Euclidean space, the smallest amount of energy dissipation occurs on a linear trajectory). A reach, as part of an event, is defined according to its relations with the event as a whole and with its neighbours. The variation of those attributes that required limits is presented in Anpendix 1, Table A. 1.4. The values in the tables from Appendix 1 suggest a reach be defined according to: • The minimum difference between two adjacent reaches must be at least 3° in slope or 20° in azimuth. Only four events failed to fulfill this condition. These values were considered because events 61-10 and 62-10 do not respect these conditions and were dropped as outliers during the model-building process using different statistical procedures (such as studentized deleted residuals and hat matrix leverage) and quantitative analysis. Event 21-101 did not influence the overall model and 94-26d was used only for testing purposes. • The maximum difference in slope between two reaches must be smaller than 26° (the value is area-dependent - the determination is based on the data set), except when one reach is the fan. If the difference is more than 26°, the down-stream reach can have a slope greater than 46° (104%), making it extremely unstable. Given this, initiation of the terrain failure can occur in one of the intermediate reaches and not at the first reach. The rest of the reaches, 45 from the real failure point to the present, presumed, initiation point, can appear only after the loss of the basal support. The length of the first reach or fan must be greater than 10 m. Only one event falls outside this criterion. Event 61-18 is only 37.2 m, and the fan represents more than 20% of the total length. These, together with the real length of the first reach (8.5m), justify setting the minimal cut-off value for the fan and first reach as 10m. The length of any reach, except the fan and the first one, must be greater than 25m. Two events break this rule. For 61-10, the minimum length is 24.8 m, which can be considered sufficiently close to 25. Event 73-12 is an outlier in many respects, including this criterion. The maximum length of a reach must be less than 200 m. Two events did not meet this requirement; both events are outliers in many respects, including this criterion. The maximum length rule is not applied if the reach has a slope larger than 45°. The ratio between the length of two adjacent reaches must be greater than 20% and smaller than 500%, except when one is the fan. In such a case, the ratio must lie between 16% and 625%. This means that a reach cannot be five times longer or shorter than any adjacent reach, except fans. Two events do not fulfill this rule. The 31-3 event has one extremely long reach, 20 times longer than the adjacent one, suggesting that the reach was incorrectly identified in the field. For event 52-30, the value is 19%, very close to the minimal value of 20%. The stopping rule for the event is related to the slope. Based on the statistics of the data set, if the slope is less than 18°, then the event stops (Table A. 1.4.). There are 99% of the intermediate reaches that have slopes greater than 18°, making the selected cut-off value consistent. This ensures that the event will not cross a portion of forested terrain with a slope less than 18° and a length greater than 25 m. This means that when the event reaches a slope smaller than 18°, only one reach is added to the path variable. 46 Events 51-3, 52-12, 73-18 and 83-7 all stop on slopes greater than 18°. They can be explained as follows: > Event 51-3 was halted by very dense forest. > Event 52-12 is very short and wide; the length-width ratio is 15%. This, correlated with the shallowness of the event, suggests that there was insufficient energy to maintain the event movement. > Event 73-18, which was caused by a road, is short (31.5 m) and stops on a 44° slope. The environmental conditions are similar to those of event 52-12; the soil is shallow (maximum 50 cm depth) and the forest is very dense. These conditions suggest that initiation occurred during heavy rain (to raise the pore water pressure to a critical level), but the momentum developed by the initial sliding volume was insufficient to ensure further travel. > Event 83-7 stops on a slope of 23°. The slide started in a clearcut area, and stops in dense forest, suggesting that the presence of the forest was the critical factor halting movement. Two reaches have intermediate reaches with slopes less than 18°. > Event 61-10 has two reaches with slopes of 15° and 12°, respectively. The situation is identical to the 21-101 event, with the reaches having triangular profiles determined by rocks. The width uniformity criterion expresses the variation of a reach's width along its path. A way to present this criterion is to calculate the ratio between the width of the top and the bottom of the reach. If this ratio is close to 1 then it can be inferred that width does not vary along the reach path. This approach is very simplistic and can misrepresent the width uniformity along the reach (i.e. a regular reach can have a ratio of width top to 47 width bottom of 0.85, as in Figure 4.3, but an event represented by only one reach can have for the same ratio the value 0.5, formed by a trapezoidal shape). Width bottom Width bottom Width top Width top = 0.5 = 0.85 Figure 4.3 Comparison of the ratio of width bottom to width top for two types of reaches Figure 4.4 Lateral angle, y, of a reach with trapezoidal shape To solve this problem, the uniformity determined from the width is transformed into a uniformity of the lateral angle of the trapezoid representing the reach, y Figure 4.4. The angle is determined using Equation 4.12: r^arctanC ^"^"- ) 2*L 1 1 (4.12) where y: lateral angle of the trapeze; W , Wbottom •' width of the top and bottom of the reach; and top L: slope length of the reach. The criterion can be stated as: the width of a reach is uniform if the lateral slope angle is less than 15°, except on a fan. Three reaches do not fulfill the condition. Reach 2/event 62-14 and reach 1/event 94-30 have angles greater than 15° as bedrock outcrops transformed the sections from relative ellipsoids to a triangular 48 shape. Reach 1/event 83-7, which starts in a clearcut, is very wide. When the event reaches closed forest, the path becomes very narrow. These criteria determine a debris slide-flow's trajectory. The elements that bound the values of the variables are the effective and relative length of a reach, the slope variation between reaches, the azimuth variation between reaches, the stopping rule, and the variation in width. The selected cut-off values are area-dependent. This means that different geographical areas can have different cut-off values. Their identification requires suitable field sampling and a similar procedure to that described above. 4.6 Defining the path as a variable Each event can be represented on a contour map, as shown for event #21-101 in Figure 4.5. Event 21-101 / \ / Stream.shp A/Slide.shp / V Road.shp Contourjine.shp I I Contour.shp N 0.3 0.3 !6 Kilometers Figure 4.5 Event 21-101 49 The profile of an event is represented by the reach succession along the debris slide-flow trajectory. The longitudinal profile of event 21-101 is presented in Figure 4.6. 140 -r 0 50 100 150 200 250 300 350 400 Length [m] Figure 4.6 Longitudinal profile of event 21-101. A major challenge is to express the profile of an event with a single number. A possible solution for this problem is founded in numbers theory. The quantification of terrain variation along the debris slide-flow path is based on the correspondence between binary and decimal numeration systems as stated in Theorem 1 (Creanga, 1965): Theorem 1. The transformation from one numeration system to another numeration system is a bijective function. The theorem ensures that a number in a numeration system can be expressed in only one way in any different numeration system. The decimal system is used in statistics and non-linear programming calculations. If the path variable is expressed in any other numeration system, theorem 1 ensures that its expression in the decimal system is unique. The binary system was used to express the debris slide-flow path as a single number. Each reach can have a value 1 or 0 according to its neighbouring reaches. The first reach has a value of 1. 50 The remaining reaches obey the following rule: the reach has the value 0 if the slope of the reach immediately above it is greater, and 1 if the opposite holds. An additional rule is that if the event ends in a stream there is no value assigned to the reach containing the stream. For example, event 52-13 in Figure 4.7 has the following slope values: Reach 1: 33° Reach 2 : 28° Reach 3 : 33° Reach 4: stream As the event ends in a stream there is no digit for the final reach. If the event had ended in a fan with a slope angle of less than that of reach three, then the assigned value would be 0. 120 j 100 JE 80 c o (0 60 - > o LU 40 -20 -0 1 0 20 40 60 80 100 120 140 160 180 Length [m] Figure 4.7 Profile of event 52-13 The binary coding for event 52-13 is 1 0 1. The first 1 is for the first reach. As the slope of the second reach is less than that of the first, the assigned value is 0. The slope of the third reach is 33°, greater than the slope of the second reach (28°), and the assigned value is therefore 1. The stream has a slope of 0°, less than reach three, but as a stream is present, no value is assigned to it. This means that no extra digits are attached to the binary coding. 51 A value obtained in this way has to be transformed into the decimal system to be interpreted with the remaining variables. Theorem 1 ensures that for each number in the binary system, there is one and only one corresponding number in the decimal system. The decimal system number can therefore represent the coding of an event. As the binary system identifies each event path based on reaches, two different debris slide-flows are represented by two different numbers (Figure 4.8). Profile view of an event with 5 reaches Profile view of an event with 4 reaches 700 600 — 500 J . 400 | 300 I 200 T 1 -i • — / - 100 0 100 200 300 400 500 Horizontal distance Path codification: 10 10 Decimal system: 10 Horizontal distance 110 10 26 Figure 4.8 Two different debris slide-flows coded by two different numbers The coding explains the variation in the debris slide-flow in two ways, by the variation of the slope along the path (based on the binary coding), and by the variation of the direction of flow (expressed by azimuth) along the path (based on reach characterization). The slope variation explained by the binary coding crudely characterizes the energy variation along the debris slideflow path. If the slope increases, the assigned value for the corresponding reach is 1 and therefore the binary number is greater. The supplementary conditions required to codify an event state that a reach have to be longer than 25 m and the event stops when a reach has a slope less than 18°. 52 The variation in azimuth is incorporated into the binary coding due to the reach characterization and definition. Two reaches are considered different if the difference between their azimuths is greater than 20°. However, a change in azimuth occurs because a reach has to follow the greatest slope, meaning the largest kinetic energy variation. A new reach identified as a change in the azimuth is perceived at the energetic level as an increase in kinetic energy of the mass movement (Figure 4.9). Therefore a new reach results in a higher number in the binary coding and consequently in the decimal system, consistent with the energetic variation of the event. * Figure 4.9 Azimuth variation consistent with kinetic energy of the mass movement The path variable is dependent only on the terrain. Its value represents the terrain variation along the trajectory of the debris slide-flow. The cut-off values represent slope morphology in relation to terrain stability. 53 4.7 Hypotheses Information from geomorphology, geology, physics and biology was combined to develop the following hypothesis: There is a significant relationship between the debris slide-flow travel distance and geomorphology, geology, tree species, stand characteristics, canopy closure and soil attributes. The geomorphological attributes considered were debris slide-flow travel path, slope, azimuth, plan and profile curvature, and position on the slope. The soil attributes that were included were variation of granulometry, fine particle content, and specific weight. The average height and diameter of the stand at the initiation point (first reach) was used to indicate the stand characteristics that influence the event travel distance (e.g. root strength and structure, stand mass) The identification of quantifiable attributes enabled a more flexible approach to the testing of the general hypothesis. The next step was therefore to test the significance of each attribute over the debris slide-flow travel distance. A series of secondary hypotheses were developed as the first step in the testing of the general hypothesis. A new variable, which quantifies the trajectory of the event, and which has been termed here the path variable, should have a significant influence on the debris slide-flow travel distance (Corominas, 1996). The hypothesis that the path variable influences debris slide-flow travel distance was tested. 54 The average slope of the debris slide-flow, a, determined by Equation 4.13, ( a - arctan * 2/-*sin(a,.) 1=1 (4.13) ]T/,.*cos(a,) 1=1 <2,: slope of individual researches comprising the event; and /,: length of individual reaches comprising the event. is the most widely studied attribute characterizing debris slide-flow events. This variable has been considered when inferring the relative importance of different attributes on travel distance (Terzaghi, 1943; Heim, 1989; Cannon, 1993; Corominas, 1996; Hungr, 1999; Finlay et al, 1999). The hypothesis that the average terrain slope influences debris slide-flow travel distance was tested. Together with the average slope, fan slope and slope of the first reach were introduced into the analysis because of their possible influence on event travel distance (Pao, 1961; Corominas, 1996). The hypothesis that there is a significant relationship between slope of the fan, slope of the first reach and debris slide-flow travel distance was tested, as was the hypothesis that initial volume has a significant influence on the debris slide-flow travel distance (Cannon, 1993; Corominas, 1996; Fannin et al, 1996). As northerly aspects have a different water regime than southerly ones, aspect may have an influence on travel distance. This is because an increase in pore water pressure can reduce the stability of the terrain (Greenway, 1987; Wise, 1997). In addition, species composition varies with aspect (Hosie, 1969; Burns and Honkala, 1990; Selby, 1993). Each species has unique rooting habits and therefore the stability of the terrain varies according to the species association 55 (Chirita, 1974; Burns and Honkala, 1990; Wu et al, 1994). The hypothesis that aspect has a significant influence on the debris slide-flow travel distance was tested. Water flow on a slope increases towards its base (Viessman and Lewis, 1996; Powrie, 1997). Given the role of pore-water pressure on the stability of a slope (Terzaghi, 1943; Kenyey,1984; Powrie, 1997), the hypothesis that debris slide-flow initiation point is correlated with the travel distance of the flow was tested. Terrain curvature influences the local microclimate by reducing or accelerating water flow on the slope (Corominas, 1996; Viessman and Lewis, 1996, Megahan and Ketcheson, 1996). The local topography of the reach can help fulfill conditions for the triggering of the event, and the terrain curvature influence on debris slide-flow's travel distance was tested. The rooting habit of each species influences terrain stability (Sidle, Pearce and O'Louhlin, 1985; Wu et al, 1994; Watson et al, 1994; Helliwell, 1994). There are differences in root structure between single-species stands and mixed species stands (Stanescu et al, 1997). The influence of the roots on travel distance was tested. Slope stability can be reduced if the mass of vegetation on a slope is large (Greenway, 1987). Consequently, there may be a relationship between travel distance of the debris slide-flow and the mass of the vegetation on the slope at the time of failure. The mass of the stand can be represented by the combination of species, canopy closure, diameter and height. The average stand height and diameter influence on debris slide-flow travel distance was tested. 56 Gross precipitation is the precipitation falling above the stand canopy. It is similar to the precipitation in an open area when there are no forest edge or topographic effects. The gross precipitation is separated into rainfall interception loss, throughfall and stemflow. The interception loss (I) is the portion of the precipitation retained by canopy surface. For different types of forest, there exists a linear relationship between interception loos and gross precipitation during a rainfall (Hashino et al, 2002): I=aR + P where (4.14) R : rainfall; a : empirical constant approximately the ratio between interception loss and gross rainfall. Gash (1979) showed that cc-— r where e average interception rate and T average rainfall intensity; and (3 : an empirical constant indicating the intercepted rainwater remaining on leaves and branches when the rain stops (e.g. P=l.3-2.0 mm for coniferous trees (Rutter et al., 1975; Gash, 1979). Tree foliage intercepts 10% to 25% of precipitation and up to 100% of light rainfall (Greenway, 1987). The interception loss is usually greater for conifers (20% to 40%) than for hardwoods (10% to 20%) (Zinke, 1967). The amount of interception loss depends on factors such as species, stand age, annual precipitation, meteorological factors such as wind speed, vapour pressure deficit, and canopy structure (Rutter et al, 1971; Xiao et al, 2000). The quantity of water reaching the ground influences the water regime of the soil (Innes, 1983; Greenway, 1987). The soil water volume can be directly related to the risk of failure (Terzaghi, 1943; Kenney, 1984; Powrie, 1997). I therefore tested whether stand canopy closure influences travel distance. 57 The substrate influences the development of soil and vegetation. As the nature of the soil and vegetation influences slope stability (Chirita, 1974; Traci, 1985), the bedrock geology may be an important factor influencing slope stability, and this was tested. A series of studies (Innes, 1983; Takahashi, 1991; Iverson, 1997) have presented the importance of soil granulometry on the travel distance of debris slide-flows. The significance of the relationship between soil granulometry, soil fine particle content, soil specific weight at the initiation point and debris slide-flow travel distance were all examined using simple linear regression. Logging activities can influence mass movement activity (Sidle, 1992, 2000; Rollerson et al, 2001). The length that a debris slide-flow travels depends on whether it is in a clearcut area or in forest (Robinson et al, 1999). Watson et al (1994) found that 50% of the root strength of harvested trees was lost in 1^1 years after harvesting, depending on the species and climatic region. The critical period for terrain stability is the period between when soil strength is reduced through the decay of the roots of the harvested trees and soil strength starts to increase through the development of the rooting systems of the new stand. This period of reduced soil strength occurs 3-15 years after harvesting (Robinson et al, 1999). Using only events that travelled through the forest, except for the first reach, I examined the influence of terrain on travel distance. All the above hypotheses were tested using the statistical methods, described in Section 4.9. 58 4.8 Assumptions A number of assumptions have to be made in order to test the hypotheses. The underlying assumption of this study is: The mass movement travel distance can be fully explained by mass movement attributes. This assumption requires that a number of attributes be attached to the event. These attributes completely describe the behaviour of the mass movement in time and space. It is impossible to identify all the attributes, so the most important have to be selected. Their selection is determined by the starting set of attributes used in the study. One set can lead to some significant attributes while another set can lead to a different group. For example choosing slope, gully profile and soil granulometric properties as predictor variables can lead to a result. If the predictor variables include slope, species and terrain curvature the result is different, but not necessarily wrong. The selection of the initial set of attributes is based on experience, interpretation of the laws of physics, deduction, and previous studies performed in the area of interest. The larger the number of attributes considered, the more accurate are the results. An overlapping set of initial attributes considered by different studies (e.g. slope and volume (Corominas, 1996), volume and obstruction length (Megahan and Katcheson, 1996), slope, transverse radius of channel curvature, volume (Cannon, 1993) and slope and height of failure (Finlay et al, (1999)) is recommended to allow inferences about the significance of different attributes acting in different combinations. The variation of the significance of attributes depending on which combination is used can be assessed by adding new attributes to the initial set. In this respect, this study considered a series of variables taken into account by studies performed by Greenway et al. (1984), Fannin and Rollerson (1993), Corominas (1996), Megan 59 and Ketcheson (1996), Turner and McGuffey (1996), Wise (1997), and Finlay et al. (1999), such as slope, azimuth, and volume. Some studies have also considered certain rheological attributes (e.g., McLellan and Kaiser (1984), Fang and Zhang (1988)), which have been used together with slope. In statistical analyses, the importance of the rheological attributes is reduced when they are used together with geomorphological attributes (Neter et ah, 1996; Iverson, 1997) as rheology is considered constant along event path. Therefore, rheological attributes (e.g. dynamic friction coefficient, uplift pressure) were not used here. The set of attributes used in this study improved the studies listed above in the categories: • geomorphology: introduction of a succession of different slope angles along the event trajectory (path variable), terrain curvature and position on the slope; • slope vegetation: introduction of species structure, stand characteristics (average height and diameter); and • event geometry: depth at VA, VI and % of the width. There are two constraints that impede recording all the attributes: the limited time given for any study and the availability of suitable methods. Because of these two constraints it is impossible to record all attributes that describe a mass movement. Therefore, new assumptions have to be made that are consistent with the assumption that a mass movement is explained by its attributes. Usually, any landslide study assumes that the measured attributes represent the actual values. Although some studies recognize this to be false, subsequent calculations ignore the implications of using data that have imprecise meaning. Usually, operating with these kinds of data and ignoring their vagueness, leads to unreliable results. The assumption that builds the vagueness of the attribute values into the calculation is that some attributes are measured with low accuracy. 60 Particle size distribution (PSD) type and grading, and soil specific weight are usually measured with low accuracy because they cannot be measured in a continuous manner along the event path. In this study, I assumed that soil granulometric properties are constant throughout individual reaches. This is known to be incorrect given the spatial variability of soils (Iverson, 1997). The assumption that governs most landslide investigations is that the attributes do not change from the moment of occurrence until the moment when their value is measured. This study includes into the analysis the fuzziness represented by the evolution of a landslide following the event. The assumption is that the attributes that characterize the event do not change through time (from the occurrence until the moment of the inventory). The question that usually arises during data collection is: Has there been modification of the landslide from its occurrence until the moment when the measurement was made? If the measurements are made soon after the event occurs, this assumption tends to hold (Petley, 1984). However, if the data are collected long after the event, some of the attributes may have changed, such as the length and width of event (Petley, 1984; Wise, 1997). This is because erosion of the landslide scar is accelerated in the absence of vegetation. The same process also affects the sidewalls of the event, leading to the calculation of false volumes involved in the initial motion. If the soil is well-developed and the vegetation around the event path is unaffected by the landslide, the rate of erosion of the scar and sides may be low. In this study, all events had occurred within the last 20 years, and forest vegetation existed around the landslide path. The measurements made in the field assume that the initiation point of a landslide is the point with the highest elevation. Theoretically, this is not always true as a landslide can start at a point 61 situated below the top of the back scarp. Many factors can lead to this: natural erosion of the scar and the occurrence of a new event are two of the most common factors (Abramson et al, 1996; Cruden and Varnes, 1996). This uncertainty over the precise position of the real initiation point has to be included in the analysis. The assumption that deals with this uncertainty is that the event starts at the actual initiation point. If the event starts from a point situated below the actual initiation point then the conditions for occurrence were met in a place other than that identified as the point of initiation. This assumption states that all the causes of the slope failure are located at the initiation point. If this assumption is not met then two conclusions can be drawn: 1. the conditions of occurrence at the actual initiation point were not fulfilled; and 2. the attributes characterizing the event are incorrect (unnecessary reach(es) were considered). The data set used in this study does not come from long-term, detailed monitoring, and therefore the assumption that the landslides started at the measured initiation point has to be made. The last assumption is related to the trajectory of the event. From energetic perspective, the landslide energy increases if the slope increases. This is because the potential energy is in a oneto-one relationship with slope: if slope increases then the energy increases, when no deposition of the moving material occurred. This is stated in the energy conservation equation (assuming constant mass): mv 2 E = E +E= — c p + mgM (4.15) 62 where E : total energy; E ,E C P : kinetic and potential energy; m : mass in movement; v : speed of the movement; Ah : elevation change of the movement; and g : gravitational acceleration. The assumption that is made in order to meet the relationship between energy and slope is that the unconfined event path follows the greatest slope trajectory. In the case of landslides, some local elements can influence the event trajectory dramatically. This means that at the local level, the path cannot fulfill the assumption (e.g. a large rock, or a tree can deflect the landslide trajectory in a direction that does not have the greatest slope). However, after such a point is passed, the trajectory always follows the steepest slope. The set of five assumptions listed above constitutes the framework for hypothesis testing. The first assumption represents the epistemological background of the study. Its validity provides sense and creates the foundation for the mathematical tools to be applied during the data analysis. The last four assumptions are related to the physical part of the mass movement process. Usually, techniques involving the vagueness of the data lead to low-precision models. There are two types of mathematical models that attempt to involve data uncertainties in the analysis: theories based on dual logic (Blackburn et al., 2001), and theories based on poly-logic (Lukasievicz, 1963). Theories based on dual logic assign an error to each value. A variable can have any value in the interval defined by the error, with the same likelihood. These theories have been discussed by Adcock (1877, 1878) and Kummell (1876, 1879) and developed latterly by 63 Koopmans (1937) and Levin (1964). Poly-logic theories have gained a large impetus from fuzzy sets theory. The core of fuzzy sets is that the likelihood of a value can vary from zero to certainty. Fuzzy set theory tries to reduce the impact of the real data on the analysis by including a variable degree of belief for each value, in comparison to dual logic that has only two degrees of belief (Zadeh, 1965; Nahorski, 1992). 4.9 4.9.1 Steps in building the debris slide-flow travel distance prediction model Overview Several steps are necessary to build a debris slide-flow travel distance prediction model. First the attributes significant related with event travel distance have to be established. The techniques used to identify these attributes are supplied by statistical methods. As statistical methods use variables, the attributes in the mathematical procedures used to develop the model will be termed variables. The most desirable relationship for a precise model involves unmodified variables. Therefore a preliminary analysis was performed on the raw variables. This analysis represented a part of the hypothesis testing described in Section 4.7. Usually, the non-transformed variables are not correlated with the debris slide-flow travel distance (Cannon, 1993; Megahan and Katcheson, 1996; Corominas, 1996; Finlay et al, 1999; Fannin and Wise, 2001). The next step in identifying the significant variables is to identify transformations that make their correlation with debris slide-flow travel distance significant. These transformations are functions 64 that can involve one or more variables. The most desirable case is that each function is determined by a single variable. As the functions defined for compact sets of real numbers are usually, for certain intervals, bijective (one-to-one relationship), the relationship of the raw variable to the transformed one is easy to interpret. A series of transformations involve more than one variable and represent their interaction. These transformations usually express more powerfully the correlation between the transformed interaction and the dependent variable. In order to identify these transformations, two difficulties have to be overcome: • identification of the association of the variables that makes the correlation strong; and • identification of the function that links the selected variables. Usually, the combination of variables is dependent on the function that connects them. Information regarding the association between the variables can be drawn from the literature. If no existing information is applicable, then new associations have to be derived. Another approach is based on information provided by plotting the variables on scatter graphs. Building a new composite variable consisting of a combination of other transformed variables is one of the most difficult actions in modeling. The significance of the transformed variables is determined similar to the raw ones. The debris slide-flow travel distance prediction model can include raw or transformed variables. The significance of the travel distance equation should be tested using the same procedure as for the raw variables. The equation must be tested for outliers and for influential cases. As the model will be used for further inferences, a series of assumptions have to be met (Neter et al, 1996). 65 These assumptions are related to the normal distribution, homoscedasticity and absence of correlation errors. The next stage in the development of the model consists of selecting the significant variables for the significant model, using stepwise, backward or forward methods. The significant variables supplied by the selection procedures described above were used in the fuzzy regression analysis. Fuzzy regression analysis is based on triangular fuzzy numbers (Tanaka et al, 1982). To use fuzzy regression, the variables supplied by the regression analysis using crisp sets have to be fuzzified. The fuzzification process is based on the limitations of human operators and devices and the variation through time of the variables. The resulting model is based on the non-linear programming of Tanaka et al. (1982), later improved by Tanaka et al. (1989), Heshmaty and Kendel (1985), Wang and L i (1990), Bardossy (1990), Nather and Albrecht (1990), Savic and Pedrycz (1991), Fruhwirth-Schnatter (1992), Wang and Ha (1992), Tanaka and Ishibuchi, (1992), Watanabe and Imaizumi (1993), Romer and Kandel (1995), and Hong et al. (2001). The two models (crisp and fuzzy), were tested on a data set different to that used in the modeling process. This constituted the model's validation (Snee, 1977). For each case, the confidence intervals for the predicted values were calculated. The final model was the one with smallest confidence interval for the predicted debris slide-flow travel distance. 66 4.9.2 Selecting the significant variables The process of selecting significant variables was based on the significance of the linear regressions. A simple linear regression was performed for each pair of variables: debris slideflow travel distance and predictor variable. If the regression was significant it meant that the relationship between the variable and the debris slide-flow travel distance was significant. Consequently that variable was introduced into the final model as a predictor variable. This process was identical for each variable, raw or transformed. The significance of the regressions represented the mathematical evidence to support/reject the hypotheses presented in Section 4.7. The least squares method (LSM) was used to find the relationship between the dependent and independent variables. This method was preferred as it provides extremely good results if all the assumptions are fulfilled (Ciucu, 1963; Mihoc and Firescu, 1966). The Gauss-Markov theorem ensures that there is no bias and minimum variance of the estimators when using L S M (Netter et al, 1996). The mathematical formulas that were used are presented in Neter et al (1996, section 2.7 and appendices A5-A6). The significance level was established as a = 0.05. The significance level used in past debris flow studies has varied from a = 0.2 (Wise, 1997) to a = 0.05 (Megahan and Ketcheson, 1996). The raw variables were first tested for significance correlation with event travel distance. The raw variables were then individually transformed and their significance relationship with event travel distance tested. The last step was to build transformed variables that included more than one independent variable. The type of transformation was supplied by plotting the debris slideflow length against each variable. The significant variables were then used to build the debris slide-flow travel distance prediction model using crisp sets. 67 4.9.3 Identifying and testing the model using crisp sets The correlations between dependent and independent variables were tested using all 38 observations. The data set was divided into a data subset used to build the model, called the estimation data, and a data subset used to test the model, called the prediction data. To split the data set, its size must be larger than 2p+25, where p is the number of variables used in the model (Snee, 1977). As there were 38 events, the regression equation cannot include more than seven predictor variables. The size of the estimation data subset is recommended to be larger than p+15 (Snee, 1977). The remaining data represented the prediction data subset. The data set identification was based on the significance analysis of the raw variables. The method recommended by Snee (1977) was applied, with some modifications. These were based on the idea that the prediction data subset has to be as different as possible from the estimation data subset. The overlap between the two data subsets can only be related to the dependent variable; this means that the debris slide-flow travel distance for the prediction subset has to be within the range of the estimation data subset. The transformed and significant raw variables were all used in the multiple regression analysis. The predictor variables for the regression equation were chosen based on their individual significance in relation to debris slide-flow travel distance as well as using a subjective interpretation of the information available in the literature. The significant multiple regression equation selected the significant independent variables using different procedures: stepwise, backward and forward selection. If these procedures supplied the same results then the model for predicting debris slide-flow travel distance was represented by the resulting regression equation. 68 Inference based on multiple regression results requires that a series of statistical assumptions have to be met. The model therefore had to be adjusted to fulfill these assumptions. This could lead to the inclusion of variables that were not significantly correlated with the debris slide-flow travel distance but which resulted in a model that met the assumption requirements. The three assumptions that have to be met by the regression equation are that the errors are normally distributed, that they show homoscedasticity, and that the observations are independent. In addition to these assumptions, a series of conditions have to be met. The model should not be affected by multicollinearity and there should be no outliers of the dependent variable or the independent variables. These assumptions and requirements were tested, always with a significance level of a = 0.05 (the lowest reported in the literature on terrain stability). 4.9.3.1 Normal distribution of the errors This condition has to be fulfilled to make inferences based on the regression equation coefficients and also on the confidence intervals of the predicted value. Several tests can be applied to check if the assumption of normality is violated, with the most commonly used being the Shapiro-Wilk, Kolmogorov-Smirnov, Cramer-von Mises, and Anderson-Darling tests. The tests are presented in Conover (1999). Emphasis in this analysis was placed on KolmogorovSmirnov and Shapiro-Wilk tests (Craiu, 1998; Conover, 1999). 4.9.3.2 Errors involving homoscedasticity Homoscedasticity means that the error variance is the same for all observations in the sample, and therefore that the variance of the dependent variable is the same for all observations in the sample. If heteroscedasticity is present, the spread of the predicted variable depends on the 69 values of the predictor variables (Rutemiller and Bowers, 1968; Breusch and Pagan, 1979). In this case, even if the regression coefficients are unbiased and consistent, they are not efficient (i.e. they do not have minimum standard error). The presence of heteroscedasticity was determined using White's test (White, 1980), with a significance level of a = 0.05. 4.9.3.3 Errors are not correlated If the errors are serially correlated then the least squares procedure, used to estimate the regression equation coefficients, has the following undesirable consequences (Berk, 1977; Bernstein et al, 1988; Neter et al, 1996). The regression equation coefficients do not have the minimum variance (from Gauss-Markov theorem), the errors' variance can be underestimated by mean square errors (MSE), the confidence intervals based on the t distribution are no longer strictly applicable, the tests involving the F distribution cannot be used, and the variance of the regression equation's coefficients determined using least squares method (LSM) can underestimate the real variance of the estimated regression coefficients. First order error serial correlation was considered. This means that the equations to be tested were: L = a + ^ a .* X. + £j 0 initial regression equation t (4.16) £ • = p* £._, + Uj where first order serial correlation L : dependent variable (debris flow travel distance); X independent variables; i: Ej: error of the observation i ; th Oo, Oi: estimated regression equation coefficients; Uj: error term for the £}•; and p: parameter expressing the errors serial correlation. 70 The test used to check whether the errors were serially correlated was described by Durbin and Watson (1950) and Durbin (1970). The Durbin-Watson statistic, d, is calculated from: 5>,-«,-.) 2 d=^~i (4.17) 1=1 where e,: error of the i observation. th The test is based on the distribution of the Durbin-Watson statistic, d. Durbin-Watson bounds are dependent on the level of significance (already established at a=0.05), the number of observations and the number of predictor variables. If d determined using the above calculation is larger than the upper Durbin-Watson bound then the errors are independent, if d is smaller than lower bound then the errors can be correlated and if d is between the upper and lower bounds the test is inconclusive. If the test is inconclusive then the Theil-Nagar test was used (Theil and Nagar, 1961). 4.9.3.4 Model multi-collinearity The multi-collinearity test is used to detect one or more collinear relations among the set of predictor variables. Multi-collinearity is extremely problematic in regression analysis, leading to local solutions instead of global ones. Some of the problems supplied by multi-collinearity are presented in Figure 4.10. 71 Figure 4.10 Exact collinearity. The coefficients of the regression are undetermined. A change of the regression's parameters plane do not change the errors sum of squares (Belsley et al, 1980). A series of studies have developed techniques to detect the subset of independent variables involved in collinear relations. Chatterjee and Price (1977) used ridge regression and the variance inflation factor (VIF). The VIF is used in determining the biasing constant used in ridge regression. The equation for VIF is: VIF =(\-R y k 2 k l (4.18) where VIFk : VIF for the independent variable k; R k '• coefficient for multiple determination when Xk is regressed on the other p-1 independent variables in the model; p : number of independent variables; and k : a natural number from 1 to p. 72 Neter et al. (1996) recommend 10 as a cut-off value for VIF. If the VIF > 10, multi-collinearity is present, and the least squares is not the best method to us in the regression coefficient determination. 4.9.3.5 Identifying and assessing the outliers Outliers are events clearly separated from the rest of the data. Outliers often involve large residuals and can therefore have a strong impact on regression equation estimation if the least squares method is used. These events have to be studied individually, so that a decision can be made as to whether they should be retained or eliminated from the analysis. Studentized-deleted residuals, based on single row effect investigation (Belsley et al, 1980), were used to identify outlying values of the dependent variable. The equations used to calculate this statistic were: Student, = e : where -10.5 n-p-l SS£(1-/*,.)-£,. (4.19) : the i error; th SSE: sum of squares of errors h : the / diagonal in the hat matrix H = X(X'X) X' th _1 u Studentized-deleted residuals were used because they have equal variances (homoscedasticity) and can be easily related to the t-distribution. However, the Studentized deleted residuals, StudenU, can conceal some influential data that have relatively small values for residuals. To assess the Studentized-deleted residuals, their relationship with the t-distribution was used because StudenU follows the t distribution with n-p-1 degrees of freedom. The Bonferroni test 73 was used and therefore the critical value was t _ x the Studenti < t _ x . _ _ . The significance level was a = 0.05. If ann n p x . _ _ then the i observation is not an outlier of the dependent variable. t h ann n p x Outliers can also be identified in the set of independent variables. Two tests were used to determine if an individual observation was an outlier of the independent variables: hat matrix leverage (H) and COVRATIO. The hat matrix leverage statistic, hu, supplies an indication regarding the outliers in the independent variables. The cut-off value for hat matrix leverage is 2p — n where p is the number of independent variables and n is the number of observations (Belsley et al.; 1980). If hu < — then the i t h observation is not an outlier of the independent n variables. The COVRATIO statistics are determined using Equation 4.20. det(s ) (X ' X ) - ' ) ^ — ^ — det (s (X'X)- ) 2 COVRATIO = where (i m 2 1 (4.20) s^: errors variance estimated after deleting the i observation; t h s : errors variance; and X(i): X matrix without the i observation. t h The cut-off value proposed by Belsley et al. (1980) is: \covratio-l\=— If the | covratio -11 < — n (4.21) then the i observation is not an outlier of the independent variables. t h The influence of all the events identified as possible outliers on the regression equation estimates needs to be assessed. An influential observation is one whose exclusion from the data set would cause major changes to the regression equation estimates. The tests carried out to assess the 74 influence of an individual observation are based on its omission from the data set. The tests used are: • DFFITS - the difference between the fitted value y,. for the i event when whole data set th is used in fitting the regression function and the predicted value y i{i) for the i t h event obtained when the i event is omitted in fitting the regression function (Belsley et al, t h 1980; Neter etal, 1996); • Cook's distance measure (Cook, 1977, 1979; Belsley et al, 1980; Neter et al, 1996); and • DFBETAS - the difference between the corresponding estimated regression coefficients determined using the whole data set and also when the i event is omitted (Belsley et al, th 1980; Neter et al, 1996). The DFFITS statistics is calculated using the formula (Belsley et al, 1980; Neter et al, 1996): DFFITS = e, 0.5 n-p-l f i. \ 0.5 = student; SSE(l-h„)-eS (4.22) The general cut-off value for DFFITS is 2 / - (Belsley et al, 1980) or 1 (Neter et al, 1996). If V n DFFITS < 1 or DFFITS < 2 1— then there are no individual events that have a significant V n influence on the regression. Cook's distance measure considers the influence of the i event on all predicted events from the l data set. Cook's distance, Dj, is calculated using the formula (Cook, 1979; Belsley et al, 1980; Neter et al, 1996) D = ' _ f j _ * _ ^ _ PS 2 d-Kf (4.23) 75 Cook's distance measure, D;, is related to the F distribution. The value of Dj is compared to the F n.p corresponding percentile value. When the percentile value is less than 20%, then the i th Pi observation has only a small influence on the fitted value (Neter et al, 1996). If a , determined using the formula F ( a ; p, n-p) = D;, is smaller than 0.2, then there is no evidence of a particular event having a significant influence on the fit of the regression equation. The D F B E T A S statistic is the scaled measure of the change in each parameter estimate (Belsley et al, 1980; Neter et al, 1996). D F B E T A S is calculated by deleting the i th observation using the equation: DFBETAS where (X'Xfjj: the bay : j t h jj = element of the (4.24) (X'X)" ; 1 and coefficient calculated deleting the i th observation. Different authors recommend distinct cut-off values in the D F B E T A S assessment. Belsley et al. 2 (1980) recommended a cut-off value —== and Neter et al. (1996) recommend a cut-off value of 1 for small- to medium-size data sets. If D F B E T A S is smaller than the selected cut-off value, then the i th observation is not an influential case. If an outlier that is also influential is not obviously erroneous it requires further examination. Such cases provide useful information about the adequacy of the model. The elimination from the analysis of outlying influential observations that are not clearly erroneous should be done rarely and with caution. The final decision over the data set structure is made using robust regression (Rouseeuw and Leroy, 1987; Hoaglin et al, 1985; Neter et al, 1996). The robust 76 regression procedure diminishes the influence of the outlying events when L S M are used. Of the many procedures developed for robust regression, this study used the iteratively re-weighted least square method. The weighted function used is the bi-square function, w (Rouseeuw and Leroy, 1987; Cleveland and Devlin, 1988), expressed by: [ 0 1 - ( W 2 f | a M ' 6 8 5 (4.25) |w|> 4.685 where u is calculated using the Beaton-Tukey (1974) method: u = Robust regression is an iterative process. The iterations stop when convergence of the assigned weights is obtained. Cases whose final weights are relatively small are considered outlying, and therefore require investigation. If there was no evidence that the outlying events have to be dropped from the data set, they were kept and the final regression equation built using crisp sets was the one supplied by the robust regression. 4.9.4 Regression function assessment in relation to the debris flow ending point Events that ended in a stream are usually assessed separately from those that did not. The new variable that quantifies the path in relation to terrain morphology solved this problem. Using the new approach, the two cases were studied together. However, a separate analysis was performed to show the validity of combining the data. This was done by first identifying the regression equations modeling each type of event, i.e. one equation for events that ended in a stream and one equation for events that did not. As the goal of this method was to check the equation modeling both type of events, the two functions that predicted debris flow travel distance for events that ended and did not end in streams used the same set of predictor variables supplied by the final model for crisp sets from Section 4.9.4.2. 77 4.9.4.1 Regression function for events that end/do not end in a stream The first step in the analysis was to test the significance of each equation built for events that ended/did not end in streams using the above variables. The hypotheses to be tested initially were: HO: There is no significant relationship between the debris flow travel distance (for events that end in a stream/do not end in a stream) and the predictor variables of the model built using crisp sets; HI: There is a significant relationship between the debris flow travel distance (for events that end in a stream/do not end in a stream) and the predictor variables of the model built using crisp sets; These hypotheses were tested using the F distribution, as described in Section 4.9.2. This test requires that the assumptions of homoscedasticity of the errors, normal distribution of the errors, and independency of observations are met. These assumptions were checked using the same methods as in Section 4.9.3. The models obtained for each of the two cases were also checked for outliers using robust regression, as described in Section 4.9.3.4. 4.9.4.2 Inferences about the two regression functions The Fisher distribution was used to test the difference between the two regression equations obtained in Section 4.9.4.1. A modified Levene test ( Levene, 1960) can be used to verify if the variances of the error terms for the two models are equal, as it does not depend on the error terms 78 being normally distributed. The absolute errors were used for each regression equation. The elements needed to perform the Levene test are: d,=\e -e \ iX d, = x ^ (4.26) where en : errors of the regression function built from events that did not end in a stream; e-a: errors of the regression function built from events that ended in a stream; e ,e : error means for the regression functions of the events that did not end/ended in a { 2 stream; and ni, ri2: number of events that did not end/ended in a stream: ni+ri2 -n. The pooled variance s is determined with the equation: 2 n, + n - 2 2 The Levene test is based on the t distribution and t statistics, which are calculated from: h= d s r \ (4-28) d I— + — The critical t value for a=0.05 and n D F is ti-o/2; n -2- If IL< t critic the null hypothesis cannot be rejected and there is no difference between the variances of the two regression lines. The assessment of the difference between the two regressions functions is based on the class variable technique (Neter et al, 1996); the set of two equations being identified with a class variable. Using this concept, a new equation can be built: 79 yi =fi(X,) + c *f {X ) t 2 i + Ci = ^ a x n n + c ]Ta Xi 1=1 where t i2 + 2 = ]T a,,.*,. + £ a ^ x , + e (4.29) t 1=1 i=i 1=1 y,: debris flow travel distance; / / : regression function for the events that did not end in a stream (in terms of variables); f i : regression function for the events that did not end in a stream (in terms of variables); Xt: set of independent variable; and ci: a variable defined in the following way; { 1 if tfie event ends in the stream 0 if the event do not ends in the stream Ei: error term; and an = an and a2,= c,-*a,-2. The significance of the regression is ensured by the variable selection. The level of significance for the test is established at a = 0.05. The F test is used and the D F associated are p2+l for the numerator and n-pi-p -2 for the denominator, where p i and p are the number of variables in the 2 2 regression function for the events that did not end in a stream (pi) and ended in a stream (p ). If 2 the F-value determined using the hypotheses constrains F c a i c < F,^.^ > n _ p _ 2 then the functions describing the events that ended/did not end in a stream are not significantly different. If this is the case, then a single function can be used to explain both types of events. Information about the difference between the functions that model the travel distance for events that ended/did not end in stream can be obtained by a check of the significance of the corresponding coefficients in the model mixing the functions for the two types of events and a summary of the statistics for each type of event. 80 This test was the final assessment in the process of regression building using crisp sets. The final debris flow travel distance prediction model was represented by the function fulfilling all the assumptions and conditions described above. 4.9.5 Regression model built using fuzzy sets The goal of a regression analysis is to describe how the dependent variable is related to the independent variables. Because the relationship is investigated using a data set, the regression rarely describes the precise relationship between the variables (Bardossy, 1990). There are two causes for modeling errors, omitting independent variables and choosing the wrong regression function (linear, logarithmic, etc). The reason that fuzzy regression is used in modeling natural processes such as landslides is related to the uncertainty of the data (e.g., the soil granulometry). The regression parameters can be crisp sets as the uncertainties are related only to the data values. This means that the modeling process involves fuzzy data and crisp parameters: data with uncertain values and parameters with precise values. The use of fuzzy sets partially solves the second factor that can lead to modeling errors: choosing the wrong regression function (Klir and Yuan, 1995). The fuzzy approach can therefore solve two problems simultaneously, namely uncertainties associated with the data values and lack of fit caused by using the wrong regression function. The fuzzy set theory used to develop the landslide travel distance model is presented in Appendix 3. 81 4.9.5.1 Fuzzifying the variables from the final crisp set model The variable fuzzification is based on three aspects of the data. The variable can vary through time, from the event occurrence to the moment of measurement. This change over time needs to be considered in the model as the initial values are not known exactly. Secondly, the data are collected directly (by walking the length of the event) or by using instruments such as a clinometer or compass. Both methods do not provide exact results because of the limitation imposed by human error and by the instruments. Thirdly, the debris flow measurements are 'inaccurate. The process of data collection using maps or aerial photographs can lead to incorrect values (Robinson et al, 1999). Considering these three points, the fuzzification can be done as percentages around the measured value or by adding or subtracting a certain value from the measured value. Possible variation through time of the variables and difficulties in data collection (if maps are used) have to be considered in order to express reality. As the fuzzy sets used are symmetrical triangular fuzzy numbers (Appendix 3), the fuzzification process has to follow a series of rules (Zadeh, 1965; Klir and Yuan, 1995; Nguyen and Walker, 2000). The vagueness of a value has to be the same in both directions: larger or smaller. The likelihood of any value around the measured one must follow a linear function. The fuzzified variables must be used in the regression analysis, not the crisp variables. The process of crisp variable fuzzification is based on the Zadeh extension principle (see Appendix 3). The fuzzification process permits as large as possible a spread for the predictor variables and as small as possible a spread for the predicted variable. The arithmetics of fuzzy sets reveal that if the spread of the predictor variables is large that of the predicted one is larger. Following a 82 comparison with the confidence intervals supplied by the crisp sets, the spread of the dependent variable was established as 20% (i.e. +/- 20%) from its value. This ensures a level of confidence of the model of at least 80% from the predicted debris flow travel distance. The spread of the independent variables is determined in relation to the the maximum variation of a variable due to error in recording, limitations of the instruments used in data recording and evolution of the debris flow from its occurrence until the time of data collection, and by ensuring that the mathematical requirements from the fuzzy regression calculations are fulfilled. The actual spread of the value assigned to each independent variable is determined as a percentage of its crisp value and is presented in Section 5.5. Fuzzified variables were used in the fuzzy regression analysis. 4.9.5.2 Fuzzy regression analysis The fuzzy regression method is presented in the Appendix 3. The variables used in the regression analysis were supplied by the fuzzified variables of the final regression equation using crisp sets. The non-linear programming set of equations used in fuzzy regression estimation includes, besides mixed transformed variables (variables built as a product of two or more individual variables), the individual variables compounding the mixed variables (i.e. together with the mixed variable slope * stand height, slope and stand height will also appear separately). The equations proposed by Tanaka et al. (1982), Tanaka and Ishibuchi (1992) and Klir and Yuan (1995), which are the basis of the fuzzy linear regression analysis of this study, are expressed as: m minimize n £ | S j - ^ \ a j=i t \s | u (4.30) 1=1 83 subject to n n -Y} i\ n Yj i n-yj J ' ' a s + a x = 1 n forallj6{l,m} n (4.31) 1=1 i=l where +s = 1 y,-: dependent variable ( debris flow travel distance); Xi: independent variables; a,-: regression equation coefficients (crisp numbers); n : the number of variables; m : the number of events studied; : spread of the fuzzy i variable from the j t h Sj: the spread of run out distance for the j t h t h event; and event The membership function (Zadeh, 1965) does not appear in these equations. This is because the minimum value of the membership function in the Tanaka et al. (1982) model is 0. This implies that the degree of certainty associated with each predicted value is larger than 0. The goal of the model is to provide as large a degree of certainty as possible (for u. = 1, fuzzy sets become crisp sets). As the study has an exploratory character, the minimum degree of certainty was established as 10%. The degree of certainty associated with a certain value can be understood as the equivalent of the significance level from crisp set theory. The significance level u.= 0.1 is larger than the significance level a = 0.05, as the objective of the study is to produce a model with a small confidence interval and with a high degree of certainty. The comparison between the models, built using crisp or fuzzy sets, is made by selecting the same value for the significance level and degree of certainty, a = u, = 0.05. 84 Two more groups of constraints were added to the above equations. The first group of equations restricts the confidence interval to values that are strictly positive. This constraint does not exist in the model built on crisp sets, as it sets a confidence interval that could include negative numbers. The mathematical expression of this condition is: i>,*,7-Ekky>0 (4.32) i=i ;=i where the symbols are as above. The second group of conditions forces the predicted value to be greater than the lower bound of the measured fuzzified debris flow travel distance. The conditions imposed by Tanaka et al. (1982) ensure the overlapping of the measured and predicted fuzzified values for the dependent variable (Figures 3, 4 and 5 from Appendix 3). The condition expressed in Equation 4.35 increases the overlapping area. In geometrical terms the condition can be represented as: Predicted Value Left bound measured value Right bound measured value Figure 4.11 Forcing the predicted value to be larger than the lower bound of the fuzzified measured value. In mathematical terms this condition can be represented by: n 2X*,v where L left - ie L f t >0 (4.33) : lower bound of the measured fuzzified value 85 If the predicted value is larger than the upper bound of the fuzzified measured value, as in Figure 4.12, then a similar condition states that: lL i ij- r , 1=1 a X L ish where L : right <0 (4.34) upper bound of the measured fuzzified value. Lower bound measured value Upper bound Predicted value measured value Figure 4.12 Forcing the predicted value to be smaller than the upper bound of the fuzzified measured value. As the two conditions described above are extremely restrictive, leading to a confidence interval for the predicted value smaller than 40% for the event's travel distance, there is usually no feasible solution for the whole set of equations. As a result, only one of the above two sets of conditions can be selected in the modeling process, expressed by: Yj i iJ- left a X L >0 (4.35) i=i There are two reasons for selecting this condition. From the point of view of risk it is better to overestimate the travel distance than underestimate it. This means that the predicted travel distance should be larger than the lowest possible value determined for the travel distance. Secondly, the lower boundary condition ensures that the predicted value can have only positive values. The complete set of functions involved in the debris flow travel distance prediction model using fuzzy sets is: 86 m minimize n J | ~ J l >I u I a y=l (- ) 4 s 36 i=l subject to - J l a,- I +Y a x <y j ,=1 Jl a i ij j + S j =L righ[ i I •Sy + J i * < 7 i=l a (4.37) forallJG{l,m} ,=1 ~j s = left L 1=1 J«,-*«,-Jkky>0 i=l (4.38) i=l J a , . ^ - L , >0 (4.39) t e / i=i The parameters supplied by the above restrictions represent the coefficients of the final model used in the debris flow travel distance prediction based on fuzzy set theory. The formula used for n n n the confidence interval, CI, of the predicted value was: CI=J<2,je, ± Ja,-^, =y . ± J a . ^ . y i=l 4.9.6 ( 1=1 i=l Regression assessment The performance of the models determined using crisp or fuzzy sets needs to be assessed. As described in Section 4.9.3, the data set was split into two subsets, one for estimation and the other for prediction. The function coefficients were determined using the estimation data subset and the model was assessed using the prediction data subset. The prediction data subset also included the events identified as outliers by the statistical analysis on the crisp sets and eliminated from the estimation data subset. The predicted values for the debris flow travel distances were calculated using the models supplied by both the regression analysis on crisp sets and the regression analysis on fuzzy sets. The confidence intervals for the values determined crisp sets were: 87 ^' y i —t\-all\n-p S predict s pre** =MSE* (1 + X'. ( X ' Z ) 2 where ,, (4.40) _ 1 X) t y,-, X'., X ' , X , X are as above; s predict '• variance for the predicted value; and ; 2 MSE : mean square error for the multiple regression equation. For fuzzy sets, the confidence intervals were expressed by: CI- y, n ±^ a s j j ij 1=1 If the actual values fell within the confidence interval predicted by the models then the functions proposed to predict the debris flow travel distance were considered as being correctly estimated. If the number of events that were incorrectly estimated was proportionally larger than selected significance level, then the models were considered to be poor. In such an event, the regression analysis would need to be repeated, starting with the selection of the transformed variables. The events that did not fall into the confidence interval had to be analyzed individually. Information from analysis of the wrongly predicted events was used to improve the performance of the model. If the analysis indicated that such a step was necessary, some adjustment to the functions used in the prediction had to be made. A regression assessment using a completely different data set represented the final test in the process of model validation. The final model was chosen in relation to two features. If the data were considered to be accurate then the crisp set model was used, provided that the CI was smaller than that supplied by the fuzzy set function. The selected model was the one with the smaller confidence interval, except when the data accuracy was reduced, in which the case the fuzzy model was used regardless of the confidence interval. The selection process for the model that best fitted the data and the data accuracy is presented in Figure 4.13. 88 DATA Accurate data High degree of certainty associated with each value Model determined using crisp sets has CI smaller than model using fuzzy sets Use model determined using crisp sets theory Inaccurate data Low degree of certainty associated with each value Model determined using crisp sets has CI larger than model using fuzzy sets Use model determined using fuzzy sets theory Figure 4.13 Flow chart for selecting the prediction function. 89 4.10 Risk assessment After establishing the model used to determine the travel distance of the event, the risk was assessed as follows: a) identification of the initiation point of the event; b) prediction of the event travel distance and its confidence interval; c) a check of the position of the element at risk (E) i. if the distance from the debris flow initiation point and the position of element at risk was smaller than the upper limit of the confidence interval, the vulnerability (V) was considered as 100%, with the significance level or degree of certainty established; ii. if the distance from the debris flow initiation point and the position of element at risk was greater than upper limit of the confidence interval, the vulnerability was considered to be 0%, with the significance level or degree of certainty established; d) determination of the risk ( R ) using the following equation (Covello and Merkhofer, 1993; Mapping and Assessing Terrain Stability Guidebook, 1999): R=PxC=PxExV= PxE if the position of the element at risk is smaller than upper CI limit of the traveldist. (4.43) if the position of the element at risk is larger than upper CI limit of the traveldist. where P : probability of event occurrence; C : consequences. The E-value depends on the nature of the element at risk and was therefore uninfluenced by the debris flow's initiation point or travel distance. The above equation is very simple and is completely determined by the initiation point of the debris flow. 90 5 Results This chapter explains how the debris flow travel distance prediction model was built. The calculations were performed using SAS version 8.2 for statistics and non-linear programming and MS E X C E L version 2000 for data manipulation. 5.1 Significance of the correlation between debris flow travel distance and the raw variables The first step in building the model was to determine the raw variables that were significantly correlated with debris flow travel distance. This was done by performing a simple linear regression analysis for each of the independent variables. As trends and relationships were difficult to identify, the analysis was performed using the whole data set, i.e., the 38 events. The results of the simple linear regression analyses are summarized in Table 5.1. Eight variables were correlated with debris flow travel distance: path, terrain curvature expressed by the combination of plan and profile curvature, stand height and diameter, soil fine particle percentage, and terrain state represented by main and secondary human activities considered together. The rest of the variables considered in analysis were not correlated with debris flow travel distance at the selected a level (0.05). 91 Table 5.1 Pearson's correlation coefficients between debris flow travel distance and selected predictor variables. Variable Path variable Average slope Fan slope First reach slope Initial volume Azimuth Position on slope Plan curvature Profile curvature Terrain curvature - plan and profile curvature (separate) Terrain curvature - plan and profile curvature (interacting) Species Stand height Stand diameter Canopy closure Geology Particle side distribution (PSD) type PSD grading Soil fine particle content Soil specific weight Terrain state (main human activity) Terrain state (main and secondary human activities considered separate) Terrain state (main and secondary human activities considered interacted) Coefficient of correlation 0.36 0.03 0.02 0.002 0.02 0.07 0.07 0.06 0.02 0.08 F calc Pr>F Ho 20.33 1.1 0.76 0.09 0.55 2.82 2.86 1.1 0.44 0.75 0.0001 0.3 0.39 0.76 0.46 0.1 0.1 0.34 0.64 0.56 Rejected Accepted Accepted Accepted Accepted Accepted Accepted Accepted Accepted Accepted 0.45 2.94 0.01 Rejected 0.04 0.13 0.13 0.001 0.05 0.02 0.12 0.08 0.02 0.02 0.62 0.52 5.33 5.44 0.04 0.96 0.32 2.34 3.1 0.9 0.48 10.46 0.67 0.027 0.025 0.84 0.39 0.72 0.11 0.08 0.35 0.62 0.0001 Accepted Rejected Rejected Accepted Accepted Accepted Accepted Rejected Accepted Accepted Rejected 0.64 6.49 0.0001 Rejected Two sets of class variables represented terrain curvature and terrain state. To determine the correlation's significance between debris flow travel distance and the class variables, two cases were considered, the variables alone (Equation 5.1), and interactions amongst the variables (Equation 5.2). L = f (plan curvature, profile curvature) L = f (main human action , secondary human action) L = f (plan curvature*profile curvature) L = f (main human action * secondary human action) 92 For terrain curvature, each case led to a different conclusion. Plan and profile curvature, when considered as separate variables, were not correlated with travel distance, but a correlation was present when they were considered as interacting variables. Only the variable representing the interaction between plan and profile curvature was considered in the model-building process. The variables representing terrain state were significant whether they were considered separately or as interactive variables. In both cases, the influence of the main human activity was not significant. These variables were not used in the model building process. Variables that were not correlated with debris flow travel distance were included in the analysis in a transformed form. The transformations were performed to improve variable correlation with event travel distance. 5.2 Transforming the raw variables The analysis performed on the raw variables revealed that the variable with the highest influence, based on correlation coefficients, was the path variable. It is possible to test whether there is a variable transformation that would increase the significance level of any of the correlation coefficients; trigonometric (sine and cosine), logarithmic and power functions (positive and negative exponential) functions were used to modify the raw variables. The graphs in Appendix 7 indicate the transformation types that were used to increase the correlation between variables. Slope at the initiation point ((j)) can be expressed as a raw variable (in degrees), but it can also be transformed. A sine function was selected for this study as the sine function shows the variation 93 of elevation in relation to slope distance. The transformation leading to a new variable for slope is sin (0). Azimuth expresses the aspect of the debris flow. As mentioned in Chapter 4, water regimes vary from southerly expositions to northerly ones. This study considered northerly aspect as having wet regimes, southerly expositions as having dry regimes, and easterly and westerly expositions as neutral (intermediate). A cosine function was chosen because it has the following properties: • for cases of 0 or 360° its value is 1; • for cases of 180° its value is -1; and • for cases of 90° or 270° its value is 0. This means that for northerly aspect, the cosine function assigned a value of 1, for southerly aspect a value of -1, and for east and west aspect, a value of 0. A power function was applied to accentuate the information. The only restriction that needed to be applied to the power function was that it had to be an odd number in order to preserve the sign. After a series of tests, the number that seemed to be appropriate for this investigation was five. The transformed variable can therefore be expressed by: CAZ=(cos(azimuth)) 5 (5.3) Canopy closure is an indicator of terrain state. This study considered only the canopy closure of the first reach. The analysis performed in Section 5.1 revealed that there was no significant correlation between the debris flow travel distance and canopy closure of the first reach. Transformation of the variable might therefore increase the significance level or even make it significant. Figure 7 from Appendix 7 and a series of attempts to derive an appropriate 94 transformation indicated that a suitable function for improving the relationship between debris 1 flow travel distance and canopy closure was fc + 0.01 where k is stand canopy closure The height of the stand at the initiation point was correlated with the debris flow travel distance (see Section 5.1). Stand height varied from 0 to 50 m. This large range of values, when compared to the values for transformed azimuth (from -1 to 1), makes the variable coefficients comparison inappropriate (Bernstein et al, 1987; Neter et.al, 1996). This was considered undesirable and a further transformation of stand height undertaken, as linear transformations leaves the collinearity diagnostics little altered (Belsley et al, 1980). The most common transformation is normalization; this study relayed on the idea of normalization, but did not use the mean or standard deviation in the calculation. This did not affect the results, as transformed height was not used to infer height; rather, it was used in the inferences related to debris flow where h is stand height at the first travel distance. The new height variable, codified ht, is 10 to reach. The number 1 was added to avoid zero values for height. The number 10 was selected as supplying the largest correlation coefficient with debris flow travel distance. The transformed variables height and slope of the first reach were combined to form a new variable. The new combined variable should be more significant than either alone, and should also be more strongly correlated with debris flow travel distance. A series of tests were performed to establish the best combination. The selected function represents the interaction between slope of the first reach and stand height: h+1 10 ) *(l + sincp) 5 95 The correlation between the new variable expressing the debris flow path and event travel distance can be improved if further transformations are applied. As Figure 1 from Appendix 7 does not offer sufficient information regarding the functions to be used, several attempts were made to improve the transformation, resulting in the following transformation: (log(V^ + D ) L 2 Table 5.2 Correlations between the transformed variables and debris flow travel distance. Variable Transformed slope Transformed azimuth Transformed canopy closure Transformed stand height Interaction slope at the first reach and stand height Transformed path variable Coefficient of correlation 0.003 0.03 0.02 0.13 0.28 0.68 F calculated Pr > F Ho calculated 0.13 1.3 0.81 5.33 14.23 0.72 0.29 0.37 0.03 0.0006 Not-rejected Not-rejected Not-rejected Rejected Rejected 76.1 <0.0001 Rejected Variable transformations had different effects on their correlation coefficients and significance. Slope modification improved the correlation coefficient only 1.5 times. However this transformation, combined with another one, stand height, was used further in fulfilling the multiple linear regression assumptions. The azimuth modification reduced the correlation coefficient 2.5 times. The canopy closure modification improved the correlation coefficient 200 times in relation to raw canopy closure. Although some of these transformations reduced the correlation coefficient, they helped to fulfill the multiple linear regression assumptions. The results obtained for transformed stand height did not differ from those based on untransformed stand height. This is not surprising, as the transformation is linear and therefore the modification only influenced the coordination system. The goal of this transformation was not to increase the level of significance but to scale the height variable to a level similar to the other variables. 96 The correlation for the variable representing the interaction between slope at the first reach and stand height (R = 0.28) was at least twice as strong as either individual component variable, and was therefore used in the regression analysis. The transformed path variable was correlated with debris flow travel distance (Table 5.1), as was the untransformed variable. However, the correlation coefficient increased from R = 0.35 for the 2 untransformed variable to R =0.68 for the transformed variable. 5.3 Selection of the estimation and prediction data sets The simple linear regression building process using raw or transformed variables was performed using the whole data set. The results from Section 5.1 show that there is no significant correlation between geology and debris flow travel distance. The division procedure was presented in Section 4.9.3. Two subsets were used, each having a different geology. Three geological types were represented in the data set: granite, gneiss and fine sedimentary. As the size of the estimation data set should be bigger than p+15, it had to contain two types of geology. The predication set therefore had to contain the third geological type, based on the requirements established in Section 4.9.3. The estimation data set contained the events that occurred on granite and gneiss and the prediction data set had the events occurring on fine sedimentary rocks. Under the conditions required for regression, the travel distance of the events on fine sedimentary rocks must be within the range defined by the events that have occurred on granite and gneisses. The debris flows in the estimation data had travel distances ranging from 25.8m to 1341.7 m. The debris flows in the prediction data set had distances ranging from 43.3 m to 131.6 m, and so the above condition was fulfilled. 97 The separation of the data set led to the following structure: Estimation data set Prediction data set • 30 events • Events on granite or gneiss • 8 events • Events on fine sedimentary lithologies 5.4 Debris flow travel distance model built using crisp sets The estimation data set that was used to build the model contained 30 observations. The regression analysis was separated into a preliminary regression analysis which identified the regression equation required to predict the debris flow travel distance and checked the regression analysis requirements (multi-collinearity and influential observations), and a final regression analysis that checked the assumptions and requirements of the estimated equation in relation to the results of the preliminary analysis. 5.4.1 Preliminary regression analysis The preliminary regression analysis applied the results presented in Section 4.9 to the estimation data set. For the preliminary analysis, the assumptions needed for inference were not verified because there is no need to fulfill these requirements to test the regression equation for multicollinearity and influential observations (Chatterjee and Price, 1977; Belsley et al, 1980; Neter et al, 1996). Four tests were performed: regression equation significance, multi-collinearity, identification of the outliers and influential observations, and remedial measures for outliers and influential observations. 98 Sections 5.1 and 5.2 supplied the variables considered in the regression analysis: • transformed path variable: (logC^/path +1)) ' - coded L P A T H ; • transformed azimuth: (cos(azimuth)) - coded C A Z ; • transformed variables representing the interaction between slope of the first reach and 1 2 5 stand's height: (^-^-) * (1 + sin <p) - coded TST; and 10 5 • interaction between plan and profile curvature and transformed canopy closure. Path, azimuth, slope and stand height transformation does not have physical meaning but are done aiming statistical significance and theoretical assumption fulfilling. The last variable was a combination of categorical and continuous variables. The quantification of the categorical variables used a binary logic system. This composed variable led to a set of nine variables. As the alternative plane for plane curvature combined with the plane for vertical curvature does not exist in the data set, this interaction was eliminated from the set of nine variables. Consequently the interaction between plane and profile curvature and transformed canopy closure was represented by eight variables (Table 5.3). Table 5.3 The eight variables representing the interaction of plane curvature, profile curvature and canopy closure (k). Variable codification Cvcv Plan curvature Profile curvature concave convex plane concave convex plane 1 0 0 1 0 0 Variable value if plane curvature is concave and vertical curvature is concave then the variable has the value Cvcx 1 0 0 0 1 0 if l/(k+0.01) else is 0 plane curvature is concave and vertical curvature is convex then the variable has the value Cvp 1 0 0 0 0 1 if l/(k+0.01) else is 0 plane curvature is concave and vertical curvature is plane then the variable has the value l/(k+0.01) else is 0 Cxcv 0 1 0 1 0 0 if plane curvature is convex and vertical curvature is concave then the variable has the value l/(k+0.01) else is 0 99 Variable codification Cxcx Plan curvature Profile curvature Variable value concave convex plane concave convex plane 0 1 0 0 1 0 if plane curvature is convex and vertical 1 if plane curvature is convex and vertical curvature is convex then the variable has the value l/(k+0.01) else is 0 Cxp 0 1 0 0 0 curvature is plane then the variable has the value l/(k+0.01)else is 0 Pcv 0 0 1 1 0 0 if plane curvature is plan and vertical curvature is concave then the variable has the value l/(k+0.01) else is 0 Pcx 0 0 1 0 1 0 if plane curvature is plane and vertical curvature is convex then the variable has the value l/(k+0.01) else is 0 The total number of variables involved in the analysis was 11, representing the event path, the event azimuth, the slope of the first reach, stand height and eight combinations of plane and profile curvature and canopy closure at the first reach. 5.4.1.1 Testing the significance of the multiple regression The F value calculated for the multiple regression analysis was 11.71, which was compared with F critic = 2.051. This revealed that the multiple regression equation was related to the debris flow travel distance (P < 0.0001). The multiple coefficient of determination was R =0.88. This meant 2 that further investigations based on the above regression analysis could be performed reliably. The coefficients associated with each variable, together with the significance test and VIF determined using the estimation data set, are presented in Table 5.4. The VIF for any parameter estimate was smaller than 10 and therefore the model was not affected by multi-collinearity. 100 Table 5.4 Parameter estimates of the regression equation. Parameter Estimate t Value Pr > |t| Variance Inflation Intercept. -67.39. -0.98 0.3389 0 LPATH 138.41 Variable 2.73 0.0138 2.69511 TST 0.21 5.23 <.0001 1.59004 CAZ 76.02 1.10 0.2875 1.46401 cvcv 0.37 0.22 0.8254 1.08028 cvcx 1.24 0.68 0.5072 1.31053 cyp 50.16 0.71 0.4864 1.21761 cxcv 295.68 2.85 0.0107 2.21196 cxcx -246.16 -2.55 0.0199 1.35959 exp -113.79 -1.03 0.3189 1.15782 pev 0.73 0.44 0.6680 1.07611 pcx -45.83 -0.46 0.6512 1.25344 The significance associated with each parameter estimate revealed that the path variable, slope at the first reach combined with stand height and two of the eight variables representing interactions among plane, profile curvature and canopy closure were strongly significant. A further selection of the significant variables could have led to the elimination of one of the above variables. However, at this stage of the study, the selection procedure was not applied as the regression had first to be tested for outliers and influential observations. 5.4.1.2 Identification of outliers and influential observations The next step in the investigation was to identify outliers in the data set that might affect the regression. Three tests were used, namely Studentized deleted residuals for the dependant variable (debris flow travel distance), hat matrix leverage, and C O V R A T I O values for independent variables. The values for these three statistics are presented in Table A.8.1. The cut-off value for determining outliers in the dependent variable was 2 (Belsley et al, 1980). The Studentized deleted residuals indicated that three observations were larger than the cut-off 101 value: 31-3, 52-30 and 62-10. These events needed to be further investigated to see if they had a significant influence on the regression function. The Belsley et al. (1980) theory was used to identify outliers in the independent observations. The cut-off values derived using this theory were: • for hat matrix leverage b • 2p — n 2*11 = 30 = 0.733 for C O V P v A T I O the cut-off value | c o v r a / / o - l | = ^ - = ^ - = l.l n 30 The values for hat matrix leverage indicated that there were four events larger than 0.7333 that required further investigation: 61-10, 62-23, 62-23b, 74-20. The values for C O V R A T I O indicated that there were 13 observations larger than 1.1 that required further investigation: events 31-4, 51-3, 52-12, 52-16, 52-32, 61-10, 61-15, 61-19, 62-23, 62-23b, 64-20, 73-18 and 74-20. Three procedures were used to test the influence of these outliers: DFFITS, Cook's distance and D F B E T A S . The values for these statistics are presented in Table A.8.2. The cut-off value for DFFITS was 2.1— = 2.1— = 1.211. The events identified as influential when Vn V 30 using the DFFITS criterion were 31-3, 52-30, 62-10, 62-23, 62-23b, 74-20. Cook's distance measure is related to the F distribution. The value of D i was compared to the Fp, -p corresponding percentile value. I f the percentile value was less than 20%, the i t h n observation had little influence on the fitted value. The critical value to fulfill this condition was F(o.2,n,i9) = 1.531. Three observations were considered as influential when using this criterion: 62-23, 62-23b and 74-20. 102 For the D F B E T A S criterion, Belsley et al. (1980) recommends the following number as a cut-off value: 2 2 - = = - = = 0.3651 4n V30 Neter et al. (1996) proposed a cut-off value of 1 for small- to medium-size data sets. Given the size of the datasets used in this study, I have adopted this recommendation. Table 5.6 indicates that there are six influential events: 21-102, 31-3, 52-30, 62-23b, 73-28 and 74-20. These results indicated that there were six events that should be considered as outliers and which were also influential: 31-3, 52-30, 62-10, 62-23, 62-23b and 74-20. They required further investigation to see whether applying remedial measure procedures could reduce their influence on the regression analysis. The analysis revealed that the remaining observations identified as outliers were not influential. They did not require further examination as they would not have to be eliminated from the estimation data set. 5.4.1.3 Remedial measures for outliers and influential observations The remedial measures used for the six events were based on robust regression (Section 4.9.3.5). The results of the robust regression procedure are presented in Table 5.5. Table 5.5 Robust regression weight. Id weight id weight id weight 21-101 21-102 • 31-3 31-4 51.-3 52-10 52-12 52-13 52-16 52-26 0.983 0.992 0.000 0.983 0.891 0.994 0.984 0.984 0.958 0.925 52-30 52-32 •. 53-5 53-6 6i-io ; 61-15 61-18 61-19 62-10 62-14 0.927 0.999 0.754 0.983 0.000 0.999 0.874 0.997 0.000 0.919 62-23 62-23b . 64-20 73-12 73-17 73-18 73-18b 73-28 74-20 83-7' 0.996 0.000 0.632 0.953 0.949 0.994 1.000 0.917 0.999 1.000 103 Four events had weights close to 0: 31-3, 61-10, 62-10 and 73-12. If further investigation were to reveal that there were recording errors or inadvertent variable definitions, the four events could be eliminated from the estimation data set. Event 31-3 was identified as an outlier in relation to the dependent variable and influential using DFFITS and D F B E T A S criteria. The event had one of the reaches 67 times longer than the adjacent reach. This suggested that the event was erroneously recorded. The event was therefore eliminated from the estimation data set. The investigation of event 61-10 indicated that the reach characterization from Section 4.5 did not apply to this event because the minimum slope difference between two adjacent reaches was smaller than the selected cut-off value, 3°. The corresponding difference in azimuth was only 18°, not 20°, the established cut-off value. Secondly, there was an intermediate reach with a length smaller than the selected cut-off value of 25 m. This debris flow was dropped from the estimation data set as the event did not fulfill the reach definition and was also an outlier. Event 62-10 had similar problems to 61-10; there were two reaches that did not fulfill the requirements established for reach definition, namely that the slope difference between adjacent reaches was smaller than 3° and the azimuth of the respective reaches did not differ by more than 20°, the established cut-off value. This debris flow was dropped from the estimation data set because the event did not fulfill the reach definition and was an outlier and an influential case. Event 73-12 had a second reach that was 14.4 m long and therefore did not fulfill the reach definition (Section 5.1). As the difference between the established cut-off value, 25 m, and actual length was large, this event was eliminated from the analysis. Events 52-30, 62-23, 62-23b and 74-20 had weights in the robust regression greater than 0.999. Their investigation revealed no errors in data recording or problems related to reach definition. This indicated that these events should be kept in the estimation data set. 104 5.4.2 Final regression analysis The final regression analysis was based on the modified estimation data set, which had 26 observations. This was more than the minimum number of events required to ensure the desired percentage of variation (PV=30%) and significance level (cc=0.15) established in Chapter 4. The final model built using crisp sets was used in the debris flow travel distance prediction. As the model had to fulfill all the assumptions of regression analysis, the analysis of the regression significance, outliers, influential case identification and assumptions is presented below. 5.4.2.1 Multiple regression significance The significance of the regression equation using the reduced estimation data set is presented in Table 5.6. Table 5.6 Regression significance. Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr>F Model 11 1007625 91602 50.01 <.0001 Error 14 25641 1831 Corrected Total 25 1033266 As the probability that F cr ii t C <F ca ic = 50.01 is less than 0.0001, the multiple regression analysis was significant, and the independent variables and debris flow travel distance are correlated. The coefficient of determination was R =0.975. As a reduced data set was used, an increase in significance and the coefficient of determination was expected. The regression coefficients are presented in Table 5.7 together with the VIF. 105 Table 5.7 Parameter estimates for the final regression. Pr > |t| Variance Inflation -5.16 0.0001 10.31 <.0001 0 2.830 0.035 2.00 0.0652 2.608 CAZ 43.649 1.89 0.0801 1.836 cvcv cvcx 0.337 0.73 0.4763 1.108 0.212 0.40 0.6950 1.471 cvp -8.473 -0.41 0.6893 1.143 cxcv 212.106 6.72 <.0001 1.494 cxcx cxp -42.636 -1.35 0.1969 1.950 11.753 0.35 0.7328 1.439 pcv 0.683 1.49 0.1584 1.101 pcx 20.523 0.72 0.4842 1.375 Variable Parameter Estimate t Value Intercept -140.348 LPATH 257.426 TST The model was not affected by multi-collinearity; all VIF values were smaller than 10, the selected cut-off value. The variable significance was tested using the t-distribution. The path variable, and one of the class variables representing interaction between plan, profile curvature and canopy closure, were significant at a=0.05. This indicated that a further selection of the significant variables was needed. 5.4.2.2 Identification of the outliers and influential observations The regression analysis performed on the reduced estimation data was checked for outliers and influential observations using the same statistics as were used for the preliminary regression analysis. The cut-off values were recalculated for the new estimation data set as the elimination of the four events modified the original estimation data set size. The new cut-off values were: • — = ^ — ^ = 0.846 for hat matrix leverage n 6 26 | covrarib.-l |=i£=A!H = 1.269 • forCOVRATIO • for DFFITS • for Cook's distance the corresponding F distribution critical value is F o.2,n,i5)=1.5866 • for D F B E T A S n 2 /Z \ n 2 26 /il - i 3 V 26 ( max ( _ L = - | Vw = = 0.39, 1) = 1 V26 106 The results given in table A.8.3 reveal that there was only one outlier for the dependent variable, 73- 17; the rest of the events had the corresponding studentized deleted residuals smaller than 2, the cut-off value. Different results were obtained for the independent variables: • Hat matrix leverage identified events 52-30, 62-23, 62-23b, and 74-20 as outliers; • Covratio identified events 21-101, 21-102, 31-4, 52-10, 52-12, 52-13, 52-16, 52-30, 5232, 53-6, 61-15, 61-19, 62-23, 62-23b, 64-20, 73-28 and 74-20 as outliers. The outliers identified with the above procedures were examined for their influence on the regression analysis, which was tested using DFFITS, Cook's distance and DFBETAS (Table A.8.4.). There were several influential observations. The results were quite similar even when different procedures were used. The DFFITS procedure identified 51-3, 52-30, 62-23, 62-23b, 73-18 and 74- 20 as influential events. Cook's distance procedure identified 62-23, 62-23b and 74-20 as influential events. The DFBETAS procedure identified 52-30, 62-23, 62-23b, 73-18 and 74-20 as influential events. Robust regression was again used to explore the outliers and influential cases (Table 5.8). Events identified as influential only were not eliminated from the analysis. Table 5.8 Robust regression weights. .Event 21-101 ; 21-102 ;. 31-4 51-3 .52-10 52-12 52.-13 52-16 52-26 0.996 0.899 0.997 0.977 0.981 0.958 0.933 53-6 61-15 61-18 61-19 62-14 62-23 1.000 Wight 0.987 0.984 Event 52-30 52-32' . ' Weight 0.984 0.999 0.751 0.984 0.990 0.884 0.998 0.915 Event 62-23b 64-20 73-17 73-18 73-18b 73-28 74-20 83-7 Weight 1.000 0.996 0.609 0.939 0.947 0.995 1.000 0.913 53-5 ' 107 All events had a weight greater than 0.6. This indicated that all outliers that were influential could be kept in the reduced estimation data set. Therefore the regression function obtained using the data set with 26 observations could be used for further inferences if the multiple linear regression assumptions were met. 5.4.2.3 Distribution of errors The distribution of the errors was checked for normality using the procedures described in Section 4.9.3.1. The results are presented in Table 5.9. Table 5.9 Tests for the distribution of errors. Tests for Normality Test Statistic p Value Shapiro-Wilk W 0.977 Pr<W 0.8188 Kolmogorov-Smirnov D 0.109 0.035 0.246 Pr>D >0.1500 >0.2500 >0.2500 Cramer-von Mises W-Sa Anderson-Darling A-Sa Pr>W-Sa Pr > A - S a There is no evidence to reject the null hypothesis; therefore the errors were considered to be normally distributed. 5.4.2.4 Homoscedasticity of the errors The assumption of error homoscedasticity was verified using White's test (Section 4.9.3.2); the results are presented in Table 5.10. Table 5.10 Test the errors homoscedasticity. Test of First and Second Moment Specification DF Chi-Square Pr > ChiSq 26 24.97 0.5207 108 As the null hypothesis could not be rejected at the established significance level, a=0.05, the errors were considered to be homoscedastic. Fulfilling this assumption ensured that the inferences based on the regression function had the smallest confidence interval of all the possible estimates (Gauss-Markov theorem). 5.4.2.5 Correlation amongst the errors Correlation amongst the errors was tested using the Durbin-Watson procedure (Section 4.9.3.3); the results are presented in Table 5.11. Table 5.11 Error correlation using Durbin-Watson procedure. Durbin-Watson D 2.397 Number of observations 26 1 order autocorrelation -0.231 ST The null hypothesis could not be rejected at the selected significance level (oc=0.05) and the errors were considered to be non-correlated. The Durbin-Watson procedure showed that there was no evidence to infer that there was a correlation between two consecutive errors (no timeseries correlation). As the debris flows were selected randomly and any two of them not on the same slope, it could be considered that there was no spatial correlation among events. The nocorrelation assumption provided information related to the Gauss-Markov theorem requirements: it ensured that the results could be applied to the estimation data set. 109 5.4.2.6 Selection of the significant variables The regression function established and tested in Section 5.4 contained variables that were not correlated with the debris flow travel distance, so further selection of the significant variables were needed. The selection procedures used to select the significant variables were backward selection, forward selection, and stepwise selection. The significance level for a variable to stay in the model was established as 0.1 (used in backward and stepwise selection procedures). The significance level for a variable to enter into the model was 0.1 (used in forward and stepwise selection procedures). If all the selection procedures supplied similar results the variable would be selected based on the information presented by the selection procedures. If the results varied from procedure to procedure, the variable selection would be based on judgement. Table 5.12 Backward selection of the significant variables. Variable Parameter Estimate Standard Error Type IISS F Value Pr > F Intercept -122.69 18.44 70185 44.27 <.0001 LPATH 247.65 18.72 277427 174.99 <.0001 TST 0.04 0.01 10577 6.67 0.0178 CAZ 38.31 17.46 7633.28068 4.81 0.0402 cxcv 213.84 28.26 90754 57.25 <.0001 cxcx -52.87 26.59 6269.61060 3.95 0.0606 All continuous variables and two components of the class variables were significant (Table 5.12). As class variables could not be separated in the selection process, the proposed function based on backward selection procedure was: L = f (path,firstreach slope * stand height, azimuth,firstreach plan curvature * profile curvature * canopy closure) where L = predicted debris flow travel distance. 110 Table 5.13 Forward selection of the significant variables. Variable Intercept LPATH CAZ Cxcv Parameter Standard Error Type II SS F Value Pr>F Estimate -134.69 19.65 91346 46.99 <.0001 274.23 17.39 483316 248.62 <.0001 43.40 19.09 10046 5.17 0.0331 185.59 28.76 80940 41.64 <.0001 The transformed path, transformed azimuth and one component of the class variable were significant (Table 5.13). The class variable cannot be separated in the selection process and therefore all the components had to be included in the final regression. Consequently the proposed function based on forward selection procedure was: L = f (path, azimuth, first reach plan curvature * profile curvature * canopy closure) Table 5.14 Stepwise selection of the significant variables. Variable Intercept LPATH CAZ cxcv Parameter Estimate Standard Error Type II SS F Value Pr >F -134.69 19.65 91346 46.99 <.0001 274.23 17.39 483316 248.62 <.0001 43.40 19.09 10046 5.17 0.0331 185.59 28.76 80940 41.64 <.0001 The results of stepwise selection presented in Table 5.14 were identical to the results of the forward selection procedure. The proposed function was: L = f (path, azimuth, first reach plan curvature * profile curvature * canopy closure) A comparison of the results from the three selection procedures suggested that the path variable, azimuth, the interaction among plane and profile curvature with canopy closure were correlated with the debris flow travel distance and that the variable representing interaction between slope of the first reach with first reach stand height is correlated with debris flow travel distance only if backward selection procedure was used. As the slope of the first reach defines the initial energy of an event, the variable representing the interaction between the first reach slope and stand height was kept in the model. Ill The variables used in the final regression analysis that led to the final debris flow travel distance model built on crisp sets were: • transformed path variable: (\ogQpath +1)) - codified L P A T H ; • transformed azimuth: (cos(azimuth)) - codified C A Z ; • transformed variables representing the interaction between slope of the first reach and 12 5 h +1 stand height: (-j^-) * C + 5 • 1 s i n 1 <P) - codified TST ; and 2 interaction among plan and profile curvature and transformed canopy closure. TST variable is introduced in the model to fulfill the regression analysis assumptions. 5.4.2.7 Debris flow travel distance prediction model built using crisp sets Based on the selection procedures presented in the preceding section, the final model selected for prediction debris flow travel distance was: L=-140.35+257.42*LPATH+0.03*TST+43.65*CAZ+ curvature + <?, 3 (5.4) Curvature=0.34*cvcv+0.2*cvcx-8.5*cvp+212.1*cxcv-42.6*cxcx+11.75*cxp+0.7*pcv+20.5*pcx where the symbols are the same as in the previous section. The final debris flow travel distance prediction model fulfilled all the assumptions and requirements necessary to build a regression equation. The variables that dominate the model are L P A T H and cxcv. The model built using crisp sets used only data with a precise meaning, with no vagueness being associated with any of the values. C A Z has for north and south exposition value 1, respectively - 1 , and for east west value 0. The power function accentuates the exposition influence on event travel distance. TST varies between 0 for height 0 and 105 m for height 45 m and slope 45° (for heights less than 20 m, the influence of TST on travel distance is insignificant). The reason to keep this variable in the model is that help in fulfilling all the linear regression assumptions. Curvature presents the influence of interaction between local terrain configuration and stand canopy closure (e.g. when terrain is convex for plane curvature and concave for profile curvature the event travel distance increase) 1 2 3 112 5.4.2.8 Regression assessment in relation to debris flow termination The results of the regression assessment for debris flow termination were based on the methods presented in Section 4.9.4. Two types of procedure were used. Firstly, informative procedures using regression equations for each type of event (ended/did not end in a stream) included a descriptive statistical summary for each type of event and a check of the significance of the corresponding coefficients in the model mixing the functions for the two types of events. Secondly, an exact procedure based on class variable techniques was adopted. The informative procedures offered only guides about the regression function built for the two types of events; their results could not be used for inferences. The result of the exact procedure could be used to infer the difference between the model predicting the events that ended in a stream and the model predicting events that did not end in a stream. The regression analysis for the events that did not end in a stream had a multiple coefficient of determination R =0.989. The probability that F regression =76.63 > F 2 c r i t i c a i was 0.0001, so at a significance level a=0.05, the relationship was significant. Table 5.15 Regression equation parameters estimate for events that did not ended in a stream. Variable Intercept LPATH TST CAZ cvcv cvcx CVP cxcv cxcx exp pcv Parameter Standard Estimate Error t Value Pr > |t| -90.38 44.50 -2.03 0.08 205.63 48.74 4.22 0.00 0.06 0.04 1.64 0.14 76.94 28.05 2.74 0.03 0.16 0.57 0.28 0.79 0.68 0.66 1.02 0.34 -35.78 37.68 -0.95 0.37 224.82 98.58 2.28 0.05 -86.48 81.89 -1.06 0.32 -23.31 42.96 -0.54 0.60 0.52 0.57 0.91 0.39 113 The regression equation for events that did not end in a stream was therefore: L=-90.38 + 205.63*LPATH + 0.06*TST + 76.93*CAZ + curvature + (5.5) Curvature = 0.16*cvcv+0.67*cvcx - 35.78*cvp +224.82*cxcv-86.48*cxcx -23.3*cxp +0.52*pc; where the symbols are as previously stated. For events that ended in a stream, the multiple coefficient of determination R was =0.91. The 2 probability that F regression =7.65 > F critical was 0.01, indicating a significant relationship. The parameter estimates are presented in Table 5.16. Table 5.16 Parameter estimates for regression equation modeling the events that ended in a stream Variable Parameter Estimate Intercept -114.06 56.19 -2.03 0.09 LPATH 205.56 42.68 4.82 0.00 TST -0.02 0.03 -0.73 0.49 CAZ -9.71 26.30 -0.37 0.72 cvcv 72.11 33.44 2.16 0.07 cvcx 38.63 41.13 0.94 0.38 cvp 49.31 31.89 1.55 0.17 cxCx 49.71 34.78 1.43 0.20 pcx 41.74 35.00 1.19 0.28 Standard Error t Value Pr > |t| The regression equation for the events that ended in a stream was therefore: L=-114.05 + 205.56*LPATH - 0.02*TST-9.71*CAZ + curvature + e ; t (5.6) Curvature = 72.11 *cvcv+38.63*cvcx+49.31 *cvp+49.71 *cxcx+41.74*pcx; where the symbols are as stated previously. 114 As before, the assumptions of normal distribution of the errors, homoscedasticity of the errors, and observation independency had to be satisfied. The distribution of errors was tested using Shapiro-Wilk, Kolmogorov-Smirnov, Cramer-von Mises and Anderson-Darling tests (Table 5.17). Table 5.17 Statistics for testing the normality assumption. Test Shapiro-Wilk Tests for Normality Events not ending in a stream Statistic p Va ue W 0.9532 Pr<W 0.4477 Events ending in a stream Statistic p Value W Pr<W 0.9121 0.1462 D 0.1528 KolmogorovSmirnov Cramer-von Mises D 0.1750 Pr>D W-Sq 0.0603 Pr > W-Sq >0.2500 W-Sq 0.0677 Pr > W-Sq >0.2500 Anderson-Darling A-Sq 0.3650 Pr>A-Sq A-Sq 0.4565 Pr > A-Sq 0.2347 0.1268 >0.2500 Pr>D >0.1500 The proposed models fulfilled the assumption that the errors were distributed normally. The test used to check the homoscedasticity assumption was that of White (1980) (Table 5.18). Table 5.18 White test for homoscedasticity. Events did not end in stream Events ended in stream DF Chi-Square Pr > ChiSq DF Chi-Square Pr > ChiSq 25 18.87 0.8032 17 14.74 0.6141 As there was no evidence to infer that the errors were heteroscedastic, the assumption that the errors have constant variance was fulfilled for each equation modeling events that ended/did not ended in stream. Correlation amongst the errors was tested using the Durbin-Watson test (Table 5.19). There was no evidence that the errors were correlated for both equations modeling events that ended/did not end in a stream. 115 Table 5.19 Durbin-Watson auto-correlation test. Event type Events did not end in stream Events ended in stream 2.522 1.395 Number of observations 19 15 1 order autocorrelation -0.332 0.204 Durbin-Watson D st Outliers and influential observations were examined through Beacon-Tukey bi-weight robust regression using IRLS (Table 5.20). The results indicated that there were no outliers or influential cases that needed to be removed from the data set. Table 5.20 Weights of the Beacon-Tukey test. Events not ending in a stream Events ending in a stream Event Weight Event Weight 0.94 21-101 21-102 0.96 51-3 0.95 31-4 0.99 52-12 1.00 52-10 0.96 52-30 0.99 52-13 0.99 0.64 53-5 52-16 1.00 53-6 1.00 52-26 0.88 61-10 1.00 62-14 0.94 61-15 1.00 64-20 0.99 61-18 0.80 73-17 0.94 61-19 0.37 0.92 73-18b 62-23 0.97 73-28 0.99 62-23b 0.99 94-23 0.97 73-18 1.00 94-26d 0.95 74-20 94-34 0.99 0.99 83-7 0.95 94-34b 0.93 94-24 0.52 94-30 0.76 94-35b 1.00 94-36 0.93 These results indicated that the assumptions required for the tests performed in this Section were met. 116 Inferences about the two regression functions The inferences regarding the two functions modeling events that did not end/ended in a stream were separated into two categories: informative and exact. The results supplied by the informative procedure provided information about the two functions but the results were unreliable. In contrast, the results supplied by the exact procedure were reliable. Two informative procedures were used to assess the differences between the functions modeling events that ended/did not end in a stream. The summary statistics using the overall function are presented in Tables 5.25 and 5.26. The variable used for this statistics was: Ldif = L - L p d re where L L L if d - debris flow travel distance P red - predicted debris flow travel distance expresses the difference between the actual and predicted value of the debris flow travel distance. L ^ f statistics were calculated using the final function from Section 5.4.2.7 for both types of events. If the statistics were not significantly different, then the regression function could detect the termination point of the debris flow, regardless of whether or not a stream was present. Table 5.21 Summary statistics for the variable Ldif. Analysis Variable: L - L Event type p r e d -N Mean Std Dev Minimum Maximum Not ending in a stream 14 0.48 31.36 -41.05 69.44 0.56 34.18 -85.41 40.04 Ending in a stream 12 117 The difference between the predicted and actual debris flow travel distances for events that did not end in streams was not significantly different to 0 (Table 5.21). As 0 479 t calc = 3 1 3 5 9 7 = 0-Q57^2.16=f „-,, =f c c m e 0 9 7 5 ; 1 3 r e w a sn 0 evidence to reject the hypothesis that the Vl4 difference between predicted and actual travel distance was different from 0. The same conclusion was reached for events that ended in streams (Table 5.22), as supported by the tvalues: tcaic = 4 ° f 3 5 7 = 0 . 0 5 6 6 ^ 2 . 2 0 1 = r , . „ . =r . 9 7 Q cr c 0 9V5 . 11 Vl2 The variances of the two types of events were also not significantly different: r P calc 1 34.1779 = 31.3598 a 2 = 1.1878<3.40 = F c r i , =F . c 0 9 7 5 ; 1 U 3 . The results of the procedure that tested the parameter estimates of the combined equation are presented in Table 5.26. The coefficients of the mixed equation obtained using the procedure described in Section 4.9.4.2 are also presented in Table 5.26. The 'c' in front of the variables transforms them according to the procedure explained in Section 4.9.4.2. If the coefficients representing the events that ended in streams in the mixed function were not significantly different from zero, then the mixed function made no difference between events that ended or did not end in streams. The variables cccxp, ccpcv and ccpcx did not enter into the model because they led to a singular matrix with determinant zero. The results were not affected by their elimination because the information provided by them was null or redundant. The coefficients corresponding to the equation modeling the events that ended in streams were not significantly different from 0 (a=0.05). This.indicated that there was a possibility that all of them, simultaneously, were not significantly different from 0. 118 Table 5.22 Significance of the mixed equation coefficients Parameter Estimates Variable Intercept DF Lpr 1 Parameter Estimate -90.38 Standard Error 38.39959 t Value -2.35 Pr > |t| 0.0337 1 205.63 42.05758 4.89 0.0002 TST 1 0.06 0.03235 1.90 0.0785 CAZ 1 76.94 24.20635 3.18 0.0067 Cvcv 1 0.16 0.49352 0.33 0.7498 Cvcx 1 0.68 0.57070 1.19 0.2553 Cvp 1 -35.78 32.51505 -1.10 0.2897 Cxcv 1 224.82 85.05677 2.64 0.0193 Cxcx 1 -86.48 70.65725 -1.22 0.2412 Cxp 1 -23.31 37.06980 -0.63 0.5397 Pcv 1 0.52 0.49309 1.05 0.3102 Pcx 1 41.74 47.51549 0.88 0.3945 C 1 -23.67 85.41406 -0.28 0.7857 Clpr 1 -0.08 71.60667 -0.00 0.9992 CTST 1 -0.08 0.04733 -1.69 0.1127 CCAZ 1 -86.65 43.13926 -2.01 0.0643 Ccvcv 1 71.95 45.40680 1.58 0.1354 Ccvcx 1 37.95 55.84517 0.68 0.5078 Ccvp 1 85.09 54.14919 1.57 0.1384 1 136.19 84.98430 1.60 0.1314 Ccxcx The F distribution was used to test the difference between the two regression equations obtained in Section 5.4.2. The use of the F distribution for testing requires the equality of the variances of the error terms of the two models. A modified Levene test was used to check this condition. The Levene procedure uses the absolute errors for each regression equation. The elements needed to perform the Levene test are presented i n Table 5.23. Table 5.23 Elements used to perform Levene test. Regression function for events not ending in a stream Regression function for events ending in a stream 4 =1 % - d X (d i{ 1 y t - 25.98 d~ ) 2 { 8936.28 2 =19i - l(d -d ) 2 i2 2 y t 1 18.92 1221.78 119 The pooled variance s was: 2 2 _ ~^' S '2 2) + (rf ~ d ^ 2 n,+n -2 = 8936.28 + 1121.78 _ ^ 19 + 1 5 - 2 2 where n , n are the number of observations i n each category (events that ended/did not end in a t 2 stream). The Levene test is based on the t distribution and the calculated value of t was 1.147 L t L d.-d, =—,' s = 2 — +— \n, n 2 25.98 - 1 8 . 9 2 , , , _ =1.147.: 17.82,/— + — V19 15 The critical t value for oc=0.05 and 32 degrees of freedom is 2.035. A s t L < t cn 7,- , c there was no evidence to infer that the variances of the two regression lines were different. The two functions fulfilled all the assumptions and requirements needed to test the simultaneous significance of the mixing function's coefficients, and the analysis, based on the methods outlined i n Section 4.9.4.2 are shown in Table 5.24. Table 5.24 Significance of the coefficients of the combined function. Test coefficients: Results for Dependent Variable slope length Source Numerator Denominator DF 8 14 A s the probability that the calculated Mean Square 2922.69530 2025.17572 F F Value 1.44 value was greater than Pr >F 0.2619 F critical = 0.2619 > 0.05, the coefficients were not significantly different from 0 and the combination of the two equations was not significantly different from any of the individual composing equations that were studied. This meant that the equations proposed in Section 5.4.2 could be used for both types of events. 120 Fuzzification of the selected variables 5.5 Fuzzy regression analysis is based on non-linear programming (Tanaka et al, 1982). Consequently, this study considered other variables together with the significant ones identified using crisp sets (Section 5.4.2.6). The newly added variables were the continuous variables that entered into the composition of TST. The following ranges were established for the fuzzy numbers based on the rules provided in Section 4.9.5.1: • 20% for the actual slope length of the debris flow. This value was selected to provide as narrow a confidence interval as possible. A cut-off value of 80% precision is generally considered acceptable (Corominas, 1996; Wise, 1997); • 40% for the transformed path variable (LPATH). As L P A T H does not have a real meaning this value was selected based on the error allowed in the path identification process. The selected spread of the L P A T H allowed an error of three reaches in identification of path reaches for long events and one reach for short ones; • 10% for the transformed variable representing the slope of the first reach (codified st). This spread allowed an error of 10% for small angles and 15% for large ones; • 30% for the transformed variable representing the interaction between slope at the initiation point and stand height at the first reach. In terms of real measurements this spread allowed 75% error in determining the stand height (assuming no error in measuring slope at the initiation point); • 20% for the transformed stand height at the first reach. In terms of real variables, this allowed an error of 20% of the stand height (codified ht); and • 20% for the transformed canopy closure at the first reach. This spread allowed an error of 16% from the real canopy closure. 121 5.6 Debris slide-flow travel distance model built using fuzzy sets Section 4.9.5.2 presented the methods used to determine the debris slide-flow travel distance model using fuzzy sets. As fuzzy regression does not involve any assumptions or supplemental forecast constraints the estimated equation represents the final model for debris flow travel distance prediction. The fuzzy regression based on the fuzzified data set was: L=-193.8+196.68*LPATH+0M8*TST+252J*st-U8*ht+46.52*CAZ +curvature (5.7) Curvature=0.59*cvcv+0.07cvcx+4.77*cvp+150.16*cxcv-5.7*cxp+0.23*pcv+21.47*pcx where the symbols are as defined in Sections 5.4.2.6 and 5.5. The optimization technique used to perform the non-linear programming problem was quasiNewtonian (Simmons, 1975; Bazaraa et al., 1993) A total of 114 iterations were performed to reach the convergence point, the solution of the non-linear programming problem. The convergence criterion of the objective function, the difference between two iterations, was satisfied. The minimum of the objective function was 1835.2, and the bias was 26 * » bias = — 26 = 275 061 — =-10.58 26 : where y,-: actual value of the debris flow travel distance J y : predicted debris slide-flow travel distance t Compared to the crisp regression, where the bias was 0, this did not seem to be very good. However, the procedures are completely different. The crisp regression was determined minimizing the squared errors, which led to a bias equal to 0 (Gauss-Markov theorem), whereas the fuzzy regression was calculated minimizing the confidence interval of the predicted value. 122 The next step was to use t statistics to determine whether the bias of the model built using fuzzy sets was significantly different than 0. calc bias s bias -10.58 = -0.81568 66.13327 V26 t critic = t 0.975;25 = 2.056 As t calc < t critic the bias was not significantly different than 0. 5.7 Regression assessment The debris flow travel distance prediction models obtained above had to be assessed for their forecasting performances (using the procedures outlined in Section 4.9.6). The regression was evaluated using the prediction data set, which contained eight events. The regression was also tested on the events eliminated from the prediction data set as outliers and influential cases. The results of the regression function developed using crisp sets are presented in Table 5.25 Table 5.25 Confidence limits of predicted travel distance using crisp sets. Event 31-3 61-10 62-10 73-12 94-23 94-24 94-26d 94-30 94-34 94-34b 94-35b 94-36 Lower confidence limit 131.6 1308.9 458.3 161.2 -58.8 39.8 -71.9 -63.4 -34.0 -63.7 -60.7 -16.7 Predicted value 267.3 1554.7 605.2 283.2 61.5 149.4 51.9 31.7 65.4 32.3 34.4 148.7 Upper confidence limit 403.1 1800.5 752.0 405.3 181.9 259.0 175.6 126.9 164.8 128.4 129.6 314.0 Actual value 1169.6 1341.7 915.1 91.6 115.0 98.7 131.6 102.9 120.0 135.0 43.3 116.0 123 The model predicted seven of the eight events correctly, which is within the established limits. The model correctly predicted only one from the set of four outliers. This made the crisp model sensitive to the correct identification of the variables. The events identified as outliers are discussed in more detail below. The results of the test data regression analysis performed using the fuzzy sets are presented in Table 5.26 Table 5.26 Assessment of the fuzzy regression analysis. Fuzzy low actual length Actual length Fuzzy up actual length Fuzzy low predicted Fuzzy predicted Fuzzy up predicted 31-3 935.68 1169.6 1403.52 130.306 248.51 366.72 61-10 1073.36 1341.7 1610.04 761.893 1214.77 1667.65 62-10 732.08 915.1 1098.12 316.805 533.63 750.45 73-12 73.28 91.6 109.92 113.608 245.70 377.79 94-23 92.00 115.0 138.00 69.073 152.80 236.52 94-24 78.96 98.7 118.44 109.615 202.60 295.59 94-26d 105.28 131.6 157.92 71.628 154.96 238.29 94-30 82.32 102.9 123.48 36.809 102.85 168.89 94-34 96.00 120.0 144.00 91.942 176.79 261.64 94-34b 108.00 135.0 162.00 33.996 101.84 169.68 94-35b 34.64 43.3 51.96 50.502 117.31 184.12 94-36 92.80 116.0 139.20 52.580 155.59 258.60 Event The fuzzy model predicted 100% correctly for the test data set. For the outliers, it correctly predicted two of the four events. An analysis of the incorrectly predicted outliers revealed that several factors were at play. For event 31-3, the reach definition was violated (one of the reaches was 67 times larger than the adjacent one). Supplemental reaches are needed, with the number of reaches for the same 124 distance being at least six. This would have led to a correct answer being derived by the fuzzy model. For event 62-10, the reach definition was also violated (one of the reaches being too long [299 m]). This reach should have been split in two smaller ones. The fuzzy regression correctly predicted the event, but the crisp set model forecasted outside the confidence interval. If one reach had been added to the event, the model using crisp sets would have predicted correctly. For event 73-12, there were too many reaches for a short distance, violating the requirement for a minimum length of a reach. For the distance involved, there should only have been two reaches, but four were recorded. The fuzzy model would have had a correct result if only three reaches had been identified. Given these results, the model built using fuzzy set methods was considered to be robust, even to outliers. The only condition that had to be imposed was to respect the reach definition. 125 6 Discussion and conclusions 6.1 Discussion The analysis of the correlation between the raw variables and debris flow travel distance revealed a series of interesting observations. The simple linear regression from Section 5.1 showed that slope morphology, expressed by L P A T H , has a greater effect on travel distance than all the other terrain attributes. The path analysis presented a correlation coefficient of 0.68, indicating that debris flow travel distance and terrain morphology co-vary together more than two thirds from their total variation. This result is consistent with other studies (Corominas, 1996; Finlay et al., 1999). The importance of slope morphology in relation to the other attributes was confirmed via simple linear regression utilising the remaining variables. The average slope analysis produced a non-significant correlation with debris flow travel distance. Initially, this result seemed to be in contradiction with theory (Corominas, 1996; Wise, 1997), given that it suggests that event travel distance does not depend on the terrain slope, but a closer examination of the results in Table 5.1 revealed that they are consistent with the laws of geometry. Debris flow travel distance predicted using only the slope (Fang and Zhang, 1988) was based on the equation: L - ^ sin (p where (6.1) L : debris flow travel distance; AH: elevation difference between start and end points of debris flow; and (f>: average slope. 126 However, debris flow travel distance depends not only on slope but also on the elevation difference (Figure 6.1). As the travel distance depends on two elements, slope and elevation difference, the variation of one element will not necessarily lead to variation of travel distance. AH Figure 6.1 Dependency of debris flow travel distance on average slope. Consequently, slope alone cannot explain debris flow travel distance; the elevation difference is also required. The analysis also showed that there was no correlation with the travel distance for the slope of the first reach or the fan (last reach). The statistical approach demonstrated that the stochastic analysis is consistent with the functional theory represented by Equation 6.1. Two studies have stressed the influence of the initial volume on debris flow travel distance: Wise (1997) and Fannin and Rollerson (1993). However, these studies considered events that did not end in streams, and this constraint is invalid for the model developed in this study. Different results might be expected in this study, as the sampling design included events that both ended and did not end in streams. In fact, there was no correlation between volume at the initiation point and debris flow travel distance (Table 5.1). For 55% of the events that started on the lower parts of slopes, the initial volume was greater than 40% of the volume of the whole event. Events that started on the lower parts of slopes had a short length as the termination point (in some cases 127 the stream) was usually close to the point of initiation. Regardless of the slope position, events that involved a large part of their volume in the first reach generally did not have a long path. Slide azimuth alone did not influence debris flow travel distance; different events with the same azimuth had different lengths. For example, event 21-101 with azimuth 289° had a length of 359.2 m, whereas event 64-20 with an azimuth 292° had a length of only 97.8 m (for more examples please refer to Appendix 1). Subsurface water flow tends to vary according to slope position: the closer to the top of the slope, the smaller the quantity of water moving through the soil (Viessman and Lewis, 1996). In the Arrow Forest District, this is not always the case, as there can be significant water flow coming off of shallow slopes above the main valley-side slopes. This has been identified as a management problem, especially where road construction has concentrated the flow of subsurface water (Jordan, 2002). Increased water flow through the undisturbed soil increases the pore water pressure and therefore the effective stress is reduced (Terzaghi, 1943; Kenney, 1984; Viessman and Lewis, 1996; Powrie, 1997). There was no correlation between the debris flow initiation point on the slope and debris flow travel distance (Table 5.1). For example, a significant number of short events initiated mid-slope: 67% of the mid-slope events were less than 100 m long. This analysis has shown that the travel distance was not dependent upon slope position. Terrain curvature influences the local hydrological conditions (Viessman and Lewis, 1996). As the index of terrain curvature that was used in this study only characterized the first reach, it was unlikely that any correlation with the debris flow travel distance would be significant. This was confirmed by the results presented in Table 5.1. However, the interaction between plane and 128 profile curvature was significantly correlated with debris flow travel distance (Table 5.1). This suggests that the local curvature of the first reach significantly influenced travel distance. This may be related to slope hydrology since for the same slope, vegetation, soil and geology, local hydrological conditions are influenced by canopy closure, plane and profile curvature (Viessman and Lewis, 1995, and Bernoulli's law) . 1 Vegetation influences terrain stability through the interception of precipitation and by the contribution of the roots to soil strength (Sidle et al, 1985; Greenway, 1987; Selby, 1993; Watson et al, 1994; Wu et al, 1995; Mapping and Assessing Terrain Stability Guidebook, 1999). The stand dendrometric attributes considered in this study were the average height and diameter of the dominant tree layer measured at the first reach (Selby, 1993). Stand composition did not vary more than 30% for each species along the paths of individual debris flows, except around the first reach. The average stand height and diameter did not vary more than 20% from one reach to another, except within the first reach. Both average stand height and diameter were correlated with the event travel distance (Table 5.1). This correlation is an indication of the relationship between terrain state as expressed by stand parameters and travel distance. As the elements describing the stand considered in this study characterized only the first reach, their influence must be correlated with other variables explaining the terrain variation along the event trajectory (Corominas, 1996). The influence of the vegetation species around the first reach on event travel distance was shown in Table 5.1. No correlation was observed between tree composition around the first reach and debris flow travel distance. While stand characteristics likely played a non-significant role once a failure has occurred, these characteristics influenced the probability of terrain failure (Traci, 1985; Watson et al, 1994). For the same horizontal distance, total head variation is larger for terrain profile that is concave than for convex one. 1 129 Stand canopy closure is another element considered in terrain stability analysis (Selby, 1993; Mapping and Assessing Terrain Stability Guidebook, 1999). Canopy closure is usually considered in the debris flow initiation investigations as a class variable (Greenway, 1987; Selby, 1993). No correlation was found between canopy closure and travel distance in this study (Table 5.1), but canopy closure may be important when is considered in combination with other attributes, such as terrain configuration and vegetation type (Gray, 1994). Canopy closure was measured only for first reach and therefore its contribution on travel distance is reflected in water quantity that reaches the ground. When statistics is used to analyze the relation between event travel distance and different attributes, canopy closure can help in fulfilling the regression analysis assumptions and requirements (Neter et al., 1996). Finlay et al. (1999) concluded that slope geology influences travel distance; however the function developed in this study excluded geology. Finlay et al. (1999) did not present any information regarding the significance of the relationship between geology and debris flow travel distance. Usually, the techniques used to establish the correlation between qualitative variables (such as geology) and travel distance is based on a regression equation that uses class variables, i.e. transformed dependent variables (Neter et al., 1996). However, as transformation of the dependent variable can give rise to the problems described in Chapter 2, the information supplied by studies using this technique must be interpreted with caution. More studies are needed to clarify the relationship between geology and debris flow travel distance. Rheological studies have stressed the importance of soil granulometric properties on terrain stability (Terzaghi, 1943; Innes, 1983; Hungr et al, 1984; Takahashi, 1991; Iverson, 1997). The size of the particles involved in the mass movement was less than 20 cm (which includes cobbles). In this research several statistical tests were performed to determine whether there was 130 a correlation between soil granulometry properties and debris slide - flow travel distance. The data set from Appendix 1 presents a relatively constant PSD (particle size distribution) along the path for both control and event soil samples, only sand, sand-gravel or gravel. The results show that the PSD type and grading did not have a significant influence on the debris flow travel distance (Table 5.1). The PSD should be seen as a class variable, and in this case the used data set contain only one category (because of PSD uniformity) therefore is no relationship between event travel distance and granulometry. This result is useful for the rheological models because Galileo's equation can be used as a tool to determine the event travel distance (Innes, 1983; 1 Hungr et al., 1984; Takahashi, 1991). However, as this research was essentially exploratory, more investigation is needed before it can be inferred that PSD type and grading does not influence debris flow travel distance. Logging activities have a large impact on debris flow initiation and travel distance (Fannin et al., 1996; Sidle and Wu, 1997). The data set used to build the debris flow travel distance model contained only events that occurred in the stand, with the exception of the first reach. For the events considered in this study, the forestry actions in the first reach did not influence debris flow travel distance (Table 5.1). It would seem that logging activities on the first reach (like clearcuting or roads) influence the factors affecting debris flow initiation mechanisms, but have an non-significant influence on the travel distance of the subsequent event over un-logged terrain. The model built using statistics indicated the strong predictive capability of the set of transformed variables. The transformed variables selected for building this model revealed the same patterns as the non-transformed variables. The only difference is that the correlation The final velocity of a solid (vf) depend on the initial velocity (Vj), acceleration (a) and the distance between initial and final position (Al): Vf =V; +2aAl 1 2 2 131 coefficient between the independent variables and debris flow travel distance was greater for the transformed variables than for the raw variables. As the co-domain of the transformed variables, except for TST, varied over the same range of values (from 0 to 100), the regression coefficients indicated the relative significance of the dependent variables (Freedman et al, 1991; Neter et al, 1996). The most important variable for predicting the debris flow travel distance was the event path. The transformed variables led to a regression equation with a coefficient of determination of 0.975. Considered independently, only two variables were correlated with debris flow travel distance: path variable and stand height. Considered together, the set of seven variables achieved good predictability (as expressed by the coefficient of determination) and reliable inferences (as expressed by the fulfillment of the assumptions required for the regression. As the debris flow travel distance prediction model was built using statistics procedures, the resulting equation had an intercept. There were two reasons for this approach. Firstly, as the goal of the study was to build a model with as narrow a confidence interval as possible, the GaussMarkov theorem assumptions must be fulfilled. The Gauss-Markov theorem assumptions could not be fulfilled if the model was built without an intercept. Secondly, if all the variables had a value of 0 (implying no event, a plan surface and no vegetation) the intercept should be 0. Under such conditions, the predicted length would be: L predicted = -140.35+43.65+curvature = -96.7+curvature (6.2) If the surface is planar, there is no horizontal or vertical curvature. However, as this combination did not exist in the model (the data set did not have any event with this combination), any coefficient can be assigned to the plan-plan variable. Equation 6.2 demonstrates that when the 132 coefficient of the plane-plane curvature is 0.967 (at canopy closure 0), the curvature has a value of 96.7. This value leads to an intercept equal to 0, and consequently the model can be considered to have no intercept. However, this rationale is only valid for this case; further investigations might assign a different value to the plan-plan variable. The regression equation can only be used for predictions in the data range defined by the estimation data set (Table 6.1). Table 6.1 Input data range used in debris flow travel distance prediction. Variable Debris slide-flow travel distance [m] Path [reaches number] Azimuth [°] Slope [°] Stand Height [m] Canopy closure [%] Plane curvature [class] Profile curvature [class] Lower limit allowed for Upper limit allowed for prediction (using proposed prediction (using proposed model) model) 25 770 1 7 2 357 18 53 0 45 0 100 Plane, convex, concave Plane, convex, concave The model built using the crisp set regression equation conforms to the laws of physics. When the coefficient value shows the importance of the effects of the variables on event travel distance, the coefficient value (i.e. positive or negative) presents information related to the relationship between the model and the laws of physics. The path variable coefficient has a positive value in all models that use a regression equation for debris flow travel distance prediction. This means that the larger the L P A T H value, the larger the event travel distance. The path variable is large if the upstream reach slope is smaller than the downstream reach (for the same number of reaches). A positive path variable coefficient value therefore demonstrates that the proposed equation is consistent with the law of energy conservation: if the energy of the mass movement increases along the event path, the debris flow travel distance will also increase. 133 TST varies from 0 to 3328, which explains the small coefficient compared with the coefficients for the rest of the significant variables (at least 1000 times smaller). The maximum influence of TST on event travel distance was 100 m. The coefficient for the variable expressing the interaction between slope and stand height at the first reach (TST) had a positive value. The positive coefficient is consistent with the physics of mass movements: where the slope increases, the event travel distance increases (Newton et al. 2002). However, the positive correlation between debris flow travel distance and stand height at the first reach indicates that the larger the stand height, the larger the travel distance will be. The data set in Appendix 1 reveals that stand diameter and height were recorded where the canopy closure was greater than that 0.5. Where the initiation point was surrounded by forest cover (canopy closure greater than 0.5) the water quantity required to initiate mass movement was greater than if there were no vegetation (Selby, 1993). Therefore, for the same event path, travel distance would be larger for debris flows starting within a stand than for those starting in a clearcut. This is because debris flow initiation conditions are more difficult to achieve in forested terrain than in clearcuts (Sidle et al, 1985; Greenway, 1987; Selby, 1993). In forested terrain, when debris flow initiation conditions are fulfilled at the initiation point, down-slope hydrological conditions required for terrain failure are also achieved (principle of continuity combined with Pascal's law) (Pao, 1961; Kaufmann, 1963). The C A Z variable varies from -1 to +1, as cosine varies from -1 to +1. As the C A Z co-domain is close to the L P A T H range (0-3), therefore the absolute value of the coefficient also demonstrates the importance to the independent variables on debris flow travel distance (Tucker, 1962; Freedman et al, 1991). As C A Z coefficient is smaller than L P A T H , this indicates that C A Z is less significant than L P A T H . This result confirms the analysis presented in Section 5.2: L P A T H was correlated with debris flow travel distance, but C A Z was not. 134 The value of the C A Z coefficient is consistent with geomorphological theory (Selby, 1993). The positive coefficient demonstrates that events with a northerly exposure have travel distances larger than those with a southerly exposure; at a maximum, an event would be 87.3 m longer on a northern face than for southerly exposure. The reason likely lies in the difference between the water regime on the two exposures; southerly faces have more sun, and therefore evapotranspiration is more intense than on northerly faces. The terrain curvature coefficients vary according to the relative influence of the variables on event travel distance (Tucker, 1962; Freedman et al., 1991). The largest coefficients are associated with the variables with the highest correlations. The coefficients correspond to the cxcv and cxcx variables that make the function consistent with the selection procedures used in Section 5.4.2.6. The value of the coefficient indicates the intrinsic properties of the terrain morphology at the first reach. There are two variables with negative coefficients: cvp and cxcx. These present the possibility that vertical curvature controls the event length, whereas the combination of profile and plan curvature controls the absolute value. This suggests that the energy variation of events is more controlled by the profile curvature than by the plan curvature. This would be consistent with hydrology along the path: concave shapes have a wetter regime than plan or convex ones (Powrie, 1997). This wet regime is a significant element in debris flow triggering and travel distance (Takahashi, 1991; Selby, 1993). The model was further investigated using the Pearson correlation coefficients presented in Appendix 6. The path variable, TST and C A Z coefficients from the regression equation were consistent with the Pearson correlation coefficient, all of them being positive. The class variables quantifying the terrain curvature presented consistent model equation coefficients and Pearson correlation coefficients only for cvp, cxcv, cxcx (see Table 5.3). However, for the remaining 135 class variables, the Pearson correlation coefficient signs did not match the model equation coefficient signs. This requires further explanation. Firstly, none of the variables with opposite coefficient signs were correlated with the dependent variable. Therefore the information supplied by these variables does not contribute significantly to the modification of the dependant variable. Secondly, the opposite sign demonstrates that the information corresponding to the respective variables is also supplied by different variables (Bernstein et al., 1987). This means that the significant variables provide the same information as the ones with opposite signs for the Pearson correlation coefficients and model equation coefficients. Therefore, the use of nonsignificant variables in the final crisp set regression model is based only on the need to preserve the class variable as a whole (Neter et al., 1996). Pearson correlation coefficients for the raw variables are also presented in Appendix 6. The results show that the transformation of the path variable into L P A T H increases the correlation between the raw path variable and debris flow travel distance, from 0.84 to 0.93. The remaining variables used in the regression were not correlated with event travel distance in their untransformed state. These results were also produced by simple linear regression analysis, with the exception of stand height at the initiation point (due to the data set used). The simple linear regression performed earlier in this Section used the whole data set (38 events), whereas the Pearson coefficients were determined using the reduced estimation data set from Section 5.4.2. Further examination reveals some interesting trends in these results. Slope, stand height and canopy closure around the first reach are all strongly correlated with each other. This means that if they were all correlated with the. event travel distance (which they are not), they would supply essentially the same information. The difference in the sign of the Pearson correlation coefficient demonstrates the lack of correlation between slope of the first reach and event travel distance; 136 the negative value suggests travel distance increases as slope decreases, although the nonsignificance of the coefficient means that any relationship can dismissed. However, this demonstrates that the interpretation of the Pearson correlation coefficient and simple linear regression analysis is only useful for significant relationships (Freedman et al., 1991; Mendenhall, 1984). The azimuth of the event as a whole is correlated with debris flow travel distance. This means that the variables characterizing the entire event, path and azimuth, have a stronger impact on the travel distance than the variables describing local terrain properties (i.e. initial slope, stand height, canopy closure, plan and profile curvature). However, the variables describing local terrain properties are useful in fulfilling the mathematical assumptions required for regression analysis. As the transformed variables are more significant than the raw variables, they supply useful information for prediction in the model. The fuzzification process aimed to capture the variability of attributes over time, including both intrinsic variability and the limitations of human operators and instruments (Zadeh, 1965). Variation in each variable was initially interpreted independently. Where the non-linear programming algorithm used to solve the fuzzy approach did not supply a convergent feasible solution, the initial variable variation was adjusted to fulfill the mathematical requirements needed for convergence and feasibility. As a result of backwards erosion of the head scarp, the length of a debris flow continues to increase from the moment of occurrence until the moment when data about it are collected. The initial event length is therefore smaller than the measured value. Data recording is also imperfect, especially measurements taken using a hipsometer while walking the length of the 137 event, which tend to exaggerate the total length of the event. These two opposing trends determine the procedures for assigning the fuzzy value for the event length. The most likely value for the debris flow travel distance is the measured one (assumption two). The most unlikely value is determined by subtracting and adding the spread of the fuzzy number to the most likely value. Both the non-linear programming convergence requirements and the minimal confidence interval of the predicted debris flow travel distance were considered in establishing the spread of the fuzzy number. Wise (1997) found a maximum coefficient of determination of 0.828 for unconfined events. This coefficient of determination led to confidence intervals larger than 20% for event length. Corominas (1996) obtained a correlation coefficient of 0.763 between dependent and independent variables for debris flows. This value also provided confidence intervals for the predicted values that were larger than 20% of the event length. For the purpose of this research, the variation of the value assigned for the debris flow travel distance was established at 20%, to fulfill the non-linear programming convergence requirement and also ensure the smallest confidence interval for the predicted value (Cannon 1993; Corominas, 1996; Megahan and Ketcheson, 1996; Wise, 1997; Finlay et al. 1999). As the predictor variable, the transformed path variable should have as large a variation as possible. Even if the path identification were inaccurate, the model would still have to predict correctly. Path recognition varies with event length. Consequently, for short events, the path variable could change within a reach; for long events, the path could vary within three reaches. The average number of reaches for short events, with length less than 200 m, is two. The analysis of reach distribution demonstrated that one reach error ensures that more than 80% of the events had between one and three reaches. If the event was larger than 500 m, its path was 138 difficult to identify accurately, as both event trajectory and map scale influence accurate reach identification. As this study had an exploratory nature regarding the path variable, a maximum number of three reaches were allowed for event path misidentification. This value allowed a variation of L P A T H from 35 - 55% of the L P A T H value. The variation in L P A T H value was established at 40% (of the L P A T H value) required to fulfill the non-linear programming conditions. The variation of the first reach slope depended on the slope value; for steeper slopes the errors in identification can also be high. This is because of the difficulties encountered in precisely measuring distances on maps. When contours were close, there was a greater error in the determination of plan length. Since the cosine function had a circular variation, the difference between shallow and steep slopes should not have been very large. This study therefore considered that the errors in the slope identification could vary from 10 - 15% of the actual slope of the first reach. The non-linear programming conditions forced the variation associated with the cosine of the slope to be in a range of 10% of the slope cosine. The method used to measure average stand height in the field resulted in a 10% variation from the real average value (Munteanu et al., 1980). However, since stand height varies with the sampled trees, an increase in variation was considered. The total stand average variation was established at 20% of the measured average stand height for the purpose of this model. As the relationship between the raw and transformed stand heights was linear, the variation of the transformed height was also 20% from the transformed value. The variation of TST was determined in relation only to the non-linear programming requirements. The transformed variable was allowed to vary within a range of 30% of the measured value. 139 Canopy closure at the first reach was very difficult to measure, especially when it had low values. Measured canopy closure typically varies within a range of 10% from the real value (Munteanu et al., 1980). As several ecosystems could be present in the area around the first reach, canopy closure was allowed to vary by more than 10%. The non-linear programming requirements indicated that a variation of the transformed canopy closure of 20% would fulfill the mathematical conditions. Translated back to the raw variable, this allowed a variation in 16% of the canopy closure at the first reach. The azimuth was not fuzzified because the symmetrical triangular fuzzy number could lead to a value for the cosine of greater than 1, contradicting the cosine definition. Azimuth was the only un-fuzzified variable in the model. More predictor variables were considered in the fuzzy set model than for the crisp set model. This was because of the non-linear programming method that was adopted. In order to supply as precise a model as possible, a large number of independent variables had to be considered (Chvatal, 1983; Winston, 1994). The non-linear programming method considered the same seven variables used in the crisp set modeling but also incorporated ht and st , the components of TST. 1 1 These two variables increased the precision of the fuzzy set prediction model. The coefficient signs for the fuzzy set model were the same as for the crisp set model. The new variable, st, had a coefficient consistent with the physics of the processes involved: steeper slopes associated with longer events. Although the coefficient for stand height was larger than 1, its influence on event travel distance was not important (the maximum influence was 5.5 m). The negative sign for the relationship between stand height at the initiation point and event travel distance can be discounted, as the relationship was not significant. ht is the transformed average stand height (h) using equation: ht=(h+l)/10 and st is the transformed slope of the first reach (((>) using equation st=sin<J). 1 140 The regression equation used for debris flow travel distance prediction was determined using a sample of 38 events from the total of 582 inventoried. If crisp sets were used, a larger data set would lead to a better estimation of the coefficients and also to a narrower confidence interval for predicted debris flow travel distance (McClave and Dietrich, 1991; Craiu, 1997). For the fuzzy sets model, a larger data set would have as a consequence a larger number of constraints and therefore a more precise identification of the feasible region (Chvatal, 1983; Winston, 1994). A better identification of the feasible region would also lead to a more precise solution for the convergence point, and consequently a more accurate debris flow travel distance prediction. For the crisp set model, because the dependent variable is the unmodified event travel distance, there is no bias; the least squares method used by the Gauss-Markov theorem provides unbiased estimators. The Gauss-Markov theorem also demonstrates that the estimated equation coefficients provide the smallest confidence intervals for the predicted variable. The situation is different for the fuzzy set model. As each value is associated with a degree of uncertainty, the application of crisp set algorithms would have led to a large confidence interval for the predicted value (Dubois and Prade, 1980; Klir and Yuan, 1995; Nguyen and Elbert, 1999). To reduce the confidence interval associated with the predicted value, different techniques are required. The work of Tanaka et al. (1982) minimized the confidence interval but did not consider the degree of bias of the estimators. The bias of the fuzzy set model was only 10.58, representing 4% of the average debris flow travel distance. As the bias of the fuzzy set equation was not significantly different from 0, there is not sufficient evidence to infer that the fuzzy regression is biased. The two equations obtained using different procedures from set theory were comparable: both were unbiased and provided the minimum confidence interval for the predicted value. The assessment of the two regression equations, based on crisp and fuzzy 141 sets, demonstrated that both models performed extremely well; the predicted values fell within the established confidence intervals. The fuzzy set model predicted 100% of the events from the prediction data set, whereas the crisp set model incorrectly predicted one event. The fuzzy set approach supplies a more precise model than the crisp set approach for events shorter than 150 m. The crisp set regression equation had negative values for the lower limit of the confidence interval for 78 % of debris flows with a travel distance of less than 150 m. However, for longer events, the crisp set model had a narrower confidence interval. The model determined using the fuzzy set is more robust than the crisp set model. In this regard the fuzzy set model is more valuable and its results more trustworthy than the crisp set model. The narrow confidence interval for predicted debris flow travel distance led to a precise identification of the element at risk. The probability of misidentification of the element at risk was less than 5%, the type I error level selected in Chapter 5, for both models. Therefore using equations 5.4 and 5.6 to assess the risk associated with terrain failure will decrease the probability of misidentification of an element at risk. 6.2 Conclusions The debris flow travel distance model is based on regression analysis, using the unmodified event travel distance as the predictor variable. The regression equation predicts the debris flow travel distance with a high degree of precision (coefficient of determination R =0.975 or event 2 travel distance spread 20% from its value), regardless of whether the model was based on a crisp or fuzzy set approach. As all the assumptions of regression analysis are met and the equation 142 predicts values within the established confidence interval (a=0.05), the model is robust, regardless of the skewness of the initial dataset. Compared to other debris flow travel distance models (Cannon, 1993; Corominas, 1996; Megahan and Ketcheson, 1996; Wise, 1997; Hungr, 1999) the equation proposed here incorporated a larger number of terrain attributes. A reduced set of terrain attributes can explain only a small part of the phenomena occurring in the debris flow process and leaves a series of processes un-investigated (i.e. geomorphological attributes do not explain the rheology). Incorporating a large number of attributes in the model made the prediction more accurate than the one produced by models that use fewer attributes, or a subset of the larger one (Luckasievicz, 1963). Statistically, more predictor variables lead to larger determination coefficients and smaller means square error associated with the predicted hyperplane equation (Neter et al, 1996). This reduction in means square error represents the increase in prediction accuracy (McClave and Dietrich, 1991). In addition, a newly designed variable that describes the event path enabled this debris flow travel distance model to include both event types, ending and not ending in a stream, in the same analysis. This created more flexibility with two effects on debris flow travel prediction. Firstly, the model considered events that conformed to the real terrain variation and which did not impose any restriction based on the termination point. Secondly, examining both types of events simultaneously enabled the sample size of the estimation data set to be increased, and consequently the confidence interval of the predicted length was narrower. The event path was based on reach definition. If a larger number of events had been inventoried, a better definition of the reach boundaries established in Section 4.5 would have been achieved. 143 As these limits are area-dependent, and the values can vary from region to region, a more exact definition of a reach should consider the physics of mass movements. The fuzzy set theory used in this study was shown to be a powerful tool for quantifying the uncertainties related to the data values. For events with short travel distances, the results supplied by the fuzzy set model are better than those one provided by the crisp set model. The variable that to describe terrain variation, LPATH, uses attributes related only with terrain morphology. The elemental terrain variation unit is the reach. Because of that the LPATH capture the terrain variation from debris slide-flow perspective. LPATH represents the terrain morphology by a single number. The quantification process is based on the assumption that reach limits and succession are correct identified. Because has a dynamic character, as a result of comparison between two successive slope and not to a certain value, the binary codification capture the terrain variation to each specific site. This feature makes the codification attractive because it is suitable to any type of terrain variation. The set of variables required to determine the debris flow travel distance do not include the position of the initiation point. To calculate the travel distance only maps that describe the profile morphology (path variable, local plan and profile curvature) stand characteristics (canopy closure, average height) and azimuth are required. Therefore the model is easy to implement in any software that works with this type of information, such as Arcview 3.2 or ArcGIS 8.0. As the debris flow travel distance model is represented by a set of two equations and a logical operator (IF) that decides what type of equation is applied (based on fuzzy or crisp set), the necessary calculations are fast and consequently the results are obtained in real time. 144 6.2.1 Recommendations for future research The models developed here have brought into focus two issues related to the investigation of terrain stability, namely the quantification of terrain variability and the uncertainties related to the data values. As the quantification of terrain variability is based on the reach definition, which in turn is determined by the local conditions, a more precise identification of reaches is recommended. This could focus on efforts to eliminate the constraints imposed by the regression analysis approach. A more dynamic definition, determined by the geomorphological attributes of the event trajectory, could be a goal of future investigations based on terrain variation. The goal of the debris flow travel distance prediction model was to provide as narrow a confidence interval as possible. For the crisp set model, this goal was achieved by a regression equation with a coefficient of determination very close to 1. For the fuzzy set model, a narrow confidence interval for the predicted value was achieved by the small spread of the dependent variable, event travel distance. A more precise approach is to establish the spread of the debris flow travel distance as a fuzzy number for each event in metric units, not in percentages. Soil fine particle content plays an important role in terrain stability investigation (Iverson, 1997) and this study revealed that there is significant correlation between soil fine particle content at the initiation point and debris flow travel distance (Table 5.1). As the granulometry was not considered along the path, its influence on the event length requires further investigation, directed especially to quantification of the granulometry variation along debris flow trajectory. This is important for valley-confined flows, where debris lying in the channel may be entrained in the flow. It is obviously not important for events that move over the vegetation surface. 145 References Abramson, L . W . , Lee, T.S., Sharma, S., and Boyce, G . M . (1996). Slope stability and stabilization methods. New York: John Wiley and Sons, pp.629 Adcock, R J . (1877). Note on the method of least squares. The Analyst 4: 183—184. Adcock, R.J. (1878). A problem in least squares. The Analyst 5: 53—54. Agriculture and Agri-Food Canada (2002). Canadian Soil Information System. Ottawa: National Research Council Canada, 187 pp Atkinson, P . M . and Massari, R. (1996). Predicting the relative likelihood of landsliding in the Central Apennines, Italy. In R.J. Abrahart and M . Abrahart (eds.), Extended Abstracts from the 1st International Conference on GeoComputation Leeds: University of Leeds, http://www.ashville.demon.co.uk/gcl996/abs005.htm (18 Aug. 2002) Atkinson, P.M. and Massari, R. (1998). Mapping susceptibility to landsliding in the Central Apennines, Italy. Computers and Geosciences 24: 373-385. Bagnold, R.A. (1954). Experiments on a gravity-free dispersion of large solid spheres in a Newtonian fluid under shear. Proceedings of the Royal Society of London 225: 49—63. Bardossy, A . (1990) Note on fuzzy regression. Fuzzy Sets and Systems 37: 65-75. Bazarra, M.S., Sherali, H.D., Shetty, C M . (1993) Nonlinear programming. Theory and Algorithms (2 nd ed.) New York, John Wiley & Sons, pp. 368 B C Ministry of Energy and Mines (2002). B C G S Geology Map. http://ebony.gov.bc.ca/mapplace/minpot/bcgs.cfm (18 Aug. 2002) B C Ministry of Forests (1999) Mapping and Assessing Terrain Stability Guidebook. (2 nd ed.) Victoria: Ministry of Forests. 43 pp Beaton, A . E . and Tuckey, J.W. (1974). The fitting of power series, meaning polynomials, illustrated on bandspectroscopic data. Technometrics 15: 147—185. 146 Belsley, D.A., Kuh, E . , and Welsch, R.E. (1980). Regression diagnostic: Identifying influential data and sources of collinearity. New York: John Wiley and Sons, pp.292 Bergado, D.T., Miura, N . , Onitsuka, K., Anderson, L.R., Bowels, D.S. and Sharp, K . D . (1988). Probabilistic assessment of earth slope stability by variation reduction and nearest-neighbor methods. In C. Bonnard (ed.), Proceedings of the fifth international symposium on landslides. Rotterdam: A . A . Balkema. pp. 501— 514. Berk, K . N . (1977). Tolerance and condition in regression computations. Journal of the American Statistical Association 72: 863—866. Bernstein, I.H., Garbin, C P . and Teng, G.K. (1987). Applied multivariate analysis. New York: Springer Verlag. pp.508 Bishop, A . W . (1955). The use of the slip circle in the stability analysis of slopes. Geotechnique 5: 7—17. Blackburn, P., de Rijke, M , and Venema, Y . (2002). Modal logic. Cambridge: Cambridge University Press, pp.554 Breush, T.S. and Pagan, A.R. (1979). A simple test for heteroscedasticity and random coefficient variation. Econometrica 47: 1287-1294. Bums, R . M . and Honkala, B . H . (1990). Silviculture of North America. Washington, D C : US Government Printing Office, pp.675 Cannon, S.H. (1993). An empirical model for the volume-change behaviour of debris flow. In H.W.Shen & F. Wen (eds.), Hydraulic Engineering'93. San Francisco: A S C E . pp. 1768-1773 Carrara, A . , Cardinali, M . , Deti, R., Guzzeti, F., Pasqui, V . , and Reichenbach, P. (1991). GIS techniques and statistical models in evaluating landslide hazard. Earth Surfaces Processes and Landforms 16: 427-445. Carrara, A . , Cardinali, M . , Gauzzetti, F., and Reichenbach, P. (1995). GIS technology in mapping landslide hazard. In Carrara A & Guzzetti F. (eds.), Geographical Information Systems in Assessing. Natural Hazards,, Dordrecht: Kluwer Acaddemic Publisher, pp.135-176 Chatterjee, S. and Price, B. (1977). Regression Analysis by Example. New York: John Wiley and Sons, pp.228 Chirita, C . D. (1974). Ecopedologie cu baze de pedologie generala. Cluj Napoca: Editura Ceres, pp.591 Ciucu, G . (1963). Elemente de teoria probabilitatilor si statistica matematica. Bucuresti: EDP.pp.358 147 Chowdhury, R. N . and Xu, D. W . (1993). Rational polynomial technique in slope-reliability analysis. Journal of Geotechnical Engineering 119: 1910-1927. Chvatal, V . (1984). Linear programming. New York: W . H . Freeman and Company, pp.477 Cleveland, W. S., Devlin, S. J., & Grosse, E. (1988). Regression by local fitting. Journal of Econometrics 37: 87114. Cochran, W . G . (1977). Sampling techniques. (3 ed.) Singapore: John Wiley and Sons, pp.428 Conover, W. J. (1999). Practical Nonparametric Statistics. (3 ed.) New York : John Wiley and Sons, pp.584 Cook, R. D. (1977). Detection of Influential Observations in Linear Regression. Technometrics 19: 15-18. Cook, R. D. (1979). Influential Observations in Linear Regression. Journal of the American Statistical Association 74: 169-174. Corominas, J. (1996). The angle of reach as a mobility index for small and large landslides. Canadian Geotechnical Journal 33: 260-271. Corominas, J. (1996). The angle of reach as a mobility index for small and large landslides: Reply. Canadian Geotechnical Journal 33: 1029-1031. Covello, V . and Merkhofer, M . (1993). Risk Assessment Methods. New York: Plenum Press, pp.318 Craiu, M . (1998). Statistica matematica. Bucuresti: Matrix Rom. pp. 243 Craiu, V . (1997). Statistica matematica. Bucuresti: Ed. Universitatii Bucuresti. pp. 253. Creanga, I. (1965). Introducere in teroria numerelor. Bucuresti: EDP. pp. 453 Cruden, D. M . (1991). A simple definition of a landslide. Bulletin of the International Association of Engineering Geology 43: 27-29. Cruden, D . M . and Varnes, D . J. (1996). Landslide Types and Processes. In A.K.Turner and R. L . Schuster (eds.), Landslides investigation and mitigation (2 ed.). Washington D C : National Academy Press, pp. 36-71 de Martonne, E . (1926). Une nouvelle fonction climatologique: l'indice d'aridite. La Meteorologie 2: 449-458. Demaerschalk, J. P. and Kozak, A . (1974). Suggestions and criteria for more effective regression sampling. Canadian Journal of Forest Research 4: 341-348. 148 Demaerschalk, J. P. and Kozak, A . (1975). Suggestions and criteria for more effective regression sampling 2. Canadian Journal of Forest Research 5: 496-497. Dubois, D. and Prade, H . (1980). Fuzzy sets and systems: theory and applications. New York: Academic Press. pp.393 Duncan, J.M. (1996) Soil slope stability analysis. In A.K.Turner and R. L . Schuster (eds.), Landslides investigation and mitigation (2 ed.). Washington D C : National Academy Press. Pp 337-371 Durbin, J. and Watson, G . S. (1950). Testing for serial correlation in least square regression: I. Biometrika 37: 409428. Durbin, J. (1970). Testing for serial correlation in least squares regression when some of the repressors are lagged dependent variables. Econometrica 38: 410-421. Elfving, G . (1952). Optimum allocation in linear regression theory. Annals of Mathematical Statistics 23: 255-262. Environment Canada. Canadian Climate and Water Information. 2002. http://www.mscsmc.ec.gc.ca/climate/climate normals/stnselect e,cfm?&RequestTimeout=120&StationName=&SearchTy pe=&Province=BC (18 Aug. 2002) Evans, S. G . (1997). Fatal landslides and landslide risk in Canada. In D.M.Cruden and R. Fell (eds.), Landslide risk assessment. Rotterdam: A . A . Balkema. pp. 185-196 Fang, Y . S. and Zhang, Z . Y . (1988). Kinematic mechanism of catastrophic landslides and prediction of their velocity and traveling distance. In C. Bonnard (ed.), Proceedings of the fifth international symposium on landslides. Rotterdam: A . A . Blakema. pp. 125-128 Fannin, R. J. and Rollerson, T. P. (1993). Debris flow: some physical characteristics and behavior. Canadian Geotechnical Journal 30: 71-81. Fannin, R. J. and Wise, M . P. (1995). A method for calculation of debris flow travel distance. In Proceedings of the 48th Canadian Geotechnical Conference. Vancouver, pp. 643-650 Fannin, R. J., Wise, M . P., Wilkinson, J. M . T., and Rollerson, T. P. (1996). Landslide initiation and runout on clearcut hillslopes. In Proceedings of the 48th Canadian Geotechnical Conference. Vancouver, pp. 195-199 Fannin, R. J. and Wise, M . P. (2001). An empirical-statistical model for debris flow travel distance. Canadian Geotechnical Journal 38: 982-994. 149 Fellenius, W . (1927). Erdstatische Berechnungen mit Reibung und Kohasion (Adhasion) und unter Annahme Kreiszylindrischer Gleitflachen (Earth stability calculations assuming friction and cohesion on circular slip surfaces). Berlin: W.Ernst & Sohn Verlag. pp.40 Finlay, P. J., Mostyn, G . R., and Fell, R. (1999). Landslide risk assessment: prediction of travel distance. Canadian Geotechnical Journal, 36: 556-562. Freedman, D., Pisani, R., Purves, R., and Adhikari, A . (1991). Statistics. (2 ed.) New York: W W Norton and Company, pp.619 Fruhwirth-Schnatter, S. (1992) On statistical inference for fuzzy data with application to descriptive statistics. Fuzzy Sets and Systems 50: 143-165. Gash, J. H . C. (1979). A n analytical model of rainfall interception by forests. Quarterly Journal of Royal Meteorological Society 105: 43-55. Gray, D . H . (1995). Keynote address. Influence of vegetation on the stability of slope. In Barker, D . H . (ed.) Vegetation and slopes. Trowbridge: Thomas Thelford services, pp.2-25. Greenway, D. R., Brian-Boys, K. C , and Anderson, M . G . (1984). Influence of vegetation on slope stability in Hong Kong. In Proceedings 4th International Symposium of Landslides. Toronto, pp. 399-404 Greenway, D. R. (1987). Vegetation and Slope stability. In L.R.Anderson and K. S. Richards (eds.), Slope stability. Plymouth: John Wiley and Sons. pp. 187-230 Halliday, D. and Resnick, R. (1986). Fundamentals of Physics. (2 ed.) New York: John Wiley and Sons. pp. 922 Hammond, C , Hall, D., Miller, S., and Swetik, P. (1992). Level I stability analysis documentation for version 2.0 Ogden: US Department of Agriculture, pp. 190 Hansen, A . (1984). Landslide hazard analysis. In D.Brunsden and D. B. Prior (eds.), Slope Instability. Chichester: John Wiley and Sons. pp. 523-602 Hansen, D. (1996). The angle of reach as a mobility index for small and large landslides: Discussion. Canadian Geotechnical Journal 33: 1027-1029. Hashino, M . , Yao, H . , and Yoshida, H . (2002). Studies and evaluations on interception processes during rainfall based on a tank model. Journal of Hydrology 255: 1-11. 150 Heim, A . (1989). Landslides and human lives. Vancouver: BiTech Publishers, pp. 195 Helliwell, D. R. (1994). Rooting habits and moisture requirements of trees and other vegetation. In D.H.Barker (ed.), Vegetation and slopes. Trowbridge: Thomas Telford Services, pp. 260-263 Heshmaty, B. and Kandel, A . (1985). Fuzzy linear regression and its applications to forecasting in uncertain environment. Fuzzy Sets and Systems 15: 159-191. Hoaglin, D. C , Mosteller, F., and Tukey, J. W. (1985). Exploring data tables, trends, and shapes. New York: John Wiley and Sons, pp.560. Hong, D. H . , Lee, S., and Do. (2001) Fuzzy linear regression analysis for fuzzy input-output data using shapepreserving operations. Hosie, R.C. (1969). Fuzzy Sets and Systems Native trees of Canada. 122: 513-526. (7 ed.). Ottawa: Canadian Forest Service, pp.380 Hungr, O., Morgan, G . C , and Kellerhals, R. (1984). Quantitative analysis of debris torrent hazards for design of remedial measures. Canadian Geotechnical Journal 21: 663-677. Hungr, O. (1995). A model for the runout analysis of the rapid flow slides, debris flows and avalanches. Canadian Geotechnical Journal 32: 610-623. Hungr, O. (1999). Landslide/Debris flow runout prediction for risk assessment. (FR-96/97-777). Vancouver: Department of Earth and Oceanic Sciences. U B C . pp. 52 Innes, J. L . (1983). Debris flow. Progress in Physical Geography 1: 469-501. Iverson, R. M . and Major, J. J. (1986). Groundwater seepage vectors and the potential for hillslope failure and debris flow mobilization. Water Resources Research 22: Iverson, R. M . (1997). The physics of debris flows. 1543-1548. Reviews of Geophysics 35: 245-296. Janbu, N . (1973). Slope stability computations . In R.C.Hirschfel and S. J. Poulos (eds.), Embankment Dam Engineering: Casagrande Memorial Volume. New York: John Wiley and Sons, pp.47-86. Johnson, A . M . and Rodine, J. R. (1984). Debris flow. In D.Brunsden and D. B. Prior (eds.), Slope instability. Norwich: John Wiley and Sons. pp. 257-361 Jordan, P. (2002). Landslide frequencies and terrain attributes in Arrow and Kootenay Lake Forest District. Unpublished, pp. 30 151 Kaufmann, W . (1963). Fluid mechanics. New York: McGraw-Hill Book Company, pp.432. Kenney, C . (1984). Properties and behaviors of soils relevant to slope instability. In D.Brunsden and D . B. Prior (eds.), Slope instability. John Wiley and Sons. pp. 27-65 Klir, G . and Yuan, B. (1995). Fuzzy sets and fuzzy logic. New Jersey: Prentice Hall PTR. pp.574 Koopmans, T. C. (1937). Linear Regression Analysis of Economic Time Series. Haarlem: Netherlands Economic Institute, pp.150 Kummell, C. H . (1876). New investigation of the law of errors of observation. The Analyst 3: 165-171. Kummell, C. H . (1879). Revision of Proof of the Formula for the Error of Observation. The Analyst 6: 80-81. Lau, K. C. and Woods, N . W . (1997). Review of methods for predicting the travel distance of debris flow from landslides on natural terrain. (GEO Technical Note 7/97) Hong Kong: Geotechnical Engineering Office. pp.48 Leshchinsky, D. and Huang, C. C . (1992). Generalized slope stability analysis: interpretation, modification and comparison. Journal of Geotechnical Engineering 118: 1559-1575. Levene, H . (1960). Robust Tests for the Equality of Variance. In I.Olkin (ed.), Contributions to Probability and Statistics. Palo Alto: Stanford University Press, pp. 278-292 Levin, M . J. (1964). Estimation of a system pulse transfer function in the presence of noise. I E E E Transactions on Automatic Control, AC-4: 37-43. Lukasiewicz, J. (1963) Elements of mathematical logic. Oxford: Pergamon Press, pp.124 Marshall, P. L . and Demaerschalk, J. P. (1986). A strategy for efficient sample selection in simple linear regression problems with unequal per unit sampling costs. Forestry Chronicle 62:16-19. McClave, J. T. and Dietrich, F. H . (1991). Statistics. (5 ed.) San Francisco: Dellen Publishing Company, pp.928 McLellan, P. J. and Kaiser, P. K . (1984). Application of a two parameters model to rock avalanches of the Mackenzie Mountains. In Proceedings IV International symposium on landslides. Downsview: Canadian Geotechnical Society, pp. 1: 559-565 Megahan, W . F. and Ketcheson, G . L . (1996). Predicting downslope travel of granitic sediments from forest roads in Idaho. Water Resources Bulletin 32: 371-381. 152 Mendenhall, W . (1984). A course in business statistics. Boston: Duxbury Press, pp.670 Miao, T. D. & A i , N . S. (1988). Landslide analysis and prediction by catastrophe theory. In C . Bonnard (ed.), Proceedings of the fifth international symposium on landslides. Rotterdam: A.A.Balkema. pp. 731-733 Mihoc, G . and Firescu, D. (1966). Statistica matematica. Bucuresti: Editura didactica si pedagogica. pp.360 Montgomery, D. R. and Dietrich, W . E . (1994). A physically based model for the topographic control on shallow landsliding. Water Resources Research 30: 1153-1171. Morgenstern, N . R. and Price, V . E . (1965). The analysis of the stability of general slip circles. Geotechnique 15: 79-93. Munteanu, C., Neagu, I., Cristescu, C., Predescu, G., Ceuca, G., Patrascoiu, G., Moise, I., Nicoara, I., Smeykal, G., Enasescu, S., and Draghiciu, I. (1980). Indrumar pentru amenajarea padurilor. Bucuresti: ICAS.pp.429 Nahorski, Z . (1992). Regression analysis: a perspective. In J.Kacprzyk (ed.), Fuzzy regression analysis. Warsaw: Omnitech Press, pp. 3-13 Nather, W. and Albrecht, M . (1990). Linear Regression with Random Fuzzy Observations. Statistics 21[4]: 521-531. Naturale Resources Canada. Magnetic declination. 2002. http://www,geolab,nrcan,gc.ca/geomag/e cgrf.html#MIRP (18 Aug. 2002) Nelkon, M . (1974). Fundamental of Physics. (2 ed.) Norwich: Fletcher and Son . pp.782 Neter, J., Kutner, M . , Nachtsheim, C J . and Wassermann, W . (1996). Applied linear statistical models. (4 ed.) BunRidge: WCB/McGraw-Hill. pp.1408 Newton, I., Cohen L B . and Whitman, A . (2002). The Principia: mathematical principles of natural philosophy. Los Angeles: University of California Press, pp.1025 Nguyen, H . T. and Walker, E. A . (2000). A first course in fuzzy logic. (2 ed.) Boca Raton: Chapman & Hall/CRC. pp.300 Pao, R. H . F. (1961). Fluid mechanics. New York: John Wiley and Sons, pp.502 Petley, D. J. (1984). Ground investigation, sampling and testing for studies of slope instability. In D.Brunsden and D. B. Prior (eds.), Slope instability. Whilshire: John Wiley and Sons. pp. 67-101 Powrie, M . (1997). Soil Mechanics. London: E & F N Spon. pp.420 153 Resources Inventory Committee (1997). Terrain Stability Mapping in British Columbia. Victoria: Resources Inventory Committee, http://srmwww.gov.bc.ca/risc/pubs/earthsci/terrain2/index.htm (18 Aug. 2002) Robinson, E . G., Mills, K., Paul, J., Dent, L . , and Skaugset, A . (1999). Storm impact and landslides of 1996. Oregon Department of Forestry, pp.145 Rollerson, T.P., Millard, T., Jones, C , Trainor, K. and Thompson, .B. (2001) Predicting post-logging landslide activity using terrain attributes: Coast Mountains, British Columbia. Technical report TR-011. Vancouver: Vancouver Forest Region, pp.21. Romer, C . and Kandel, A . (1995). Statistical tests for fuzzy data. Fuzzy Sets and Systems 72: 1-26. Rousseeuw, P. J. and Leroy, A . M . (1987). Robust regression and outliers detection. New York: John Wiley and Sons, pp.329. Rutemiller, H . C . and Bowers, D. A . (1968). Estimation in heteroscedastic regression model. Journal of the American Statistical Association 63: 552-557. Rutter, A . J., Kershwa, K. A . , Robins, P. C , and Morton, A . J. (1971). A predictive model of rainfall interception in forests. I. Derivation of the model from observations in a plantation of Corsican pine. Agricultural meteorology 9: 367-384. Rutter, A . J., Morton, A . J., and Robins, P. C . (1975). A predictive model of rainfall interception in forests. II. Generalization of the model and comparison with observations in some coniferous and hardwood stands. Journal of Applied Ecology 12: 367-380. Savic, D. A . and Pedrycz, W. (1991). Evaluation of fuzzy linear regression models. Fuzzy Sets and Systems 39: 5163. Selby, M . J. (1993). Hillslope materials and processes. (2 ed.) Oxford: Oxford University Press, pp.451 Sidle, R. C , Pearce, A . J., and OLoughlin, C . L . (1985). Hillslope stability and land use. Washington, D.C.: American Geophysical Union, pp.140 Sidle, R. C . (1992) A theoretical model of the effects of timber harvesting on slope stability. Water Resources Research 28: 1897-1910. 154 Sidle, R. C. and Wu, W . (1997). Application of a Distributed Shallow Landslide Analysis Model (dSLAM) to Managed Forested Catchments in Coastal Oregon. In Human Impact on Erosion and Sedimentation; Proceedings of 5th Scientific Assembly of IAHS. Wallingford: IAHS Publ. no. 245. pp. 213-221 Sidle, R . C . (2000). Watershed Challenges for the 21 Century: A Global Perspective for Mountainous Terrain. st Missoula: U S D A Forest Service R M R S - P - 13, pp.45-56 Simmons, D . M . (1975) Nonlinear programming for operations research. Englewood Cliffs, Pretince-Hall, pp.448 Snee, R. D. (1977). Validation of regression models: methods and examples. Technometrics 19: 415-428. Soeters, R. and van Westen C.J. (1996). Slope instability Recognition, Analysis and Zonation. In A.K.Turner and R. L . Schuster (eds.), Landslide investigation and mitigation. Washington D C : National Academy Press, pp. 129-173 Stanescu, V . , Sofletea, N . , Popescu, O. (1997). Flora forestiera lemnoasa a Romaniei. Bucuresti: Editura Ceres. pp.367 Takahashi, T. (1981). Debris flow. Annual Reviews of Fluid Mechanics 13: 57-77. Takahashi, T. (1991). Debris flow. Rotterdam: A A Balkema. pp.165 Tanaka, H . , Uejima, S., and Asai, K. (1982). Linear regression analysis with fuzzy model. IEEE Transaction on System, Man and Cybernetics 12: 903-907. Tanaka, H . , Hasayashi, I., and Watada, J. (1989). Possibilistic linear regression analysis for fuzzy data. European Jurnal of Operational Research 40: 389-396. Tanaka, H . and Ishibuchi, H . (1992). Possibilistic regression analysis based on linear programming. In J.Kacprzyk (ed.), Fuzzy regression analysis. Warsaw: Omnitech Press, pp. 47-60 Terzaghi, K. (1943). Theoretical soil mechanics. New York: John Wiley and Sons, pp.510 Theil, H . and Nagar, A . L . (1961) Testing the Independence of Regression Disturbance. Journal of the American Statistical Association 56(296):793-806 Thornthwaite, C. W . (1931). The climates of north America according to a new classification. Geography review 21: 633-655. Traci, C. (1985). Impadurirea terenurilor degradate. Bucuresti: Editura Ceres.pp.187 155 Tucker, H . G . (1962). A n introduction to probability and mathematical statistics. New York: Academic Press. pp.248 Turner, A . K. and Jayaprakash, G . P. (1996). Introduction. In A.K.Turner and R. L . Schuster (eds.), Landslides investigation and mitigation (2 ed.). Washington D C : National Academy Press, pp. 3-11 Turner, A . K. and McGuffey, V . C. (1996). Organization of Investigation Process. In A.K.Turner and R. L . Schuster (eds.), Landslide investigation and mitigation (2 ed.). Washington D C : National Academy Press, pp. 121128 Varnes, D. J. (1978). Slope Movement Types and processes. In R.L.Schuster and R. J. Krizek (eds.), Landslides: Analysis and Control (1 ed.). Washington D C : National Research Council, pp. 11-33 Viessman, W . and Lewis, G . L . (1995). Introduction in Hydrology. (4 ed.) New Jersey: Prentice Hall, pp.760 Voight, B. and Pariseau, W. G . (1978). Rockslide and avalanches: an introduction. In B.Voight (ed.), Rockslide and avalanches. Amsterdam: Elsevier Scientific Publishing Company, pp. 1-71 Wang, S. Q. and Unvin, D. J. (1992). Modelling landslide distribution on loess soils in China: an investigation. International Journal of Geographical Information Systems 6: 391-405. Wang, X . and Ha, M . (1992). Fuzzy linear regression analysis. Fuzzy Sets and Systems 51: 179-188. Wang, Z . Y . and L i , S. M . (1990). Fuzzy linear regression analysis of fuzzy valued variables. Fuzzy Sets and Systems 36: 125-136. Watanabe, N . and Imaizumi, T. (1993). A fuzzy statistical test of fuzzy hypotheses. Fuzzy Sets and Systems 53: 167178. Watson, A . , Marden, M . , and Rowan, D. R. (1994). Tree species performance and slope stability. In D.H.Barker (ed.), Vegetation and slope. Trowbridge: Thomas Telford Services, pp. 161-172 White, H . (1980). A heteroskedasticty-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48: 817-838. Wieczorek, G . F. (1996). Landslide triggering Mechanism. In A.K.Tumer and R. L . Schuster (eds.), Landslide investigation and mitigation (2 ed.). Washington D C : National Academy Press, pp. 76-87 Winston, W . L . (1994). Operations research. (3 ed.) Belmont: Duxbury Press, pp.1372. 156 Wise, M . P. (1997). Probabilistic modeling of debris flow travel distance using empirical volumetric relationships. MSc University of British Columbia. Wu, T. H . , Riestenberg, M . M , and Flege, A . (1994). Root propoerties for design of slope stabilization. In D.H.Barker (ed.), Vegetation and slopes. Trowbridge: Thomas Telford Services, pp. 52-60 Wu, W. and Sidle, R. C. (1995). A distributed slope stability model for steep forested basins. Water Resources Research 31: 2097-2110. Xiao, Q., McPherson, E . G . , Ustin, S. L . , Grismer, M . E., and Simpson, J. R. (2000). Winter rainfall interception by two mature open-grown trees in Davis, California. Hydrological processes 14: 763-784. Zadeh, L . (1965). Fuzzy sets. Information and Control 8: 338-353. Zinke, P. J. (1967). Forest interception study in the United States. In W.E.Sopper and H. W. Lull (eds.), Forest Hydrology. Oxford: Pergamon Press pp. 137-161 157 ror ro ro I CO O) ro CO ro CD CO M J CD TCD Cl CD Cl CO CD ro ui vi lie. CO CD CD ro To TO L CD CD P o o ro ro cr cT CD 03 O cu TCD CD Cl Q:I Ul co 3 o CD 3 o Cl CD O CO Ul vl oo H I o ro I CO 5:i rrol I I CO CO I CD Q:I 3? I CO Ul _ i i CO CD Si co co Ul ro CD r2 r2 co ro ro ICO. roi of ro or ro co CO Ul to OLl ^n ro TOT ro i oo cn co ro -SU o_ o" o' P ro co 03 vl CO co co ro ro Vl co co ..vL oo 00 vL 00 3 CL !< ro o icnj CD CL cr CO CO co oo cn col ro cn 03 Uol CO co CO CD |00| CO 00 CO CO co ro co co co co co loo. cn 3_ CO v L ro ro co 00 CO ro cn co o rt o CO CO GO o ro 00_ oo co jro oo ui ro r§ o bo ro cn cn o DP P 3 o ST crq n < CD ^1 N CD o 3 rt O a; 3" ro oo ro oo co ro coi ro P Q Q f = rt o iro o IP co 00 to o ml vj_ o bo a o Uo cn j o l CD o 01 o cn o CO mi iro o CO bo o jo JO ro oo ro irol re ro | S3 ro o o cn o v L o ml o bo o> mi o bo o O t/5 I M b i joj QI CD CO 00 CD cn CD 03 —i o c CD CO CD cn CD cn CD CO CD CO CD cn CD CD 5' ro CD CD 4*. CO N Q. 46.1 ro e/d O CO CD 30.2 Oi CD 76.3 75.8 773.9 ro o CD 32.1 Oi Ul 4>- 154.4 CD 66.1 130.5 ro Ul oo ro 30.9 W CD 126.1 e/d 530.4 15.5 Ul 13.9 e/d Oi CD e/d ro e/d 68.8 94.4 CD 77.3 180.3 Ul ro CD e/d CO e/d L 34.2 ro e/d CD 39.5 4^ C O e/d Ul e/d CJ) CD 228.1 CD 88.5 CD 272.9 81.9 93.1 67.5 1341.7 4*. vj oo e/d CO CD e/d CD 115.2 o CD 24.8 CL 67.5 ro Ul a. Reach Type Length CT 52-26 o CO Ul CO 52-26 Oi Ul CO 1 52-26 i 52-30 o CO 52-30 o Ui 52-30 o i 52-30 i 52-30 O Oi Oi Oi 52-30 i 52-30 o o o o o CD 52-30 1 52-32 • 52-32 • 52-32 CJ) CJ) 52-32 i 52-32 o —k CJ) • 53-5 o i 53-5 o i 53-5 O) Oi Oi 53-6 1 53-6 CT> S> ci CO Ul So bi CO o co O CO ro ro bo CO Q Ul Ul 4^ 4* Ul ro CO CO 4^ Ul bi Ul IO Ul o 4^ Oi C D to bo bi o Ul CO o Ul o CO CO ro CT4^ ro ro cn ro Ul io CO IO Oi U l bo o Ul CT4> ro ro CO CO ro ro co CO N C O CO 4> 00 o CO Ul CO oo Ul 4>. CT) 4> co bo Ul 4* o bo 4* CO 4*. 4^ 4> r 1 4^ k CO 4^ 00 bo CO 4^ o CD o • . Ul CO L 4^ forest forest Vegetation 4*- CO forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest forest Clearcut CO Depth 1/2 Depth co 4^ CD 4^ o Width bottom Slope Azimuth Depth bi o ro CO C O C O C O C O U l Ul Ul co Ul 00 cn Oi o 32.5 o Ul CO 4^ 20.8 CO CO CO ro ro ro co ro v j C O Ul o o CO ro ro bo CO CO oo L forest CJ> CD Oi o CO CO ro oo to Ul ro 4* forest 4^ 4* o vj Ul L CD 4>> forest Ul o CO C O CO 4*- Oi CO CO 00 4> 12.6 63.25 CO CO ro C O Ul ro ro cn 12.5 ro C O ro vj co oo Oi 4>- oo 12.5 IO CO on 19.6 31.75 ro ro ro C O ro ro co ro C O CD o oo co ro 00 CO ro Oi 22.4 10.1 4* 30.2 34.8 oo ro 19.6 10.1 15.4 IO oo 30.9 34.8 23.3 4*. CO CO ro oo CTro ro 00 4> 26.5 15.4 ro ro Ul CO CD 30.2 23.3 17.3 00 o o ro ro ro ro ro bi ro C O b i bi 00 Ul 26.5 13.4 11.9 4^ 13.4 17.3 16.3 ro ro ro ro CO 13.6 11.9 15.6 ro ro 4^ C O co ro 4^ C O o 00 4* 00 00 CJ) Ul Oi CO 13.4 16.3 15.5 24.33 221.818 o o co •s Oi CO o ro C O ro ro ro C O ro ro ro ro ro ro Ul CJ) 4^ co Ul Oi oo o CD 20.5 15.6 11.5 CO o 15.5 15.5 CO 20.6 15.5 11.5 Ul vl CO vl v) v| vj GO CO CO CO — L — L — i ro ro ro ro cn ro CT)o ro o to 4* Q . CO CD ro _^ ro Q . CD CD ct Ct cn O ) cn cn C L cn cn CT) CT) cn cn CD cn cn cn cn cn cn cn cn cn cn cn — L ro ro rp rp rp rp rp rp rp rp rp 4^ ro ro ro cn cn 00 oo CD CD CD CD CD co o cn C O CO o o o o o o o -r* 4>CT) CT) cn cn ro c r cn ro rp ro i '» 1 ro ro CO ZD 00 Ul o CO u CJ CO CD 0) ro CD v| CD CD CD Sr ci ci ci cT. CD cn CD cn CD CD CO CD ci ro CD _^ Ul CT) CL CD S: S: CD CO CD ct L CD vl ro ro co CO cn cn co C O oo ro cn cn 00 C D 4* v l CO o ro v l CT) oo C O CO C O CO Ul ro C O ro o oo oo CO C O C O 03 C O CT) L i v| CD cn i o 00 i o bo ro b>C O CT) 4^ C O bo CT) ro CT) cn ro ro io 4^- bo co O 4^ CD ' ro CD CL CL ro o o v| CO CO ro CD O v| CL CD ro _i Q . CD Sr CL T3 CD r— CO CT) O v| CO zr ro CT) oo oo ro io bn N L k o Ul ro CD ZJ CQ —k Er => OL cn CO it 4^ cn Li 4^ vl CO _^ CT) ro Lk CT) bo v| cn po W Lk CT3 CO ro ro cn v l cn cn ro cn bo 4*. cn LL _ k ro C O co v l ro ro cn ro o C O C O cn ro cn bo cn CD CO CO E? CO o 00 •oQ 4*. 4^- co CL E? ro CT) '•r> 4^ cn 4^. v j CO i ro 4. ro ro ro C O ro ro v| o cn cn v | ro v | CT) 30.5 CO CO CO ro ro o cn CD ro 4> v | C O to o CO CO _^ CO CO —i bo ro cn 4^- co 4^. O cn ro _^ cn v j cn cn bo bn bo '4s. cn LL co •i ro ro CO 4*. C O CO ro 00 00 00 00 CO 00 cn CO ro ro CO CO ro oo ro 03 CO O CO cn oo ro 03 oo ro ro v| C O —i CO O L L o O ro cn v | 4^ CD CO ro ro ro oo cn ro o C O C O 00 C D C O ro cn bo — k CO 4^ cn v) 4^ ? ro v| CO ro CO cn -0 cn CD CO ro CO Ul o CD co C O ro CO cn cn 4*. cn CO ro 4*. cn Li o CO 4^ 4*. cn o ro cr o ro CO CD CO bo cn co cn CD o CO ro CD — i o o 3 CO o •o CO CD CD > N 3' v| ro o vj IO c zr D CD •o o o ro _k cn cn C O _k O ro _k O _k _i o co o v| _L o -k o ro 4*. bn cn bo 4*. v | co o cn o ro O -k -k —L cn —L v | cn cn o o _k cn o -k —i Ef cn cn 4^ D CD •D o ro o cn cn C O o o to -i. cn cn C O _k O CD 03 3 —H L -J. -J. o o CO cn o bo _k CO O fo o o o ro Q CD v| o O —h CD CD o o —i CD CD O o o o o CD CD CD CD CD cn cn cn cn cn 0) -I o 03 o k o o - i 4*. o W cn ro C D v | IO C D o _i o _k ro 4^ _k o _k Q bn C O 4^ v | ro 4^ o cn cn l—t- i—t- —•> —H O o o CD CO CD CD — K —h o CD cn cn cn O o o CD CO CD CO 1—* CD CO i—»• O o —k ro _k v | U l -k -L _k o CD 0) —K o _k o o CD CD CO «—ti—f ro Q cn _k o O -k -k v| bn cn o _k cn o —h —H o CD CO o O O o o CD CD CO CD CO CD CO cn cn cn i—t- k Er ro fo a CD •a E? -k -k C O ^ . ro 4^ O CD o —i CD CO 0) o < CD CQ CD 0) i—H o 3 CO co 4a4a. CO CO 3 CD 94-36 CD cn cn CO •fa. CO CO U l 4a, CT 1 CO Ul CT CD CD CD T3 o ro ro CL ct Sr >-! CD •vl -vl -vl •vl •vl -vl -vl v j CO CO CO CD co CO CO CO CD Oo 00 CL CO CO CO CO 4a. CO 4a. 4a. 4a. 4a. 4a. 4*. 4a. -Pa- 4^. CO •fa. CO L •vl •vl CO CO CO CO CO rb rb rb rb rb rb rb rb rb U l CO 4^ cn cn o o o •vl oo •vl oo •v| •vl 4a. CO CO •vl CO o o CO •vl 00 00 CO CT CT CO 4^ C L C L CO CT CO 00 CO CO 4* 4a. -S» 4a. i 1 i CO i i 1 i 1 i 1 • IO 1 CO to CO 4* to to to 00 CO •vl •vl J3 o 00 4* cn CT CT 4a. CD 00 CO _ l _,, _,, CD CD CD ct Q. ro CL ro _^ _,, CD CD CD CD CL QL _,, _i CD CD ro _^ CL CL CO CD CL CL ro ro _^ _± C L CD CD CD CL ct CL L _J, CD CD CD CL CL ct Vi CD CD < 3" -H -< 13 CD r~ 3 3" c u o O) U l co IO cn CO ro CD _L ro 4a. CO v j U l co CO o Ol o CO CO 4^ U l 3 CD CO CO CO CO O o CO o cn CO CO co CO CO CO U l cn 4a. CO U l U l —L v4 ro o vj O IO CO CO « A cn 4*. 00 00 o co co cn o O CO CO cn b) N CD N Ul Ul bo cn bo CO cn CD 00 CO •v| bo bo U l b i cn cn 4> CL 3 CD CO •vl cr O CD bo co CD L o cn ro oo 4^ bo •vl CO to CO CD U l ro CO 4> Cv> CD it Ul CO 4^ co co CD 00 Ul p 1-L cn _^ io cn 4a. ro 4a- -v| co oo Ul •v| •s 3" CD 3 CO N r-t CD «-! O on O 3 48.65 CD T3 O CL < 'S «—»• •v| cn o co bo N ro IO CD Ul ro CO v j CO 1-L 4> -vl bi &• cr o ro CO oo O bo k 4a. CO bo cn CO . cn ro 4^ 4a- to Ul 4a. CO 4a. 4* to 4* 4* 4a. 4a. U l 4a. 4a. U l Ol vj CO 4a. 4a. 4a. 4a. CO CO L •vl ro CO cn ro ro CO CO CO CO CO CO CP U l CO U l 4a- CO CO U l U l U l CO -v| Ul Ul ro ro cn CO _^ CO CD vj CO Ul ro o ro IO ro CO CO CO IO ro IO ro O) 4^ o 4»» 4a- -vl v j CO cn ro oo ro U l Ul cx> ro cn ro to CO CO CO CO CO CO CO •vl U l U l to ro U l ro CO CO U l 00 oo cn cn U l •v| to o ro ro o N ^ ro ro co Oro co co o l ro CO vj CO CO N CO CO O IO oo to O ro oo CO bi vj cn O 3 00 O T3 CO 4a. 4^ 4a. 44. CO CO CD CO 4* 4a- CO CD > N ro oo ro 3' to ro IO ro —L _ L c o o 4a. 4*. 4* 4a- 1—H ro ro 4> 4> CO CP IT D CD •o o o o p o U l CO CO 1* CO -i. o - L _L N o O N - L CO I Ul 1-L o L o io p o N 1 o o o 4> 1-L o o 03 U l 4> U l U l o ro IO ro co U l bo - L 1-L 1-L - L Ix bo U l - L _L o ro —A Ul io ro io o p to ro ro o P i Ul CP o ro o CO o —L bi o ro cn o o O O O CO U l U l U l b i 3 _L - L - L _L o bo U l o CO _ L o Ul —L O Ul D CD •o —. to —L ro D •a CD o -L U l CD o o - L o 4> U l CO o o CO o o - L - L •v| IO cn —. g g g g g o o o CD CD Cfl CD CD CD CD cn cn cn cn CD CD CO cn cn -L o - L 1 o o 00 cn CP U l U l P o o CDCD Cfl i—* Cfl o —L - L •v| to bo bo o o O o CD Cfl CD cn o c cn O CD CD CU o 4a- - L o —L Ul - L o o CO U l o o Ul Ul o Ul o _L CO Ul < o o o CD CD CD Cfl cn cn CD o o o o g ca Cfl CD CO CD Cfl CD CD CDCD cn cn cu o 3 ON Table A . 1.2. Geology and granulometry for the event soil sample id 21-101 21-102 31-3 31-4 51-3 52-10 52-12 52-13 52-16 52-26 52-30 52-32 53-5 53-6 61-10 61-15 61-18 61-19 62-10 62-14 62-23 62-23b 64-20 73-12 73-17 73-18 73-18b 73-28 74-20 83-7 94-23 94-24 94-26d 94-30 94-34 94-34b 94-35b 94-36 Where Start point Geology Path Slide Weight Slide Type Slide Grad Slide Fine m G 18 2.600 s u 3 1 b G 2.649 6 gs 9 t G 4 w 2.458 10 sg t G 2 2 2.609 u gs b N 1 2.567 4 gs g N 1 1 b 2.638 gs g u m N 1 2.617 5 gs N m 5 2.593 s 13 g m N 20 2.707 8 sg g N 2 b 2.604 2 sg g N t 106 2.604 13 gs g m N 2.411 8 11 u gs m N 4 2.609 w 4 sg m N 4 2.648 u 1 gs m G 2344 2.603 7 gs g b N 2 2.617 w 9 sg b G 2 2.596 u 13 gs m G 44 2.756 2 sg g N m 68 2.684 6 gs g b N 3 2.605 3 sg g m N 1 2.643 3 gs g m N 1 2.694 u 2 gs b G 2 2.606 2 gs g m G 12 2.601 u 6 gs 1 b G 2.627 3 gs g b G 1 2.577 3 gs g b 1 2.552 G u 2 gs 2 2.621 b G u 5 gs m G 5 2.651 u 2 sg m 2 G 2.763 11 sg g b 2 Fs 2.518 u 6 gs b Fs 3 2.608 u 2 gs 2 b Fs 2.639 u 9 gs b Fs 1 2.622 u 2 gs b Fs 2 2.643 s u 7 1 b Fs 2.619 u 5 gs m Fs 1 2.615 u 10 gs b Fs 3 2.724 gs u 9 b (m, t) represents the event starting point on the slope: bottom (middle, top); Geology: G-granite; N-gneiss and Fs-fine sedimentary; P S D type for event soil sample: g-gravel; gs-gravel sandy; sg-sandy-gravel; s-sand; P S D grading for event soil sample: g-gap graded; u-uniform graded and w-well graded Slide fine - percentage of silt and clay in the event soil sample. Table A. 1.3. Human activity and granulometry for the control soil sample id 21-101 21-102 31-3 31-4 51-3 52-10 52-12 52-13 52-16 52-26 52-30 52-32 53-5 53-6 61-10 61-15 61-18 61-19 62-10 62-14 62-23 62-23b 64-20 73-12 73-17 73-18 73-18b 73-28 74-20 83-7 94-23 94-24 94-26d 94-30 94-34 94-34b 94-35b 94-36 Where Contr Weight 2.6 2.677 2.577 2.527 2.577 2.67 2.682 2.656 2.707 2.604 2.636 2.629 2.632 2.419 2.603 2.646 2.61 2.757 2.652 2.753 2.664 2.679 2.573 2.528 2.629 2.364 2.552 2.671 2.725 2.565 2.518 2.608 2.639 2.673 2.643 2.619 2.615 2.781 Contr Type s s sg 9 9 s 9 sg sg sg sg gs sg 9 gs sg gs sg sg sg sg gs gs g gs sg gs s s gs gs gs gs gs sg gs 9 9 Contr Grad u u w u u u u 9 9 9 9 9 9 u 9 9 u 9 9 9 9 9 9 u u w u 9 • 9 u u u u u u u Contr Fine 3 8 11 2 4 8 7 7 8 2 9 6 9 4 7 5 3 4 8 7 3 2 7 5 3 22 2 8 8 4 u u 5 2 9 5 9 7 14 2 Clearcut 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 PSD type for control soil sample: g-gravel; gs-gravel sandy; sg-sandy-gravel; s-sand; PSD grading for control soil sample: g-gap graded; u-uniform graded and w-well graded Contr fine - percentage of silt and clay in the control soil sample; Clearcut - presence or absence of the clearcut at the first reach. 163 Table A. 1.4. Elements characterizing t ie debris Ilows Event Minimum Maximum Minimum azimuth difference, slope slope corresponding difference difference Minimum slope difference % length adjacent reach Minimum Minimum length length fan/first (intermediate reach reaches) Slope to stop Maximum angle of the lateral sides of trapezoid 21-101 2 7 3 25 79.5 28.6 0 7 21-102 0 0 0 100 0 75.8 0 9 31-3 6 6 25 5 17.3 13.4 0 7 31-4 4 4 20 53 55.4 28.5 0 5 51-3 0 0 0 100 0 143 32 1 52-10 0 0 0 100 0 55.5 0 2 52-12 0 0 0 100 0 25.8 33 0 52-13 5 5 44 42 41.4 50.4 0 1 0 5 52-16 6 8 8 36 34.9 52.5 52-26 14 14 20 65 30.2 46.1 0 9 52-30 3 26 14 19 30.9 32.1 12 5 52-32 2 7 41 26 66.1 75.8 0 5 53-5 2 4 22 22 65 15.5 18 5 53-6 6 10 15 44 77.3 34.2 0 4 61-10 2 17 18 21 24.8 39.5 2 8 61-15 0 0 0 20 0 10 0 13 61-18 0 0 0 28 0 8.5 0 8 61-19 7.5 13 6 25 133 43.7 14 9 62-10 2 4 6 47 88.8 . 25 0 2 62-14 8 8 32 50 0 26.4 0 19 62-23 0 0 0 100 0 93.2 18 2 62-23b 0 0 0 100 0 58.6 0 1 64-20 0 0 0 100 0 11.6 12 10 73-12 5 22 52 39 14.4 18.3 5 9 73-17 0 0 0 100 0 109.6 0 1 73-18 0 0 0 100 0 31.5 44 9 73-18b 0 0 0 100 0 35.8 0 6 73-28 0 0 0 100 0 11.3 0 12 74-20 7 7 8 61 41.6 41.9 20 3 83-7 9 9 33 16 0 17.8 23 20 94-23 0 0 0 100 0 115 0 2 94-24 0 0 0 100 0 98.7 0 1 94-26d 1 1 8 35 0 34.9 0 4 94-30 9 9 3 48 0 33.6 0 18 94-34 0 0 0 100 0 15 0 2 94-34b 0 0 0 100 0 120 0 2 94-35b 3 3 38 68 0 25.4 0 7 94-36 15 15 9 81 0 52.2 0 1 164 S Q. CD Me CO ZD SD CD -a 3" rt 0 3 o • < ro V cn CO o •D CD S a I CD - CO CO cn L —L —L co v j O ro cn co ro CO co CO cn 4>- b b vl •• 3 —k CQ IT rt CO vi —h co ro ro ro Co — k CD ro CO ope o o m O CD —L CD ro CO yi b CO CD b < o cn Q 3 — CO — 1 3 b o 5" _ X CD —I. s a 5' O 3. N O a c CO ro CQ Cn 00 Ef fo W b 0) v| CD 3 CD £L a rt a I 3 o =f =• N CQ O O C 3 ro L CO ro ro 3 5? CO 4* o £ bi bi CO b b CO CO Ol CO o 2 m 3 2 _i < = S fU !-» ro o vv l| 4*. O ro CD ° ' b bi b bi Q a 3 a.' L o 00 Co b b ro v| O T3 CJ co CD b CJ ro 4^ v| 00 00 CD CD CO 4^ :=£ 00 o 3 o 3 cz — i t3" _k ro ro ro ro v| > D CD o b —L o fo b •o Ef — t fo 4^ D CD o CO ro ro co •a Ef —1 ro ro D CD o ^ 1->• "1 CO CL CD o ro o c C/5 rt a 3^ _i cn ci Ef CO - —h o Iro cn ro •a Ef CO 4*. IT rt c a f t - \ 65.0 \ \ 8 \ V \ \ \. \ \ \ V \\ \ \ 39.0 \ CD Ol b 39.0 CL 128.0 bottom Az imuth Depl 37.5 Depl Depl b CO 4^ vl zr ro —i Li !h top CD O N levation Horizont Horizontal Elevation al length length cumul cumulated m Y \ \ lope Length Slope 31 .Q ON ON 75.8 \ Reach b CO a. Ef o bo zr co CO cn ro CD 1020, 1010 cn vj O cn cn CO cnL Elevatioi cumulate Ol 4> iv> 3 Horizont Horizontal al length length cumul 14.2 996.2 10.1 CO CO CD m levatio cn cn L co CD co CO CO lope CO CO lope Length ^ CO 1138.9 co 17.3 648.68 CO co CD Reach Mean Std.dev. CD CO 0J bo CD Q.' Er CO CD 00 co co ,— o T3 Q. Er 4*. ro CD d Azi muth Depl ro bottoi CD CO ro co CD cn 3" o CD o CO CO ro 4^ Depl N 03 zr ro Depl zr o oo ON CD co 4> evatio Horizontal Horizontal Elevatioi length length cumul cumulate 47.0 23.9 b 4^ CD IV) CO Depl o o Depl oo zr fo b cn ro ->. GJ - L O Depl N ro 3 Ef ro o ro ro i—^ O Ef •a ro ro b bottom Azi CO 14.9 CD vl vj lope Length 55.4 29.5 42.4 18.31 co b Reach Mean Std.dev. ON 00 CD o Z3 CO ro vl CO o "a m CD ro co co co co ro CD ro CO ro L b o. -> CL a Ef IT zr CD CU o col CO I o TJ CD CD ZJ CQ CD vj b CO | o XI CD co ro CO 00 ml CD < s. Ol ro • 5' ZS CD § •a 81 CD ZJ I (Q O — ^ „ N gs CD I S AW Hill U 5 " ° co col 8 ml | CD | CD CL Ef O TJ O 3 > co o ro N Dl CD CD XJ ro CDI J O D co S3 CO £" z; Q. CO 33 —* ro Q. CO o zr Q. CD < Co' 5' 3 CO o CD r— IO CD Z) CQ £0_ o ~u Ol b ro co LvL CD < CO f—t- o' zi CD 3. P N Icq O roJ L IS ro —i ro cn 8 mi 3 5 V l cn co CD CD v l ro co E* CO oo cr «- o CO oo S 3" > N o ro T3 ro D ro o o b lb U) Ef co CO 53" 3 CL CO CD CO 3 33 CL CD CO CL CD < CO' o o' 3 CO O T3 CD g 00 oo co bo co CD CD • CQ CO o ro co| o T3 ro co CO CD CD < Co ro O' vi 3 kjo! a CD Icq o CO CD ^ =P o to ca llllf co_ -0> CD 3 CQ i—t- N 3" O CD c ro vl CD ml fi c CO LTJ I < CO CO co CD CD = CD La. a 3 CO ro UjJ CT O co ro o | 1 >\ N 3' CO c 4^. A. 3- • CD •a *—»- 3" CD col ro ~o1 CD T3 CO 4* - J CO cu 3 Q. tu CD tu 3 a CD CU CL Sri CD < CU' o' zs col 51 col o TJ 0 CD z: CQ COI o TJ CD ro rjj < CD tu o 3 CD CO 00 3 - = !Q o CO 00 HOJJ01M 0 0 5 • 3 5= J • J D O C * cu 8 m I— tu? JD SCD § • r> ZJ T a in 3- co O TJ CL -A o 0 ro NJ cn o cn 0 0 0 o o cn 0 03 co r—h zr CT O Pt O > N 3' c co zr D CD TJ zr ca D CD TS ro CD a CO Ja. ICQ 100 ro co co Iro I to CD TJ < — -t ZT co a CD O cn to cb o Mean Reach Standard devi CO •—*• on ro co p. CO lope Lengt zr CO CO N CO CO 4> lope co ro <X> ro cn co cn cn cn CO 0 5 CD CO v | —1 CD cn ro L CO CO co co cn CO to —L vl CO ro ro ro oo bo oo bo 8 m 3 2 £c1 cn cn 3" ° ' CO vl CO CO lk v| bo bo z> Horizontal length cumu bo bo CO CO m levatio Horizont length CO IO N 4.78 1•0 88.2 cn L co v j CD ro CO ro cn pj o Q> bo —k CO bn Q . => CL ro CO ro ro co o CD 0> fo V i -± S o 55 o ro o o NJ cn o co o o Ef o TJ ? ci Ef co co cn CD 3 Azi muth Depl —i cn o bottoi ro CO co o CD o 0 5 ro cn CO zr o L O) o _l CO 0) L cn Depl CP _ i zr DO ro ro Depl zr o z> o ;n ->• Icn v> ON CO CD SD O ro O TD ca |— cn CD ZS CQ cn cn CO ICQ CO o 9 XJ CD fflJ oi rrj| CD < g> 5' ZS c? CD ZS LQ o zs 4^ cn 1 —i N' cu zsL CD oo I B IS ml CD cu & CD O s ci r-t- ZT o XJ CT O CO CD Pt o 3 > N ro CD 3' cz co_ si CD TS cn CD TS ZT 4a. co fo D CD TI 4*. cn UL zr col 4^ 3 CD CO ro COJ CO o 0 r~ Ol co CD ro CQ CO | o T3 ro ro CO I CD ml O) CD < Co 6 ro 3 CO CD ^ N O 3 Co" CD i kill still Tl O W TJ CO cn co Cn II o rz m 3 CD < c_ C O Co IT T3 CD Q. 5' 3 al o T3 CT O rt o > N 3' c CD X3 CO. ro CD co ro CD T3 -»• ro co ICQ ro Co ro co sa sa OD CD ro CO o T3 CD I CD - CQ co_ sa CO o cn led X3 ICQ Ul CD ro to [0 CD < CU o ro ZJ I D ° ICQ o Er =J co 3> CD _ N 8§ » oooo 0) I I ! I O 3 « 3- •a r-t- « ro g UI-B m | 3 c— u rircu ro " o' ZS CD -t ro o o o o co o o A o o tn o o CD Ul CO CD Ul o o o cr O cn CD o 3 > N 3 cz o o D CD T3 CD CD Ul cn O CD o o o CO CD O CD ro co D CD "O Er CD cn ro |ro io CO ICQ CO a CD n o CD Reach ro on m 23.5 HUE CO 29.3 29.1 bottom Azi muth Depl 29.3 Depl Depl CO zr z> ro SI -L J> -»• o cn th top 29.3 37.3 o Elevatioi cumulate CTJ levation Horizonta Horizontal I length length cumu 16.5 b Slope 17.5 CO o lope Length 18.6 24.74 ro oo o. = o b 8.5 14.28 ro co b b TJ co cn o 00 N 2 Mean Standard dev 0) CO CL Ef zr zr CO i—t- DJ rs ci CD 09 zs » cx ro cu ozr CL CD < o' zs -P* Ul CP ro CD CO o T3 CD 2 CO , CD O ovj ro o o CO co CP ro ro CD •fa. ro ro zs 4a. •vl CO CO vj CD CzrQ CO o 4> ro _ i co cn N -Pa. CO cn C D r~ . CO •vl co CO co ro CO Li "Q m (Tl < C coD •vl _ L CO o CD 6' 3 -pa. CO b cn _i —. _± CO —k cn 4^ Ul CO Ul o cn C D CD CD ro CP b I (DO N" 3 42. 4a. ro 3" 3 03 CD 3 (Q oI Ef ?. _ N CD Co CD CD cn o Ul ro CO CO -vl b CD CO b i> o O3 C — L 4a. c3 — 8 m CO CO CO ro •vl CO — L —» -vl ro CO CD CD — Ul CD Q . >• R' CD sr ro 3 ? o -Pa. o ro CD o 3 ci ro ro Ul CO CO CD bo Ui ro — L cn CO CO o ro ^ o ,„ — c j o o o r* u i o o o !° u i o o o 0 5 o o co i o *• o Ef o •a a Ef CT ro ro _ L _» (~\ \J 7X CO CO bo Ul o ro cn oo ro co CO co o 3 CO CO co 3' co cn —i cn J> 4a. Ul —i Ul ro —i -Fa. > N Ef D ro "D o Ul Jl s Ef -* PO •s\ - A —L 4a. a ro •o CP z> o SI —L V) -» Ef zi ro D CD •a - L IE? CO D CD D < s • -J. ro | •fa. Mean Reach Standard devi 03 I-*- o" s CD v j ro CT) CO CO CD -0 bo co © ro ro CO —k CD cn cn v i co 121.1 b v l 134.4 CD Ol -vl k ro fo © it cn b o ro CD r o Cn v| CO 33 O CO J " o o = ^ cu 3 a 3 o) a ~ & ? 3 » g i» 3 o in CD cn CD J> —k CO 4=k o -vl ro v | V l b b b b b CO CO r o ro cn J > co — i co CO N b b —k vj vj zr CO ro 00 CD r o v j Ul L m z> Horizonta Horizontal 1 length length cumu b 73.6 79.5 1 J> CD CO levatio •vi V) ro CD CD r o fo Ul co cn it co co o r o fo b fo b CO o cn OS* CD lope ro 45. U l W ope Lengt 85.0 —L ro -J. £ m 3 JB E 4*. 2 Co Ef: cn b Q aCD 9 3 § ci ro ro l i cn cn Lk b CD -vl CO ro —k CD b b 13" 1—>• <—»- o a Ef cn Lk -vl b ro CD CD 4^ b b b o O o \j D -k v| CD r o Q ro —A 3 _k M J> -k fa. D o cn JI zr zr D D fo Depl £> s Depl cn *» W v| > muth Depl Q O) 3 N co •fa. CD cr o Pt o faOO io zr 1O •-»• ( D vi :fa- : -k ( D ( CO 4^ Mean Reach Standard deviation IV) 30.8 18.0 44.1 19.3 63.4 19.3 48.8 18.0 O V) ->• co 00 cn J> 40.1 CO ro CD CD oo 26.4 co co O N <D Slope Length Slope Elevation Horizontal Horizontal Elevation Width top Width bottom Azimuth Depth 1/4 Depth 1/2 Depth 3/4 length length cumul. cumulated CD cn CO 5.65 cn 19.37 CO CO CD bo Horizonta Horizontal I length length cumu Elevatio cumulate CD CD !h top 37.2 46.1 bottom Az imuth Depl 37.2 Depl Depl zr CO -» | • —i Q . => ro io cn levatio rr io ,0 JI Q ro CD ro —k 64.1 TJ Slope L 21.92 (« lope Length 11.6 86.2 48.9 52.75 v] vi ? <a i» 3 g 5 " ° "8i a zr r O ^faOl C TJ Reach on 00 cn —k TJ Z> to ro Ul vi o • j> m ro CD J > cn ro __k 2 Mean Standard dev CO CO CL Q. Ef zr -* ro zr J3 CD CU o zr fo CO o x CD 2 CD rs co ro co M co co CQ £2 o X CD cn ro mi CD < CU o =3 01 fl N o zs zr ri. CD DJ " N i a 0m 1 5 — DJ CU C5_ CD ° ' CD cn O Q. Ef cr O o CD _3_ > N 3' cn c rr O CD X CP a CD X! zr Cn ro"| CD X zr\ co ico oo 00 Mean Reach Standard deviati o n ro CO CO CD vj CO O) lope Length 41.9 14.73 cn O co CO ro o J > ro CO b ro o CD co io cn GO ro CD v l co Co b io 03 Horizontal Elevatiol length cumul cumulate — L co CO ro v| W 3 Horizont length co CO CT) v i co CO 1* ro 4> m evatio co Q ro Q lope 22.3 4.04 ro N Q . -> CL th top 10.1 14.4 12.71 ci p 4> o cn oo 00 CO CO -O £> z> o =i Azi muth Depl ro o v| —X. bottoi Ef zr J> Depl o o zr 3 T) - l (D o < ro Depl o LO Z) -*• < D <JI • ->• IJ>. zr CO CO 5T 3 a CO 2 CD CO 3 Re -T CL CO o CL CD e>je| UBOOIS < Co' 3" 5' 3 ro k co o TJ CD r™ CD vj b W CD cn ro CD ro CO CT) cn vi —k ro L CQ ro V l CD bo HT CD o co ro ro CO TT CD m CD < CO CD CO vl CD Q L CO cn cn 3 CD 4^. I — CD => !Q O ^ N O 5 2 Co CD I 3 Ef o 3. O C 3 CO —k O) co 4> N O IB CD CD vl CD b CO _k evat ion mulaited 8 m CL oo cn b o TT CL CT O Pt O —k J> co CD 3 > N ro ro cn CO co ro 3' o ro TT ZT ro ro ro J> D CD TT ro ro W ZS ro a CD TT ZJ _k _k co b CO *> Event 21-101 Event 21-102 Event 31-3 Event 31-4 Event 51-3 Event 52-10 Reach 1 L-55,5 m Event 52-13 Reach 3 L=97.9 m Event 52-16 203 Event 52-26 Event 52-30 Event 52-32 Event 53-6 209 Event 61-10 211 212 Event 61-15 213 Event 61-18 214 Event 61-19 Reach 4 L=200.7 m 216 Event 62-10 218 Event 62-14 219 Event 62-23 Reach 1 L=93.2 m Event 62-23b N Reach 1 L=58.6 m Event 64-20 222 Event 73-12 223 Event 73-17 Reach 1 L= 109.6 m Event 73-18 Event 73-18b Event 73-28 N Reach 1 L=96.7 m Reach 2 L=11.3m Event 74-20 228 Event 83-7 Appendix 3 Basic concepts in fuzzy sets theory 1. Elements of set theory In traditional set theory, termed crisp set theory, based on dual logic an element can or cannot belong to a set. Suppose X is the universe of discourse i.e. the set of all possible elements with respect to a property. A crisp set A is represented by the membership function p: X —>{0,1} defined as following if an element of X, xe X, belong to A then p,(x£ A)=l if not then p. (xe A)=0. Therefore a crisp set can be expressed as a set of ordered pairs {(x,p )| x e X and P-A:X—»{0,1} is the membership function of element x to set A} A Example: True sentence: 2 is an even number => [i (2) = 1 A where A is the set of even numbers False sentence: Bold eagle is a mammal =» ju (bold eagle) = 0 A where A is the set of mammals. A fuzzy set is represented in a similar manner with the crisp sets, the only difference being in the co domain, which is transformed from the discrete set {0,1} to the continuous interval [0,1]. Similar with crisp sets the fuzzy sets are expressed as a set of ordered pairs {(X,|0,A)| x e X and PA:X —> [0,1] is the membership function of element x to set A} Example: Temperature is around 22°C. f0.5x-10 1 sin(45x-900) if XG [20,22) if x = 22 If X=[0,50] if xe (22,24] This means that the range of temperatures is from 0°C to 50°C. When this range changes, the membership function also can change: 230 if XE [12,22) fO.Lc-1.2 if x = 22 1 IfX=[-100, 500] if xe (22,42] sin(4.5;c-9) The above examples show that membership function is dependent on X , the universe of discourse. 2. Basic concepts on fuzzy sets 2.1. a-cuts, support of a fuzzy set and normal fuzzy set. A very useful tool in dealing with fuzzy sets is the concept of a-cuts. For any fuzzy set A, defined on the universe of discourse X , and any real number ae [0,1], the a-cut is defined as the crisp set -j x| JJ^X^OC\. The notation of an a-cut set is usually A. a An important property of a-cuts is that the order of a values is inversely preserved by the inclusion of sets corresponding to each a-cut. In formal mathematical language this can be expressed as: For any ai<a , with ai, a e [0,1] 2 0 | AD 2 a ! A This property is equivalent with the following equalities: "'Au" A = ' A a 'Au" 2 a 2 A=° A 2 The support of a fuzzy set A, which has the universe of discourse X , is the crisp set containing all the elements from X that have the membership function strictly larger than 0. The height of a fuzzy set A is the largest membership value of the elements of A. A normal fuzzy set is a fuzzy set having the height equal with 1. When a fuzzy set is not normal, dividing the membership function to its maximum height can normalize it. 231 2.2. Convexity The concept of convex fuzzy set is a natural generalization of the classical concept of convex crisp sets. A fuzzy set on real set (R) is convex if all its a-cuts are convex crisp sets (crisp sets convex in classical sense). A result very useful in possibility theory is the following theorem: Theorem 1. A fuzzy set A on R" is convex if and only if for any xi, x e R" and all Ae [0,1] 2 u. (A*,+(l-A)x )> min[u. (xi), A A 2 HA(X2)]. 3. Basic operation on fuzzy sets The basic operations on fuzzy sets are similar with the one defined on crisp sets: union, intersection, inclusion and complement. In conventional set theory these operation are defined in the following manner using characteristic function: A crisp set A is empty A = 0 if and only if (iff) P:A(X)=0 for any xe X. Suppose A, B are two crisp sets with characteristic function [IA and p: . Two crisp sets are equal, B A=B, iff A and B are subset of X and U.A(X) = |i (x), for any B XGX. A crisp set A is included in a crisp set B, A c B , iff A and B are subsets of X and HA(X) < u. (x) for any xe X. B AuB={xeX| xe A or xeB}={xeX||a, (x)=l or pi (x)=l }={xeX|u, (x)*0 or u, (x)*0} A B A B AnB={xeX| xeA and xeB}={ xeX|u, (x)=l and u, (x)=l }={xeX| p: (x)*0 and u. (x)*0} A B A B A ={xeX| x£ A}= {xeX|p. (x)=0}={xeX|u. (x)=l-|a (x)} A x A The same operations are translated to fuzzy sets: A fuzzy set A is empty, A=0, if and only if (iff) |J,,i(x)=0 for any xe X. 232 Suppose A and B are two fuzzy sets. Two fuzzy sets are equal, A=B, iff A and B are subset of X and \i (x) = p. (x), for any xe X. A B A fuzzy set A is included in a fuzzy set B, AcB, iff A and B are subsets of X and \x (x) < p: (x) for A B any xe X. Au5={xeX| u i(x)^0 or p, (x)#0}={xeX| \i s=max(\i (x),\x (x))} ( / B A AL Anfi={xeX| PA(X)*0 and p, (x)*0}={xeX| \i B AnB therefore \i B = max(m,p^) AuB =min(p, (x),p (x))} therefore \x A B AnB = min^M A ={xeX|p (x)=l-p. 4(x)} therefore ^_ = 1 - \i x / A These operations are called standard fuzzy operation. To generalize these concepts a more general definition was used. 3.1. Fuzzy sets intersections The fuzzy intersection of two sets A and B is defined as a binary operation on [0,l]x[0,l]. Suppose that i is a function defined as follows i: [0,l]x[0,l] -> [0,1]. By definition W4nB(x) = l[M-A(x),U. (x)]. B For example the function / can be defined as i(x,y)=min(x,y)= ^ X+ ^ ——, X where x, y e[0, 1], which lead to the standard fuzzy intersection. To define the binary operation i as fuzzy intersection it should satisfy the following requirements (Klir and Yuan, 1995): 233 Axiom i l . Boundary condition /(a, 1 )=a for any ae [0,1 ] Axiom i2. Monotonicity For any a, b, d e [0,1], with b < d then i(a, b) < i(k, d) Axiom i3. Commutativity For any a, b e [0,1], i(a,b) = /(b,a). Axiom i4. Associativity For any a, b, d e [0,1], i'(a,/(b,d)) = i(i(a,b),d). The set of axioms enounced above define a set of functions known as t-norms. An additional axiom is usually stated: Axiom i5. Function / is continuous. The standard fuzzy intersection is a special case of t-norm, which has the property /(a,a)=a, for any a e [0,1] (idempotency). Examples of fuzzy intersection functions are Yager class /(a,b)=l-min{ l,[(l-a) +(l-b) ] }, for any a, b e [0,1]; w w 1Av Algebraic product: *'(a,b)=a*b, for any a, b e [0,1]; Bold intersection (or bounded difference) /(a,b)=max(0, a+b-1), for any a, b e [0,1]; Which is the Yager intersection with w=l: JY(a,b)=l-min(l,2-a-b). a Drastic intersection: i(a,b) = \b 0 when b = 1 when a = l for any a, b e [0,1]; otherwise. 234 3.2. Fuzzy sets union In the same manner as the fuzzy sets intersection was defined, the fuzzy sets union would be also defined. The fuzzy union of two sets A and B is defined as a binary operation on [0,l]x[0,l]: u: [0,l]x[0,l]->[0,l]. By definition HAUB(X) = M[^A(X),HB(X)]. For example the function u can be defined as u(x,y)=max(x,y)= X + ——, where x, y £ [0, 1], X which lead to the standard fuzzy union. The function defining a fuzzy union has to satisfy following axioms (Klir and Yuan, 1995): Axiom u l . Boundary condition w(a,0)=afor any a£ [0,1]; Axiom u2. Monotonicity For any a, b, de [0,1] with b < d, then w(a,b) < w(a,d). Axiom u3. Commutativity For any a, b£ [0,1], then w(a, b) = u(b, a). Axiom u4. Associativity For any a, b, de [0,1], then w(a, w(b, d)) = u(u(a, b), d). These properties define a class of functions named t-conorms. An additional axiom is usually stated: Axiom u5. Function u is continuous. 235 The standard fuzzy union is a special case of t-conorms which has the property w(a, a)=a (idempotency). Examples of union functions are: Yager class: w (a, b) = min [1, (a + b ) ] w w w 1/w where we (0,«>). Standard fuzzy union is a Yager t-conorm with w=°°. Probabilistic sum (sometimes called algebraic sum): u(a, b) = a + b - ab for any a, be [0,1]; Bold union (or bounded sum): u(a, b)= min (1, a+b) a when b = 0 Drastic union: u(a,b) = b when a = 0 1 for any a, be [0,1]; Yager for w=l for any a, be [0,1]. otherwise. 3.3. Fuzzy set complement Let c:[0,l] —> [0,1] be a function which assign a value C ( | I A ( X ) ) to each membership grade U,A(X). p.^(x)=c(|j. (x)) A To define the function c as a complement it should satisfy the following requirements (Klir and Yuan, 1995): Axiom c l . Boundary condition c(0)=l and c(l)=0 Axiom c2. Monotonicity For any a, be [0,1] if a<b then c(a) > c(b). 236 If c is involutive, meaning the c(c(a))= a for any ae [0,1] then c is also bijective. Examples of involutive fuzzy complements are: Yagger class: C (a)=(l-a ) , where we w 1/w w (0,oo) Sugeno class: (L\(a)=———, where XE (-1,°°) l-Aa The standard complement is a particular case of Yager fuzzy complement: w=lor Sugeno fuzzy complement: X=0. 3.4. Disjunctive sum of fuzzy sets The disjunctive sum is the name for the logical operation known as "exclusive OR". The definition of fuzzy disjunctive sum is based on its crisp definition A@B=(AnB)Kj(AnB) The representation of the disjunctive sum is presented in Figure 1. •••••1 •HH •Hi ••IP \ Figure 1. Disjunctive sum for crisp sets For fuzzy sets this definition become MAQB (*) = max{min [u (x),l - ju (x)} min [l - fx (x), ju (*)]} A b K b 237 The basic idea of exclusive OR logic operation is eliminating the common are from the union of A and B. A new operator can be defined having in mind the above idea: /W*Hr^(*)-j"i»(*)l This operator lead to a different result than the disjunctive sum defined previously and therefore is has a different name: disjoint sum. 3.5. Difference of two fuzzy sets For crisp sets the difference is defined as A-B=AnB The idea of difference between two sets is eliminating from first set the element that are common with the second set. For fuzzy sets are two way of defining the difference: Simple difference- based on crisp sets definition ju _ - min(ju (x),l - ju (*)) A B A B if for intersection is used standard fuzzy intersection Bounded difference - based on the difference underlying idea MABB = max(0,/7 (x) - ju (x)) A B 238 3.6. Cartesian product of fuzzy sets The Cartesian product applied to fuzzy set is based on Cartesian product of crisp sets. (x ) are the membership function of Aj, A2,., A fuzzy sets and any ;c,eA,-, If ju (x ),ju (x ),...,ju Ai l A2 2 A n n ie {l,n} then MA A *.. A„ (x ,x ,..,x ) = min(M U , ) , / ^ (x ),..,M „ IX 2 X l 2 AS n 2 2 A OJ) 3.7. Distance in fuzzy set A distance is defined as a function that has the following properties: d(A,B)> 0 for any A, B e X c R d(A,B)=d(B,A) for any A, B e X c R d(A,C)<d(A,B)+d(B,C) for any A, B, C e X c R d(A,A)=0 for any A e X c R A large number of functions fulfill the above requirements but usually three types of distances are used: Euclidean, Hamming and Minkowski distance. • Euclidean distance is defined as e(A,B) = /^(^(x^-figCX;)) • 2 where A, B are fuzzy sets on R (A, B eX). An addition distance based on Euclidean distance, when card (X) = n is finite, is defined. The new distance is called relative Euclidean distance and has the equation: 8(A,B) = e(A,B) Vn where card(X) = n 239 • Hamming distance, also named the distance between class sets, is defined as d(A,B)= i>A(Xi)-Mx,)| i=l,x eX i This distance shows how different are two sets in regarding with their components. • A s for Euclidean distance, a relative Hamming distance is defined: 8(A, B) = • d ( A , B where card(X) = n. ) n M i n k o v s k i distance is a generalization of Hamming distance and Euclidean distance. It is defined as r d (A,B) = £ | | x ( x ) - p ( x ) | w A V / w v B for WE [1,°°] For w=l M i n k o v s k i distance become Hamming distance and for w=2 become Euclidean distance. 4. Extension principle T o work with function defined on fuzzy sets a rule regarding the membership function in the codomain have to be done. Zadeh established the rule, known as Extension principle in 1965. The extension principle has the following enunciation: Let consider X and Y two fuzzy sets and f a function from A , a fuzzy subset of X ( A c X ) , to B, a fuzzy subset of Y ( B c Y ) . Then the membership function of p-B(y) is defined as My)= ™ xef M x x ) (y) where f '(y) is the set of points in X which are mapped into Y by f. 240 5. Fuzzy numbers A fuzzy number is defined as a fuzzy set A on R which has the following properties (Klir and Yuan, 1995): • • • A is a normal fuzzy set; a A is a closed interval for every ae (0,1]; The support of A must be bounded. A more relaxed definition is given by Lee (2002), which replaces the boundarness and closeness of the intervals conditions with the convexity requirement. The definition that would be used in this study is the one provided by Klir and Yuan (1995). The operations with fuzzy numbers are based on two properties: • each fuzzy set can be uniquely represented by it's a-cuts; • a-cuts of a fuzzy number are a closed interval on R. The basic operations defined on close intervals are: addition [a,b]+[c,d]=[a + c,b + d] substraction [a,b]-[c,d]=[a - d,b - c] multiplication division [a,b]*[c,d]-[min(ac,ad,bc,bd),max(ac,ad,bc,bd)] [a,b]l[c,d]=[a,b]*[ll d ,11 c]=[min(a / c,a / d,b / c,b / d) ,max(a / c,a / d ,b / c,b / d)] An operation > between two fuzzy numbers A and B is defined, using a-cuts, as (A>B) = A> B a a a If > is one of the four operation defined above on closed intervals, the result is also a fuzzy number and is expressed as A>B= I L ( A > 5 ) with the membership function determined by the extension ae[0,l] principle: 241 M >B(Z)= S U A P min[// (*),// ()0] A FI for all ze R. z=xt>y For the basic arithmetic operations defined for closed intervals the above equation are: MA+B( )= X S U P mint^A(x),^i (y)] B Z=x+y JU _ (x) = sup min[> (x), fi (y)] A B A B z=x-y ju * (x)=sup mm[ju (x), ju (y)] A B A B z=x*y JUA/BW = S U P mrn[> (x),ju (y)] B A z=xl y The most used shape for a fuzzy number is the triangle. A triangular fuzzy number is a fuzzy number represented by three points and the functions among theses points are linear. The notation for a triangular fuzzy number is A=(ai, a2, ^) and its membership function is x-a. a , - a, a <x<a x a -x a < x<a 3 2 a —a 3 2 3 2 Triangular fuzzy number A=(3,15,20) 1.2 - I 1 CO > 0.8 a (A 0.6 a> JQ 0.4 E a) S 0 2 10 x value 15 Fig.l. Triangular fuzzy number A symmetric triangular fuzzy number is a triangular fuzzy number with property a2-ai = a3-a2. 242 6. Possibility theory Possibility theory is that part from fuzzy set theory, which defines the degrees of belief that a given element belongs to a fuzzy set. To formalize the theory a series of definitions are made. A fuzzy measure on pair (X,f), where J-a family of nonempty subsets of X , is a function that has the following properties: /(0)=O and/(X)=l for any A, Be J a n d A c B then f(A) <f(B) for any increasing sequence A ; CA2 c . . . in IF, if (^JA, e f, then l i m / ( A ) = / ( ^ j A ) ( 1=1 for any decreasing sequence A1DA2D... in f, if i— f"") A e i'=l i i=i lim/(A .)=/'(f ) A) then , * > | _ i=l > A belief measure is a function, Bel, defined on all the subsets of set X (called power of X , #(X)) with the following properties: Bel: 2<X)-> [ 0 , 1 ] ; Bel(0)=O and Be/ ( X ) = 1; Bel(A u A u . . . u AJ>J)fle/(A.)-J)5e/(A. n A ) + . . . + ( - l ) { 2 i n+1 5 e / ( A , n A n . . . n A,) 2 For every A e 2(X), Bel(A) can be interpreted as the degree of belief that a certain element from X belong to the set A . As a property of belief measure Bel(A) + Bel(A)<l 243 A plausibility measure is a function, PI, defined on all the subsets of set X with the following properties: PI: HX) ^ [0,1]; P/(0)=Oand PI (X) = 1; Pl(A nA x 2 n...nA )<J^Pl(A )-J^Pl(A a J 7 u A )+...+(-1) P/(A, B+1 J k u A u...u A J 2 ./'<* A series of properties follow from the above definitions of Belief and Plausibility measure for any A e TQQ Pl(A) + Pl(A)>\ Pl(A) + Bel(A)=\ Pl(A) + Bel(A)=\ Pl(A)>Bel(A) The belief and plausibility measure are uniquely determined if a function having certain properties is used. This function is called basic probability assignment and has the following properties: m: <P(X)-> [0,1] £/n(A)=l AeP(X) The equations used to define belief and plausibility measure using basic probability assignment are: Bel(A)= ^m(B) B P/(A)= Jt cA . , 5>(fl) where A, Be 2>(X) B\Ar>B*<t> An element A of power set of X , T(X), such that m(A)> 0 is called focal element of m. A focal element is a subset of X for which the available evidence focuses. If X is a finite set, then basic probability assignment function can be characterized by a finite series of its focal elements. 244 A body of evidence is a pair (7, m), where 7 represent a set of focal elements and m the corresponding basic probability assignments. Total ignorance is represented in terms of basic probability assignment by m(X)=l and m(A)=0 for any A^X. If the focal elements of a body of evidences are nested the belief and plausibility measure gain the following property: Bel (A nB) PI (AuB) = min [Bel (A), Bel (B)] = max [PI (A), PI (B)] for any A, B e 2(X). If the body of evidence is nested the belief measure is known as necessity measure and plausibility measure as possibility measure. The necessity and possibility measure can be defined also using fuzzy measure. A fuzzy measure on < X , 0 (C- a nonempty family of subsets of X) with the following property Nec(C] A,)=inf Nec(A ) for any family {A,- | ie 1} in Csuch that Pi A. e C, where I is an arbitrary t 16/ 16/ index set, is called necessity measure. A fuzzy measure on < X , 0 (C- a nonempty family of subsets of X) with the following property Pos(\^j At)=sup Pos (At) for fe/ , e any family {A, | ie 1} in C such that {JA e / ( C, where I is an arbitrary iel index set, is called possibility measure. 245 The necessity and possibility measures have the following properties: Nec(A nB) = rmn[Nec(A), Nec(B)] Pos(AuB)=max[Pos(A), Pos(A)] Nec(A) + Nec(A)<l Pos(A) + Pos(A)>l Nec(A) + Pos(A) = l min[Nec(A),Nec(A)]=Nec(AnA)=0 max[Pos( A), Pos( A)]=Pos (A u A)=1 if Nec(A) > 0 then Pos(A) =1 if Pos(A) < 1 then Nec(A) = 0 for any A, B e T(X) (Klir and Yuan, 1995) 7. Fuzzy linear regression with fuzzy data This type of fuzzy linear regression assumes that the data are fuzzy numbers but the regression coefficients are crisp. Tanaka et a/.(1982) proposes as a way to solve the regression equation possibility theory. The theory proposed by Tanaka use as a basis symmetric triangular fuzzy numbers. The traditional crisp regression equation based on Cotes (1722) assumption is n y= ^a,-JC . + ( e { where y: dependent variable i=i xi: independent variables a,-: regression equation coefficients n : the number of variables ef. error 246 The model propose by Tanaka include the error term in the vagueness of the predictor variables. Therefore the fuzzy regression equation is n Y -^ a X j i where Y: dependent variable symmetric triangular fuzzy number i 1=1 xi: independent variable symmetric triangular fuzzy number a,-, n : as above A fuzzy symmetric triangular number a can be written as a=(ai,a ,oc ) = (a ,s) 2 3 2 where ai, a , a.3 are the point defining the triangular fuzzy number 2 s = a -ai=0C3-a because a is a symmetric fuzzy number. 2 2 As in crisp regression equation the aim of is to calculate the coefficients aj, ci2, ... , a such that the n linear fuzzy function fits the fuzzy data as best as possible. In crisp sets a series of methods were developed to find the coefficients- absolute values criterion (or Li), least square (or LQ), maximum likelihood etc. For fuzzy sets the equation have to fulfill two goodness criteria (Tanaka et al, 1982; Klir and Yuan,1995) to be "the best": • The total difference between the area of the actual fuzzy number and the areas of the predicted fuzzy number has to be minimal; • The predicted fuzzy number and the actual fuzzy number have to be compatible at least to certain level he [0,1]. The compatibility is defined as com(Y ,Y )=suptmn[Y (y),Y (y)] where Yj, Y2: fuzzy numbers. l 2 1 2 (Al) yeR 247 For a data set containing m e N observation, and n e N variable, xi, ie {l,n} the above relation are written as following: Xy = <xy,Sij> Yj = for je {l,m} and ie{l,n) actual dependent fuzzy number <yj,Sj> n The total spread of the predicted fuzzy number is ^ | a | ; i=l The two goodness criteria lead to a non-linear programming problem. The spread criterion is written in mathematical terms as m rt minimize £ | s .- £ | a,. | s | J 7=1 (A2) tj 1=1 This criterion ensures that the spread between the actual fuzzy number and the predicted one is close. Condition A 2 avoids situation like the ones presented in Figure 2 and leads to solutions like in Figure 3. This means that the predicted fuzzy value has to overlap with the actual fuzzy value ensuring a minimum oc-level for both. When two fuzzy numbers overlap there is the possibility that one of them has a very large spread compared with the other one (fig.2). Condition A2 eliminates this kind of solutions. 3 Membership function o o o o ho cn bo ho Figure 2. Comparing the spread of two fuzzy numbers (undesired case) 0 50 100 150 200 250 Length [m] 248 Figure 3. Comparing the spread of two fuzzy numbers (desired case) 0 10 20 30 40 50 60 70 Length [m] The compatibility criterion (second criterion) ensures that between the predicted fuzzy value and the actual fuzzy one is an overlapping. If the selected compatibility level is h £ (0,1] then the equation used is: com^,Y )=supminCF,(y),Y (y)] 2 2 >h (A3) The solid line from Figure 4 shows the maximum level of compatibility between the two fuzzy numbers. The compatibility level ensures that the predicated values has a degree of belief at least as high as established level; as large is the compatibility level as close to actual value is the predicted one. Fig. 4. Compatibility between two fuzzy numbers In mathematical terms the above condition is written as following: 249 It ll -Y,\ i\ ij Z i u-yj J a s +y +s a x 1=1 for all jefl, mj 1=1 i=l i=l n (A4) n Yj\ i\ ij+Yj i ij-yj- i a s a x s The above equation written for all m observations lead to 2m equations. The final non-linear programming problem is m n minimize £ | s., - £ | a,,\ s | tj ri n - Z\ i\ u L i u-yj j subject to 1=1 ' i=i ' y a 1 1 (A5) s +y a x +s for allymj i=i 250 Appendix 4: Prediction and confidence intervals for the crisp set regression equation. Event Actual length Predicted length Lower Conf. limit Upper conf. limit 25.8 6.7 -99.6 112.9 52-12 73-18 31.5 64.4 -55.2 184.1 73-18b 35.8 5.1 -105.5 115.6 61-18 37.2 84.8 -11.9 181.6 52-10 55.5 66.4 -39.8 172.5 62-23b 58.6 59.1 -70.7 188.9 61-15 62.1 75.2 -32.9 183.4 21-102 75.8 93.1 -21.5 207.7 52-26 76.3 113.4 12.1 214.7 62-14 80.2 120.2 20.0 220.5 31-4 84.9 76.1 -44.5 196.7 62-23 93.2 93.8 -36.0 223.6 53-5 94.4 163.8 67.8 259.8 64-20 97.8 107.0 1.0 213.0 73-28 108.0 99.2 -12.7 211.2 73-17 109.6 24.2 -73.9 122.2 83-7 130.4 92.7 -4.0 189.4 51-3 143.0 102.0 -11.7 215.6 74-20 150.8 149.9 20.1 279.7 53-6 180.3 164.6 68.6 260.6 52-13 189.7 172.2 73.8 270.5 52-16 343.6 369.1 252.4 485.8 21-101 359.2 342.8 237.9 447.8 52-32 530.4 535.0 413.6 656.3 61-19 719.1 712.6 600.6 824.6 52-30 773.9 753.6 628.0 879.3 Appendix 5: Prediction and confidence intervals for the fuzzy set regression equation. Event Fuzzy low limit Actual value travel distance Fuzzy up limit Fuzzy model low limit Fuzzy model Fuzzy model up limit 21-101 287.4 359.2 431.0 144.8 287.4 429.9 21-102 60.6 75.8 91.0 37.7 106.5 175.3 31-4 67.9 84.9 101.9 68.1 152.2 236.3 51-3 114.4 143.0 171.6 52.1 114.4 176.7 52-10 44.4 55.5 66.6 51.1 111.0 170.9 52-12 20.6 25.8 31.0 6.6 65.9 125.1 52-13 151.8 189.7 227.6 83.4 181.9 280.4 52-16 274.9 343.6 412.3 239.1 390.3 541.5 52-26 61.0 76.3 91.6 80.4 156.1 231.8 52-30 619.1 773.9 928.7 382.8 619.1 855.4 52-32 424.3 530.4 636.5 299.9 452.9 605.9 53-5 75.5 94.4 113.3 77.1 171.9 266.8 53-6 144.2 180.3 216.4 90.7 187.0 283.3 61-15 49.7 62.1 74.5 50.9 136.3 221.6 61-18 29.8 37.2 44.6 44.6 119.6 194.6 61-19 575.3 719.1 862.9 370.6 575.3 780.0 62-14 64.2 80.2 96.2 46.5 134.5 222.5 62-23 74.6 93.2 111.8 14.1 74.6 135.1 62-23b 46.9 58.6 70.3 7.0 70.5 133.9 64-20 78.2 97.8 117.4 84.0 165.2 246.4 73-17 87.7 109.6 131.5 27.2 87.7 148.2 73-18 25.2 31.5 37.8 37.8 105.6 173.4 73-18b 28.6 35.8 43.0 7.2 72.3 137.4 73-28 86.4 108.0 129.6 70.8 152.1 233.4 74-20 120.6 150.8 181.0 23.1 120.6 218.2 83-7 104.3 130.4 156.5 36.3 111.4 186.4 252 Appendix 6: Pearson's correlation coefficients for raw and transformed variables Pearson correlation coefficients for transformed variables (used in the regression) Simple Statistics N Mean Std Dev Sum Lef 26 178.73462 203.29939 4647 25.80000 Lpr 26 1.07180 0.57682 27.86676 0.64416 2.89420 TST 26 525.18201 781.50639 13655 0.0000131 3328 CAZ 26 0.05497 0.50147 1.42927 -0.98791 1.00000 cvcv 26 4.04911 19.57479 105.27690 0 100.00000 cvcx 26 3.96877 19.58966 103.18790 0 100.00000 Cvp 26 0.15583 0.44092 4.05147 0 1.40845 cxcv 26 0.09225 0.33123 2.39855 0 1.40845 cxcx 26 0.10532 0.37982 2.73825 0 1.63934 Cxp 26 0.08556 0.30424 2.22467 0 1.23457 Pcv 26 3.92650 19.59728 102.08900 0 100.00000 Pcx 26 0.12364 0.35141 3.21477 0 1.23457 Variable Minimum Maximum Label 773.90000 Lef 253 Pearson Correlation Coefficients, N = 26 Prob > |r| under HO: Rho=0 Lef 'P r TST Lef lpr 1ST CAZ CVCV CVCX cvp CXCV CXCX exp pcv pcx 1.00000 0.92962 0.30533 0.46935 -0.12424 -0.02346 -0.02509 0.61203 -0.14608 -0.14043 -0.08461 -0.19525 <.0001 0.1293 0.0156 0.5454 0.9094 0.9032 0.0009 0.4764 0.4938 0.6811 0.3391 1.00000 0.36539 0.37863 -0.15481 0.05632 -0.10619 -0.21684 -0.14948 -0.22924 0.0565 0.4502 0.7847 0.05113 0.8041 0.36718 0.0664 0.0650 0.6057 0.2873 0.4661 0.2600 1.00000 0.09908 -0.14234 -0.13052 -0.09793 -0.16245 0.49211 0.08803 -0.13835 0.00321 0.6301 0.4879 0.5251 0.6341 0.4278 0.0107 0.6689 0.5003 0.9876 1.00000 -0.02041 -0.41934 0.10922 0.22386 -0.16695 0.17513 -0.02285 -0.26345 0.9212 0.0330 0.5953 0.2716 0.4150 0.3922 0.9118 0.1935 1.00000 -0.04358 0.8326 -0.07603 0.7120 -0.05992 0.7712 -0.05965 0.7722 -0.06050 0.7691 -0.04310 0.8344 -0.07569 0.7132 1.00000 -0.07446 -0.05868 -0.05842 -0.05926 -0.04222 -0.07413 0.7177 0.7758 0.7768 0.7737 0.8378 0.7189 1.00000 -0.10237 -0.10191 -0.10337 -0.07364 -0.12932 0.6188 0.6203 0.6153 0.7207 0.5289 1.00000 -0.08032 -0.08146 0.6924 -0.05804 -0.10192 0.7782 0.6203 -0.08110 0.6937 -0.05778 0.7792 -0.10146 1.00000 -0.05860 -0.10291 0.7761 0.6169 1.00000 -0.07332 0.92962 <.0001 0.30533 0.36539 0.1293 0.0664 CAZ 0.46935 0.37863 0.0156 0.0565 0.6301 cvcv -0.12424 -0.15481 0.4502 -0.14234 0.4879 -0.02041 cvcx •" -0.02346 0.9094 0.05632 -0.13052 -0.41934 -0.04358 0.7847 0.5251 0.0330 0.8326 -0.02509 0.05113 -0.09793 0.10922 -0.07603 -0.07446 0.9032 0.8041 0.6341 0.5953 0.7120 0.7177 0.61203 0.36718 -0.16245 0.22386 -0.05992 -0.05868 -0.10237 0.0009 0.0650 0.4278 0.2716 0.7712 0.7758 0.6188 -0.14608 0.4764 -0.10619 0.49211 -0.16695 -0.10191 -0.08032 0.0107 0.4150 -0.05965 0.7722 -0.05842 0.6057 0.7768 0.6203 0.6965 -0.14043 -0.21684 0.08803 -0.05926 -0.10337 0.6689 0.7691 0.7737 0.6153 -0.08146 0.6924 -0.08110 0.2873 0.17513 0.3922 -0.06050 0.4938 -0.08461 -0.14948 -0.13835 -0.02285 -0.04310 -0.04222 -0.07364 -0.05804 -0.05778 -0.05860 0.6811 0.4661 0.5003 0.9118 0.8344 0.8378 0.7207 0.7782 0.7792 0.7761 -0.19525 -0.22924 0.00321 -0.26345 -0.07569 -0.07413 -0.12932 -0.10192 -0.10146 -0.10291 -0.07332 0.3391 0.2600 0.9876 0.1935 0.7132 0.7189 0.5289 0.6203 0.6219 0.6169 0.7219 0.5454 cvp cxcv cxcx exp • pcv. pcx 0.09908 0.9212 0.6965 1.00000 0.6937 0.6219 0.7219 1.00000 Pearson correlation coefficients for raw variables Simple Statistics Variable N ' Meah" Std Dev ; Sum Minimum Maximum Label Lef 26 178.73462 203.29939 4647 25.80000 Path 26 9.23077 21.79231 240.00000 1.00000 sir-. 26 34.65385 5.76154 901.00000 18.00000 az 26 188.53846 105.23316 4902 0 H 26 23.76923 12.57078 618.00000 0 45.00000 H K 26 0.78462 0.31073 20.40000 0 1.00000 K 773.90000 Lef 106.00000 Path 44.00000 St 344.00000 254 Pearson Correlation Coefficients, N = 26 Prob > |r| under HO: Rho=0 Lef Path St az II K Lef Path St az H K 1.00000 0.84167 <.0001 -0.03752 0.8556 0.23633 0.2451 0.16680 0.13396 0.4154 0.5141 0.84167 <.0001 1.00000 0.04845 0.8142 0.36154 0.0696 0.30362 0.13227 0.1316 0.5195 -0.03752 0.8556 0.04845 0.8142 1.00000 0.09592 0.6411 0.50143 0.54430 0.0091 0.0040 0.23633 0.2451 0.36154 0.0696 0.09592 0.6411 1.00000 0.14614 0.32370 0.4762 0.1067 0.16680 0.4154 0.30362 0.1316 0.50143 0.0091 0.14614 0.4762 1.00000 0.59094 0.0015 0.13396 0.5141 0.13227 0.5195 0.54430 0.0040 0.32370 0.1067 0.59094 1.00000 0.0015 CD a X •• Ti 3 2T <•*• CB © < MB 8- a C CB =5 O O T3 0) % < 2. a 55* o QTQ tf 5' o o CB 5' e © c X I' CB 3 «. tf to X X X X o T3 CD •8? X X X X X X x O T3 CD »c X SO X tf X X X >x * X X X X X X 3 X ST X MB X X X X © X tf f 1-1 < O CD to era 13 D o .1 -L.J, X f ON x* c s- * X CfQ X CD 1 to 00 X X • <=3 ro -x-x—x- T| 3 o 3 O o c 1-1 XX •a CD 3 — S« X W X X X XXKSX f i-i oo x X XX (fq' 3" XX £. CTQ X X X X X X O cT X X X X 3 XX to X x "X""X—X f 3 Cu Cu ES' XX XX XX X X XX g CTQ XXX X XXX X X CD to X Appendix 8. Result tables Table A.8.1 Statistics used to identify the outlier observations (whole dataset) Event Studentized deleted residuals Hat Diag H 21-101 0.74 0.15 COV RATIO 1.61 21-102 -1.29 0.50 1.28 31-3 6.23 0.58 0.00 31-4 0.56 0.71 5.52 51-3 0.27 0.48 3.64 52-10 -0.45 0.18 2.12 52-12 -0.33 0.32 2.71 52-13 0.59 0.11 1.76 52-16 0.08 0.35 3.02 52-26 -0.35 0.14 2.13 52-30 -3.76 0.51 0.00 52-32 -0.43 0.62 4.61 53-5 -0.57 0.08 1.72 53-6 -0.03 0.08 2.16 61-10 -0.17 0.79 9.30 61-15 -0.5643 0.3590 2.4790 61-18 -0.2345 0.0963 2.1137 61-19 0.6178 0.3131 2.2144 62-10 2.2535 0.3711 0.1371 62-14 -0.5833 0.1504 1.8425 62-23 -0.5630 0.9998 8979.317 62-23b -0.1964 0.9997 6262.330 64-20 0.4263 0.3092 2.5301 73-12 -0.5095 0.1513 1.9505 73-17 0.5650 0.1112 1.7871 73-18 -0.2703 0.6665 5.6551 73-18b 0.9757 0.3918 1.6975 73-28 -1.6379 0.3790 0.5510 74-20 0.4397 0.9998 9557.203 83-7 0.0766 0.0888 2.1700 Table A.8.2. Statistics used to determine the influential observations (whole dataset) Event DFFITS Cook DFBETAS LPATH TST CAZ cvcv Cvcx cvp cxcv cxcx . cxp pcv pcx 21-101 0.3161 0.009 0.2099 -0.1595 -0.0669 -0.0569 -0.1077 -0.1392 -0.2193 -0.0031 -0.0196 -0.0482 -0.0712 21-102 -1.2800 0.132 0.1796 -0.3172 -0.2881 -0.0624 -0.1870 -0.0406 -0.1740 0.0515 0.0542 -0.0618 -1.1123 31-3 7.3309 1.445 -3.4200 6.8106 -0.6937 -0.1681 0.1347 0.0110 1.7014 -3.7303 -1.7499 -0.1613 -1.6314 31-4 0.8872 0.068 0.0315 -0.0629 0.1355 0.0154 0.0598 -0.0121 -0.0004 0.7910 0.0088 0.0153 0.0614 51-3 0.2605 0.006 -0.0099 -0.0123 0.1112 -0.0001 0.0428 -0.0237 0.0106 0.0209 0.1923 0.0002 0.0248 52-10 -0.2117 0.004 0.0663 0.0232 -0.1433 0.0382 -0.0252 0.0779 -0.0143 0.0150 0.0774 0.0417 0.0286 52-12 -0.2248 0.004 0.0725 -0.0006 0.0877 0.0200 0.0386 -0.1792 -0.0409 0.0289 0.0090 0.0198 0.0454 52-13 0.2082 0.004 0.0617 -0.1112 -0.0410 -0.0526 -0.0761 -0.0922 -0.0998 -0.0168 -0.0390 -0.0571 -0.0739 52-16 0.0552 0.000 0.0152 -0.0161 0.0142 0.0011 0.0040 0.0366 -0.0094 0.0107 0.0027 0.0011 0.0073 52-26 -0.1421 0.002 0.0208 0.0352 -0.0853 0.0324 -0.0098 0.0616 0.0116 0.0064 0.0492 0.0313 0.0231 52-30 -3.8068 0.699 -1.0647 -2.1722 -1.0445 -0.2242 -0.4305 0.8485 1.0268 0.8741 0.4385 -0.2220 -0.0882 52-32 -0.5534 0.027 0.3513 -0.0964 -0.0870 0.0373 -0.0544 -0.0514 -0.5153 0.0587 0.0894 0.0358 0.0473 53-5 -0.1716 0.003 -0.0071 0.0297 0.0362 0.0516 0.0554 0.0744 0.0578 0.0486 0.0513 0.0503 0.0775 53-6 -0.0101 0.000 -0.0004 0.0016 0.0023 0.0027 0.0036 0.0043 0.0034 0.0029 0.0030 0.0029 0.0046 61-10 -0.3365 0.010 -0.1722 0.0230 0.1196 -0.0227 0.0516 0.0017 -0.0453 -0.0111 -0.0584 -0.0216 -0.0094 61-15 -0.4223 0.015 0.0315 -0.0629 0.1355 0.0154 0.0598 -0.0121 -0.0004 -0.2592 0.0088 0.0153 0.0614 61-18 -0.0766 0.001 0.0066 0.0241 0.0140 0.0223 0.0261 0.0301 0.0175 0.0169 0.0230 0.0242 0.0344 61,-19 0.4171 0.015 -0.0311 -0.0709 0.2588 -0.0176 0.0860 -0.0720 0.2179 0.0523 -0.0553 -0.0163 0.0408 1.7310 0.204 0.7717 -0.0871 0.6445 0.1430 0.2423 0.8524 -0.5001 0.2344 0.0824 0.1408 0.3374 -0.2454 0.005 0.0174 -0.0079 0.1684 0.0607 0.1142 0.0504 0.0527 0.0940 0.0466 0.0596 0.1242 62-23 -42.2972 154.968 0.0847 -0.1033 -0.0617 -0.0742 -0.1017 -0.1252 -0.1382 -0.0436 -0.0568 -40.1312 -0.1033 62-23b -11.1805 11.005 -0.0168 -0.0523 0.0049 -10.6114 -0.0452 -0.0779 -0.0384 -0.0347 -0.0597 -0.0546 -0.0709 64-20 0.2852 0.007 0.0371 -0.0595 0.0592 -0.0036 0.0135 -0.0266 -0.0266 0.0327 -0.0014 -0.0034 0.2480 13-12 -0.2151 0.004 -0.1244 0.1351 0.0101 0.0451 0.0625 0.1013 0.1359 -0.0107 0.0220 0.0440 0.0451 .73-17 0.1998 0.003 -0.0510 -0.0491 -0.0468 -0.0599 -0.0672 -0.0647 -0.0195 -0.0511 -0.0625 -0.0639 -0.0948 73-18 -0.3822 0.013 -0.0099 -0.0123 0.1112 -0.0001 0.0428 -0.0237 0.0106 0.0209 -0.3467 0.0002 0.0248 0.7831 0.051 0.0635 -0.1268 -0.3914 -0.0447 -0.1933 0.0299 -0.0844 -0.0356 0.0499 -0.0448 0.4414 -1.2795 0.125 0.5002 -0.4108 0.3251 0.0251 0.0676 -1.1187 -0.3415 0.2551 0.0714 0.0251 0.1700 32.6403 92.948 0.0722 -0.3500 -0.1063 0.0501 27.9864 0.1555 0.0477 0.2300 0.1695 0.0482 0.1188 0.0239 0.000 -0.0043 -0.0035 -0.0046 -0.0077 -0.0079 -0.0091 -0.0041 -0.0073 -0.0080 -0.0067 -0.0114 62-10 62-14 : 73-18b. 73-28 74-20 83-7 • 263 Table A.8.3. Statistics for identifying the outliers (reduce data set) Event Studentized deleted residuals Hat Diag H Cov Ratio 21-101 0.446 0.307 2.927 21-102 -0.595 0.559 4.003 31-4 0.381 0.725 7.760 51-3 1.459 0.533 0.844 52-10 -0.302 0.338 3.381 52-12 0.535 0.340 2.837 52-13 0.431 0.148 2.410 52-16 -0.960 0.617 2.789 52-26 -0.979 0.218 1.326 52-30 1.380 0.875 3.768 52-32 -0.204 0.748 9.277 53-5 -1.845 0.094 0.165 53-6 0.374 0.094 2.363 61-15 -0.381 0.389 3.487 61-18 -1.199 0.111 0.778 61-19 0.204 0.489 4.585 62-14 -1.045 0.193 1.146 62-23 -1.051 1.000 5530.163 62-23b -0.792 1.000 5410.017 64-20 -0.254 0.334 3.443 73-17 2.537 0.141 0.023 73-18 -1.459 0.700 1.313 73-18b 0.967 0.451 1.926 73-28 0.277 0.488 4.426 74-20 1.642 1.000 1467.102 83-7 0.929 0.110 1.265 Table A.8.4. Statistics to identify the influential cases for final regression (reduce dataset) Event DFFITS Cook DFBETAS LPATH TST 0.01 0 . 2 4 5 7 CAZ cvcv cvcx cvp cxcv cxcx exp pcv pcx -0.1796 -0.1111 -0.0174 -0.1283 -0.0822 -0.1646 0.0726 0.0685 -0.0127 -0.0074 21-101 0.2974 21-102 -0.6698 0.04 0.2158 -0.2559 -0.1889 -0.0123 -0.1481 -0.0337 -0.1141 0.1271 0.1173 -0.0130 -0.4698 0.6187 0.03 0.0263 -0.0746 0.1045 0.0100 0.0468 0.0015 -0.0324 0.4902 0.0157 0.0100 0.0598 1.5584 0.19 -0.2640 -0.0247 0.7943 -0.0511 0.3697 -0.0871 -0.0677 0.0976 0.8496 -0.0466 0.1326 52-10 -0.2157 0.00 0.1275 -0.0305 -0.1635 0.0498 -0.0593 0.0596 0.0098 0.0398 0.1080 0 . 0 5 14 0.0410 52-12 0.3840 0.01 - 0 . 0 9 2 5 0.0493 -0.0528 -0.0327 -0.0331 -0.0705 31-4 51-3 • 0.0277 -0.0789 -0.0338 -0.0373 0.3008 -0.0437 -0.0371 -0.0754 -0.0727 52-13 0.1794 0.00 0 . 0 8 2 1 -0.1064 -0.0915 0.0226 -0.0004 -0.0406 -0.0404 52-16; -1.2179 0.12 -0.6207 0.5671 -0.2302 -0.0948 -0.0139 -0.7675 0.3880 -0.4660 -0.2641 -0.0923 -0.3306 52-26 -0.5168 0.02 0.2027 0.0257 -0.3632 0.1387 -0.1035 0.1803 0.0996 0.0483 0.2293 0.1336 0.0945 52-30- 3.6455 1.04 0.566 1 2.1153 0.1666 0.2904 0.1725 -0.2350 -0.2081 -1.3061 -0.4371 0 . 2 8 4 5 -0.2084 52-32 -0.3518 0.01 0 . 0 9 1 9 -0.0735 0.0307 0.0101 -0.0028 -0.0153 -0.3233 0.0571 0.0317 0.0099 0.0395 -0.5940 0.03 0.0127 0.1157 0. 1 8 2 8 0.1889 0.2620 0.1728 0.1836 0.1816 0.1785 0.2815 -0.0423 -0.0527 -0.0344 -0.0385 -0.0365 -0.0359 -0.0577 53-5 ' 0.0025 -0.0001 -0.0009 -0.0259 -0.0327 -0.3037 0.01 0 . 0 2 6 3 -0.0746 0.1045 0.0100 0.0468 0.0015 -0.0324 -0.1124 0.0157 0.0100 0.0598 '61-18 -0.4245 0.01 0 . 0 7 1 9 0.0693 0.0296 0.1380 0.1205 0.1849 0.1064 0.0924 0.1439 0.1472 0.1964 61-19 0.2001 0.00 0 . 0 9 19 -0.0735 0.0307 0.0101 -0.0028 -0.0153 0.0681 0.0571 0.0317 0.0099 0.0395 62-14. -0.5106 0.02 0.0300 -0.1049 0.3377 0.1156 0.2358 0.1323 0.0242 0.2287 0.0986 0.1138 0.2725 62-23 -81.6501 551.48 0.2693 -0.2408 -0.1845 -0.1314 -0.2625 -0.2460 -0.2885 -0.0059 -0.0256 -76.3511 -0.1684 62-23b -49.5174 209.93 -0.2065 -0.1004 0.1373 -46.3313 -0.1426 -0.3861 -0.2223 -0.1840 -0.3467 -0.2964 -0.3597 64-20' -0.1795 0.00 -0.0347 0.0574 -0.0359 0.0011 -0.0055 0.0116 0.0339 -0.0415 -0.0115 0.0010 -0.1548 73-17 1.0277 0.06 -0.4175 -0.0141 -0.0455 -0.3515 -0.2245 -0.3920 -0.1038 -0.3162 -0.4225 -0.3667 -0.5205 73-18. -2.2264 0.38 -0.2640 -0.0247 0.7943 -0.0511 0.3697 -0.0871 -0.0677 0.0976 -1.8846 -0.0466 0.1326 73-18b 0.8753 0.06 0.2459 -0.2238 -0.4938 -0.0177 -0.2918 -0.0123 -0.0647 0.0567 0.1647 -0.0194 0.4709 73-28/ 0.2697 0.01 - 0 . 1 0 7 1 0.1270 -0.0168 -0.0062 0.0157 0.2197 0.0713 -0.0889 -0.0490 -0.0060 -0.0461 -0.4351 0 . 1 8 6 4 100.7551 0.4733 0.3069 0.6367 0.5539 0.1793 0.3700 -0.0186 -0.0759 -0.0444 -0.1265 -0.1413 -0.1023 -0.1724 53-6 0.1207 61-15 74-20. ' 83-7'. 0.00 124.8502 1158.65 0.0536 -0.7046 0.3258 0.01 -0.1109 0.0328 -0.1154 -0.1340 265
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Prediction of unconfined debris slide-flow travel distance...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Prediction of unconfined debris slide-flow travel distance using set theory Strimbu, Bogdan Mihai 2002
pdf
Page Metadata
Item Metadata
Title | Prediction of unconfined debris slide-flow travel distance using set theory |
Creator |
Strimbu, Bogdan Mihai |
Date Issued | 2002 |
Description | Mass movement risk assessments are usually separated into two components: movement initiation and the travel distance of the event. The initiation point is the point on the slope where the mass failed. This position is hard to determine and a common assumption is that it is the highest elevation of the scar. The travel distance is the distance from the initiation point to the point where all material is deposited. This study concentrates on particular forms of mass movements, namely unconfined debris flows and debris slides. The parameters that characterize mass movements change over time. Usually, measurements are performed after the event, resulting in the data being of questionable precision. The reliability of any mass movement travel investigation is dependent on the accuracy of the measured values. The results obtained are dependent on the precision of the original data, and can affect predictions made from the data in two ways: uncertainties in model lack-of fit (data suitability) and uncertainties in data meaning. This study builds a new debris slide-flow travel distance prediction model with a narrow confidence interval that can take into account the vagueness of the variables. Fuzzy set theory has been applied in order to overcome uncertainties related to the true value of the parameters. The study was performed using data from the Arrow Forest District, British Columbia, Canada. A total of 38 events were measured, classified as unconfined debris slide - flow, traveling through forested terrain, and used to build and test the debris slide - flow travel distance prediction model. The relationship between debris slide-flow length and other debris slide-flow attributes (i.e. geomorphology, geology, canopy closure and species) was established using regression analysis on crisp sets. A new attribute was introduced to capture the debris slide-flow path. The new path variable is based on the one-to-one relationship that exists between the binary and decimal numeration systems. The path variable uses uniform sections of the debris slide-flow, called reaches, which are larger than 25 m, except for first and last reach. Each reach can have a value 0 or 1 depending on the slope of the upstream reach. The first (uppermost) reach always has a value of 1. The values assigned to other reaches follows the rule that if the slope of the reach is less that the slope of the reach immediately above it, it is assigned value of 0; if the slope of the reach is greater than that of the upstream reach, it is assigned a value of 1. The event stops if the slope is less than 20° or it reach the stream. The analysis of the crisp data set revealed that the new path variable, slope, azimuth, height of the stand, canopy closure and horizontal and vertical curvature are the significant variables (at a significance level of a=0.05) affecting debris slideflow travel distance. The significant variables supplied by the regression analysis using crisp sets were fuzzified in order to introduce the vagueness of reality. The fuzzified variables were used in a fuzzy regression analysis, based on non-linear programming. The same variables used in the regression analysis of crisp sets were used in the fuzzy analysis. The confidence interval for debris slideflow travel distance prediction model using the fuzzy sets was smaller than 40% of the event travel distance. The models for the crisp and fuzzy sets show similar trends. Each model predicts the debris slide-flow travel distance with more than 80% precision. The final model used for the prediction combines both models, thereby minimising the confidence interval and the variable fuzziness. The equations derived from the models can be implemented in management software that uses digitized contour maps. |
Extent | 20697223 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-10-09 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0074930 |
URI | http://hdl.handle.net/2429/13864 |
Degree |
Doctor of Philosophy - PhD |
Program |
Forestry |
Affiliation |
Forestry, Faculty of |
Degree Grantor | University of British Columbia |
Graduation Date | 2003-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_2003-0059.pdf [ 19.74MB ]
- Metadata
- JSON: 831-1.0074930.json
- JSON-LD: 831-1.0074930-ld.json
- RDF/XML (Pretty): 831-1.0074930-rdf.xml
- RDF/JSON: 831-1.0074930-rdf.json
- Turtle: 831-1.0074930-turtle.txt
- N-Triples: 831-1.0074930-rdf-ntriples.txt
- Original Record: 831-1.0074930-source.json
- Full Text
- 831-1.0074930-fulltext.txt
- Citation
- 831-1.0074930.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0074930/manifest