Comparison of Neural Classifiers and Conventional Approaches to Mode Choice Analysis by Stella Yu Wai Chow B.A.Sc., The University of British Columbia, 2000 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE in THE FACULTY OF GRADUATE STUDIES (Department of Civil Engineering) We accept this thesis as conforming to the required standard THE UNIVERSITIY OF BRITISH COLUMBIA April 2002 Â© Stella Yu Wai Chow, 2002 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Ch/i The University of British Columbia Vancouver, Canada DE-6 (2/88) A B S T R A C T A B S T R A C T This thesis provides a comparison of three modeling techniques which can be used for mode choice analysis. The techniques include the conventional logit, artificial neural networks (ANNs), and neurofuzzy models. The three modeling techniques were applied to mode choice data extracted from the 1999 24-hour trip diary survey of the Greater Vancouver Regional District. The travel mode of each individual was explained using explanatory variables acquired from three categories of the database: household database, personal database, and trip database. The results showed that, as modeling techniques, both ANNs and neurofuzzy models are highly adaptive and very efficient in dealing with problems involving complex interrelationships among many variables. The neurofuzzy technique combines the learning ability of artificial neural networks and the transparent nature of fuzzy logic. In addition; the neurofuzzy technique only selects the variables that significantly influence mode choice and display the stored knowledge in terms of fuzzy linguistic rules. This allows the modal decision making process to be examined and understood in great detail. The results of the comparison also indicated that neurofuzzy models produced the best results in terms of model accuracy. As well, it selected the least number of variables to achieve these results. n Stella Y . W. Chow T A B L E O F C O N T E N T S T A B L E O F C O N T E N T S A B S T R A C T ii L I S T O F T A B L E S vi L I S T O F F I G U R E S ix A C K N O W L E D G E M E N T S x C H A P T E R 1 I N T R O D U C T I O N 1 1.1 BACKGROUND 1 1.2 THESIS OBJECTIVE 3 1.3 THESIS STRUCTURE 4 C H A P T E R 2 L I T E R A T U R E R E V I E W 5 2.1 BACKGROUND 5 2.2 M O D E CHOICE . . . . 9 2.2.1 CONVENTIONAL MODE CHOICE MODELING 13 2.2.1.1 LOGIT MODEL 13 2.2.1.2 PROBIT MODEL 16 2.2.2 ARTIFICIAL NEURAL NETWORKS 17 2.2.3 FUZZY LOGIC 22 2.2.4 NEUROFUZZY 27 2.3 PREVIOUS W O R K 3 4 C H A P T E R 3 D A T A C O L L E C T I O N 3 6 3.1 INTRODUCTION 3 6 3.2 D A T A DESCRIPTION 3 8 3.3 TRAINING AND TESTING D A T A 44 3.4 M E T H O D O L O G Y 4 5 3.4.1 INTRODUCTION TO NEUFRAME 3.0 47 C H A P T E R 4 D A T A A N A L Y S I S A N D R E S U L T S 5 0 4.1 INTRODUCTION 5 0 4.2 CONVENTIONAL M O D E CHOICE L O G I T M O D E L S 5 0 4.2.1 METHODOLOGY 51 4.2.2 RESULTS OF THE WORK/SCHOOL TRIP 54 in Stella Y . W . Chow T A B L E OF CONTENTS 4.2.3 RESULTS OF THE RECREATIONAL TRIP 59 4.3 A N N M O D E CHOICE M O D E L S 64 4.3.1 METHODOLOGY 64 4.3.2 RESULTS OF THE WORK/SCHOOL TRIP 65 4.3.3 RESULTS OF THE RECREATIONAL TRIP 68 4.4 NEUROFUZZY M O D E CHOICE M O D E L S 69 4.4.1 METHODOLOGY 70 4.4.2 RESULTS OF THE WORK/SCHOOL TRIP 71 4.4.3 RESULTS OF THE RECREATIONAL TRIP 75 4.5 CONCLUSION 7 7 C H A P T E R 5 A N A L Y S I S A N D D I S C U S S I O N 78 5.1 INTRODUCTION 78 5.2 CROSS COMPARISON ANALYSIS 78 5.2.1 THE WORK/SCHOOL TRIP 79 5.2.1.1 CROSS COMPARISON (I) 79 5.2.1.2 CROSS COMPARISON (II) 80 5.2.2 THE RECREATIONAL TRIP 82 5.2.2.1 CROSS COMPARISON (I) 82 5.2.2.2 CROSS COMPARISON (II) 84 5.3 DISCUSSION 85 5.3.1 PREDICTION ACCURACY 86 5.3.2 THE NUMBER OF INPUT VARIABLES 88 5.3.3 THE SIGNIFICANCE OF INPUT VARIABLES 89 5.3.4 CLASSIFICATION THRESHOLD 91 5.3.5 NETWORK TRANSPARENCY 95 5.4 CONCLUSION 9 9 C H A P T E R 6 C O N C L U S I O N 100 6.1 SUMMARY 100 6.2 CONCLUSIONS 101 C H A P T E R 7 R E F E R E N C E 103 A P P E N D I C E S 106 iv Stella Y . W. Chow T A B L E OF CONTENTS A P P E N D I X 4 - A SAMPLE CALCULATION OF T H E PROBABILITY OF AN EVENT OCCURRING (LOGIT M O D E L ) 107 A P P E N D I X 4 - B RESULTS OF T H E L O G I T M O D E CHOICE M O D E L S FOR THE WORK/SCHOOL TRIP 109 A P P E N D I X 4 - C RESULTS OF THE L O G I T M O D E CHOICE M O D E L S FOR T H E RECREATIONAL TRIP 114 A P P E N D I X 4 - D SAMPLE CALCULATION OF T H E K A P P A STATISTIC 118 A P P E N D I X 5 - A RESULTS OF T H E L O G I T M O D E L FOR CROSS COMPARISONS (II).... 1 1 9 v Stella Y . W. Chow LIST OF TABLES LIST OF T A B L E S Table 2-â€¢1: Previous Research 35 Table 3-â€¢1: Travel Modal Share of the Two Trip Purposes 40 Table 3-â€¢2: Data Variables 41 Table 3-â€¢3: Descriptive Statistics of the Data Set for the Work/School Trip 43 Table 3-â€¢4: Descriptive Statistics of the Data Set for the Recreational Trip 44 Table 4-â€¢1: Approximate Strength of Agreement of Kappa statistic (Landis and Koch 1997) 54 Table 4-â€¢2: Result of the Logit Model with 20 Variables for the Work/School Trip 56 Table 4-â€¢3: Prediction Results of the Logit Model with 20 Variables for the Work/School Trip (Classification Threshold = 0.5) 57 Table 4--4: Logit Models with Variable License and without Variable Licenses for the Work/School Trip (Classification Threshold = 0.5) 57 Table.4-â€¢5: Summary of A l l 5 Logit Models for the Work/School Trip 58 Table 4-â€¢6: The Best Logit Model with 14 Variables for the Work/School Trip 59 Table 4--7: Result of the Logit Model with 20 Variables for the Recreational Trip 61 Table 4--8: Prediction Results of the Logit Model with 20 Variables for the Recreational Trip (Classification Threshold = 0.5) 61 Table 4-â€¢9: Logit Models with Variable Licenses and without Variable Licenses for the Recreational Trip (Classification Threshold = 0.5) 62 Table 4--10: Summary of A l l 5 Logit Models for the Recreational Trip 63 Table 4-â€¢11: Result of the Logit Model with 8 Variables for the Recreational Trip 63 Table 4-â€¢12: Result of the Logit Model with 7 Variables for the Recreational Trip 63 Table 4-â€¢13: Prediction Results of the A N N Model with 20 Variables for the Work/School Trip (Classification Threshold = 0.50) 68 Table4-â€¢14: Success Rate of the A N N Model for the Work/School Trip under Different Confidence Levels 68 Table 4-â€¢15: Prediction Results of the A N N Model with 20 Variables for the Recreational Trip (Classification Threshold = 0.50) 69 Table 4- 16: Success Rate of the A N N Model for the Recreational Trip under Different Confidence Levels 69 VI Stella Y . W. Chow LIST OF TABLES Table 4-17: Prediction Results of Neurofuzzy Model 1 for the Work/School Trip (Classification Threshold = 0.50) 72 Table 4-18: Prediction Results of Neurofuzzy Model 2 for the Work/School Trip (Classification Threshold = 0.50) 73 Table 4-19: Success Rates of Neurofuzzy Models for Work/School Trip under Different Confidence Levels 73 Table 4-20: Prediction Results of Neurofuzzy Model 1 for the Recreational Trip (Classification Threshold = 0.50) 76 Table 4-21: Prediction Results of Neurofuzzy Model 2 for the Recreational Trip (Classification Threshold = 0.50) 76 Table 4-22: Success Rates of Neurofuzzy Models for the Recreational Trip under Different Confidence Levels 77 Table 5-1: Prediction Results of the A N N Model Based on the Best Logit Model for the Work/School Trip (Classification Threshold = 0.50) 79 Table 5-2: Prediction Results of the Neurofuzzy Model Based on the Best Logit Model for the Work/School Trip (Classification Threshold = 0.50) 80 Table 5-3: Prediction Results of the Logit Model Based on the Best Neurofuzzy Model for the Work/School Trip (Classification Threshold = 0.50) 80 Table 5-4: Prediction Results of the A N N Model Based on the Best Neurofuzzy Model for the Work/School Trip (Classification Threshold = 0.50) 81 Table 5-5: Summary of A l l Comparisons for the Work/School Trip 82 Table 5-6: Prediction Results of the A N N Model Based on the Best Logit Model for the Recreational Trip (Classification Threshold = 0.50) 83 Table 5-7: Prediction Results of the Neurofuzzy Model Based on the Best Logit Model for the Recreational Trip (Classification Threshold = 0.50) 83 Table 5-8: Prediction Results of the Logit Model Based on the Best Neurofuzzy Model for the Recreational Trip (Classification Threshold = 0.50) 84 Table 5-9: Prediction Result of the A N N Model Based on the Best Neurofuzzy Model for the Recreation Trip (Classification Threshold = 0.50) 84 Table 5-10: Summary of A l l Comparisons for the Recreational Trip 85 vii Stella Y . W. Chow LIST OF TABLES Table 5-11: Variable Comparison between the Best Logit and Neurofuzzy Models (the Work/School Trip) 90 Table 5-12: Variable Comparison between the Best Logit and Neurofuzzy Models (the Recreational Trip) 91 Table 5-13: Fuzzy Set Representation 96 Table 5-14: Fuzzy Rules Used by the Neurofuzzy Model for the Work/School Trip 98 V l l l Stella Y . W. Chow LIST OF FIGURES LIST OF F I G U R E S Figure 2-1: Urban Transportation Planning Process (Khisty and Lall 1998) 6 Figure 2-2: Four Basic Models Used in Transportation Planning (Khisty and Lall 1998). 8 Figure 2-3: Model Split Model in Two-Mode Situation (Khisty and Lall 1998) 12 Figure 2-4: A typical Multi-layer Feed-forward Neural Network (Yang et al. 1993) 19 Figure 2-5: A Neuron in the Hidden Layer (Maier et al. 2000) 19 Figure 2-6: Logistic Function 20 Figure 2-7: Flow Chart of a Back-Propagation Training Algorithm (Yang et al. 1993).. 22 Figure 2-8: Triangular Fuzzy Sets for Temperature 24 Figure 2-9: The Basic Components of a Fuzzy System (Maier et al. 2000) 25 Figure 2-10: B-spline Functions with k=l to 4 (Bossley 1997) 27 Figure 2-11: Structure of a Typical B-Spline Neurofuzzy System (Sayed and Razavi 2000) 29 Figure 2-12: Additive Model Structure (Bossley 1997) 31 Figure 2-13: Univariate Addition and Univariate Deletion (Bossley 1997) 32 Figure 2-14: Tensor Product and Tensor Split (Bossley 1997) 33 Figure 2-15: Knot Insertion and Knot Deletion (Bossley 1997) 34 Figure 3-1: Greater Vancouver Regional District Subregions : 37 Figure 3-2: Mode Choice Distribution of Recreational Trip 39 Figure 3-3: Mode Choice Distribution of Work/School Trip 39 Figure 3-4: Fuzzy Rules in NEUframe Version 3.0 48 Figure 3-5: Rule Fire Matrix in NEUframe Version 3.0 49 Figure 4-1: Structure of an A N N Mode Choice Model 67 Figure 4-2: Structure of Neurofuzzy Model 1 for the Work/School Trip 72 Figure 4-3: Comparison of Neurofuzzy Model 1 and Model 2 for for Work/School Trip75 Figure 5-1: Classification Thresholds Comparison for the Work/School Trip 93 Figure 5-2: Classification Thresholds Comparison for the Recreational Trip 95 Figure 5-3: Membership Functions of the Neurofuzzy Model 97 ix Stella Y . W. Chow A C K N O W L E D G E M E N T S A C K N O W L E D G E M E N T S First of all, I would like to give special thanks to my supervisor Dr. Tarek Sayed, for his immeasurable guidance and patience. It was a great pleasure to complete this thesis under his supervision. Second, I would also like to thank Mr. Clark Lim, a senior transportation engineer at Translink, for spending hours on retrieving data on my behalf and giving me helpful direction for handling the data. Next, I would like to thank my parents and my sister for supporting me throughout my study life. M y Mom and Dad have always believed that learning is the most important stage in life. They have been giving my sister and me the best opportunities to learn. T H A N K Y O U ! I am sure they will be proud of me completing a master's degree. Finally, I would like to thank Mr. Neon Koon who has been giving me support throughout both of my graduate and undergraduate programs. x Stella Y . W. Chow INTRODUCTION CHAPTER 1 INTRODUCTION 1.1 Background The rapidly growing number of single occupied vehicles has a diminishing effect on the road space. This leads to significant impact on the environment, the economic development, and the quality of life. In order to enhance the existing transportation system, a good transportation plan is desirable to improve the movement of people and goods. Travel demand forecasting models play an important role in transportation planning. The traditional travel-demand forecasting model involves four steps: trip generation, trip distribution, mode choice and trip assignment. Among these four steps, mode choice is perhaps the most critical step since it affects many transportation policy issues such as travel cost and time. What would the mode shift be i f transit fare is decreased? Would this attract 20% more transit riders? Such questions cannot be concluded easily without the application of an efficient mode choice model. A good mode choice model is needed to examine the relationship between an individual's mode choice selection and the different variables such as policy, demography, and group variables. Mode choice models help determine how commuters can travel more efficiently. A wide range of applications employs mode choice modeling techniques such as route choice modeling (Schwartz et al. 1999) and freight transport (Abdelwahab and Sayed 1998). Different mathematical mode choice models have been used. The most widely used conventional mode choice models are probably the logit and probit models. They are parametric models which have restrictions on the use of specific distribution functions 1 Stella Y . W. Chow INTRODUCTION (Sayed and Razvai 2000). With a logistic distribution, the resulting model is the logit model and with the standard normal distribution, the resulting model is the probit model. As well, the input and output relationships of a mode choice model are so complex that they are extremely non-linear. However, the logit and probit models which use linear regression to develop mode choice models may not produce good results when applied to complex and non-linear problems. Therefore, exploring other advanced modeling approaches may enhance the mode choice model process. Two of these advanced modeling techniques are Artificial Neural Network (ANN) and Neurofuzzy models. An Artificial Neural Network (ANN) attempts to mimic, in a very simplified way, the human mental neural structure and functions. They are highly adaptive and very efficient in modeling complex, non-linear problems. Neurofuzzy systems combine the transparent, linguistic representation of a fuzzy system with the learning ability of ANNs. Both neural networks and fuzzy system have their strengths and weaknesses. Neural networks are good at recognizing patternsâ€”they have the ability to learn from a data set. However, the outputs of neural networks remain a "black box" which is difficult to interpret and to make the necessary changes to the model. Because of the implicit knowledge representation, long-term experience and great effort, such as to identify the significant input variables and to set the parameters of the learning and training algorithms, are required to obtain the optimal model. The advantage of a fuzzy system is that its output is easy to interpret, thus, generating an explicit knowledge representation for optimization is possible. As well, linguistic variables are used in fuzzy system to represent natural languages such as very cold, cold, and warm. This allows a gradual 2 Stella Y . W. Chow INTRODUCTION transition between member and non-member in a set theory. As a result, computer programming can have a more human-like way of thinking. However, the main disadvantage of fuzzy system is that expert knowledge is required to derive the "if then" rule from the data set manually and to describe the membership functions. This needs a great deal of effort especially for large data sets. Therefore, the combination of fuzzy system and neural networks, called neurofuzzy, has been employed to overcome the weaknesses of each individual technique. Neurofuzzy combines the transparent representation of fuzzy system with the learning capability of neural networks. A fuzzy system can be set up automatically as a neural network and is trained. This allows further insight into the modeling process. To develop an efficient mode choice model, the number and types of explanatory variables required for calibrating the model are critical because a rich data source is usually not available. Models with fewer variables reduce the computational cost and time. In addition, a good mode choice model not only has the ability to provide accurate prediction, but it also provides insight into the modeling process. Therefore, relationships between the explanatory variables and the travel behavior of commuters can be investigated. Furthermore, allowing expert knowledge to be added into the developed model for further adjustment would also be a useful component of a good mode choice model. 1.2 Thesis Objective In this thesis, three modeling techniques: the conventional logit model, artificial neural networks (ANNs), and neurofuzzy approaches, were used to develop mode choice 3 Stella Y. W. Chow INTRODUCTION models. The main objective of this thesis is to compare the results of the three modeling techniques and identify their advantages. A l l three modeling techniques were applied to the data set obtained from the 24-hour trip diary survey of Greater Vancouver Regional District (GVRD). The data set consisted of three categories: household, personal, and trip characteristics. "To and from work/post-secondary school trip" and "social/ recreational/ personal trip" are the two trips used in the thesis. The two travel modes are transit and automobile drive. Analyses of the results and comparisons were carried out to evaluate the performance of each modeling technique. 1.3 Thesis Structure This thesis consists of six chapters. In Chapter 2, a literature review on the background of transportation planning, mode choice analysis, and the three modeling techniques (conventional, ANNs, and neurofuzzy) are presented. As well, results of previous research on the comparison of artificial intelligent methods with conventional methods are also presented. Chapter 3 provides a detailed description and statistical summary of the data set used in this thesis. Chapter 3 also provides a description of the neurofuzzy software NEUframe. In Chapter 4, the data analysis and the results of three mode choice modeling techniques are presented. Comparisons and discussion of all results are presented in Chapter 5. Finally, the research conclusions and recommendations for future research are provided in Chapter 6. 4 Stella Y . W. Chow LITERATURE REVIEW CHAPTER 2 LITERATURE REVIEW 2.1 Background Transportation has a significant impact on our daily life. It affects our quality of life, the quality of the environment, the economic development of our society, etc. The aim of transportation planning is to help prevent problems such as traffic congestion, adverse land use patterns, unsafe travel patterns, severe environmental impact such as air pollution, and the unwise use of government funding (Beimborn 1995). All these problems are often related to the transportation policies which lead to changes in the behavior of traveler and changes to the existing infrastructures. As a result, good transportation planning is necessary to develop concise and useful information to make decisions on the development and management of future transportation systems. Therefore, the movement of people and goods can be improved, particularly in urban areas (Khisty and Lall 1998, Beimborn 1995). Transportation planning is a complicated process which involves many stakeholders: public organizations (local and regional agencies), private organizations (developers, bankers, land owners), and other interest groups (environmental agencies). Sequences of steps are included in the transportation planning process such as data collection, forecasting, and developing alternatives (Beimborn 1995). Figure 2-1 shows the process of the urban transportation planning. There are two elements in the transportation plan: long-range element and transportation systems management (TSM) element. The former element recognizes major constructions, changes to existing facilities, as well as the actions on long-range policy such as building a new freeway. 5 Stella Y. W. Chow LITERATURE REVIEW The latter element improves the current transportation system and makes necessities for short-range transportation enhancement. Travel-demand forecasting is an important planning tool. It is used in many elements of the planning process including TSM, long-range elements, plan refinement, and the updating process (Khisty and Lall 1998). ORGANIZATION MPO State Operating Agencies * PLANNING WORK PROGRAMS Prospectus Unified Planning Work Program O a. LU a. < Q Q. 3 of O H 2 O GO GO LU o o 0 . O o o TRANSPORTATION PLAN: LONG-RANGE ELEMENT â€¢ Planning Tools Evaluation of Plan Alternatives Selection of Plan Element TRANSPORTATION PLAN: TRANSPORTATION SYSTEMS MANGEMENT ELEMENT Planning Tools Evaluation of Plan Alternatives Slection of Plan Element PLAN REFINEMENT TRANSPORTATION IMPROVEMENT PROGRAM Staged Multiyear Element Annual Element Figure 2-1: Urban Transportation Planning Process (Khisty and Lall 1998) 6 Stella Y . W. Chow LITERATURE REVIEW Travel demand forecasting models play an important role in transportation engineering. They project changes required in future traffic such as the change in highway capacity and public transportation development. Before getting into the travel demand forecasting process, information of the study area, urban activities, transportation system, and travel are required (Khisty and Lall 1998). With regard to the study area, the planner should define the boundaries and subdivide the study area into transportation analysis units such as zones, districts, sectors, or rings. Afterwards, urban activities information in the transportation analysis units can be collected and then network geometry of transportation system such as numbering the intersections can be carried out (Khisty and Lall 1998). The traditional travel demand forecasting model involves four steps: trip generation, trip distribution, mode choice, and trip assignment. The study area is divided into zones. Zones are different in sizes and each zone is characterized by its own population, employment rate, land development, etc. Figure 2-2 explains the four steps travel-demand forecasting models. On top of travel demand forecasting are the urban activity forecasts which are needed to estimate future population, economic activity, and land use (Beimborn 1995). Trip generation, which is the first step of the four steps travel demand forecasting model, estimates the number of trips by purpose that will be produced and attracted from each zone. Home based school trip and work trip are the two typical trip purposes. A home based school trip is a trip that starts from home and ends at school while a home based work trip represents the trip that starts from home and ends at work place. After estimating the trip production and trip attraction of each zone, trip distribution is the next step to determine the number of trips between different zone 7 Stella Y . W. Chow ) LITERATURE REVIEW origins and destinations. The Fratar method and the gravity model are the most widely used models. The detail of each method is explained in transportation planning books. Following trip distribution is the determination of how trips between the origin and the destination split into different travel modes. In other words, how will people choose their travel modes. Mode choice modeling is used for this purpose. Finally, trip assignment analyzes the route that people will use to travel from trip origin to trip destination. Thus, traffic flow patterns can be developed and congestion points can be located for further investigation. URBAN ACTIVITY TRIP GENERATION TRIP DISTRIBUTION MODE USAGE HIGHWAY ASSIGNMENT TRANSIT ASSIGNMENT Figure 2-2: Four Basic Models Used in Transportation Planning (Khisty and Lall 1998) 8 Stella Y . W. Chow LITERATURE REVIEW 2.2 Mode Choice Mode choice is one of the most critical steps in the four stage travel demand forecasting. Mode choice affects many transportation policy issues such as travel time and cost (Horowitz et al. 1986). For instance, what will the change of the number of transit riders be i f the fare is increased? Wil l many transit riders switch to other travel modes such as driving? Wil l the revenue of transit still continue an acceptable level? How about constructing a new station or changing a transit route? What effects will the fare increase have on ridership, profit, and traffic? Many policy variables affect the mode choice of the travelers. Thus, predicting the change of mode choice caused by any change in the relevant variables is important. There are many methods of mode choice modeling such as elasticity, aggregate and disaggregate mode choice modeling (Horowitz et al. 1986). Aggregate mode choice modeling relies on the characteristics of either zonal travel information or groups of traveler. On the other hand, disaggregate mode choice models are based on the mode choice made by individual travelers. Disaggregate mode choice models have played a significant role in travel demand analysis for 25 years (Bierlaire 1997). In this thesis, disaggregate mode choice model is employed. Many factors influence the mode choice selection of commuters. Basically, there are three groups of classification. These include the socio-economic characteristics of the trip maker, the characteristics of the trip and the characteristics of the transportation system (Khisty and Lall 1998, Ortuzar and Willumsen 1994). By discovering the relationships between all the influential factors and mode choice selection of commuters, future prediction can be achieved. This achievement can also facilitate improvement of 9 Stella Y . W. Chow LITERATURE REVIEW the potential travel modes. Some possible factors affecting mode choice are listed below (Khisty and Lall 1998, Ortuzar and Willumsen 1994): The trip maker Family income; â€¢ Number of automobile available; â€¢ Driver license; . Household type (age, working status, family size, etc.); Density of living area. The trip Average trip length; . Trip purposes (work trip, school trip, recreational trip, etc.); â€¢ The starting time of the trip (peak hour, off-peak period, etc.). The transportation system â€¢ Travel time includes waiting time, riding time, walking time of any travel modes; Travel cost (fare, parking fee, fuel, maintenance cost, insurance); â€¢ Convenience and flexibility; . Safety. Mode choice analysis can be undertaken at various points in the forecasting process. The two commonly used mode choice analyses are direct generation usage model or trip end model and trip interchange mode usage model (Khisty and Lall 1998). The former mode choice analysis is applied immediately after trip generation. In contrast, the mode choice analysis of the latter model is implemented after trip distribution. Figure 2-3 shows both mode choice models in a two-mode circumstance: transit and automobile. Direct generation usage mode, as illustrated in Figure 2-3 (a), generates trips by mode before distributing trips to destinations. The use of direct 10 Stella Y . W. Chow LITERATURE REVIEW generation usage model was popular in the past because people believed that individual characteristics played an important role in mode choice modeling, particularly in the US (Ortuzar and Willumsen 1994). With respect to the model calibration, direct generation usage model is easy to apply and it only entails a small data sample (Meyer and Miller 1984). Nevertheless, due to the fact that this mode choice modeling method only depends on individual characteristics, policy decisions such as increasing parking fees and improving travel time of public transportation have no impact on individual's selection of travel mode (Ortuzar and Willumsen 1994). Therefore, competing travel modes cannot be compared easily. Conversely, trip interchange mode usage model, as demonstrated in Figure 2-3(b), is the most common analysis because comparison of the competing travel modes can be made after knowing where the travelers will go for their trips. Thus, the characteristics of the transportation network are more measurable (Khisty and Lall 1998). 11 Stella Y . W. Chow LITERATURE REVIEW Land-use characteristics Transit person trip distribution Transit traffic assignment (a) Direct-generation usage mode Socioeconomic characteristics Automobile person trip distribution Highway traffic assignment Land-use characteristics Transit traffic assignment Trip generation r Person trip distribution 1 r Mode usage Socioeconomic characteristics Highway traffic assignment Auto occupancy (b) Trip interchange usage mode Figure 2-3: Model Split Model in Two-Mode Situation (Khisty and Lall 1998) 12 Stella Y . W.Chow LITERATURE REVIEW 2.2.1 Conventional Mode Choice Modeling The logit and probit models are the two most widely used conventional mode choice models. They predict a decision made by an individual as a function of a number of variables. This function is based on the utility of competing modes (Meyer and Miller 1984). The details of logit and probit models are explained in section 2.2.1.1 and section 2.2.1.2 respectively. 2.2.1.1 LOGIT MODEL Model Structure and Operation A disaggregate binary logit mode choice model is used when two mode choices are available such as transit and private car. Conversely, i f there are more than two modes, multinomial logit model can be used. A logit model analyzes the changes in the log odds of the dependent variable with every unit change in the independent variables. The mathematical function is based on the utility of each competing mode. A competing travel mode with a high utility has more potential travelers. This is based on the behavioral principle called "utility maximization" in which an individual will choose the alternative which has the maximum utility (Meyer and Miller 1984). Nevertheless, due to the fact that not all components of the utility function can be measured, the utility function for individual i choosing mode k includes a deterministic part (Vjk) and a random part (si). A deterministic utility contains observable components; a random part consists of immeasurable components such as the perceptions of travelers to choose their travel modes (Meyer and Miller 1984). A linear utility function (Abdelwahab and Sayed 1998) can be expressed as: 13 Stella Y . W. Chow LITERATURE REVIEW Ul k = a k + P l X l + P 2 X 2 + - + P j X j + e i U i k = V i k + Â£ i (2-1) where i = 1,2,3, â€¢ â€¢ â€¢ ,n (number of individuals); Uik = utility of individual i choosing mode k; Vjk = the deterministic utility of individual i choosing mode k; ock â€” constant for mode k; X i , X 2 , . . . ,Xj = explanatory variables; pi ,p2,. . .,Pj = corresponding calibration coefficients; 8j = random error term. A logit model uses an S-shaped (sigmoid) curve in which the regressed values will be in the interval of [0,1] with 0 being no probability and 1 being 100 percent probability of success (Khisty and Lall 1998). Thus, this curve can ensure that the output probabilities are always between 0 percent and 100 percent. To predict an individual's mode choice, the utility function should be transformed into a probability function. Equation (2-2) is the transformation equation for the binary logit model (Abdelwahab and Sayed 1998); equation (2-3) is for the multinomial logit model (Meyer and Miller 1984). Pk(i) = (l + e (2-2) 14 Stella Y . W. Chow LITERATURE REVIEW e v i t where Pk(i) = probability that individual i chooses mode k; Pik = probability of individual i choosing alternative k from a set of alternatives j ; Vjk = the deterministic utility of individual i choosing mode k; Ci = a set of alternatives. Independence of Irrelevant Alternatives One of the major weaknesses of the logit model is the property of independence of irrelevant alternatives (IIA) in which the alternatives presented in the choice set are independent of each other (Meyer and Miller 1984). The ratio of the probability of choosing one travel mode to the probability of choosing another mode is independent of other available alternative modes. This problem is more common in multinomial logit models (Abdelwahab and Sayed 1998, Meyer and Miller 1984). Based on Equation (2-3), the ratio of the probability of choosing two alternatives is given by equation (2-4) (Meyer and Miller 1984). P eV i k r ik e ev* (2-4) where P i g = probability of individual i choosing alternative g from a set of alternatives; 15 Stella Y . W. Chow LITERATURE REVIEW V j g = the deterministic utility of individual i choosing mode g; It should be noted that the term, ^ ]e V i j , in equation (2-3) has disappeared. This term is J E C , the summation of the probability of choosing each competing travel mode in a choice set. So, equation (2-4) demonstrates that the ratio of the probability of choosing two alternatives is independent of the attributes and the availability of other competing travel modes. For example, i f a multinomial logit model consists of transit, skytrain, and automobile, the ratio of the probability of choosing transit to automobile is independent of the availability of skytrain. When there are improvements to skytrain such as decreasing the fares to attract more travelers, the probability of choosing skytrain will increase; however, both transit and automobile will have the same proportional changes. In other words, the ratio of the probability of choosing transit to automobile will remain constant. This can lead to serious prediction error. The logit model is similar to a regression model in which a function is required to fit the observed data set. Maximum likelihood estimation is used to estimate the coefficients of the utility function through the use of iterative numerical search. This estimation is employed in both logit and probit models (Abdelwahab and Sayed 1998). 2.2.1.2 PROBIT MODEL ' " ~ , ' 1 ~ ; The probit model is another method to transform the utility function into a bounded probability function (Sayed and Razavi 2000). Unlike the logit model, the distribution function employed in probit model is the inverse of the standard normal 16 Stella Y . W. Chow LITERATURE REVIEW cumulative distribution, which also has a S-shaped curve. This S-shape curve can explain the relationships between the mode choice selection of individuals and the associated explanatory variables. From the point of view of practicality, the coefficients of a probit model are difficult to interpret except for the binary case. In the binary case, the performance of a probit model is comparable to that of logit model. The function of a binary probit model is shown in equation (2-5) (Abdelwahab and Sayed 1998). P k(i) = 0 ( ( V i k - V i j ) / o ) (2-5) where 0(Â«) = standardized cumulative normal distribution function; a = standard deviation of (e k - 8 j) which is the arbitrary scale of utility function. 2.2.2 Artificial Neural Networks Artificial neural networks (ANNs) have been applied to a wide variety of fields. They have many successful applications in civil engineering such as traffic engineering, highway safety and maintenance (Abdelwahab and Sayed 1998). Model Structure and Operation According to previous studies (Abdelwahab and Sayed 1998, Sayed and Razavi 2000, Rao et al. 1998, Hensher & Ton 2000, Yang et al. 1993), ANNs are composed of numerous interconnected artificial neurons. These neurons are the processing elements and they are structured in a number of layers. Neurons in adjoining layers are linked 17 Stella Y . W. Chow LITERATURE REVIEW through connection weights. These weights represent the strength of the connection between the neurons. An A N N model usually consists of three types of layers: an input layer, a number of hidden layers, and an output layer. The number of neurons in an input layer corresponds to the number of input variables of the model; the number of neurons in an output layer represents the number of outputs of the model. Each neuron in the hidden layer receives and combines all of its weighted outputs from its previous layer. Then, it converts its linear combination value into an output using some transfer functions. Afterwards, the converted output is transmitted to the neurons in the following layer (see Figure 2-5). Therefore, the data is basically passed from one layer to the next layer through links. The typical three-layer feed-forward network is shown in Figure 2-4. Neurons are connected through connections Wjj and w,k, which are shown in equation 2-6 (Yang et al. 1993). The output of a neuron in the hidden layer is mathematically defined as: Where f is a transfer function and x is the input values in the input layer (Yang et al. 1993). A number of non-linear functions can be served as transfer functions. A logistic function, which is a typical transfer function, is displayed in Figure 2-6 (Hensher and Ton 2000). The output of a neuron in the output layer is the sum of the weighted outputs from each neuron in the last hidden layer. m (2-6) i=l 18 Stella Y . W. Chow LITERATURE REVIEW INPUT LAYER HIDDEN LAYER OUTPUT LAYER Figure 2-4: A typical Multi-layer Feed-forward Neural Network (Yang et al. 1993) Figure 2-5: A Neuron in the Hidden Layer (Maier et al. 2000) 19 Stella Y . W. Chow LITERATURE REVIEW f (x ) 1 0 x Figure 2-6: Logistic Function Supervised and Unsupervised Learning ANNs can be divided into two learning categories: supervised and unsupervised (Abdelwahab and Sayed 1998). In supervised learning, a desired output is required to train the network. A back propagation algorithm, the most commonly used supervised learning algorithm, can be used for training the network. It estimates the difference between the predicted and target values and then transfers the errors from the output layer back to the hidden and input layers. Therefore, connection weights can be adjusted to minimize the errors each time. ANNs have a global response because each weight adjustment affects some inputs of the model. Equation 2-7 and 2-8 demonstrate the mathematical functions for adjusting the connection weights (Yang et al. 1993). q. min I > k - d k ) 2 (2-7) k=l AW(n) = n5W(n) + ocAW(n -1) (2-8) where dk = target value of the k neuron in the output layer; 20 Stella Y . W. Chow LITERATURE REVIEW Zk = the predicted value; 8 = the error; AW = change in weight; n = cycle number; r\ = the training rate (0<r|<l); a â€” the momentum rate (0<cc<l). The training rate controls the size of adjusting the weights. A larger training rate leads to a bigger change in the connection weights. Similarly, the momentum rate controls the direction changes of the weights (NEUframe Version 3.0 Manual). The process of the back propagation algorithm is shown in Figure 2-7. The performance of ANNs depends on many parameters: the size of the data, the learning rate, and the training strategy. The network complexity depends on the number of hidden layers and neurons of the model. In unsupervised learning, the desired output is not known. Thus, weights adjustment cannot be performed. This learning method is used generally for clustering the input data by finding the similarities among them (NEUframe Version 3.0 Manual). 21 Stella Y . W. Chow LITERATURE REVIEW Initialize parameters Set maximum number of iterations: N Back-propagation lV(n+1 )=W{n)+r\ A W(n)+aA lrV(/i-1) 0(n+1 )=0(n)+ri A0(/i)+a A9(n-1) n=n+1 Yes Figure 2-7: Flow Chart of a Back-Propagation Training Algorithm (Yang et al. 1993) 2.2.3 Fuzzy Logic Fuzzy Logic was developed in 1965 by Lotfi A . Zadeh. It has been applied in many different problems such as cruise control for automobiles, single button control for washing machines etc. Fuzzy logic can make the interpretation used by computer similar to that used by human beings (Zadeh 1984). That means a more human-like way of 22 Stella Y . W. Chow LITERATURE REVIEW thinking in the computer programming. With fuzzy logic, things are not simply "true" or "false". There is a degree of being "true" and it is between a number of 0 and 1. Basic Components Fuzzy logic uses natural language such as: cold, warm, hot, small, medium, large, rather small, quite noisy. Such terms are not precise and they are called linguistic variables (Zadeh 1984, Sayed and Razavi 2000). A fuzzy set is a class with fuzzy boundaries. This class is associated with a grade of membership that defines the membership degree to that class. Conventionally, classes have sharp or what is normally called "crisp" boundaries (a member either belongs to a class or does not belong to a class). With fuzzy logic, a gradual transition between member and non-member in a set is introduced. Computers can also process this membership representation. The degree of membership is specified between 1 for full member and 0 for full non-member (Zadeh 1984, Sayed and Razavi 2000). Mathematically, a fuzzy set A is a function defined on the domain X : p , A ( x ) : X - > [ 0 , l ] (2-9) where A is a linguistic variable describing the variable x; UA(X) represents the grade of membership of x belong to fuzzy set A. Figure 2-8 illustrates the triangular fuzzy sets for temperature (Bossley 1997). For instance, the upper range of the fuzzy set "warm temperature" is 35Â°C and the lower range of it is 15Â°C. Thus, a temperature of 25Â°C has a grade of membership of 1 in the set of "warm temperature". Meanwhile, a temperature 23 Stella Y . W. Chow LITERATURE REVIEW of either 30Â°C or 20Â°C receives a grade of membership of 0.5 in the fuzzy set of "warm temperature". v e r y c o l d cold warm h o t v e r y hot !E in Si E 5 15 25 35 45 t e m p e r a t u r e f C ) Figure 2-8: Triangular Fuzzy Sets for Temperature Fuzzy Systems A fuzzy system uses a set of "IF THEN" rules to process information. Fuzzy logic rules represent the relationships between the inputs and the outputs of a system (Bossley 1997). The following rule is for a n-dimensional input single output fuzzy system (Bossley 1997): r;j :IF (x,is AJ) AND ... AND (xâ€ž is AJJ THEN (yis Bj)(Cij) (2-10) where A'k, k=l,...,n and BJ are the fuzzy membership functions for the input and output respectively; cy represents the corresponding rule confidence. The rule confidence, which has a range of [0,1], represents the confidence level of the fuzzy rule. For example, a Cy = 1 denotes the fuzzy rule has 100 percent contribution to the output of a fuzzy system. The part of the sentence in between "IF" and "THEN" represents the 24 Stella Y. W. Chow LITERATURE REVIEW antecedent of the rule. The sentence after " T H E N " is the consequent of the rule (Bossley 1997). The following is an example of a fuzzy system: IF the "room temperature" is low T H E N "increase the heat". IF the "room temperature" is high T H E N "decrease the heat". How much heat should be increased depends on the degree of "room temperature is low" being true. The basic components of a fuzzy system are shown in Figure 2-9 (Bossley 1997, Maier et al. 2000). There are three processes within the system: fuzzification, fuzzy inference, and denazification. Fuzzification is used to convert crisp data into fuzzy numbers within the fuzzy domains; fuzzy inference or rule firing is the computation of fuzzy numbers; denazification is used to convert the computed fiizzy outputs back to crisp data. Fuzzy sets Fuzzy rules Crisp Inputs Fuzzifier Fuzzy Inference Crisp Output Defuzzifier Figure 2-9: The Basic Components of a Fuzzy System (Maier et al. 2000) 25 Stella Y . W. Chow LITERATURE REVIEW B-Spline Functions B-spline function is one way to represent fuzzy sets. B-spline functions are piecewise polynomials (Sayed and Razavi 2000). The smoothness of B-spline functions depends on their order k. When k=l, B-spline is a piecewise constant; B-spline is a piecewise linear when k=2; B-spline is a piecewise quadratic when k=3; B-spline is a piecewise cubic when k=4 (Bossley 1997). So, different orders of B-splines can be used to represent fuzzy sets. The smoothness of the function output increases as k increases (Bossley 1997). Figure 2-10 shows some B-spline functions with different orders of k. B-spline functions have a number of advantageous properties that are suitable to represent fuzzy membership functions (Bossley 1997, Harris et al. 1995). Some of the advantages are that: i . B-spline functions are transparent and straightforward. Their steady recurrence makes the interpretation of the degree of fuzzy membership simple. i i . B-spline functions are piecewise polynomials with different compact orders. This allows information to be stored locally over only a small number of basis functions. i i i . A partition of unity, ^J5 u A i (x) = 1, is structured by basis functions. Therefore, this produces a precise estimation. 26 Stella Y . W. Chow LITERATURE REVIEW piecewise constant piecewise linear input knots input knots (c) (d) Figure 2-10: B-spline Functions with k = l to 4 (Bossley 1997) 2.2.4 Neurofuzzy ANNs are good at recognizing patterns, but they are not good at explaining the results. The results obtained from ANNs remain a "black box" which is difficult to interpret in order to make the necessary changes to the model. On the other hand, the outputs of fuzzy systems are easy to understand because of the use of linguistic variables. However, the main disadvantage of fuzzy systems is that expert knowledge is required to derive the " i f then" rules from the data set manually. This needs significant effort especially for large data sets. Therefore, the combination of fuzzy systems and ANNs has been employed to overcome the weaknesses of each individual system. Neurofuzzy combines the transparent representation of fuzzy system with the learning capability of 27 Stella Y . W. Chow LITERATURE REVIEW neural networks. In other words, a fuzzy system can be set up as a neural network and is trained. This allows further insight into the modeling process. Model Structure and Operation B-spline Associative Memory Networks (B-spline AMNs) is one kind of neurofuzzy models. It has been shown that B-spline AMNs and some types of fuzzy models are learning equivalent (Maier et al. 2000, Mills and Harris 1996). Figure 2-11 shows the typical structure of a B-spline A M N Model. A p-dimensional lattice normalizes the input space. There is only one hidden layer in a neurofuzzy system. The nodes in the hidden layer are composed of basis functions. Functions such as threshold, B-spline, and Gaussian can be used as basis functions. In the case of B-spline AMNs, B-spline functions are used. The size, shape, and overlap of these basis functions verify the complexity and structure of a model (Maier et al. 2001). The output y is simply the weighted sum of the normalized fuzzy membership functions from the basis function layer (Sayed and Razavi 2000, Maier et al. 2000). Equation (2-11) represents the output of the B-spline neurofuzzy network. p ajWj = a(x) T w (2-11) i=i where a(x) = (ai(x), ..., ap(x)) is the output vector of basis function; x = ( x b x n) is the input variables; w = (wi, ..., w p) is the weights of the system. 28 Stella Y . W. Chow LITERATURE REVIEW AMNs have a local response. Given a set of model inputs, only a small number of weights contribute to the output of the model (Maier et al. 2000). Consequently, the adjustment of weights does not affect every input; only weights that contribute to the model output are under training and adjustment. In other words, information that is stored in the network can be used locally and this gives rise to the increment of the convergence speed. In addition, the model has the capability of integrating new information into the network without modifying the original information that is stored in some other places of the model. Normalized Basic functions input space Figure 2-11: Structure of a Typical B-Spline Neurofuzzy System (Sayed and Razavi 2000) 29 Stella Y . W. Chow LITERATURE REVIEW Curse of Dimensionality There are many different combinations of ANNs and fuzzy systems. However, many of them suffer from the "curse of dimensionality" (Sayed & Razavi 2000, Mills and Harris 1996). If a fuzzy system has N input variables and each input variable has P membership functions, the fuzzy system will have as many as P N potential fuzzy rules. For example, i f there are 6 variables and each variable has 5 fuzzy sets, the number of potential fuzzy rules will be as large as 15625. Because of such high dimensionality, many neurofuzzy structures cannot be applied to problems with high number of inputs (a high-dimensionality problem). Additive Neurofuzzy Networks As mentioned above, one of the major weaknesses of conventional neurofuzzy model is the "curse of dimensionality". Many interactions among the input variables are irrelevant to the model. So, additive modeling which decomposes the high-dimensional neurofuzzy model into several low-dimensional, conventional neurofuzzy sub-models can be employed to mitigate the effect of the curse of dimensionality. A N O V A (ANalysis Of VAriance) is a decomposition approach for additive modeling (Sayed and Razavi 2000, Maier et al 2000, Bossley 1997). It breaks down the output function into a number of sub-functions. The representation of A N O V A is shown in equation (2-12): f(x) = f 0 +Xf , (x 1 ) + i ; X f l j j ( x j , x j ) + ... + f u,..., n(x) (2-12) i=l i j=i+l where fo represents a constant; the second term stands for the univariate sub-function; the third term is bivariate and so on. Because some combinations of input variables are 30 Stella Y . W. Chow LITERATURE REVIEW redundant, each sub-model has a smaller set of fuzzy rules. Thus, the structure becomes more transparent and simpler. Figure 2-12 shows two possible neurofuzzy hierarchical structures that can overcome the curse of dimensionality (Bossley 1997). In that figure, the problem originally has 4 input variables (a 4D model). It is decomposed into two ID and one 2D sub-models. If there are 5 fuzzy sets for each input variable, the number of fuzzy rules will be reduced from 625 to 35. Figure 2-12: Additive Model Structure (Bossley 1997) There are many advantages of using additive modeling. For instance, it can greatly eliminate the redundant input variables, thus, producing a prudent model. As well, each sub-model is still represented by fuzzy rules. So, the transparency of the model can be maintained. Moreover, the outputs of the sub-models are simply weighted and then added together. This retains the linearity structure (Bossley 1997, Millis and Harrisl996). 31 Stella Y . W. Chow LITERATURE REVIEW A S M O D Models with numerous parameters are associated with high dimensionality problems and largely increase the computational costs and time. Therefore, it would be better to develop a model that can still maintain the desired output even with the least number of model parameters (Maier et al 2000). A S M O D (Adaptive Spline Modeling of Observational Data) makes use of B-spline as its basis functions. Based on a set of training data, A S M O D uses A N O V A to decompose a high dimensional additively B-spline model into several lower dimensional neurofuzzy sub-models automatically. The process begins with a simple structure. After an iterative search, the simple structure is refined or the error of the model is minimized. There are three refinements: univariate addition, tensor multiplication, and knot insertion (Bossley 1997, Millis and Harris 1996). A S M O D has the ability to incorporate a new univariate input variable into the existing model. Figure 2-13 shows an example of univariate addition and deletion. From left to right, variable x 5 is added to the existing model. From right to left, variable x 5 is removed from the model which demonstrates the univariate deletion. x 3 -X4, X univariate addition subnetwork deletion X i -x 3-X4-X5-i s t > < ] Figure 2-13: Univariate Addition and Univariate Deletion (Bossley 1997) 32 Stella Y . W. Chow LITERATURE REVIEW With regard to the tensor multiplication, A S M O D replaces a sub-model with the one that depends on one or more input variables within the network. Figure 2-14 shows an example of tensor multiplication and tensor split. From left to right, variable X 3 is incorporated to the top sub-model. Thus, variable X 3 has dependency on both xi and X 4 after the tensor multiplication. The reverse of this process is called tensor split. Figure 2-14: Tensor Product and Tensor Split (Bossley 1997) Knot insertion is a process in which a new extra knot (a new basis function) is inserted in an input fuzzy set to increase the flexibility of a sub-model. Figure 2-15 shows an example of knot insertion. Again, by looking from left to right, an extra knot is inserted into the top sub-model. 33 Stella Y . W. Chow LITERATURE REVIEW Figure 2-15: Knot Insertion and Knot Deletion (Bossley 1997) As a consequence, A S M O D can automatically establish the required number of input variables, sub-models, and number and size of basis functions. Maier et al. (2000) used B-spline neurofuzzy model to forecast a bacteria called cyanobacterium Anabaena spp. in the River Murray in Australia. They illustrated that the stopping criterion played an essential role in affecting the results obtained from the B-spline A M N model. Meanwhile, they also showed that the potential input variables had the next bigger influences on the results. Followed by that was the order of B-spline basis function. They demonstrated that second order B-spline function produced a better and a more transparent output. 2.3 Previous Work Considerable research has been completed on the comparison of artificial intelligent system with conventional method. Some previous research was listed in Table 2-1. Hensher and Ton (2000) compared ANNs with the nested logit model on the 34 Stella Y . W. Chow LITERATURE REVIEW analysis of commuter mode choice. The result illustrated that the performance of.both ANNs and nested logit model were similar. Two papers compared the artificial intelligent models with the conventional models on freight mode choice analysis. Abdelwahab and Sayed (1999) showed that the predictive ability of ANNs and conventional mode choice models were comparable. Shortly after, Sayed and Razavi (2000) demonstrated that neurofuzzy, which had the least explanatory variables, had the ability to produce the same prediction accuracy as conventional models and A N N models. Maier et al. (2000) developed a forecasting cyanobacterial concentration model by using B-spline associated memory networks (AMNs). The results also showed that the performance of B-spline A M N model was slightly better than that of the back-propagation multiplayer perceptrons. In summary, most of the research showed that the prediction accuracies of both neurofuzzy and ANNs were at least as good as the conventional models. Table 2-1: Previous Research Author Method Conventional ANNs Neurofuzzy P.V. Sabba Rao et al. (1998) y y X Abdelwahab and Sayed (1999) y X Hensher and Ton (2000) y y X Sayed and Razavi (2000) y y y Maier et al. (2000) X y y y Analysis was performed. X Analysis was not performed. 35 Stella Y . W. Chow D A T A COLLECTION C H A P T E R 3 D A T A C O L L E C T I O N 3.1 Introduction The data set used in this research was obtained from the Greater Vancouver Transportation Authority, "TransLink". The data set was extracted from the 1999 24-hour trip diary survey of the Greater Vancouver Regional District (see Figure 3-1). The trip diary survey gathers travel and household information for either future or current transportation planning. In this mode choice analysis, the travel mode of each individual was explained by variables that served as explanatory variables. These explanatory variables were acquired from three categories of database: The household database The personal database The trip database The household database included items such as: the number of people in the household, the number of vehicles, the size and structure of the household, and the average income of the household. The personal database was employed to describe the attributes of each individual such as their age, gender, employment status, driver's license ownership, and non-working status. The trip database described the characteristics of individual trips such as travel time and travel distance. Therefore, each individual had a set of explanatory variables that included all three categories. 36 Stella Y . W. Chow D A T A COLLECTION An individual might have more than one travel mode in a single trip. For example, some of them might drive to a bus station and then take the transit. To solve this problem, Translink has created a mode hierarchy list that can sort out the primary mode of a trip. According to this hierarchy list, transit has a higher priority than both automobile driving and walking. For instance, i f a person drove to a bus station and then took the transit, the transit would become the primary mode. As a result, the travel mode used in this research was based on an individual's primary mode. There were 4 main modes of travel in the data set: the automobile driver, the automobile passenger, the transit and walking or biking. Nevertheless, only two travel modes were employed to construct the models in this research: the transit and automobile driving. Figure 3-1: Greater Vancouver Regional District Subregions 37 Stella Y . W. Chow D A T A COLLECTION 3.2 Data Description There were about 20,000 observations in the data set. Initially, the data set could be divided into at least 6 different trips: to work/post-secondary school trips, from work/post-secondary school trips, during work trips, to grade school trips, from grade school trips, and social/recreational/personal business trips. However, dividing trips into 6 different purposes is cumbersome. Moreover, some trips such as grade school trips did not have sufficient observations to support neither the training nor the testing of the models. Since the purpose of this research is to compare the predictive ability of different modeling methods, combining similar trips will not have a huge impact on the results of the analysis. Consequently, to work/post-secondary school trips and from work/post-secondary school trips were combined into one trip. Another trip that was used in this study was the social/recreational/personal business trips. To abbreviate the name of each trip, "the work/school trip" is used to represent to & from work/post-secondary school trips and "the recreational trip" is applied to indicate social/recreational/personal business trips. The mode choice distribution for both work/school trip and recreational trip was carried out. The results are shown in Figure 3-2 and Figure 3-3. The automobile driver, indicated by "auto", had the largest percentage (over 60 percent) in both trip purposes. The percentage of taking the transit was only 5 percent in the recreational trip and 11 percent in the work/school trip. It should be noted that four travel modes were included in the mode choice distribution. In this study, only automobile driving and transit were examined, thus removing observations with travel modes of either walking/biking or automobile passenger. In addition, observations with "not stated" and "invalid" were 38 Stella Y . W. Chow D A T A COLLECTION eliminated. The total number of observations and the travel mode percentage of each trip are listed in Table 3-1. The work/school trip consisted of 2599 observations; the recreational trip had 3179 observations. Social/Recreational/Personal Business walk/bike Figure 3-2: Mode Choice Distribution of Recreational Trip Work / Post -Secondary S c h o o l walk/bike 13% transit Atiliiil 11% / xHH autopass ^ ^^^^w-8% WZM â€” â€”-J / auto ^^y^ 68% Figure 3-3: Mode Choice Distribution of Work/School Trip 39 Stella Y . W. Chow D A T A COLLECTION Table 3^ 1: Travel Modal Share of the Two Trip Purposes Trip Purposes Transit % Automobile % The Work/School Trip 13.89% 86.11% (2599 observations) The Recreational Trip 7.14% 92.86% (3179 observations) The variables used in building the model are listed in Table 3-2. There are 20 variables in total. For total travel costs, individuals who took the transit paid transit fares according to the three travel zones. The one zone fare is $1.75; the two zone fare is $2.50 and the three zone fare is $3.50. Individuals who drove paid for operation and parking fees. The operation cost of driving was approximately based on a rate of $0.09/km. Maintenance and insurance were not employed because the appropriate data were not available. Therefore, only operation costs were included in the analysis. To illustrate this, an estimated 30 km drive would require $0.09/km x 30 km = $2.70. Parking fees were only applied to Downtown Vancouver and post-secondary schools such as universities and colleges. According to the U B C B-lot parking fees, $3.25 for whole-day parking was employed in the study. Parking in other areas was considered to be free. Total travel costs were used in the analysis. For instance, i f a person took the transit, the total travel costs for that person would only consist of their transit fare. On the other hand, i f a person drove, the total travel costs would include the automobile operation cost plus the applicable parking cost. 40 Stella Y . W. Chow D A T A COLLECTION Table 3-2: Data Variables Variable code Description Household Database H_muncd Ppha l l Pph_16 Hh inc Hjype Vph Bph H_tot_trip Municipality of household (1 to 26 different parts of Lower Mainland, e.g. 3= Burnaby) Total persons in household (persons/household) Total persons in household who are 16 years of age or older (persons/household) Household income (1= less than $29,999, 2= between $30,000 to $59,999, 3 = between $59,000 to 89,999, 4= more than $90,000) Household structure type (7 different types of household structures e.g. 1= single detached houses without suites) Vehicles per household (vehicles/household) Bicycles per household (bicycles/household) Total trips taken by everyone in the household within 24 hr (trips/household) Person Database Age Sex Sch stat Emp_stat Nw_stat License P_tot_trip Age Gender (1= male, 2= female) School attendance status (1= full-time student, 2= part-time student, 3= no basis student, 4= not a student) Employment status (1= full-time employed, 2= part-time employed, 3= self-employed, 4= no basis employed, 5= not employed) Status of those unemployed (l=student, 2= preschooler, 3=homemaker, 4= volunteer, 5= retired, 6=unemployed) Does an individual have a driver license? (1= yes, 2= no) Total trips taken by an individual within 24 hrs (trips/individual) Trip Database Trav_time Estdist Omuncd Dimmed Tcost Total estimated travel time (hour) Estimated distance (km) Trip origin municipality (1 to 26 different parts of Lower Mainland, e.g. 3= Burnaby) Trip destination municipality (1 to 26 different parts of Lower Mainland, e.g. 3= Burnaby) Total travel costs ($) 41 Stella Y . W. Chow D A T A COLLECTION The statistical summaries of all 20 variables for both trip purposes are showed in Table 3-3 and Table 3-4. The two tables show the maximum and minimum values, the mean, the standard deviation, and the variance of each variable. The results of descriptive statistics for work/school trip are shown in Table 3-3. The household municipality of most observations (29 percent) was in Vancouver. Approximately 51 percent of the observations lived in municipalities such as Surrey, Burnaby, Coquiltam, Delta, North Vancouver and Richmond. Most observations had an average of 2 to 3 people or family members living together in a household. Around 62 percent of the data had a household income of between $30,000 and $90,000 per year. Less than 10 percent of the observations had an annual household income below $30,000. Approximately 82 percent of the observations were not students but had either a full-time or part-time job. Ninety-seven percent of the observations had driver's licenses. The majority of trip destinations were Vancouver, Burnaby, Surrey or Richmond. Typically, the travel distance and travel time were 13 km and 0.46 hours respectively. Meanwhile, the average travel cost was $1.42 per individual. The descriptive statistics for recreational trip with 3179 observations is shown in Table 3-4. The household municipalities, the average of family members living together in a household, the household income, the trip destinations, and the percentage of driver's license ownerships of most observations were similar to those of the work/school trip. Eighty-eight percent of the observations were not students. Indeed, 46 percent of the majority had either a full-time or part-time job and 42 percent of the observations were retired. The employment status and non-working status of the two trips have the greatest differences. The typical travel distance and travel time were 6.9 km and 0.28 hours 42 Stella Y . W. Chow D A T A COLLECTION respectively. Meanwhile, the average travel cost was $0.72 per individual, which is half of that of the individual travel costs in the work/school trip. In conclusion, the observations in each trip purpose have similar household characteristics except for their personal and trip characteristics. The large majority of observations in the work/school trip were either students or part-/full-time workers. However, about half of the observations in the recreational trip were retired. As well, the travel distance, travel cost and travel time of the work/school trip were double those of the recreational trip. Table 3 -3 : Descriptive Statistics of the Data Set for the Work/School Trip Minimum Maximum Mean Std. Deviation Variance H M U N C D 3.00 26.00 12.4840 5.7801 33.410 PPH A L L 1.00 8.00 2.7649 1.3317 1.773 PPH 16 1.00 8.00 2.2466 .9577 .917 H H INC 1.00 4.00 2.7649 .9578 .917 H TYPE 1.00 7.00 2.2555 1.4145 2.001 V P H .00 7.00 1.8388 .9121 .832 B P H .00 6.00 1.7888 1.4881 2.214 H TOT TRIP 1.00 46.00 10.0342 6.7198 45.156 A G E 15.00 79.00 39.8730 11.5812 134.124 SEX 1.00 2.00 1.4540 .4980 .248 SCH STAT 1.00 5.00 3.6302 .9474 .897 EMP STA1 1.00 5.00 1.6460 1.0821 1.171 N W STAT .00 7.00 8.542E-02 .4928 .243 LICENSE 1.00 2.00 1.0296 .1696 2.876E-02 P TOT TRIP 1.00 18.00 4.3440 2.2402 5.019 T R A V TIME .02 3.50 .4574 .3032 9.192E-02 EST DIST .01 89.33 13.1578 11.1217 123.692 0 M U N C D 3.00 26.00 12.5818 5.7792 33.399 D M U N C D 3.00 26.00 12.4921 5.6928 32.408 TCOST .00 8.74 1.4173 1.1860 1.407 M O D E 1.00 2.00 1.1389 .3459 .120 43 Stella Y . W. Chow D A T A COLLECTION Table 3-4: Descriptive Statistics of the Data Set for the Recreational Trip Minimum Maximum Mean Std. Deviation Variance H M U N C D 2.00 26.00 12.6796 5.6478 31.898 PPH A L L 1.00 8.00 2.9303 1.3979 1.954 PPH 16 1.00 8.00 2.1636 .9161 .839 H H INC 1.00 4.00 2.5627 .9927 .985 H T Y P E 1.00 7.00 2.1434 1.4007 1.962 V P H .00 6.00 1.7178 .8626 .744 B P H .00 6.00 1.7852 1.6514 2.727 H TOT TRIP 2.00 46.00 12.9052 7.4134 54.959 A G E 13.00 86.00 46.1155 14.7023 216.159 SEX 1.00 2.00 1.5609 .4964 .246 SCH STAT 1.00 5.00 3.8013 .7468 .558 EMP STA1 1.00 5.00 3.1519 1.7178 2.951 N W STAT .00 7.00 1.8207 2.3230 5.396 LICENSE 1.00 2.00 1.0238 .1525 2.326E-02 P TOT TRIP 1.00 18.00 6.0512 2.5737 6.624 T R A V TIME .02 2.25 .2826 .2348 5.513E-02 EST DIST .01 103.10 6.9258 8.5546 73.182 0 M U N C D 2.00 28.00 12.6966 5.6498 31.920 D M U N C D 2.00 28.00 12.7182 5.6849 32.318 TCOST .00 9.28 .7177 .8531 .728 M O D E 1.00 2.00 1.0715 .2577 6.638E-02 3.3 Training and Testing Data The data of the two trip purposes were divided into two sets. One set (about 70% of the data) was used to train the model and the other set (about 30% of the data) was for testing the predictive ability of each developed model. The details of each trip purpose are discussed below: The Work / School Trip Cross-validation was carried out to evaluate the predictive accuracy of a model for data testing from which the model was developed by training a proportion of the data. 44 Stella Y . W. Chow D A T A COLLECTION In this research, 70 percent of the total number of observations (1820 observations) was randomly selected for training the model and 30 percent (779 observations) was used for testing the predictive performance of the model. Both the training and testing data had the same automobile drive and transit modal shares to avoid any empirical bias. The Recreational Trip Similar to the work/school trip, cross-validation was also applied. Seventy percent of the total number of observations (2225 observations) was randomly selected for training the model and 30 percent (954 observations) was used for testing the predictive ability of the model. Again, to avoid any empirical bias, both the training and testing data had the same automobile drive and transit modal shares. 3.4 Methodology For evaluating the different mode choice analysis techniques, the following models were constructed based on 70 percent of the data. There were mainly four sections in total. The first section was to develop models with all explanatory variables; the second section was to calibrate and select the best model for each modeling technique. The model that can maintain the same predictive ability even with the least number of input variables is defined as the best model. The third section was to construct A N N and neurofuzzy models based on the explanatory variables identified by the best logit model; the last section was to construct A N N and logit models based on the explanatory variables selected by the best neurofuzzy model. The purpose of the last two sections was for comparison. 45 Stella Y . W. Chow D A T A COLLECTION Developing Different Mode Choice Models Using A l l 20 Variables: An A N N mode choice model using 20 variables; A logit mode choice model using 20 variables; A neurofuzzy mode choice model using 20 variables. Developing and Selecting the Best Mode Choice Model (excluding ANNs): The best logit mode choice model; The best neurofuzzy mode choice model. Cross Comparison (I)â€”Based on the Best Logit Mode Choice Model: A n A N N mode choice model was built based on the input variables selected from the best logit mode choice model; A neurofuzzy mode choice model was built based on the input variables selected from the best logit mode choice model. Cross Comparison (II)â€”Based on the Best Neurofuzzy Mode Choice Model: A n A N N mode choice model was built based on the input variables selected from the best neurofuzzy mode choice model. A logit mode choice model was built based on the input variable selected from the best neurofuzzy mode choice model. Two sets of analysis were developed: One was for the work/school trip and the other one was for the recreational trip. This thesis used SPSS version 8.0 to perform the logistic regression and NEUframe version 3.0 to develop A N N and B-spline neurofuzzy 46 Stella Y . W. Chow D A T A COLLECTION models. A brief description of NEUframe version 3.0 is presented in the following section. 3.4.1 Introduction to NEUframe 3.0 NEUframe version 3.0 is a tool that helps extract knowledge from a data set. It offers both clustering and supervised modeling. Multi-layered perceptron, the neufuzzy, and the radial basis function are the three algorithms of supervised modeling. The Kohonen algorithm is the clustering algorithm. Each algorithm has its own features that can be used to model different combinations of data. In this research, standard back propagation or what is often called multi-layered perceptron was used to develop the A N N model whereas the neufuzzy was used for neurofuzzy modeling. There are four statistical performance measures in NEUframe 3.0: Akaike, Bayesian, final prediction error, and generalized cross validation. The first two are best used for low quality, noisy data. The last two are effective for good quality data. Practically, the Neufuzzy only allows less than 30 input variables. It can perform a wide variety of tasks such as: classification, time series forecasting, modeling, control systems, and some data mining. Due to its locally adapted response, its training speed is relatively slow. Neuframe 3.0 not only allows users to visualize the fuzzy rules (see Figure 3-4), but it also presents two output features that further explain the results and processes of the tested models. One of these is the "Rule Fire Matrix" and the other one is the "Output Graph". 47 Stella Y . W. Chow D A T A COLLECTION 1 Fuzzy Rules Fuzzy Rules List SubNetwofk2 + Rule Not Edited â€¢ Rule Already Edited 5: J-J IF lnput.2 is Sef.1 THEN Output is Sell [0.041 OR Output is Set2 JO.96] G: (-) IF Input2 is Set2 THEN Output is Set1 (0.21) OR Output is Set2 (0.79) 7: (-) IF Input2 is Set3 THEN Output is Sell (1.00) OR Output is Set2 (0.00) Edit Disable Selected Rule | Close â€¢HHHHHHHIHHIHHHlHHHil^HHHHHHHIHH Figure 3-4: Fuzzy Rules in NEUframe Version 3.0 The "Rule Fire Matrix" displays how the fuzzy rules apply to each observation in the testing data (see Figure 3-5). It is presented in a table format with each row representing a fuzzy rule and each column denoting an observation in the querying process. Each cell is colored so that the importance of a certain fuzzy rule applying to a specific observation is identified. The darker the cell, the more significance a fuzzy rule relates to a particular observation. For instance, the red square highlighted cell indicating Fuzzy Rule 1 has a fairly significant effect on Observation 5 in the query set. Fuzzy Rule 7 with the darkest color has the greatest impact on Observation 5. Similarly, the "Output Graph" shows the mean square error and the statistical significance of each sub-network in each pass during the training process. 48 Stella Y . W. Chow D A T A COLLECTION IJueiy Process Display Figure 3-5: Rule Fire Matrix in NEUframe Version 3.0 49 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS CHAPTER 4 DATA ANALYSIS AND RESULTS 4.1 Introduction This chapter describes the methodologies and procedures employed to develop models with all explanatory variables of the three modeling techniques (Logit, ANNs, and Neurofuzzy). As well, the means for calibrating and selecting the best model of each modeling technique are also presented in this chapter. The best model is defined as the model that can maintain the same predictive ability even with the least number of explanatory variables. Therefore, models with different numbers of explanatory variables have to be performed in order to achieve the best model. Section 4.2 describes the results of the conventional mode choice logit model. Section 4.3 and section 4.4 are the results of the A N N and the neurofuzzy mode choice models respectively. Section 4.5 is the conclusion of this chapter. 4.2 Conventional Mode Choice Logit Models The purpose is to calibrate two logit models for the work/school trip and another two logit models for the recreational trip. One logit model was estimated with all twenty variables without taking the significance of each input variable into account; the other logit model, which was considered to be the best model, was constructed with the least number of input variables. Section 4.2.1 describes the methodology and the statistical measures, such as wald statistic and the kappa statistic, used in developing the logit model. 50 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS 4.2.1 Methodology Originally, a logit model was developed based on all twenty explanatory variables. Subsequently, five more models, based on the least possible number of variables, were built by training them with the following different methods: the forward wald method, the forward likelihood ratio, the backward wald method, the backward likelihood ratio, and a manually performed method. The first four methods were achieved automatically by SPSS logistic regression; the last method was performed manually by determining the significance of each explanatory variable in the model developed with all twenty variables. After calibrating the model, 30 percent of the data set was used to carry out the testing. In addition, wald statistic, the model chi-square, and kappa statistic were used to further evaluate the performance of each model. Wald Statistic Determining the statistical significance of each independent variable is based on wald statistic. Wald statistic, which has chi-square distribution, can be calculated using the following equation: where z â€” Wald statistic; B = coefficient of a certain input variable; S.E. = standard error. 51 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS If the coefficient of a variable has a wald statistic value greater than 3.84, that variable is significant at a 95% confidence level. However, when the coefficient of a variable is large, the large standard error may decrease the wald statistic of that variable. In this case, wald statistic may not be used to verify the importance of that variable. In contrast, two additional models should be developed instead: one with the variable and one without the variable. Thus, hypothesis testing can be based on model log-likelihood. This is one adverse property of wald statistic. However, wald statistic has better performance on large sample sizes (SPSS Regression Models 9.0). Goodness of Fit Two methods can be used to determine how well the model fits the data. One method is by examining the model chi-square, and the other one is by calculating the predictive success rate. The likelihood is the probability of the observed results (SPSS Regression Models 9.0), which is less than 1. Thus, a -2 log likelihood is used instead. In general, the higher the likelihood, the better the fit will be regarding the model. A higher likelihood generates a smaller -2 log likelihood value. The calculation of the model chi-square is shown in equation (4-2). The initial -2 log likelihood subtracts the final -2 log likelihood gives the model chi-square. The initial -2 log likelihood would be the same for all modeling methods; however, the final -2 log likelihood would be different. If the model were good, its final -2 log likelihood would be small. Therefore, the larger the model chi-square, the better the fit will be regarding the model. Another method to measure the goodness of fit is to employ the predictive success rate, calculated using the estimated variable coefficients, to test the remaining 30% of the data set. 52 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS Model Chi-square = [-2 Log Likelihood initial] - [-2 Log Likelihood Finai] (4-2) Kappa statistic Another measuring method, the kappa statistic (Sayed and Abdelwahab 1998), is applied to further examine the predictive accuracy of the models. The kappa statistic coefficient is calculated using the following equation: where P = Overall percent agreement; P e = Overall percent agreement expected by chance. Kappa statistic is an index that measures the agreement and disagreement between the true observations and observations predicted from the created model. The kappa statistic coefficient can be in the range of [-1, +1]. If the kappa statistic coefficient has a positive value, it suggests that there is an agreement between the actual and predicted observations. A kappa statistic with a value of zero indicates an agreement between the actual observations and predicted observations that can be expected by chance. If a kappa statistic coefficient has a negative value, it denotes disagreement between the actual and predicted observations. In other word, a kappa statistic coefficient index with a value larger than zero indicates agreement. A value of +1 denotes perfect agreement. Although there are no absolute rules to interpret the kappa statistic coefficient, some guidelines are available to help understand it. Table 4-1 shows the approximate strength K = 1 - P (4-3) 53 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS of agreement of kappa statistic. It is noted that this table can only serve as a rough guideline (Landis and Koch, 1977). Table 4-1: Approximate Strength of Agreement of Kappa statistic (Landis and Koch 1997) Kappa Strength of Agreement < 0.00 Poor 0.01-0.20 Slight 0.21-0.40 Fair 0.41-0.60 Moderate 0.61-0.80 Substantial 0.81-1.00 Almost Perfect 4.2.2 Results of the Work/School Trip Six work/school trip models were developed with their methods indicated in their corresponding brackets: Model 1 - A l l 20 variables Model 2 - The smallest possible number of variables (Manually) Model 3 - The smallest possible number of variables (Forward Wald) Model 4 - The smallest possible number of variables (Forward Likelihood) Model 5 - The smallest possible number of variables (Backward Wald) Model 6 - The smallest possible number of variables (Backward Likelihood) The results of Model 1 with its estimated coefficients, the corresponding wald statistic and the significant level of all variables are shown in Table 4-2. By examining the variables, the wald statistic of each variable and the associated sign are the primary 54 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS focus. The sign provides the direction in the relationship between the explanatory variables and the dependent variable (Abdelwahab and Sayed 1998). For instance, variable "vph" has a negative coefficient (see Table 4-2). This suggests that "vph", the number of vehicles in a household, reduces the probability of selecting "transit" as one's travel mode. Model 1 with its chi-square value of 758 (20 degrees of freedom) is significant. According to Table 4-3, the model correctly predicts 66 observations as taking the transit travel mode but incorrectly identifies 42 observations as taking the automobile travel mode. This results in a 61% success rate (66/108 = 61%) for the transit travel mode. On the other hand, only 15 automobile observations are wrongly predicted as taking the transit travel mode which results in a success rate of 97.8% for the automobile travel mode. The overall prediction success rate of Model 1 is 92.7%. As well, the kappa statistic of Model 1 is 0.4993 which indicates a moderate strength of agreement between the observed and predicted values. Based on wald statistic, the coefficients on the household's municipality; the number of people older than 16 years in a household; the number of vehicles owned per household; and the age, gender, employment status, non-working status, number of trips made by an individual, estimated travel time, estimated travel distance, original municipality, destination municipality, and total travel costs of individuals are significant at a 95% confidence level. Therefore, a total of 13 out of 20 variables are significant. The large coefficient of the variable "licenses" results in a large standard error (Appendix 4-B). Consequently, wald statistic may not be used to verify the significance of variable "licenses". In order to perform hypothesis testing, another model without variable "licenses" was created. The results, as presented in Table 4-4, were tested by model log-likelihood. From Table 4-4, the model 55 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS with variable "licenses" offers a better fit and prediction success rate than those obtained from the model without variable "licenses", thus indicating that variable "licenses" should be included in the model. Consequently, the number of explanatory variables used in model 2 increases from 13 to 14 after taking variable "licenses" into consideration. Table 4-2: Result of the Logit Model with 20 Variables for the Work/School Trip Variable Coefficient Wald statistic Significance Constant -8.0175 7.3602 1.1866 H muncd -0.1262 17.928* 0.0000 Pph all 0.0291 0.0278 0.8675 Pph_16 0.4959 6.2896* 0.0121 Hh_inc 0.0597 0.2359 0.6272 H type 0.0307 0.1388 0.7095 Vph -1.7824 91.515* 0.0000 Bph 0.1312 2.2481 0.1338 H_tot_trip -0.0466 1.8530 0.1734 Age -0.0276 7.2210* 0.0072 Sex (1) -0.5102 5.8314* 0.0157 Sch_stat -0.1421 1.3184 0.2509 Emp_stat -0.3556 8.4606* 0.0036 Nw_stat 0.6283 8.4515* 0.0036 License 8.1652a 1.2506 0.2634 P_tot_trip -0.1575 4.4001* 0.0359 Trav_time 2.3597 33.083* 0.0000 Est_dist -0.1333 65.409* 0.0000 Omuncd 0.1207 14.865* 0.0001 D_muncd 0.0857 14.055* 0.0002 Tcost 0.9456 72.025* 0.0000 Model Chi-Square 758.229 (20 df) * Significant at the 95% level. A large coefficient of standard error 56 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS Table 4-3: Prediction Results of the Logit Model with 20 Variables for the Work/School Trip (Classification Threshold = 0.5) Observed From SPSS Model Transit Automobile Total % Correct Transit 66 42 108 61.1% Automobile 15 656 671 97.8% Total 77 702 779 92.7% Kappa statistic calculation: P = (66+656)/779 = 0.9268 P e = (108/779) (66/779) + (671/779) (656/779) = 0.8539 Kappa statistic = (0.9268 - 0.8539) / (1 - 0.8539) = 0.4993 Table 4-4: Logit Models with Variable License and without Variable Licenses for the Work/School Trip (Classification Threshold = 0.5) With variable Licenses Without variable Licenses Model Chi-Square Prediction Success Rate Transit Autodr. Overall Kappa statistic 758.229 (20 df) 61.11 97.76 92.68 0.4993 702.881 (19 df) 54.63 97.32 91.40 0.4313 After running all 5 models, a comparison was carried out to identify the best logit model with the least number of variables. Table 4-5 presents a summary of all 5 results. Backward methods appear to give a better result because they start with all the explanatory variables and then eliminate some explanatory variables through an iterative procedure (SPSS Regression Models 9.0). Models 2, 5 and 6 are superior to Models 3 and 4 in terms of their overall goodness of fit. They all have higher values of the model 57 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS chi-square. In addition, Models 2, 5, and 6 have the highest kappa statistic values which denotes that they have stronger agreement between the observed and predicted values than Models 3 and 4. Moreover, the overall success rates of Models 2, 5, and 6 are better than the overall success rates of the remaining models. Basically, Models 2, 5 and 6 have the same result and variable coefficients. Thus, any one of them is the best logit model. Table 4-6 displays the best logit model. A l l of the variables except "licenses" are significant at a 95% confidence interval. Variable "licenses" has a large coefficient value of 8.05. Thus, another model without "licenses" is necessary to demonstrate the importance of variable "licenses". The entire results are presented in Appendix 4-B. Table 4-5: Summary of A l l 5 Logit Models for the Work/School Trip Model Method Chi- Kappa Variable Correct Prediction % square Overall Transit Autodr 2* Manually 752.8 0.479 14 92.4 58.3 97.9 3 Forward Wald 344.6 0.167 1 88.5 19.4 99.6 4 Forward LR 725.8 0.463 10 92.3 55.6 98.2 5* Backward Wald 752.8 0.479 14 92.4 58.3 97.9 6* Backward LR 752.8 0.479 14 92.4 58.3 97.9 * Significant model 58 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS Table 4-6: The Best Logit Model with 14 Variables for the Work/School Trip Variable Coefficient Wald statistic Significance Constant -7.9304 1.1684 0.2797 H muncd -0.1193 16.4938* 0.0000 Pph 16 0.4430 12.8388* 0.0003 Vph -1.7647 108.482* 0.0000 Age -0.0314 10.644* 0.0011 Sex (1) -0.4490 5.8811* 0.0153 Emp_stat -0.3118 8.0375* 0.0046 Nw_stat 0.6362 9.5931* 0.0020 License 8.04733 1.2139 0.2706 P_tot_trip -0.2167 14.703* 0.0001 Travjime 2.3313 32.120* 0.0000 Estdist -0.1352 69.977* 0.0000 Omuncd 0.1189 14.565* 0.0001 Dimmed 0.0844 13.721* 0.0002 Tcost 0.9722 81.108* 0.0000 Model Chi-Square 752.783 (14 dj E) * Significant at a 95% level. A large coefficient of standard error 4.2.3 Results of the Recreational Trip In a similar manner, six models of the recreational trips were developed. The method for each model is indicated in the brackets: Model 1 - A l l 20 variables Model 2 - The smallest possible number of variables (Manually) Model 3 - The smallest possible number of variables (Forward Wald) Model 4 - The smallest possible number of variables (Forward Likelihood) Model 5 - The smallest possible number of variables (Backward Wald) Model 6 - The smallest possible number of variables (Backward Likelihood) 59 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS The results of Model 1 are shown in Table 4-7. The model chi-square demonstrates that Model 1 is significant. Model 1 has an overall success rate of 97.5% (see Table 4-8), which is good. In addition, the kappa statistical value of Model 1 is 0.6722 (see Appendix 4-D for the sample calculation of kappa statistic), which indicates a substantial strength of agreement between the observed and the predicted values. Based on wald statistic, the coefficients on the number of people older than 16 in a household, household income, number of vehicle owned, number of trips made by an individual, estimated travel time, estimated travel distance, and total travel costs are significant at a 95% confidence level. Therefore, a total of 7 out of 20 variables is essential in the model. Similar to the previous case, variable "licenses" has a large regression coefficient that creates a large standard error (Appendix 4-C). Consequently, another model without variable "licenses" was created to perform hypothesis testing. The results, as presented in Table 4-9, were tested by model log-likelihood, success rate and kappa statistic. The results in Table 4-9 show that variable "licenses" is significant in the model. Consequently, Model 2 was developed based on 8 significant variables. Only 8 variables are important: much less than those selected in the work/school trip. 60 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS Table 4-7: Result of the Logit Model with 20 Variables for the Recreational Trip Variable Coefficient Wald statistic Significance Constant -12.403 1.2483 0.2639 H muncd 0.0373 0.6923 0.4054 Pph all -0.3664 1.0036 0.3164 Pph_16 0.8899 5.7256* 0.0167 H h i n c -0.4083 4.4214* 0.0355 H type 0.0487 0.2042 0.6514 Vph -1.6689 31.201* 0.0000 Bph -0.2491 2.3072 0.1288 H t o t t r -0.0100 0.0189 0.8908 Age 0.0018 0.0245 0.8757 Sex (1) 0.1160 0.1273 0.7212 Schstat -0.1041 0.2533 0.6148 Emp_stat 0.0257 0.0181 0.8930 Nw_stat 0.0032 0.0006 0.9801 License 10.635a 0.9340 0.3338 P_tot_ trip -0.4105 9.6525* 0.0019 Travtime 3.0859 46.809* 0.0000 Est_dist -0.1756 69.296* 0.0000 Omuncd 0.0250 0.3190 0.5722 Dmuncd 0.0195 0.3621 0.5473 Tcost 2.2159 111.35* 0.0000 Model Chi-Square 799.980 (20 df) * Significant at a 95% level. a A large coefficient of standard error Table 4-8: Prediction Results of the Logit Model with 20 Variables for the Recreational Trip (Classification Threshold = 0.5) From SPSS Model Observed Transit Automobile Total % Correct Transit 53 15 68 77.9% Automobile 9 877 886 99.0% Total 62 892 954 97.5% 61 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS Table 4-9: Logit Models with Variable Licenses and without Variable Licenses for the Recreational Trip (Classification Threshold = 0.5) With variable Licenses Without variable Licenses Model Chi-Square 799.98 738.66 Prediction Success Rate Transit 77.9% 70.6% Autodr. 99.0% 98.9% Overall 97.5% 96.9% Kappa statistic 0.6722 0.5977 After running all 5 models, a comparison was carried out to identify the best logit model. Table 4-10 illustrates the summary of all 5 results. Models 2 and 4 are superior to Models 3, 5 and 6 in terms of overall goodness of fit. They all have higher values of the model chi-square. Model 2 has the highest kappa statistical value, which indicates that it has a stronger agreement between the observed and predicted values than the other models. Thus, according to kappa statistic, Model 2 is superior to Model 4. Nevertheless, Model 2 requires 8 variables whereas Model 4 only entails 7 variables. Besides, 7 out of 8 variables in Model 2 are significant at a 95 % confidence interval while all 7 variables are significant in Model 4. Therefore, Model 4 is better than Model 2 in terms of its use of the least number of input variables. Interestingly, the variables selected by both models are comparable except that Model 2 has an extra variable "pph_16". Table 4-11 and Table 4-12 demonstrate the results of the two models. "Licenses" has been shown to be significant in both models (Appendix 4-C). Model 4 is chosen to be the most efficient logit model because it has the lowest number of explanatory variables and all of its variables are significant. Furthermore, the kappa statistical value of Model 4 is only 0.0085 less than that of Model 2. As a result, Model 4 is the best logit model. 62 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS Table 4-10: Summary of All 5 Logit Models for the Recreational Trip Model Method Chi- Kappa Variable Correct Prediction % square Overall Transit Autodr 2* Manually 786.78 0.6722 8 97.5 77.9 99.0 3 Forward Wald 218.97 0.0239 1 92.0 2.94 98.9 4* Forward L R 783.82 0.6637 7 97.5 75.0 99.2 5 Backward Wald 795.93 0.6369 10 97.2 75.0 98.9 6 Backward L R 795.93 0.6369 10 97.2 75.0 98.9 * Significant model Table 4-11: Result of the Logit Model with 8 Variables for the Recreational Trip Variable Coefficient Wald statistic Significance Constant -10.782 0.9939 0.3188 Pph_16 0.3841 3.1563 0.0756 Hh inc -0.5079 8.2061* 0.0042 Vph -1.8509 39.508* 0.0000 License 10.61793 0.9655 0.3258 P_tot_trip -0.4479 25.884* 0.0000 Trav_time 2.9868 47.916* 0.0000 Est_dist -0.1826 83.372* 0.0000 Tcost 2.2794 129.01* 0.0000 Model Chi-Square 786.784 (8 df) * Significant at a 95% level. a A large coefficient of standard error Table 4-12: Result of the Logit Model with 7 Variables for the Recreational Trip Variable Coefficient Wald statistic Significance Constant -10.552 0.9737 0.3238 Hh inc -0.4469 6.5071* 0.0107 Vph -1.6278 38.033* 0.0000 License 10.8423 1.0297 0.3102 P t o t t r i p -0.4589 27.028* 0.0000 Trav_time 2.8572 46.222* 0.0000 Estdist -0.1857 86.165* 0.0000 Tcost 2.3014 130.54* 0.0000 Model Chi-Square 783.818 (7 df) * Significant at a 95% level. a A large coefficient of standard error 63 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS 4.3 A N N Mode Choice Models The purpose is to calibrate an A N N model with all 20 explanatory variables for both the work/school trip and the recreational trip. Section 4.3.1 describes the methodology for developing an A N N model. Section 4.3.2 and section 4.3.3 display the resulting A N N models of the work/school trip and recreational trip respectively. 4.3.1 Methodology A l l models have three layers: an input layer, a hidden layer and an output layer. The two neurons in the output layer, having the format [automobile, transit], denote automobile and transit travel modes. An A N N result coded as [1,0] represents automobiles; an output coded as [0,1] corresponds to transit. The analyses applied a learning rate of 0.2, a momentum rate of 0.8, and a training error of 0.05 as the training parameters. After calibrating a model, the prediction success rates were calculated. This was completed by using 30% of the data set for testing the performance of the developed models. Under a 50% classification scheme, observations with output values of 50% or higher were assigned to a specific mode. For instance, it is obvious to select transit as the travel mode given an output of [0.11,0.89]. However, an observation with an output coded as [0.48,0.52] does not facilitate recognition of which travel modes are preferentially selected. Although 0.52 is greater than 0.5, this value contains significant uncertainties. Therefore, different confidence levels (50%, 60%, 70%, 80%, 90%, 95%, and 99%) were performed for further investigation. Based on a 50% classification 64 Stella Y . W. Chow DATA ANALYSIS AND RESULTS threshold, an observation with an output of [0.48,0.52] would still be considered as transit mode; however, the output of this observation would be counted as the wrong prediction according to a 60% classification threshold. Another technique to examine the prediction accuracy rate of the model is the expected percent right, %R (Shalaby 1998). This represents the correct prediction percentage of the model. %R is calculated using the following equation: %R = ^ E E P x ( J ) - y . x (4-4) X j where P x (j)= the probability that observation x chooses mode j; y i x = set to 1 if observation x uses mode j; otherwise, it is set to zero; N = total number of observations of the data set. 4.3.2 Results of the Work/School Trip All 20 explanatory variables were implemented to construct an ANN model. The model network, as shown in Figure 4-1, is composed of three layers. Layer 1 has 20 linear nodes; Layer 2, the hidden layer, has 5 sigmoid nodes; and layer 3 has 2 sigmoid nodes. Table 4-13 shows the prediction results. The success rate of transit is 85.2% (92/108 = 85.2%), and of automobile is 99.3% (666/671 = 99.3%). The overall prediction success rate is about 97% ((92+666)/779 = 97.3%). This is obtained by adding the number of correctly predicted transit modes and automobile modes together, and then dividing the result by the total number of observations. The kappa statistic of the model 65 Stella Y. W. Chow D A T A ANALYSIS A N D RESULTS is 0.7905 which designates a substantial agreement between the observed and predicted values. Table 4-14 demonstrates the success rates under different confidence levels. The success rates are reasonably steady with the increment of confidence levels. When confidence level reaches 90%, the success rate still maintains a high accuracy level of 87.7%. The correct prediction percentage, %R, is approximately 96%. 66 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS DATA ANALYSIS AND RESULTS Table 4-13: Prediction Results of the ANN Model with 20 Variables for the Work/School Trip (Classification Threshold = 0.50) From ANN Model Observed Transit Automobile Total % Correct Transit 92 16 108 85.2% Automobile 5 666 671 99.3% Total 97 682 779 97.3% Table 4-14: Success Rate of the ANN Model for the Work/School Trip under Different Confidence Levels Confidence Level Success Rate 0.5 97.3 % 0.6 96.9 % 0.7 96.5 % 0.8 94.1 % 0.9 87.7 % 0.95 78.3 % 0.99 55.8 % 4.3.3 Results of the Recreational Trip Similar to the work/school trip, 20 explanatory variables were applied to construct an ANN model. The prediction results of the model are shown in Table 4-15. The resulting model has the same network structure as that of the work/school trip. The transit success rate is 79.4%; the automobile's is 99.3%. The overall prediction success rate is 98%. The kappa statistic of the model is 0.7149 which designates a substantial agreement between the observed and predicted values. Table 4-16 demonstrates the success rates under different confidence levels. The success rates are reasonably steady with the increment of confidence levels. When the confidence level reaches 90%, the success rate 68 Stella Y. W. Chow DATA ANALYSIS AND RESULTS maintains a high accuracy level of 84.4%. The correct prediction percentage, %R, is about 94.5%. Table 4-15: Prediction Results of the ANN Model with 20 Variables for the Recreational T r i p (Classification Threshold = 0.50) From ANN Model Observed Transit Automobile Total % Correct Transit 54 14 68 79.4% Automobile 6 880 886 99.3% Total 60 894 954 97.9% Table 4-16: Success Rate of the ANN Model for the Recreational T r i p under Different Confidence Levels Confidence Level Success Rate 0.5 97.9% 0.6 97.4 % 0.7 95.3 % 0.8 93.3 % 0.9 84.4 % 0.95 66.6 % 0.99 41.3 % 4.4 Neurofuzzy Mode Choice Models Two models were developed for each trip. The difference between the two models was their stopping criteria. One model applied the Bayesian stopping criterion and the other one used the Akaike stopping criterion. Section 4.4.1 describes the methodology of the modeling process; section 4.4.2 displays the resulting models of work/school trip. In section 4.4.2, a brief comparison between the two stopping criteria is provided. Finally, section 4.4.3 illustrates the results of the recreational trip. 69 Stella Y. W. Chow D A T A ANALYSIS A N D RESULTS 4.4.1 Methodology Unlike the logit and A N N models, NEUframe constructs the optimal neurofuzzy model automatically. It can determine the number of input variables and the dimension of basis functions. To obtain the best neurofuzzy model, different stopping criteria are employed. In this thesis, Akaike and Bayesian statistical performance measures are the two stopping criteria. Maier et al. (2000) illustrated that the order of B-splines functions has a negligible impact on the network output and suggests that second order (k=2) B-splines functions normally generate a more transparent model. Thus, only the stopping criterion is considered in this thesis. The two models using the two different stopping criteria are listed below. Model 1 - Using all 20 variables (Bayesian, k=2) Model 2 - Using all 20 variables (Akaike, k=2) The output only consists of variable "mode" with a value of 1 corresponding to transit and a value of 0 representing automobiles. With a 50% classification threshold, an output coded as [0.11] indicates the automobile travel mode; an output of [0.92] denotes the transit travel mode. In brief, an output of [0.11] demonstrates that the probability of choosing the automobile is 89% (1 - 0.11 = 0.89) and the probability of selecting transit is 11%; an output of [0.92] suggests that the probabilities of selecting transit and automobile are 92% and 0.08% respectively. Similar to both the A N N and logit models, the prediction success rates were calculated using 30 % of the remaining data set. In addition, different confidence levels (50%, 60%, 70%, 80%, 90%, 95%, and 99%) were carried out to further examine the performance of each model. 70 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS 4.4.2 Results of the Work/School Trip The network of Model 1 is composed of three subnetworks. It consists of five variables: licenses, non-work status, estimated travel time, estimated travel distance, and total travel costs. Figure 4-2 illustrates the structure of the model. Each subnetwork has two variables connected with it. For instance, the estimated travel distance and total travel costs are attached to subnetwork 1 in Figure 4-2. The first subnetwork contains 300 fuzzy rules; the second subnetwork has 10 fuzzy rules and the third subnetwork includes 15 fuzzy rules. A total of 325 fuzzy rules was created. Table 4-17 shows the prediction results of Model 1. The success rate of transit is 89.8% and that of the automobile is 99.6%. The overall success rate is 98.2%. The kappa statistic is 0.8565 which indicates an almost perfect agreement between the observed and predicted values. The correct prediction percentage, %R, is 98%. Table 4-19 demonstrates the success rates of Model 1 under different confidence levels. 71 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS Total travel cost Subnetwork 3 Figure 4-2: Structure of Neurofuzzy Model 1 for the Work/School Trip Table 4-17: Prediction Results of Neurofuzzy Model 1 for the Work/School Trip (Classification Threshold = 0.50) From Neurofuzzy Model Observed Transit Automobile Total % Correct Transit 97 11 108 89.8% Automobile 3 668 671 99.6% Total 100 679 779 98.2% In the second model, the Akaike stopping criterion and second order B-spline functions were used. The network of Model 2 is composed of five subnetworks. It consists of eight variables: household municipalities, number of people older than 16 in a household, number of vehicles owned per household, non-work status, estimated travel 72 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS time, estimated travel distance, destination municipalities, and total travel costs. Each subnetwork has one to three variables connected with it. The first subnetwork contains 300 fuzzy rules; the second subnetwork has 10 fuzzy rules and the third subnetwork includes 15 fuzzy rules. A total of 325 fuzzy rules was created. Table 4-18 shows the prediction results of Model 2. The overall prediction accuracy is 98.5%. The kappa statistic value is 0.8744 which indicates an almost perfect agreement between the observed and predicted values. The correct prediction percentage, %R, is 100%. Table 4-19 illustrates the success rates of Model 2 under different confidence levels. Table 4-18: Prediction Results of Neurofuzzy Model 2 for the Work/School Trip (Classification Threshold = 0.50) From Neurofuzzy Model Observed Transit Automobile Total % Correct Transit 97 11 108 89.8% Automobile 1 670 671 99.9% Total 98 681 779 98.5% Table 4-19: Success Rates of Neurofuzzy Models for Work/School Trip under Different Confidence Levels Confidence Success Rate Success Rate Level (Model 1) (Model 2) 0.5 98.2 % 98.5 % 0.6 98.1 % 98.5 % 0.7 98.0 % 98.3 % 0.8 97.7 % 98.3 % 0.9 97.7 % 98.2 % 0.95 95.9 % 98.2 % 0.99 89.2 % 98.1 % 73 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS A Comparison of Akaike and Bayesian Stopping; Criteria The results obtained from the two models are rather similar. Model 2 with the use of the Akaike stopping criterion gives a better success rate, kappa statistic and %R. Under different confidence levels, the success rates of Model 2 are also superior to the model with the Bayesian stopping criterion. Using the Akaike stopping criterion can maintain a success rate of 98% throughout all confidence levels. Figure 4-3 shows a comparison of success rates of the two stopping criteria under different confidence levels. The success rate of the Bayesian stopping criterion is excellent as well. It can maintain a 98% prediction accuracy when the confidence level reaches 90%. Although the Akaike stopping criterion has a slightly better performance than the Bayesian stopping criterion, the number of input variable used by Akaike is more than that of the Bayesian stopping criterion. Akaike needs 8 input variables, while Bayesian only requires 5. The prediction abilities of both stopping criteria are almost the same. Furthermore, the kappa statistic of the two stopping criteria indicates that their prediction performances are reliable. Thus, the Bayesian stopping criterion is considered to be superior to the Akaike stopping criterion according to prediction accuracy and the number of input variables. 74 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS 100% 98% Â© 96% (0 * 94% (A 8 92% o 3 W 90% 88% 86% 50 60 70 80 90 Confidence Level 95 99 Bayes i an â€¢Akaike Figure 4-3: Comparison of Neurofuzzy Model 1 and Model 2 for for Work/School Trip 4.4.3 Results of the Recreational Trip In this model, the Bayesian stopping criterion and second order B-spline function were employed. The overall model structure contains 4 subnetworks with one to three explanatory variables linked to each subnetwork. Six variables were identified when constructing the model: vehicles owned per household, age, driver's license possession, estimated travel time, estimated travel distance, and total travel costs. A total of 869 fuzzy rules were created. Table 4-20 shows the prediction results. The overall prediction success rate is 100%. The kappa statistic is 1.0 which indicates a perfect agreement 75 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS between the observed and predicted values. The correct prediction percentage, %R, is 100%. Table 4-20: Prediction Results of Neurofuzzy Model 1 for the Recreational Trip (Classification Threshold = 0.50) From Neurofuzzy Model Observed Transit Automobile Total % Correct Transit 68 0 68 100% Automobile 0 886 886 100% Total 68 886 954 100% The Akaike stopping criterion and second order B-spline function were used to construct neurofuzzy Model 2. The network of the model is composed of three subnetworks. Each subnetwork contains three input variables. Eight variables were selected when developing the network: household municipality, household income, vehicles owned per household, driver's license possession, age, estimated travel time, estimated travel distance, and total travel costs. A total of 1250 fuzzy rules was created. Table 4-21 shows the prediction results. Results such as the kappa statistic and %R are the same as those of the model with the Bayesian stopping criteria. Table 4-21: Prediction Results of Neurofuzzy Model 2 for the Recreational Trip (Classification Threshold = 0.50) From Neurofuzzy Model Observed Transit Automobile Total % Correct Transit 68 0 68 100% Automobile 0 886 886 100% Total 68 886 954 100% 76 Stella Y . W. Chow D A T A ANALYSIS A N D RESULTS Table 4-22: Success Rates of Neurofuzzy Models for the Recreational Trip under Different Confidence Levels Confidence Success Rate Success Rate Level (Model 1) (Model 2) 0.5 100% 100% 0.6 100% 100 % 0.7 100% 100% 0.8 100% 100% 0.9 100% 100% 0.95 100 % 100 % 0.99 100 % 100 % For the recreational trip, the predictive ability of the two stopping criteria offers the same result. However, the number of input variables identified by the two models is different. The Bayesian stopping criterion requires 6 input variables while the Akaike stopping criterion needs 8. As a result, the Bayesian can generally provide a better neurofuzzy model. 4.5 Conclusion The most efficient (best) mode choice model for each modeling technique has been identified. Several methods of statistical analyses such as the prediction success rates, kappa statistic, and the correct prediction percentages were applied to examine each developed model. As well, different classification thresholds were used to further investigate the performance consistency of each model. In general, the success rates of recreational trip are better than those of work/school trip. This might imply that the data used in this thesis fit the recreational trip better than the work/school trip. After identifying the best mode choice model from each modeling technique, further comparisons and analyses will be carried out in the next chapter. 77 Stella Y . W. Chow ANALYSIS A N D DISCUSSION CHAPTER 5 ANALYSIS AND DISCUSSION 5.1 Introduction In the previous section, the best mode choice model of each modeling technique was selected. Each of the best mode choice models identified its own set of explanatory variables. In this section, cross comparison analyses for both trip purposes were carried out for comparison purposes. In cross comparison (I), the A N N and neurofuzzy models were constructed based on the explanatory variables identified by the best logit model. In cross comparison (II), both logit and A N N models were developed whose explanatory variables were obtained from the best neurofuzzy model. In Section 5.2, descriptions on the results of the cross comparison analyses are presented. Section 5.3 is a discussion of all comparisons and results. Finally, section 5.4 is the conclusion.of this chapter. 5.2 Cross Comparison Analysis The cross comparison analysis is composed of 2 main sections. Section 5.2.1 shows the results and comparisons of the work/school trip and section 5.2.2 illustrates the comparison results of the recreational trip. A summary table of all developed mode choice models is presented at the ends of both sections 5.2.1 and 5.2.2. The summary table consists of two parts: the results of the best models and the results of the cross comparison analysis. 78 Stella Y . W. Chow ANALYSIS A N D DISCUSSION 5.2.1 The Work/School Trip 5.2.1.1 CROSSJCOMPARISON (I) A n A N N Model Based on The Best Logit Model A n A N N model was developed based on the 14 explanatory variables selected by the best logit model. The prediction results of the A N N model are shown in Table 5-1. The overall success rate is 98.6 percent and the kappa statistic is 0.8837 which indicates an almost perfect model. The overall success rate and the kappa statistic of the A N N model are much better than those of the best logit model. Table 5-1: Prediction Results of the A N N Model Based on the Best Logit Model for the Work/School Trip (Classification Threshold = 0.50) From A N N Model Observed Transit Automobile Total % Correct Transit 97 11 108 89.8% Automobile 0 671 671 100.0% Total 97 682 779 98.6% A Neurofuzzy Model Based on the Best Logit Model A neurofuzzy model was calibrated with 14 input variables obtained from the best logit model. The Bayesian stopping criterion and second order B-spline function were used. Among 14 input variables, 5 of them were recognized for construction of the neurofuzzy network. They are non-work status, driver license ownership, estimated travel time, estimated travel distance, and total travel costs. The network is composed of three subnetworks. Each subnetwork has two variables connected with it. A total of 325 fuzzy rules was created. Table 5-2 shows the prediction results of the constructed model. 79 Stella Y . W. Chow ANALYSIS A N D DISCUSSION Transit has a success rate of 89% while that of the automobile is 99.6%. The overall success rate is 98%. The kappa statistic is 0.8423 which indicates an almost perfect agreement between the observed and predicted values. Finally, the correct prediction percentage is 97.1%. Table 5-2: Prediction Results of the Neurofuzzy Model Based on the Best Logit Model for the Work/School Trip (Classification Threshold = 0.50) From Neurofuzzy Model Observed Transit Automobile Total % Correct Transit 96 12 108 88.9% Automobile 3 668 671 99.6% Total 99 680 779 98.1% 5.2.1.2 CROSS C O M P A R I S O N Â® "I . Z T A Logit Model Based on the Best Neurofuzzy Model A logit model was created based on the 5 input variables obtained from the best neurofuzzy model. The results of the logit model are shown in Table 5-3. The success rate of transit is 26%; that of the automobile is 97% and the overall success rate is 87%. The kappa statistic is approximately 0.1897 which illustrates only a slight agreement between the observed and the predicted values. The %R is 80.7 percent. Table 5-3: Prediction Results of the Logit Model Based on the Best Neurofuzzy Model for the Work/School Trip (Classification Threshold = 0.50) From Logit Model Observed Transit Automobile Total % Correct Transit 28 45 108 25.9% Automobile 14 648 671 96.6% Total 77 702 779 86.8% 80 Stella Y . W. Chow ANALYSIS A N D DISCUSSION A n A N N Model Based on the Best Neurofuzzy Model An A N N model was also calibrated based on the 5 selected variables. Table 5-4 displays the prediction results of the developed A N N model. The overall success rate is 97 percent. The kappa statistical value and %R are 0.7356 and 97.8 percent respectively. Table 5-4: Prediction Results of the ANN Model Based on the Best Neurofuzzy Model for the Work/School Trip (Classification Threshold - 0.50) From A N N Model Observed Transit Automobile Total % Correct Transit 84 24 108 77.8% Automobile 2 669 671 99.7% Total 86 693 779 96.7% Table 5-5 shows a summary of the results for the work/school trip. The upper section of the table consists of the best mode choice model obtained from each modeling technique; the lower section is the results of the two cross comparisons. The individual travel mode success rate, the overall success rate, the number of input variables, the kappa statistic and the %R are included in the summary table. 81 Stella Y . W. Chow ANALYSIS A N D DISCUSSION Table 5-5: Summary of All Comparisons for the Work/School Trip Transit Auto % Overall No. of Kappa %R % % variables Best Model Comparison Best Logit Model 58.3 97.9 92.4 14 0.479 86.4 A N N Model 85.2 99.3 97.3 20 0.791 96.0 Best Neurofuzzy Model* 89.8 99.6 98.2 5 0.857 98.0 Cross Comparison (I) and (II) (I) Best Logit Model A N N Model 89.8 100 98.6 14 0.833 95.3 Neurofuzzy Model* 88.9 99.6 98.1 5 0.842 97.1 (II) Best Neurofuzzy Model* Logit Model 25.9 96.6 86.8 5 0.190 80.7 A N N Model 77.8 99.7 96.7 5 0.7356 97.8 Bayesian stopping criteria and second order B-splines function. 5.2.2 The Recreational Trip 5.2.2.1 CROSS COMPARISON (I) A n A N N Model Based on the Best Logit Model An A N N mode choice model was constructed using 7 explanatory variables obtained from the best logit model. The prediction results are illustrated in Table 5-6. The overall success rate is 97.6 percent and the kappa statistic is 0.6731. The overall success rate and the kappa statistic of the A N N model are better than those attained from the best logit model. 82 Stella Y . W. Chow ANALYSIS A N D DISCUSSION Table 5-6: Prediction Results of the ANN Model Based on the Best Logit Model for the Recreational Trip (Classification Threshold = 0.50) From A N N Model Observed Transit Automobile Total % Correct Transit 51 17 68 75.0% Automobile 6 880 . 886 99.3% Total 57 897 954 97.6% A Neurofuzzy Model Based on the Best Logit Model A neurofuzzy model was calibrated based on the 7 input variables obtained from the best logit model. Among the seven input variables, four were used by the neurofuzzy network. These are household income, estimated travel time, estimated travel distance, and total travel costs. The network is composed of two subnetworks with each subnetwork contains two input variables. A total of 136 fuzzy rules was developed. The overall success rate is 100 percent (see Table 5-7) and the kappa statistic is 1.0. Table 5-7: Prediction Results of the Neurofuzzy Model Based on the Best Logit Model for the Recreational Trip (Classification Threshold = 0.50) From Neurofuzzy Model Observed Transit Automobile Total % Correct Transit 68 0 68 100% Automobile 0 886 886 100% Total 68 886 954 100% 83 Stella Y . W. Chow ANALYSIS A N D DISCUSSION 5.2.2.2 CROSS COMPARISON (II) A Logit Model Based on the Best Neurofuzzy Model A Logit model was created based on 6 variables obtained from the best neurofuzzy model. The prediction results are shown in Table 5-8. The overall success rate is 97.1 percent and the kappa statistic is 0.604. Table 5-8: Prediction Results of the Logit Model Based on the Best Neurofuzzy Model for the Recreational Trip (Classification Threshold = 0.50) From Logit Model Observed Transit Automobile Total % Correct Transit 46 22 68 67.7% Automobile 6 880 886 99.3% Total 52 902 954 97.1% An A N N Model Based on the Best Neurofuzzy Model Similarly, an A N N model based on the best neurofuzzy model was developed. The structure of the resulting A N N model consists of 3 layers. The input layer contains 6 nodes; the hidden layer has 3 nodes; and the output layer has 2 nodes. The overall success rate is 98%, as shown in Table 5-9. The kappa statistic and %R are 0.776 and 96.6%) respectively. Table 5-9: Prediction Result of the ANN Model Based on the Best Neurofuzzy Model for the Recreation Trip (Classification Threshold = 0.50) From A N N Model Observed Transit Automobile Total % Correct Transit 56 12 68 82.4% Automobile 3 883 886 99.7% Total 59 895 954 98.4% 84 Stella Y . W. Chow ANALYSIS A N D DISCUSSION Table 5-10 presents a summary of the results for the recreational trip. The upper section of the table consists of the best mode choice model obtained from each modeling technique; the lower section is the results of the two cross comparisons. The individual travel mode success rate, the overall success rate, the number of input variables, the kappa statistic and the %R are included in the summary table. Table 5-10: Summary of All Comparisons for the Recreational Trip Transit Auto % Overall no. of Kappa %R % % variables Best Model Comparison Best Logit Model 75.0 99.2 97.5 7 0.664 95.6 A N N Model 79.4 99.3 97.9 20 0.715 94.5 Best Neurofuzzy Model* 100 100 100 6 1.000 100 Cross Comparison (I) and (II) (T) Best Logit Model A N N Model 75.0 99.3 97.9 7 0.673 95.9 Neurofuzzy Model* 100 100 100 4 1.000 100 (IT) Best Neurofuzzy Model * Logit Model 67.7 99.3 97.1 6 0.604 95.1 A N N Model 82.4 99.7 98.4 6 0.776 96.6 *Bayesian stopping criteria and second order B-splines function. 5.3 Discussion In the following sections, discussions on the performance of the three modeling techniques are presented in terms of several measurements: prediction accuracy, the number and significance of input variables, classification threshold, and model transparency. Each measurement is compared and explained in detail. 85 Stella Y . W. Chow ANALYSIS A N D DISCUSSION 5.3.1 Prediction Accuracy The Work7 School Trip Among three mode choice analysis techniques, neurofuzzy models always offer the best prediction accuracy regardless of the measures used to assess the models' success. The upper section of Table 5-5 illustrates the best results obtained from three models for the work/school trip. With regard to the overall success rate, both ANNs and neurofuzzy's achieve better results than the conventional technique (logit model). It should be noted that the success rates of individual travel modes, which are transit and the automobile, are as significant as the overall success rate. This is because the proportions of the two travel modes presented in the data differ significantly. Greater than 85 percent of the observation in the data set selected the automobile as his/her travel mode; only a small portion of the observations chose transit. Therefore, i f the success rate of the transit mode were low, the overall success rate would still be maintained at a high level. This occurs in the best logit model as shown in the summary table. The success rate of transit predicted by the best logit model is low and is only 58 percent. However, the overall success rate is 92 percent. Thus, kappa statistic plays an essential role under this circumstance. The kappa statistic of the best logit model is 0.479 which is far less than that of the A N N and neurofuzzy models. The A N N model has a kappa statistic of 0.791 while the neurofuzzy model has a kappa statistic larger than 0.85. Although the prediction success rates of both A N N and the neurofuzzy models are fairly similar, the neurofuzzy has a higher kappa statistic and %R which indicates that the neurofuzzy is a more reliable model. Moreover, the cross comparisons (I) and (II) demonstrate that the neurofuzzy can produce superior results. In cross comparison (I), 86 Stella Y . W. Chow ANALYSIS A N D DISCUSSION the neurofuzzy model has the highest kappa statistic and %R; in cross comparison (II), the success rates, kappa statistic, and %R of A N N and the logit models are not as good as those of the neurofuzzy model. To illustrate, the logit model in cross comparison (IT) only has a kappa statistic of 0.190, which is below the acceptable level. The Recreational Trip Table 5-10 shows the summary table of the recreational trip. In a similar manner, the table consists of two sections: the best model comparison and cross comparisons. For the best model comparison, the overall success rates of both the A N N and logit models are comparable but the A N N model still provides a slightly better estimate of individual modal shares and, therefore, results in higher kappa statistic. However, the %R of the best logit model is slightly higher than that of A N N model. Among the three mode choice modeling techniques, the neurofuzzy model offers the best results. It is worth noting that both the Bayesian and Akaike stopping criteria can achieve perfect results on all statistical measures. With regard to cross comparisons, the neurofuzzy model achieves an excellent result again in cross comparison (I). In cross comparison (TJ), the results of both the A N N and logit models are much lower than those obtained from the best neurofuzzy model. In summary, the neurofuzzy models always offer the best prediction accuracy regardless of the measures used to assess the model's success. The results obtained from cross comparison (I) illustrate that the neurofuzzy model whose input variables are based on the best logit model can still provide good performance. In cross comparison (II), the results of the A N N and logit models whose input variables are based on the best 87 Stella Y . W. Chow ANALYSIS A N D DISCUSSION neurofuzzy model are not as good as the neurofuzzy model itself. Therefore, after investigating the prediction accuracy results of the two trip purposes, it is found that the neurofuzzy has the highest performance in terms of prediction accuracy. 5.3.2 The Number of Input Variables The number of input variables required for calibrating a model is important since a rich data source might not always be available. Some data are easy to collect but some are very difficult such as the waiting time of individual transit passengers. As well, not all variables presented in the data set are related to mode choice analysis. Models with fewer variables reduce computational cost and time. Consequently, the number of input variables required to implement mode choice analysis plays a significant role in the modeling process. Among three mode choice modeling techniques, the neurofuzzy can automatically develop an optimal model with a minimum number of input variables. For instance, the best logit model of the work/school trip requires at least 14 input variables while the neurofuzzy only entails 5 input variables which is about three times less than that of the best logit model. Moreover, cross comparison (I) shows that neurofuzzy can further eliminate the number of input variables selected by the best logit model from 14 down to 5. In addition, neurofuzzy can generate even better results than the logit model. In cross comparison (II) of the work/school trip, the results of the logit model, based on the 5 input variables identified by the best neurofuzzy model, are poor. This model has the low kappa statistic of 0.19. Although the neurofuzzy uses the least number of input variables, it maintains the highest performance on all statistical measures. Thus, the neurofuzzy 88 Stella Y . W. Chow ANALYSIS A N D DISCUSSION model is the most efficient model based on its number of input variables. Table 5-10 presents the comparison summary results of the recreational trip. As mentioned previously, the number of input variables in each model is similar. However, the neurofuzzy still has the highest performance. In cross comparison (I), the neurofuzzy model further reduces the number of input variable identified by the best logit model from 7 down to 4 and it maintains its perfect prediction results. 5.3.3 The Significance of Input Variables Table 5-11 shows a variable selection comparison between the best logit model and two neurofuzzy models for the work/school trip. A l l 14 significant variables in the best logit model, which are all statistically significant at a 95 percent confidence level, are listed in the table. The variables are arranged in the order of their statistical significance with the uppermost region of Table 5-11 containing the most significant variable. The exception is that the variable "licenses" has a large coefficient that results in a low wald statistic. However, it has been shown in the preceding, chapters that "licenses" is an important variable in the model. The selected input variables of both neurofuzzy models are all statistically significant. Some selected variables are highly important such as "travtime", "estdist", and "tcost". Table 5-12 also shows a variable comparison between the best logit and neurofuzzy models for the recreational trip. Seven variables chosen by the best logit model are statistically significant at a 95 percent confidence level. The rest of the explanatory variables that are not included in the best logit model are insignificant. Both neurofuzzy models have their most important variables located on the upper portion of 89 Stella Y . W. Chow ANALYSIS A N D DISCUSSION the table. Surprisingly, some selected variables in the neurofuzzy model were not statistically significant. This may challenge the conventional method of choosing explanatory variables based on their wald or t-statistics. Table 5-11: Variable Comparison between the Best Logit and Neurofuzzy Models (the Work/School Trip) Best Logit Model Neurofuzzy1 (5 variables) Neurofuzzy (8 variables) Variable Wald statistic Vph 108.48* V Tcost 81.108* Est_dist 69.977* Trayjime 32.12* s Hmuncd 16.494* â€¢/ P t o t t r i p 14.703* Omuncd 14.565* D muncd 13.721* Pph_16 12.839* </ Age 10.644* Nw_stat 9.5931* Emp stat 8.0375* Sex (1) 5.8811* License 1.2139* S Variables chosen by Neurofuzzy Model * Variables significant at a 95% confidence level 1 Bayesian stopping criteria and second order B-splines function. 2 Akaike stopping criteria and second order B-splines function. 90 Stella Y , W. Chow ANALYSIS A N D DISCUSSION Table 5-12: Variable Comparison between the Best Logit and Neurofuzzy Models (the Recreational Trip) Best Logit Model Neurofuzzy1 Neurofuzzy Variable Wald statistic (6 variables) (8 variables) Tcost 130.54* Est_dist 86.165* Trav time 46.222* Vph 38.033* P t o t t r i p 27.028* H h i n c 6.5071* License 1.0297* V â€¢/ Age Not in Best Logit Hh muncd Not in Best Logit S Variables chosen by Neurofuzzy Model * Variables significant at a 95% confidence level 1 Bayesian stopping criteria and second order B-splines function. 2 Akaike stopping criteria and second order B-splines function. 5.3.4 Classification Threshold The classification threshold is a value used to convert the predicted output value to a discrete value which is equal to either 1 or 0. Different classification thresholds were applied to each developed model. Figure 5-1 demonstrates the comparison results of the three modeling techniques based on different confidence levels for the work/school trip while Figure 5-2 is for the recreational trip. Each figure illustrates three comparison results: the best model comparison, the cross comparison (I), and the cross comparison (II). The success rates of most models decrease with the increase of confidence level. For the work/school trip, the neurofuzzy maintains a steady success rate throughout the confidence levels. Even i f the confidence level increases to 99 percent, the neurofuzzy can still achieve a success rate of 90 percent. In Figure 5-1(a), the success rate of the A N N model begins to gradually decrease when the confidence level reaches 90 percent. Similarly, the success rate of the logit model begins to decline when the 91 Stella Y . W. Chow ANALYSIS A N D DISCUSSION confidence level reaches 70 percent. In Figure 5-1(b), the curves of both the A N N and logit models look alike. When confidence level reaches 99 percent, the success rates of the A N N and logit models drop below 40 percent. In Figure 5-1(c), the logit model with its sharp downward slope has the worst performance. It decreases rapidly for confidence levels beyond 90 percent and goes down to almost zero at 99 percent confidence level. Until this point, the neurofuzzy can sustain a relatively constant success rate. Followed by the neurofuzzy is the A N N , and then the logit model. The confidence level results of the recreational trip (see Figure 5-2) are similar to work/school trip except that the performance of the logit model is better than that of the A N N model. Applied to both trip purposes, the neurofuzzy always offers a high, constant success rate throughout confidence levels. 0) 4-1 ra QÂ£ (fl w o o o 3 V) 1 0 0 % 9 0 % 8 0 % 7 0 % 6 0 % 5 0 % 4 0 % 3 0 % 2 0 % 1 0 % 0 % 5 0 6 0 7 0 8 0 9 0 Confidence Level 9 5 9 9 â€¢ â€¢* â€” Logit -aâ€” A N N I- â€¢ - Bayesian (a) A Comparison of A l l Three Best Models 92 Stella Y . W. Chow ANALYSIS A N D DISCUSSION Q) re (A (A a> o o 3 (0 100% 90% 80% 70% 6 0 % 50% 4 0 % 30% 20% 10% 0% 50 60 70 80 90 Confidence Level 95 99 â€¢ - Bayes ian H i â€” A N N â€¢A â€” Logit (c) Cross Comparison (II)â€”Based on the Best Neurofuzzy Model Figure 5-1: Classification Thresholds Comparison for the Work/School Trip 93 Stella Y . W. Chow ANALYSIS A N D DISCUSSION 0) (0 w u u (0 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 50 60 70 80 90 Confidence Level 95 99 Bayesian â€¢ANN â€¢A â€” Logit (b) Cross Comparison (I)â€”Based on the Best Logit Model 94 Stella Y . W. Chow ANALYSIS A N D DISCUSSION 0) ro (A (0 0) U O 3 V) 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 50 60 70 80 90 Confidence Level 95 99 - â€¢â€¢ -Bayesian â€” â€¢ â€” A N N Logit (c) Cross Comparison (II)â€”Based on the Best Neurofuzzy Model Figure 5-2: Classification Thresholds Comparison for the Recreational Trip 5.3.5 Network Transparency ANNs have self-learning capability and are good at recognizing patterns. However, the results obtained from an A N N model remain a "black box" which is difficult to interpret to make the necessary changes to the model. The knowledge enclosed in the A N N network is kept in weight matrix form (Sayed and Razavi 2000). The weights presented in the weight matrix are ambiguous and difficult to understand. On the other hand, fuzzy logic can handle ambiguous information by using fuzzy sets and membership functions. Fuzzy systems provide explicit knowledge representation for easy Optimization. However, the main disadvantage of fuzzy systems is that expert knowledge is required to derive the " i f then" rule from the data set manually. Therefore, both ANNs and fuzzy systems have their own weaknesses and strengths. 95 Stella Y . W. Chow ANALYSIS A N D DISCUSSION Neurofuzzy models combine the advantage of both ANNs and fuzzy logic to overcome problems from each individual system. They combine the transparent representations of a fuzzy system with the learning capability of ANNs. A neurofuzzy model can be set up as a neural network and trained. This allows further insight into the process of the modeling and fuzzy rules can be set up automatically from a set of data. Figure 5-3 demonstrates some of the membership functions obtained from the results of the neurofuzzy for the work/school trip model. The membership function in Figure 5-3 (a) is for total travel costs and the one in Figure 5-3 (b) is for driver's license ownership. The variable total travel costs consists of 10 fuzzy sets. Set 1 represents total travel costs less than $1.09 (very inexpensive) and fuzzy set 10 corresponds to total travel costs greater than $7.65 (very expensive). The remaining fuzzy sets might have the following representation as shown in Table 5-13. With the use of fuzzy logic, ambiguous information can be applied in mode choice modeling. Table 5-13: Fuzzy Set Representation Fuzzy Set Representation Name of Fuzzy Set 1 <$1.09 Very inexpensive 2 =$1.09 Inexpensive 3 =$1.64 Considerably inexpensive 4 =$2.64 Moderately inexpensive 5 =$2.73 Slightly inexpensive 6 =$3.28 Fair 7 =$4.37 Slightly expensive 8 =$5.46 Moderately expensive 9 =$7.65 Expensive 10 >$7.65 Very expensive 96 Stella Y . W. Chow ANALYSIS A N D DISCUSSION Driver License Ownership (normalized) (b) Membership Function for Driver License Ownership Figure 5-3: Membership Functions of the Neurofuzzy Model 97 Stella Y . W. Chow ANALYSIS A N D DISCUSSION With the use of linguistic variables, the model can easily be interpreted and any necessary changes can also be made. Table 5-14 illustrates some fuzzy rules created by the neurofuzzy model for the work/school trip. Two input variables, "licenses" and "est_dist", are connected to subnetwork number 2. Subnetwork number 1 also has 2 input variables that are "estdist" and "tcost". By examining these fuzzy rules, the relationships between different variables can be investigated. Although the neurofuzzy can automatically develop an optimal network, expert knowledge can still be added to refine the model. Therefore, neurofuzzy modeling not only has the ability to construct an optimal model, but also has the ability to incorporate new information into that model such as fuzzy rules, membership functions, and subnetworks. Table 5-14: Fuzzy Rules Used by the Neurofuzzy Model for the Work/School Trip Subnetwork number Rule number Rule (2) 301 If "License" is Yes A N D "Est_dist" is (0 km to 25 km) Then "mode" is auto (0.58) OR "mode" is transit (0.42) (2) 302 If "License" is No A N D "Est_dist" is (0 km to 25 km) Then "mode" is auto (0.58) OR "mode" is transit (0.42) (2) 303 If "License" is Yes A N D "Est_dist" is (25 km to 28 km) Then "mode" is auto (0.58) OR "mode" is transit (0.42) (2) 304 If "License" is No A N D "Est_dist" is (25km to 28 km) Then "mode" is auto (0.58) OR "mode" is transit (0.42) (2) 305 If "License" is Yes A N D "Est_dist" is (28 km to 33 km) Then "mode" is auto (0.58) OR "mode" is transit (0.42) (2) 306 If "License" is No A N D "Est_dist" is (28 km to 33 km) Then "mode" is auto (0.58) OR "mode" is transit (0.42) (2) 307 If "License" is Yes A N D "Est_dist" is (33 km to 45 km) Then "mode" is auto (0.58) OR "mode" is transit (0.42) (2) 308 If "License" is No A N D "Est_dist" is (33 km to 45 km) Then "mode" is auto (0.58) OR "mode" is transit (0.42) (2) 309 If "License" is Yes A N D "Est_dist" is (45 km to 89 km) Then "mode" is auto (0.58) OR "mode" is transit (0.42) (2) 310 If "License" is No A N D "Est_dist" is (45 km to 89 km) Then "mode" is auto (0.58) OR "mode" is transit (0.42) 98 Stella Y . W. Chow ANALYSIS A N D DISCUSSION (1) 231 If "Est_dist" is 30 km A N D "tcost" is $5.46 Then "mode" is auto (0.60) OR "mode" is transit (0.40) (1) 232 If "Est_dist" is 31 km to 36 km A N D "tcost" is $5.46 Then "mode" is auto (0.56) OR "mode" is transit (0.44) (1) 233 If "Est_dist" is 37 km A N D "tcost" is $5.46 Then "mode" is auto (0.84) OR "mode" is transit (0.16) (1) 234 If "Est_dist" is 37.5 km A N D "tcost" is $5.46 Then "mode" is transit (1.00) 240 If "Est_dist" is >67 km A N D "tcost" is $5.46 Then "mode" is auto (0.59) OR "mode" is transit (0.41) (1) 297 If "Est_dist" is 45 km A N D "tcost" is >$7.65 Then "mode" is auto (0.56) OR "mode" is transit (0.44) (1) 298 If "Est_dist" is 56 km A N D "tcost" is >$7.65 Then "mode" is auto (0.54) OR "mode" is transit (0.46) (1) 299 If "Estdist" is 67 km A N D "tcost" is >$7.65 Then "mode" is auto (0.62) OR "mode" is transit (0.38) (1) 300 If "Est_dist" is >67 km A N D "tcost" is >$7.65 Then "mode" is auto (0.58) OR "mode" is transit (0.42) 5.4 Conclusion In this chapter, the best model comparison and the cross comparisons of the two trip purposes were carried out. Discussions on the performances of the three modeling techniques were presented in terms of several measurements: prediction accuracy, the number and significance of input variables, classification threshold, and model transparency. The performances of neurofuzzy models for both trip purposes are relatively consistent. The neurofuzzy models provide the best results among the three mode choice modeling methods, particularly in the recreational trip. These models have the highest and most consistent success rates throughout all classification thresholds. The results of logit models are less accurate than those obtained from ANNs in the work/school trip but they are better than ANNs in the recreational trip. 99 Stella Y . W. Chow CONCLUSION CHAPTER 6 CONCLUSION 6.1 Summary Travel-demand forecasting models play an important role in transportation planning. Among the traditional four steps travel-demand forecasting model, mode choice is perhaps the most critical step because it affects many transportation policy issues. The Logit and probit models are the most widely used conventional mode choice models. However, their use of linear regression may not produce good results when applied to nonlinear and complex problems. Therefore, exploring other advanced modeling approaches may enhance the mode choice model process. In this thesis, three modeling techniquesâ€”the conventional logit model, artificial neural networks (ANNs) and neurofuzzy approachesâ€”were used to develop mode choice models for two trip purposes. Neurofuzzy combines the transparent representation of a fuzzy system with the learning capability of neural networks. Therefore, a fuzzy system can be set up automatically as a neural network and is trained. This allows further insight into the modeling process. Basically, there are two main analyses in this thesis: the best model comparison and the cross comparison. The best model comparison selects the best mode choice model from each modeling method while the cross comparison develops models whose input variables are based on other constructed models. Different statistical measurements such as the kappa statistic and correct prediction percentage were used for examining the results. The efficiency of a mode choice model was based on several criteria: prediction 100 Stella Y . W. Chow CONCLUSION accuracy, the number and significance of input variables, classification threshold, and model transparency. 6.2 Conclusions The results of the logit models are worse than those obtained from the ANNs in the work/school trip, but they are better than the ANNs in the recreational trip. On the other hand, the performances of the neurofuzzy modeling for both trip purposes are relatively consistent. The neurofuzzy model provides the most promising results among the three mode choice modeling methods, particularly in the recreational trip. It has the highest and the most consistent success rates throughout all classification thresholds. Furthermore, the neurofuzzy model can construct an optimum model by selecting the significant input variables from the data set automatically. Overall, the main conclusions of this thesis are that: 1. Neurofuzzy models provide the highest prediction accuracy and reliability. 2. Neurofuzzy models can maintain the highest success rates even with the least number of explanatory variables. 3. Neurofuzzy models offer the most consistent success rates throughout different classification thresholds. 4. Neurofuzzy models have a transparent knowledge representation. 5. Neurofuzzy models have the ability to incorporate new information into the models for further refinement. 101 Stella Y . W. Chow CONCLUSION Based on the results of all three modeling techniques, the neurofuzzy technique is found to be the most efficient mode choice model. The explicit knowledge representation of the neurofuzzy makes the exploration of the relationships between the choice behavior and the associated explanatory variables much simpler. As well, allowing expert knowledge to be incorporated into the developed model is an advantage for further model refinement. With advanced computer technology, the neurofuzzy approach might be able to have a wide range of applications in the field of transportation engineering such as road safety. A n example of the potential applications of the neurofuzzy in road safety might be to analyze what types of road surface conditions will increase the chance of having a car accident. A n efficient mode choice model can provide a better understanding of how individuals or groups select their travel modes for a particular trip purpose. Some recommendations might further enhance the development of the mode choice model. For instance, LOS variables, such as transit waiting time, and automobile related variables, such as insurance cost, can be included into the data set to make the model become more realistic. Regarding to the trip purposes, "to & from work/school trip" may be separated into four trip purposes: to work trip, from work trip, to school trip and from school trip. Similarly, social/recreational/personal business trip may also be subdivided into 3 trip purposes for a more precise modeling result. Therefore, a more accurate prediction of the share of trips attracted to public transportation can be achieved. 102 Stella Y . W. Chow REFERENCE C H A P T E R 7 R E F E R E N C E Abdelwahab, Walid and Tarek Sayed. "Freight Mode Choice Models Using Artificial Neural Networks." Civil. Eng. and Env. Syst. 16 (1999): 267-286. Beimborn, Edward A. " A Transportation Modeling Primer." Center for Urban Transportation Studies. May 1995. 15 June 2001 <http://www.uwm.edu/Dept/ CUTS/primer.htm> Bierlaire, Michel. "Discrete choice models." Intelligent Transportation Systems Program. 22 May 1997. 15 Jun. 2001 <http://rosowww.epfl.ch/mbi/papers/ discretechoice/paper.html> Bossley, Kevin Martin. "Neurofuzzy Modeling Approaches in System Identification." Doctor of Philosophy in the Engineering and Applied Science thesis (1997). University of Southampton, Highfield, Southampton, U K . Harris, C. J. et al. "Advances in Neurofuzzy Algorithms for Real-time Modeling and Control." Engng. Applic. Artif. Intell. 9.1 (1996): 1-16. Hensher, David A . and Tu T. Ton. " A Comparison of The Predictive Potential of Artificial Neural Networks and Nested Logit Models for Commuter Mode Choice." Transportation Research Part E 36 (2000): 155-172. Horowitz, Joel L. , Frank S. Koppelman and Steven R. Lerman. " A Self-Instructing Course in Disaggregate Mode Choice Modeling - FTA." National Transportation Library. DOT-T-93-18 (1986). 15 Jun. 2001 <http:// ntl.bts.gov/DOCS/381 SIC.html> 103 Stella Y . W. Chow REFERENCE Khisty, C. Jotin, and B. Kent Lall. "Urban Transportation Planning." Transportation Engineering: An Introduction. 2nd ed. New Jersey: Prentice-Hall, 1998. 449-515. Landis, J. Richard and Koch, G. Gary. "The Measurement of Observer Agreement for Categorical Data." Biometrics 33.1 (1977): 159-174. Maier, Holger R., Tarek Sayed and Barbara J. Lence. "Forecasting Cyanobacterial Concentrations Using B-Spline Networks." Journal of Computing in Civil Engineering 14.1 (2000): 183-189. Meyer, Michael D., and Eric J. Miller. "Demand Analysis." Urban Transportation Planning: A Decision-Oriented Approach. The United States of America: McGraw-Hill, 1984. 225-273. Mills , D. and Harris C. J. "Neurofuzzy Modeling and Control of a Six Degree of Freedom A U V . " Image, Speech and Intelligent Systems (1995/6). University of Southampton. 15 June 2001 <http://www.ecs.soton.ac.uk/publications/rj/1995-1996/isis/djm/rj 95 .htm> "NEUframe Version 3.0 Manual." Neural Computer Science. U K : Neusciences Intelligent Solutions. 1997. Ortuzar, J. de D. and L .G. Willumsen. "Model Split and Direct Demand Models." Modeling Transport. 2nd ed. England: John Wiley & Sons, 1994. 187-202. Rao, P.V. Subba et al. "Another Insight into Artificial Neural Networks Through Behavioral Analysis of Access Mode Choice." Comput., Environ, and Urban Systems 22.5 (1998): 485-496. 104 Stella Y . W. Chow REFERENCE Sayed, Tarek and Abdolmehdi Razavi. "Comparison of Neural and Conventional Approaches to Mode Choice Analysis." Journal of Computing in Civil Engineering 14.1 (2000): 23-30. Sayed, Tarek and Walid Abdelwahab. "Comparison of Fuzzy and Neural Classifiers for Road Accidnets Analysis." Journal of Computing in Civil Engineering 12.1 (1998): 42-46. Schwartz W.L. et al. "Guidebook on Methods to Estimate Non-Motorized Travel: Supporting Documentation." Federal Highway Administration. FHWA-RD-98-166(1999). 15 June 2001. <http://www.walkinginfo.org/task_orders/to_12/tol2/ vol2/title.htm> Shalaby, Amer S. "Investigating the Role of Relative Level-Of-Service Characteristics in Explaining Mode Split for the Work Trip." Transportation Planning and Technol. 22 (1998): 125-148. Spear, D. Bruce. "Applications of New Travel Demand Forecasting Techniques to Transportation Planning: A Study of Individual Choice Models." Federal Highway Administration. Mar. 1977. 12 Jun. 2001 <http://ntl.bts.gov/ DOCS/SICM.html> "SPSS Regression Model 9.0." SPSS Inc., U.S.: SPSS Inc., 1999. Yang Hai et al. "Exploration of Driver Route Choice with Advanced Traveler Information Using Neural Network Concepts." California PATH Program (1993): UCB-ITS-PRR-93-13 Zadeh, Lotfi A . "Making Computers Think Like People." IEEE Spectrum Aug. (1984): 26-32. 105 Stella Y . W. Chow APPENDICES APPENDICES 4-A) Sample Calculation of the probability of an event occurring (Logit Model). 4-B) Results of the Logit Mode Choice Models for the work/school trip. 4-C) Results of the Logit Mode Choice Models for the recreational trip. 4- D) Sample Calculation of the Kappa statistic. 5- A) Results of the Logit Model for cross-comparisons (II). 106 Stella Y . W. Chow APPENDICES APPENDIX 4-A Sample Calculation of the probability of an event occurring (Logit Model). Assuming the estimated Coefficients of a Model are: Variable. Coefficient Constant -10.552 H H income -0.4469 Vph -1.6278 License 10.842 Total trip -0.4589 Travel time 2.8572 Estimated distance -0.1857 Total travel costs 2.3014 The resulting model can be written as an equation: -10.552 + -0.4469 (HH_income) + -1.6278 (Vph) + 10.842 (License) + -0.4589 (Total Trip) + 2.8572 (Travel time) + -0.1857 (Estimated distance) + 2.3014 (Total travel costs) Prob (event) = i + Where B 0 = constant of the model B i = coefficient of explanatory variable 1 Given an observation with the following information: Coefficient Explanatory variables Data Value B, H H income 2 ($30,000 to $59,999) B 2 Vph 2 vehicles B 3 License 1 (has license) B 4 Total trip 6 trips B 5 Travel time 0.42 hour B 6 Estimated distance 4.934 km B 7 Total travel costs $0.44 Then, the probability of choosing transit as travel mode is calculated by: Prob (transit) = 107 Stella Y . W. Chow APPENDICES (B 0 + B i X + ...) = -10.552 + -0.4469 (2) + -1.6278 (2) + 10.842 (1) + -0.4589 (6) + 2.8572 (0.42) + -0.1857 (4.934) + 2.3014 (0.44) = -4.85142 Prob (transit) = _ ( _ 4 8 5 1 4 2 ) = 0.0078 = 0.78% Prob (automobile) = 1 - Prob (transit) = 1 - 0.0078 = 99.2% As a result, the probability of taking transit is lower than the probability of driving. Thus, automobile is chosen to be the travel mode. 108 Stella Y . W. Chow APPENDICES APPENDIX 4-B Results of the Logit Mode Choice Models for the Work/school Trip. Part A Model 1 - done by the Enter Method (20 variables) Part B Model 2 - the smallest possible number of variables Model 3 - the smallest possible number of variables Model 4 - the smallest possible number of variables Model 5 - the smallest possible number of variables Model 6 - the smallest possible number of variables Part A Appendix 4-B1 Model 1: Enter Method (with all 20 variables') Variable Coefficient Wald Significance Constant -8.0175 7.3602 1.1866 Household municipal -0.1262 17.928 0.0000* PPH All 0.0291 0.0278 0.8675 PPH 16 0.4959 6.2896 0.0121* HH_inc 0.0597 0.2359 0.6272 H type 0.0307 0.1388 0.7095 Vph -1.7824 91.515 0.0000* Bph 0.1312 2.2481 0.1338 H_tot_tr -0.0466 1.8530 0.1734 Age -0.0276 7.221 0.0072* Sex (1) -0.5102 5.8314 0.0157* Sch_stat -0.1421 1.3184 0.2509 Emp_stat -0.3556 8.4606 0.0036* Non-work status 0.6283 8.4515 0.0036* License 8.1652 1.2506 0.2634a Total trip -0.1575 4.4001 0.0359* Travel time 2.3597 33.083 0.0000* Estimated distance -0.1333 65.409 0.0000* Original municipal 0.1207 14.865 0.0001* Destination municipal 0.0857 14.055 0.0002* Total travel costs 0.9456 72.025 0.0000* Model Chi-Square 758.229 (20 df) Prediction Success Rate Transit 61.11% Automobile 97.76% Overall 92.68% Kappa statistic 0.499 Significant at a 95% level. a A large coefficient of standard error (Manually) (Forward Wald) (Forward Likelihood) (Backward Wald) (Backward Likelihood) 109 Stella Y . W. Chow APPENDICES Logit Models with Variable "Licenses" and without "Licenses". With variable Licenses Without variable Licenses Model Chi-Square Prediction Success Rate Transit Autodr. Overall Kappa statistic 758.229 (20 df) 61.11% = 97.76% 92.68% 0.4993 702.881 (19 df) 54.63% 97.32% 91.40% 0.4313 Part B Appendix 4B-2 Model 2: Manually Method (with 14 variables) Variable Coefficient Wald Significance Constant -7.9304 1.1684 0.2797 Household municipal -0.1193 16.4938 0.0000* PPH 16 0.4430 12.8388 0.0003* Vph -1.7647 108.482 0.0000* Age -0.0314 10.644 0.0011* Sex (1) -0.4490 5.8811 0.0153* Emp_stat -0.3118 8.0375 0.0046* Non-work status 0.6362 9.5931 0.0020* License 8.0473 1.2139 0.2706 a Total trip -0.2167 14.703 0.0001* Travel time 2.3313 32.120 0.0000* Estimated distance -0.1352 69.977 0.0000* Original municipal 0.1189 14.565 0.0001* Destination municipal 0.0844 13.721 0.0002* Total travel costs 0.9722 81.108 0.0000* Model Chi-Square 752.783 (14 df) Prediction Success Rate Transit 58.33% Automobile 97.91% Overall 92.43% Kappa statistic 0.479 * Significant at a 95% level. a A large coefficient of standard error 110 Stella Y . W. Chow APPENDICES Logit Models with Variable "Licenses" and without "Licenses". With variable Licenses Without variable Licenses Model Chi-Square Prediction Success Rate Transit Autodr. Overall Kappa statistic 752.783 (14 df) 58.33% 97.91% 92.43% 0.4791 699.960 (13 df) 55.56% 97.47% 91.66% 0.4429 Appendix 4B-3 Model 3: Forward Wald Method Variable Coefficient Wald Significance Constant 0.8661 26.966 0.0000* Vph -1.8875 221.61 0.0000* Model Chi-Square 344.623 (1 df) Prediction Success Rate Transit 19.44% Automobile 99.55% Overall 88.45% Kappa statistic 0.167 * Significant at a 95% level. Appendix 4B-4 Model 4: Forward Likelihood Method Variable Coefficient Wald Significance Constant -8.5240 1.6382 0.2006 PPH 16 0.3908 10.472 0.0012* Vph -1.7080 107.37 0.0000* Age -0.0245 7.5309 0.0061* Sex (1) -0.4103 4.1890 0.0407* License 8.1078 1.2434 0.2648 a Total trip -0.1998 13.545 0.0002* Travel time 2.2954 32.768 0.0000* Estimated distance -0.1341 72.709 0.0000* Destination municipal 0.0759 13.674 0.0002* Total travel costs 0.9389 81.919 0.0000* Model Chi-Square 725.754 (10 df) Prediction Success Rate Transit 55.56% Automobile 98.21% Overall 92.30% Kappa statistic 0.463 * Significant at a 95% level. A large coefficient of standard error 111 Stella Y . W. Chow APPENDICES Appendix 4B-5 Model 5: Backward Wald Method Variable Coefficient Wald Significance Constant -7.9304 1.1684 0.2797 Household municipal -0.1193 16.494 0.0000* PPH 16 0.4430 12.839 0.0003* Vph -1.7647 108.48 0.0000* Age -0.0314 10.644 0.0011* Sex (1) -0.4990 5.8811 0.0153* Emp_stat -0.3118 8.0375 0.0046* Non-work status 0.6362 9.5931 0.0020* License 8.0473 1.2139 0.2706 a Total trip -0.2167 14.703 0.0001* Travel time 2.3313 32.120 0.0000* Estimated distance -0.1352 69.977 0.0000* Original municipal 0.1189 14.565 0.0001* Destination municipal 0.0844 13.721 0.0002* Total travel costs 0.9722 81.108 0.0000* Model Chi-Square 752.783 (14 df) Prediction Success Rate Transit 58.33% Automobile 97.91% Overall 92:43% Kappa statistic 0.479 * Significant at a 95% level. A large coefficient of standard error 112 Stella Y . W. Chow APPENDICES Appendix 4B-6 Model 6: Backward Likelihood Method Variable Coefficient Wald Significance Constant -8.9284 1.4788 0.2240 Household municipal -0.1193 16.494 0.0000* PPH 16 0.4430 12.839 0.0003* Vph -1.7647 108.48 0.0000* Age -0.0314 10.644 0.0011* Sex(l) -0.4990 5.8811 0.0153* Emp_stat -0.3118 8.0375 0.0046* Non-work status 0.6362 9.5931 0.0020* License 8.0473 1.2139 0.2706 a Total trip -0.2167 14.703 0.0001* Travel time 2.3313 32.120 0.0000* Estimated distance -0.1352 69.977 0.0000* Original municipal 0.1189 14.565 0.0001* Destination municipal 0.0844 13.721 0.0002* Total travel costs 0.9722 81.108 0.0000* Model Chi-Square 752.783 (14 df) . Predictions Success Rate Transit 58.33% Automobile 97.91% Overall 92.43% Kappa statistic 0.479 * Significant at a 95% level. a A large coefficient of standard error 113 Stella Y . W. Chow APPENDICES APPENDIX 4-C Results of the Logit Mode Choice Models for the Recreational Trip. Part A Model 1 - done by the Enter Method (20 variables) PartB Model 2 - the smallest possible number of variables (Manually) Model 3 - the smallest number of variables (Forward Wald) Model 4 - the smallest number of variables (Forward Likelihood) Model 5 - the smallest number of variables (Backward Wald) Model 6 - the smallest number of variables (Backward Likelihood) Part A Appendix 4C-1 Model 1: Enter Method (with all 20 variables) Variable Coefficient Wald Significance Constant -12.403 1.2483 0.2639 Household municipal 0.0373 0.6923 0.4054 PPH All -0.3664 1.0036 0.3164 PPH 16 0.8899 5.7256 0.0167* HH_inc -0.4083 4.4214 0.0355* H type 0.0487 0.2042 0.6514 Vph -1.6689 31.201 0.0000* Bph -0.2491 2.3072 0.1288 H_tot_tr -0.0100 0.0189 0.8908 Age 0.0018 0.0245 0.8757 Sex(l) 0.1160 0.1273 0.7212 Sch_stat -0.1041 0.2533 0.6148 Emp_stat 0.0257 0.0181 0.8930 Non-work status 0.0032 0.0006 0.9801 License 10.635 0.9340 0.3338 a Total trip -0.4105 9.6525 0.0019* Travel time 3.0859 46.809 0.0000* Estimated distance -0.1756 69.296 0.0000* Original municipal 0.0250 0.3190 0.5722 Destination municipal 0.0195 0.3621 0.5473 Total travel costs 2.2159 111.35 0.0000* Model Chi-Square 799.980 (20 df) Prediction Success Rate Transit 77.94% Automobile 98.98% Overall 97.48% Kappa statistic 0.6722 * Significant at a 95% level. a A large coefficient of standard error 114 Stella Y . W. Chow APPENDICES Logit Models with Variable "Licenses" and without "Licenses". With variable Licenses Without variable Licenses Model Chi-Square Prediction Success Rate Transit Autodr. Overall Kappa statistic 779.98 (20 df) 77.94% 98.98% 97.48% 0.6722 738.66 (19 df) 70.59% 98.87% 96.86% 0.5977 PartB Appendix 4C-2 Model 2: Enter Method (with 8 variables) Variable Coefficient Wald Significance Constant -10.782 0.9939 0.3188 PPH16 0.3841 3.1563 0.0756 Household income -0.5079 8.2061 0.0042* Vph -1.8509 39.508 0.0000* License 10.6179 0.9655 0.3258 a Personal total trip -0.4479 25.884 0.0000* Travel time 2.9868 47.916 0.0000* Estimated distance -0.1826 83.372 0.0000* Total travel costs 2.2794 129.01 0.0000* Model Chi-Square 786.784 (8 df) Prediction Success Rate Transit 77.9% Automobile 99.0% Overall 97.5% Kappa statistic 0.6722 * Significant at a 95% level. a A large coefficient of standard error LoRit Models with Variable "Licenses" and without "Licenses". With variable Licenses Without variable Licenses Model Chi-Square Prediction Success Rate Transit Autodr. Overall Kappa statistic 786.784 (8 df) 77.94% 98.98% 97.48% 0.6722 738.66 (7 df) 76.47% 98.87% 97.27% 0.6500 115 Stella Y . W. Chow APPENDICES Appendix 4C-3 Model 3: Forward Wald Method Variable Coefficient Wald Significance Constant Total travel costs -3.8118 1.1407 660.65 186.14 0.0000* 0.0000* Model Chi-Square Prediction Success Rate Transit Automobile Overall Kappa statistic 218.967(1 df) 2.94% 98.87% 92.03% 0.0239 * Significant at a 95% level. Appendix 4C-4 Model 4: Forward Likelihood Method Variable Coefficient Wald Significance Constant -10.552 0.9737 0.3238 HH income -0.4469 6.5071 0.0107* Vph -1.6278 38.033 0.0000* License 10.842 1.0297 0.3102a Total trip -0.4589 27.028 0.0000* Travel time 2.8572 46.222 0.0000* Estimated distance -0.1857 86.165 0.0000* Total travel costs 2.3014 130.54 0.0000* Model Chi-Square 783.818 (7 df) Prediction Success Rate Transit 75.00% Automobile 99.21% Overall 97.48% Kappa statistic 0.6637 * Significant at a 95% level. a A large coefficient of standard error 116 Stella Y . W. Chow A P P E N D I C E S Appendix 4 C - 5 Model 5: Backward Wald Method Variable Coefficient Wald Significance Constant -11.641 1.1052 0.2931 Household municipal 0.0586 4.4041 0.0359* . PPH_16 0.4665 4.9571 0.0260* H H income -0.4243 5.3973 0.0202* Vph -1.7168 34.948 0.0000* Bph -0.3329 5.5879 0.0181* License 10.403 0.8860 0.3466 a Total trip -0.4377 23.991 0.0000* Travel time 3.0474 48.947 0.0000* Estimated distance -0.1778 76.182 0.0000* Total travel costs 2.2631 123.41 0.0000* Model Chi-Square 795.933 (10 df) Prediction Success Rate Transit 75.00% Automobile 98.87% Overall 97.17% Kappa statistic 0.6369 * Significant at a 95% level. a A large coefficient of standard error Appendix 4 C - 6 Model 6: Backward Likelihood Method Variable Coefficient Wald Significance Constant -11.641 1.1052 0.2931 Household municipal 0.0586 4.4041 0.0359* P P H 1 6 0.4665 4.9571 0.0260* H H income -0.4243 5.3973 0.0202* Vph -1.7168 34.948 0.0000* Bph -0.3329 5.5879 0.0181* License 10.403 0.8860 0.3466 a Total trip -0.4377 23.991 0.0000* Travel time 3.0474 48.947 0.0000* Estimated distance -0.1778 76.182 0.0000* Total travel costs 2.2631 123.41 0.0000* Model Chi-Square 795.933 (10 df) Prediction Success Rate Transit 75.00% Automobile 98.87% Overall 97.17% Kappa statistic 0.6369 Significant at a 95% level. a A large coefficient of standard error 117 Stella Y . W . Chow APPENDICES APPENDIX 4-D Sample Calculation of the Kappa statistic. Prediction Results of a Mode Choice Model (Classification Threshold = 0.5). Observed From SPSS Model Transit Automobile Total % Correct Transit 66 42 108 61.1% Automobile 15 656 671 97.8% Total 77 702 779 92.7% p - p e 1 - P e where P = Overall percent agreement; P e = Overall percent agreement expected by chance. P = (66+656)/779 = 0.9268 P e = (108/779) (66/779) + (671/779) (656/779) = 0.8539 Kappa statistic = (0.9268 - 0.8539) / (1 - 0.8539) = 0.4993 118 Stella Y . W. Chow APPENDICES APPENDIX 5-A Results of the Logit Model for Cross Comparisons (II) Appendix 5A-1 A Logit Model Based on the Best Neurofuzzy Mode Choice Model (Work/School Trip) Variable Coefficient Wald Significance Constant -12.492 2.8786 0.0898 Nw_stat 0.2926 3.9134 0.0479* License 9.5092 1.6688 0.1964a Travel time 2.6859 64.243 0.0000* Estimated distance -0.1751 142.59 0.0000* Total travel costs 1.1227 139.30 0.0000* Model Chi-Square 513.973 (5 df) Prediction Success Rate Transit 25.9% Automobile 96.6% Overall 86.8% Kappa statistic 0.1897 * Significant at a 95% level. a A large coefficient of standard error Appendix 5A-2 A Logit Model Based on the Best Neurofuzzy Mode Choice Model (Recreational Trip) Variable Coefficient Wald Significance Constant -14.854 2.1292 0.1445 Vph -2.0329 74.013 0.0000* Age 0.0150 3.2101 0.0732 License 11.601 1.3036 0.2536a Travel time 2.9368 53.797 0.0000* Estimated distance -0.1580 76.025 0.0000* Total travel costs 2.0598 125.86 0.0000* Model Chi-Square 743.771 (6 df) Prediction Success Rate Transit 67.7% Automobile 99.3% Overall 97.1% Kappa statistic 0.604 * Significant at a 95% level. a A large coefficient of standard error 119 Stella Y . W. Chow
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Comparison of neural classifiers and conventional approaches...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Comparison of neural classifiers and conventional approaches to mode choice analysis Chow, Stella Yu Wai 2002
pdf
Page Metadata
Item Metadata
Title | Comparison of neural classifiers and conventional approaches to mode choice analysis |
Creator |
Chow, Stella Yu Wai |
Date Issued | 2002 |
Description | This thesis provides a comparison of three modeling techniques which can be used for mode choice analysis. The techniques include the conventional logit, artificial neural networks (ANNs), and neurofuzzy models. The three modeling techniques were applied to mode choice data extracted from the 1999 24-hour trip diary survey of the Greater Vancouver Regional District. The travel mode of each individual was explained using explanatory variables acquired from three categories of the database: household database, personal database, and trip database. The results showed that, as modeling techniques, both ANNs and neurofuzzy models are highly adaptive and very efficient in dealing with problems involving complex interrelationships among many variables. The neurofuzzy technique combines the learning ability of artificial neural networks and the transparent nature of fuzzy logic. In addition; the neurofuzzy technique only selects the variables that significantly influence mode choice and display the stored knowledge in terms of fuzzy linguistic rules. This allows the modal decision making process to be examined and understood in great detail. The results of the comparison also indicated that neurofuzzy models produced the best results in terms of model accuracy. As well, it selected the least number of variables to achieve these results. |
Extent | 5965490 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-08-12 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0063490 |
URI | http://hdl.handle.net/2429/12050 |
Degree |
Master of Applied Science - MASc |
Program |
Civil Engineering |
Affiliation |
Applied Science, Faculty of Civil Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2002-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_2002-0046.pdf [ 5.69MB ]
- Metadata
- JSON: 831-1.0063490.json
- JSON-LD: 831-1.0063490-ld.json
- RDF/XML (Pretty): 831-1.0063490-rdf.xml
- RDF/JSON: 831-1.0063490-rdf.json
- Turtle: 831-1.0063490-turtle.txt
- N-Triples: 831-1.0063490-rdf-ntriples.txt
- Original Record: 831-1.0063490-source.json
- Full Text
- 831-1.0063490-fulltext.txt
- Citation
- 831-1.0063490.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0063490/manifest