COINTEGRATION AND STATIONARITY ANALYSIS OF JAPANESE SPECULATIVE LAND AND STOCK MARKETS: 1982 - 1993 by HEIDI M.C. KELLY B.Comm., Dalhousie University, 1989 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES THE FACULTY OF COMMERCE AND BUSINESS ADMINISTRATION We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA October 1994 â€¢ Heidi M.C. Kelly, 1994 In presentin g thi s thesi s i n partia l fulfilmen t o f th e requirement s fo r a n advance d degree a t th e Universit y o f Britis h Columbia , I agre e tha t th e Librar y shal l mak e i t freely availabl e fo r referenc e an d study . I furthe r agre e tha t permissio n fo r extensiv e copying o f thi s thesi s fo r scholarl y purpose s ma y b e grante d b y th e hea d o f m y department o r b y hi s o r he r representatives . I t i s understoo d tha t copyin g o r publication o f thi s thesi s fo r financia l gai n shal l no t b e allowe d withou t m y writte n permission. (Signature) Department o f tom^TOo. t G06\i\e^l The Universit y o f Britis h Columbi a Vancouver, Canad a Date fjfi DE-6 (2/88 ) b f f /fdfl'tlrffazfr 0/1 ii ABSTRACT There was a remarkable downturn in the stock and land markets of Japan at the end of 1989 and the beginning of 1990. This thesis examines speculative stock and land market indices, explores relationships between these indices, and determines if the downturn had any affect on such relationships. The two sets of data used are measures of the Japanese speculative stock market (the Nikkei and Topix, Tokyo Stock Exchange market indices) and measures of the Japanese speculative land market (golf course membership price indices for the country as a whole, the eastern part of Japan, the western part of Japan, and Tokyo). Preliminary analysis of the data suggests the existence of three similarities: first, between the two stock market indices; second, amongst the four golf course membership price indices; and, third, between the set of stock market indices and the set of golf course membership price indices. The graphs of the data, the effects of transformations, the lack of outliers, the lack of seasonality, and the distributions of the data are remarkably alike. However, a more technical look at the data supports the opposing point of view. ARIMA modelling shows there exists surprisingly little structural similarity either within or between the two sets of data (i.e., the speculative stock and land market indices). In addition, cointegration test results provide little evidence of the expected relationship between these data sets. iii When accounting for the effect of the downturn on the data, evidence of linear relationships does exist. Cointegration tests using data before the downturn provide evidence of linear relationships between the two data sets (particularly between the stock market indices and the country composite index for golf course membership prices). However, examination of data from after the downturn shows that the downturn seems to have changed the data in such a way as to remove these previously existing linear relationships between the two sets of indices. Cointegration tests show no linear relationships appear to exist within either data set before the downturn (i.e., between the two stock market indices or between any pairwise combination of the four golf course membership price indices), but there is evidence of some such relationships after the downturn. The conclusion of this paper is that the downturn actually changed the relationships within each set of data (i.e., between different measures of the speculative stock market and between different measures of the speculative land market) and between the two sets of data (i.e., the relationship between the Japanese stock and land markets). iv TABLE OF CONTENTS ABSTRACT ii TABLE OF CONTENTS iv LIST OF TABLES viii LIST OF FIGURES x ACKNOWLEDGEMENT xii CHAPTER 1.1 1.2 1.3 1 INTRODUCTION Why is Japan of interest? Why look at Japan's stock market? Why would it be believed that stock prices and land prices are related? 1.4 Why would golf course membership prices be considered? CHAPTER 2.1 2.2 2.3 2.4 2 A TIME OF CHANGE Dateline Before the downturn The downturn After the downturn CHAPTER 3.1 3.2 3.3 3 DATA DESCRIPTION Introduction Choosing the time interval of the data Stock market indices 3.3.1 Background 3.3.2 Nikkei 3.3.3 Topix 3.3.4 Nikkei vs. Topix 3.4 Golf course membership price indices 3.4.1 The problems with land price measurement 3.4.2 Golf course membership price indices CHAPTER 4.1 4.2 4.3 4.4 4.5 4 DO OUTLIERS EXIST? What will happen during this analysis? Main results and conclusions Background Actions taken Detailed analysis 1 2 3 4 6 8 8 9 11 11 13 13 13 16 16 16 19 21 22 22 25 30 30 31 31 33 34 V 4.6 Description of analysis 4.7 Evaluative discussion CHAPTER 5 WHAT CAN BE SAID OF THE DISTRIBUTIONS OF THE DATA? 5.1 Main results and conclusions 5.2 Background 5.2.1 The purpose of the logarithmic transformation 5.3 Actions taken 5.4 Detailed analysis 5.4.1 Nikkei 5.4.2 Topix 5.4.3 Golf - all 5.4.4 Golf - east 5.4.5 Golf - west 5.4.6 Golf - tokyo 5.5 Description of analysis 5.5.1 Nikkei 5.5.2 Topix 69 5.5.3 Golf - all 5.5.4 Golf - east 5.5.5 Golf - west 5.5.6 Golf - tokyo 5.6 Evaluative discussion 5.6.1 Nikkei 5.6.2 Topix 5.6.3 Golf - all 5.6.4 Golf - east 5.6.5 Golf - west 5.6.6 Golf - tokyo 35 36 38 38 39 41 42 43 44 48 52 56 60 64 68 68 69 70 71 72 74 74 75 76 77 79 80 CHAPTER 6.1 6.2 6.3 6.4 6.5 6.6 6 DOES SEASONALITY EXIST? Main results and conclusions Background Actions taken Detailed analysis Description of analysis Evaluative discussion 6.6.1 Possible problems with method 81 81 81 82 83 84 84 85 CHAPTER 7.1 7.2 7.3 7 STATIONARITY Main results and conclusions Background Actions taken 88 88 89 91 VI 7.4 Detailed analysis 7.4.1 Nikkei 7.4.2 Topix 7.4.3 Golf - all 7.4.4 Golf - east 7.4.5 Golf - west 7.4.6 Golf - tokyo 7.5 Description of analysis 7.5.1 Stock Indices 7.5.2 Golf Indices 7.6 Evaluative discussion 7.6.1 Stock Indices 7.6.2 Golf Indices 92 92 95 97 100 102 105 108 108 109 Ill Ill 112 CHAPTER 8.1 8.2 8.3 8 DECIDING THE METHODS OF ANALYSIS What is known so far? How are the structures of the indices to be determined? What type of relationship is under scrutiny? 8.3.1 Why is the analysis restricted to such a specific relationship? 8.3.2 Is regression appropriate? 8.3.3 Is cointegration appropriate? 8.4 How can it be determined if the downturn affected the relationship? . . 115 116 117 118 118 120 122 123 CHAPTER 9 ARE THE STRUCTURES OF THE DATA SERIES SIMILAR? . . 9.1 Main results and conclusions 9.2 Background 9.2.2 Brief description of the ARIMA model 9.2.3 Difference stationary vs. trend stationary 9.2.4 Autocorrelation functions and ARIMA processes 9.3 Actions taken 9.4 Detailed analysis 9.4.1 Nikkei 9.4.2 Topix 9.4.3 Golf - all 9.5 Description of analysis 9.5.1 Nikkei 9.5.2 Topix 9.5.3 Golf - all 9.5.4 Golf - east 9.6 Evaluative discussion 9.6.1 Nikkei 9.6.2 Topix 9.6.3 Golf - all 124 124 126 130 132 134 135 136 136 138 141 149 149 150 152 153 157 157 159 161 vii 9.6.4 Golf - east 9.6.5 Golf - west 163 165 CHAPTER 10 DOES A LONG-TERM LINEAR RELATIONSHIP EXIST? 10.1 Main results and conclusions 10.2 Background 10.2.1 What is cointegration? 10.2.2 How does one test for cointegration? 10.2.3 Is the pot calling the kettle black? 10.2.4 What assumptions are necessary to use cointegration tests? . 10.3 Actions taken 10.4 Detailed analysis 10.5 Description of analysis 10.6 Evaluative discussion 168 168 169 169 170 171 172 173 174 175 175 CHAPTER 11.1 11.2 11.3 11.4 11.5 11.6 178 178 179 179 182 184 184 11 EFFECTS OF DOWNTURN Main results and conclusions Background Actions taken Detailed analysis Description of analysis Evaluative discussion 11.6.1 Relationships within the group of stock indices and within the group of golf indices 11.6.2 Relationships between the group of stock indices and the group of golf indices 185 188 CHAPTER 12 WHAT HAS THIS ANALYSIS DISCOVERED ABOUT THE DATA? 12.2 What is now known about the data? 12.3 What is now known about the structure of the data? 12.4 What is now known about the relationship between weekly stock and land indices? 12.5 What is now known about the effects of the downturn on the data? . 12.6 What further analysis could be done? 194 195 197 BIBLIOGRAPHY Appendix 1 - List of Acronyms and Abbreviations Used Appendix 2 - List of Symbols Used 199 201 202 191 192 193 viii LIST OF TABLES Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table 1. Total land area and annually traded land area by use 2. Calculations used in determining outliers; part one 3. Calculations used in determining outliers; part two 4. Unit root tests on the Nikkei and its transformations 5. Autocorrelation functions of the Nikkei and its transformations 6. Unit root tests on the Topix and its transformations 7. Autocorrelation functions of the Topix and its transformations 8. Unit root tests on Golf - all and its transformations 9. Autocorrelation functions of Golf - all and its transformations 10. Unit root tests on Golf - east and its transformations 11. Autocorrelation functions of Golf - east and its transformations 12. Unit root tests on Golf - west and its transformations 13. Autocorrelation functions of Golf - west and its transformations 14. Unit root tests on Golf - tokyo and its transformations 15. Autocorrelation functions of Golf - tokyo and its transformations 16. Parameter estimates for ARIMA models fitted to Nikkei 17. Acf of residuals for ARIMA models fitted to Nikkei 18. Parameter estimates for ARIMA models fitted to the logarithm of the Nikkei 19. Acf of residuals for ARIMA models fitted to the logarithm of the Nikkei 20. Parameter estimates for ARIMA models fitted to the Topix 21. Acf of residuals for ARIMA models fitted to the Topix 22. Parameter estimates for ARIMA models fitted to the logarithm of the Topix 23. Acf of residuals for ARIMA models fitted to the logarithm of the Topix 24. Parameter estimates for ARIMA models fitted to Golf - all 25. Acf of residuals for ARIMA models fitted to Golf - all 26. Parameter estimates for ARIMA models fitted to the logarithm of Golf - all 27. Acf of residuals for ARIMA models fitted to the logarithm of Golf all 28. Parameter estimates for ARIMA models fitted to Golf - east 29. Acf of residuals for ARIMA models fitted to Golf - east 30. Parameter estimates for ARIMA models fitted to the logarithm of Golf - east 31. Acf of residuals for ARIMA models fitted to the logarithm of Golf east 32. Parameter estimates for ARIMA models fitted to Golf - west 33. Acf of residuals for ARIMA models fitted to Golf - west 23 34 34 94 94 96 97 99 99 101 102 104 104 106 107 136 136 137 137 138 139 139 140 141 141 142 142 143 143 144 144 145 145 ix Table 34. Parameter estimates for ARIMA models fitted to the logarithm of Golf - west Table 35. Acf of residuals for ARIMA models fitted to the logarithm of Golf west Table 36. Parameter estimates for ARIMA models fitted to Golf - tokyo Table 37. Acf of residuals for ARIMA models fitted to Golf - tokyo Table 38. Parameter estimates for ARIMA models fitted to the logarithm of Golf - tokyo Table 39. Acf of residuals for ARIMA models fitted to the logarithm of Golf tokyo Table 40. Cointegration test results for original data Table 42. Cointegration test results for before the downturn Table 43. Cointegration test results for after the downturn 146 146 147 147 148 148 174 182 183 x LIST OF FIGURES Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 1. Nikkei 2. Topix 3. Golf - all 4. Golf - east 5. Golf - west 6. Golf - tokyo 7. Golf - all four indices 8. Histogram of the Nikkei 9. Normal probability plot of the Nikkei 10. Histogram of the first differences of the Nikkei 11. Normal probability plot of the first differences of the Nikkei 12. Histogram of the logarithm of the Nikkei 13. Normal probability plot of the logarithm of the Nikkei 14. Histogram of the first differences of the logarithm of the Nikkei 15. Normal probability plot of the first differences of the logarithm of the Nikkei 16. Histogram of the Topix 17. Normal probability plot of the Topix 18. Histogram of the first differences of the Topix 19. Normal probability plot of the first differences of the Topix 20. Histogram of the logarithm of the Topix 21. Normal probability plot of the logarithm of the Topix 22. Histogram of the first differences of the logarithm of the Topix 23. Normal probability plot of the first differences of the logarithm of the Topix 24. Histogram of Golf - all 25. Normal probability plot of Golf - all 26. Histogram of the first differences of Golf - all 27. Normal probability plot of the first differences of Golf - all 28. Histogram of the logarithm of Golf - all 29. Normal probability plot of the logarithm of Golf - all 30. Histogram of the first differences of the logarithm of Golf - all 31. Normal probability plot of the first differences of the logarithm of Golf - all 32. Histogram of Golf - east 33. Normal probability plot of Golf - east 34. Histogram of the first differences of Golf - east 35. Normal probability plot of the first differences of Golf - east 36. Histogram of the logarithm of Golf - east 37. Normal probability plot of the logarithm of Golf - east 38. Histogram of the first differences of the logarithm of Golf - east 18 21 26 27 27 28 28 44 44 45 45 46 46 47 47 48 48 49 49 50 50 51 51 52 52 53 53 54 54 55 55 56 56 57 57 58 58 59 xi Figure 39. Normal probability plot of the first differences of the logarithm of Golf - east Figure 40. Histogram of Golf - west Figure 41. Normal probability plot of Golf - west Figure 42. Histogram of the first differences of Golf - west Figure 43. Normal probability plot of the first differences of Golf - west Figure 44. Histogram of the logarithm of Golf - west Figure 45. Normal probability plot of the logarithm of Golf - west Figure 46. Histogram of the first difference of the logarithm of Golf - west Figure 47. Normal probability plot of the first differences of the logarithm of Golf - west Figure 48. Histogram of Golf - tokyo Figure 49. Normal probability plot of Golf - tokyo Figure 50. Histogram of the first differences of Golf - tokyo Figure 51. Normal probability plot of the first differences of Golf - tokyo Figure 52. Histogram of the logarithm of Golf - tokyo Figure 53. Normal probability plot of the logarithm of Golf - tokyo Figure 54. Histogram of the first differences of the logarithm of Golf - tokyo . . . Figure 55. Normal probability plot of the first differences of the logarithm of Golf - tokyo Figure 56. Autocorrelation functions of Nikkei and Topix Figure 57. Autocorrelations functions of the Golf indices Figure 58. First differences of the Nikkei Figure 59. Logarithm of the Nikkei Figure 60. First differences of the logarithm of the Nikkei Figure 61. First differences of the Topix Figure 62. Logarithm of the Topix Figure 63. First differences of the logarithm of the Topix Figure 64. First differences of Golf - all Figure 65. Logarithm of Golf - all Figure 66. First differences of the logarithm of Golf - all Figure 67. First differences of Golf - east Figure 68. Logarithm of Golf - east Figure 69. First differences of the logarithm of Golf - east Figure 70. First differences of Golf - west Figure 71. Logarithm of Golf - west Figure 72. First differences of the logarithm of Golf - west Figure 73. First differences of Golf - tokyo Figure 74. Logarithm of Golf - tokyo Figure 75. First differences of the logarithm of Golf - tokyo 59 60 60 61 61 62 62 63 63 64 64 65 65 66 66 67 67 83 83 92 93 93 95 95 96 97 98 98 100 100 101 102 103 103 105 105 106 xii ACKNOWLEDGEMENT My heartfelt appreciation goes to my parents, who have given the world to me. I am grateful to all those who have helped me by offering their expertise and patience. My advisors, of course, deserve special mention: Dr Piet de Jong who assisted me on so many occasions; Dr William Ziemba who was liberal in providing both his knowledge and access to data; and Dr Anthony Boardman who gave his time so generously. 1 CHAPTER 1 INTRODUCTION Japan has moved from a ruined economy at the end of World War II to one of the most successful in the world today. As a result, more and more interest has focused on Japan: its culture, its business practices, and its economy. This paper considers the Japanese economy through analyzing its stock market and land market. In particular, the focus of this analysis is (a) to model indices representative of the stock and land markets to determine their structure and similarities, (b) to discern any existing long-term linear relationship between these indices, and (c) to determine what effect, if any, the Japanese stock market downturn at the end of 1989 and the beginning of 1990 had on this relationship. This paper looks at stock and land prices in Japan by examining two stock market indices (the Nikkei and the Topix) and four measures of speculative land prices (golf course membership price indices for the country as a whole, the eastern part of Japan, the western part of Japan, and Tokyo). It will be shown that there is surprisingly little similarity between the stock market and land market in Japan, with this being particularly true after the downturn removes what little degree of relationship previously existed. The downturn actually changed the relationship both within each group of indices and between the two groups of indices. 2 The first chapter will explain why this topic is of interest. The second and third chapters will provide some relevant background and describe the data to be used. Chapters 4 through 7 will begin the analysis by establishing some basic facts about the data which will both help in the understanding of the data and aid in the more complicated analysis which will follow. These early chapters will look for outliers (Chapter 4), examine the distributions of the data (Chapter 5), determine if seasonality exists in the data (Chapter 6), and determine if stationarity exists (Chapter 7). The remaining chapters will describe what methods of analysis will be used (Chapter 8), examine the structures of the indices (Chapter 9), examine the data for linear relationships (Chapter 10), and examine the effect of the downturn on these relationships (Chapter 11). The final chapter will summarize the results and comment on some interesting related issues. 1.1 Why is Japan of interest? It is important for each country to understand as much as possible about the economies of other countries. Distance has come to have little meaning given technological advances which allow instantaneous communication around the world. The economies of the world are becoming more and more interrelated, and interaction between financial markets is the greatest it has ever been. The stock market downturn of October 1987 showed the extent to which this is true, forcing people to realize that developments in one market affect other markets (Powell 1988, 177). Given this state of affairs, no 3 country can remain isolated and hope to be successful. Each country must learn as much as possible about other economies and how they interact. Japan is of particular interest due to its economic strength and its influence on world economics. Japan's economic strength is shown, for example, by its being the only industrialized country in 1992 to have a surplus (according to the article "The Japanese government is running a budget surplus" in The Economist, April 11-17, 1992). Japan's influence on world economics in recent years is well established; in fact, in the spring of 1986 Japan became the world's leading creditor (Burstein 1988, 62). Given the pessimistic and uncertain condition of the economies in most countries today, it is no surprise that more and more research is being focused on Japan. 1.2 Why look at Japan's stock market? There are three reasons why the Japanese stock market is worth analysis. Firstly, stock market performance is a prime economic indicator in industrialized economies. Secondly, the large size of Japan's market makes it significant. The increasing degree of interaction between the world's economies and stock markets discussed earlier makes this large size particularly important for all countries. To conceptualize the size of the Tokyo Stock Exchange (TSE), consider that the value of the TSE's stocks surpassed 4 that of the New York Stock Exchange in 1987 (Burstein 1988, 38). At its peak in December 1989, the Japanese stock market contained 42% of the capitalization of all the markets in the world (Wood 1992, 8). Thirdly, the stock market in Japan has seen some incredible movements in the recent past: steady increases to an almost unbelievable high followed by steady decreases. (Graphs of the Nikkei and Topix are shown in section 3.3.2 and 3.3.3 respectively.) This raises questions as to the causes for such changes, interactions of these changes with other aspects of the economy, and the possibility of predicting such changes. Analysis will help answer these questions. One relationship known to exist in many counties is a relationship between the stock market and the land market. There have been problems, however, establishing the existence of such a relationship in Japan due to difficulties in obtaining an accurate and timely measure of land prices. 1.3 Why would it be believed that stock prices and land prices are related? Land prices in Japan have seen the same general trend as the stock market: a period of steady increase followed by one of steady decrease. One explanation for this apparent relationship is that a high degree of the wealth of a Japanese company may be found in 5 the value of that company's land assets. Thus, as the value of the land holdings changes, so, too, does the value of the company. The value of stocks and land are further related through the Japanese banking system. This relationship is interesting but complicated, and to do its explanation justice would mean too much of a diversion from the main concerns of this paper; however, a brief exposition is in order. Japanese banks have an interest in the price of land since a large number of loans collaterized by property have been made in the past. The banks' stake in land prices is increased by the many loans made to the property and construction sectors. As a result of these connections, a significant decline in the value of property translates into a worrisome decrease in bank security. Although currently changing for the better, the accounting practices followed by the banks and the prevailing financial culture in Japan both obscure and compound the dilemma. Bank interests and the value of the stock market are also related. The stock market is obviously affected by banks since banks comprise 25% of the market (Wood 1992, 10). Banks are also affected by stock market movements since Japanese banks have large stock holdings and include in their capital the profits and losses due to stock market movements. These profits and losses may be included in a bank's capital whether they are realized or unrealized. This causes even more trouble than one would imagine since 6 Japanese banks must increase their capital in order to meet new capital-adequacy standards agreed to by the Basel-based Bank for International Settlements (Wood 1992, 24-25). This goal will become harder and harder to reach if the stock market continues to decline. In fact, if the Nikkei declines to 12,000 "the capital gains on the huge stock portfolios of the banks ... will be wiped out" (Wood 1992, 9). This does not bode well for the health of Japanese banks. Thus, both market observations and economic explanations suggest a relationship between stock prices and land prices to be a sensible proposition. However, this relationship is difficult to explore because of the inadequacy of land price information available for Japan (this will be discussed more fully in section 3.4.1). A substitute for land prices, however, would allow further analysis in this area. The substitute chosen for this study is golf course membership prices, which are a measure of speculative land prices (Ziemba and Schwartz 1992a, 147). 1.4 Why would golf course membership prices be considered? It has been shown that there is reason to believe speculation on stock and speculation on land are related. As discussed earlier, speculation on land is to be represented in this paper by golf course membership prices. The use of golf price membership prices is not without precedent. Analysis has been done to determine if one of stock or land prices 7 can be used to predict the other (see for example Ziemba and Schwartz, 1992a). Even the Economist hypothesized about the existence of a relationship between golf courses and the stock market. This paper looks at this question by searching for evidence of a long-term linear relationship between these two variables. 8 CHAPTER 2 A TIME OF CHANGE No data analysis can take place in a vacuum. All data must be considered within its existing framework, only then can legitimate conclusions be drawn from analysis of the data. Thus, this chapter contains a short summary of some relevant and interesting facts regarding Japan. 2.1 Dateline This section provides a few chronological highlights of Japanese stock market history relevant to the data under consideration in this paper. 1949 Tokyo Stock Exchange re-opened after closing at the end of the war; Nikkei had first-day average on May 16 of 176.21 (Viner 1988, 55) 1961 second section added to TSE 1968 on its opening day, 4 July, the Topix was set equal to 100 1969 TSE changed the index it calculates from the Nikkei to the Topix 1970s market became more open to the Japanese consumer (Sakakibara and Feldman 1990, 39) 9 1975 the Nikkei Keizai Shimbun Company arranged with the Dow Jones Company to publish the Nikkei-Dow Jones Average (Viner 1988, 81) 1985 Nikkei-Dow Jones Stock Average renamed the Nikkei Stock Average 1986 TSE permitted non-Japanese brokerage firms to become members (Viner 1988, 51) 1989 stock market decline began after peak reached in December (December 10, Topix peak of 2,873.32; December 24, Nikkei peak of 38,915.90) 1990 Golf course membership price indices reached peak and began decline (March 4 peak for eastern Japan of 788.58 and for Tokyo of 629.02; March 11 peak for all of Japan of 948.17 and for western Japan of 1,275.57) 2.2 Before the downturn Japan's land and stock prices increased to phenomenal levels before the downturn. Ziemba and Schwartz (1992b) provide examples and comparisons which do an excellent job of explaining just what these high levels meant. For example, on page 119 they state "essentially half the world's land value at current, 1987-1990, prices is accounted for by Japanese land!" As long ago as 1961 it was noted that the growth in land prices much exceeded that of commodity prices. For example, by 1947, Tokyo's price index had risen to 1,364 while 10 the nation's land index had risen to 6,658 (with both having a mid-1930s level of 100) (Matsumura 1961, 491). This disproportionate rise in land prices was maintained until this last decade. Many experts put forward arguments suggesting these high levels experienced were sustainable. However, many arguments were also provided for the opposite point of view! It is important to note that companies and equities in Japan and North America cannot be directly compared due some significant differences in the two types of systems. For example, Japan's stock market has more crossholdings than other countries. Before World War II, the Japanese economy was structured around zaibatsu which were "giant clusters of companies from many different industries. Each conglomerate was formally tied together by a common holding company and interlocking directorships" (Wright and Pauli 1987, 32). After the war, anti-monopoly regulations would not allow this, but the zaibatsu re-grouped "through mutual shareholdings: [for example,] instead of having one holding company own 51% or more of the shares in 51 companies, these same firms would buy up 1% of each other" (Bronte 1982, 170). These shares are long-term investments and are not traded. This allows the companies in these new groups, called keiretsu, to maintain the strategies of the zaibatsu, thereby making them incomparable to the companies of most other industrialized countries. 11 2.3 The downturn The October 1987 crash was not a major crash for Japanese stocks. In that case, for example, the Topix decreased only 15.8% on October 20 and then partially recouped this loss in a 9% increase the next day (Tse 1991, 94). This paper uses the word "downturn" to denote the period of time during which the general trend in the stock and land indices changed from long-term upward movement to long-term downward movement. This took place during the end of 1989 and the beginning of 1990. The peaks for the stock indices were in December 1989 (December 24th for the Nikkei and December 10th for the Topix) and the peaks for the golf course membership price indices were in March 1990 (March 4th for eastern Japan and Tokyo, and March 11th for the country's composite and western Japan). The peaks reached were incredibly high; for example, even though the United States has 25 times more land than Japan (Ziemba and Schwartz 1992b, 119), Wood says that the value at the end of 1989 of all the land in Japan was four times that of America (1992, 50). 2.4 After the downturn Even shortly after the downturn, land values in Japan were high in comparison with other countries. In 1991, the value of the land of the Imperial Palace in Tokyo was 12 estimated to be equal to the value of all Canadian land (Wood 1992, 50). The results of the downturn were significant; one need only look at the graphs of the data in Chapter 3 in order to see the upward movement quickly become downward movement. On the one hand, the downturn may be just one more in a long list of events and therefore can be used to better predict relationships which existed before, during, and after the downturn. On the other hand, the downturn may have changed these very relationships. Chapter 11 of this paper compares the relationship between stock and land (a) before and (b) after the downturn and discovers that the latter explanation is the case. There is some evidence of a long-term linear relationship between stock and land before the downturn but this evidence does not exist after the downturn, suggesting the downturn actually changed the relationship between these two variables. 13 CHAPTER 3 DATA DESCRIPTION 3.1 Introduction Both of the main categories of data used in this paper, stock market indices and golf course membership price indices, will be discussed in detail in this chapter. These data were obtained from NEEDS (the Nikkei Economic Electronic Database System) and the Nihon Keizai Shimbun, Inc. respectively. Some other data sources available for Japanese financial data are the Japan Real Estate Institute for land indices, the Daiwa Securities Co., Ltd. used by Hamao (1991); the Nissho Monthly Stock Price file and Nissho Monthly Stock Returns file used in Kato and Schallheim (1990); and numerous others described in Roehl (1985). Also, a particularly useful source is the Directory of Japanese Databases 1990 prepared by the U.S. Department of Commerce; this gives in depth descriptions of Japanese databases accessible in the United States, including the subjects covered, language used, how to access the databases, and more. 3.2 Choosing the time interval of the data The time frame chosen for an analysis should be just long enough to capture the variation of interest. Too long a time frame would overlook important information; for 14 example, analysis of weekly data would give no information regarding daily variation. On the other hand, having a time frame which is too short means much extra information is included which may obscure the variation of interest; for example, analysis of daily data would make biannual variation difficult to detect. Weekly measures are used in this paper. It is felt that a longer time frame would not capture the movements of the indices to an adequate degree, and a shorter time frame would unnecessarily complicate matters by including the confusion of the many anomalies which are accepted as being present in stock market data. Kato, Ziemba, and Schwartz (1990) document their own and other studies which indicate that stock market indices calculated throughout the day and on a daily basis have some variability which is predictable. The intra-day effect refers to the stock market, on average, having patterns of changes over the duration of the day. For example, all days of the week other than Tuesday show an increase in the first part of the day. Kato, Ziemba, and Schwartz also show that the average percent change in daily stock market indices of the TSE is different depending on the day of the week; this is called the day of the week effect. For example, on average, losses are experienced on Mondays and Tuesdays and gains on Wednesday. The use of weekly data avoids the many complications these anomalies would create for analysis. 15 Patterns in stock market data have also been investigated for longer time frames (see Chapter Four of Ziemba and Schwartz (1992b) for detailed descriptions and references): the monthly effect in which certain months of the year have consistently higher or lower returns than others; the January effect in which higher returns are found in January than in other months; the holiday effect in which the period before a holiday has higher returns than other days; and the turn of the month effect in which predictable patterns emerge at the end and beginning of each month. The effect of these types of anomalies on the weekly data to be analyzed in this paper is evaluated in Chapter 6. Time frames longer than a week (such as monthly, quarterly, or yearly measures) would show a more limited picture of the quickly changing Japanese stock market than is desired for the purposes of this analysis. Weekly measures were chosen for the stock market indices used in this analysis in order to avoid the majority of the anomalies found in analysis of smaller stock market time frames while still managing to capture the movement of the stock market sufficiently. The same degree of research has not been done regarding anomalies with the golf course membership price indices and so no comments are made in this area. However, weekly data is again considered the most appropriate measure to capture the desired variability in golf course membership prices. 16 3.3 Stock market indices 3.3.1 Background There are eight stock exchanges in Japan: Tokyo, Osaka, Nagoya, Kyoto, Hiroshima, Fukuoka, Niigata, and Sapporo. The Tokyo Stock Exchange (TSE) is, without a doubt, the largest and is accepted as being representative of stock movements in Japan; for example, in the late 1980s the TSE accounted for more than 95% of the total Japanese market value (Hamao 1989, 21). There are many methods by which the TSE may be measured; the most recognized of these are the Nikkei and the Topix, which will be described later in this chapter (and will be referred to as a group in this paper as simply the stock indices). The Tokyo Stock Exchange is divided into the First Section (TSE-I or ichibu) and the Second Section (TSE-II or nibu). The TSE-I includes companies which are large and which have a long trading history. The TSE-II includes newly listed stocks and smaller companies. 3.3.2 Nikkei The Nikkei 225 Index or Nikkei 225 Stock Average (referred to in this paper as simply the Nikkei) is a share price-weighted index calculated by adding the prices of 225 prespecified stocks traded on the first section of the TSE and "dividing by a divisor that 17 changes over time due to stock splits, rights issues, etc." (Schwartz and Ziemba 1991, 23-24). The weekly Nikkei was taken from NEEDS (Nikkei Economic Electronic Database System). This weekly index is based on the daily close figures on the last trading day of the week of the TSE. Although the last day of trading would now normally be Friday, in the past the TSE did trade on selected Saturdays. The last trading day is not necessarily a Friday, however; for example, the Wednesday close figure would be used if the market was not open for trading on Thursday and Friday due to holidays. The actual data series used in this analysis starts on 31 July 1983 and ends on 11 April 1993 (507 observations), and is shown in figure 1. Although the dates as given by NEEDS are somewhat counter-intuitive, they are used in this paper in order to avoid confusion for the readers if comparisons are done with other studies using the NEEDS data. All the dates given are Sundays. As described above, the closing values from the last day of trading for the week are used to calculate the index. This index is then used as the previous week's Sunday value. For example, the Nikkei at the close of trading on Friday, 2 October 1992 was 17,324.07; this value is the weekly index given for the previous Sunday, the 27th of September 1992. 18 Nikkei â€¢O 2S 1989 1990 1991 199 Figure 1. Nikkei. The data series is missing the index for the date 28 December 1986 because the TSE was closed for the entire week of Sunday, 28 December 1986 through Sunday, 4 January 1987. The average of the week before and the week after was substituted: the 21 December 1986 index was 18701.30 and 4 January 1987 was 18810.36 for the average given for 28 December 1986 of 18755.83. Some concerns regarding the use of the Nikkei as a measure of the TSE are: (1) The 225 stocks used in the calculations were chosen at the inception of the Nikkei during the mid-1950s and thus the Nikkei reflects the composition of the economy at that time, not the present economy (Bronte 1982). 19 (2) The Nikkei does not accurately reflect the whole market due to the small number of stocks included in its calculation (Schwartz and Ziemba 1991, 24). (3) Capitalization of individual firms is not considered in the Nikkei, so that a sharp fluctuation in the share price of a small firm can suggest the whole market is moving (Bronte 1982). These objections to the Nikkei can be avoided in the calculation of other measures. However, no calculation method can meet every criteria, so there will always be objections to any measure proposed. 3.3.3 Topix The "Topix" refers to the Tokyo Stock Price Index and is calculated by the Tokyo Stock Exchange. It is calculated as a value weighted average of all the stocks on the first section of the TSE which is adjusted to take into account "any corporate activities that affect the current market value other than price changes, such as new listings, assignment of stocks from the second section to the first section and vice versa, [etc.]" (Schwartz and Ziemba 1991, 24). The adjustment is not affected by corporate decisions "that entail no change in the market value of shares of the company" (Schwartz and Ziemba 1991, 27) such as stock splits, stock dividends, etc. The Topix was set at 100 20 on 4 July 1968. Further detail on the Topix may be found in Chapter 3 of Jonathan Isaacs' book Japanese Equities Markets (1990). The Topix data were obtained from NEEDS. This weekly index is based on averages of daily closing prices on the TSE over the week. The actual data series used in this analysis starts on 31 July 1983 and ends on 11 April 1993 (507 observations), and is shown in figure 2. Again, this paper uses the dates as given by NEEDS in order to avoid confusion, even though these dates are somewhat counter-intuitive. As with the Nikkei, the Topix dates are all Sundays, and the index given for a Sunday is the average daily close for the following week. For example, the weekly figure given for 3 January 1993 is 1296.72; this is the average of the daily close figures for Monday, 4 January through Friday, 8 January (1305.81, 1298.13, 1291.87, 1298.25, and 1289.52). The Topix data series is missing the index for the date 28 December 1986 because the TSE was closed for the entire week of Sunday, 28 December 1986 through Sunday, 4 January 1987. The average of the week before and the week after was substituted: the 21 December 1986 index was 1561.51 and 4 January 1987 was 1584.32 for the average given for 28 December 1986 of 1572.92. 21 Figure 2. Topix. The Topix was established in order to make up for some of the faults in the Nikkei. However, the Topix is subject to distortions when high priced issues sharply fluctuate (Bronte 1982). Again, no index will be considered perfect by everybody, and thus the researcher must choose the most appropriate one. A knowledge of the strong and weak points of the available indices means this choice will be an informed one. 3.3.4 Nikkei vs. Topix Choosing between the use of the Nikkei and Topix must be done based on the individual's personal preferences and the use to which the data will be put. The "most 22 widely used market indicator for the TSE is the Nikkei Stock Average based on 225 issues" (Tse 1991, 94). The calculation of the Topix, on the other hand, uses a larger number of stocks and is weighted according to capitalization, and is therefore considered more representative of the whole market (Viner 1988, 82); in fact, the Tokyo Stock Exchange itself now uses the Topix. Both indices are used in this paper in order to provide the most complete analysis possible and to allow some comparison between the two indices. 3.4 Golf course membership price indices As discussed in Chapter 1, it is reasonable to speculate on a relationship between the stock market and the land market. Since no appropriate measure of land prices is available for Japan, a measure of speculative land prices (a golf course membership price index) is used in this analysis. 3.4.1 The problems with land price measurement Once it is decided that land prices are of interest, one must decide what measurement of this variable is appropriate. Would the price determined by land traded be appropriate? The price of land based on actual or imputed rent paid? The price determined by 23 government assessment? Each of these methods will be shown to be unacceptable for the weekly analysis to be done in this paper. One reason the price determined by land traded is unsatisfactory is that the amount of land traded in Japan is small compared to the amount of land owned. A partial explanation of this lies in Japanese culture which relates personal status with possession of land (Wood 1992, 49). For example, the capital finance account of the national accounts shows that in the 1985 calendar year net transactions of land in the household sector were less than one percent of the land held (Noguchi 1990, 49). Based on table 1, from Noguchi (1990, 49), the percentage of land traded based on area is equally small in other sectors: Table 1. Total land area and annually traded land area by use. Area (a) (10,000ha) Japan total-all uses 3,778 Three metropolitan regions 393 Other regions 3,385 Residential use Houses other than residential areas Farm land Forests 94 57 549 2,529 Annually traded area (b) (10,000ha) b/a % 22.5 2.3 20.3 0.60 0.58 0.60 2.1 2.23 0.95 7.3 9.6 1.67 1.33 0.38 Source: National Land Agency, Kokudo Riyo Hakusho (Land utilization White Paper), 1987. 24 The price of land based on actual or imputed rent paid is not suitable for this analysis because in Japan it is not representative of land value. Tradition has limited rent increases to 5% per year (according to the article "Unfilled" in the July 11-17, 1992, issue of The Economist). This means that data based on rents would not be representative of the actual value of the land, particularly during times of considerable land value change such as has been experienced in recent times. For example, real urban residential land values in Japan rose 3000% since 1952 while real housing rents increased only 200% to 400% (Rose 1990, 3). The price determined by government assessment is also not an acceptable measurement of land value for the weekly analysis to be done in this paper. The National Land Agency publishes land indices biannually which are calculated using actual sales prices in over 25,000 sites throughout Japan (Wood 1992, 51). The illiquidity of the Japanese land market makes one question how well this small number of sales represent the value of land in general. How representative can they be considering that "when there are not transactions, the local assessor simply reports no price change" (Wood 1992, 51)? Another area of concern is timeliness. Due to the time over which these values are collected, calculated, and published, the resulting indices are considered "notoriously lagging" (Wood 1992, 51) and ineffectual. These problems indicate that the published biannual indices would be of questionable value in this analysis of the quickly changing stock market for which it has already been established that weekly intervals are desirable. 25 Finding no appropriate measure of land values does not mean this analysis has reached a dead end. All that is necessary is a suitable proxy which adequately represents the movement in land values. This paper uses a measure of speculative land: golf course membership price indices. 3.4.2 Golf course membership price indices Golf course membership price indices may be considered a good substitute for the price of land for many reasons. Since golf course memberships are much less expensive than land, many more transactions take place. The secondary trading of these memberships "constitutes an active market in Japan ... [in which they] are traded like securities" (Wood 1992, 60). This means the golf course membership trading is relatively easy to estimate. Golf course memberships should not be considered an insubstantial market force; before the downturn, the value of golf courses in Japan (according to Dr William T. Ziemba) was more than that of the Australian stock market! Nihon Keizai Shimbun, Inc. provides indices for golf course membership prices. This paper looks at four such indices, which will be referred to as a group in this paper as simply the golf indices. The calculations for the "Golf - east" index includes the administrative divisions of Tokyo, Kanagawa, Chiba, Ibaraki, Shizuoka, Saitama, Tochigi, Gunma, and Aichi. The calculations for the "Golf - west" index includes the administrative divisions of Gifu, Mie, Osaka, Hyogo, Kyoto, Shiga, Nara, Wakayama, Yamaguchi, and Kyushu. The "Golf - all" index is a composite for the country and is calculated using all the administrative divisions listed for east and west. Tokyo is the center for economic activity in Japan thus the index "Golf - tokyo" is included in this analysis. These four data streams are graphed in Figures 3 through 6. All four indices are shown again in figure 7 (using the same scale on the y-axis) for comparative purposes. Golf a 0. - all 5- 83 1984 1985 Figure 3. Golf - all. 1987 1989 27 Golf - 1982 198 3 198 4 eas t 1986 198 7 198 8 198 9 Figure 4. Golf - east. Golf - west ,; | 82 1983 19B4 1985 Figure 5. Golf - west. 1986 1987 1988 1989 1990 1991 1992 1993 28 Figure 6. Golf - tokyo. Golf - all four u3 1982 1983 1984 19 Figure 7. Golf - all four indices. series All four golf indices used in this analysis are weekly time series starting on 10 January 1982 and ending on 11 April 1993 (588 observations). The indices are calculated using the following equation: y2 ( bid + asked \ index = -^-Â± '- n (The average of the price bid and the price asked is calculated; the average for all relevant courses is then calculated by summing over all golf courses to be included (n); and then the result is divided by n.) All four golf indices are missing data for the following dates: 2 January 1983, 1 January 1984, 8 January 1984, 12 December 1984, 6 January 1985, 29 December 1985, 5 January 1986, 28 December 1986, 4 January 1987, 3 January 1988, 1 May 1988, 31 December 1988, 30 December 1989, 17 March 1990, 5 May 1990, 29 December 1990, 4 January 1992, 2 January 1993. These missing values were replaced by the average of the index for the week before and the week after the missing value. 30 CHAPTER 4 DO OUTLIERS EXIST? 4.1 What will happen during this analysis? Chapters 4 through 11 describe the analysis done and will usually follow the same general format. While this format may seem fractured, its purpose is to clearly separate each step in the analytical process. This will both make these analytical steps easy to follow and allow readers the discretion to determine which aspects of the analysis are of interest to them. The first section, Main results and conclusions, will give a short statement summarizing the results of the analysis conducted in the chapter. The second section, Background, provides some basic theory and reminders of issues relevant to the analysis to take place in the chapter. The third section, Actions taken, lists the specific steps taken in the analysis. Detailed analysis, the fourth section, gives the numerical outcome of the analytical steps taken. These numerical results are then summarized and described in the fifth section of each chapter, Description of analysis. The sixth and final section of each chapter is the Evaluative discussion in which the writer's interpretation, comments, and conclusions are given. 31 The analysis starts by getting some of the usual calculations out of the way. The data is checked for outliers and seasonality. The issue of transforming to stationarity is also addressed. Chapter 8 then describes how this and other information is used to determine the methods which are best able to help in determining the issues in question in this paper. Then, in Chapters 9 to 11, these methods are carried out and more is progressively discovered about the data. The final chapter, Chapter 12, draws together all the most interesting and important results, conclusions, problems, and limitations of the entire paper. 4.2 Main results and conclusions None of the data streams has easily identified outliers, thus no observations are excluded from further analysis. 4.3 Background Outliers are observations which are excluded from the analysis by reason of their being non-repetitive deviations from the general pattern of interest. 32 One way of looking at a time series is to assume that some underlying process is generating the numbers. The time series seen in reality, then, is a sample of this process. The purpose of analysis is to determine a reasonable model which approximates the underlying process in such a way as to provide some insight into the sample and its structure. Looking at the world in this manner means that outliers would be deviations in the sample which are caused by events not related to the underlying process. Detection and examination of possible outliers will hopefully allow the analyst to determine the cause of the deviation and thereby determine the appropriateness of its inclusion in the sample. Removal of true outliers from the analysis allows the analyst to have a more accurate sample of the underlying process and thereby increases the chances of closely modelling the situation. Some examples of causes for outlying observations would be incorrect measurement, transcription error, or arithmetic error. Outliers can have a substantial effect on many different aspects of data analysis. Outliers can have a large impact on any regression done, for example, since any observation which is greatly different from the rest of the sample will be an influential observation in least squares calculations. Many problems exist in any attempt to determine the existence of outliers. Some analysts believe that all outliers have to be determined before any analysis is done. They believe that if this is not the case, the outliers will be chosen in a manner determined by 33 the analysis already done and will therefore be influenced by any prejudice the analyst may have as to the results desired (i.e., the choosing of outliers would not be impartial). Other analysts maintain that it is inevitable that the analyst will become more familiar with the data as the analysis progresses and, in fact, the analysis itself may highlight the existence of outliers. The problem with this, of course, is that observations perceived to be outliers are examined in depth while other observations receive little or no examination. The outcome is often a bias in which the arguments for discarding outliers are accepted simply because the observations appear to be outliers. Fisher (1962) provides an interesting discussion of these and related aspects of data manipulation. 4.4 Actions taken Two techniques were used to determine the presence of outliers. First, an examination was made of the graphs of the data looking for any sudden gap or change from the general shape of the data. Second, the data were checked for unusually large or small indices; the probability is small that an observation will fall outside the range of three standard deviations from the mean, thus any observations which fall outside this range are possible outliers. 4.5 Detailed analysis Table 2. Calculations used in determining outliers; part one. X 21,113.30 Nikkei 1,633.28 Topix 386.04 Golf - all 353.75 Golf - east Golf - west 447.03 Golf - tokyo 299.41 s 7,918.77 597.41 224.69 192.69 300.34 160.98 x - 3s x + 3s -2,643.01 -158.95 -288.03 -224.32 -453.99 -183.53 44,869.61 3,425.51 1,060.11 931.82 1,348.05 782.35 where x is the: mean of the sample and s is the standard deviation of the sample. Table 3. Calculations used in determining outliers; part two. Nikkei Topix Golf - all Golf - east Golf - west Golf - tokyo Minimum Date Index Maximum Date Index 83/08/07 8,920.8 83/08/07 657.3 82/01/10 100 82/01/10 100 82/01/10 100 82/01/24 99.747 89/12/24 38,915.90 89/12/10 2,873.32 90/03/11 948.17 90/03/04 788.58 90/03/11 1,275.57 90/03/04 629.02 Observations outside range? no no no no no no 35 4.6 Description of analysis The graphs of the series may be found in Chapter 3. The Nikkei (figure 1) and Topix (figure 2) show extremely similar movements. There is a general upward movement until December 1989, followed by a general downward movement. Throughout these movements, however, there are smaller periods of upward and downward motion. By the end of the time period involved in this analysis (April 1993) brief upward activity is showing. The graph of all four golf indices together (figure 7) highlights the similarities of the golf indices. There are numerous years of little action followed by a quick rise to a peak in the beginning of 1987. Some of the height of this peak is lost and then there is another period of quick upward motion leading to a higher peak in March of 1990. This second peak is followed by a fairly consistent period of decrease. One exception to this general pattern is Golf - west which has only one peak (coinciding with the second peaks of the other series). Table 2 calculates a range bounded by three sample standard deviations below the sample mean and three sample standard deviations above the sample mean. Table 3 lists the minimum and maximum values of each series; none of these fall outside the range of three sample standard deviations determined in table 2. 36 4.7 Evaluative discussion The graphs of the Nikkei and Topix show numerous fluctuations and reversals of direction (i.e., upwards vs. downwards). Outliers are unusual, non-repetitive events; thus it is the very prevalence of these fluctuations and reversals which disqualify them as being outliers. The four golf indices do not show the same degree of fluctuation, although the two peak pattern exists. One point of this analysis is to determine how closely the stock and golf indices are related, particularly during an unusual pattern such as the two peaks. Removing either of these two peaks is therefore counter productive to the aims of the analysis. Thus, inspection of the graphs of the data indicates no outliers exist. It is interesting to note that by the end of the time period involved in this analysis (April 1993) a brief upward movement in both stock indices can be seen, while no such movement is apparent in the golf indices. Is this to be one of the short-term fluctuations so prevalent in these series, or is it beginning of a longer term recovery in the market? Only time will be able answer this question. As can be seen from tables 2 and 3, no observations fall outside three sample standard deviations from the sample mean, indicating there are no observations unusually large or small enough to indicate the existence of outliers. 37 Neither method indicates the presence of outliers, which is not surprising considering the series involved. The indices are calculated using well regulated formulae which have not been changed since their inception, thus there are no outliers due to changes in the method of collection or calculation. The data are also calculated and carefully watched by the many organizations which depend heavily on them, thereby providing yet another level of accuracy checks. I feel it is better to err on the side of caution when deciding which observations to exclude as outliers. This paper's intention is to gain a better understanding of the similarities of stock and golf prices and their relationship. It is the unusual aspects of the series which are the most interesting - removing these observations would defeat the purpose of the analysis and make the conclusion of no relationship something of a selffulfilling prophecy. 38 CHAPTER 5 WHAT CAN BE SAID OF THE DISTRIBUTIONS OF THE DATA? 5.1 Main results and conclusions Similarities exist both within the stock indices and within the golf indices. The patterns and distributions of the data are consistent, and the transformations investigated do not alter this consistency. This indicates that, according to this level of analysis, the data streams within each set (i.e., within the set of stock indices and within the set of golf indices) seem to be capturing basically the same information. At this point there is little differentiation between the two stock time series and between three of the golf time series (with Golf - west being the exception). However, as the analysis continues in future chapters these similarities will be further investigated and found somewhat superficial and misleading. The Nikkei and the Topix follow the same general pattern, and the distributions of these data streams and their logarithms show no coherent pattern. Taking the first differences of both the original data and their logarithmic transformations creates similar monomodal distributions. Graphs of both the golf data and their logarithms exhibit similar patterns, except that west has only one peak as compared to the two peaks found in the other data sets. The 39 histograms of the golf data and their logarithms show a concentration at the lower values and a multi-modal structure. Taking first differences transforms the distribution of the data (and their logarithms) to be monomodal with long tails. As described below, a simplifying assumption used frequently in statistical analysis is that the data are normally distributed, or that the data approximate a normal distribution. The analysis in this chapter identifies transformations of the data which satisfy this assumption most closely. The first differences and the first differences of the logarithm of both the Nikkei and the Topix do reasonable jobs of transforming these data to approximations of the normal distribution; with the first differences of the logarithm of the Topix having the best approximation to normality. Of the four Golf data sets (all, east, west, and tokyo), only the first differences of the logarithm of Golf - west approximate the normal distribution. 5.2 Background Many statistical techniques operate under the assumption that the population from which the sample (i.e., index) is drawn has a normal distribution; thereby indicating the sample data itself should have, or should at least approximate, a normal distribution. An important aspect of the assumption of normality is that if data is normally distributed the first two moments of the data (i.e., mean, variance, and covariances) are all that is needed to completely describe the data. This means that when discussing the stationarity of data (in Chapter 7), only the first two moments need be considered for those data streams found to approximate a normal distribution. The second important aspect of normality is that in the presence of normality, data which is uncorrelated is also independent. (Two time series which are uncorrelated have no linear relationship, while two data streams which are independent have no relationship at all. Correlation is easy to determine, but independence is not; in the presence of normality, however, independence can be shown simply by showing the lack of correlation.) The main area of this analysis which uses this aspect of normality is the autoregressive integrated moving average (ARIMA) methodology (Chapter 9). As will be described in more detail in Chapter 9, the presence of normality means that the ARIMA methodology may be considered appropriate for most stationary series, making the search for a more appropriate model unnecessary. There are two visual methods to assist the analyst in deciding whether data may be considered to approximate a normal distribution. The first is to compare the histogram of the data to that of a normal distribution and decide whether the histogram of the data closely enough approximates the normal bell-shaped curve. The second method is a normal probability plot in which the sample (i.e., the data to be analyzed) is plotted against the values one would get, on average, by sampling from a normal distribution of the same size (with a mean of zero and a standard deviation of 1). The resulting normal 41 probability plot would show a straight line if the sample is normally distributed. Thus, the analyst must determine if the normal probability plot has a line straight enough to convince him or her that the assumption of normality is appropriate. If the sample is considered nonnormal, it may be possible to transform the data to normality. 5.2.1 The purpose of the logarithmic transformation A transformation frequently used, particularly with economic data, is taking the natural logarithm (to the base e) of the data. Taking the logarithm allows relative changes to be compared over a wide spectrum of levels in the data. If, for example, a rich professor with a salary of $90,000 wins a lottery in which the prize is 30% of the winner's salary, the professor would then have a total income for the year of $117,000. If, on the other hand, a poor student with an annual salary of $15,000 wins the same lottery, the additional money received would be $4,500. There are two ways to think of these two prizes. One could consider the winnings of $27,000 and $4,500 and decide they are very dissimilar; $22,500 is no small difference! In this case, any analysis of this example data would use the original data with no logarithmic transformation. On the other hand, it may be that the two amounts are considered equivalent since the relative worth of the money may be the same for both women: a relative addition of 30%. Taking the logarithm of the data compensates for the different amounts received and makes the prizes equal by dealing with them as relative changes (i.e., the 30% of income won in the lottery). If this interpretation is accepted, it would be suitable to use the logarithmic transformation of the data for analysis of this example. 5.3 Actions taken For each of the data series of interest, both its histogram and its normal probability plot are examined. The "data series of interest" include the original data, the first differences of the original data, the logarithm of the original data, and the first differences of the logarithm of the original data. The logarithmic transformation is examined since this transformation is so prevalent in economic research. The first differences of both the original data and the logarithm of the original data are investigated, since these transformations are useful when considering stationarity, as will be discovered at a later stage in this analysis (Chapter 7). 43 5.4 Detailed analysis In the histograms which follow, each point on the graph may represent several observations. For example, the histogram of the Nikkei in figure 8 includes the phrase "each point represents 2 observations." In this case, a "set" of observations would contain two observations and would be indicated by one point in the graph. Each histogram may have a similar statement; if no such statement exists then each point represents one observation. Within the normal probability plots, an asterisk (*) represents one observation, any numeral between two and nine indicates the number of observations being represented by that numeral (i.e., a "5" represents five observations), and an addition sign (+) indicates ten or more observations are being represented. 5.4.1 Nikkei Each point represents 2 observations +12000 + 18000 + 24000 30000 + 36000 42000 Figure 8. Histogram of the Nikkei. 40000+ Nikkei 30000+ 20000+ *222* * * *542 6875 9+4 + 5 + + 3 + +9 5++9 + + 97 + + + ++2 59 3 +++ + *222345678+9 10000+ -2.4 -1. 2 0. 0 1. Normal Probability Score Figure 9. Normal probability plot of the Nikkei. 2 â€”+ 2.4 Each point represents 4 observations -3000 -200 0 -100 0 0 100 0 200 0 Figure 10. Histogram of the first differences of the Nikkei. 2000+ * First Differenceof Nikkei 0 + 2322 2+8765 ++++++ +++++++++ -2000+ 2 * * *+++++5 36789 *342 * 2 + 2 * * 2 -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 2. 4 Figure 11. Normal probability plot of the first differences of the Nikkei. Each point represents 2 observations 9.00 9.30 9.60 9.90 10.20 10.50 Figure 12. Histogram of the logarithm of the Nikkei. 10.50+ *654322 Log o f Nikkei - 8 + + 86 +6 ++ + 10.00+ ++ * 95 ++ + 6 + 7 7 7 9.50+ 5+ 9.00 + 2 5 ++ 9 +9 5678* * * *2223 4 -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 2. Figure 13. Normal probability plot of the logarithm of the Nikkei. 4 ** * Each point represents 4 observations â€¢0.150 -0.10 0 -0.05 0 0.00 0 0.05 0 0.10 0 Figure 14. Histogram of the first differences of the logarithm of the Nikkei. 0.080+ First Differenceof Log of Nikke i 0.000+ -0.080+ _ * * * 2 *2 *543* + + 875 ++++++++* 2+++++++ 2+++++ 68 + + 356* 223* * 2 -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 2. 4 Figure 15. Normal probability plot of the first differences of the logarithm of the Nikkei. 48 5.4.2 Topi x â€” + 800 1200 +1600 + -2000 1_. + 2400 2800 Figure 16. Histogram of the Topix. 22* * Topix 35432 *873 + + 9 * ++4 8 ++ + + + + + + + + + 9 8 ++2 4 + 2400 + 1600 + 7 ++ + 800 + 678++7 * * *22234 5 -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e Figure 17. Normal probability plot of the Topix. 2 â€”+ 2.4 Each point represents 5 observations -210 -140 -70 70 0 140 Figure 18. Histogram of the first differences of the Topix. 150+ * First Differenceof Topi x 0+ * 3322 +8765 6+++++ ++++++++++ -150+ 2 *345 2* 6++++4 6784 + -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e + * 2 2. Figure 19. Normal probability plot of the first differences of the Topix. ** * 4 Each point represents 2 observations 6.60 6.90 7.20 7.50 7.80 8.10 Figure 20. Histogram of the logarithm of the Topix. 8.00+ 2 Log o f Topix - '+++8 ++++ 7.50+ + 7.00+ 2 6.50+ * 4 5 + 8 + + + + + + 9 + + + 2 +3 48 + + 263 * *22234 3 4654322 3 * * * + +7 +9 -2.4 -1. 2 0. 0 1. Normal Probabilit y Plo t 2 2. Figure 21. Normal probability plot of the logarithm of the Topix. 4 51 Each point represents 5 observations -0.100 -0.050 0.000 0.050 0.100 0.150 Figure 22. Histogram of the first differences of the logarithm of the Topix. First Differenceof Log of Topi x 0.080+ 0.000+ -0.080+ _ * * * 2 *2* * 332* 8+8765* 4+++++++4 ++++++++ +++++5 578+2 *45* 222 -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 2. 4 Figure 23. Normal probability plot of the first differences of the logarithm of the Topix. 52 5.4.3 Golf - all Each point represents 3 observations â€”+160 + 320 + 480 + 640 + -800 + 960 Figure 24. Histogram of Golf - all. 900+ 222 Golf - 6764 all 2+ 600+ 8 300+ + 4 + 6 ++ + + + 8++++2 + + + 6 4 + + 8+++++ * ***22244678++++ + 2 2 2 0 + + + + + -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e Figure 25. Normal probability plot of Golf - all. . 2 â€” + 2.4 53 Each point represents 10 observations -+ â€¢30 -15 -+ 0 -- + - -- + 15 -- + 45 30 Figure 26. Histogram of the first differences of Golf - all. 50 + First Diff o f Golf all 25 + 34322* 58762 5 + + +5 0 + *23 -25 + 5++++++++++++++ 38+++++ 4564 * 2* -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 + 2 .4 Figure 27. Normal probability plot of the first differences of Golf - all. Each point represents 5 observations 4.50 5.00 5.50 6.00 6.50 7.00 Figure 28. Histogram of the logarithm of Golf - all. Log o f Golf - all 6.40+ *22** 987644 +++ ++ +++9 5 * * * * + + + 8 + + + 2 + 5.60+ + 9* + + + + + + 3 + + +* 4.80+ 578+ * ***22244 * 9 -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 2. Figure 29. Normal probability plot of the logarithm of Golf - all. 4 55 Each point represents 12 observations â€¢0.060 -0.03 0 0.00 0 0.03 0 0.06 0 0.09 0 Figure 30. Histogram of the first differences of the logarithm of Golf - all. First Diff o f * Log o f * Golf - * all 0.060 + 2* 0.000+ 6+++++++++++++ 3678+++++ *224* *** * -0.060+ * 564422 +++++++82 8 -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 2. 4 Figure 31. Normal probability plot of the first differences of the logarithm of Golf - all. 56 5.4.4 Golf - east Each point represents 4 observations 150 30 0 45 0 60 0 75 0 90 0 Figure 32. Histogram of Golf - east. 750+ 4222 Golf - 876 east + 500+ ++ 250+ 6 +++9 + + 2 6 + + + + 5 + 5++++++++ * ***22244678 5 ++2 2+* + + + + + -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e Figure 33. Normal probability plot of Golf - east. + + 2 2. * 4 4 ** * Each point represents 12 observations +- â€”H- -40 +- -+ 0 -20 -- + 60 --+ 40 20 Figure 34. Histogram of the first differences of Golf - east. First Diff o f Golf east 40 + 0 + -40+ * 25S78+++ '232 2* * ** *432* 3+++8764 +++++++++++++++++ * + + + + -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e . 2 +- 2.4 Figure 35. Normal probability plot of the first differences of Golf - east. Each point represents 3 observations 4.80 5.20 5.60 6.00 6.40 6.80 Figure 36. Histogram of the logarithm of Golf - east. Log of Golf - *8764422 east +++ 6.30+ ++ 2 5 9 7 ++ + 3 + 3 + 5.60+ + 7 8 + + + + +3 4.90+ 5++++ 4 246785 * ***222 2 -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 2. Figure 37. Normal probability plot of the logarithm of Golf - east. 4 *** * Each point represents 10 observations â€”+- â€¢0.080 +-0.040 0.000 0.040 +0.080 +0.120 Figure 38. Histogram of the first differences of the logarithm of Golf - east. First Diff o f Log o f Golf east 0.070 + 0.000+ 56442 *++++++82 +++++++++++++++ *4678++++8 2223 -0.070+ -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 2. 4 Figure 39. Normal probability plot of the first differences of the logarithm of Golf east. 60 5.4.5 Golf - west Each point represents 6 observations 0 25 0 50 0 75 0 100 0 125 0 Figure 40. Histogram of Golf - west. ***** 1200+ 322 Golf - 264 west 78 800+ ++ 400+ + + + 6 + 7 9 + + 3 2 + ++ + 53 + 7 4 5 * * + + + 2 2+++++++6 0 + * ***22244678 8 -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e Figure 41. Normal probability plot of Golf - west. 2 2. 4 Each point represents 10 observations -+-45 -- + â€¢30 â€¢-+ - â€¢15 -- + 30 -- + 15 -+0 Figure 42. Histogram of the first differences of Golf - west. First Diff of Golf - 2 5 + west 0 + -25 + ** * * *65 786 43 22" + + + 3+ ++++++++++++++2 7 + + +7 26792 *343 ** 2* + + + + -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 . + 2.4 Figure 43. Normal probability plot of the first differences of Golf - west. Each point represents 3 observations 4.50 5.00 5.50 6.00 6.50 7.00 Figure 44. Histogram of the logarithm of Golf - west. 7.0+ 787644 Log o f Golf - west 6.0+ 2 ++ +++ +++ 9+ * 7 + + + + 8 + 22*** * + + + 3 5 3 + + + 8 + 5.0+ 5+ 578++9 * ***22244 * 2 + + + + -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e + 2 2. Figure 45. Normal probability plot of the logarithm of Golf - west. . 4 Each point represents 6 observations â€¢0.030 -0.01 5 0.00 0 0.01 5 0.03 0 0.04 5 Figure 46. Histogram of the first difference of the logarithm of Golf - west. First Diff o f * Log o f * Golf - * west 0.030 + *22 * 0.000+ ++++++++ 343 773 9 +++* ++++++++7 4 * * 7 ++ + 6 *678+5 *2243 ** â€¢0.030+ * * * -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 2. 4 Figure 47. Normal probability plot of the first differences of the logarithm of Golf west. 64 5.4.6 Golf - tokyo Each point represents 6 observations 100 20 0 30 0 40 0 50 0 60 0 Figure 48. Histogram of Golf - tokyo. 500+ 764422 Golf - + tokyo 5 400+ +++ 200+ 7 * * 5 + 9 + 5 + + + + + + **22244678++++++++5 6++9 +++ + -2.4 -1. 2 0. 0 1. Normal Probabilit y Plo t Figure 49. Normal probability plot of Golf - tokyo. + 8 2 2. 2 4 *** * 65 Each point represents 12 observations --+â€¢15 -+0 -- + 15 +45 -- + 30 -- + 60 Figure 50. Histogram of the first differences of Golf - tokyo. First Diff o f 50 + Golf tokyo 30 + 0 + 454322 9++872 2++++++++++++++++5 278++++ 23454 * * *2 2 -30 + ++ + + -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 +2.4 Figure 51. Normal probability plot of the first differences of Golf - tokyo. Each point represents 3 observations + 4.55 +4.90 +- 5.25 +5.60 +5.95 +6.30 Figure 52. Histogram of the logarithm of Golf - tokyo. 98764422 +++* +++4 Log o f Golf tokyo 6.00 + 2*** * + + + + 3 6 + 9 + + + 3 +4 64 5 9 + + + *+++++++7 *2244677 5.40+ 4.80 + if * * * * * * -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 2. Figure 53. Normal probability plot of the logarithm of Golf - tokyo. 4 Each point represents 9 observations +-0.035 0.000 +0.035 +0.070 +0.105 +0.140 Figure 54. Histogram of the first differences of the logarithm of Golf - tokyo. 0.140+ First Diff o f Log o f Golf tokyo 0.070 + 222 0.000+ 77644 7+++++* *â€¢ + + + + ++ + + ++ + + ++ 48+++++ 2224463 -0.070+ -2.4 -1. 2 0. 0 1. Normal Probabilit y Scor e 2 â€” + 2.4 Figure 55. Normal probability plot of the first differences of the logarithm of Golf tokyo. 68 5.5 Description of analysis (For this section, the logarithmic transformation will be referred to the "log" and the normal probability plot will be referred to the "npp.") 5.5.1 Nikkei The histogram of the Nikkei (figure 8) has numerous humps and seems higher in the lower half than in the upper half. Its npp (figure 9) has curvature at both ends. The first differences' histogram (figure 10) has only one mound and has long tails, with the lower tail being longer than the upper one. The npp (figure 11) has an almost straight line except for an upward hump in the first half. When the log is taken of the original data, the histogram (figure 12) has numerous mounds and the npp (figure 13) again shows curvature at both ends. The histogram (figure 14) of the first differences of the log of the data has one mound with long tails. Its npp (figure 15) has a line which is not quite straight due to an upward hump on the first half. 69 5.5.2 Topix The histogram of the Topix (figure 16) shows no coherent shape. Its npp (figure 17) has curvature at both ends. The first differences of the Topix have a histogram (figure 18) which is a single mound with long tails. The associated npp (figure 19) resembles a somewhat straight line with a hump in the lower half. The log of the Topix has a histogram (figure 20) with many mounds and has a npp (figure 21) with curvature at both ends. The histogram of the first differences of the log of the Topix (figure 22) has one mound and heavy tails. There is one point (i.e., set of five observations) which is separated from the others to the right of the histogram. Its npp (figure 23) is approximately straight, again with one set not following this general pattern. 5.5.3 Golf-all The histogram of Golf - all (figure 24) has two humps, one at the lowest values and one at the middle. There is no tail for the lower values and a long tail for the upper values. 70 Its npp (figure 25) has a long section at the lower values which is parallel to the x-axis and has some curvature at the higher values. The histogram of first differences of Golf - all (figure 26) has a single thin mound and long tails with its right tail more drawn out than its left. Its npp (figure 27) shows a straight line except that the middle portion has a downward hump. The log of Golf - all has a bimodal distribution in its histogram (figure 28) and an elongated "S"-shape in its npp (figure 29). The first differences of the log of Golf - all shows a histogram (figure 30) with a single mound with the right tail heavier than the left. Its normal probability plot (figure 31) has a middle section which is parallel to the x-axis. 5.5.4 Golf - east The histogram of Golf - east (figure 32) has more than one mound with more observations in the lower portion than in the upper portion of the graph. Its npp (figure 33) has a flat section parallel to the x-axis at the left hand side, followed by a straight line at a 45 degree angle. 71 The first differences of Golf - east has a histogram (figure 34) with one thin and high mound and heavy tails. Its npp (figure 35) has a flat section in the center which is parallel to the x-axis. The log of Golf - east has a bimodal histogram (figure 36) and the shape of its npp (figure 37) is that of an elongated "S"-shape. The first differences of the log of Golf - east has a histogram (figure 38) with one thin mound and long, flat tails. Its npp (figure 39) has a long flat section at its center which is parallel to the x-axis. 5.5.5 Golf - west The histogram of Golf - west (figure 40) has one mound at the extreme left-hand side and one other mound at the middle value range. It has no low value observations below the first mound, but has a long tail on the right hand side. Its npp (figure 41) has a long flat section parallel to the x-axis at the left hand side followed by the expected 45 degree angle straight line with a bit of curvature at the extreme right hand side. The histogram (figure 42) of the first differences of Golf - west has one tall thin mound and spread out tails. Its npp (figure 43) has a horizontal flat section in the middle. 72 The histogram of the log of Golf - west (figure 44) shows no coherent shape and the npp (figure 45) is curved at both ends. The histogram (figure 46) of the first differences of the log of Golf - west shows a single, centered mound with long tails on either side. The npp (figure 47) follows a 45 degree straight line except that there is a section in the middle which shows a slight tendency towards being horizontal. 5.5.6 Golf - tokyo The histogram of Golf - tokyo (figure 48) has one large hump at the extreme left hand side. Its center and right hand side remain low and somewhat constant. The left hand side of the npp (figure 49) of Golf - tokyo is flat and parallel to the x-axis. Its center then becomes the expected 45 degree angle, with the right hand side then curving towards being horizontal again. The first difference of Golf - tokyo has a histogram (figure 50) with one high, symmetrical mound and a particularly drawn out right hand tail. Its npp (figure 51) is basically horizontal for the first two-thirds and then becomes almost vertical. 73 The log of Golf - tokyo has a bimodal histogram (figure 52) with little concentration in the center of its distribution. Its npp (figure 53) has a middle section which is almost vertical. The first differences of the log of Golf - tokyo has a histogram (figure 54) with one symmetrical hump and a drawn out right hand tail. Its npp (figure 55) has a middle section is horizontal to the x-axis. 74 5.6 Evaluative discussion (For this section, the logarithmic transformation will be referred to the "log" and the normal probability plot will be referred to the "npp.") 5.6.1 Nikkei The histogram of the Nikkei clearly does not resemble the shape of a normal distribution and the curvature at the ends of its npp also indicates the distribution is nonnormal. The first differences of the Nikkei have a histogram whose one hump is reasonably symmetric with a general bell-curve shape. The npp is not quite straight, but it is certainly closer to a straight line than was found with the original data. The first difference of the Nikkei does an adequate job of transforming the data to resemble a normal distribution. The numerous mounds in the histogram of the log of the Nikkei and the resulting nonbell-shaped curve indicate nonnormality. The curvature at both ends of the npp of the log of the data is another sign of nonnormality. 75 The histogram of the first differences of the log of the Nikkei shows one mound which resembles the bell-shape curve which would be expected with a normal distribution. The npp of the first differences of the log of the Nikkei shows a shape similar to that of the first differences' although the former is a bit straighter. The first differences of the log of the Nikkei do an adequate job of transforming the data to resemble a normal distribution. The first differences of the log do a better job in this respect than do the first differences. 5.6.2 Topix The histogram of the Topix clearly does not have any resemblance to a normal distribution and the curvature at both ends of its npp support this conclusion of nonnormality. The first differences of the Topix, however, have a histogram with just a single mound. Its npp resembles a straight line except for a small hump in the lower half. The first differences of the Topix can be considered a reasonable approximation of a normal distribution. The histogram of the first differences of the Topix is similar in shape to that of the first differences of the Nikkei; the associated npp of the Topix, however, is somewhat straighter than that of the Nikkei. 76 The log of the Topix does not resemble a normal distribution nor is its npp a straight line, indicating that the log of the Topix should not be considered to be normally distributed. The first differences of the log of the Topix have a histogram with a shape that is similar to the normal distributions' except that there is one point separated and to the right of the general shape. Its npp is a straight line but, again, has one point which does not follow the general pattern. The first differences of log of the Topix is a reasonable approximation of a normal distribution. The first differences of the log do a slightly better job in this respect than do the first differences. The first differences of the log of the Topix also do a slightly better job at approximating normality than do the first differences of the Nikkei. 5.6.3 Golf - all Both the histogram with its bimodal shape and the npp with its nonlinearity indicate that Golf - all should not be considered to be represented by a normal distribution. The first difference of Golf - all's histogram has a thin hump and tails that are too heavy to represent a normal distribution. However, it has only one mound and its general shape bears some resemblance to the normal distribution and, so, is more similar 77 to a normal distribution than is the original data. The middle portion of its npp is parallel to the x-axis, suggesting that the first differences are not a good approximation to a normal distribution. The bimodal histogram and elongated "S"-shape in the npp indicate that the log of Golf - all should not be considered a normal distribution. The histogram for the first differences of the log of Golf - all has a mound which is thinner than would be that of a normal distribution. The npp plot also resembles that of the first differences with its flat middle section, although this flat section is somewhat longer than that of the first differences. This transformation, unfortunately, also cannot be considered a particularly good approximation to the normal distribution. 5.6.4 Golf - east Golf - east should not be considered normal because it has too many low value observations in proportion to the higher valued ones. This fact is shown in the large hump at the left most side of the histogram and also in the flat section at the left-hand side of the npp. 78 The first differences of Golf - east should not be considered normal, either, because its observations are concentrated too much in the center of the range of values. Evidence of this is given by the high thin tower of the histogram and the flattened middle section of the npp. The histogram of the first differences of Golf - east shows tails which are too heavy to be considered to approximate those of the normal distribution. The log of Golf - east should not be considered normal as not enough of its observations are concentrated at the center of the distribution (which would be the case with a normal distribution). This fact is demonstrated both in the histogram with its two humps at each extreme and valley in the middle, and in its npp which has a middle section parallel to the y-axis. The first differences of the log of Golf - east is very similar to the first differences of the original data (a tall thin hump in the middle of the histogram and a flat middle section in the npp) and should not, therefore, be considered normally distributed. In addition, as with the first differences, the histogram of the first differences of the log of Golf - east shows tails which are too heavy to approximate those of a normal distribution. 79 5.6.5 Golf - west Golf - west is concentrated in the lower value range too much to make it resemble a normal distribution. This is seen by both the large mound at the left of the histogram followed by a long, drawn out right hand side and the flat horizontal section at the left hand side of the npp. The first differences of Golf - west have too many observations concentrated at the center. Its histogram is does not have the gentle slopes and gradually thinning tails seen in a bell-shaped normal probability histogram but rather a tower that is thinner and therefore higher, and tails on both sides which are flat and long. Although this histogram has the desired one mound in the center, this data is not a good approximation to the normal distribution. The log of Golf - west cannot be considered to resemble a normal distribution as there is no concentration of observations at the mean, as is shown by the lack of a single mound or, in fact, any concentration at all in the center of its histogram. The first differences of the log of Golf - west have a histogram whose single mound is not as steep and sharp as those seen to date. In fact, the related npp can be considered to be a straight line. Thus, the first differences of the log of Golf - west can be considered a good approximation of the normal distribution. 80 5.6.6 Golf-tokyo The histogram of Golf - tokyo in no way resembles the normal bell-shaped curve and its npp does not show a straight line, thus Golf - tokyo should not be considered normally distributed. The first difference of Golf - tokyo have a histogram which differs from that of the normal distribution both by having its hump too tall and thin and by having its tails too heavy. The associated npp contains a horizontal section in the middle. The first differences' distribution should not be considered to approximate a normal distribution. The log of Golf - tokyo cannot be considered normal because it is lacking the necessary central concentration of observations, as is shown both by the relatively small number of observations falling in the center of the histogram and by the vertical section in the middle of its npp. The first differences of the log of Golf - tokyo have a distribution similar to that of the first differences of the original data and should not, therefore, be considered to approximate a normal distribution. 81 CHAPTER 6 DOES SEASONALITY EXIST? 6.1 Main results and conclusions Graphs of both the original data series and their autocorrelation functions show no evidence to indicate the presence of seasonality. Seasonality, in this case, includes various anomalies which could have an effect on this weekly data: monthly effects, the January effect, and so on. 6.2 Background The first question to be answered in this chapter is if the data show signs of seasonality (i.e., part of the structure of the series being predictable based on the time of year). As listed in section 3.2, many anomalies are believed to exist in stock market data. The second question to be answered in this chapter, therefore, is whether or not these anomalies have a significant impact on the data to be used in this paper. The first type of anomaly to be discussed are those associated with time frames larger than a week, such as the effects every January and before holidays. These anomalies are spaced one year apart, thus if their effect is significant there would be a high correlation 82 between the observations on the associated dates each year. These associated dates each year are, by definition, 52 weeks apart. Their significance, therefore, can be determined by the degree of autocorrelation at a lag 52, since the data is weekly. The second type of anomaly to be discussed, although not strictly an issue of seasonality, is the day of the week effect. In this case, the behaviour of the stock market, on average, is predictably higher or lower on specific days than it is for the average for the week. 6.3 Actions taken The graphs of all the original data series and their autocorrelation functions are visually examined to determine if there is any evidence of seasonality. 83 6.4 Detailed analysis Figure 56. Autocorrelation functions of Nikkei and Topix. Autocorrelation Functions , Gol f Figure 57. Autocorrelations functions of the Golf indices. 84 6.5 Description of analysis Graphs of the original data series may be found in Chapter 3, and a verbal description of these graphs may be found in section 4.5. None of these graphs show any discernable cyclical pattern. The autocorrelation functions of the series (figures 56 and 57) are straight lines decreasing as the lag increases. 6.6 Evaluative discussion The graphs of the data do not display any evidence of a repeated seasonal pattern. The autocorrelation functions show a continuous decline with no significant variation, thus demonstrating no evidence of seasonality. In particular, none of the autocorrelation functions have a high autocorrelation (or in fact any increase) for observations 52 weeks apart (or surrounding this lag), indicating none of the anomalies have a significant impact on this weekly data. 85 6.6.1 Possible problems with method There are some difficulties with the procedure used in this chapter. Firstly, events which take place on specific dates will occur on different days of the week in successive years. For example, if a particular weekly Nikkei index (which uses the closing value on the last day of trading in the week as described in section 3.3.2) includes a holiday effect in one year, then the Nikkei of the related week for the next year would still be calculated on the last day of the week but the same effect of the holiday anomaly will not be included since it will be taking place on the next day of the week (if neither is a leap year). This anomalous effect, then, would not be consistently included every year in the calculation of the indices. Secondly, with the relative timing of the anomalous influences changing over the years, their effects may be caught in different weeks over the years. For this reason, the lags immediately surrounding 52 weeks are of interest. Thirdly, the day of the week effect's most direct influence on the indices must be considered. The Nikkei is calculated using information from the last trading day of the week, which is now usually Friday. However, due to holidays, there are some instances in which Friday is not the last trading day. This means that the two cases above will be influenced by different day of the week effects. For example, if Thursday is the last trading day for one week, then instead of the index including, as usual, whatever effect 86 Friday is known to have, it will instead include the effect Thursday exerts. Thus, the two weeks would not, strictly, be comparable. This dilemma does not exist to the same extent for the Topix as this index is calculated using the average of the entire week. Day of the week effects would have an influence if, for example, Friday and its related anomalous effect is not included in this average; but this impact is much smaller than that for the Nikkei. The direct day of the week effect described in the third problem above has an aspect which is not easily discounted. In the past the TSE did trade on selected Saturdays. Ziemba and Schwartz (1992b, 166) give the history of the Saturday trading days as follows: until the end of 1972, trading took place on all Saturdays; from then until July 1983, no trading was done on the third Saturday of the month; from August 1983 until July 1986, the market was closed on the second Saturday of the month as well; from then until the end of January 1989, trading took place only on the first, fourth, and fifth (if it existed) Saturdays of each month; after January 1989 no trading took place on any Saturday. Ziemba and Schwartz (1992b, 169-70) indicate that the day of the week effect for Saturdays indicates that Saturdays are more positive than the other days of the week. In particular, the mean daily return (4 April 1978 to 18 June 1987) for Saturdays (0.1678) was higher than for Fridays (0.0605) (Table 3.4 of Ziemba and Schwartz 1992b, 170). The data analyzed in this paper begins in 1982, and thus covers all the different Saturday trading schedules listed above; however, the effect of this Saturday trading is not included in this analysis. Although the research discussed in section 3.2 regards anomalies in the stock market indices, one may consider the golf indices to be similarly influenced since they are also based on speculative trading. None of the above problems have a significant effect on this analysis. The autocorrelation function can be determined for the weeks surrounding the 52 lag; in this way, the analysis deals with the second type of problem (effect included in different weeks over the years). The first and third problems (regarding the effect of no trading on holidays) is not directly accounted for in this analysis. However, when one considers the small number of times holidays affect the indices relative to the large total number of observation, it is realized that any related problem will not create any significantly large influence on the analytical results. (Ziemba and Schwartz (1992b, 223) report that there are thirteen holidays per year in Japan.) Remember that this is not to say that these anomalies do not exist or do not have any influence; but rather, that their existence and influences are not significant for the large number of weekly observations used in this analysis. 88 CHAPTER 7 STATIONARITY 7.1 Main results and conclusions The concept of difference stationarity applies well to both the Nikkei and Topix, with first differencing required to reach stationarity. The same is true of the logarithmic transformations of these two time series. There is some evidence that the first differences of the logarithmic transformations are more appropriate since the taking of logarithms seems to reduce possible variance changes over time. After first differencing is done in these cases, there appears to be a constant mean, no unit roots, and autocorrelation functions which show no pattern of slow decline. Stationarity does not apply so clearly to the golf time series. For both the untransformed data and the logarithmic transformations of the data, the unit root tests indicate first differencing induces stationarity. Nonstationarity, however, is indicated by the autocorrelation functions of the data. The graphs do little to clear up this contradiction. Thus, there is some evidence to suggest that models assuming difference stationarity (including ARIMA models) may not be appropriate for the golf time series. 89 7.2 Background In fitting an autoregressive integrated moving average (ARIMA) model to the data, the first step is to ensure difference stationarity. Difference stationarity is generally assumed, in which case differencing is done until stationarity is reached. (Section 9.2.3 describes difference stationarity.) Simply stated, strict stationarity exists if probability calculations are independent of time. Weak, or second-order, stationarity exists when the first two moments do not depend on time (i.e., the mean, variance, and covariances). Weak and strict stationarity are the same thing in the case of a normal distribution (discussed in Chapter 5) since normally distributed data are completely described by its first two moments. Box and Jenkins determined the degree of differencing necessary to reach stationarity through examination of the sample autocorrelation function. Nonstationary is suggested if the autocorrelation function is large and slowly declines. This method, however, often leads to overdifferencing (Mills 1990, 121). A more formal test is available for determining the degree of differencing: the unit root test. If a model is stationary, then all corresponding roots lie inside the unit circle. If any roots lie outside the unit circle, the time series would change exponentially over time and would therefore be nonstationary; this would be immediately obvious in examining 90 the graph of the data. One alternative remains, however, a unit root; in this case the model describing the data would not be stationary, but this fact would not be obvious through an examination of the graph of the data. The simplest example of a unit root test is that done for the model yt = ay,., + st, where s, is a stationary noise term. Least squares regression is used to estimate a, and the coefficient a is then tested to determine if it is significantly different from one. The distribution of the usual test statistic (d / standard error of d) is not a t-distribution, however; instead, the test statistic must be compared to specifically calculated critical values. Dickey and Fuller were the first to recognize this and develop appropriate critical values, thus this type of test is called a Dickey-Fuller unit root test. This basic test can be expanded to include lagged values of yt in order to capture higher order autoregression; this is called an Augmented Dickey-Fuller test and is the test used in this analysis. The Augmented Dickey-Fuller test assumes the data can be represented by a pure, finite autoregressive process. This assumption is obviously restrictive if the moving average aspect of the model cannot be re-expressed as a finite autoregressive process of a reasonable order; in this case, the unit root test can be very misleading. More specifically, the degree of the autoregressive re-expression must be of low enough order to have been captured in the lagged structure of the unit root test. In reality, this is often not the case, and, currently, alternative tests which deal with this difficulty are being developed. The result of this problem with the Augmented Dickey-Fuller unit root test 91 is that the test has low power; it will often say a unit root exists (due to the existence of an uncaptured moving average component in the data) when in reality there is no unit root. 7.3 Actions taken Examination of the plots of the data, their autocorrelation functions (acf), and the results of unit root tests are used to make judgements as to the appropriateness of assuming stationarity. Specifically the unit root tests in this analysis were based on the Augmented DickeyFuller test specified in the computer software package Shazam: Model: yt = oc0 + (l+ajy,., + a 2 t + Sty Vy tj + st where Vyt = y, - yt.l5 Â£t 1S Gaussian white noise, and Â£ indicates summation from j=l to p; with p referring to the number of lags included) H0: unit root; a,=0 Ha: no unit root; a,<0 The null hypothesis (H0) is rejected if the test statistic calculated is less than the associated critical value. 92 7.4 Detailed analysis In this section, "d=l" indicates first differences; both "In" and "log" refer to the logarithmic transformation; and "Std Err" stands for standard error. 7.4.1 Nikkei First difference s o f Nikke i 984 1985 1986 1987 Figure 58. First differences of the Nikkei. 93 Figure 59. Logarithm of the Nikkei. First difference s o f I n of Nikke i 1985 198 6 198 7 198 8 198 9 Figure 60. First differences of the logarithm of the Nikkei. 94 Table 4. Unit root tests on the Nikkei and its transformations. Test Statistic Critical Value 10% Unit Root? Stationary? original d=l -1.1752 -3.9470 -3.13 -3.13 yes no no yes log log, d=l -1.0238 -4.2146 -3.13 -3.13 yes no no yes Nikkei Table 5. Autocorrelation functions of the Nikkei and its transformations. Lags Autocorrelations Std Err Nikkei 1-12 13-24 1.00 0.99 0.98 0.98 0.97 0.96 0.96 0.95 0.94 0.94 0.93 0.92 0.92 0.91 0.91 0.90 0.89 0.89 0.88 0.88 0.87 0.86 0.86 0.85 0.04 0.21 Nikkei, d=l 1-12 13-24 0.07 0.13 0.07 -.01 0.01 0.03 0.09 -.01 0.09 -.09 -.09 -.05 -.02 -.02 .00 0.03 -.04 -.01 -.01 0.02 0.05 0.13 0.05 0.01 0.04 0.05 Nikkei, log 1-12 13-24 0.99 0.99 0.98 0.98 0.97 0.96 0.96 0.95 0.95 0.94 0.93 0.93 0.92 0.91 0.91 0.90 0.89 0.89 0.88 0.87 0.87 0.86 0.86 0.85 0.04 0.21 Nikkei, log, d=l 1-12 0.05 0.08 0.06 -.02 -.01 0.01 0.09 0.02 0.07 -.07 -.06 0.00 13-24 0.01 -.00 0.05 0.07 -.03 0.00 -.04 -.01 0.02 0.08 0.04 -.03 0.04 0.05 95 7.4.2 Topix First 1984 198 differences 5 988 198 of 9 199 Topix 0 199 1 199 2 199 Figure 61. First differences of the Topix. N a t u r a l l o g a r i t h m (In ) o f N i k k e i 10 5 - r n 1Â» 10 A /A 3 2 - 10 9 9 - 9 8 9 9 6 - AV J\ Â»Â«9 9 2 â€¢" Figure 62. Logarithm of the Topix. i 3 First difference s o f I n of Topi x BE 1986 1987 1991 1992 1993 Figure 63. First differences of the logarithm of the Topix. Table 6. Unit root tests on the Topix and its transformations. Test Statistic Critical Value 10% Unit Root? Stationary? original d=l -1.2154 -4.1433 -3.13 -3.13 yes no no yes log log, d=l -1.2128 -4.1443 -3.13 -3.13 yes no no yes Topix 97 Table 7. Autocorrelation functions of the Topix and its transformations. Autocorrelations Lags Std Err Topix 1-12 13-24 1.00 0.99 0.98 0.98 0.97 0.96 0.96 0.95 0.94 0.93 0.92 0.92 0.91 0.90 0.90 0.89 0.88 0.88 0.87 0.86 0.86 0.85 0.84 0.83 0.04 0.21 Topix, d=l 1-12 13-24 0.22 0.13 0.06 -.02 0.00 0.04 0.09 0.05 0.08 -.05 -.11 -.10 -.01 -.04 0.03 0.07 0.00 0.03 -.02 -.01 -.03 0.10 0.09 -.01 0.04 0.05 Topix, log 1-12 13-24 0.99 0.99 0.98 0.97 0.97 0.96 0.95 0.95 0.94 0.93 0.92 0.92 0.91 0.90 0.90 0.89 0.88 0.87 0.87 0.86 0.85 0.84 0.84 0.83 0.04 0.21 Topix, log, d=T 1-12 0.25 0.09 0.06 -.04 -.02 0.02 0.11 0.07 0.05 -.03 -.09 -.06 13-24 0.01 -.02 0.08 0.10 0.03 0.04 -.03 -.04 -.04 0.08 0.07 -.05 7.4.3 Golf - all First Difference s o f Gol f - all km â€¢~~H^lJ)t._^--trf~J,..'-.,r-V â€” 982 198 3 198 4 198 5 198 6 19B 7 199 8 Figure 64. First differences of Golf - all. 0.04 0.05 98 Logarithm of Golf - all A i\ j 5.2 - 19 82 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 Figure 65. Logarithm of Golf - all. First differenc e o f lo g o f Golf - all ^VtM,â€¢^n^d^d 982 198 3 A! W ^ 1989 199 0 199 1 199 2 199 3 Figure 66. First differences of the logarithm of Golf - all. 99 Table 8. Unit root tests on Golf - all and its transformations. Test Statistic Critical Value 10% Unit Root? Stationary? Golf - all original d=l -1.59 -3.27 -3.12 -3.12 yes no no yes log log, d=l -0.49 -4.38 -3.12 -3.12 yes no no yes Table 9. Autocorrelation functions of Golf - all and its transformations. Lags Golf - all 1-12 13-24 Autocorrelations Std Err 1.00 1.00 0.99 0.99 0.99 0.98 0.98 0.98 0.97 0.97 0.96 0.96 0.95 0.95 0.94 0.94 0.94 0.93 0.93 0.92 0.91 0.91 0.90 0.90 0.04 0.20 Golf - all, d=l 1-12 0.81 0.67 0.49 0.32 0.22 0.19 0.14 0.17 0.16 0.17 0.17 0.16 13-24 0.17 0.17 0.18 0.18 0.18 0.17 0.18 0.19 0.19 0.18 0.17 0.15 0.04 0.09 Golf - all, log 1-12 1.00 0.99 0.99 0.99 0.98 0.98 0.98 0.97 0.97 0.97 0.96 0.96 13-24 0.95 0.95 0.95 0.94 0.94 0.93 0.93 0.92 0.92 0.92 0.91 0.91 0.04 0.20 Golf - all, log, d=l 1-12 0.80 0.68 0.50 0.34 0.26 0.25 0.22 0.26 0.26 0.25 0.23 0.21 13-24 0.20 0.20 0.19 0.20 0.20 0.19 0.19 0.19 0.19 0.17 0.16 0.14 0.04 0.09 100 7.4.4 Gol f - east First Difference s o f Golf - east â€¢J**4 1988 1989 ^/V 1990 199 Figure 67. First differences of Golf - east. Figure 68. Logarithm of Golf - east. First differenc e o f lo g of Golf - east l)(|JW^4*^ 1989 1990 1991 1992 19' Figure 69. First differences of the logarithm of Golf - east. Table 10. Unit root tests on Golf - east and its transformations. Test Statistic Critical Value 10% Unit Root? Stationary? Golf - east original d=l -0.6681 -5.3482 -3.12 -3.12 yes no no yes log log, d=l -0.2601 -4.9548 -3.12 -3.12 yes no no yes Table 11. Autocorrelation functions of Golf - east and its transformations. Lags Autocorrelations Golf - east 1-12 13-24 Std Err 1.00 0.99 0.99 0.99 0.98 0.98 0.97 0.97 0.96 0.96 0.95 0.95 0.95 0.94 0.94 0.93 0.93 0.92 0.92 0.91 0.90 0.90 0.89 0.89 0.04 0.20 Golf - east, d=l 1-12 0.77 0.57 0.34 0.13 0.04 0.02 0.00 0.05 0.06 0.06 0.07 0.06 13-24 0.06 0.07 0.08 0.09 0.11 0.11 0.11 0.11 0.11 0.09 0.08 0.07 0.04 0.07 Golf - east, log 1-12 1.00 0.99 0.99 0.99 0.98 0.98 0.98 0.97 0.97 0.96 0.96 0.96 13-24 0.95 0.95 0.94 0.94 0.93 0.93 0.93 0.92 0.92 0.91 0.91 0.90 0.04 0.20 Golf -east, log, d=l 1-12 0.77 0.64 0.44 0.27 0.17 0.16 0.12 0.17 0.17 0.16 0.15 0.12 13-24 0.110.12 0.12 0.14 0.13 0.14 0.13 0.13 0.13 0.110.10 0.09 0.04 0.08 7.4.5 Golf - west First Difference s o f Golf - west jui ....^,MW L , > w UiW^T I 1 IV ^ 7 1984 198 5 198 6 198 7 198 8 198 9 199 1i 0 199 Figure 70. First differences of Golf - west. 1 199 2 199 3 103 Logarithm of Golf - west /s A /W 6 . 6 - /\ 6 . 2 - /\ 1 â€”- , V H 5.9 â€” 5.6 - 5.4 - 5 - 4 . 8 - 19 82 1983 198* 1985 1986 1987 1988 1989 1990 1991 1992 1993 Figure 71. Logarithm of Golf - west. F i r s t d i f f e r e n c e o f lo g o f Gol f - 1985 wes t 1986 Figure 72. First differences of the logarithm of Golf - west. Table 12. Unit root tests on Golf - west and its transformations. Golf - west original d=l log log, d=l log, d=2 Test Statistic Critical Value 10% Unit Root? Stationary? -1.6368 -3.3800 -3.12 -3.12 yes no no yes -0.8885 -3.0624 -6.3910 -3.12 -3.12 -3.12 yes yes no no no yes Table 13. Autocorrelation functions of Golf - west and its transformations. Lags Golf - west 1-12 13-24 Autocorrelations Std Err 1.00 1.00 0.99 0.99 0.99 0.99 0.98 0.98 0.98 0.97 0.97 0.96 0.96 0.95 0.95 0.94 0.94 0.93 0.93 0.92 0.92 0.91 0.91 0.90 0.04 0.20 Golf - west, d=l 1-12 0.78 0.74 0.69 0.62 0.49 0.44 0.38 0.35 0.32 0.29 0.30 0.31 13-24 0.32 0.30 0.30 0.30 0.29 0.25 0.25 0.31 0.23 0.25 0.25 0.22 0.04 0.11 Golf - west, log 1-12 1.00 0.99 0.99 0.99 0.98 0.98 0.98 0.97 0.97 0.97 0.96 0.96 13-24 0.96 0.95 0.95 0.94 0.94 0.94 0.93 0.93 0.92 0.92 0.92 0.91 0.04 0.20 Golf - west, log, d=l 1-12 0.74 0.72 0.68 0.65 0.57 0.52 0.46 0.46 0.42 0.38 0.38 0.38 13-24 0.38 0.37 0.36 0.36 0.37 0.31 0.31 0.36 0.31 0.32 0.32 0.29 0.04 0.12 Golf - west, log, d=2 1-12 -.45 0.03 -.01 0.12 -.08 0.00 -.09 0.07 0.00 -.09 0.02 0.00 13-24 .00 0.01 -.03 .00 0.11 -.09 -.12 0.21 -.13 0.01 0.08 -.04 0.04 0.05 105 7.4.6 Golf-tokyo F i r s t D i f f e r e n c e s o f Gol f - toky o - -O^tyw^jW Y^At-^^-A-Wia 1982 1983 1984 1985 1986 1987 1388 1989 1990 1391 1992 1993 Figure 73. First differences of Golf - tokyo. Logarithm 6 . 1 - (i H 6 - Golf - tokyo KJ \ 6 . 2 - 5.9 of 5 . S -i 5 . 5 5.4 - 5.2 - /^ / 5 . 1 5 - 4 . 9 - 4 .-7 - 82 1983 1984 1985 1986 1987 1988 1989 Figure 74. Logarithm of Golf - tokyo. 1990 1991 1992 1993 First d i f f e r e n c e o f lo g o f Gol f - 1982 19B 3 198 4 198 5 198 6 198 toky o 7 19Q : Figure 75. First differences of the logarithm of Golf - tokyo. Table 14. Unit root tests on Golf - tokyo and its transformations. Test Statistic Critical Value 10% Unit Root? Stationary? Golf - tokyo original d=l -7.7636 -3.8745 -3.12 -3.12 yes no no yes log log, d=l -0.7272 -4.5478 -3.12 -3.12 yes no no yes 107 Table 15. Autocorrelation functions of Golf - tokyo and its transformations. Lags Autocorrelations Std Err Golf - tokyo 1-12 1.00 1.00 0.99 0.99 0.98 0.98 0.98 0.97 0.97 0.96 0.96 0.95 13-24 0.95 0.94 0.94 0.93 0.93 0.92 0.91 0.91 0.90 0.90 0.89 0.88 0.04 0.20 Golf - tokyo, d=l 1-12 0.60 0.48 0.26 0.18 0.10 0.14 0.08 0.16 0.14 0.13 0.14 0.09 13-24 0.09 0.09 0.12 0.11 0.10 0.11 0.08 0.05 0.08 0.06 0.05 0.05 0.04 0.07 Golf - tokyo, log 1-12 1.00 0.99 0.99 0.99 0.99 0.98 0.98 0.98 0.97 0.97 0.96 0.96 13-24 0.96 0.95 0.95 0.94 0.94 0.94 0.93 0.93 0.92 0.92 0.91 0.91 0.04 0.20 Golf - tokyo, log, d=l 1-12 0.59 0.52 0.35 0.26 0.20 0.24 0.17 0.24 0.23 0.21 0.20 0.14 13-24 0.16 0.16 0.18 0.17 0.16 0.16 0.14 0.110.110.08 0.09 0.10 0.04 0.07 7.5 Description of analysis 7.5.1 Stock Indices The results of this aspect of the analysis of the Nikkei and Topix are so similar that only one description is necessary, for the most part, as it describes both series equally well. The graphs of the data (found in Chapter 3) show a definite pattern over time: a period of increase followed by a period of decrease. The Augmented Dickey-Fuller unit root tests provide evidence of unit roots in both cases. The associated autocorrelation functions begin with a value of 1.00 at lag one and slowly decline over twenty-four lags to a value of 0.85 and 0.83 for the Nikkei and Topix respectively. The first differences of the data show small departures from zero at the beginning with bigger departures in the middle and end. The unit root test provide no evidence of unit roots for these cases. The autocorrelation function for the first differences of the Nikkei has a value at the first lag which is less than two standard errors from zero and the remainder of the twenty-four lags shown contain six values which exceed two standard errors from zero. The autocorrelation function for the first differences of the Topix has values for the first two lags which are greater than two standard errors from zero, with three other such values scattered through the first twelve lags. 109 The graphs of the logarithm of the data show the same general pattern as is found in the original data: a period of upward movement followed by a period of downward movement. The unit root tests provide evidence of unit roots in these cases. The autocorrelation functions for these series begin with a value of 0.99 at lag one and slowly declines over twenty-four lags to values of 0.85 and 0.83 for the Nikkei and Topix respectively. The first differences of the logarithm of the data show deviations from zero which are fairly consistent except for a few short periods of exceptionally large deviations. The unit root tests provide no evidence of unit roots in these cases. The associated autocorrelation function for the Nikkei contains only one lag for which the autocorrelation is greater than two standard errors from zero. The autocorrelation function for the Topix has four such values, including those of the first two lags. 7.5.2 Golf Indices In the interest of brevity, this section will give a description of the typical situation for the golf data series followed by descriptions of departures from this pattern. 110 The graphs of the golf data and the logarithms of the data show a period of increase including two peaks followed by a period of decrease. Golf - west and its logarithm are the exceptions, as they have only one peak. The graphs of the first differences of the golf data show very little variation from zero in the first half of the graph; this is followed by large spikes both upward and downward in the middle of the graphs; and the last half of the graphs show a greater degree of deviation from zero than found in the first half of the data. The graphs of the first differences of the logarithm have the same spikes in the middle time period and more deviation from zero after these spikes; however, the change in the degree of variation is much less marked than that of the first differences. Golf - west is again the exception; the graphs of both its first differences and the first differences of its logarithm show mainly positive deviations from zero for the first two-thirds of the graph. The Augmented Dickey-Fuller unit root tests for the untransformed and logarithmically transformed golf data suggest a model entailing unit roots would be applicable. There is no evidence of this when first differences are taken of the original data and their logarithms. The exception here is that the unit root tests with Golf - west show evidence of a unit root model being suitable for the first differences of the logarithm of the data; there is no such evidence when second differences of the logarithm are taken. Ill The autocorrelation functions of the original data and its logarithms show the same pattern: a lag one value of 1.00 followed by a slow decline over twenty-four lags to approximately 0.90. The autocorrelation functions of the first differences of the data and its logarithm begin with at least the first twelve lags being greater than or equal to two standard errors from zero; one exception is the first differences of Golf - east for which only the first four lags meet this criterion. These autocorrelations are quite high at lag one (e.g., 0.81, 0.77, 0.78, 0.60) and rarely go below one standard error from zero in twenty-four lags. For Golf - west, the first twenty-four lags for these two autocorrelation functions (of the original data and their logarithms) do not go below two standard errors from zero; the second differences of its logarithm has much lower autocorrelations (although the first lag autocorrelation is -0.45). 7.6 Evaluative discussion 7.6.1 Stock Indices For both the Nikkei and the Topix, the shape of the graphs of the data and their logarithms show a period of upward trend followed by a period of downward trend. Since the means of these series obviously vary over time, these time series should not be considered as appropriately represented by a model requiring stationarity. This deduction is supported by the large and slowly declining autocorrelation functions and 112 Augmented Dickey-Fuller unit root test results showing evidence of the existence of unit roots. The first differences of the data and the first differences of the logarithm of the data do not show any obvious trend. The graphs of the first differences indicate the variances of the data may be increasing over time. The graphs of the first differences of the logarithm of the data do not show this to the same degree, which indicates these series may be more appropriate. However, based on the evidence of the autocorrelation functions and the lack of unit roots indicated by the unit root tests, both these series may be considered adequately approximated by difference stationarity. 7.6.2 Golf Indices For all four golf time series, the graphs of the data and their logarithms show the mean varying over time, thus these data should not be considered approximated by a stationary model. This conclusion is also reached based both on the large and slowly declining autocorrelation functions and on the results of the unit root tests. The results for the first differences and the first differences of the logarithmic transformations are not as easily interpreted. The unit root tests indicate stationarity. However, the opposite conclusion is supported by autocorrelation functions of the data. 113 The graphs of data show inconclusive evidence: Golf - all, Golf - east, and Golf - tokyo have graphs which could be interpreted as showing stationary structure except for a huge spike in the middle of their graphs which indicates possible nonstationarity. The autocorrelation functions remain uncomfortably high and may be interpreted as slowly declining. The graphs of the first differences show a somewhat lower degree of variation in the first half of the data than in the second half. This difference is not as noticeable, however, in the graphs of the first differences of the logarithm of the data, indicating that the first differences of the logarithm of the golf data may be more appropriate. Once again, Golf - west does not follow the pattern of the other golf time series. The unit root test on the first differences of the logarithm of Golf - west has a test statistic which is greater than the critical value; however the test statistic is close to the critical value (-3.06 vs. -3.12) and may therefore be considered marginal evidence for stationarity. In addition, the graphs of the first differences of Golf - west and its logarithm do not show the same distinctive spike as do the other golf indices, however, neither is the pattern stable over time: the first two thirds being largely positive while the last third is largely negative. The autocorrelation functions of the first differences of Golf - west and its logarithm have a pattern similar to the other golf indices: somewhat large and slowly declining. In considering all three tests for Golf - west, the decision on the appropriateness of difference stationarity is a difficult one. 114 The contradictions arising in the results of the three different methods of determining stationarity should cause some concern. Thus, it would be wise to proceed with caution in describing the golf indices using models which assume difference stationarity. The results of fitting such a model would be useful as they would provide further evidence for making the decision regarding difference stationarity; this is done by fitting autoregressive integrated moving average (ARIMA) models to the data in Chapter 9. 115 CHAPTER 8 DECIDING THE METHODS OF ANALYSIS We are now at the point when the analysis starts to get particularly interesting. The preceding chapters decided the presence (or absence) of some important qualities in the data. Using this information, specific sets of assumptions may be made so that analysis done in future chapters is made possible and meaningful. This chapter will discuss which methods of analysis are appropriate and useful in the quest to learn more about the stock and golf data and a possible relationship between the two. In the following chapters, the goals of this paper discussed in the introduction are met as the following questions are answered: (a) what similarities exist in the structure of the stock market indices and the golf course membership price indices; (b) is there a long-term linear relationship between these indices; and (c) did the downturn change this relationship? It will be discovered that (a) similarities of structure are found among the time series in two instances: between the two stock market indices, and between Golf - all and Golf east; (b) the only evidence of a linear relationship over the entire time period of analysis exists between the logarithm of the Nikkei and the logarithm of Golf - tokyo; (c) the downturn does effect the data both by creating linear relationships within the group of stock indices and within the group of golf indices, and by removing any linear relationships between the stock and golf data which may have existed before the downturn. 116 8.1 What is known so far? The presence of outliers and/or seasonality adds complexity to all analyses. It is convenient if the possibility that these factors are present in the data is decided before the first model is proposed. In this way, the simplest analysis methods possible can then be employed. It has been determined in previous chapters (Chapters 4 and 6) there is no evidence to suggest either outliers or seasonality, thus analysis may continue without concern for these complications. In general, a better method for dealing with the complications of outliers and seasonality would be to include all aspects of the data in the model building process and thus do all estimating at once. This was not done in this analysis, however, since it was fairly easy to tell by looking at the graphs of the data (before any calculations) that seasonality and outliers were most probably not involved. Thus, it was deemed easier to more formally eliminate these complications at an early stage in analysis in order to leave the remaining analysis as streamlined as possible. It was learned in Chapter 5 that the first differences of the stock data (the Nikkei and Topix) and the first differences of the logarithmic transformations of these data may be considered to fulfil the normality assumption. It was also determined that the first differences of the logarithm of Golf - west could be considered to approximate a normal distribution. However, the remaining golf data cannot be considered normally distributed 117 due to their heavy tails; consideration must be given to any affect this departure from normality may have on the results of any analysis which assumes normality. Chapter 7 shows evidence of difference stationary structures for the Nikkei and Topix and their logarithms. However, difference stationarity in regards to the four golf indices is questionable, with different tests giving conflicting results. 8.2 How are the structures of the indices to be determined? The autoregressive integrated moving average (ARIMA), or Box and Jenkins, approach to modelling time series is well defined and accepted. The data is differenced until stationarity exists, then the degrees of the autoregressive (AR) and the moving average (MA) components of the resulting stationary series are found. Developing ARIMA models for the data series will provide insight into the structure of each series on an individual basis. These results may then be compared in order to discover if any similarities exist between the two stock market indices, between the four golf course membership price indices, and between the stock indices and the golf indices. This is done in Chapter 9. 118 ARIMA modelling assumes the data to be difference stationary but, as discussed earlier, this is a problem for the golf indices. If the results of the ARIMA modelling prove muddled, this would provide further evidence against stationarity. If, however, ARIMA modelling applies well to the golf data, this could be considered additional evidence for stationarity. 8.3 What type of relationship is under scrutiny? Chapter 10 looks for evidence of long-term linear relationships both within and between the stock market indices and golf course membership price indices. It is important that it is clear what type of relationship is in question. This analysis can not conclude there is no relationship between the stock and golf indices, it can only determine if there exists a long-term linear relationship. This means that even if no long-term linear relationship is found, there may still exist some other type of relationship (for example, a nonlinear relationship). 8.3.1 Why is the analysis restricted to such a specific relationship? Research is done to add to the body of knowledge existing for the topic of interest. The first investigation done on any subject should always be the simplest possible analysis. 119 The logical way for research to progress is to begin with the most straight-forward explanation and method and to only then continue, if necessary, to try to understand the data using more complicated procedures. This process comes to a halt as soon as a model is found which adequately describes the complexities involved in the data. The result of this process is that the simplest model possible is discovered and used. This means no unnecessary complications are needlessly included to complicate the analysis, and it allows the analytical results to be more readily understood and meaningful. All avoided complexity increases our ability to interpret, understand, and use results. In the case of the stock and golf indices under investigation in this paper, the simplest relationship which could exist is a linear relationship. This possibility, therefore, should be the first to be examined, as is done in this paper. Nonlinear relationships may exist between the indices under analysis. However, little theory has been developed for nonlinear time series and nonlinear cointegration. As theory is developed in these areas, it could be applied to these indices. The time frame of interest here is the long-term. The possibility to be explored is that the stock and golf indices may have some relationship by which the movements in one are related to the movements in the other. Movements, in this case, refer to a long-run equilibrium relationship. Determining the existence of such a relationship will help support or deny many economic theories. 120 Any short-term relationship discovered to exist between these variables would also be interesting in that it may make the understanding or prediction of the indices easier for a short period of time. However, this is another topic and of questionable use since the duration of such a relationship would be unknown. 8.3.2 Is regression appropriate? Given that we are testing for a linear relationship between two variables, the easiest and most well known method of analysis which comes to mind is ordinary least squares (OLS) regression. In OLS regression for two variables, one variable (yj is dependent on the other variable (x,) in the following manner: yt = po + PjX, + st. Regression is not appropriate, however, when stationarity does not exist. Without stationarity some characteristics of the OLS regression (such as the use of the t-test to determine the significance of coefficients) are not valid. In fact, Granger and Newbold (1974) found that if a regression is run in the presence of nonstationarity, the results are spurious. Mills (1990, 268) describes this well by saying such regressions "frequently have high R2 statistics yet also typically display highly autocorrelated residuals ...[and] conventional significance tests are seriously biased towards rejection of the null hypothesis of no relationship, and hence towards acceptance of a spurious relationship." 121 The simplest example to demonstrate this point is to consider two independent variables yt and xt, where yt = Q. + s t ' and xt = yt + e t " (with t being time and the st being noise terms). Neither yt nor x, have a stationary structure since their means are obviously not independent of time. Now consider a regression done using these two variables: yt = Pxt + st. Since yt and xt are independent, no linear relationship exists between them and, thus, P = 0. However, standard regression methodology would estimate the regression coefficient, p, to be C/yl In fact, this result can be reached simply by using substitution: From x, = yt + s t " one gets t = (l/y)(xt - s t "). Substitute this value for t into the equation for yt: yt = # + s t' yt = C, (l/y)(xt - s t ") + s t ' yt = (^/y)(xt - s t ") + s t ' yt = (^/y)xt - (C/y)st" + s t ' yt = (^/y)xt + an error term Thus, a regression would estimate the coefficient Â£/y! In summary, the common trend (nonstationarity) in yt and xt has created spurious regression results. Regression was not developed to deal with nonstationarity and therefore does not work in the usual way when nonstationarity is present. Since none of the indices in this analysis have a stationary structure, clearly this is a case where regression is not appropriate. 122 8.3.3 Is cointegration appropriate? Cointegration exists if two processes with nonstationary structures can be linearly combined to create a data series with a stationary structure. In more general terms, if two variables with nonstationary structures are cointegrated then there exists a long-term linear relationship between them. The theory of cointegration was specifically designed to deal with the possible presence of nonstationarity. This is clearly different from the theory of OLS regression. The two methodologies are meant to be used in two very different circumstances: cointegration deals with a symmetric relationship between two variables, whereas regression deals with a dependent relationship between two variables. In summary, cointegration determines the existence of long-term linear relationship in the presence of nonstationarity and is therefore an appropriate technique to use in determining the second of the goals of this paper: to determine if there is a long-term linear relationship between the stock and golf indices used in this analysis. This is done in Chapter 10. 123 8.4 How can it be determined if the downturn affected the relationship? This question is important regardless of whether or not a long-term linear relationship is found in Chapter 10. If a relationship exists, the question is if the downturn changed or removed this relationship. If no evidence of a relationship is found, the question is if a relationship did exist either before and/or after the downturn. The method to answer the last of the major questions in this analysis (the affect of the downturn) is simply to break the data into three sets (before the downturn, during the downturn, and after the downturn) and test for cointegration using the data before the downturn and then test for cointegration using data after the downturn in order to determine the presence or absence of long-run linear relationships during these separate time periods. CHAPTER 9 ARE THE STRUCTURES OF THE DATA SERIES SIMILAR? 9.1 Main results and conclusions This chapter provides evidence for a number of interesting conclusions: (a) the structures of the logarithmic transformations of the Nikkei and Topix are not the same; (b) Golf - all and Golf - east have the same structure, while Golf - west and the logarithm of Golf - tokyo do not share any common structure with the other golf indices; (c) there is no similar structure to be found between the stock indices and the golf indices; and (d) autoregressive integrated moving average (ARIMA) modelling proves appropriate for only half of the indices and their transformations, indicating the assumption of difference stationarity may be unsuitable in some cases. ARIMA modelling proves inappropriate for both the Nikkei and Topix indicating that these series may not have a difference stationary structure. The logarithms of the Nikkei and Topix, on the other hand, modelled well as ARIMA( 1,1,1) and (1,1,0) respectively. This means there is no reason to doubt the difference stationary structure of these logarithmic transformations. The structures of the logarithmic transformations of the Nikkei and Topix are not the same; the transformed Nikkei has both AR and MA components, while the transformed Topix has only an AR component. This is very interesting as it means the two stock market indices should not be treated as interchangeable, as is sometimes presently done. Golf - all, Golf - east, and both their logarithmic transformations have the same structure: ARIMA(0,1,6). The fact that this data admits to an ARIMA structure provides evidence supporting the assumption of difference stationarity. However, the high order of these models (and the high residuals' lag 8 autocorrelation for all four models) suggests that the appropriateness of ARIMA methodology in these cases is questionable. Golf - west, Golf - tokyo, and both their logarithmic transformations could not be modelled well using the ARIMA structure. This indicates these indices may not have a difference stationary structure. The best ARIMA models found were all of high order and varied between containing MA components, AR components, and both. The associated residuals' autocorrelation functions all include high values which provide evidence that these models are not adequately capturing the correlation structure of the data. In comparing the four golf time series, it is interesting to note that the only two series whose variation is captured using an ARIMA process (Golf - all and Golf - east) had the same underlying structure - an ARIMA(0,1,6). All four golf time series seem very similar when comparing graphical analytical tools (such as was done in Chapter 5). The surprising fact that a consistent structure was found for only two of these variables indicates the importance of more sophisticated analysis in understanding time series. 126 In comparing the stock time series with the golf time series, it can be seen that these groups of series do not have a similar ARIMA structure. This is a surprising result and is important since land and stock are sometimes considered closely related. One last observation should be discussed. The residuals of many of the ARIMA models include high order autocorrelations which are greater than two standard errors from zero. This suggests there may be some long-term persistence in this data. Determining if this persistence exists and modelling it using fractional differencing would be an interesting topic for further investigation. It may be that allowing for long-term persistence will result in models showing the series to be more similar. 9.2 Background This thesis is working with two separate stock market time series and four golf course membership price time series. In trying to understand these series and any relationship between them, two logical questions arise. Firstly, do the time series within each group have similar structures (i.e., are the two stock market series similar to each another in structure, and are the four golf course membership price series similar to one another in structure)? Secondly, are the structures of the stock indices and the golf indices alike? 127 The first step in answering the above two questions is to determine the structure of each individual series. As discussed in section 4.3, one way of looking at a data series is to assume that some underlying process is generating the numbers and the time series which arises in reality is a sample of this process. The purpose of analysis is to use this sample in an attempt to discover a reasonable model which approximates the underlying process. Thus, in this chapter, the time series are modelled in order to gain some insight into the structure of the sample and its underlying process. The second step in answering the two main questions of this chapter is then to compare the resulting models and thereby compare the series. If two indices are found to have similar models, this indicates the two time series have similar structures. The same logic holds for comparing the models of the two stock indices, comparing the models of the four golf indices, and comparing the group of stock time series with the group of golf time series. These steps are carried out in this chapter and the resulting conclusions described. 9.2.1 Is the ARIMA model appropriate? Of the many models available for time series analysis, the data in this paper is modelled using the ARIMA (autoregressive integrated moving average) model, originally developed by Box and Jenkins (1970). This is a well developed method of time series analysis and has been found to be appropriate for many economic time series. The ARIMA model allows for differencing to stationarity, with the remaining structure containing possible autoregressive and moving average aspects. Combinations of past and current observations and error terms are used to define the structure of the time series. The assumptions underlying the ARIMA model are based on Wold's decomposition and normality. Wold's decomposition says that "every weakly stationary, purely nondeterministic, stochastic process ... can be written as a linear combination (or linear filter) of a sequence of uncorrected random variables" (Mills 1990, 67). This sequence of uncorrelated random variables is infinite, however, making it impossible to estimate. In order for the sequence to be reduced to a finite number of reasonable order (i.e., as in ARIMA theory), then the uncorrelated random variables of Wold's decomposition must also be assumed to be independent; this allows the theory of the ARIMA methodology to work. Thus, it can be seen that normality (discussed in Chapter 5) is very convenient, since, in the presence of normality, uncorrelated and independent are equivalent. Unfortunately, these assumptions do not all hold perfectly for the stock and golf time series under analysis. Stationarity was discussed in Chapter 7 where it was determined that difference stationarity is appropriate for both the original and logarithmically transformed stock indices, but questionable for those of the golf indices. Nondeterministic means that "any linearly deterministic components have been 129 subtracted" from the data (Mills 1990, 67); thus, the second condition would be true if the data is such that difference stationarity is appropriate. (See section 9.2.3 for a description of the concept of difference stationarity.) Finally, the normality of the data was discussed in Chapter 5; the first differences and first differences of the logarithms of the stock data and the first differences of the logarithm of Golf - west were shown to approximate normality. The stock indices seem to comply with the assumptions of the ARIMA methodology, thus it is appropriate to perform an ARIMA analysis on this data. However, it is reasonable to question whether ARIMA modelling would be a useful exercise in the case of the golf indices, since there is doubt regarding the validity of the assumptions. ARIMA modelling is important, however, for two main reasons. Firstly, even though the assumptions do not hold perfectly, the ARIMA modelling process may still provide some insight into the structure of the data. Secondly, the success of the modelling exercise will help in determining the degree to which the assumptions are valid. In other words, the success (or failure) of the ARIMA methodology will provide further evidence for accepting (or rejecting) the questionable assumption of difference stationarity. In particular, the presence of difference stationarity is important to establish as it is also an assumption which will be used in Chapters 10 and 11 when tests for cointegration are done. For this reason, this chapter de-emphasizes the assumption of normality and 130 emphasizes the possibility that the lack of difference stationarity is the culprit when ARIMA modelling proves inappropriate. 9.2.2 Brief description of the ARIMA model Since the ARIMA or Box-Jenkins methodology is so well known, this section provides only a short summary. In applying an autoregressive integrated moving average (ARIMA) model, the first step is to difference the data d times until a stationary structure is achieved. The autoregressive and moving average aspects of the data (of orders p and q respectively) are then estimated. When the application of the model is complete, all the covariance structure in the data should be described, thus leaving residuals which are white noise. The ARIMA methodology requires differencing to achieve stationarity. If the resulting d equals 0 then stationarity exists to begin with and therefore no differencing is required. For d > 0, it is assumed the original data is nonstationary in a manner for which stationarity can be brought about by differencing (i.e., the data is difference stationary). It is from this first step (differencing to stationarity) that the "integrated" part of the model title derives. "Integrated" refers to the fact that one must add together (or 131 integrate) the differenced values of this resultant series in order for direct comparison to the original (undifferenced) data. Integration and its degree will be specified in this paper as 1(0) for stationarity (data is not differenced), 1(1) for first order integration (first differencing required to reach stationarity), 1(2) for second order integration (second differencing required to reach stationarity), and so on. The second step in applying the ARIMA model is to apply to the differenced data the stationary modelling technique known as ARMA(p,q), which is an autoregressive moving average process with an autoregressive (AR) order of p and a moving average (MA) order of q: yt - a,yt.j - ... - a ^ = e, - b,^., - ... - bqst.q. Under certain conditions, backsubstitution can be used to rewrite this model as either a pure AR or a pure MA. This backsubstitution means, of course, that any model found may be rewritten in many different ways. When choosing between these alternate models, it is best to keep the model as simple as possible by minimizing the sum of the orders p+q (i.e., parsimonious parametization). Finally, diagnostic checking must take place to determine if the model resulting from the above steps is appropriate. This can be done in two ways. First, the residuals of the model should be noise if the model chosen is correct; this may be determined by looking at the autocorrelation function of the residuals. (Section 9.2.4 discusses this.) 132 Second, models with higher p and q may be fit; these additional model variables will have insignificant coefficients if the original model is a suitable one for the data. 9.2.3 Difference stationary vs. trend stationary As discussed above, the ARIMA model assumes difference stationarity (i.e., the structure is such that differencing brings about stationarity). This section describes more specifically the meaning of difference stationarity. Trend stationarity, which is another model of how data may be structured, is also described. If data is such that trend stationarity is appropriate then the ARIMA model is not suitable; thus it is important to be aware of this concept in order to make well informed decisions about what assumptions are appropriate for the data under analysis. In addition, the effect of external shocks is very dissimilar in the cases of difference stationarity and trend stationarity. For difference stationarity, the original series is differenced to get Vy, = P + Et where Et is a zero mean series with a stationary structure admitting to an ARMA representation. A simple example of a difference stationary structure would be a random walk with drift, Vyt = p + st, where Â£t is noise. Through backsubstitution, this model can be written as yt = y0 + tp + Ts { (Z from i = 1 to t). This equation shows that both the variance and autocovariances of yt depend on t, thus showing that yt does not have a stationary 133 structure. (yt does, however, have a difference stationary structure since first differencing induces stationarity.) Another important characteristic to note is that each value of yt is influenced by past observations due to the accumulation of nondiscounted errors (2Sj). Another type of stationarity is trend stationarity. In this case, yt can be divided into a trend aspect and a zero mean series with a stationary structure (u,). An example would be an equation including linear trend: yt = a + pt + i^, which can be rewritten as Vyt = (3 + VUf The variance of yt is constant and past events have no effect on present or future observations. Trend stationarity is a popular model because the trend can be easily removed (using regression on time in the above example) thereby leaving only the related stationary residuals to be explained. If the data is better modelled by trend stationarity than difference stationarity, ARIMA modelling is obviously not appropriate (since differencing will not induce stationarity). It is not easily determined whether a particular set of data is most appropriated modelled by difference or trend stationarity, and making the wrong assumption can often leave analytical results uninterpretable or, even worse, spurious. It is important to recognize that difference stationarity and trend stationarity have extremely dissimilar effects on interpretation of the effect external shocks have on the data. As an example, assume the government takes some particular action which 134 changes interest rates. This policy will have a permanent long-term effect on future interest rates if interest rates have a difference stationary structure, while it would have only a transitory effect if interest rates have a trend stationary structure. 9.2.4 Autocorrelation functions and ARIMA processes Fitting an ARIMA model is done in an attempt to describe the variation in the data, thus autocorrelation functions (acf) are useful in ARIMA modelling. The description of the data's variability given by its acf is used in determining the most effective ARIMA model. In addition, the acf of the residuals of a given ARIMA model are useful in determining if the model has adequately described the data's variability. A pure AR(p) process has an acf which quickly (geometrically) declines. A pure MA(q) process has significant autocorrelations for lags less than and equal to q, and autocorrelations which are approximately zero for lags greater than q. A mixed ARMA process will have an acf which is infinite in extent and declines as the lag k increases. The autocorrelation function of the residuals of a given model provide an excellent means of determining how successful the model has been. Residuals have autocorrelations close to zero (i.e., they approximate noise) if the ARIMA model has been effective in describing the data's covariance structure. 135 9.3 Actions taken The ARIMA modelling process has three steps: (a) identifying of p, d, and q; (b) estimating the model parameters; (c) diagnostic checking. This section describes the implementation of these steps to each data series under analysis. Because the BoxJenkins methodology is well known, a short summary of these actions is all that is given. The appropriate degree of differencing for the data series was decided when discussing stationarity in Chapter 7. Reasonable orders of p and q for the data were estimated using the data's autocorrelation function. The related parameters were then estimated and the resulting residuals' autocorrelation function compared to that of noise. For all indices, the first model chosen did not produce residuals which approximated noise, thus, higher orders of p and q were chosen and the related models estimated. This process was repeated until a model was found whose residuals approximated noise. This candidate model then went through further diagnostic checking: higher order models were estimated with the candidate model being considered appropriate if these additional coefficients were not significantly different from zero. (The significance of the model coefficients were determined using t-tests in the usual manner.) 9.4 Detailed analysis 9.4.1 Nikkei Table 16. Parameter estimates for ARIMA models fitted to Nikkei. Parameter Estimate t-statistic ARIMA(1,1,1) 0.75222 0.66437 4.130 3.249 ARIMA(2,1,1) 0.29692 0.11549 0.24032 0.959 2.139 0.772 0.31681 0.26236 -0.12700 1.152 0.960 -2.530 aa ARIMA(1,1,2) Table 17. Acf of residuals for ARIMA models fitted to Nikkei. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.07 0.13 0.07 -.01 0.01 0.03 0.09 -.01 0.09 -.09 -.09 -.05 13-24 -.02 -.02 .00 0.03 -.04 -.01 -.01 0.02 0.05 0.13 0.05 0.01 0.04 0.05 ARIMA(1,1,1) 1 -12 -.03 0.07 0.01 -.06 -.02 0.01 0.08 -.02 0.10 -.09 -.09 -.05 13 -24 -.01 -.01 0.00 0.04 -.05 -.02 -.02 0.02 0.04 0.12 0.04 0.00 0.04 0.05 Table 18. Parameter estimates for ARIMA models fitted to the logarithm of the Nikkei. Parameter Estimate t-statistic ARIMA(1,1,1) a, b, 0.76814 0.70963 3.020 2.566 ARIMA(2,1,1) a. a2 b, 0.37899 0.05915 0.33224 0.762 1.007 0.667 ARIMA(1,1,2) ai 0.37182 0.32702 -0.06720 0.836 0.737 -1.241 b, b2 Table 19. Acf of residuals for ARIMA models fitted to the logarithm of the Nikkei. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.05 0.08 0.06 -.02 -.01 0.01 0.09 0.02 0.07 -.07 -.06 0.00 13-24 0.01 -.00 0.05 0.07 -.03 0.00 -.04 -.01 0.02 0.08 0.04 -.03 0.04 0.05 ARIMA(1,1,1) 1 -12 -.01 0.03 0.02 -.05 -.03 -.01 0.08 0.01 0.06 -.08 -.07 0.00 13 -24 0.01 0.00 0.04 0.07 -.04 0.00 -.05 -.01 0.02 0.08 0.04 -.03 0.04 0.05 9.4.2 Topix Table 20. Parameter estimates for ARIMA models fitted to the Topix. Parameter Estimate t-statistic ARIMA( 1,1,0) a, 0.21981 5.056 ARIMA(0,1,1) b, -0.18349 -4.192 ARIMA(1,1,1) Â»i 0.53534 0.33085 3.442 1.910 0.19917 0.09180 4.486 2.059 a2 a3 0.19781 0.08892 0.01533 4.433 1.956 0.342 ARIMA(0,1,2) b, b2 -0.18896 -0.11983 -4.268 -2.700 ARIMA(0,1,3) b, b2 b3 -0.20080 -0.13877 -0.07846 -4.513 -3.077 -1.757 b, ARIMA(2,1,0) ai aj ARIMA(3,1,0) *\ Table 21. Acf of residuals for ARIMA models fitted to the Topix. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.22 0.13 0.06-.02 0.00 0.04 0.09 0.05 0.08-.05-.11 -.10 13-24 -.01 -.04 0.03 0.07 0.00 0.03 -.02 -.01 -.03 0.10 0.09 -.01 0.04 0.05 ARIMA( 1,1,0) 1 -12 -.02 0.08 0.04 -.03 0.00 0.02 0.08 0.01 0.08 -.04 -.09 -.08 13 -24 0.02 -.05 0.03 0.07 -.02 0.03 -.03 0.00 -.06 0.10 0.08 -.04 0.04 0.05 ARIMA(0,1,1) 1 -12 0.02 0.12 0.04 -.03 0.00 0.02 0.09 0.02 0.08 -.04 -.09 -.09 13 -24 0.01 -.04 0.03 0.07 -.02 0.03 -.03 0.01 -.05 0.10 0.08 -.03 0.04 0.05 ARIMA(2,1,0) 1 -12 0.00 0.00 0.02 -.05 -.01 0.02 0.08 0.02 0.08 -.04 -.10 -.08 13 -24 0.02 -.04 0.03 0.07 -.02 0.03 -.03 -.01 -.06 0.11 0.08 -.05 0.04 0.05 ARIMA(0,1,2) 1 -12 0.01 0.01 0.07 -.03 -.01 0.03 0.08 0.02 0.08 -.04 -.10 -.08 13 -24 0.02 -.05 0.03 0.07 -.02 0.03 -.02 -.01 -.06 0.11 0.08 -.05 0.04 0.05 Table 22. Parameter estimates for ARJMA models fitted to the logarithm of the Topix. Parameter Estimate t-statistic ARJMA(0,1,1) b, -0.22892 -5.286 ARJMA( 1,1,0) a. 0.24760 5.736 ARJMA(1,1,1) a, b, 0.38103 0.14237 2.301 0.804 ARJMA(2,1,0) a, a? 0.23968 0.03220 5.381 0.720 Table 23. Acf of residuals for ARIMA models fitted to the logarithm of the Topix. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.25 0.09 0.06 -.04 -.02 0.02 0.11 0.07 0.05 -.03 -.09 -.06 13-24 0.01 -.02 0.08 0.10 0.03 0.04 -.03 -.04 -.04 0.08 0.07 -.05 0.04 0.05 ARIMA( 1,1,0) 1 -12 -.01 0.02 0.05 -.05 -.01 0.00 0.10 0.04 0.05 -.03 -.07 -.05 13 -24 0.03 -.05 0.07 0.09 -.01 0.04 -.03 -.02 -.06 0.08 0.08 -.07 0.04 0.05 ARIMA(0,1,1) 1 -12 0.01 0.08 0.05 -.04 -.01 0.00 0.10 0.04 0.05 -.03 -.07 -.06 13 -24 0.03 -.04 0.07 0.09 0.00 0.05 -.04 -.02 -.06 0.08 0.07 -.07 0.04 0.05 ARIMA(1,1,1) 1 -12 0.00 -.01 0.04 -.06 -.02 0.00 0.10 0.04 0.05 -.03 -.08 -.05 13 -24 0.03 -.05 0.07 0.09 -.01 0.04 -.04 -.03 -.06 0.09 0.08 -.08 0.04 0.05 9.4.3 Golf - all Table 24. Parameter estimates for ARIMA models fitted to Golf - all. Parameter Estimate t-statis 1,1,6) b, b2 b3 b4 b5 b6 -0.81523 -0.84346 -0.73949 -0.48628 -0.26670 -0.25610 -20.25 -16.35 -12.56 -8.21 -5.19 -6.33 ',1,7) b, b2 b3 b4 b5 b6 b7 -0.80367 -0.82494 -0.71204 -0.43649 -0.19975 -0.20159 0.08069 -19.37 -15.63 -11.47 -6.55 -3.22 -3.83 1.95 Table 25. Acf of residuals for ARIMA models fitted to Golf - all. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.81 0.67 0.49 0.32 0.22 0.19 0.14 0.17 0.16 0.17 0.17 0.16 13-24 0.17 0.17 0.18 0.18 0.18 0.17 0.18 0.19 0.19 0.18 0.17 0.15 0.04 0.09 ARIMA(0,1,6) 1-12 -.02 0.01 -.02 0.00 0.01 0.01 0.02 0.11 -.02 0.04 0.01 0.00 13-24 0.08 -.02 0.05 0.05 0.05 -.01 -.01 0.07 0.03 0.05 0.02 0.03 0.04 0.04 142 Table 26. Parameter estimates for ARIMA models fitted to the logarithm of Golf - all. Parametier Estimate t-statis ARIMA(0,1,6) b, b2 b3 b4 b5 b6 -0.76043 -0.85298 -0.74417 -0.48474 -0.27519 -0.23972 -18.78 -16.98 -12.84 -8.31 -5.50 -5.90 ARIMA(0,1,7) b, b2 b3 b4 b5 b6 b7 -0.74439 -0.83419 -0.70941 -0.43301 -0.21127 -0.18594 0.07631 -17.90 -16.23 -11.59 -6.59 -3.46 -3.62 1.84 Table 27. Acf of residuals for ARIMA models fitted to the logarithm of Golf - all. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.80 0.68 0.50 0.34 0.26 0.25 0.22 0.26 0.26 0.25 0.23 0.21 13-24 0.20 0.20 0.19 0.20 0.20 0.19 0.19 0.19 0.19 0.17 0.16 0.14 0.04 0.09 ARIMA(0,1,6) 1-12 -.02 0.00 -.01 -.01 0.01 0.03 0.02 0.14 0.03 0.03 0.03 0.01 13-24 0.04 0.01 0.02 0.08 0.03 0.02 -.01 0.03 0.06 0.03 0.04 0.01 0.04 0.04 Table 28. Parameter estimates for ARIMA models fitted to Golf - east. Parameter Estimate t-statistic ARIMA(0,1,6) b, b2 b3 b4 b5 b6 -0.81203 -0.78748 -0.65752 -0.36825 -0.13795 -0.16865 -19.80 -14.86 -10.93 -6.08 -2.62 -4.09 ARIMA(0,1,7) b, b2 b3 b4 b5 b6 b7 -0.79663 -0.77413 -0.61529 -0.30524 -0.06457 -0.08440 0.09343 -19.22 -14.56 -9.96 -4.63 -1.05 -1.59 2.26 Table 29. Acf of residuals for ARIMA models fitted to Golf - east. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.77 0.57 0.34 0.13 0.04 0.02 0.00 0.05 0.06 0.06 0.07 0.06 13-24 0.06 0.07 0.08 0.09 0.11 0.11 0.11 0.11 0.11 0.09 0.08 0.07 0.04 0.07 ARIMA(0,1,6) 1-12 -.02 0.00 -.02 -.04 -.02 -.02 -.04 0.10 0.00 -.01 0.04 -.02 13-24 0.00 0.02 0.01 0.02 0.02 0.03 0.01 -.01 0.07 -.01 0.01 0.02 0.04 0.04 144 Table 30. Parameter estimates for ARIMA models fitted to the logarithm of Golf - east. Estimate t-statistic ARIMA(0,1,6) -0.72341 â€¢0.81970 â€¢0.68492 â€¢0.44949 â€¢0.22398 â€¢0.20620 -17.72 -16.39 -11.93 -7.79 -4.50 -5.04 ARIMA(0,1,7) â€¢0.70655 â€¢0.80528 â€¢0.64842 â€¢0.40790 0.16963 0.15690 0.06698 -16.97 -15.87 -10.77 -6.37 -2.82 -3.10 1.61 b7 Table 31. Acf of residuals for ARIMA models fitted to the logarithm of Golf - east. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.77 0.64 0.44 0.27 0.17 0.16 0.12 0.17 0.17 0.16 0.15 0.12 13-24 0.11 0.12 0.12 0.14 0.13 0.14 0.13 0.13 0.13 0.11 0.10 0.09 0.04 0.08 ARIMA(0,1,6) 1-12 -.02 -.01 -.01 -.03 0.00 0.01 0.00 0.10 0.06 -.01 0.05 -.02 13-24 -.01 0.05 -.01 0.07 0.00 0.04 0.01 0.00 0.07 -.01 0.01 0.02 0.04 0.04 Table 32. Parameter estimates for ARIMA models fitted to Golf - west. ARIMA(0,1,9) Parameter Estimate t-statistic b, b2 b3 b4 b5 b6 b7 b8 b9 -0.41805 -0.51829 -0.65543 -0.69289 -0.47196 -0.45610 -0.34478 -0.29193 -0.18645 -10.22 -12.10 -14.34 -13.89 -8.75 -9.15 -7.54 -6.80 -4.56 Table 33. Acf of residuals for ARIMA models fitted to Golf - west. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.78 0.74 0.69 0.62 0.49 0.44 0.38 0.35 0.32 0.29 0.30 0.31 13-24 0.32 0.30 0.30 0.30 0.29 0.25 0.25 0.31 0.23 0.25 0.25 0.22 0.04 0.11 ARIMA(0,1,9) 1-12 0.01 0.01 0.00 0.00 -.01 0.02 0.03 0.03 0.06 0.07 0.01 0.03 13-24 0.09 0.03 0.08 0.06 0.06 -.09 -.09 0.22 -.07 0.08 0.13 0.01 0.04 0.04 Table 34. Parameter estimates for ARIMA models fitted to the logarithm of Golf - west. Parameter ARIMA(5,1,0) *i a2 a3 a4 % Estimate t-statistic 0.38978 0.27736 0.16786 0.14172 -0.10657 9.436 6.308 3.739 3.223 -2.580 Table 35. Acf of residuals for ARIMA models fitted to the logarithm of Golf - west. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.75 0.72 0.68 0.65 0.57 0.52 0.46 0.46 0.42 0.38 0.38 0.38 13-24 0.38 0.37 0.36 0.36 0.37 0.32 0.31 0.36 0.31 0.32 0.32 0.29 0.04 0.12 ARIMA(5,1,0) 1-12 -.01 0.00 0.03 0.02 0.03 -.05-.10 0.03-.01 -.11 -.01 0.01 13-24 0.02 0.04 0.02 0.03 0.10-.09-.10 0.15-.04 0.06 0.12-.01 0.04 0.04 Table 36. Parameter estimates for ARIMA models fitted to Golf - tokyo. Parameter Estimate t-statistic ARIMA(0,1,4) -0.50969 -0.51091 -0.25153 -0.11470 -12.37 -11.29 -5.58 -2.78 ARIMA(3,1,0) 0.50637 0.26302 -0.13851 12.34 5.87 -3.38 &2 Table 37. Acf of residuals for ARIMA models fitted to Golf - tokyo. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.60 0.48 0.26 0.18 0.10 0.14 0.08 0.16 0.14 0.13 0.14 0.09 13-24 0.09 0.09 0.12 0.11 0.10 0.11 0.08 0.05 0.08 0.06 0.05 0.05 0.04 0.07 ARIMA(0,1,4) 1-12 0.00 0.02 0.02 0.04 0.02 0.11 -.10 0.11 0.05 0.02 0.09 -.02 13-24 0.01 0.02 0.08 0.01 0.01 0.07 0.02 -.04 0.06 0.01 0.01 0.04 0.04 0.04 ARIMA(3,1,0) 1-12 0.00 0.00 0.02 -.04 -.04 0.10 -.09 0.11 0.04 0.00 0.09 -.02 13-24 -.01 0.01 0.07 0.01 0.01 0.07 0.01 -.04 0.06 0.01 0.00 0.04 0.04 0.04 Table 38. Parameter estimates for ARIMA models fitted to the logarithm of Golf tokyo. ARIMA(1,1,3) Parameter Estimate t-stati a, b, b2 b, 0.85157 0.43535 -0.14265 0.16586 15.88 6.18 -2.88 3.04 Table 39. Acf of residuals for ARIMA models fitted to the logarithm of Golf - tokyo. Lags Residual Autocorrelations Standard Error ARIMA(0,1,0) 1-12 0.59 0.52 0.35 0.26 0.20 0.24 0.17 0.24 0.23 0.21 0.20 0.14 13-24 0.16 0.16 0.18 0.17 0.16 0.16 0.14 0.110.110.08 0.09 0.10 0.04 0.07 ARIMA(1,1,3) 1-12 0.02 -.01 0.03 -.07 -.08 0.07 -.10 0.07 0.06 0.02 0.05 -.05 13-24 -.01 0.01 0.06 0.03 0.01 0.06 0.03 -.02 0.03 -.04 0.00 0.05 0.04 0.04 149 9.5 Description of analysis In this section, "significant" and "insignificant" will refer to significant and insignificant for a confidence level of 0.05, "d=l" indicates first differences and "log" refers to the logarithmic transformation. The autocorrelation functions discussed include the first twenty-four lags. 9.5.1 Nikkei 9.5.1.1 Nikkei, d=l The first differences of the Nikkei have an autocorrelation lag 2 which is greater than three standard errors from zero, and many others greater than two standard errors from zero. The ARIMA( 1,1,1) model has significant coefficients and residuals with an autocorrelation function whose lags 1,9, 10, 11 and 22 are greater than or equal to two standard errors from zero. Both the first differences of the data and the ARIMA( 1,1,1) residuals have autocorrelations of lag 22 which are higher than two standard errors from zero. 150 The models included for diagnostic checking, ARIMA(2,1,1) and (1,1,2), both contain coefficient estimates which are insignificant. 9.5.1.2 Nikkei, log, d=l The first differences of the logarithm of the Nikkei have an autocorrelation lag 2 which is two standard errors from zero and lag 7 which is greater than two standard errors from zero. The ARIMA( 1,1,1) model has significant coefficients and residuals with an autocorrelation function whose lags 7 and 10 are equal to two standard errors from zero. The models included for diagnostic checking, ARIMA(2,1,1) and (1,1,2), both contain insignificant coefficients. 9.5.2 Topix 9.5.2.1 Topix, d=l The autocorrelations of lag 1 and 2 of the first differences of the Topix are greater that three standard errors from zero, and five of the remaining twenty-two lags of the 151 autocorrelation function have values equal to or greater than two standard errors from zero. The two first order models, ARIMA( 1,1,0) and (0,1,1) both have significant coefficients. The residuals of these first order AR and MA models have lag 2 autocorrelations which are two and three standard errors from zero respectively. For both these models, the residuals' autocorrelations of lags 7, 9, 11, 12, 22 and 23 are greater than or equal to two standard errors from zero. ARJMA(2,1,0) and (0,1,2) both have significant coefficients and both have residuals whose autocorrelations for lags 7, 9, 11, 12, and 22 are greater than or equal to two standard errors from zero. The models included for diagnostic checking, ARIMA(1,1,1), (3,1,0), and (0,1,3) all include insignificant coefficients. 9.5.2.2 Topix, log, d=l The autocorrelation of lag 1 of the logarithm of the first differences of the Topix is greater that three standard errors from zero, and many other lags are also high. 152 The two models which have significant coefficients are ARIMA( 1,1,0) and (0,1,1). The MA model's residuals have an autocorrelation lag 2 which is two standard errors from zero, and lags 7 and 16 which are greater than two standard errors from zero. The AR model's residuals have only two autocorrelations which are greater than two standard errors from zero (lags 7 and 16). The models included for the purposes of diagnostic checking, ARIMA( 1,1,1) and (2,1,0) contain insignificant coefficients. 9.5.3 Golf - all 9.5.3.1 Golf - all, d=l The first differences of Golf - all have an autocorrelation function which begins very high and declines somewhat to a moderately high level. The ARIMA(0,1,6) has coefficients which are all significant; its residuals' autocorrelation function has two lags (lag 8 and 13) which are greater than or equal to two standard errors from zero. The ARIMA(0,1,7) and ARIMA(1,1,6) models have coefficients which are not significant. (The ARIMA(1,1,6) results are not shown in the interest of brevity.) 153 9.5.3.2 Golf - all, log, d=l The first differences of the logarithm of Golf - all have an autocorrelation function which begins very high and declines to a moderately high level. The ARIMA(0,1,6) model has significant coefficients, and its residuals' autocorrelation function has lags 8 and 16 which are greater than three and two standard errors from zero respectively. The ARIMA(0,1,7) model includes one insignificant coefficient. 9.5.4 Golf - east 9.5.4.1 Golf-east, d=l The autocorrelation function of the first differences of Golf - east has very high autocorrelations for the first four lags, with the remaining lags all having values less than two standard errors from zero. The ARIMA(0,1,6) model has coefficients which are all significant and its residuals' autocorrelation function includes one lag (lag 8) which is greater than two standard errors from zero. The ARIMA(0,1,7) model contains insignificant coefficients. 154 9.5.4.2 Golf - east, log, d=l The autocorrelation function of the first differences of the logarithm of Golf - east has very high autocorrelations for low lags. The ARIMA(0,1,6) model has coefficients which are all significant and its residuals' autocorrelation function includes one lag (lag 8) which is greater than two standard errors from zero. The ARIMA(0,1,7) model contain an insignificant coefficient. 9.5.5 Golf - west 9.5.5.1 Golf-west, d=l The autocorrelation function of the first differences of Golf - west begins with very high autocorrelations for low lags and declines over the lags to a moderately high level. The ARIMA(0,1,9) model has coefficients which are all significant and its residuals' autocorrelation function contains many values in the second dozen lags which are greater than or equal to two standard errors from zero. 155 9.5.5.2 Golf - west, log, d=l The autocorrelation function of the first differences of the logarithm of Golf - west begins with very high autocorrelations for low lags and declines over the lags to a moderately high level. The ARIMA(5,1,0) model has coefficients which are all significant and its residuals' autocorrelation function contains many values which are greater than two standard errors from zero. 9.5.6 Golf - tokyo 9.5.6.1 Golf-tokyo, d=l The autocorrelation function of the first differences of Golf - tokyo begins with very high autocorrelations for low lags and declines over the lags to be less than one standard error from zero. Both the ARIMA(0,1,4) and (3,1,0) models have coefficients which are all significant. Their residuals' autocorrelation functions contain many values which are greater than or equal to two standard errors from zero. 156 9.5.6.2 Golf - tokyo, log, d=l The autocorrelation function of the first differences of the logarithm of Golf - tokyo begins with very high autocorrelations for low lags and declines over the lags to be less than two standard errors from zero. The ARIMA(1,1,3) model has coefficients which are all significant. The related residuals' autocorrelation function contains two values (lags 5 and 7) which are greater than or equal to two standard errors from zero. 157 9.6 Evaluative discussion In this section, "significant" and "insignificant" will refer to significant and insignificant for a confidence level of 0.05, "d=l" indicates first differences and "log" refers to the logarithmic transformation. The autocorrelation functions discussed include the first twenty-four lags. Difference stationarity is frequently discussed in this section; see section 9.2.3 for a discussion of this concept. 9.6.1 Nikkei 9.6.1.1 Nikkei, d=l The first differences of the Nikkei have an autocorrelation of lag 2 which is greater than three standard errors from zero, and numerous other high autocorrelations. This indicates that the series does not approximate noise and analysis must continue to determine a model which more accurately describes the variation of the data. The first differences of the Nikkei are modelled best by the ARIMA( 1,1,1) model. This model has significant coefficients; however, its residuals' autocorrelation function contains high autocorrelations in the latter half of the first dozen lags, indicating the model may not have adequately captured the variation in the data. 158 The fact that both the first differences of the data and the ARIMA( 1,1,1) residuals have high autocorrelations for lag 22 indicates that there may be some structure to the variation of the data which is not being accounted for by the analysis done to this point. Higher order models need not be considered further since both ARIMA(2,1,1) and (1,1,2) contain insignificant coefficients. 9.6.1.2 Nikkei, log, d=l The first differences of the logarithm of the Nikkei have an autocorrelation of lag 2 which is two standard errors from zero and lag 7 which is greater than two standard errors from zero. Since such a low order lag is significant, the analysis continues to determine if a more suitable model exists. Taking the logarithm of the Nikkei does not seem to change the structure of this series since it is modelled best by the ARIMA( 1,1,1) model, as was the untransformed series. All coefficients estimated for this model are significant. Its residuals' autocorrelation function contains two autocorrelations equal to two standard errors from zero; however, these lags are in the second half of the first dozen lags, and thus these residuals may be appropriately considered to approximate noise. 159 Diagnostic checking was done by fitting the higher order models ARIMA(2,1,1) and (1,1,2); both these models have insignificant coefficients for the additional variables, indicating the ARIMA( 1,1,1) has done an adequate job in describing the variation in the logarithm of the Nikkei. The logarithm of the Nikkei is well described by the ARIMA( 1,1,1) model. The good fit of this ARIMA model indicates that both the logarithmic transformation and the assumption of difference stationarity are appropriate. 9.6.2 Topix 9.6.2.1 Topix, d=l The first differences of the Topix do not seem to approximate noise, since their first and second order autocorrelations are greater than three standard errors from zero and other lags are also high. Thus, further analysis must be done to determine a model which more accurately describes the variation of the data. Models with significant parameters are (1,1,0), (0,1,1), (2,1,0), and (0,1,2). The high second order correlations indicate the residuals of (1,1,0) and (0,1,1) do not approximate noise. The second order models have residuals showing four high autocorrelations in the latter half of the first dozen lags; these models are better at describing the data's variation than are the first order models, but still do not seem to do an adequate job. Higher order models need not be considered since the models ARIMA( 1,1,1), (3,1,0), and (0,1,3) include insignificant coefficients. It is interesting to note that for all four of the Topix models described, the autocorrelation functions of the residuals at lag 22 are greater than or equal to two standard errors from zero. This seems to indicate some structure to the variation of the data has not been described by the modelling done in this analysis. 9.6.2.2 Topix, log, d=l The first differences of the logarithm of the Topix do not seem to approximate noise, since their first order autocorrelation is greater than three standard errors from zero and many other higher lag autocorrelations are also high. Both first order models (1,1,0) and (0,1,1) have significant coefficients. The MA model (0,1,1), however, has residuals with a lag 2 autocorrelation of two standard errors from zero which causes some concern that the model is not capturing the data's variation. The AR model (1,1,0) has residuals whose autocorrelations approximate that of noise, with only one lag in the first dozen higher than two standard errors from zero (lag 7). Diagnostic checking was done by fitting an ARIMA(2,1,0) and ARIMA( 1,1,1); both of these additional variables had insignificant coefficients, indicating that the ARIMA( 1,1,0) model is appropriate. The logarithm of the Topix is well described by the ARIMA( 1,1,0) model. The good fit of this ARIMA model indicates that both the logarithmic transformation and the assumption of difference stationarity are appropriate. 9.6.3 Golf - all 9.6.3.1 Golf-all, d=l The first differences of Golf - all have very high autocorrelations and do not approximate noise; thus, further analysis is needed to determine the correlation structure of this variable. The best ARIMA model for this data has a high order of magnitude: ARIMA(0,1,6). (Lower orders of magnitude were analyzed and deemed inappropriate.) The fitness of the ARIMA(0,1,6) model can be seen from its coefficients all being significant and its residuals' autocorrelation function approximating noise (the only two high autocorrelations are for high lags: lags 8 and 13). The good fit of this ARIMA model provides evidence that the assumption of difference stationarity is appropriate in this case, although the high order of the model casts some doubt. Diagnostic checking shows that this model is appropriate, since the ARIMA(0,1,7) and ARIMA(1,1,6) models both contain insignificant coefficients. (The ARIMA(1,1,6) is not shown in the interest of brevity.) 9.6.3.2 Golf - all, log, d=l The first differences of the logarithm of Golf - all have high autocorrelations and thus do not approximate noise. Further analysis is needed to more fully determine the correlation structure of this variable. The ARIMA(0,1,6), another model of high magnitude, is a model which adequately describes Golf - all. Its coefficients are all significant and its residuals have only two large autocorrelations at high lags (lags 8 and 16). It is interesting to note that the residuals of the ARIMA(0,1,6) model for both Golf - all and the logarithm of Golf - all have an autocorrelation at lag 8 which is greater than two standard errors from zero. The good fit of this ARIMA model provides evidence that the assumption of difference 163 stationarity is appropriate in this case, although the high order of the model casts some doubt. Diagnostic checking substantiated the appropriateness of this model. The ARIMA(0,1,7) model has an insignificant coefficient. The second model which is included in this diagnostic checking procedure is the ARIMA(1,1,6) model, but its results can not be calculated meaningfully and are therefore not shown; this fact further supports the chosen model. 9.6.4 Golf - east 9.6.4.1 Golf-east, d=l The high autocorrelations of the first differences of the logarithm of Golf - east indicate further modelling is needed to describe the data adequately. This series is best described by the high order model ARIMA(0,1,6). Its coefficients are all significant and the only high autocorrelation its residuals' contain is for a high lag (lag 8). The good fit of this ARIMA model provides evidence that the assumption of difference stationarity is appropriate in this case, although the high order of the model casts some doubt. 164 The appropriateness of this model is confirmed with diagnostic checking. The ARIMA(0,1,7) model contains insignificant coefficients. The ARIMA(1,1,6) model has coefficients which are all significant; it is not chosen, however, since the most parsimonious model is desired (its results are not shown in the interest of brevity). 9.6.4.2 Golf - east, log, d=l The high autocorrelations of the first differences of the logarithm of Golf - east indicate further modelling is needed to describe the data adequately. The logarithm of Golf - east is best described by the high order model ARIMA(0,1,6). Its coefficients are all significant and the only high autocorrelation its residuals' contain is for a moderately high lag (lag 8). It is interesting to note that the residuals of the ARIMA(0,1,6) model for Golf - all and Golf - east and both their transformations have an autocorrelation at lag 8 which is greater than two standard errors from zero. This consistency indicates there may be some structure to the data which is not being captured by the ARIMA models. Otherwise, the good fit of this ARIMA model provides evidence that the assumption of difference stationarity is appropriate in this case, although the high order of the model casts some doubt. The appropriateness of this model is confirmed with diagnostic checking. The ARIMA(0,1,7) model contains an insignificant coefficient. The second model which is included in this diagnostic checking procedure is the ARIMA(1,1,6) model but its results can not be calculated meaningfully and are therefore not shown; this fact further supports the chosen model. 9.6.5 Golf - west 9.6.5.1 Golf-west, d=l The high autocorrelations of the first differences of Golf - west indicate further modelling is needed to describe the variation of the data. No ARIMA model is found which adequately described this series. The best model found is the high order model ARIMA(0,1,9). Its coefficients are all significant, but its autocorrelation function contains many values in the second dozen lags which are greater than or equal to two standard errors from zero. This indicates there is some structure to the variation of this series which is not being modelled by the standard ARIMA methodology. The assumption of difference stationarity is questionable. 166 9.6.5.2 Golf - west, log, d=l The high autocorrelations of the first differences of the logarithm of Golf - west indicate further modelling is needed to describe the variation of the data. No ARIMA model is found which adequately described this series. The best model found is the high order model ARIMA(5,1,0). Its coefficients are all significant, but its autocorrelation function contains many values which are greater than two standard errors from zero. This indicates there is some structure to the variation of this series which is not being modelled by the standard ARJMA methodology. The assumption of difference stationarity is questionable. 9.6.6 Golf - tokyo 9.6.6.1 Golf-tokyo, d=l The high autocorrelations of the first differences of Golf - tokyo indicate further modelling is needed to describe the variation of the data. No ARIMA model is found which adequately described this series. The best models found are the high order models ARIMA(0,1,4) and (3,1,0). Their coefficients are all significant, but their autocorrelation functions contain many values which are greater 167 than or equal to two standard errors from zero. This indicates there is some structure to the variation of this series which is not being modelled by the standard ARIMA methodology. The assumption of difference stationarity is questionable. 9.6.6.2 Golf - tokyo, log, d=l The high autocorrelations of the first differences of the logarithm of Golf - tokyo indicate further modelling is needed to describe the variation of the data. No ARIMA model is found which adequately described this series. The best model found is the high order model ARIMA( 1,1,3). Its coefficients are all significant, but its autocorrelation function contains only two values which are greater than or equal to two standard errors from zero; however, these values are found at somewhat low lags (lags 5 and 7) and therefore cause some concern as to the appropriateness of the model. This indicates there may be some structure to the variation of this series which is not being modelled by the standard ARIMA methodology. The assumption of difference stationarity is questionable. 168 CHAPTER 10 DOES A LONG-TERM LINEAR RELATIONSHIP EXIST? 10.1 Main results and conclusions Of all the pairwise combinations of variables possible between the stock indices and the golf indices, only one showed any evidence of a long-term linear relationship (the combination of the logarithm of the Nikkei and the logarithm of Golf - tokyo). In addition, the two stock indices showed no evidence of a long-term linear relationship, nor did any combination of two golf indices. The similarities between the indices seen in the analysis done in earlier chapters of this thesis indicate the two stock indices and the four golf indices are very similar, and there is reason to suspect a relationship would exist between the stock indices and the golf indices. However, first the results of the ARIMA modelling of Chapter 9 and now the cointegration results of this chapter provide evidence which contradicts these concepts. It must be remembered that the data covers only a portion of history (from the middle of 1983 to the middle of 1993) and the analysis therefore examines only a portion of time. A ten year time frame should be long enough to capture the essence of the data, however, it may be the case that a relationship would be found if data from World War II to the present were examined. In addition, the analytical time frame includes the 169 downturn which took place at the end of 1989 and the beginning of 1990. The downturn was an extraordinary event whose influence cannot be ignored, thus the next chapter (Chapter 11) determines the effect of this downturn. 10.2 Background 10.2.1 What is cointegration? It was determined in section 8.3.3 that cointegration was the appropriate manner in which to test for linear relationships in the stock and golf data. Cointegration was described at that time, but will be described here in more technical detail. Cointegration describes the existence of a long-term linear relationship between nonstationary processes. More specifically, cointegration exists when the linear combination of multiple 1(d) (d^O) processes is stationary. Since the relationship of interest in this analysis is between two 1(1) indices, the following description of cointegration will include only this case. (A long-term linear relationship will be referred to as simply a "relationship" for the remainder of this chapter in order to simplify the text.) If yt and x, are two 1(1) processes and a linear combination of them, Zj = yt - ocxâ€ž is 1(0), then yt and xt are said to be cointegrated. This relationship is a symmetric one: if (yt a x j is stationary, then so too will be (xt - GyJ. 10.2.2 How does one test for cointegration? The presence of cointegration is tested by running a cointegrating regression and testing if the resulting residuals have a stationary structure. In the simplest case, the cointegrating regression takes the form of yt = ocxt + Uj. (Remember that since this relationship is symmetric, the cointegrating regression could also be xt = 9yt + vt.) If yt and xt are cointegrated, the residuals (uj would have a stationary structure and be equivalent to the z, described in the previous section (section 10.2.1). Therefore, the structure of the residuals of the cointegrating regression is tested for stationarity using the Augmented Dickey-Fuller test as described in section 7.2. If yt and ^ are cointegrated, the ordinary least squares (OLS) regression (yt = 0xt + e^ estimate of P done in the cointegrating regression would be consistent with "excellent large-sample properties" (Kennedy 1992, 254). In large samples, as are being used in this analysis, this OLS parameter estimate of |3 will be an estimator of the cointegrating parameter a (where a refers to z, = yt - ax t ). This estimate of a would not, however, be asymptotically normal and, thus, the test statistic for determining the significance of a must be compared to critical values specifically calculated. In fact, these critical values 171 are different from those used in the unit root test described in section 7.2 (Kennedy 1992, 267). 10.2.3 Is the pot calling the kettle black? In Chapter 8, it was decided that regression is not an appropriate methodology to determine the existence of a linear relationship in this data. The argument was based on the fact that if a regression were run in the presence of nonstationarity (as is the case for the stock and golf indices), the results would look "good" (e.g., have an extremely high R2) but would have biased conventional significance tests which lead to the acceptance of spurious relationships. In other words, regression would indicate the presence of a linear relationship even if (as is shown by the following cointegration tests) no such relationship exists. Cointegration is appropriate for determining linear relationships between two time series in the presence of nonstationarity. However, this methodology includes a cointegrating regression. Why is regression appropriate in this instance when it is not appropriate in the usual OLS analysis? Regression done in the presence of nonstationarity results in some of the usual OLS characteristics being invalid (e.g., R2). The use made of the cointegrating regression takes these restrictions into account and is therefore valid and appropriate. On the other hand, classical OLS analysis (i.e., R2, significance tests, and so 172 on) used in the presence of nonstationarity does not take account of the restrictions involved and thus uses the regression analysis in an inappropriate manner; this can lead to meaningless and spurious results. More generally, cointegration and all its related tests were specifically designed to deal with the possible presence of nonstationary data, whereas this is definitely not the case with ordinary least squares regression analysis. The two methodologies are meant to be used in two very different circumstances: cointegration deals with a symmetric relationship between two variables, whereas regression deals with a dependent relationship between two variables. 10.2.4 What assumptions are necessary to use cointegration tests? Cointegration assumes that (1) the relationship being investigated is linear (as is the case in this paper) since little theory exists for nonlinear cointegration, (2) difference stationarity exists, and (3) the variables are of the same degree of integration (usually 1(1)). The second assumption is somewhat questionable given the stationarity results for the golf indices found in Chapter 7 (some evidence supporting a lack of stationarity) and the difficulty encountered in fitting ARIMA models to some of the data in Chapter 9. The third assumption has been established in Chapter 7 where it was decided that the data most closely approximates stationary structures after first differences are taken. The 173 analysis continues with testing for cointegration to (a) determine the presence of relationships between the stock and golf indices, and (b) provide further evidence regarding the possible stationary structures of the data under analysis. 10.3 Actions taken The analysis done previously in this paper has been done on each time series individually. This chapter, however, analyzes combinations of the data. The time period to be considered, therefore, must obviously be that for which data is available for both groups. The stock indices cover the time period of 31 July 1983 to 11 April 1993, and the golf indices cover 10 January 1982 to 11 April 1993. The data used in this chapter, therefore, covers the time period of 31 July 1983 to 11 April 1993. The cointegration tests included in this chapter were done using the statistical package Shazam. The cointegrating regression equation used provides for the presence of a constant, a trend, and autocorrelated errors. The residuals of this regression equation are tested for stationary structure using the Augmented Dickey-Fuller unit root test as described in section 7.3. If cointegration exists between the two variables being tested, there should be no unit root structure found for the residuals of the related cointegrating regression; if cointegration does not exist, such a unit root structure should be found. 10.4 Detailed analysis Table 40. Cointegration test results for original data. Variables Test Statistic Critical Value 10% nikkei all nikkei east nikkei west nikkei tokyo topix all topix east topix west topix tokyo -2.2715 -2.4364 -1.9256 -2.5409 -2.0963 -2.3045 -1.6818 -2.4146 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no no no no no no no nikkei topix all east all west all tokyo east west east tokyo west tokyo -2.0708 -1.9029 -1.8462 -2.6925 -1.8439 -2.5902 -2.4553 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no no no no no no nikkei all nikkei east nikkei west nikkei tokyo topix all topix east topix west topix tokyo -2.2586 -2.2943 -1.8776 -3.7933 -2.0686 -2.2091 -2.0014 -2.5683 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no no yes no no no no nikkei topix all east all west all tokyo east west east tokyo west tokyo -2.4718 -1.6070 -1.7060 -1.7908 -1.6513 -1.9334 -1.2059 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no no no no no no Cointegrated? Logarithm of original data: 10.5 Description of analysis For both the original data and the logarithmic transformations of the original data, all the combinations of the stock and golf indices have test statistics from the cointegration test which are greater than the 10% critical value, with one exception. The exception is that the logarithmic transformations of the Nikkei and Golf - tokyo show a test statistic for the cointegration test which is less than the 10% critical value. 10.6 Evaluative discussion All combinations of the stock and golf indices were regressed using the cointegrating regression described in section 10.2 (Background). In every case, the unit root test (described in section 7.2) indicates the presence of a unit root structure in the resulting residuals, which, in turn, indicates that cointegration between the variables does not exist. Thus, there is no evidence to indicate the presence of a long-term linear relationship between any of the stock market indices and the golf course membership price indices under analysis in this paper. One exception does exist, the cointegration test on the linear combination of the logarithmic transformations of the Nikkei and Golf - tokyo indicates there is evidence to suggest cointegration does exist between these two variables. Since both these indices are calculated based on events which take place in Tokyo, it seems reasonable that these variables be related. However, this explanation should also apply to the Topix and Golf - tokyo, which showed no evidence of cointegration. Why, then, the different results? This is exactly the type of question which this thesis hopes to bring to light. Specific differences must be determined and their explanations sought in order for a more complete understanding of the data and the dynamics behind the data to be reached. It is important to remember that the cointegration test can only suggest evidence of cointegration. As with any statistical test, the results cannot prove the null hypothesis is true (i.e., cannot prove cointegration exists), but can only show when it is false. In addition, one must remember that difference stationarity is an assumption upon which this test is based, and the analysis done in both Chapters 7 and 9 indicates there is reason to be suspicious of this assumption in the case of the logarithm of Golf tokyo. Of course, this argument works both ways; cointegration may, in fact, exist for another combination and the test is not picking this up do to some violation of the assumptions of the test. Another point to keep in mind when interpreting any analysis is the existence of Types I and II error. The large number of tests executed throughout this analysis virtually guarantees that some tests will provide evidence for a conclusion which is contradictory to what exists in reality. 177 Yet another caution which must be kept in mind is the difficulties encountered by the Augmented Dickey-Fuller unit root test when dealing with moving average processes (which would include much of the data under analysis), as discussed in section 7.2. At first glance these results may seem unbelievable given the similarities seen in the data streams in previous chapters. It was the attempt at ARIMA modelling in Chapter 9 which began to indicate a substantial degree of differentiation among the series. In addition, it must be remembered that the results of this chapter indicate only that longterm linear relationships do not exist. Some other type of relationship may well exist. The general lack of cointegration, in simple terms, means that the data streams do not have a linear combination for which the residuals remain stable over time. However, it is not reasonable to assume that two variables which are not cointegrated will definitely drift apart over time (since, again, another type of relationship may well exist). It can only be said that the variables might drift apart. The most interesting use of the results of this chapter's analysis comes when they are compared to the results of the next chapter. The conclusion at this point is that there is a general lack of linear relationship over time between the stock and golf time series. The next question to answer is: did the downturn have an effect on this lack of relationship? 178 CHAPTER 11 EFFECTS OF DOWNTURN To simplify the text of this chapter, the term "relationship" will refer to a long-term linear relationship unless otherwise specified. 11.1 Main results and conclusions There is evidence to suggest that the downturn did have an effect on the relationship between the stock and golf indices: it seems to have removed any long-term linear relationship which may have previously existed. The downturn had the opposite effect on relationships within each group: it changed the time series so as to create linear relationships which did not exist prior to the downturn. The cointegration tests done on the data before the downturn provide evidence of a linear relationship between both stock time series and Golf - all, and marginal evidence of a linear relationship between the logarithms of the Topix and Golf - tokyo. The cointegration tests done on the data after the downturn provide no evidence of a longterm linear relationship between any combination of a stock and a golf index (or their logarithms). 179 Before the downturn there was no evidence that the two groups of data (stock and golf) contained any two indices which were linearly related. After the downturn, the two stock indices seem to have a linear relationship, as do two of the golf indices: Golf - all and Golf - west. The downturn seems to have changed the indices in such a way as to make them more linearly related within each group. 11.2 Background No additional background is necessary as cointegration is fully described in the background section of Chapter 10. 11.3 Actions taken The "downturn" under analysis took place at the end of 1989 and the beginning of 1990, as discussed in section 2.3. In order to analyze the effects of this downturn, the data was divided into three sections: before the downturn, during the downturn, and after the downturn. The data before and after the downturn are examined in an attempt to determine if any linear relationships existed between or within the stock and golf indices during each time period and, in addition, to compare these two time periods in order to determine if the downturn had any effect on these relationships. 180 The duration of the most extreme effects of the downturn for each time series was determined to have taken place during a period of four weeks around the week with the highest index (i.e., two weeks on either side). Since comparisons are to be made between the stock and golf indices, the dates included in the data sets for before and after the downturn must be the same for all time series. Therefore, it is the earliest peak which determines the last date included in the data set for before the downturn, and it is the latest peak which determines the first data included in the data set for after the downturn. Using table 41, then, it can be seen that the last date included in the data before the downturn is 26 November 1989, and the first date included in the data after the downturn is 1 April 1990. Table 41. Calculations for determining downturn data to exclude. Nikkei Topix Golf - all Golf - east Golf - west Golf - tokyo Maximum Index + 3 weeks - 3 week 89/12/24 89/12/17 90/03/11 90/03/04 90/03/11 90/03/04 89/12/03 89/11/26 90/02/18 90/02/11 90/02/18 90/02/11 90/01/14 90/01/07 90/04/01 90/03/25 90/04/01 90/03/25 The data set for before the downturn contains 330 observations (for each of the six time series under analysis) which run from 31 July 1983 through 26 November 1989 181 (inclusive), and the data set for after the downturn likewise contains 159 observations which run from 1 April 1990 through 11 April 1993 (inclusive). For both before and after the downturn, each combination possible between a stock index and a golf index is tested for cointegration using the test described in section 7.3. The same test is also done between indices within each group (e.g., within the group of golf indices, Golf - all and Golf - east are tested for cointegration). 11.4 Detailed analysis Table 42. Cointegration test results for before the downturn. Variables Test Statistic Critical Value 10% nikkei all nikkei east nikkei west nikkei tokyo topix all topix east topix west topix tokyo -3.4035 -2.7053 -3.0460 -2.5565 -3.3569 -3.1713 -2.7449 -3.4443 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no no no no no no no nikkei topix all east all west all tokyo east west east tokyo west tokyo -1.3921 0.7940 -2.1087 -0.1390 -2.1076 -1.9679 0.6972 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no no no no no no Logarithm of original data: nikkei all -3.3737 nikkei east -2.7672 -1.7227 nikkei west -2.8945 nikkei tokyo topix all -3.6688 topix east -3.2939 topix west -1.5430 topix tokyo -3.3246 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no no no yes no no no nikkei topix all east all west all tokyo east west east tokyo west tokyo -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no no no no no no -2.4923 0.0209 -1.6776 -0.4368 -1.8783 -2.0355 -0.3974 Cointegrated? Table 43. Cointegration test results for after the downturn. Variables Test Statistic Critical Value 10% nikkei all nikkei east nikkei west nikkei tokyo topix all topix east topix west topix tokyo -2.9053 -2.8870 -2.8730 -2.8179 -2.1151 -2.7588 -2.0496 -2.1761 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no no no no no no no nikkei topix all east all west all tokyo east west east tokyo west tokyo -3.3014 -2.9807 -4.0452 -1.8103 -2.4764 -2.3034 -1.4377 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no yes no no no no nikkei all nikkei east nikkei west nikkei tokyo topix all topix east topix west topix tokyo -2.2371 -2.2525 -2.2671 -2.2371 -2.1796 -2.1784 -2.1943 -2.2364 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no no no no no no no nikkei topix all east all west all tokyo east west east tokyo west tokyo -3.4666 -1.8286 -3.7369 -2.0547 -2.1176 -3.0359 -1.7558 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 -3.4959 no no yes no no no no Cointegrated? Logarithm of original data: 184 11.5 Description of analysis Before the downturn, the cointegration tests done on the combinations of the stock and golf time series (and their logarithms) produce test statistics which are all greater than the critical value at 10%. The one exception to this is the cointegration test done on the logarithms of the Topix and Golf - all, which generates a test statistic which is less than the critical value. After the downturn, only two cointegration tests done on the combinations of the stock and golf time series (and their logarithms) produce test statistics which are less than the critical value at 10%: Golf - all and Golf - west, and the logarithmic transformations of Golf - all and Golf - west. 11.6 Evaluative discussion The same provisos must be given now as were given at the end of Chapter 10 (section 10.6). The results of these tests are influenced by the degree to which the assumptions are met; this will be discussed for each relevant variable when the interpretation of the test results are considered below. In general, one must also remember the existence of both Type I and Type II error, the possible existence of a relationship existing other than the linear relationship under analysis in this chapter, and the difficulties with the 185 Augmented Dickey-Fuller test in dealing with moving average processes (such as much of the data under analysis). 11.6.1 Relationships within the group of stock indices and within the group of golf indices 11.6.1.1 Before the downturn Before the downturn no evidence was found for any linear relationship between the Nikkei and the Topix, nor for any combination of the golf indices. 11.6.1.2 After the downturn: Golf - all and Golf - west After the downturn, evidence was found for within group relationships in two instances: between Golf - all and Golf - west, and between the logarithmic transformations of these variables. As with any statistical test, the confidence placed in the result must be tempered by the degree to which the data conform to the assumptions of the test. The assumption of difference stationarity is questionable in the case of these golf indices (as will be summarized in the following two paragraphs). The doubts surrounding the 186 assumption of difference stationarity for Golf - all and Golf - west mean the results of the cointegration test between these two variables is questionable. For Golf - all, Chapter 7 shows that the three techniques used to determine stationarity provide contradictory results. For Golf - all and its logarithm the unit root tests indicate stationarity after first differencing; the spikes in the graphs could or could not be interpreted to invalidate the conclusion of stationarity which could otherwise be reached, and the autocorrelation functions indicate the data does not have a difference stationary structure. The ARIMA modelling results of Chapter 9 did little to clear up this issue: Golf - all and its logarithm both modelled well using ARIMA models which would indicate the assumption of difference stationarity is appropriate, however the high order of the model provides evidence to the contrary. For Golf - west the determination of a difference stationary structure is no easier. The unit root test for the first differences of Golf - west indicated this data to have a stationary structure, and the unit root test for the first differences of the logarithm of Golf - west showed only marginal evidence of stationarity. The graphs of the first differences of Golf - west and its logarithm show evidence of nonstationary (with patterns which change over time). The autocorrelation functions for these transformations were high and slowly declining which indicates the presence of nonstationarity. The ARIMA results of Chapter 9 do not clear up this issue since no transformation of Golf - west modelled well as an ARIMA process. 187 11.6.1.3 After the downturn: Nikkei and Topix There are two other combinations which deserve mention: the Nikkei and the Topix, and the logarithmic transformations of these variables. The test statistics for these combinations are close to the critical value, and may be interpreted as marginal evidence of cointegration. The existence of a linear relationship between the logarithms of the Nikkei and Topix is more easily accepted and more easily considered valid, as there is some additional evidence that the logarithms have stationary structures. This thesis has shown that a simple level of analysis, such as visual evaluation of graphs of data and investigation of distributions, can be deceiving. It is, however, interesting to note that the two series these basic methods found the most similar are ones for which more sophisticated analysis finds evidence of a linear relationship: the Nikkei and Topix. Although it may be known that one should be wary of conclusions based solely on "basic" analysis as discussed above, it is reassuring when this interpretation agrees with, and is backed up by, more technical analysis. The validity of the cointegration test for the Nikkei and Topix is high since, of all the time series examined, these two indices best follow the assumption of difference stationarity which is necessary for cointegration testing (see section 10.2.4). Chapter 7 discusses how the two stock variables transformed to stationarity upon first differencing with all three techniques showing the same conclusion - constant means, no unit roots, 188 and low autocorrelation functions which show no pattern of slow decline. In addition, the logarithmic transformations of both modelled well as low order ARIMA processes in Chapter 9. 11.6.2 Relationships between the group of stock indices and the group of golf indices Before the downturn there is evidence of some linear relationships existing between the stock market indices and the golf course membership price indices. After the downturn, however, all such evidence ceases to exist. Before the downturn, only the results of the cointegration test on the logarithms of the Topix and Golf - all indicate evidence of a long-term linear relationship. This result should not be considered absolutely reliable since the assumptions of the cointegration test are not wholly met by one of the variables. As discussed in section 11.6.1, the assumption of difference stationarity is reasonable for the logarithm of the Topix but questionable for Golf - all. Although the test results before the downturn show only one combination between the two groups of data with test statistics less than the 10% critical value, it shows many 189 combinations whose test statistics are close to the critical value. These combinations could be interpreted as having marginal evidence of cointegration: Nikkei and Golf - all, logarithm of the Nikkei and logarithm of Golf - all, Topix and Golf - all, Topix and Golf - tokyo, and logarithm of the Topix and logarithm of Golf - tokyo. Since all tests before the downturn involving the Topix and Golf - all (and their logarithms) show either significant or marginal evidence of cointegration, it is reasonable to conclude that strong evidence does exist of a relationship between these two variables. Both the combination of the Nikkei and Golf - all and the combination of their logarithmic transformations show marginal evidence of a linear relationship. This information provides additional evidence of a relationship between the stock indices and Golf - all. Both the Topix and Golf - tokyo and their logarithmic transformations show marginal evidence of cointegration before the downturn. Again, this evidence must be interpreted somewhat cautiously, as the assumptions upon which the cointegration tests are based are doubtful for Golf - tokyo. (The assumptions for the logarithm of the Topix were discussed above and were determined to have been well fulfilled.) The tests for stationarity done in Chapter 7 for Golf - tokyo had the same inconclusive results as 190 those described for Golf - all in section 11.6.1.2. The fact that Golf - tokyo did not model well as an ARIMA process (Chapter 9) provides further evidence that Golf tokyo may not have a difference stationary structure. It is because of these concerns that the evidence of a long-run relationship between these two variables is not strong, although some evidence does exist. In summary, before the downturn there is strong evidence to suggest that the stock indices and Golf - all have a long-term linear relationship. There is also marginal evidence to suggest that the Topix and Golf - tokyo have a long-term linear relationship. After the downturn, however, no series shows even marginal evidence of cointegration. 191 CHAPTER 12 WHAT HAS THIS ANALYSIS DISCOVERED ABOUT THE DATA? 12.1 Summary This paper looks at the relationship between the stock prices and land prices in Japan. It does this by examining two measure of the Japanese speculative stock market (the Nikkei and Topix, Tokyo Stock Exchange market indices) and four measures of the Japanese speculative land market (golf course membership price indices for the country as a whole, the eastern part of Japan, the western part of Japan, and Tokyo). This analysis highlights the differences in these time series, the small degree of linear relationship between them, and the effect the downturn had on this relationship. The Nikkei and the Topix should not be considered equivalent or interchangeable, nor should the four golf course membership price indices, as their structures are not the same. No linear relationships appear to exist within these groups before the downturn which took place at the end of 1989 and the beginning of 1990 (i.e., between the two stock market indices or between any pairwise combination of the four golf course membership price indices), but there is evidence of some such relationships after the downturn. Looking at data including before, during, and after the downturn, there is almost no evidence of linear relationships between the stock market indices and the golf course membership price indices. By accounting for the effect of the downturn on the data, however, evidence of linear relationships does exist. When examining only data from before the downturn, evidence of linear relationships between the two data sets is found (particularly between the stock market indices and the country composite index for golf course membership prices). However, examination of data from after the downturn shows that the downturn seems to have changed the data in such a way as to remove these previously existing linear relationships between the two groups of indices. 12.2 What is now known about the data? Graphical analysis of the data under scrutiny would suggest the existence of three similarities: first, between the two stock market indices (the Nikkei and Topix); second, amongst the four golf course membership price measurements (all, east, west, and tokyo); and, third, between the set of stock market series and the set of golf course membership price indices. The graphs of the data, the effects of transformations, the lack of outliers, the lack of seasonality, and the distributions of the data are remarkably alike for both the stock and golf time series. In fact, the similarity within each set is rather astounding. In addition, one would be led to believe that some relationship may exist between these two sets of data on the basis of their perceived similarities in these areas. However, a more technical look at the data reveals the opposite: there exist surprising differences within each data set and there is little evidence of the expected relationship between these data sets. These findings will be discussed more fully in the sections to follow. 12.3 What is now known about the structure of the data? The ARIMA methodology worked well for the logarithms of the stock market indices, although differing structures were found: the logarithm of the Nikkei admitted to an ARIMA( 1,1,1) structure while the logarithm of the Topix was well represented with the ARIMA( 1,1,0) model. The assumption of difference stationarity is justified for the logarithm of the two stock indices based on test results to that effect and on the series modelling well as ARIMA processes. This difference stationarity means that a disturbance at one point would have effects which last for all following observations. The golf course membership price indices, on the other hand, did not fare so well. No conclusion can be reached for any of the four golf indices regarding difference stationarity due to conflicting test results. Only the country composite and the composite for the eastern part of the country were capable of being well modelled as ARIMA processes, although the high order of the models is cause for concern: both these series and their logarithms model well as ARIMA(0,1,6) processes. In contrast, the other two golf course membership price indices (the composites for the western part of Japan and for Tokyo) did not model well as ARIMA processes, thereby providing additional evidence against their having a difference stationary structure. 12.4 What is now known about the relationship between weekly stock and land indices? Taking into account the data from the entire time frame under analysis (1983 to 1993), there is little evidence of a linear relationship either within or between the groups of stock market and golf course membership price indices. These results are important because they contradict the conclusions which would be found if regressions were done on this data. Regression is a common methodology used to test for linear relationships, but when used in the presence of nonstationarity it leads to the conclusion of strong linear relationships where, in reality, none may exist. (All the stock and golf times series have nonstationary structures.) The lack of relationship is rather surprising given that the series seem so similar when compared using the more basic techniques of the beginning of this paper: examination of graphs and distributions of the data. It is only the more sophisticated analysis done in the latter portion of this paper (ARIMA modelling, difference stationarity, and cointegration) which bring to light the large degree of dissimilarity in the data. These results are important because they indicate that no linear relationship existed between stock and land prices in Japan over the time period between 1983 and 1993. 12.5 What is now known about the effects of the downturn on the data? A downturn took place in the stock and land markets of Japan at the end of 1989 and the beginning of 1990. Before this time, the price of stock and land steadily increased to astronomical heights, but after the downturn the price of both experienced a rather consistent decline. This pattern can easily be seen when looking at graphs of the stock market and golf course membership price indices under analysis in this thesis. One wonders what effect such a cataclysmic event would have on the indices and the relationships between them. This analysis shows that the downturn did have a rather large and, again, surprising effect on the stock and golf indices. By examining the data before the downturn separately from the data after the downturn, some linear relationships do, in fact, come to light. Before the downturn no evidence was found for any linear relationship between the Nikkei and the Topix, nor for any pairwise combination of the golf indices. After the downturn, however, evidence was found to indicate such a relationship between the two stock market indices and between two of the golf course membership price indices (the country composite and the composite for the western part of Japan). These results are important because they help in understanding the similarities and dissimilarities both among the group of stock market indices and among the group of golf course membership price indices. These results also indicate the downturn affected the relationship between the indices representing each market. Before the downturn there is evidence to suggest long-term linear relationships between the group of stock indices and the group of the golf course membership price indices. After the downturn, however, there is no evidence of any such relationships. These results indicate the downturn affected the relationship between stock and land prices in Japan. Given that there was a relationship between the stock and golf indices before the downturn, could this relationship have been used to forecast the downturn which took place? The results of this analysis say this is not possible. Since the downturn itself changed the very relationship between the stock and golf indices, the downturn could not have been predicted based on any previous relationship between these indices. 12.6 What further analysis could be done? There are two areas where further analysis would prove interesting. The first area concerns possible additional areas of analysis on the data used in this paper. The second area pertains to analysis to be done on related data. This paper has examined the data for the existence of linear relationships. It is possible that nonlinear relationships exist; this could be tested in the future as theory is developed to deal with nonlinear time series and nonlinear cointegration. Standard ARIMA methodology does not account for the covariance structure observed for large lags in the stock market and golf course membership price indices. Further modelling using fractional differencing may allow for this long-term persistence to be captured. After this is done, the conclusions regarding the difference stationarity of the data may well change. This paper concentrated on looking at relationships between the data and the effects of the downturn. As a result, some interesting aspects of the data received only limited attention. It would be interesting, for example, to have a more technical look at outliers and seasonality, and the possibility of level shifts and/or variance changes over time. 198 The Dickey-Fuller unit root test was used in the cointegration testing done in this paper. As more comprehensive tests become available, it would be interesting to see how their results compare with those of the Dickey-Fuller tests done on Japan's stock market and golf course membership price data. In terms of looking at related data and topics, it would be interesting to look in more detail at Japan's stock and land markets; for example, analysis done using a proxy for land prices other than golf course membership prices. Even more interesting would be analysis which compares the effects of the downturn found in this paper with the effects of similar downturns which took place in other countries. Given that the downturn had such an effect on the relationship between Japanese stock and land markets, what other economic relationships may have been changed is yet another important question. BIBLIOGRAPHY Box, G.E.P. and G.M. Jenkins. 1970. Time Series Analysis: Forecasting and Control. San Francisco: Holden Day. Bronte, Stephen. 1982. Japanese Finance: Markets & Institutions. London: Euromoney Publications Limited. Burstein, Daniel. 1988. YEN! Japan's New Financial Empire and Its Threat to America. New York: Simon and Schuster. Fisher, Franklin M. 1962. A Priori Information and Time Series Analysis: Essays in Economic Theory and Measurement. Amsterdam: North-Holland Publishing Company. Granger, C.W.J, and P. Newbold. 1974. "Spurious Regressions in Econometrics." Journal of Econometrics 2: 111-120. Hamao, Yasushi. 1989. "Japanese Stocks, Bonds, Bills, and Inflation, 1973-87." The Journal of Portfolio Management 15 Winter 1989: 20-26. Hamao, Yasushi. 1991. "Japanese Financial Markets: An Overview." Ziemba, Bailey, and Hamao 3-14. Isaacs, Jonathan. 1990. Japanese Equities Markets. London: Euromoney Publications PLC. Kato, Kiyoshi and James A. Schallheim. 1990. "Seasonal and Size Anomalies in the Japanese Stock Market." Elton and Gruber 225-247. Kato, Kiyoshi, William T. Ziemba, and Sandra L. Schwartz. 1990. "Day of the Week Effects in Japanese Stocks." Elton and Gruber 249-281. Kennedy, Peter. 1992. A Guide to Econometrics. 3rd ed. Cambridge, Massachusetts: The MIT Press. Matsumura, Yutaka. 1961. Japan's Economic Growth 1945-60. Tokyo: Tokyo News Service, Ltd. Mills, Terence C. 1990. Time Series Techniques for Economists. New York: Cambridge University Press. Noguchi, Yukio. 1990. "Japan's Land Problem." Trans. Mamoru Ishikawa. Japanese Economic Studies, Journal of Translations 18.4 Summer 1990: 48-64. 200 Powell, Jim. 1988. The Gnomes of Tokyo. New York: Dodd, Mead & Company. Roehl, Tom. 1985. "Data Sources for Research in Japanese Finance." Journal of Financial and Quantitative Analysis 20: 273-276. Rose, Louis A. 1990. Land Values and Housing Rents in Urban Japan. Working Paper No. 90-27 August, 1990 at the Department of Economics, University of Hawaii. Sakakibara, Eisuke and Robert Alan Feldman. 1990. "Japanese Financial System in Comparative Perspective." Elton and Gruber 27-55. Schwartz, Sandra L. and Ziemba, William T. 1991. "The Japanese Stock Market: 1949-1991." Ziemba, Bailey, and Hamao 23-44. Tse, Y. K. 1991. "Price and Volume in the Tokyo Stock Exchange." Ziemba, Bailey, and Hamao 91-119. United States, Department of Commerce, Japanese Technical Literature Program, and National Technical Information Service, Office of International Affairs. Directory of Japanese Databases 1990. PB 90-163080. Viner, Aron. 1988. Inside Japanese Financial Markets. Homewood, Illinois: Dow Jones-Irwin. Wood, Christopher. 1992. The Bubble Economy: The Japanese Economic Collapse. London: Sidgwick & Jackson. Wright, Richard W. and Gunter A. Pauli. 1987. The Second Wave: Japan's Global Assault on Financial Services. London: Waterlow Publishers. Ziemba, William T. 1991. "The Chicken or the Egg: Land and Stock Prices in Japan." Ziemba, bailey, and Hamao 45-68. Ziemba, William T., Warren Bailey, and Yasushi Hamao, eds. 1991. Japanese Financial Market Research. New York: Elsevier Science Publishing Company Inc. Ziemba, William T., and Sandra L. Schwartz. 1992a. Power Japan: How and Why the Japanese Economy Works. Chicago: Probus Publishing Company. Ziemba, William T. and Sandra L. Schwartz. 1992b. Invest Japan: The Structure, Performance and Opportunity of Japan's Stock and Funds Markets. Chicago: Probus Publishing Company. Appendix 1 - List of Acronyms and Abbreviations Used AR(p) ARIMA(p,d,q) ARMA(p,q) ACF 1(d) MA(q) NEEDS Nikkei NSA OLS Topix TSE - autoregressive process of order p - autoregressive integrated moving average - autoregressive moving average - autocorrelation function - integration of order d - moving average process of order q - Nikkei Economic Electronic Database System - Nikkei 225 Index or Nikkei 225 Stock Average - Nikkei Stock Average - ordinary least squares (regression) - Tokyo Stock Price Index - Tokyo Stock Exchange 202 Appendix 2 - List of Symbols Used al5...,ap b,,...,bq d k (4. a a2 - coefficients in an autoregressive model - coefficients in a moving average model - dth degree of differencing - when used with ACF and PACF, indicates lag - mean - standard deviation - variance
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Cointegration and stationarity analysis of Japanese...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Cointegration and stationarity analysis of Japanese speculative land and stock markets: 1982-1993 Kelly, Heidi M. C. 1994
pdf
Page Metadata
Item Metadata
Title | Cointegration and stationarity analysis of Japanese speculative land and stock markets: 1982-1993 |
Creator |
Kelly, Heidi M. C. |
Date Issued | 1994 |
Description | There was a remarkable downturn in the stock and land markets of Japan at the end of 1989 and the beginning of 1990. This thesis examines speculative stock and land market indices, explores relationships between these indices, and determines if the downturn had any affect on such relationships. The two sets of data used are measures of the Japanese speculative stock market (the Nikkei and Topix, Tokyo Stock Exchange market indices) and measures of the Japanese speculative land market (golf course membership price indices for the country as a whole, the eastern part of Japan, the western part of Japan, and Tokyo). Preliminary analysis of the data suggests the existence of three similarities: first, between the two stock market indices; second, amongst the four golf course membership price indices; and, third, between the set of stock market indices and the set of golf course membership price indices. The graphs of the data, the effects of transformations, the lack of outliers, the lack of seasonality, and the distributions of the data are remarkably alike. However, a more technical look at the data supports the opposing point of view. ARIMA modelling shows there exists surprisingly little structural similarity either within or between the two sets of data (i.e., the speculative stock and land market indices). In addition, cointegration test results provide little evidence of the expected relationship between these data sets. When accounting for the effect of the downturn on the data, evidence of linear relationships does exist. Cointegration tests using data before the downturn provide evidence of linear relationships between the two data sets (particularly between the stock market indices and the country composite index for golf course membership prices). However, examination of data from after the downturn shows that the downturn seems to have changed the data in such a way as to remove these previously existing linear relationships between the two sets of indices. Cointegration tests show no linear relationships appear to exist within either data set before the downturn (i.e., between the two stock market indices or between any pairwise combination of the four golf course membership price indices), but there is evidence of some such relationships after the downturn. The conclusion of this paper is that the downturn actually changed the relationships within each set of data (i.e., between different measures of the speculative stock market and between different measures of the speculative land market) and between the two sets of data (i.e., the relationship between the Japanese stock and land markets). |
Extent | 5847824 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-03-05 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0087538 |
URI | http://hdl.handle.net/2429/5574 |
Degree |
Master of Science - MSc |
Program |
Business Administration |
Affiliation |
Business, Sauder School of |
Degree Grantor | University of British Columbia |
Graduation Date | 1994-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1994-0556.pdf [ 5.58MB ]
- Metadata
- JSON: 831-1.0087538.json
- JSON-LD: 831-1.0087538-ld.json
- RDF/XML (Pretty): 831-1.0087538-rdf.xml
- RDF/JSON: 831-1.0087538-rdf.json
- Turtle: 831-1.0087538-turtle.txt
- N-Triples: 831-1.0087538-rdf-ntriples.txt
- Original Record: 831-1.0087538-source.json
- Full Text
- 831-1.0087538-fulltext.txt
- Citation
- 831-1.0087538.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0087538/manifest