UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Filtering and parameter estimation for electricity markets Molina-Escobar, Alberto 2009

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2010_spring_molina-escobar_alberto.PDF [ 1.98MB ]
Metadata
JSON: 24-1.0069305.json
JSON-LD: 24-1.0069305-ld.json
RDF/XML (Pretty): 24-1.0069305-rdf.xml
RDF/JSON: 24-1.0069305-rdf.json
Turtle: 24-1.0069305-turtle.txt
N-Triples: 24-1.0069305-rdf-ntriples.txt
Original Record: 24-1.0069305-source.json
Full Text
24-1.0069305-fulltext.txt
Citation
24-1.0069305.ris

Full Text

Filtering and parameter estimation for electricity markets by Alberto Molina-Escobar B.Sc., Universidad Nacional Autónoma de Mexico, 1996 M. S., Universidad Nacional Autónoma de Mexico, 2000 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy in The Faculty of Graduate Studies (Mathematics)  The University Of British Columbia October, 2009  © Alberto  Molina-Escobar 2009  Abstract The growing complexity of energy markets requires the introduction of in creasingly sophisticated tools for the analysis of market structures and for the modeling of the dynamics of spot market and forward prices. In order for market participants to use these markets in an efficient way, it is important to employ good mathematical models of these markets. This has proved to be particularly difficult for electricity, where markets are complex, and ex hibit a number of unique features, mainly due to the problems involved in storing electricity. In this thesis we propose three models for electricity prices. All are multifactor models, that is, as well as an observable spot price they assume the existence of an unobservable long term mean’ process. The introduction of such additional processes helps to explain the relation between spot and futures prices. In the first part of the thesis we introduce a two factor Gaus sian model for prices. Using the Kalman filter, and based on both spot and forward prices, we successfully estimate parameters for simulated data. We then estimate parameters for the German EEX market, and compare our fitted model with the observed prices. We find that this model does capture some features of the EEX market, but it fails to exhibit the price spikes which are a prominent feature of true spot prices. We therefore introduce a second model, which includes jumps. The inclusion of jumps has the potential to give a better explanation of the behavior of electricity prices, but it creates difficulties in the estimation of parameters. This is because as the model noise is non-Gaussian the Kalman filter cannot be applied satisfactorily. We implement the particle filter adopting the Liu & West approach for the jump model. This method allows us to identify the hidden process in the model, and to estimate a small number of parameters. The third model is a new model for electricity prices based on the inverse Box-Cox transformation. This model is non-linear with Gaussian noise, and can generate price spikes using fewer parameters than a multi-factor jump-diffusion model. In this context, we successfully applied the Unscented Kalman filter to estimate the parameters. 11  Table of Contents Abstract  ii  Table of Contents  iii  List of Tables  v  List of Figures  vi viii  Acknowledgements  ix  Dedication  1 1 2 6 8  1  Introduction 1.1 Commodity markets 1.2 Electricity markets 1.3 The relationship between spot and futures prices 1.4 Previous work  2  Filtering 2.1 State space formulation 2.2 The Kalman filter 2.3 The unscented Kalman filter 2.4 Particle filter 2.5 Parameter estimation via maximum likelihood 2.6 Parameter estimation via Bayesian methods  25 25 27 29 32 38 43  MROU model 3.1 Double mean-reversion model 3.2 Radon-Nikodym theorem for Ornstein-Uhlenbeck processes 3.3 Future price 3.4 Formulation in Kalman filter terms  44 44 45 49 51  .  3  .  .  in  Table of Contents 3.5  Empirical results 3.5.1 Simulated data 3.5.2 The German electricity market  56 56 58  .  4  MROU with jumps 4.1 Description of the model 4.2 Valuation of electricity futures 4.3 Particle filter setup 4.4 Simulated data with known parameters 4.5 Likelihood function estimation 4.6 Sequential parameters 4.7 Empirical results  66 66 67 70 76 77 78 80  5  NLMROU model 5.1 The model 5.2 Future price 5.3 Unscented Kalman filter setup and estimation procedure 5.4 Simulation results 5.5 Parameter estimation based on historical data  89 89 90 93 95 95  6  Conclusions 6.1 Future work  .  101 102  Bibliography  105  Appendix A  113  Appendix B  116  iv  List of Tables 1.1  Models for electricity prices  3.1  Jaimungal (columns 1 The data are taken from Hikspoors and 4), and Nomikos & Soldatos (columns 2 and 3) Five different maximization runs, on the same set of simulated data Estimation using one futures contract (average of 50 simula tions) Estimation using two futures contracts (nrr300) Estimated values for f(t) by least-squares fitting ) 1 Estimated values for the EEX market using St and F(t, T Estimated values for the EEX market using S,, F(t, T ) and 1 ) 2 F(t,T The table shows the first four moments of the logarithmic deseasonalized price returns of observed data and the average of 50 simulated trajectories  3.2 3.3 3.4 3.5 3.6 3.7 3.8  22 47 57 57 58 60 63 63  65  4.1 4.2  84 Sample of 8 estimated values for )‘x Individual estimates for parameters in MROU with jumps model. 84  5.1 5.2 5.3 5.4  Five different maximization runs with 800 observations Estimation using one futures contract (n = 800) Estimated values for the EEX market using S, and F(t, T ) 1 The table shows the first four moments of the logarithmic deseasonalized price returns of observed data and the average of 50 simulated trajectories  .  96 97 98  100  V  List of Figures 1.1 1.2 1.3  1,4 1.5 1.6 2.1 3.1  3.2  3.3 3.4  Classification of commodity markets World net electricity consumption 2004-2030 If the total load is low, the plants with the lowest variable production costs are used (nuclear, hydro); if the total load is high, gas or oil fired plants with high fuel cost are running additionally, producing a huge effect on the price Seasonal patterns by hours and by week for the German market. Average daily spot price in German market for years 2002-2007 The factors exerting a major influence on electricity wholesale price A graphical representation of the particle filter with impor tance sampling and resampling Sampling distributions with (column 4) where we add 10% deviation from Hikspoors & Jaimungal’s parameters (column 1) The upper graph shows the simulated spot price St and the future price F(t, T ) with maturity of one month. The lower 1 graph is the long-term mean process L Electricity spot and nearby monthly futures price in German market The upper graph shows the log spot-price of the EEX mar ket and the seasonal component h(t) and the lower graph the deseasonal series Xt = logS h(t) The upper shows the spot price St with exp{h(t)} and and the lower graph the deseasonal spot price S, = exp{Xt} Simulation of spot and future prices (upper graph) and long term mean process (lower graph) using estimate values for part 1 —  3.5 3.6  1 2  3 4 4 5 37  48  55 60  61 62  64  vi  List of Figures 4.1 4.2 4.3 4.4 4.5 4.6 4.7 5.1 5.2 5.3  Plot of the true state L and estimate of the particle filter. The log-likelihood for different Ax values The graph shows a simulated trajectory of the MROU model with jumps The graph shows the upward and downward jumps for the simulated trajectory of the MROU model with jumps Sample particle filter trajectories for the estimate of Ax. Sample particle filter trajectories for the estimate of T1.u and ‘id using Liu and West approach Sample particle filter trajectories for the estimate of cxx and AL using Liu and West approach .  76 77 82 83 84 87 88  The graph shows a simulated trajectory of the NLMROU model. 96 Plot of the true and estimated processes L and X of the 97 NLMROU model (first 80 observations) val price using estimated and future prices Simulation of spot 100 ues for the whole data set  vii  Acknowledgements First and foremost I must acknowledge my thesis supervisor, Martin Barlow. Throughout my time as a graduate student Martin has provided me with unending guidance in research, constant encouragement and many years of financial support. Without his undying support this thesis would not have been possible. I am also grateful for the helpful comments of Rachel Kuske, Ulrich I-Iorst, Joel Friedman and Kevin Murphy. Many thanks also to Arnaud Doucet for his feedback and recommendations that help me improve my project. I would also like to thank Gabriel Mititica for his help and encouragement. Thank you also to CONACyT for its financial support during the first years of my program.  viii  To my wife and daughter.  ix  Chapter 1 Introduction In this thesis we are interested in the commodity futures markets, and in particular in electricity futures markets.  1.1  Commodity markets  In this section we are going to describe some of the unique characteristics present only in commodities markets. Figure 1.1 shows the three fundamen tals groups of commodities.  Commodities  Agricultural  Vegetable Goods • Corn • Coffe • Cotton  Animal Goods • • •  Live Cattle Pork Bellies Lean Hogs  Energy  Metals  Industrial • • •  Caper Aluminium Lead  Precious • • •  Gold Silver Platinum  Upstream  •  Crude Oil Heating Oil  •  Natural Gas  •  I  Downstream  •  Electricity  Figure 1.1: Classification of commodity markets. Commodities markets exhibit some characteristics that are not present in financial markets due to physical constraints and also due to the variation of demand due to changes in consumption. The commodity spot price is defined by the intersection of supply and demand curves. Thus the spot price can be affected by changes in consumption, production or inventory. Unlike financial assets, which are traded for investment purposes, commodities are traded in order to be consumed or used in an industrial process, with the partial 1  Chapter 1. Introduction  35.000 30000 20,000  ::  19,554  10,000  3004  2010 Year  Figure 1.2: World net electricity consumption 2004-2030. exception of some precious metals. This close link with the real economy causes commodities prices to have seasonal behavior and also mean-reversion [31]. This is one reason why many of the standard financial theories may not be applicable to commodities markets.  1.2  Electricity markets  Among the various commodities the energy market is the most recent market to be transformed. Since the early 1990s, electricity markets have been and continue to be developed as a result of the deregulation of electricity markets worldwide. In many regions the market structure has moved from a monop olistic to a competitive one. Traditionally, there was only one company or government agency that produced, moved, distributed, and sold electricity power and services. This transformation has been already taken place in the Americas (parts of Canada and US, Argentina, Chile, Peru, Paraguay and Colombia) in Europe (Norway, Finland, Denmark, Germany, France, Netherlands, Spain, Poland, and Romania) and in the Asia/Pacific region (Australia, New Zealand and Japan). In theory, deregulating the electricity market should increase the effi ciency of the industry by producing electricity at lower costs and passing those cost savings on to customers [39]. Electricity is a growing market. In 1973 electricity consumption accounted for 11% of the total world energy 2  Chapter 1. Introduction 0  Demand  Capacity  Figure 1.3: If the total load is low, the plants with the lowest variable pro duction costs are used (nuclear, hydro); if the total load is high, gas or oil fired plants with high fuel cost are running additionally, producing a huge effect on the price. demand and has grown to 18% today. The absolute growth rate of electricity consumption in the future is estimated at an average of 2.4% per year. The projected growth in electricity consumption is shown in Figure 1.2. Electricity is considered a secondary energy source, which means it is created from the conversion of other sources of energy, such as coal, oil, natural gas, nuclear power, or hydropower, all of which are referred to as primary energy sources. To understand the behavior of electricity prices we have to note that electricity possesses a unique feature; it is very difficult and expensive to store and quite difficult to transmit from one region to another. As a result of this, the spot price of electricity is set by the short-term supply-demand equilibrium, and supply and demand must be in balance at each time. Figure 1.3 displays a schematic supply-demand curve. The sup ply and demand are affected by many factors that influence the seasonality and volatility of prices. For example, supply may be affected for transition constraints (breakdowns) or fluctuation of fuel prices (oil, gas). Demand exhibits seasonal fluctuations, which are due to climate conditions. In ad 1  (ETA) System for the Analysis of Global Energy Markets (2007).  3  Chapter 1. Introduction  0000  01 1 1  M  s  T  (a) Hours average price from Jan-Dec 2002. (b) Daily average log-price throughout the week from 12/2002 to 05/2005.  Figure 1.4: Seasonal patterns by hours and by week for the German market.  350  300  250  200  150  100  50  07/01/02  01/01/03  07/01/03  01/01/00  07/01/04  01/01/05  07/01/05  00/01/06  07/01/06  01/01/07  Figure 1.5: Average daily spot price in German market for years 2002-2007  4  Chapter 1. Introduction dition, electricity demand is also not uniform through the week. It peaks during weekday working hours and is low during nights, holidays and week ends due to lower industrial activity, see Figure 1.4. Also unexpected weather conditions can cause abrupt and dramatic disruptions, producing jumps and spikes in the spot price. Finally the constraints on transmission mean that power markets are geographically distinct. In some markets (such as Alberta or Norway), demand is higher in the winter months due to the use of power for heating. In other markets, such as California power, usage peaks in the summer due to use of electricity for air conditioners. Figure 1.6 shows the 2 factors that influence the determination of electricity prices.  ThameS power plants  plonta  I  Electricity prices  Lighting  Power  end consumer behavior Vacations  )i 4 Long-term factors •  Economic cycle  •  Politcal decisions  •  Capacity expansionfcloaures  l PnblIe hotidays  Cloudiness Time a? day  Factors of a.ipply  Factors of demand Factors of supply and demand  Figure 1.6: The factors exerting a major influence on electricity wholesale price. The most unusual feature of electricity spot prices is the presence of “price spikes”; a phenomenon which does not have any parallel in other commodity markets. See for example Figure 1.5 which shows that in some days in July Source: (RWE AG) Shares of Primary Energy Sources in Total Electricity Generation 2 in Europe (2008).  5  Chapter 1. Introduction  2006 the spot price in Germany reached 300 €/MWh, compared with a normal daily price of 30-50 €/MWh. If such an event occurred for a conventional commodity, such as say cop per, holders of the material would be able to made substantial profits by selling the commodity during the spike, and then repurchasing it at a nor mal price a few days later. But, because, it cannot easily be stored, this is not possible with electricity.  1.3  The relationship between spot and futures prices  The relationship between the spot price and the futures (forward) prices is important for risk management and option pricing theory [30]. Across all these commodities ranging from agricultural products to pure financial assets certain common principles of futures valuation and futures price behavior apply. Let (, F, IP) be a complete probability space endowed with the natural filtration {}>o. In the financial world, the relation between spot and future prices, under the risk-neutral measure Q, is given by the formula I T ‘1 (1.1) F(t,T)=EQ/ISTexpj’ rdnj} where r is the risk-free interest rate. The proof is based on the no-arbitrage argument (see [9]), which proceeds by comparing returns on a portfolio con sisting of the future contract with one consisting of cash and the commodity. However, unlike financial assets, storage of commodities is costly. Con sequently, physical ownership of the commodity carries an associated flow of services. On the one hand, the owner enjoys the benefit of direct access, which is important if the commodity is to be consumed. On the other hand, postponing consumption and storing the commodity means that storage ex penses have to be paid. The net flow of these services per unit of time is called the convenience yield Ct. Since the convenience yield is the result of subtracting the cost of storage from the benefit attached to the physi cal commodity it can be both positive or negative at different times. (A positive convenience yield implies an instantaneous benefit from holding the commodity, a negative one an instantaneous cost.) 6  Chapter 1. Introduction  Again, by a no-arbitrage argument, the relationship between the spot and forward price is given by: F(t, T)  =  (ST exp  {  ftT(ru  —  cu)du}),  (1.2)  where c is the instantaneous forward convenience yield [31]. Note that the convenience yield plays the same role as dividends play for stocks. Some authors have argued that as a consequence of the non-storability of electricity the notion of convenience yield is irrelevant in power markets. Therefore the relation between spot and futures (forward) prices cannot be established through the no-arbitrage argument (see [14, 32, 33]). For exam ple, Geman and Roncoroni comment in [32]: “Our view is that a convenience yield does not really make sense in the context of electricity: since there is no available technique to store power (outside of hydro), there cannot be a benefit from holding the commodity, nor a storage cost. Hence, the spot price process should contain by itself most of the fundamental properties of power.” Other financial theories view the futures (forward) prices F(t, T) and the expected future spot price Et(ST) as related but not identical. The difference is the risk premium, i.e. F(t, T)  Et(ST) + ir(t, T).  (1.3)  The full specification is not straightforward to establish. The theory of a positive risk premium is termed normal backwardation. The opposite situa tion where the futures prices is set above the expected future spot price (a negative risk premium) is called contango. An alternative approach is the actuarial one, which values a forward con tract as its discounted expected real world payoff, see [44]. This is the ap proach we will adopt in this thesis: we will assume that the relation between spot and futures prices is given by (1.2), and that the risk free interest rate r,, and convenience yield c are constant.  7  Chapter 1. Introduction  1.4  Previous work  The main motivation for the development of models for electricity prices is the need for such models by market participants. For example, a power company has the choice of selling its power either on the spot or forward market, and would wish to make the optimum choice. In addition there is the need to price derivatives such as forwards, options and swaps. Hence the model should be sufficiently sophisticated for realistic modelling but sufficiently simple for pricing of derivatives. This issue is very important for computing risk measures, testing hedging strategies and evaluating investment policies. Various approaches have been developed to describe the stochastic price process in energy markets. There are significant parallels between corrimodity markets and interest rate markets. For commodity markets, the traded assets are both the spot and various forward or future contracts. For interest rates, the main traded assets are futures (represented by different types of bonds), while the spot or instantaneous rate of interest plays a more minor role. Given these parallels, it is natural to use the interest rate theory as a base for electricity price models. In general, interest rate models can be separated into two categories: short-rate model and forward-rate models. The short-rate models describe the evolution of the instantaneous interest rate as stochastic process, and the forward-rate models capture the dynamics of the whole forward curve (Heath-Jarrow-Morton models). These interest rate models are then applied to arrive to arbitrage-free pricing of bonds or other derivative products. The same division of models arises for power prices, where the models 3 statistical models (spot price based may be broadly divided into two groups: (forward based models). fundamental models models) and For the forward based models, the futures prices are the main objects of study, and the dynamics of the whole futures prices curve is modeled using the Heath-Jarrow-Morton [42] theory for interest rates. See for exam ple, Clewlow and Strickland (1999) [19] and Manoliu and Tompaidis (2002) [61], and for more recent papers see Borovkova (2006) [10], Koekebakker and Ollmar (2005) [55]. There is another approach based on econometric time series model that we will not 3 consider in this work (see [56, 63])  8  Chapter 1. Introduction A general discussion of HJM-type models in the context of power futures is given in Benth and Koekebakker (2008) [8]. They dedicate a large part of their analysis to the relation of spot, forward and swap-price dynamics and derive no-arbitrage conditions in power future markets and conduct a statis— tical study comparing a one-factor model with several volatility specifications using data from the Nord Pool market. The disadvantage of such approaches is that futures prices do not reveal information about price behavior on a daily timescale and provide a poor approximation to the complex observed spot behavior in power markets. In this thesis, following most of the literature, and the philosophy outlined by Geman and Roncoroni above, we will consider spot based models. In principle these models should provide a reliable description of the evolution of electricity prices. Moreover, these models are versatile in the sense that it is relatively simple to aggregate characteristics to an existing family or class of models by for example adding a seasonality function. Securities (stocks) are usually modeled by Geometric Brownian Motion with drift S, = Soexp{at+aW}, as in the famous Black-Scholes model. This model is not found suitable for commodities, since ‘mean reversion’ is typical feature of these markets [30, 31, 66]. The simplest stochastic process with mean-reverting behavior is the Ornstein Uhlenbeck process [661. Here the process X, is a diffusion process satisfying the stochastic differential equation dX  =  —A(X  —  a)dt + crdWt  (1.4)  where W is a standard Brownian motion, a the volatility of the process, and A the velocity with which the process reverts to its long term mean a. Many electricity price models use this process or variants as a basic building block. For example, Lucia and Schwartz (2002) [59], give models of the form St  h(t) + X  (1.5)  exp {h(t) + X}  (1.6)  =  or S  =  9  Chapter 1. Introduction where S is the spot price, X, is an Ornstein-Uhlenbeck process, and h(t) is a deterministic component, intended to account for seasonal and weekly effects. Benth et al. (2008) [6] called models like (1.5) ‘arithmetic models’ and (1.6) ‘geometric models’, i.e. geometric models represent the logarithmic prices by a sum of processes. The incorporation of a deterministic component of this kind is an im portant feature of nearly all spot price models. Spot prices are higher on weekdays than on weekends, due to higher demand, so a correction h(t) which compensates for this is essential. See for example Figure 1.4b. Spot price models can be divided into ‘single’ factor or multi-factor mod els. For single factor models the spot price is itself a Markov process, while in X) of a. mul multi-factor models the spot price is a function S = g(X’, IRk R+, and as g is not one-to-one tidimensional Markov process. Here g : these models have unknown or hidden components. ...,  As well as the model of Lucia and Schwartz mentioned above, other single factor models are in Cartea and Figueroa (2005) [16], Barlow (2002) [3], Kanamura and Ohashi (2007) [49], and Geman and Roncoroni (2006) [32]. Many of these models, unlike that of Lucia and Schwartz, include mecha nisms to take account of price spikes. One of the simplest of these is in Cartea and Figueroa [16], which adds a jump term to the Ornstein-Uhlenbeck pro cess: log St  =  h(t) + Y,  d  =  —cYdt + crdWt + JdN,  (1.7)  where W, is a Brownian motion, h(t) is assumed to capture the seasonal patterns of the spot price, and the third term JdN enables the process to have discrete random spikes: these are a combination of a Poisson process, which determines the jump frequency, and a jump-size distribution, which gives the jump magnitude conditional on a jump occurring. In (1.7) the process dN is approximated by a Bernoulli process with parameter ldt and J is log-Normal, i.e. log J /2, u 2 N(—u ). Cartea and 2 Figueroa apply this one-factor mean-reverting jump diffusion model for the electricity spot price, adjusted to incorporate seasonality effects and derive the corresponding forward in closed-form to the England and Wales market. 10  Chapter 1. Introduction However, the rather short period for which electricity prices were available and the small number of spikes caused difficulties with parameter estimation. Such models require a high speed of mean reversion in order to reduce the spot price following a large positive jump, and this has the effect of removing too much variability in the series over the non-jump time-periods. Barlow [3] introduces a nonlinear Ornstein-Uhlenbeck model for spot power prices. The price is obtained by matching the demand level with a deterministic supply function which must be nonlinear to account for price spikes. He proposes the inverse function of the Box-Cox transformation.  I  St  fa(Kt)’ ,  =  dX  1+tX > 1 +  0 E  —..\(Xt—a)dt+udWt,  where  f(x)  (1 + x)’,  0  fo(x)  e.  When o = 0, an exponential Ornstein-Uhlenbeck process is retrieved for St. The case 1 yields a regular Ornstein-Uhlenbeck process. The model has been estimated by maximum likelihood on the Alberta and California markets. —  Another paper sharing the same theoretical idea is found in Kanamura and Ohashi [49]. Instead of using the inverse function of the Box-Cox trans formation they assume that the supply curve has a ‘hockey stick’ shape. Setting X = D D, D describes the seasonal component and —  s dX  — —  =  f  (a + b 1 D), 1 2+b (a D), 2  0 D D > 0 D D  (—\X)dt+udW.  This model captures the observed mean-reverting behavior of electricity mar kets and it accounts very well for the observed price spikes, allowing for a better fit to market data. But the assumption of a deterministic supply 11  Chapter 1. Introduction function is probably too restrictive since this implies that spikes can only be produced by surges in demand. Geman and Roncoroni [32] built up a jump-reversion model for electricity spot prices. The model assumes that the natural logarithm of power price dynamics is described by a stochastic differential equation dE(t)  =  [h(t) + 8(t(t)  —  E(t))]dt + crdW(t) + f(E(t))dJ(t),  (1.8)  where h(t) is a deterministic seasonality function, 8 is the mean reversion speed, and o is a constant instantaneous volatility, i(t) is the mean rever sion level. The process reverts to a deterministic mean level rather than the stochastic pre-spike value. The last term in equation (1.8) represents the discontinuous part of the model featuring price spikes. This effect is charac terized by three quantities defining occurrence, direction, and size of jumps. f is a function which is ±1 depending on the level of the spot price.  t 1 E “ “  —  ‘  f  +1, —1,  if E(t) <r(t) if E(t) r(t)  —  The process J(t) is a time-inhomogeneous compound Poisson process with intensity function A(t)  =  (1 + I sin[n(t  -  7)/61I  _i)  where the expected maximum number of jumps per year is represented by Ic. Jump sizes are modeled by a sequence of independent and identically distributed truncated exponential variables. This model generates trajectories similar to those observed in the elec tricity market, and also it gives a good fit of the empirical moments of order 4 1, 2 and 4, i.e. mean, variance and kurtosis. Neither of the last two models includes the convenience yield as a factor, nor considers the valuation of futures contracts or any other kind of deriva tive. The single factor models are quite tractable and their parameters are The kurtosis of a random variable X with mean m and variance a 4 2 is defined by: ) 4 E((X-m) When i is much greater than 3, it means that the density in the tail is higher than that which prevails for a Gaussian distribution.  12  Chapter 1. Introduction relatively easy to estimate. However they have a serious limitation: they do a poor job explaining the relation between spot and futures prices, see [3] and [12]. This limitation can be avoided if changes in spot prices are allowed to depend on more than one factor. The copper mine example of Brennan and Schwartz (1985) [11] assumed that the spot price followed a geometric Brownian motion arid incorporated a convenience yield to their model, assuming it was proportional to the spot price. dS  =  ,LtSdt+uSdz,  C(S,t)  =  cS.  The idea of a constant convenience yield only holds under restrictive assump tions, since the theory of storage is rooted in an inverse relationship between the convenience yield and the level of inventories. Gibson and Schwartz (1990) [35] take an important step to a more realistic model of the econ omy by introducing a stochastic convenience yield rate. The spot price S of the commodity is described by a geometrical Brownian motion and the convenience yield rate cY is described by an Ornstein-Uhlenbeck process with equilibrium level a and rate of mean-reversion f: dS  =  (—öt)Stdt+ciiStdzi,  dc5t  =  ic(cr  dz 1 dz 2  =  pdt.  —  dz u , S)dt + 2  Significant contributions have been made by Schwartz (1997) [73]. He reviewed one and two factor models and developed a three factor model under stochastic convenience yield and interest rates. Including the interest  13  Chapter 1. Introduction as a third factor makes forward and futures prices different.  dS  =  (rt  d5  =  dz ii(a’—c5t)dt+u , 2  dr  =  a(m  dz 1 dz 2  —  6)Sdt + uiSdzi,  —  dz cr , rt)dt + 3  pidt, dzdz 3  =  dt, dz 2 p dz 1  =  dt. 3 p  This model was originally developed for two commercial commodities (copper and oil). He used the Kalman filter algorithm to estimate the parameters in the models. In the paper by Lucia and Schwartz [59], where they analyze the Nordic Power market, the spot price is modeled by =  dX  =  —.AXdt+uxdWx,  dY  =  ,udt+ciydWy,  dWxdWy  =  pdt.  The function h(t) is deterministic, and it is intended to capture the pre dictable component in the spot price, i.e. seasonal effects. This function distinguishes between weekdays, and includes a monthly seasonal compo nent employing dummy variables. The idea of this model is to have a nonstationary process for the long-term equilibrium price level Y and short-term mean-reverting component X. They estimated all the parameters simulta neously by nonlinear least squares methods. The multi-factor models described so far do not capture one of the most characteristic feature of power prices, jumps or spikes. Several authors, Deng (2000) [24], Villaplana (2003) [80], and Xiong (2004) [82] extend such models to even more factors with both diffusion and jumps. In the work of Villaplana power prices are modeled according to non-observable state variables that account for the short-term movements and long-term trends in electricity  14  Chapter 1. Introduction prices. inSt  =  h(t)+X+Y ‘cxXtdt + uxdWi + JdN())  dX  dY  =  2 —ky(i—)dt+uydW  dW 1 dW 2  =  pdt.  —  JddN(\d)  The jump components are characterized by N(\), and N(Ad), i.e. Poisson processes with intensities )4, and ‘\d respectively, and by random jumps of size J,, and Jd with some specified distribution (Gaussian/Exponential). Deng (2000) and Villaplana (2003) set their models in the affine jump diffusion (AJD) framework which enables them to use transform results of Duffie et al. (2000) [27] to derive tractable closed-form solutions for a variety of contracts. Deng proposes more sophisticated mean-reverting jump dif fusion models with deterministic/stochastic volatility and regime switching, which may be a good way of addressing the dramatic changes in spot prices. However the trajectories produced by the model are fairly different from the ones observed in the market. Cartea and Villaplana (2008) [17] build a model for wholesale power prices explained by two state variables (demand and capacity) and calculate the forward premium. Writing D, C for the demand and capacity, they model D and C by  C  =  fD(t)+XP,  =  (t)+X, 0 f  where fD, fc are deterministic functions, and X, X° are independent Ornstein-Uhlenbeck processes. They then take the spot price as given by St  exp{D  +‘yG}.  They perform empirical research embracing PJM, England and Wales, and Nord Pool markets. They find that, depending on the market and the period under study, the volatility of capacity and the market price of capacity risk could either put upward or downward pressure on forward prices. They also 15  Chapter 1. Introduction find that the forward premium follows a seasonal pattern, being positive in the months of high volatility of demand and close to zero or even negative in the months of low volatility of demand. Inspired by Cartea and Villaplana (2008), Lyle and Elliott (2009) [60] present an hybrid model that uses a supply-demand approach for price elec tricity derivatives. They assume that the system demand D(t) is given by  D(t)  =  f(t) + b(t)  where .b(t) is an Ornstein-Uhlenbeck process and f(t) a deterministic func tion. For the supply side, they model the curve S(t, P) which gives the supply at time t if the price is P. They consider curves of the form S(t, P)  =  aSb(t) + blog(cP +  )  where Sb(t) is the base portion of the system supply and a, b, c, and are positive constants. They consider two different models for Sb(t): a meanreversing model, and a Markov chain model. The equilibrium price is given by  P(t)  =  (exp  {_aSb(t)  -  D(t)  })  Using these equations Lyle and Elliott are able to obtain closed-form solutions for European options. They test the model on Alberta prices data calculating the first four empirical moments. The model gives a good fit for the mean and standard deviation but not for the skewness and kurtosis. Benth et al. (2007) [7] propose a non-Gaussian Ornstein-Uhlenbeck pro cess which takes into account seasonality and price spikes. Their model is  S(t)  =  h(t) + X(t)  where S(t) is the spot price, h(t) is a deterministic periodic function and X(t) is a sum of independent Levy-driven Ornstein-Uhlenbeck components.  X(t)  d}4(t)  =  —cY(t)dt + u(t)dL(t),  Y(0)  yj. 16  Chapter 1. Introduction Here the processes L(t) are independent, possibly time inhomogeneous pure jump Levy processes with E(L(1)) < oc. L(t) can be written in terms of n. their jump measures N (dt, dz), i = 1, 2 ...,  ft  Li(t)=J  foo  J  zN(ds,dz).  00  The deterministic predictable compensator of N(ds, dz) which is the jump measure of L(t) is of the form: v(dt,dz)  =  p(t)dtv(t)dz.  Here p(t) is a deterministic function that contrOls seasonal variation of the jump intensity, u(t) controls the seasonal variation of the jump sizes, cx is the level of mean reversion, and L(t) controls the variation of price such as the daily volatile variation and price spikes. Benth et al. (2007) provide closedform and semi-closed form solutions for forwards and options on forwards. This model, coupled with a good description of price seasonality, provides a precise characterization of electricity spot price behavior. In addition, due to its arithmetic structure, it is analytically tractable when it comes to futures and other derivatives pricing. Although the model seems to capture the stylized facts of spot prices market such as mean reversion, seasonality and price spikes, the authors did not make a precise statistical analysis of the quality of the model. However, they suggested the particle filter as a possible method to estimate the parameters in the model. Parameter estimation for this model would appear to be a significant challenge. Hikspoors and Jaimungal (2007) [44] consider two models for oil prices. The first is a two factor version of the model of Schwartz: =  exp{h(t) + X},  dX  =  ).x(Y  d  =  )y(b  dWdZ  X)dt + crxdWt,  —  —  (1.9)  Y)dt + ydZ,  pdt.  (In fact, in order to value spread options, they consider two different com modities satisfying (1.9)). The second is a modification of (1.9) with an 17  Chapter 1. Introduction additional jump component to handle price spikes: =  (1.10)  exp{h(t) + Xt + J},  where X, is given by (1.9), and dJ  =  (1.11)  —,cJt-dt+dQt  with Q a compound Poisson process, and t denotes the instant immediately before time t. Through measure changes induced by a pseudo-numeraire, they obtained, for both models, exchange option and futures prices in closed form both under real-world and risk-neutral measures. Also they consider the problem of model calibration. For the jump model (1.10) they suggest a modification of the procedure of [20]. This is to identify the price jump by searching for days with a price change more than 3 times the standard deviation of the daily price change. (This procedure is then run several times to produce a ‘despiked’ series). They do not apply this method to electricity prices, but do estimate parameters for their first model for oil, from a period of about 3 years of data. They consider parameters built under the real world probability IP, and a risk neutral probability Q. Let us write a bar on the model parameters to denote parameters with respect to Q. If P Q on the filtration generated by both X and , then one has 0XJX,  OyJy,  pp.  The parameter estimation proceeded in a number of steps: 1. Using least squares, the risk neutral parameters were estimated from the forward data, as is the unknown process Y. 2. Given (Xe, Y), the remaining real world parameters Ax, estimated by regression. One surprising feature of the estimates is that they obtain p that the long-term process adds little extra randomness.  Ay,  =  were  —0.96, so  The advantage of two or more factor models is that they allow for a good mathematical description of the problem. Furthermore they have a better fit to historical data and provide a better relation between spot and futures prices, see [73]. 18  Chapter 1. Introduction Many of the models in the literature incorporate ‘regime switching’ com ponents. For example, the basic model of Nomikos and Soldatos (2008) [64] for Nord Pool prices, is similar to that of Hikspoors and Jaimungal: S =h(t)+exp{Xt+},  (1.12)  where X is an Ornstein-Uhlenbeck type process, and Y, is a jump process intended to take account of price spikes. However, both X and incorporate regime dependent terms. X, is given by dX  =  —  X)dt + uxdW,  (1.13)  where R is the water reservoir level, and is assumed to follow a two-state Markov chain with state space {wet, dry}. (Much of the Nord Pool electricity is driven by a ‘jump process’, and satisfies production is by hydro.) dY  =  Ytdt + dLi. 2 ,c  (1.14)  Here L is a jump process, with rates and jump distribution dependent on the season. Davison et al. (2002) [22] they propose a hybrid model based on the ratio a(t) between demand and capacity. At each time step t the spot price S is drawn from a distribution P(t) which is a mixture of Gaussian distributions given by P(t)  =  (1- t)))PL(t) + f(c(t))PH(t).  Here PH(t) is the price-spike distribution, PL(t) is the low-price distribution, and is is a function of c that represents a relative demand-capacity ratio. The distribution e plays the role of a switching variable that determines whether the price is to be drawn from PL(t) or PH(t), i.e. the probability of a spike. They assumed (t)  =  where the constants are distributions PL and PH deterministic. It appears that each the distribution PL, PH.  tanh(20  *  (a(t)  —  0.85)) +  determined by historical PJM spot prices. The are taken to be Gaussian and the function (t) is time step t independent samples are taken from The choice of independent samples from PL would 19  Chapter 1. Introduction lead to highly oscillation prices, and in Anderson and Davison (2008) [1] they replace FL by a Brownian motion. To test the model, they simulated trajectories and compare statistical moments. Applying the Kolmogorov Smirnov test, they concluded that their model is able to simulate data that are from a similar distribution to the observed prices. Eriwein et al. (2008) [29] develop and analyze an exponential Ornstein Uhienbeck process with an added jump process based on hidden Markov model (HMM) setting.The jump component is a Poisson process where the mean and variance are controlled by a discrete time HMM. That is, the spot price, that is partially observed (the underlying economic state is hidden) is given by S  =  D exp{Xt}  where D is a deterministic function and dX  =  a(z)(/3(zt)  —  X)dt + a(zt)dW + JdN.  Here Zt is a Markov chain with 2 or 3 states, and the jump sizes J are conditional Gaussian distributed, i.e.  JIZt  (zt)). 2 N((zt),u  They apply the EM algorithm to estimate the parameters in the model using data from the Nord Pool market. The model captures some of the spikes presents in the real data for the 2 and 3-state Markov chain. A puzzling feature of this paper is that the estimates for the transition matrix of zt suggest that Zt are close to i.i.d. random variables. In the same theoretical framework a continuous-time process is derived by Kholodnyi (2001) [52], where self-reversing non-Markovian spikes are added to a Markovian regular price process. One sees in the literature the need to balance two competing demands. Simple models, particularly simple one factor models, have relatively few pa rameters, and these parameters may be relatively easy to estimate. However, these models generally fail to capture one or more features of real markets of which the most difficult are the existence of price spikes, and the relation between spot and future prices. 20  Chapter 1. Introduction The need for a better fit with data leads to more complicated models, usually with hidden variables, and sometimes with multiple regimes. How ever introducing more factors requires introducing more parameters into the model. Parameter estimation then becomes a significant challenge. Benth et al. (2008) [6] remark: “The question of estimating such models on data is not an easy one... For multi-factor models this may be an even more challenging problem, involving highly sophisticated estimation techniques”. Such parameter estimation is, of course, an essential preliminary to the valuation of options or derivatives based on the commodity. Table 1.1 sum marize some of the statistical models. Although the standard statistical procedure for estimation of a partly unobserved process involves filtering methods, few of the papers in the liter ature use those techniques. An exception is Culot et al. (2006) [21]. They consider a model of the form log St  =  h(t) +  + yTX,  where h(t) is deterministic, and X, and X are spikes and long-term fac tors. The spike process X is an m state Markov regime switching pro cess, while X is a 3-dimensional Ornstein-Uhlenbeck type process given by = (X1),X2),X3)), where dX  =  —kdt + udW,  and W are independent Brownian motions. After estimating the jump term X and subtracting this from the series, the authors combine Kalman filtering techniques with maximum likelihood estimation, using both spot and forward prices, to estimate the parameters for X. Another exception is Kellerhals (2001) [51]. He developed a model for short-term electricity forwards. The suggested stochastic volatility model uses the non-tradeable spot price St of electricity and its variance rate Vt as state variables. The stochastic specification of the processes is given by dS  =  Sdt + S  dvt  =  udt + a/dZ,  dWdZ  =  pdt. 21  Chapter 1. Introduction  [  SPOT PRICE BASED MODELS  Model Specification S)Sdt + aiStdzi = (i = dz cx ö)dt + 2 dz = pdt 1 dz 2 dS = (rt 6)Sdt + uiStdzi dz cr dö = ö)dt + 2 dz j drt a(m rt)dt + 3 dz = 3 1 dz 2 dz = 3 2 pidt,__dz dt,__dz 2 p d 1 z S=h(t)+X+Y dX = —,\Xdt + cxx dWx dY, = pdt + crydWy dWxdWy = pdt  Authors Gibson / Schwartz (1990)  dS  —  —  Schwartz (1997)  —  —  —  Lucia/ Schwartz (2002)  Barlow (2002)  St  1  (1 + X)’ , 1  L.  o  c  dX Villaplana (2003) dX  1/cr =  1 + crX  =  dt 3 p  >  1+aXteo  —)(Xt  —  a) dt + crdWt  ln St h(t) + X, + —icxXtdt + cxxdWi + JdN\) JddN2d) dY = —Icy(p Yt)dt + aydW 2 dW = pdt 1 dW 2 S(t) = exp{E(t)} [h(t) + O((t) E(tj)]dt + crdW(t) + f(E(tj)dJ(t) St = exp{h(t) + Xt + J} dX = ).x(Yt X)dt + cxxdWt dY = \y(q$ Y)dt + crydZ dJ_=_—,cJt-dt_+_dQt S(t) = h(t) + X(t) X(t) (t) (t),__Y(0)_= 2 dY(t)_=_—)ç’Yj(t)dt_+_dL =  —  —  Geman / Roncoroni (2006) Hikspoors / Jaimungal (2007)  dE(t)  =  —  —  —  Benth / et al. (2007)  Table 1.1: Models for electricity prices.  Using maximum likelihood estimation based on Kalman filtering he reports empirical results on electricity data from the California market.  Given observations which derive from hidden state process u , ‘u 1 , 2 one may distinguish ‘online’ methods from ‘batch’ methods. The online methods provide an estimate ii which can easily be updated given an addi tional observation vt+i; while the batch methods estimate the whole series ...,  22  Chapter 1. Introduction (ui, ...,Ut)  from (vi,  ...,Vt).  Following most of the literature in this thesis, we have used online meth ods (such as the Kalman filter and particle filter), rather than batch methods such as the EM algorithm. One motivation for doing this is the need of mar ket participants to update their models in real time. As is clear from the survey above, many authors have attempted to de sign models which capture the typical properties of electricity prices, namely seasonalities, spikes, and stochastic mean-reversion, but none of the models proposed so far has commanded wide assent. In this thesis we consider some relatively simple multi-factor models, with a relatively small parameters. A second emphasis is the use of both spot and forward prices for parameter estimation, and a third is the use of filtering techniques to estimate the hidden processes, and hence the model parameters. We will introduce three different spot price models from which we can also extract the futures prices. All of these models are capable of capturing some of the features of the spot price dynamics and imply certain dynamics for futures prices. The first model (MROU model) is a Gaussian two-factor model where the spot prices is an mean reversion process which reverts to a stochastic mean, also fluctuating as Ornstein-Uhlenbeck process. Since the presence of spikes is a fundamental feature of electricity prices, and any relevant spot price model should take this feature into account, we introduce a second model, an extension of the MROU model with a jump component (MROU with jumps). However, the inclusion of the jump component introduces two kinds of problems in parameter estimation. The first is that the inclusion of jumps adds several new parameters, to describe the jump frequency and distribu tion. The second is that the jump models are non-Gaussian, and the best known filtering technique for these models, the particle filter, is not easily adapted to handle parameter estimation. To avoid this problem, we introduce the third model (NLMROU model) based on the MROU model that produce spikes introducing only one more parameter. 23  Chapter 1. Introduction In general terms, a statistical model is good if it successfully captures the main features of the observed data. Various statistical tests (such as the Kolmogorov-Smirnov goodness of fit) can be used to test and compare statistical models. From this viewpoint, the theory of electricity prices is rather undeveloped. No systematic comparison of the various models in the literature has been made. One reason for this it that in many cases authors have proposed models, but have not yet developed techniques for parameter estimation. In this thesis, we have followed other workers in this area in using fairly simple tests for our models. We have compared moments and sample paths of simulated and real data. Even these simple tests indicate that our models do not capture all the features of real prices.  24  Chapter 2 Filtering 2.1  State space formulation  A state space model is a representation of the joint dynamic evolution of an observable random vector Vt and a unobservable state vector Ut. It is based on two important sets of system equations: the measurement equation and the transition equation. The transition equation describes the evolution of the state vector and the measurement equation reflects how the state interacts with the vector of observations. The evolution of the state is assumed to be autonomous, that is it does not depend on the measurement equation. We consider the non-linear and non-Gaussian state-space model, which 1, 2, is represented in the following general form: For t ...,  ftQut_i, Vt  =  qt—i)  h(u, rt)  (transition equation),  (2.1)  (measurement equation),  (2.2)  are vector functions, where ft : Rn” x Thq H-> IR’ and h : R x R i—* which are assumed to be known and possibly non-linear. The process and measurement noises, qt and rt respectively, are independent with known but 0 arbitrary densities. In addition, we assume that the initial distribution of ‘u is available, that is pQuo) := p(UOIVO). Associated with a state-space model is the problem of estimating the unobservable state using a set of observations. To do so, from a Bayesian perspective we need to estimate the filtering density p(utIVit), where Vi,t , V2, ..., Vt} is the past history of the observed process up to time t. If 1 {V possible, we wish to do this recursively, so that, p(utlvi:t) can be calculated by updating the estimate p(Utvit_i) with the new observation Vt. The esti 25  Chapter 2. Filtering mate of the filtering density can be obtained in two stages (prediction and updating) as follows. Applying the Chapman-Kolrnogorov equation we can write the time up date iteration as:  p(utlvi.t_i)  Vi:t_i)P(Ut_iIVi.t_i)dUt_i  =  fP(uut_i)P(ut_lIVl:t_l)dut_i  (2.3)  by using in the last equation the Markov property. Now, after the observation Vt 1S available we use the Bayes rule to have  p(utIVi;t)  p(VitIU)p(ut) =  —  —  =  /  PPi:t p(Vt, V1:t_i  Iu)p(u)  p(vt,V:t_i) p(Vt IV1:t_i, ut)p(Vi:t_i  Iut)p(ut)  P(VtIVI:t_i)p(Vi:t_i) =  p( VtI Vi:t_i, Ut)P(UtIV1:t_i)P(Vi:t_i)P(Ut) p(VtIV1t_1)p(V1:t_1)p(ut)  —  —  p(VtInt)p(utIVit_i)  2 4)  p(VtIVIt_i)  where the denominator could be written as  =  J  p(vtlut)p(utlvit_i)dut.  (2.5)  Unfortunately, in general there do not exist closed-form expressions for equations (2.3) and (2.4). The main exception to this is where (2.1) and (2.2) are linear and the noise processes qt, rt are Gaussian, and in this case the solution is given by the Kalman Filter [48]. 26  Chapter 2. Filtering  The Kalman filter  2.2  Equations (2.1) and (2.2) reduce to the following special case where a linear Gaussian state-space model is considered. To include a more general case we have included two additive components C, and A to the transition and measurement equation respectively. We have  Ut  =  C +  D  (nx1)  Vt  =  + q_  Uti  B  A + (nx1)  (2.6)  ,  (nx1)  (nxn,)  u+  rt  (2.7)  ,  (mx1)  (n,xn,,)  where the process and measurement noises are normally distributed  (:)  N(()  (q  )).  (2.8)  The initial condition becomes no  ‘—‘  (2.9)  NQZZo,o)  and the matrices G, D, A, B, q, and r are assumed to be known. N(p ) denotes a Gaussian density with mean p. and covariance , that is:  Here  .  N(p.,  )  :=  2I_1/2 exp{—(x  —  p.)’’(x  —  p.)}.  (2.10)  denotes the determinant.  For the model above (2.6)-(2.9), it follows that the transition density p(ut+iut) and the measurement density p(VtI’ut) are normal. It can be shown that this implies that also the prediction and filtering densities are normal, see [77] for details. We have p(ut Ivi:t_i)  =  , _ 1 N(u  (2.11)  p(’ut  Vt)  =  , ]V(’ut t 1  (2.12)  p(vtIVit_1)  =  ]VQOtit.....i,  , F ) _ 1  (2.13)  27  Chapter 2. Filtering where the conditional means tIti, ut(t, tit_i and conditional covariances _ are computed by the following pseudo code of the 1 tit_, >tit and F Kalman filter. Algorithm 1 (Kalman filter) • Step 1, Initialization Set 010 ‘U  and set t  =  =  =  ,  , 0 z  1.  • Step 2, Prediction Compute  =  D+  D  • Step 3, Innovation Define et  (2.14)  tt_i  =  Vt  =  A + B u_i  —  with (2.15)  and compute B+  _ 1 F  =  B  =  t_i  B F’ , 1  =  u-i + Dtt_i  (2.16)  • Step 4, Updating Compute K  =  —  B F’ 1  (2.17) Ct,  B F’ B  (2.18) (2.19)  • Step 5, Looping if t <n, set t = t + 1 and go to Step 2; else stop. 28  Chapter 2. Filtering  In the linear Gaussian case, the Kalman filter has strong optimality prop erties. (2.11)-(2.13) give the maximum likelihood estimator of u given Vit, and this is also the minimum mean square error estimator. That is, in the mean square sense no other algorithm can perform better than the Kalman filter in the Gaussian environment; see [40] for details. In cases where the measurement or the transition equation are nonlinear, sub-optimal solutions such as the extended Kalman filter (EKF) and the un scented Kalman filter (EKF) are commonly used to solve the problem. The extended Kalman filter simply linearize all nonlinear transformations and substitutes a Jacobian matrix for the linear transformations in the Kalman filter equations [41]. Although it is easy to implement it has a number of limitations especially if the system nonlinearities are severe or the true dis tribution is multimodal or highly skewed see [75]. The unscented Kalman filter gives a more accurate performance for nonlinear equations, but does rely on the noise being Gaussian. -  2.3  The unscented Kalman filter  An alternative filter with performance superior to the extended Kalinan filter is the unscented Kalman filter. Unlike the extended Kalman filter it does not approximate the nonlin ear function of the process and the observation, it uses the true nonlinear models to approximate the distribution of the state variable ut by applying an ‘unscented transformation to it. The unscented transformation uses the so-called sigma-points that capture the mean and covariance of the original distributions and, when propagated through the true nonlinear system, cap ture the posterior mean and covariance accurately to third order. For more details see [47, 75]. Unlike the particle filter considered in the next section, which requires a large number of points, the unscented transform only re quires 2n + 1 points to capture the mean and covariance of a probability distribution in ll’. Let us consider a simplified version of the UKF formulation, where we as  29  Chapter 2. Filtering  sume that both the transition and measurement noises that is, Ut  =  Vt  N(0,q) and rt  where qt—i  ‘-  are  additive Gaussian,  f(ut_i)+qt_i,  (2.20)  h(n) + rt,  (2.21)  N(O,). Here  Ut  e R,  Vt  i7o)(uo  —z)’].  E  R.  The algorithm can be described as follows: Algorithm 2 (Unscented Kalman filter) • Step 1, Initialization Set =  Set t  E(uo)  and  E[(uo  o  —  1.  =  • Step 2, Unscented transformation Compute the sigma points and weights. i  =  0  Xt_i(O) (m) Wo  =  ,\  =  i  =  1,...,n  Xt_1(i)  flt_i  (c)  (m) —  —  i  =  n + 1,..., 2n  Xt-i(i) (m)  =  flt_i (c)_  —  —  + (V(n+)t_ ), 1 1 2(n+)’ —  (Vn +  1 2(n+A)  Here the subscripts i and i—n correspond to the jth and j_flth columns (n + ic) n 2 of the square-root matrix (Cholesky factorization). ..\ = a is a scaling parameter. a determines the spread of the sigma points around and is usually set to a small positive value. frC is a secondary —  30  Chapter 2. Filtering scaling parameter which is usually set to 0, and prior knowledge of the distribution of  is used to incorporate  Step 3, Time update Fori=O,1,...,2n,  XtIt_i(i) Vtt_i  (i)  —  =  f(t_i(i)) _ (i)) 1 h(x 2n (m)  —  W  UtIt_1  Xtjt—i (%)  Wc(xtt_, (i)  (XtIt-i (i)  —  —  tIt-1)’ +  q  =  tt-i =  Wm)VI_1(j)  w ( VI_1 (i)  (i) w (xtit- 1  (2.22)  —  i-i) (Vtit_i (z)  _ (Vt t 1  —  (1)  —  —  Vtit-i)’  -  +  (2.23)  1)’.  • Step 4, Measurement update Calculate K  =  Ut  =  Zt  =  (2.24) tt_i  • Step 5, Looping if t < n, set t = t + 1, update  + K(v —  Ut  (2.25) (2.26)  and > and go to Step 2; else stop. 31  Chapter 2. Filtering  2.4  Particle filter  A different approach to filtering has recently become popular [26, 79]. In this approach, we use Monte Carlo simulations instead of Gaussian approxima tions for p(utlvt), as in the Kalman filter. This method allows for a complete representation of the filtering distribution, so that any statistical estimate can be easily calculated. This filter has the advantage that it allows one to deal with fundamentally non-Gaussian situations. The idea is based on the importance sampling technique [70, 76]. Recall that UOt = {uo,ui, ...,ut} is the (unknown) true state of the system, and , 2 {vi, v Vt} are the observations. Suppose we wish to calculate the expectation conditional ...,  E(f(uo.t) IUi:t) =  f  t f(ü:t)p( : 0  t. Vi:t)dU : 0  (2.27)  Inmost non-Gaussian or non-linear situations, the true distribution p(uotvit) will be impossible to calculate. Importance sampling works by instead sam pling from a proposal distribution q(uotlvit), which can be easily sampled from. The support of q(uo:tIvit) is assumed to cover that of p(uotvit). We can write  E(f(uot) Ivi:t) =  f f  =  =  0:tl  f(UO:t)  q(uotvmt)duot,  p(vi.tI’uo.t)pQuo.t)  i f(ot) pVi:t)qtUO;t  J  f  W(UO:t) f(uo:t)  q(uo:tvi:t)duo:t,  V1.t)  q(uotIvit)duot  (2.28)  p(vi;t)  where =  W(UO:t)  p(vi.tluo.t)p(uo.t)  q(notlvit)  (2.29)  is defined as the filtering non-normalized weight at step t. Now 32  Chapter 2. Filtering  E(f(uo:t)Ivi.t)  f  =  p(vit) —  —  —  —  —  =  f(uot)w(uo:t)q(uo;tvit)duo:t  f f(uo:t)w(uot)q(uotIvi:t)duot q(uo.tIvi.,) 0 , I p(vi;t uo:t)p(uo:t) q(uo.t Ivi.i) du f f(uot)w(uot)q(uo:tvit)duot J W (U:t) q(uo;t,ji. ) duo Eq(f(uo:t)w(uo:t)) Eq(w(uo:t)) Eq(f(uo:t)’th(uo:t)),  where  =  (2.30)  Eq(w(uo:t))  is defined to be the filtering normalized weight at step t. , be a Monte-Carlo sample from q(uotvit). 7 n Now let u, i = 1, 2, Then by the law of large numbers ...,  E(f(uo.t) Ivi:t)  f(()))  (2.31)  where now i)  = .  (2.32)  w(u)  Note that (2.31) is correctly normalized: if equal 1.  f  1 then both sides of (2.31)  33  Chapter 2. Filtering Thus, provided we can sample from the proposal distribution q(UotIvi;t), and calculate the weights W(UO:t) given by (2.29), then we can estimate E(f(uo:t)Ivi:t). Note also, that if we can only calculate w(.) to a multi plicative constant c, which can depend on Vi:t, then this constant drops out when we calculate z7 in (2.32), and so the procedure will still work. Suppose that the importance density is chosen to factorize, so that  q(uo:tIvi:t)  =  (2.33)  q(utuot_ivit)q(uot_iIvit_i).  Then (i)  (i)\ WUo.t) /  —  (i)  t) :t)P(U 0 p(Vit1U . (i) q(u Vit) . 0  —  (i)  (i)  =  (i)  (i)  (i)  p(vt, V1:t_1UO.t_i, u )P(ut Uo:t_i)P(Uo.t_i) q(r4 Iz4_i, vi:t)q(U_ 1 IVi:t_i)  The measurement equation (2.2) implies that Vt depends on O:t only , 2 ,v 1 through t and that v ‘Vt are conditionally independent given ‘UO:t. Therefore ...,  p(Vt, Vi:t_1  (i) (i) IUO:t_i,Ut )  Further, (2.1) implies that  Ut  (i)  =  p(VtJU  (i)  )p(Vi:t_i 1U :t_i) 0  is lVlarkov, so  ) 1 p(UIU  =  . p(uU ) 1  Combining these eqilations,  (j)  =  p(Vi;t1  Iui)P(u  1 q(u_  (i) p(vt  1 —w_ —  4 IUi) i)p(VtIu)p(r  _ t ; v ) 1 ) q(4  )p(u Iu ) 1  (i) (i) q(U UO:t_i, Vit)  Iu_i, Vi:t) (  .  34  Chapter 2. Filtering Thus, the importance weights defined in (2.34) can be updated in a simple way at each time step. (2.31) implies that the filtering density p(UtIVit) can be approximated by  p(utlvit)  (2.35)  (ui)  where S() denote the point mass at u. The success of this operation de pends on how close the proposal distribution is to the posterior and whether the resulting point-mass approximation is an adequate representation of the distribution of interest. Although sequential importance sampling poses only one restriction on the importance density, equation (2.33), with the number of choices being unlimited, the design of the appropriate proposal function is, in fact, one of the most critical issues in importance sampling algorithms [79]. Poor choice leads to poor approximation in (2.35), and to poor algo rithm performance in general. See [26, 69] for more details and variants of the particle filter. One major problem with this algorithm is that the variance of the weights increases steadily over time. If one starts with a fixed number n,., of particles, then in practice after a while nearly all the mass of the distribution in (2.35) is concentrated at one particle. Not surprisingly, this leads to poor algorithm performance. In order to solve this, we resample the points to create copies of particles with large importance weights and to remove those with negligible importance weights. This ensures that there are sufficient particles exploring regions of high probability in the next time step [37]. Various methods have been suggested for this [2, 13, 58]. In the particle filter literature four basic resampling algorithms can be identified: 1. Multinomial resampling  Generate n, ordered uniform random numbers a  =  aj+i,  =  and use them to select the new particles  with (1)  —  U[O, 1)  according to the multino  35  Chapter 2. Filtering mial distribution. That is, —  (i)  with i s.t. a E  s)  s))  where F’ denotes the generalized inverse of the cumulative probability distribution of the normalized particle weights.  2. Stratified resampling Generate n ordered random numbers a=  (i—i)+ rip  [O,1) witha U 1  and use them to select () according to the multinomial distribution.  3. Systematic resampling Generate ni,, ordered numbers (j—1)+a  np  and use them to select  with a  U[O, 1)  according to the multinomial distribution.  4. Residual resampling copies of particle to the new distribution. Allocate n = n particles from {u) } by mak Additionally, resample rn = n, where the probability for selecting n is ing n,’ copies of particle (i) 2 using one of the resampling schemes n proportional to w = rlw mentioned earlier. —  /  —  A illustration of generic particle filter is shown in Figure 2.1. The whole particle filter algorithm can be implemented in the following way:  36  Chapter 2. Filtering rplO particles  i1 0  (u  .  1}  *  /  A  . •  •  •  •  np  u,  (u  .w  Figure 2.1: A graphical representation of the particle filter with importance sampling and resampling.  Algorithm 3 (Generic Particle Filter) Step 1, Initialization 0 and for each i between 1 and n (number For time step t = 0, choose u of particles), take p(uo) where p(.  )  is the initial distribution. Also take  np  (importance weights).  While 1 < t < n (number of observations) • Step 2, Prediction For each index i sample ,vit). q(utlu i 2 37  _____  Chapter 2. Filtering Step 3, Importance sampling Calculate the probabilities (likelihood distribution),  p(vt u)  q  ) 2 Iu  (prior distribution),  , Vit) 1 2 u  (proposal distribution),  (2 36)  and the associated weights for each i  (i) wt  (i)  =  ‘wt_i  ) 1 p(vt Iu )p(u u (i)  (i)  q( IUt_i,Vi:t)  • Step 4, Normalizing Normalize the weights (i)  wt  Z,;=lWt  • Step 5, Resampling Resample the points  z4 and reset w  = =  1/np.  • Step 6, Looping Increment t and go back to Step 2. Stop at the end of the While ioop.  2.5  Parameter estimation via maximum likelihood  Up to this point we have assumed that the functions f and h, with the distribution of the noise qt, Tt are fully known. We have discussed the filtering many problem that is how to estimate Uot from observations V1t. But applications, and in particular nearly all financial applications f, h, q,, and rt, will depend on unknown parameters. In this case the structure of the nonlinear state-space model becomes -  in  38  Chapter 2. Filtering  where 0  e  ut  =  ft(ut_i,  Vt  =  ht(ut,  e C Rn0  qt_i,  Tt,  61)  61)  (transition equation),  (2.37)  (measurement equation),  (2.38)  denotes the parameters in the model.  Given the above structure, in this section we addresses the problem of estimating the parameters 0 from the observed data. The parameter estima tion problem for state-space models has generated a lot of interest over the past few years and many techniques have been proposed to solve it. These methods could be broadly classified as Maximum Likelihood or Bayesian. Using a Maximum Likelihood formulation the estimate of 0 is the maxi mizing argument of the likelihood of the observed data, i.e. Vi,v 2 ( 9 =argmaxp , ...,Vt) OEO  (2.39)  where po(’ui, V2, ‘Vt) denotes the joint density of the observations up to time t. In a more convenient form we can rewrite (2.39) as ...,  0= argmax oee  J(Vi:t),  Co(Vi.)  =  log po(Vi:t)  (2.40)  where V1t = {v, ‘v , 2 ‘Vt}. The joint density can be written as the product of the conditional densities: ...,  pe(Vi:t)  where  pe(viJVo)  =  po(Vi).  =  fJP9(VkIV1:k_1),  (2.41)  Thus the log-likelihood function is  =  logpo(VkIV1k_1).  (2.42)  The material presented up to here has dealt with the state-space model using a quite general formulation. The algorithm described above is in prin ciple the full answer to the problem of parameter estimation. Now, from  39  Chapter 2. Filtering a classical approach we can use some numerical optimization search proce dure (Newton’s method, Nelder-Mead) on (2.42) in order to calculate the maximum likelihood estimate 8. Maximum likelihood estimation (MLE) of 6 is particularly simple in the linear Gaussian state-space model (2.6)-(2.7), since the density func tion Po(VkIVl:k_1) is the normal distribution with mean VkIIc-1 and covariance matrix FkIk_1 given by equations (2.15) and (2.16) respectively. Thus,  po(vkIvl:k_1)  2 exp{—(vt I(2)IFkIk_1II”  =  —  kIk_1)’F_l(vt  —  VkIk_1)}  and the log-likelihood function becomes  £O(Vi:t)  —  [log IFkIk_1I +  (Vk  VkIk_1)Fl(Vk  —  i’)]  (2.43)  where FkIk_1I denotes the determinant of FkIk_1. Thus finding the MLE is quite straightforward for the Kalman filter. In the general nonlinear case this approach is non-trivial, since the distri bution Po (v ,v 1 , 2 Vt) 1S generally not available in closed form. However, if we are using the UKF algorithm then, one can approximate the true (non Gaus sian) distribution Po(VkIV1.k_1) by a Gaussian distribution NQikIk_1, VkVk), where vkIkl and Zvkvk are given by equations (2.22) and (2.23). Hence one obtains ...,  —  [log IVkVkI +  (Vk  —  VkIk_1)’vk(Vk  —  Since the UKF algorithm is based on the equations of a Kalman filter, the maximization of the log-likelihood can be done exactly as in the Kalman filter. For the general nonlinear case, with non Gaussian noise, we have seen that the particle filter provides a technique for filtering with known parameters. In 40  Chapter 2. Filtering general, we can approximate (2.42) using equation (2.5). Given the likelihood at step k, po(VkV1:k_1)  =  fPo(vkIuk)Po(ukIvl:k_1)duk  this could be written as  PO(VkI’01:k_1)  I  =  i  Po(ukIVl:k_1) qe(ukIuk_1,v1:k)duk, e(v’I’) qoukuk_1,v1:k)  and given that by construction the u’s are distributed according to can write the Monte Carlo approximation  ±  po(vkIvlk_1)  Thus, we can estimate  q, we (2.44)  £(&) by slog k=1  (± Ew). ‘  (2.45)  j=1  While this does give an approximation to the log-likelihood, this approach has several problems if we try to use it to obtain the MLE. For a fixed 9 equation (2.45) gives a random variable, where the randomness comes from the particle filter. See Chapter 4 below for an account of the difficulties this causes. An alternative approach to maximize the likelihood is employed the Ex pectation Maximization (EM) algorithm [74]. The objective of the algorithm is to maximize the likelihood of the observed data (2.42) in the presence of the hidden variables (Uot = {uo, u ,.. Ut}). The basic idea is that if we 1 could observe UOt, in addition to the observations Vi,t then we would con sider {UOt, Vi:t} as the complete data with the joint density .  po(uo:t,vi:t)  ,  po(uo)llpo(ukIuk_1)llpo(vkuk)  (2.46)  41  Chapter 2. Filtering and seek the maximum log-likelihood estimate of & via  6= argmax  £O(Vi:t,  Uo:t),  Q(Vi:t,  uot)  logpo(vi:t, flo:t)  (2.47)  oEe  The EM algorithm for maximizing L( Vit, u ) is a two step procedure. 1. (E-Step) Computes the expected value of £0(Vi:t, flit) over the hidden C) (missing) data i:t based on the current value of the parameters 6 and the observations V1:t 6(i)) 1 Q(6  e())  =  (Vi:t, UO:t)IV1:t, 0 E(L  =  flog p (UOt, v1:t)Pe(,) (UO:t IVi:t)  2. (M-Step) Update the parameter estimate with respect to 6, argmax  maximizing Q(&j6())  &()) 1 Q(&  (2.48)  0  and we repeat this two-step process until a fixed point of  Q is obtained.  Unfortunately, there are very few situations where an exact and tractable solution exists for these two steps. One exception is the linear Gaussian statespace. The algorithm is described in [71]. Work applying the EM algoi’ithin to nonlinear dynamical systems in the form of (2.20) and (2.21) is reported in [23, 34, 41]. To use the EM algorithm for a general nonlinear state-space the sequential &()) 1 Monte Carlo (particle filter) methods are employed to approximate Q(& standard method maximization there is no to solve step numerically. For the this problem, and so it is necessary to proceed on case-by-case basis. In one general approach we calculate gradients with respect to 6, and use a gradient-based search procedure to find the maximum. Recent works that use this technique are in [36, 65, 81]. None of these methods is simple.  42  Chapter 2. Filtering  2.6  Parameter estimation via Bayesian methods  Because of these difficulties one would like an alternative to the MLE approx imation. One Bayesian approach, described in ([43, 53, 57]) is to consider a Bayesian estimation by concatenating the state vector u with the unknown parameter 0, and introduce an artificial dynamic on the parameter. That is, we replace 0 by Ot and define a new state vector Yt  (Ut)  (2.49)  where Ot - p(Ot—i Vit). Then one applies the particle filter to this augmented state-space. However this method has a number of difficulties and problems [50]. At a theoretical level it is not altogether satisfactory to replace a fixed parameter 0 by a random evolution process Ot. In addition, there are various practical problems associated with the choice of the artificial dynamic of 0. Thirdly, there are problems with the performance of the algorithm. These difficulties have been discussed in the literature, for example, in [18, 50]. Under the Bayesian framework, more sophisticated and new methods are being proposal. See [50] for an overview of particle filters methods for param eter estimation considering a nonlinear non-Gaussian state-space models.  43  Chapter 3 MROU model In this chapter we present a Gaussian two-factor model known as MeanReverting to Ornstein-Uhlenbeck model (MROU) for the spot price and the convenience yield that captures some of the characteristics that we had de scribed above of the power market and the dynamics of the futures prices.  3.1  Double mean-reversion model  There are many parallels between interest-rate models and modeling com modity prices, so many models originally developed for stock and interest rate markets have been applied to the energy market. We implement, for the valuation of electricity futures contracts, a two-factor mean-reverting model originally proposed in [5], and considered previously in [44] for oil commodity prices. Let (p2, F, {}>o Q) be a complete filtered probability space where is the risk-neutral measure. If S is the spot price then  =  dL  =  (3.1)  exp{Xt + h(t)}, —)‘x(X  dX  —AL(Lt  —  —  L) dt + x dW’, L) dt +  Q  L  . 2 dW  (3.2) (3.3)  Here X, is the observed deseasonal log spot price, and L is a non-observed long-term mean process. We assume that both processes are given under 2 satisfy the risk-neutral measure and the two Brownian motions, W’ and TT/ , W?) = pdt. h(t) is an arbitrary deterministic function that accounts 1 d(T’V for seasonality. The difference between this model and the Gibson and Schwartz model 44  Chapter 3. MROU model is that here, both the log spot price and the convenience yield follow an Ornstein-Uhlenbeck type process. The seasonal component h(t) combines trend and seasonality. Usually it consists of sum of sinusoidal functions which incorporate predictable daily and annual periodicity and dummy variables which incorporate predictable workday/weekend and holiday effects. In this thesis, we consider a sum of two cosine functions with distinct periods with a linear trend, that is h(t)  cos  0+ +t  =  (Ti  +2nt)  (3.4)  where m represent the seasonality period and the parameter =  needs to be estimated. The seasonalities have been discussed extensively in the literature, see for example Lucia and Schwartz [59], Cartea and Figueroa [16], and Benth et al. [7]. Although there are several ways of deseasonalising the data, usually it is estimated by means of non-linear regression methods. That is, if t = t we estimate the seasonality function by fitting h(t) to the log,t 1 t , 2 prices using least squares estimation ...,  2 =  (i(tj)  argmin  —  log(Sti))  ,2,rl ,T2)  Write (t) for the estimate. The deseasonalized log-spot price is given by: =  3.2  logS .-(t).  Radon-Nikodym theorem for Ornstein-Uhlenbeck processes  Before we continue with parameter estimation for the MROU model, we consider a simple case, that is the problem of parameter estimation for the Ornstein-Uhlenbeck process OU(A, a, a) defined by dX  =  —A(X  —  a)dt + crdWt.  (3.5) 45  Chapter 3. MROU model t T} the parameter o If one observes the whole process x = {X, 0 variation of X, but ) and a exactly, using the quadratic can be estimated cannot. The simplest statistical problem is when one has two alternatives: : 0 H  vs.  ,)=A a=a 0  : 1 H  ,)=) a=a 1  0 Denoting Po, P for the probability measures for (3.5) associated with H and H , the Radon-Nikodym theorem gives the optimal test for this in terms 1 of the likelihood ratio. If —  T  L(xIHi) L(xlHo)  dP 1 0 dIP  —  then the test takes the form 0 if Reject H  c(a),  ZT  where c(cl) is given by (reject 0 IP  ) 0 H  c(c))  IPo(ZT  =  =  c.  The power of the test is given by p  =  ) 1 (accept H 1 IP  =  c(a)).  (ZT 1 IP  Using the result in [44] (Thm. 3.2) we obtain log(ZT)  =  = X)dW 1 c +3 The constants  , 0 c  —  fco  ds. 2 ) 5 + ciX  and c 1 satisfy = 0 c  —  ci  ,  = 1 c  Ao  —  ci  .  (3.6)  We used Monte Carlo simulation to find the power of the test for these sample tests see Table 3.1. -  46  Chapter 3. MROU model  0 A a 0 A 1 a  T p  H-J 1 0.73 4.21 0.15 3.27 0.63 3 0.10 100%  N-S 1 5.78 4.83 5.78 5.43 1.03 1 0.05 95%  N-S 2 5.78 4.83 5.78 5.43 1.03 0.5 0.10 86%  H-J 2 0.73 4.21 0.66 4.60 0.63 25 0.10 90%  Table 3.1: The data are taken from Hikspoors & Jaimungal (columns 1 and 4), and Nomikos Soldatos (columns 2 and 3). Given A , A 0 , a 1 , a 0 , a, and T we calculated c 1 , and c 0 1 from (3.6). We performed n = 2000 simulations of a standard OU(A, a, a) under IP 0 and IP to estimate the power of the test. Using these simulations, we can estimate how much data is needed to obtain reliable parameter estimation for the OU process. We began by looking at some parameter values for OU processes found in the literature. In [44] Hikspoors & Jaimungal study the MROU model (1.9). Having used futures data to estimate the parameters for X, and }‘ in the risk neu tral measure, and also estimate the process i’ itself, they then estimate parameters for }‘ in the real world measure. The column H-J 1 of Table 3.1 gives the values of those parameters. Considering these values, we have that the distribution of H 0 and H 1 are sig nificantly different taking 3 years of observations, i.e. we distinguish (A , ao) 0 from (A , ai) almost perfectly. 1 Nomikos Soldatos [64] consider a regime switching model for the Nord Pool market see (1.12)-(1.14). The parameters a, i = 0, 1 differ according to whether the weather is wet’ or dry’. We can ask how long a period of observations is necessary to distinguish reliably between a a 0 and a a , 1 in this situation. Simulation results shows that if T = 1 then this is possible at the 95% confidence level but if T = 0.5 then the power of the test is only -  47  Chapter 3. MROU model 86% at  =  0.1.  If we imagine data to be split into fixed periods in which the regime is constant and the task is to distinguish which regime holds during the period, then one can do so reliably if the period is 1 year, but shrinking the period to 6 months will give rise to an error probability of about 10%. In the fourth column we consider two sets of parameters which differ by ao) in H-J 1. about 10%: the first set being similar to the values of In this case we see that even 25 years of data is not enough to reliably distinguish between the two parameter sets. Figure 3.1 plots the distribution of log ZT under the two hypotheses: a substantial overlap in the distribution is apparent. Null hypothesis distribution 0.1  i  I  I  —8  —8  I  I  I  I  2  4  I 8  I 8  I 10  I  I  I  0.16 0.14  -  0.12 0.1 0.08  -  -  0.06 0.04 0.02  -  ---I —10  I  —4  —2  Alternative hypothesis distribution 0.Z  I  .  I  I  I  I  0.18 0.16 0.14 0.12 0.1 0.06 0.06 0.04 0.02 I 10  I —8  —  —6  4  —2  Figure 3.1: Sampling distributions with (column 4) where we add 10% devi ation from Hikspoors & Jaimungal’s parameters (column 1).  48  Chapter 3. MROU model It would be interesting to perform an analysis using the Radon-Nikodym theorem for hypothesis testing with the MROU model. Calculating the like , 0 s T) is straightforward 3 ,L 3 lihood ZT with respect to the u-field u(X T), s see [44] (Thm. 3.2). However, we need to calculate E(ZTIXS, 0 and this would require calculating conditional expectations of the form  -  fT  {JT  E (exp  )dW 5 + ciL  )2ds} 3 L 1 c  +  —  ,o 8 x  s  T).  This does not seen to be an easy problem. However, from the results for the fourth column, it seems clear that many decades of data would be needed for an accurate estimate of the parameters for the hidden process L. Financial time series have one significant property which makes them unique in a statistical sense. As well as the data itself (e.g. the spot prices), we also have available prices of various derivatives of the product. These produce a substantial amount of extra information. In the context of the IvIROU model, we will therefore use futures as well as spot prices in our parameter estimation.  Future price  3.3  This model is an special case of Affine Jump-Diffusion model (AJD), so to obtain closed form formula for the price of futures contracts we use the results by Duffie et al. (2000) [27], see Appendix A. The model can be rewritten as d  —  L  —  0 LL  —A 0  dt +  ‘X  + Thus, U  L  ‘L  —  x 2 p  dt  pux  0  dW dW?  (3 7)  [Xe, Li]’, and we have dU  =  0+K (K U)dt + dW, 1 49  Chapter 3. MROU model 0 and where dW is a Brownian motion with covariance H  0 K  — —  0 ,  ?LL  1 K  — —  —)‘  0  )‘x  ,  0 H  —AL  PxL  —  —  PxL  (3.8)  2 L  1 = 0 and 1 = 0. 1 and lo given in the Appendix A satisfy H The functions H Duffie et al. gave expressions for various functionals of U, and in particular for ‘(u, t, T, U) By setting u  =  :=  (3.9)  E[exp{u. UT}IUt].  (1,0)’ in (3.9) one can obtain the future price of U  (t, T)  0)’, t, T, U) exp{h(T)}  :=  I1((1,  =  E[exp{XT}Utjexp{h(T)}.  (3.10) (3.11)  By equation (A-3) in Appendix A, the future price is of the form E[exp{XT + h(T)}IUj =  (0, t, T)L}, 2 (1, t, T)X + N 1 exp{h(T)} exp{M((1, 0)’, t, T) + N  (3.12)  (1,t,T) and N 1 (0,t,T) satisfy the following equa 2 where M((1,0)’,t,T), N tions:  =  (3.13)  , 1 AxN  (3.14)  , 2 1 + ALN AxN  =  2 LLN  —  (uN? + N)  —  pUXJL, 2 1 N  (3.15)  with the boundary conditions M((1,0)’,T,T) =0,  (1,T,T) 1 N  =  1,  and  (0,T,T) =0. 2 N 50  _______  Chapter 3. MROU model Solving the initial value problem, we have (3.16)  (t,T) 1 N  =  et_T),  t, T) N ( 2  =  e(tT)m  M(t, T)  =  (e(t_T)  +  —  —  (3.17)  e)t_T)m,  (et_T)  1)mi +  (et_T)  —  4+ 1)m  —  2 + 1)m  t_T) 2 (e  T) (e _ 2  —  —  3 1)m  , 5 1)m  (3.18)  where  m  =  4 rn  —  cx 2 m  +  )\X+L  =— 1 m  )\LLm  rnpcrxcrL 7722  / u  om  -\X 4 \  x 4  m3=—j—————+  Lrn  )‘X+,\L  +  mpuxuLN J 2.Ax .j  r m c 2 1725=—  ’\L 4  Thus, we have that the price of the futures contract is given by  F(t,T)  =  (t,T)L}exp{h(T)}, 2 exp{M(t,T) + N (t,T)X + N 1  (3.19)  (t,T), and M(t,T) are as (3.16)-(3.18). The 2 where the functions Ni(t,T), N deseasonalized log-future price is given by log F(t, T)  3.4  F(t, T) exp{—h(T)}.  (3.20)  Formulation in Kalman filter terms  We assume that data are available in the form of the spot price S and various 1,2, m. Most data sets are available futures or forward prices F(t, Ti), i for all hours of the week, prices prices. are traded in the form of daily Spot but final2cial markets, which trade future contracts, are only open MondayFriday. Since in any case, due to low consumption at the weekend (see Figure ...,  5  Appendix B for the calculations) 51  Chapter 3. MROU model 1.4b), spot prices on Saturdays and Sundays exhibit a different behavior to the rest of the week, we will disregard weekends and also holidays, and just consider weekday data. Given this data, we wish to estimate the parameters X, L, X, L, L and p.  The Kalman filter method has been applied previously to electricity mod els [14, 51]. To use the Kalman filter we need a discrete time set of equations, so we replace (4.2)-(4.4) with the forward Euler approximation:  St =  L  (3.21)  e,  1 X_  =  —  )x(Xt_i AL(Lt_l  —  —  )At + 1 L_  —  p2uxAWtl + pcxxAW/,  (3.23)  L)At + JLAW.  —  (3.22)  1 for the Here we have made a slight abuse of notation, in writing X,, X_ successive values of X. More precisely, in (3.22) we should write  X  —  —  )At + 1 L  —  p2uxAW + puxAW,  but to avoid two levels of subscripts we have used the form (3.22)-(3.23). , AW? 1 1/250 (the number of trading days in a year), and AW Here At are independent Gaussian random variables with mean 0 and variance At. To apply the Kalman filter, the model must be expressed in its state space form. Taking the state variable as u = (Xe, Li)’ a discretization of the ) the transition equation becomes: 1 and At = (t t_ time t = t, t , 2 , .  .  .  —  (3.24)  ut=Ct+Dtut_i +qt_i, where  Ct  =  (  0 LLAt  ‘  j’  =  (1—At 0  AAt 1—  52  Chapter 3. MROU model and the process noise covariance matrix is /  2A  °X’-  cov(qt_i) =  Zq  PxL  t  .  (3.25)  2 , and N 1 The measurement equation is given by the functions M, N calculated according to equations (3.16)- (3.18): Vt  =  ()  )  = A+B(  +r.  (3.26)  We write X for the observed deseasonalized log-spot price, and Zt,T for the observed deseasonalized log price at time t of a future contract with maturity T > t. We will assume that there is some noise in the measurement of X, and Zt,T, so that (3.27)  X=X+,  (3.28)  = log F(t, T) +  are iid N(O, cr) random Here are iid N(O, o) random variables and variables. We have two reasons for making this assumption about non-zero noise. First, at a fundamental level, it is reasonable to allow for some pricing errors due to large bid-ask spreads. (This may be particularly relevant in the futures markets, which are not always heavily traded.) Secondly, the has to be Kalman filter involves matrix inversion see (2.17), where computed. If the model has degeneracy, then severe numerical problems can arise. Adding the noise in (3.27) and (3.28) avoids this difficulty. -  Using (3.19) we can write (3.28) as = M(t,T) +XN (t,I) +. 2 (t,T) +LN 1  (3.29)  The measurement equation is therefore given by  xt* Vt  =  ZtT1  = A + B  L,  ) + rt,  (3.30)  Zt,Trn  53  Chapter 3. MROU model  where  =  0 ) 1 M(t,T  =  B  M(t,Tm)  1 (t,T N ) 1  0 (t,T 2 N ) 1  Ni(t,Tm)  t,) N ( 2  (3.31)  and the measurement noise rt has covariance matrix J  =  cov(rt)  o =  0 cj  ...  ...  :  o  0 0  o •..  0  (3.32)  Zm  Simulated trajectories of the MROU model using equations (3.24) and = 0.5, L = 3.5, (3.26) with parameters ‘x = 130, )\L = 3, ux = 5, 1/250 can be seen in Figure 3.2. The long-term mean p = 0.3 and Zt process L reverts towards the mean L, and as we expected, the spot price ) mimic the long-term mean process but with 1 St and the future price F(t, T different volatility. The observation and state equation matrices C,, D, q, A, B, and depend on the unknown parameters of the model. Based on this state-space formulation the parameters that we need to estimate are: &  =  {x,L,ux,JL,L,p,Js,uz}.  Note that if we use two different maturity contracts then rn and crz have 2 in the parameter vector 6.  2, and we will  The log-likelihood function Jü(Vit) for the linear Gaussian space-state is given by equation (2.43). This function can be maximized with respect to 0 using an appropriate numerical optimization procedure. O(Vi.t) only depends on the prediction error Ct and its covariance matrix F . Both in _ 1 turn are outputs of the Kalman filter, equations (2.14) and (2.16). Thus the maximum likelihood estimate of & can be obtained as follows:  54  Chapter 3. MROU model  ) processes 1 Spot S and Future F(t,T  0  Days  Long—term process L  Days  Figure 3.2: The upper graph shows the simulated spot price S and the future ) with maturity of one month. The lower graph is the long-term 1 price F(t, T mean process L. Algorithm 4 (Kalman Filter optimization) • Step 1 Choose a initial value for 6, say 6o. • Step 2 Run the Kalman Filter (Algorithm 1) and use the sequences et and _ to compute the log-likelihood £(Vit) by (2.43). 1 F • Step 3 Employ an optimization procedure that repeats Steps 1-2 until a max imizer 6 of (2.43) has been found. 55  Chapter 3. MROU model  Some practical problems arise with the optimization procedure, and so the performance and accuracy of the Kalman Filter are affected since the problem may be poorly scaled. An optimization problem is poorly scaled if changes in the decision variables produce large changes in the objective function for some components and not for others [25]. We solved this problem by rescaling the variables. There are several numerical search algorithms available to maximize the 6 log-likelihood (Step 3). Barlow et al. [4] used the Nelder-Mead method to minimize £Q(Vi.t). Here we decided to apply a quasi-Newton algorithm, 7 (which is similar to the method used by [51]) the so-called BFGS method to get the initial point, and then used the Nelder-Mead method to find the optimum.  3.5  Empirical results  In this section we report some empirical results based on simulated and real data to examine the Kalman filter method applied to the model described in the previous section.  3.5.1  Simulated data  We first ran the algorithm on deseasonalized simulated data. To analyze the algorithm performance, first we simulated series with 100 and 800 ob servations respectively using equations (3.24) and (3.26), considering only the nearest monthly futures contracts in the log-future price, i.e. Zt,T 1 with 1 = 30 days. In this case we have a 8 dimensional parameter space. We T started the maximization procedure with a different initial values each time. Examples of some runs are given in Table 3.2. Since we are searching for the maximum likelihood, we did 25 runs and we took the one which gave the largest value for £( ). Repeating the same 8 The Nelder-Mead method is used to minimize a function of multiple variables without 6 derivatives [see Nelder and Mead (1965)]. Details and derivation of the BroydenFletcherGoldfarbShanno method in the context 7 of filtering can be found in [28].  56  Chapter 3. MROU model 300 Run 2 Run 3 127.923 126.335 3.301 2,348 4.658 4.845 0.453 0.489 3.509 3.503 0.247 0.313 0.005 0 0 0.002 -629.702 -643.223 152.183 192.954 Ti  True Value 130 3 5 0.5 3.5 0.3 0 0  Ax AL  L p 1 crz  CPU time  Run 1 129.032 3.478 4.917 0.522 3.495 0.318 0.004 0.009 -620.952 163.141  =  Run 4 138.586 2.788 4.795 0.522 3.499 0.382 0.000 0.000 -631.359 155.216  Run 5 130.232 2.985 5.123 0.509 3.497 0.311 0 0 -601.72 123.2167  Table 3.2: Five different maximization runs, on the same set of simulated data.  Ax AL x L  L p  °s 1 z  True value 130 3 5 0.5 3.5 0.3 0 0  n=1000 Estimator Std. 129.369 1.055 0.237 2.954 4.998 0.098 0.004 0.493 0.001 3.507 0.018 0.308 0.0007 0.0003 0.003 0  m=300 Estimator Std. 4.221 129.042 0.643 2.968 4.862 0.127 0.023 0.489 3.502 0.009 0.301 0.077 0.002 0.001 0.001 0  Table 3.3: Estimation using one futures contract (average of 50 simulations). procedure with 50 different series we obtained the following results, which are given in Table 3.3. We can see that the estimation results recover the true values reasonably well in both cases. As expected, the standard deviation increases for all estimators when we reduce the number of days. Note that the estimator is close to zero for variables s and z • This is not surprising since we are 1 estimating the true model that generated the data and the noise in the model only comes from two sources. Next, we simulated 25 new series with 300 data, but this time we included  57  Chapter 3. MROU model 1 = 30 and T 2 = 60. In this log-futures prices with two different maturities, T Table 3.4 we summarize In case we have an extra parameter to estimate, az . 2 the results, repeating the same procedure as before.  ax aL  L p as 1 az 2 az  True value 130 3 5 0.5 3.5 0.3  TL 300 Estimator Std. 129.801 0.436 3.000 0.000 4.845 0.178 0.499 0.001 0.017 3.505 0.102 0.294 0.010 0.009 0.001 0 0 0  Table 3.4: Estimation using two futures contracts (n=300). Thus we obtained good approximations for all the parameters using only a short data series. This could be useful due to the scarcity of data in the electricity markets. One example is [16] where the authors comment that there is too little data for parameter estimation in the UK market. We remark that if we tried to estimate the model parameters just using the spot price, then it will require many decades of data to make accurate estimates. Notice that we used an alternative Kalman filter formulation (U-D fil tering) since numerical problems arise when we include more parameters to estimated. During one of the recursions of the filter the covariance matrix failed to be positive semi-definite and consequently the estimated parameter differed from the true values. This problem arose because the matrices and Zq were ill-conditioned. Since we are minimizing a fixed function, it is legitimate to discard runs which fail in this fashion. For further details about the U-D filter refer to [15, 38, 75]. .  3.5.2  The German electricity market  We now wish to apply these techniques to real data. While there are many markets in which electricity is traded, the data we require (that is, both spot and futures prices) in many cases are not available. For example, the Alberta 58  Chapter 3. MROU model Power Pool makes spot prices available, but forward prices are known only to market participants. One market for which both sets of data are available is the German EEX market. The European Energy Exchange (EEX) is Germany’s energy exchange. It is one of the biggest power markets in Europe. EEX emerged in 2002 from the merger of EEX Leipzig Power Exchange and EEX European En ergy Exchange Frankfurt. Both exchanges initially started spot trading for physical contracts in 2000. In 2001 EEX Frankfurt also introduced trading of standardized financial contracts. Commonly traded products in the power markets are baseload, peakload and hourly contracts. At the German market the times for peakload are defined as weekdays between 8:00am and 8:00pm. In the futures market contracts on both baseload and peakload are traded. The usual delivery periods are one month, one quarter and one year. In the spot market of the EEX baseload, peakload and hourly contracts up to the next weekday are traded. The estimates are based on historical daily average spot price and monthly baseload futures price covering the period from July 2002, when the EEX and LPX markets merged, until the end of June 2007, almost five years of historical data. This data contains prices for 1267 days. Figure 3.3 depicts the price trajectories of the spot and the nearby monthly futures prices between July 1, 2002 and June 29, 2007 for the EEX market. From the graph we note that there is a strong mean reversion and the spot prices show extreme spikes as well as high volatility which changes rapidly over short time periods. Moreover there is an linear drift over the years in the spot and futures prices. Figure 3.3 shows much greater volatility, and more price spikes, for the period Jan 2005-June 2007 than in the earlier period, Jul 2002-Dec 2004. We therefore split the data into two parts (7/1/02-12/31/04, 1/1/05-6/29/07), and repeated the runs on each of these. Following [8], we removed the seasonality by representing it as a linear combination of cosines including a trend, a weekly, and a yearly cycle of the form  +2irt\ 2 1r +2irt’\ 1 (r +j3 c 2 os h(t)=r+/3ot+/3cosç 250  )  ).  (3.33)  59  Chapter 3. MROU model  350.00  —SpoL Price  —Fl  300.00  250.00  200.00  150.00  100.00  50.00  0.00 7/1/2002  1/1/2003  7/1/2003  1/1/2004  7/1/2004  1/1/2005  7/1/2005  1/1/2006  7/1/2006  1/1/2007  Figure 3.3: Electricity spot and nearby monthly futures price in German market. Assuming 250 trading days in a year, we estimate the seasonality by fitting the h(t) function to the log-price series by ordinary least squares. The results can be seen in Table 3.5 and Figures 3.4, and 3.5. We see less strong season ality in the EEX market that in the more hydro-dependent Nordic electricity market that depends more on hydro. The seasonal component is highest in winter. Parameter Est. value  ñ 3.2965  i  0.0005  -0.0695  23.5235  0.0224  0.8218  Table 3.5: Estimated values for f(t) by least-squares fitting. Based on equations (3.24) and (3.26) we estimated the model using the spot price and one futures contract with one month maturity, for both peri ods. Also we estimated using two futures contracts with one and two month 60  Chapter 3. MROU model  Log—spot price Iog(S)  =  X  +  h(t) Iog(S,)  5.5  h(t) fitted  4.5  3.5  2.5  200  400  600  800  1000  1200  Days  Deseasonalized log—spot price X  =  log (Se)  —  h(t)  .c)  600  Days  Figure 3.4: The upper graph shows the log spot-price of the EEX market and the seasonal component h(t) and the lower graph the deseasonal series h(t). = logSt —  maturities. The results of the parameter estimation ale shown in Tables 3.6 and 3.7. For all the periods the estimates of p are quite small the largest value being 0.115 for Part 1 when estimated using one future contract. This may be compared with the estimate —0.96 obtained in [44] in the context of the oil market. -  Unlike the case of simulated data, the parameter estimates using two futures prices differ somewhat from those obtained with just one future. The explanation is presumably that the model does not perfectly describe the real data. 61  Chapter 3. MROU model  Spot price S and exp(h(t))  q)  Days  Deseasonalized spot price S  q)  600  Days  Figure 3.5: The upper shows the spot price S with exp{I(t)} and and the lower graph the deseasonal spot price St = exp{Xt}. In the absence of noise, the model (4.2)-(4.4) gives exponential decay toward the long run mean L. We can therefore interpret log 2 TX  log 2 TL  =  Ax  =  AL  as the ‘half lives’ of the processes X and L (measured in years). For the X process the estimates in Tables 3.6 and 3.7 give a half life 1.5 4 days, while for the L process the half life estimates vary from about 6—12 months. Thus the estimated process X and L do play a satisfactory role in separating out short and long-term fluctuation in the spot price. —  The long run standard deviation of the OrnsteinUhlenbeck process L is Using the numerical values for the whole period from Table 3.7,  . 2 (aL/2,\L)1/  62  __________  Chapter 3. MROU model  Ax AL  L p  1 Q.z CPU time  Whole Estimator 65.030 0.544 1.719 0.517 3.108 0.101 0.177 0.000 613.88  Part 1 Estimator 42.439 1.319 1.172 0.410 3.105 0.115 0.193 0.000 485.99  Part 2 Estimator 77.509 1.060 2.013 0.586 3.156 0.065 0.167 0.000 312.33  ). 1 Table 3.6: Estimated values for the EEX market using S, and F(t, T  Ax AL cxL  L p  us  CPU time Table 3.7:  Whole Estimator 98.507 0.441 3.161 0.537 3.169 0.021 0.000 0.089 0.000 1440.58  Part 1 Estimator 116.807 1.231 3.191 0.337 3.168 0.019 0.001 0.074 0.006 1208.56  Part 2 Estimator 98.227 0.579 2.965 0.507 3.222 0.114 0.047 0.065 0.024 703.68  ) and 1 Estimated values for the EEX market using S, F(t, T  F(t, T ). 2 this gives an standard deviation of 0.57. The corresponding quantity for the X, process is 0.23, so both components contribute significantly to the long run variance of the log-spot price. In order to investigate if the estimated parameters make sense, we sim ulated a path of the spot price, future prices and long-term mean process that describe the model using the estimated values from Table 3.6 Part 1, see Figure 3.6. We now compare those simulations with Figure 3.5, which gives the real  63  Chapter 3. MROU model  ) processes 1 Spot S and Future F(t,T  a  •0  Days  Long—term process L 3.u  I  I  I  I  I  L, 3.6  L  3.4  3.  28 2.6  0  100  200  400  300  600  500  700  Days  Figure 3.6: Simulation of spot and future prices (upper graph) and long-term mean process (lower graph) using estimate values for part 1. deseasonalized data. The generated trajectories differ somewhat from those observed in the EEX market. The most notable difference is the absence of price spikes: the simulated data is all in the range €10 €401 while the real deseasonalized data has about 6 spikes with prices above €70. The absence of such spikes is not surprising since the model contains no mechanism for generating them. -  The second feature is that, as with the real data, the spot and futures price do tend to follow each other. This is not surprising, since both prices do relate to the same commodity. Since the long run process L is available for the simulated data, we have also plotted this see lower graph in Figure 3.6. Comparing this with the future price in upper graph we noted that for those parameters, the dominant effect on the future price is from the oscillations -  64  Chapter 3. MROU model of the long-term mean. The graphs in Figure 3.6 and Table 3.7 together show that while this model does capture some features of the real data, it does have significant defects, in that it does a poor job of capturing the extreme events, spikes or jumps, which appear in the real market. Another test for the appropriateness of the model is to compare empirical moments for the real data with those from simulated data. 8 If we compare the first four empirical central moments of the log-return can see that there of the sequence of the simulated and real spot prices we is a good fit for the mean value and for the standard deviation, see Table 3.8. The empirical distribution has fatter tails than the normal distribution (kurtosis > 3), indicating a higher occurrence of extreme events, i.e. jumps.  Mean Std Dev. Skewness Kurtosis n data  Real data (1) -0.0019 0.2214 -1.3125 29.8250 634  Sim. data (1) -0.0002 0.1770 -0.0037 2.9855 634  Real data (2) 0.0000 0.2066 -0.0800 8.1288 631  Sim. data (2) 0.0002 0.20402 -0.0160 2.9313 631  Table 3.8: The table shows the first four moments of the logarithmic de seasonalized price returns of observed data and the average of 50 simulated trajectories.  Log-return for a sequence of prices S are defined as 8  i  1n(Sj+j./S)  65  Chapter 4 MROU with jumps In this chapter we consider a jump-diffusion model. This model is similar to the ones considered by Hikspoors and Jaimungal [44] and Nomikos and Soldatos [64]. [44] considers a model of the form =  exp{h(t) + Xt + J},  (4.1)  where X is the first component of a pair (Xe, Y) satisfying (1.9), and J is an independent jump process, see equation (1.11). In one respect our model represents a simplification of the model in [44], in that there is only one Gaussian factor. However, unlike the model given by (4.1), the Gaussian and jump component in our model are not easily separated into independent processes. There are several reasons for considering a jump-diffusion model. First, actual spot prices do exhibit spikes see Figure 3.3, and adding jumps to the process is one way of modeling this. Second, data for spot prices show that the fourth central moment (kurtosis) of the log-returns is much bigger than 3 (see Table 3.8). Diffusion models such as the MROU model in Chapter 3 tend to give the kurtosis close to 3. -  4.1  Description of the model  Let (Q, F, Q) be a complete filtered probability space. The dynam ics of the state variables are given by the following stochastic differential equations: (4.2)  =  dX  =  —)x(X  dL  =  —AL(Lt  —  —  L) dt + x dW’ + JdN L) dt +  L  . 2 dW  —  (4.3) (4.4) 66  Chapter 4. MROU with jumps As before S is the spot price. Here the Brownian motions W’ and W? are independent. The jump behavior of X is governed by two types of jumps: upward jumps and downward jumps. The upward jumps J are exponentially distributed with positive mean i/i, and the downward jumps J are also exponentially distributed with mean 1/nd. In this model, N and N are two respectively. The independent Poisson processes with arrival rates A and function h(t) denotes a deterministic seasonality function.  4.2  Valuation of electricity futures  This model also belongs to the class of (AJD) process. We can rewrite equations (4.3) and (4.4) according to equation (A-i) in Appendix A to obtain:  d  —  L  —  LL  ‘X  dt +  1 JdN  —  X L  ‘X  0  L  dt +  x 0  0 L  dW dW?  J?dN  0 Again defining U  =  (Xi, Li)’, the expressions for  1 , H 1 , and, H 0 , K 0 K remain the same as for the MROU, see equation (3.8). However we now have 10  =  >u  +  ‘d.  (4.5)  To obtain the formula for the future contracts F(t, T) we need to calculate the jump transform function in order to include them in the ODE for M(t, T). (t, T) are the same: see equations 2 (t, T) and N 1 The other two equations N (3.16) and (3.17). The density of the distribution of jumps of X is given by  67  Chapter 4. MROU with jumps  x>O vX(X)  =  d  x  <  0  The transform for the jumps is given by (A-4),  =  f  (4.6)  ), 2 eN1z1+N2z2d(zi,z  where (•) is the jump distribution on R , and 1 2 N N , 2 are such that the integral (4.6) converges. However, since L does not have jumps, is con centrated on the subspace z 2 0 and we have ,N 1 (N ) 2  =  feN1zd(z),  wherever this integral converges, that is, wherever  =  =  —‘rid <  1 N  <  .  Then  f:exp{Nlz}v(z)dz u  f  00  e_(_N1)zdz  (  —  d  +  u+d  0  —  (4.7)  7u  u+Adu—Nl)  (  d  f  0 de1)dZ  -  ?d 7  +Add+N1  Therefore  =  (JN+uN) 2 -ALLN AU  d (U—Nl(tT) —1—A d (d+Nl(tT)  with the boundary condition M((1, 0)’, T, T)  =  )‘  0. 68  Chapter 4. MROU with jumps The equations for N(t, T) are, as before, and  \xNi,  =  ;\LN 2 + 1 —\XN .  We can solve the system with the corresponding boundary conditions to obtain M(t, T)  =  mi(et_T)  —  1) +  + +  e(t_T) m ( 2  —  in  1) +  —  1) +  e _T) 2 ( 5 m t  1 (?7u_— e__(t_T in + ?]—1 / \x  (‘1d  e _T) 2 ( 3 m t  —  +  —  1)  1) (4.8)  eX(t_T)’  ??d+l  /  (t, T) 2 (t, T) and N 1 5 and the solution for N m where the constants m , m 1 , 2 are given by equations (3.16) and (3.17). See Appendix B for the calculation of M(t, T). ...,  Thus, the expression for the future prices is given by  E(t,T)  =  exp{h(T)}exp{M(t,T) + N (t,T)X + N 1 (t,T)L}, 2  (4.9)  1 (t, T) and N (t, T) are given by equations 2 where the functions M(t, T), N (3.16), (3.17), and (4.8). Note that (4.8) requires that since the future price is given by (t,T)  =  <  E(STI)  1. This restriction is to be expected,  =  IT). E(e)T ) T  (4.10)  Since XT contains, in general, terms involving jumps with an exponential distribution, the expectation in (4.10) will diverge if the upper tail of the jump distribution is sufficiently large. See [44].  69  Chapter 4. MROU with jumps  4.3  Particle filter setup  Since this model is non-Gaussian, due to the jump process JdN JdN, we cannot employ the IKalman filter to estimate the parameters )‘, ‘d, 1d We therefore wish to employ the particle filter, which is in principle able 7 to handle quite general distributions. —  As for the MROU model, we assume the data is given by X,  , 1 Z  t=  1,2,...,n.  (For simplicity we just considered one future price). Here X7 is the deseason ). 1 alized log-spot price, and Zt,T 1 the deseasonalized log-future price F(t, T , ..., t, be the times, and 1 The first step is discretise the model. Let to, t . As before for simplicity we abuse notation slightly, and write 1 t_ = t, for . 1 X Then we have, using the forward Euler approximation: X —  Xt  =  =  —  1 L_  —  x(Xt_i L(Lt_l  —  —  L_i)L\t + uxAW + JN  —  jdNd  L)t + JLW.  Here J are exponential random variables with parameters , tively and  zNu, /N 1 are Poisson random variables with parameters )At, spectively. Since 2)  =  1  —  t e  respec  N(O,zto).  N(O,tu),  P(/N  ‘rid  —  )\dAt  re  /te_>t  the probability of two or more jumps in one day is small. We therefore approximate AN by a Bernoulli random variable with parameter that is we take P(AN  =  1)  =  P(z\N  =  0)  =  1  —  70  chapter 4. MROU with jumps with similar approximation for AN. The transition equation is therefore  Ut=  (Xt L  N  )  1  0  N  (1— AxAt 0  LLAt  AxAt 1—ALAt  NI  )  X_ 1 1 L_ 411  + As before, we assume the measurements are subject to noise, so the mea surement equation is  Ix*N Vt  =  /  0 M(t,T)  (us ü + where  °, ,  and  N  )+  / 0  1 N(t,T)  0 N/X (t,T)) 2 N L  NI  (.412)  are standard normal random variables.  For the particle filter we initially assume the parameter 6 is known, and set it up by writing in the following way. We write i4(k) for the kth component (k = 1,2) of the ith particle (i = 1,2, ..., ni,) at time t. A key part of the implementation of the particle filter is the choice of the proposal density q(utut_i,vt). One choice would be to simply take q() to be the transition density p(utIuti) arising from (4.11). However, this choice is not likely to be optimal. Most of the time, we will have AN AN = 0, so most particles will not make a jump at time t. If the observed process X, does make a jump at time t, then most particles will be left behind by this jump. It is therefore better to choose q(ututi, Vi:t) to exploit the information available in Vt. (We remark that the optimal choice of q(ututi, Vit) would be the distribution p(utIuti, Vit), but it is not feasible to calculate this distribution). We therefore choose q(.) so that the particles are propagated according to the following equation: 71  Chapter 4. MROU with jumps  u(1) u(2)  (4.13)  X + u t, 3 =  (1) 1 —(u  —  L)At +  (4.14)  JL/tt.  Note this means that the second component of u just follows the evo lution equation of L, but we use the new data available in X to move the are independent N(O, 1) first coordinate of the particles close to X. Here random variables (i = 1, 2). We now calculate the likelihood and prior densities. We have  p(vtU)  =  p(X, Zt,TIU(1), r4(2))  =  p(XIU(1))p(Zt,TIU(2)),  (4.15)  where  xIu(1) with mz  =  N(U(1),u)  and  N(mz,u) (4.16)  Zt,TIU(2)  (t, T)U(2). 1 M(t, T) + N (t, T)U(1) + N 1  Now, for the prior density or transition density, (i) it  (i) Iu_ ) 1  =  (i) P(Ut  (1),  (i) Ut  (i)  (i)  )Iu_ 2 ( ( 1 1), u_ )) 2 ( 1 (4.17)  =  Here (2) 1 U(2)IU  (2)e_t and SL 1 with mL = L(1 e_t) + U since L is an Ornstein-Uhlenbeck process. —  (4.18)  N(mL,SL) =  (u/2L)(1  —  e_2t),  72  Ohapter 4. MROU with jumps and AN, and neglecting the Using the Bernoulli approximation for downward jump in the same and upward and probability of both ) 2 O((At) period,  1)) p(u(1)Iu ( 1  (1- )At  =  -  dAt)fO(U(1)IU2l(1)) + AUtfd(ut(1)Iutl(1)).  +  (4.19)  Here fo denotes the density corresponding to the jump-free case and f and fd correspond to the case of a single upward and downward jump respectively. Set  /x  (1) 1 u  —  (1) 1 (1  —  Lx)t,  and  2 s  =  then  fo(u(1)Iui(1))  =  (2ns2)h/2exp{(_(u(1)  -  x)2/2s2)}.  (4.20)  and fd are obtained by the convolution of fo with the distributions of and jd So  f  JU  f(u (1)  1 (1)) u  =  (4.21)  Now  73  Chapter 4. MROU with jumps  (i)  (u (1)  —  y  —  /2s + uY 2 x)  =  =  =  (( )’ (( y 2 (2s )’ (( 2 (2b ’ (2s ) 2  4 —(s —  (y  /  —  (%)  —  (i) (U  —  (1)  (i)  (i)  (i)  f(u (1) Iu_ 1 (1))  =  —  2 x))  + 2s2y)  (i)  (u (1)  —  2(u(1)  (1)  —  s2))2  —  —  x)s2))  —  —  (i)  (u (1)  + 2s2y)  (t)  (u (1)  —  —s 2 T/2 + (u (1)  =  2 x))  2 2s  —  Hence writing A(x)  —  — —  —  x)u.  s2  ue322e_1)_ P00  I  0 P00  J I  2dt (2)_1/2et / 2  A(x)/s  =  e82/2e_+(1)_)  2dt (2)_h/2e_t / 2  -00  =  where J(.) is the normal cumulative distribution function. So (i)  (i)  f(u 1 l)) (1)Iut_ (  = ue822e_1)_  (4.22) —  Let u”  (z) =  —Ut  —  (1). To get the distribution fd we use the calculations for  f, 74  Chapter 4. MROU with jumps  (z)  fd(ut  (1) I 1 (1)) 2  f(2ns2)_h/2e(_1)_2/2s2dedy  =  0  T  =  / _1/2e(_(u’_Y_Px) ) e_??dYdy 2s (27rs 2 ) 2  0  =  So (i) fd(ut  (i)  (l)u_i(l))  =  (4.23) —  —  d)/(ux 2 s  )). 12 (At)’ (i)(i)  Combining (4.19), (4.20), (4.22), and (4.23) gives p(u(l)Iu_,  ).  Finally, the proposal density is q(u  z)  (i) Iu_,,v)  =  p(u(1),u(2)Iu,(1),u(2),v)  =  (1),vt) p(u(2)Iu,(2)), 1 p(u(1)Iu  (4.24)  (i) with P(t (2)Iu (2)) given as (4.18) and 1  p(u(1)Iu,(1), Vt)  =  (2nu1l2  exp{(-(u(1)  -  (4.25)  /2u)}. 2 X)  Combining (4.15), (4.17), and (4.24) we calculate the associated weights (i)  (i)  (i)  p(VtIu ) P(’Ut u._,) (i) q(ut(i) Iu_i,Vt)  (i)  wt  (i) =  (i)  (2)u,(2))  (1),Vt) p(u(2)u,(2)) 1 p(u(1)Iu (i)  =  p(Zt,Ttu(2)) P(Ut (1)1(1)). 75  Chapter 4. MROU with jumps Here the first term is given by (4.16), and the second by combining (4.19), (4.20), (4.22), and (4.23).  4.4  Simulated data with known parameters  We tested the implementation of the particle filter presented in Section (4.3) for the MROU jump model to estimate the long-term mean process L. We simulated a series of 100 time data points according to (4.3), (4.4), and (4.9), taking  and = 1. Figure 4.1 shows the particle filter estimates of the state using 100 particles.  Figure 4.1: Plot of the true state L and estimate of the particle filter. As is clear from Figure 4.1 the particle filter does a good job of estimating the transition state for this example even using just a few particles, provided the parameters are known.  76  Chapter 4. MROU with jumps  4.5  Likelihood function estimation  Using the particle filter for a fixed parameter 0, one can obtain an estimate of the likelihood function Jü(Vit) by (2.45). However, as we already mentioned above, severe difficulties arise when one tries to optimize this function since, for each value of 0 one is using a different randomization.  Figure 4.2: The log-likelihood for different Ax values. To investigate the problem, we fixed all the parameters except A at their correct value (Ax = 110), and estimated £x(Vit) for simulated data. Taking 1000 points and 2000 particles, we estimated £ (Vi:t) for Ax in integer steps between 102 and 120. As is clear from Figure 4.2, there is little prospect of satisfactory use of a hill-climbing algorithm given the level of noise. We repeated this estimation with 15000 particles and obtained a similar curve, but with oscillations roughly 10 times smaller. These oscillations however were still large enough so that a hill-climbing algorithm would have difficulties in locating the optimum. It seems likely that increasing the number of particles by a factor of k If so, one will increase accuracy of the estimates of the log-likelihood by  77  Chapter 4. MROU with jumps might need 1O61O8 particles in order to use a hill-climbing approach even in one dimension.  4.6  Sequential parameters  Our goal is to estimate the long-term mean process together with the 11dimensional parameter vector 0 given by  according to the information available at a given time t, i.e. Vit. As we explained in Section 2.6, from a Bayesian point of view we concatenate the state vector and the parameters and apply the filter to this augmented state. Then we define Yt  (Ut)  (4.26)  where Ot is the particle approximation to 0. It is useful to add some noise to the transition for 0 t 9 1  where  {t—1}t1  1 Ot—  +  ct_i  (4.27)  is a small artificial noise with a decreasing variance A with  t.  and A are possible. It is clear that some A wide variety of choices of taken in the choice of care has to be A. If A is too large, then particles & will oscillate too much to give a satisfactory estimate of 0 (the larger the covariance, the more quickly older data are discarded). If A is too small, then unless the initial value is nearly correct, 0 will not move enough to reach the correct value. The literature contains a number of suggestions on how A should be to be white noise. Others proposed chosen. In [43] the author suggested 1f not, we oniy sample particles in 0 space at time 1 and never modify their locations, 9 then after a few time steps p(9IVit) is approximated by a single particle.  78  Chapter 4. MROU with jumps to set A as a diagonal matrix annealing to zero options.  -  see [41, 53, 68] for more  First we take a fairly simple approach. We will take A to be diagonal, for the j-th element in and write t t, for the j-th component of &, and 9 the diagonal of A. We set  (4.28)  =b  t,j  where b and cj are constants. Note that var(t,j) = 00. The ‘annealing’ given by (4.28) is not fast enough to make Oj converge almost surely. Later we use the Liu and West [57] approach. They suggest approximat ing the distribution p(OIvit) by using a mixture of Gaussian distributions, that is  :t) 1 p(OIV  (4.29)  n8m,h2At.  The quantity m = + (1 component of the mixture where  —  a)  is the kernel location for the i—th  (4.30)  =  and the matrix A is an estimate of the posterior variance covariance-matrix, i.e. =  —  )(&  —  (4.31)  t)’.  The constants h and a, that measure the extent of the shrinkage and the 2 = 1— ((2y— 1)/2)2, degree of over-dispersion of the mixture, are given by h , where the discount factor ranges between 0.95 0.99. In this 2 h a= case the artificial dynamic of the parameter is given by -  —  (i)  =  1/2  m + hA  Et+1,  ‘  n(0,I),  (4.32)  79  Chapter 4. MROU with jumps ) denotes the density of the Gaussian distribution. With these 2 where n(xIp u now depend specifications the likelihood, prior and proposal densities for on 8. So we have  z4  p(vtIy)  =  n(XIu(1),u)x t(O)’u(2), ui), (4.33) 2 n(Zt,TIMt(6) + Ni,)u(1) + N  (i)  P(Yt  (i)  I-)  (1)  (i)  (i)  (i)  ),Ot_i) 2 =P(ut (1)Jut_i(1),ut_i(  x p(u(2)Ju (2), 6) fln(IO, 1 (i) (i) y_ , vt) q(yt 1  (i)(1)  =  n(u  x  ,  (4.34)  (i) (i) (i) 2 (2),_ 1 (2)I_ ) IX ,us)p(Ut *  fln(OIO).  (4.35)  Substituting (4.33)-(4.35) into the weights formula gives  (i)  ‘wt  —  —  =  x  ) 1 p(vt iu) p(u Ju (i) q(u(i) u_ ,vt) 1  ,t(O)u (2), 2 n(Zt,TIMt(O) + Ni,t(O)u (1) + N  1), u2(2), O2). p(u(1)Iu ( 1 2  (4.36)  The last density in (4.36) is given by (4.17).  4.7  Empirical results  In this section we report some empirical results based on simulated data to show the performance of the method. The true parameters were fixed to be: 80  Chapter 4. MROU with jumps  )x=l’O x 3 A=5 1?1.5  \i=s  L=3.2  3 7d=  d=’  L’  For simplicity we took us = 0.1 and z = 0.1 to be fixed and assumed to be known. Figures 4.3 and 4.4 show a simulation of a MROU with jumps process using the parameter values as above.The upper graph in Figures 4.3 shows the simulated spot price and the future price with maturity of one month. The lower graph is the long-term mean process L. Figure 4.4 shows the upward and downward jump processes. For this data set the skewness and kurtosis is -1.8895, and 21.069 respectively. Considering the first approach (4.28) and using 1000 particles we ran the particle filter to optimize all the parameters, but the filter failed to obtain good results on most of the parameters. Difficulties of this kind in a high-dimensional situation are not surprising. Even in the case of the Kalman filter, where we optimized a deterministic function, the hill-climbing algorithm sometimes failed to find a point close to the true maximum. The particle filter replaces the hill-climbing point with a cloud of particles, where the weights w are higher when the particles are close to the true value. This should mean that the particle cloud drifts towards the true value of the parameters, but it seems intuitively unlikely that it will perform as well as a hill-climbing algorithm. A known weakness of optimization algorithms is the following. The higher the number of parameters, the worse the performance of the algorithm. This means that a one-parameter optimization should perform best. To test this, we allowed in turn each of the parameters to vary. Thus in each run we fixed all but one parameter at its correct value, and ran the particle filter with just one 0 t,j varying. We took the number of particles to be 1000, and the number of time steps to be 1200. We chose as in (4.28), with cj = 100 The Figure 4.5 shows the dynamic under the artificial noise (4.28) of := for t = 1, 2, 1200 using 1000 particles and its the parameter optimal value (110). To test the particle filter, we started the parameter some way away from its correct value. The noise is decreasing quite slowly, ...,  81  Chapter 4. MROU with jumps  SpotS I dXt=—?.x(Xt_Lt)dt+axdWl +JdNJddNd 450  )’ 1 Sexp(X  400  F(t,Ti)=Et(exp(Xri))  350 300 250 200 150 100 50 200  400  600  1000  800  12  Days  Long—term mean Lt I dLt  =  —  2L ( Lt  —  L ) dt  +  cYL dW2  Days  Figure 4.3: The graph shows a simulated trajectory of the MROU model with jumps. and hence the variance of the process remains large. Figure 4.5 suggests that there is little “push” in the particle filter towards the correct value, and this is confirmed by Table 4.1, which shows a fairly wide dispersion of estimates around the true value, for different runs of the filter on the same data set. A similar pattern arises for the other parameters. We ran the particle filter 25 times for each parameter. Let O(k) be 25. For each run k, we the value of 8 t,j in the kth run, where k = 1, 2, estimated by taking its average over the last 400 observations, that is, ...,  82  Chapter 4. MROU with jumps  Upward juwps Ju  a  600 Days  Downward jowpn Jd  Jd=ExponentiaI0d)  025 0.2 0.15 0)  0.1 0.05 I  0  200  400  I  I  I  600  800  1000  1200  Days  Figure 4.4: The graph shows the upward and downward jumps for the sim ulated trajectory of the MROU model with jumps.  1200 =  ZOt,j(k).  25. We then calculated the mean and standard deviation of (k), k = 1, 2, Table 4.2 summarizes the outcome of the whole procedure. The first column is the true value of the parameters. The second column is the initial value of that is b in (4.31). The third and fourth columns give the mean and standard deviation of the estimates (O(k), k = 1, 2, 25). ...,  ,  ...,  As we can seen from the Table 4.1, the performance results are rather  83  Chapter 4. MROU with jumps  limo  step  Figure 4.5: Sample particle filter trajectories for the estimate of Ax.  Ax  True 110  ri 126.13  r2 161.49  r3 102.76  r4 87.72  r5 76.41  r6 105.29  r7 124.15  r8 83.71  Table 4.1: Sample of 8 estimated values for Ax.  Ax x  AL L  L ,  A rid Ad  True value 110 3 5 1 3.2 1.5 5 3 1  b 30 1 2 1 1 0.5 0.5 1 0.5  mean value 99.29 3.26 8.147 1.16 3.37 2.444 8.301 14.87 4.218  std. 19.64 0.048 0.729 0.017 0.037 0.754 2.983 6.917 1.817  Table 4.2: Individual estimates for parameters in MROU with jumps model.  84  Chapter 4. MROU with jumps mixed. In some cases the particle filter was able to obtain a reasonable estimate for the parameter, but in others, particularly for the parameters associated with the jumps, the estimates are far from the true value. The particle filter literature suggests that even with a very large number of par ticles one may not be able to obtain accurate results. Next we tested the Liu and West approach. We used 4000 particles to approximate the distribution of interest. A problem that we noticed is that the estimated posterior variance-covariance matrix A collapses to zero after a few hundred iterations. We solved this problem choosing an efficient resampling scheme that kept low the variance in the particle filter algorithm. We found that the residual resampling kept the covariance matrix positive. We ran two examples. In the first one we fixed all but one parameter at its correct value, and ran the algorithm choosing a reasonable initial point for the free parameters. We obtained good results even for the jump parameters. 5 = 0,1 and z = 0.1). The results are (For simplicity we consider again a displayed in Figure 4.6. It is interesting to note that the algorithm detects precisely the parameters Xi,, A, r, 7 1d associated with extreme events (spikes). For the second example, we took the vector parameter to estimate as  0  =  {Ax,AL,ax,uL,L}  while the rest of the parameters were fixed to their optimal values. Again, we started the algorithm choosing a reasonable initial point for the free pa rameters 0 as well a small initial variance. Figure 4.7 presents part of the results. From this example, we noticed that the algorithm implemented gave ac curate estimates for the parameters L, crL, and More difficulties arose when estimating the speed reversion for long and short-term process Ax, and AL. There are slightly under and over estimated respectively. Overall, the algorithm provided more precise estimated values but diffi culties arose for the parameters related to the jump process. Further, the algorithm required a significant amount of tuning, i.e. choosing the initial 85  Chapter 4. MROU with jumps value and variance of the artificial noise. However the graphs in Figures 4.6 4.7 indicate a significantly better performance than that obtained by using (4.28). Given these difficulties, we did not feel confident that the algorithm would perform satisfactorily on real data. -  86  Chapter 4. MROU with jumps  2.  100  200  000  4  000  600  700  000  900  It It  tine step  One step  Figure 4.6: Sample particle filter trajectories for the estimate of using Liu and West approach.  ,  and  )d T  87  Chapter 4. MROU with jumps  100  200  0  400  505  600  000  500  900  1000  Figure 4.7: Sample particle filter trajectories for the estimate of ux and using Liu and West approach.  L  88  Chapter 5 NLMROU model Our difficulties with the particle filter led us to look for more tractable models which have the potential for explaining price spikes. While the incorporation of jumps is the most natural way to account for price spikes in the spot price, other explanations have been offered. Barlow [3] proposed a non-linear diffusion model, which can produce price spikes similar to those observed in real data. This single factor model is unlikely to provide a good explanation for the observed relation between spot and future prices. Here we present a two-factor model of the same kind. The model is estimated using data from the European Energy Exchange.  5.1  The model  The nonaffine term structure two-factor model for futures prices is known as the Non-Linear Mean-Reversion Ornstein-Uhlenbeck model (NLMROU). It uses the inverse of the Box-Cox transformation to generate price spikes that fit the observed data observed in the power market. Let (Q, F, {}t>o, Q) be a filtered probability space. The dynamic of the spot price under the risk-neutral measure Q is the following: =  f(X)h(t)  where X is an Ornstein-Uhlenbeck process which reverts to a stochastic mean L, also fluctuating as Ornstein-Uhlenbeck processes and fa is the inverse of the Box-Cox transformation. This transformation was introduced in the context of electricity markets in [3]. The deterministic component h(t) incorporates the seasonality effects in the model. More precisely,  89  Chapter 5. NLMROU model  S  =  h(t) x  I  (1+aX)’/, 1/a  1+aXt  >  o 6  1  (5.1)  1+aA.tEo  dX  =  dL  =  —Ax(X \L(Lt  L) dt + x dW,  —  L) dt +  —  01  dW.  (5.2) (5.3)  2 have correlation p and a 0. where the two Brownian motions W’, and W is the MROU model, see Chapter 3, with a cutoff at e . If 0 If a = 0 then S, /a increases more rapidly that an exponential 1 a < 0, the function (1 + aXt) function. An important advantage of this approach compared to other methods for produce spikes in the spot price process is the inclusion of just one more parameter a in the model to be estimated. The deterministic seasonality function is the same as is described by equation (3.33) and its components are estimated by least-square fitting exactly as in Chapter 3. We denote the deseasonalized spot price by St, that is =  S/h(t).  where as before h(t) is the estimated seasonal correction.  5.2  Future price  Based on the risk neutralized process (5.1)-(5.3) we calculate the Future price. Assuming a deterministic interest rate the Future price is the expected future spot price under the risk neutral measure Q, i.e.  fr(t,T) =  h(T)E(fa(XT)IXt h(T)f  =  x,L  =  1)  () ex{_ 2 faWy’  (w_(:x1))2}d  (5.4)  where i(s, x, 1), and a(s) are the mean and variance of XT respectively.  90  ______  Chapter 5. NLMROU model Taking y  =  (w  —  (t, T)  t(s, x, l))/u(s) =  h(T)  f(s, x,  1) +  u(s)y)e_Y2/2dy,  (5.5)  d 1  + a((s, x, 1)  =  /2dy 2 u(s)y))1e_Y  *e_Y2/2dy,  0 +A  (5.6)  V 1 d  f°°  d 1  /2dy 2 + ((s, x, 1) + u(s)y))1e_Y  =  I(—d 0 A ) , +1  (5.7)  with i/ A —C , 0 j-j  and  (.)  1 u(s)ci —  —  ui—  —  (s, x, 1) u(s)  denotes the cumulative normal distribution function.  Using the fact that E(e’)  =  /2 2 eJ  with  Y  ) 2 N(t, cr  we are able to calculate the mean (s, x, 1), and variance u(s) of XT in (5.4). Let and s=T—t. —  From previous calculations, see equation (3.9) and Appendix (6.1) ,u 1 E(exp{(u ) (XT, LT)’}I(Xt, Li)’) 2  =  exp{M(s) + N (s)X + N 1 (s)Lt} 2  where Ni(s)  =  uOx(s),  (s) 2 N  =  and  91  ___  Chapter 5. NLMROU model  M(s)  =  1) +  —  +  (—  (rLuio_ LLU2)  — 2PcrxJLuao) co 2 — ou + 2ouiu  +  0(  + 2 Taking u  (‘  =  JLU1U2a0  —  + PJxJLU?ao X +L  —  — 1)  — 1)  (&2X(S)  )  (_J  (8()  — 1)  pJxJLUlU2’\  x+(s)—1). 8 )(  0  r}J(X E(exp{u X 1  =  x, L  1)’)  =  =  (s)} 2 1 + Mi M + Nii(s)x + N 11 ( 1 exp{u (s) (s)1) 2  (5.8)  where LLO  Mu(s)  ( &X(8)  =  Na(s)  =  1 (s) 2 N  =  —  1) +  ALL0O  (&L(s)  ,XL  — 1),  cio(Ox(s)  and  2 (s) 1 M  ——  — —  PXLO (&2 2  2 2L(S) 0 (  — 1)+ 2u  — 1)  +2PuxuLao( 0 + AL)  (Ax  ()  1).  Therefore the mean and variance of XT are respectively x, 1)  =  21 N Mn(s) + Nn(s)x + (s)l 92  Chapter 5. NLMROU model and u(s)  =  12 M (s).  Unfortunately the integral in (5.7) does not admit a closed-form solution. However, this is not a significant obstacle, since these integrals can be evalu ated quickly by numerical methods. The Futures prices have been calculated for this model, though in a rather non-explicit fashion see [78]. -  Unscented Kalman filter setup and estimation procedure  5.3  This model is non-linear but has Gaussian noise, so an appropriate technique is to use the Extended Kalman filter (EKF) or the Unscented Kalman filter (UKF), see Section 2.3. The starting point of our inferential procedure involves an Euler dis cretization. The model is thus evaluated at a set of discrete times {t : i = n} such that At = t t_ 0, 1, . Writing X for X as before, the Euler 1 scheme for equations (5.1)-(5.3) can be written as: ...,  X  —  =  f(Xt),  =  1 X_  =  1 L_  —  —  (5.9) 1 x(X_ AL(Lt_i  —  —  )At + 1 L_  —  + puxAW , 2  uXAT’/ 2 p  (5.10) (5.11)  L)At + uLAW . 2  In order to apply the Unscented Kalman filter, we use the state-space repre sentation for the NLMROU model. We defined the state equation as  (  X  L  1  N  )  =  +  0 ALLAt  I  ‘  —  N I  1—AxAt 0  x” 2 P 1  )xAt 1LAt  + puxA  uLfW?  NI  )  X_  1 L_ (5 12)  .  The observations we have available, after seasonal correction, are the spot and filture prices (S F(t, T)). However, the UKF allows us to take as 93  Chapter 5. NLMROU model  our measurement any transformation of these observations. To reduce non linearity in the model we therefore took our measurement equation as  Vt  —  —  =  +  (  logS logF(t,T)  ( log{  I  N+ )  N  (/AW’ uzW2)  log f(Xt) /2dy + 1 2 } (—d 0 A ) (s, x, 1) + u(s)y))1/ae_Y  f(i +  1 usvW  N (5.13)  where St denotes the deseasonalized spot price, and F(t, T) is the deseasonal ized future price at time t with maturity T. In this case the noise covariances are given by  —  q  —  coy  (  /1  —  ax/M4’ç’ 2 p  + puxzW  N )  —  I  u%At PJXJL/t  —  PJXULAt  N p  and =  coy  1us/M’V’\ Aw)  (cT. =  0  0  4  We set up the specific characteristics of the state-space model for the spot and future prices using the transition and measurement equations (5.12) and (5.13). The UKF parameters were set to c = 0.001, = 2, and ‘ 0. Based on this state-space formulation we are able to run the unscented Kalman filter algorithm in order to estimate the parameter set &={)\x,)L,ux,uL,L,p,a,us,uz}  by means of maximum likelihood according to Section (2.5). We run Algo rithm 4 in Section 3.4, but using the UKF instead of the Kalman filter in (Step 2) to obtain 0.  94  Chapter 5. NLMROU model The log-likelihood function is calculated as  £O(VLn)  where  5.4  EVkVk,  —  and  [log IvkvkI +  tikIk1  (Uk  —  VkIk_1)  —  VkIk_1)]  (5.14)  are given by (2.23) and (2.22).  Simulation results  Assuming the parameters are known, and using the Euler scheme discretiza tion in (5.9)-(5.11) we simulated a path of the deseasonalized spot and future price with n 800 observations of the NLMROU model using A = 100, = 800 a with 1.7, L = 0.4, L = 1.9, p = 0, c = —0.4 and AL = 3, ux = 1/250. Figure 5.1 shows the trajectories. As we can see in Figure 5.2 the UKF is able to recover the ‘hidden’ states. For simplicity we fixed the parameters p = 0, s = 0.1, and og = 0.01. The optimization method was repeated 25 times with random re-initialization for each run to obtain: =  argmax  i). £ ( 9  (5.15)  8e0  and we proceeded in the same way with 50 different trajectories to obtain the following results. See Tables 5.1, and 5.2: this shows that the algorithm is able to obtain quite good estimates of the parameters for simulated data.  5.5  Parameter estimation based on historical data  For our empirical analysis we again use the data from the European Energy Exchange (EEX) in Leipzig, Germany. In our analysis we considered the spot price of the EEX baseload index and monthly baseload futures prices. The spot price is an equally weighted average of all 24 hourly spot prices for that particular day. Holidays and weekends have been removed from the data set. 95  Chapter 5. NLMROU model  SpotS=f(X)  I  Days  =f (X)t Future F(t,T)E(STSt) I S tco 101  100  0  200  300  400 Days  500  600  700  8  MROU process X & L  Days  Figure 5.1: The graph shows a simulated trajectory of the NLMROU model.  Ax AL  clx L  L c  ) 6 —(  True Value 100 3 1.7 0.4 1.9 -0.4  Run 1 102.562 3.523 1.582 0.419 1.951 -0.402 -186.276  n = 800 Run 3 Run 2 101.162 105.627 3.466 2.871 1.530 1.845 0.457 0.420 1.952 1.875 -0.382 -0.461 -179.261 -197.417  Run 4 104.041 3.540 1.728 0.380 1.803 -0.389 -186.248  Run 5 106.12 3.618 1.652 0.379 1.974 -0.406 -191.87  Table 5.1: Five different maximization runs with 800 observations. Our data comprise almost five years baseload day prices from July 1, 2002 to June 29, 2007, totaling 1267 observations. The dynamics of the spot prices for the considered period are shown in Figure 3.3. 96  Chapter 5. NLMROU model  tine  Lep  Figure 5.2: Plot of the true and estimated processes L and X of the NLM ROU model (first 80 observations).  n  =  ‘Iue value 100  Estimator 102.4069  AL  3  2.964  x  1.7 0.4 1.9 -0.4  1.786 0.3954 1.8968 -0.4009  Ax  L  L a  800 Std. 4.340 0.795 0.216 0.0921 0.0216 0.038  Table 5.2: Estimation using one futures contract (n  =  800).  To estimate the parameters of the deterministic part 1 + 2irtN IT  h(t)=r+/3ot+/3icos  250  cos( 2 )+/3  T2  + 2irt  5  ), 97  Chapter 5. NLMROU model we ran the least squares method as we did in Chapter 3. See Table 3.5 for . 2 r and r , the estimated values of , , 2, 1 In the following we will analyze the three time series of the EEX market for the periods July 1, 2002 December 31, 2004, January 1, 2005 June 29, 2007, and the whole series. The seasonalities have been removed. -  -  For simplicity and to reduce the number of parameters to estimate, we fixed the noise parameters to be constants. s = 0.1 and crz = 0.01. In view of the low correlation found between the processes X and L in Chapter 3, we took the covariance p = 0. The results on the parameter estimates are shown in the Table 5.3.  Ax AL  ox L 0  L c  Part 1 Estimator 127.599 0.11178 1.25847 0.0834349 1.86049 -0.438183  Part 2 Estimator 116.036 0.753711 0.756485 0.0730882 1.77443 -0.489661  Whole Estimator 121.557 0.203681 1.15815 0.0901108 1.79335 -0.414042  Table 5.3: Estimated values for the EEX market using S, and F(t, T ). 1 The parameters A and AL relate, as in Chapter 3, to the ‘half-life’ of the mean reversion of the short-term and long-term processes. We obtained similar results for TX log 2/Ax, showing a half life of 1 2 days. The estimated half-lives for L differ significantly for the two periods, being about 6 years for the first period, and about 8 months for the second. While the MROU and NLMROU models give similar estimates for the speed of mean reversion of the short-term component X, the NLMROU model gives significantly slower mean reversion for the long-run process L. —  The estimates of the nonlinearity parameter c are quite similar for the two periods. As in the case of the MROU process, comparison of the variances VL = o/2AL, Vx = 4/2Ax show that the dynamics of X, and L contribute significantly to the long run variation of S. As before the long run process 98  Chapter 5. NLMROU model gives a somewhat greater contribution. We obtain the following results:  w Period 1  0.18  0.08  Period 2  0.06  0.05  Note that because we are applying a different function (f() rather than exp{.}) to X to obtain the spot price, it does not make sense to directly compare the values of x and crL with those obtained in Chapter 3. In order to investigate if the estimated parameters make sense, we simu lated a path of the spot price, future prices and long and short term mean process that describe the model using the estimated values from Table 5.3 considering the whole data, see Figure 5.3. Empirical moments of the EEX spot price versus simulated moments (averaged over 50 simulation paths) are shown in Table 5.4. Comparing the real spot price with the price produced by the model, we see that the model is able to produce significant price spikes with values similar to those for the real data. As with the real data, the spikes tend to bunch together. As well, the model exhibits periods of high variance compare for example the period 700 950 in Figure 3.5, and 750 950 in Figure 5.3. The estimated parameters for the long-run mean process give quite slow mean reversion, and Figure 5.3 shows a process of this kind. -  —  —  This model will tend to generate more spikes when the L process is large, and one sees this at the end of the simulation, when the simulated spot has many small spikes in the range from €150 on. Similar features are seen in the real data in the period 800 to 1100 in Figure 3.5. -  The model does however have some defects. The first is that the skewness of the log returns is close to zero see Table 5.4 (this is likely to be a feature of any model without jumps). Then this model is not able to capture the skewness which is present in the real data. It also appears that the model underestimates the kurtosis, compared with the real data. In spite of these problems, this model appears to offer a significant im provement over the MROU model at a quite moderate ‘cost’ the cost being in terms of additional parameters and complications of parameter estimation. Although the model is far from perfect, its performance suggests that it is -  -  99  Chapter 5. NLMROU model SpotS=f(X) I  Days  Future F(t,T)E(STISt) I Sf(X) 50 40 30 20  100  200  300  400  500 Days  600  I 700  I 800  I 900  11 I0  MROU process X & L  500 Days  Figure 5.3: Simulation of spot and future prices price using estimated values for the whole data set.  Mean Std Dev. Skewness Kurtosis n data  Real data (1) 0.0019 0.2214 -1.3125 29.8250 634  Sim. data (1) 0.0000 0.60029 -0.02066 4.77539 634  Real data (2) 0.0000 0.2066 -0.0800 8.1288 631  Sim. data (2) -0.00037 0.46088 0.04927 3.99024 631  Table 5.4: The table shows the first four moments of the logarithmic de seasonalized price returns of observed data and the average of 50 simulated trajectories. well worth considering more elaborate models of this type in the search for a good description of electricity prices. 100  Chapter 6 Conclusions In this thesis, based on the specific properties of electricity we have proposed three process models that incorporate various features of power prices. We calibrate these models, using both spot and futures prices, to artificial and historical data applying three filtering methods. The first model is a two-factor linear Gaussian model. This models the log-spot price as mean-reverting process, where the mean reversion is to a sec ond (unobservable) “long run mean” process. This second process is modeled by an Ornstein-Uhlenbeck process. In this thesis (unlike the work reported in [4]) we used both the spot and future prices to estimate parameters. The space-state formulation of this model is suitable to the application of Kalman filtering techniques, and we used a maximum likelihood estimator based on the Kalman filter. This worked well for simulated data, and we then applied it to estimate parameters for the German EEX market. Simulations suggest that this model, with the fitted parameters, does fit some features of the real data. However, it definitely fails to exhibit some features of the real data, such as the jumps or price spikes seen in Figure 3.3. This defect in the first model led us to consider a second model, which incorporates jumps. We kept the same basic form of a log-spot price, and a long-term mean process, but added jumps to the log-spot price. For sim plicity we took the distribution of both the upward and downward jumps to be exponential, with possibly different rates and parameters. The standard Kalman filter cannot be applied satisfactory in this case, since the model is non-Gaussian. One alternative, which requires rather weak assumptions on the distribution involved, is the particle filter. We developed code to use the particle filter for the second (jump diffusion) model based on the kernel ap proximation of the posteriori suggested by Liu & West (2001). An empirical application on simulated data is presented to study the performance of the implemented algorithm. In general we observed that while the particle filter can work satisfactorily in estimating the unknown parameters, it is very sen 101  Chapter 6. Conclusions sitive to the particular form of dynamics for the artificial parameters used in the parameter estimation. In view of this it is hard to apply to real data. The third model we presented is an extension of the nonlinear Ornstein Uhienbeck model (NLOU) proposed in Barlow [3]. While Barlow used only one factor to describe the dynamics of the spot price in this thesis we consider a two-factor model and the same nonlinear transformations to model the spot price. The model captures the mean-reversion, jumps and spikes behavior observed in real market. The model has the advantage over jump-diffusion models that it is Gaussian. Hence we can use the unscented Kalman filter algorithm to estimate the NLMROU model. We calibrate the models to daily EEX market obtaining similar simulated trajectories with the estimated parameters.  6.1  Future work  1. In Chapter 5 we analyzed a NLMROU model of the form =  h(t)f(X)  where (Xe, L) is given by (5.2)-(5.3). The original model in [3] was justified by considerations of supply and demand curves, and so it might be more realistic to consider a model of the form =  f(Xt + h(t))  (6.1)  where h(t) is a deterministic seasonal correction. This model would have the merit of generating spikes during periods of high demand, without the need to consider different regimes, as is done in [45, 72]. The main obstacle to parameter estimation for (6.1) is that, since c is unknown, it is no longer possible to estimate h(t) by a least squares method. One possible approach is an iterative method. One would first apply the UKF to the uncorrected series (Si, F(t, T)), to obtain an initial estimate &i. One then applies least squares to the series f’(St), to obtain  102  Chapter 6. Conclusions , the spot component of the measurement 1 an estimate h . Given h 1 equation is v(’(t) = log (f(Xt + which can be calculated as in Section 5.2. Using this, and a similar expression for the futures component, one can then apply the UKF to obtain a second round of parameter estimates, and in particular an improved estimate &2 for c. Iterating, one would hope to obtain estimates for o and h(t). This procedure seems feasible to implement, but its convergence prop erties and stability are at this point not clear. 2. Improved parameter estimation using the particle filter. In this thesis we applied the Bayesian approach where an augmented state variable which includes the parameters is processed by the particle filter for the NLMROU model. We adopted the on-line estimation of parameters and state developed by Liu West (2001). The main feature of this approach is that the variance matrix A shrinks step by step and it finally converges towards 0. Hence the parameters could converges towards a wrong value because the A converges towards 0 before reaching the true parameter value. This problem cannot be avoided when we do not have prior knowledge about the parameters. That is why the method is very sensitive to the initial values of the added noise in the parameters. Along the same lines, another method that can we use is the ‘practical filter’ proposed by Polson et al. (2008) [46, 67]. Their approach is based on approximating the target posterior by a mixture of fixed-lag smoothing distributions. According to the authors, unlike the parti cle filter approaches, it provides independent samples from the target distribution, does not suffer from particle degeneracies, and handles outliers and high dimensional problems well. Meyer-Brandis T. & Tankov P. (2008) [62] comment: “Sequential filtering makes less sense when the complete series is avail able for estimation”. 103  Chapter 6. Conclusions Although computationally more intensive, the difficulties encountered in implementing the particle filter suggest this may be the correct ap proach. An approach used by Olsson et al. (2008) [65] and Wills et al. (2008) [81) is an off-line method performing maximum likelihood esti mation via the EM algorithm. An essential component in the E-step is to approximate the ‘smoothing distribution’, that is {po(utlvi:n);t = 1,.. ,n}. In the general nonlinear and non-Gaussian case various schemes have been proposed. The fixed-lag approximation is the sim pies approach and it was first proposed in [54]. In [65] they apply the particle filter technique to smooth additive functionals based on a fixed-lag smoother. The method exploits the forgetting properties on the conditional hidden chain and is not affected by the degeneracy of the particle trajectories. .  Approaches of this kind are well worth investigating for jump models of the type considered in Chapter 4.  104  Bibliography [1] L. Anderson and D. Davison. A hybrid system-econometric model for electricity spot prices: Considering spike sensitivity to forced outage distributions. IEEE Transactions on Power Systems, 23(3): 927—937, 2008. [2] M.S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp. A tutorial on particle filters for on-line non-linear/non-Gaussian Bayesian tracking. lEE Trans. Signal Processing, 50(2):174—188, 2002. [3] M. Barlow. A diffusion model for electricity prices. Mathematical Fi nance, 12:287—298, 2002. [4] M. Barlow, M. Gusev, and M. Lai. Calibration of multifactor models in electricity markets. International Journal of Theoretical and Applied Finance, 7(2):101—120, 2004. [5] D.R. Beaglehole and M.S. Tenney. General solutions of some interest rate contingent claim pricing equations. Journal of Fixed Income, 1:69— 83, 1991. [6] F.E. Benth, J.S. Benth, and S. Koekebakker. Stochastic Modelling of Electricity and Related Markets. World Scientific, 2008.  [7] F.E. Benth, J. Kallsen, and T. Meyer-Brandis. A non-Gaussian Ornstein-Uhlenbeck process for electricity spot price modeling and derivatives pricing. Applied Mathematical Finance, 14(2): 153—169, 2007. [8] F.E. Benth and S. Koekebakker. Stochastic modeling of financial elec tricity contracts. Energy Economics, 30(3):1116—1157, 2008. [9] T. Bjork. Arbitrage Theory in Continuous Time, 2nd Ed. Oxford Uni versity Press, 2004.  105  Bibliography [10] S. Borovkova and H. Geman. Analysis and modelling of electricity fu tures prices. Studies in Nonlinear Dynamics éi Econometrics, 10(3):Artide 6, 2006. [11] M.J. Brennan and E. Schwartz. Evaluating natural resource investments. Journal of Business, 1985. [12] M. Burger, B. Klar, A. Mueller, and G. Schindimayr. A spot market model for pricing derivatives in electricity markets. Journal of Quanti tative Finance, 4:109—122, 2004. [13] 0. Cappe, E. Moulines, and T. Ryden. Inference in Hidden Markov Models. Springer, 2006. [14] R. Carmona and M. Ludkovski. Spot convenience yield models for the energy markets. AMS Mathematics of Fianance, 351:65—80, 2004. [15] C. Carraro. Square root Kalman algorithms in econometrics. Computer Sciences in Economics and Management, 1:41—51, 1988. [16] A. Cartea and G. Figueroa. Pricing in electricity markets: a mean reverting jump diffusion model with seasonality. Applied Mathematical Fiance, 12(4):313—335, 2005. [17] A. Cartea and P. Villaplana. Spot price modeling and the valua tion of electricity forward contracts. Journal of Banking and Finance, 32(12):2502—2519, 2008. [18] T. Chen, J. Morris, and E. Martin. Particle filters for state and pa rameter estimation in batch process. Journal of Process Control, pages 665—673, 2005. [19] L. Clewlow and C. Strickland. Valuing energy options in a one factor model fitted to forward prices. Working Paper, University of Sydney. [20] L. Clewlow and C. Strickland. Energy Derivatives: Pricing and Risk Management. Lacima Group, 2000. [21] M. Culot, V. Goffin, S. Lawford, S. de Menten, and Y. Smeers. An affine jump diffusion model for electricity, working paper, 2006.  106  Bibliography [22] D. Davison, L. Anderson, B. Marcus, and K. Anderson. Development of a hybrid model for electrical power spot prices. IEEE Transactions on Power Systems, 17(2):257—264, 2002. [23] J.F.G. De Freitas, M. Niranjan, and A.H. Gee. Dynamic learning with the EM algorithm for neural networks. Journal of VLSI Signal Process ing Systems, 26:119 131, 2000. —  [24] S. Deng. Pricing electricity derivatives under alternative stochastic spot price models. 33rd Hawaii International Conference on System Sciences, 4:4025—4034, 2000. [25] J.E. Jr. Dennis and R.B. Schnabel. Numerical Methods for Uncon strained Optimization and Nonlinear Equations. SIAM, 1996. [26] A. Doucet, N. de Freitas, and N. Gordon. Sequential Monte Carlo Meth ods in Practise. Springer-Verlag, 2001. [27] D. Duffie, J. Pan, and K. Singleton. Transform analysis and asset pricing for affine jump-diffusions. Econometrica, 6:1343—1376, 2000.  [28] J. Durbin and S.J. Koopman. Time Series Analysis by State Space Methods. Oxford University Press, 2001. [29] C. Erlwein, F. Benth, and R.S. Mamon. Hmm filtering and parameter estimation of electricity spot price model. Revised for Energy Economics, 2009.  [30] A. Eydeland and K. Wolyniec. Energy and Power Risk Management: New Developments in Modeling, Pricing, and Hedging. John Wiley & Sons, 2003. [31] H. Geman. Commodities and Commodity Derivatives. John Wiley Sons, 2005. [32] H. Geman and A. Roncoroni. Understanding the fine structure of elec tricity prices. Journal of Business, 79(2):1225—1261, 2006.  [331  H. Geman and 0. Vasicek. Forwards and futures on non storable com modities. RISK, August:93—97, 2001.  107  Bibliography [34] Z. Ghahramani and S. Roweis. Learning nonlinear dynamical systems using an EM algorithm. Advances in Neural Information Processing Systems, 11:599—605, 1999. [35] R. Gibson and E. Schwartz. Stochastic convenience yield and the pricing of oil contingent claims. Journal of Finance, 3:959—976, 1990. [36] B. Gopaluni, T.B. Schon, and A. Wills. Particle filter approach to non linear system identification under missing observations with a real appli cation. Proceedings of IFAC Symposium on System Identfication, 2009. [37] N.J. Gordon, D.J. Salmond, and A.F.M. Smith. Novel approach to non-linear/non-Gaussian Bayesian state estimation. lEE Proceeding -F Radars Sonar Navig., 140(2):107—113, 1993.  [38] M Grewal and A.P. Andrews. Kalman filtering: Theory and practice using MATLAB. Wiley, 2001. .  [39] J.M. Griffin and S.L. Puller. Electricity deregulation: choices and chal lenges. Series in the Economics of Public Policy, 2005. [40] J. Hamilton. Time Series Analisys. Princeton University Press, 1994.  [41] S. Haykin. Kalman Filtering and Neural Networks. John Wiley & Sons, 2001. [42] D. Heath, R. Jarrow, and A. Morton. Bond pricing and the term struc ture of interest rates: A new methodology for contingent claim valuation. Econometrica, 60:77—105, 1992. [43] T. Higuchi. Self-organizing time series model. Sequential Monte Carlo Methods in Practise (Doucet, de Freitas and Gordon, Eds), pages 429— 444, 2001. [44] 5. Hikspoors and S. Jaimungal. Energy spot price models and spread op tions pricing. International Journal of Theoretical and Applied Finance, 10(7):1111—1135, 2007. [45] R. Huisman and R. Mahieu. Regime jumps in electricity prices. Energy Economics, 25:425434, 2003.  108  Bibliography [46] M.S. Johannes, N.G. Poison, and J.R. Stroud. Optimal filtering of jump diffusions: Extracting iatent states from asset prices. Review of Finan cial Studies, 22(7): 2559—2599, 2009.  [47] S.J. Julier and J.K. Uhlmann. Unscented filtering and nonlinear esti mation. Proceedings of the IEEE, 92(3):401—421, 2004. [48] R.E. Kalman. A new approach to linear filtering and prediction prob lems. Transactions of the ASME Journal of Basic Engineering, D 82:35—45, 1960. -  [49] T. Kanamura and K. Ohashi. A structural model for electricity prices with spikes : Measurement of spike risk and optimal policies for hy dropower plant operation. Energy economics, 29(5):1010—1032, 2007. [50] N. Kantas, A. Doucet, S.S. Singh, and J.M. Maciejowski. Overview of Sequential Monte Carlo methods for parameter estimation on general state space models. 15th IFAC Symposium on System Identification (SYSID), In Press, 2009.  [51] B.P. Kellerhals. Financial Pricing Models in Continuous Time and Kalman Filtering. Springer, 2001. [52] V.A. Kholodnyi. A non-markovian process for power prices with spikes and valuation of european contingent claims of power. Preprint, 2001. [53] G. Kitagawa. Self-organizing state space model. Journal of the American Statistical Association, 93(443): 1203—12 15, 1998. [54] G. Kitagawa and S. Sato. Monte carlo smoothing and self-organizing Sequential Monte Carlo Methods in Practise state-space models. (Doucet, de Freitas and Gordon, Eds), 2001. [55] 5. Koekebakker and F. Olimar. Forward curve dynamics in the Nordic electricity market. Managerial Finance, 31(6):73—94, 2005. [56] A. Leon and A. Rubia. Testing for weekly seasonal unit roots in the Spanish power pool. Modelling prices in competitive electricity markets, D. W. Bunn Ed, pages 177—189, 2004.  109  Bibliography [57] J. Liu and M. West. Combined parameter and state estimation in simulation-based filtering. Sequential Monte Carlo Methods in Practise (Doucet, de Freitas and Gordon, Eds), pages 197—223, 2001. [58] J.S. Liu and R. Chen. Sequential Monte Carlo methods for dynamic systems. Journal of the American Statistical Association, 93:1032—1044, 1998. [59] J. Lucia and E. Schwartz. Electricity prices and power derivatives evi dence from the Nordic power exchange. Review of Derivatives Research, 5:5—50, 2002. -  [60] M.R. Lyle and R.J. Elliott. A simple hybrid model for power derivatives. Energy Economics, 31, 2009. [61] M. Manoliu and S. Tompaidis. Energy futures prices: term structure models with Kalman filter estimation. Apply Mathematical Fiance, 9:21— 43, 2002. [62] T. Meyer-Brandis and P. Tankov. Multi-factor jump-diffusion models of electricity prices. International Journal of Theoretical and Applied Finance, 11(5):503—528, 2008. [63] A. Misiorek, S. Trueck, and R. Weron. Point and interval forecasting of spot electricity prices: Linear vs. non-linear time series models. Studies in Nonlinear Dynamics and Econometrics, 10, 2006. [64] N.K. Nomikos and 0. Soldatos. Using affine jump diffusion models for modelling and pricing electricity derivatives. Applied Mathematical Finance, 15(1):4171, 2008. [65] J. Olsson, 0. Cappé, R. Douc, and E. Moulines. Sequential Monte Carlo smoothing with application to parameter estimation in nonlinear state space models. Bernoulli, 14, 2008.  [66] D. Pilipovic. Energy Risk: Valuing and Managing Energy Derivatives. McGraw-Hill, 1998. [67] N.G. Poison, J. R. Stroud, and P. Muller. Practical filtering with se quential parameter learning. Journal of the Royal Statistical Society, Series B, 70(2):413—428, 2008. 110  Bibliography [68] D. Raggi. Adaptive MCMC methods for inference on affine stochastic volatility models with jumps. The Econometric Journal, 8(2):235—250, 2005. [69] B. Ristic, S. Aralampalam, and N. Gordon. Beyond the Kalman filter. Artech House, 2004. [70] C.P. Robert and G. Casella. Monte Carlo Statistical Methods, 2nd Ed. Springer-Verlag, 2004.  [71] 5. Roweis and Z. Ghahramani. A unifying review of linear Gaussian models. Neural Computation, 11:305—345, 1999. [72] G. Schindimayr. A regime-switching model for electricity spot prices. Working Paper, 2005. [73] E.S. Schwartz. The stochastic behavior of commodity prices: impli cations for valuation and hedging. Journal of Finance, 52(3):923—973, 1997. [74] R.H. Shumway and D.S. Stoffer. Time Series Analysis and Its Applica tions: With R Examples, 2nd Ed. Springer, 2006. [75] D. Simon. Optimal State Estimation. John Wiley  Sons, 2006.  [76] A.F.M. Smith and A.E. Gelfand. Bayesian statistics without tears: A sampling-resampling perspective. The American Statistician, 46(2):84— 88, 1992.  [77] H. Tanizaki. Nonlinear Filters: Estimation and Applications. SpringerVerlag, 1996. [78] J. Teichmann. A note on nonaffine solution of term structure equations with applications to power exchanges. Mathematical Finance, 15(1):191— 201, 2005. [79] A. van der Merwe, R. Doucet and N. de Freitas. The unscented particle filter. Technical Report CUED, 380, 2000. [80] P. Villaplana. Pricing power derivatives: a two-factor jump-diffusion approach. Business Economics Series (Working Paper), 2003.  111  Bibliography [81] A. Wills, T. Schon, and B. Ninness. Parameter estimation for discretetime nonlinear systems using EM. Proceedings of the 17th IFAC World Congress, Seoul, Korea, 2008.  [82] L. Xiong. Stochastic models for electricity prices in Alberta. University of Calgary, M.Sc. thesis, 2004.  112  Appendix A An Affine Jump-Diffusion process (AJD) is a jump-diffusion process for which the drifts and covariances and jump intensities are linear in the state vec tor U. Duffie et al. (2000) [27] show that AJD processes are analytically tractable in general. t < oo} in Let U be a strong Markov process with realizations {U, 0 , which solves the following stochastic differential 2 some state space D C R’ equation U  0+ U  f  , s)ds + 5 (U  f  u(U, s)dW +  Z.  (A-i)  The jump behavior of U is governed by m types of jump processes. Each jump type Z is a pure jump process with a stochastic arrival intensity t) for some : (D, t) —* R and jump amplitude distribution v : (D, t) F—* on R”, where v only depends on time t. The functions and u (D, t) —* are assumed to be Lipschitz continuous in order to guarantee that (A-i) has a unique solution. The process W.. is a standard Brownian motion in IRA. The process U defined by (A-i), is said to be an affine jump-diffusion process if (U,t)  =  (t) + K 0 K (t)U, 1  u(U, t)u’(U, t)  =  Ho(t) +  H(t)Uk,  =  where for each 0 (t) E R 0 , T h K (t) 1 t < no, K R”, H (t) 0 T1 and R is symmetric, H (t) e 1 Also, for k = 1,... ,n, H(t), defined to be 113  Appendix A 1 (t) to be k is in R and the matrix obtained by fixing the third index of H Finally l(t) R is symmetric.  , l) 1 ,H 0 ,K 0 ,H 1 , the tuple (K 0 Notice that given an initial condition U x [0, cc) x [0, cc) x D C of can be used to determine a transform ‘J : UT conditional on Ut, 0 <t <T, defined by ‘I(u,t,T,U)  :=  (A-2)  E[exp{u U}U] .  where E denotes the expectation under the distribution of UT determined , 1) is well-behaved at 1 ,K 0 ,H 1 ,H 0 , 1w). If we suppose (K 1 ,K 0 ,H 1 ,H 0 by (K (u, T), then the transform ‘V of Ut, 0 < t <T, defined by (A-2) exists and is given by: P(u, t, T, U)  =  (A-3)  exp{M(u, t, T) + N(u, t, T) U}.  Here M(.) and N(.) satisfy the following complex-valued Riccati equations: aM(u,t,T) =  -A(N(u,t,T),t),  M(u,T,T)  =  -B(N(u,t,T),t),  N(u,T,T)  aN(u,t,T)  where, for any c  =  =  0, u,  e C’,  A(c, t)  =  Ko(t) c + c’Ho(t)c +  B(c,t)  =  (t)c. 1 (t)’c+ c’H 1 K  .  lj(c)  —  1),  Here (c) is the “jump transform” for the i-th jump. It is given by =  f  exp{c. U}dv(U)  (A-4)  whenever the integral is well defined. 114  Appendix A We define the extended transform 1 : R x C x [0, oc) x [0, oc) x D of UT conditional on U, by E[(v UT) exp{u. UT}IUt].  u, t, T, U)  H—*  C  (A-5)  Given sufficient regularity the “extended transform” P can be computed by differentiation of the transform \J1 Hence 1(v, u, t, T, U)  =  ‘I’(u, t, T, U){C(t) + D(t) U}  (A-6)  where ‘I’ is given by (A-3), and C(.) and D(.) satisfy  0  =  —Ko(t)’D  =  (t)’D 1 —K  —  —  N’Ho(t)D  —  1o(t)Vo(N)D,  (A-7) (A-8)  (t)D, 1 N’H  with the boundary conditions C(T)=0,  (A-9)  D(T)=v.  Here V(c) is the gradient of (c) with respect to c  C.  115  Appendix B In this appendix we solve the ODEs (3.13)-(3.15) which arise in the calcula tion of futures prices in the AJD model. We begin with the ODEs arising from the MROU model. Recall that (B-i)  =  (B-2)  , 2 —xNi + ALN  =  —  (uN + uN)  1 We begin with (B-i). We have N  =  —  pUXUL. 2 1 N  (B-3)  (t, T) and 1 N (B-4)  Ni(ui,T,T)=ui.  ,N 1 2 as function of t only, then (B-1)-(B-3) are ODEs If we fix T, so regard N and (B-4) has solution 1 N  etA.  =  Since A  =  uie_T,  therefore Ni(ui,t,T)  =  uie>(t_T).  (B-5)  We now treat (B-3): =  , 2 +N  u ( 2 N , T,T)  =  . 2 u  (B-6) i i6  _  Appendix B , and multiplying by a factor 1 Substituting for N  t  we obtain  2 dN —  (B-7)  /J)LN  So (N ) 2 dt  2 dN [L-—/LALN , 2  2 N  PLN2,  dp. —  —ALdE,  I1 ij,  e.  in equation (B-?), we obtain  Substituting for  d(e ) 2 N t dt —  _Axuie(t_T)e_t  (B-8)  _Axuietx_e_T  (B-9)  so, 1 AXU  2 eLN —  —  etx_e_T  AL e(t_T)  2 N —  —  AL  +  + C,  (B-b)  Cet.  (B-il)  +CeT  (B-12)  Now, solving for C N u T,T)rr_ = ‘u ( 2 ,  1 An —  AL  117  _____  Appendix B therefore  C  =  U C 2 L  AXU1  +  e_T.  (B-13)  —  Substituting C in equation (B-li) gives  (u t, T)—— 2 N ,  1 Axu  eAx(t_T)  —  +  et(u e 2 _T  +  e)(t_T) 2 u  +  AX—AL  Axui =  AX  —  e(t_T)  AL  +  Xl  Xl  A  e_T)  AX—AL et_T).  AL  —  (B-14)  (B-15)  Finally we solve (B-3). =  2 —ALN  ),T,T) 2 with M((ui,u  + uN)  —  pJXJL 2 N 1  —  (B-16)  0.  Let c 1 and N 2 given /(Ax AL). Replacing the solution for N 1 —Au by equations (B-5) and (B-15) in (B-16) we have —  dM dt  =  e 2 ALL(e + u +(aet_T)  =  _ALLe  +  e 2 u  (t—T) —  _Je2t_T)  +L  —  —  u  e(t_T))  (t—T)  —  —  —  [uue2(t_T)  ] 2 ce)t_T))  (t_T)  e2A_T)  u e 2 tT)  )(t—T)  +puxuLulu2e  —  2  e 2 ALLU U  _°e2_T)  (t—T)  +  ALLe(t_T)  —  +  puxuLcu1e  +L)(tT)  +  uu ( e 2 t_T)  (t—T)  puxuLu1e  +L)(t-T)  118  _____ _____  ____ _____  Appendix B  Integrating both sides gives  e x(t—T)  LLU2  —  t_T)  e(t_T)  +  2A(t—T)  —  At_T) 2 le —  ’\x 4  \x 4  e2(t_T) —  —  U  —  __e(t_T) a 2 u  e  ),t,T) 2 Since M((ui,u  M  (x+AL)(t-T)  —1)—  UxU1(e2A(t_T) —  1)  —  ’\x 4  (t—T)  (e  PxL1  —  +  PUXULU1 e  Xx+L)(t-T)  +  c.  0,  =  —  U2e2t_T)  +  AL 4  —  1)  ( 2 ALLU  AL(t-T) —  UL (e At_T) 2 AX 4  —  (e(t_T)  —  1) +  1)  —  (et_T)  ULU2  L(t_T) 2 (e  —  —  1)  1)  ’L 4  —  At_T) 2 (e  1 +  )’L 4  —  1)  -AL 2  (e(t_T)  +  —  1)  —  puxuLaUl  (e(t_T)  —  1)  )X+AL PUxULQU1U2(e(Ax+L)ct_T) —  X  Now, taking  +  1) +  (1,0) and defining  (et_T) —  X  L  ) = 2 (ui,u  PUxULclu1  +  1).  L  m = —)‘x/’x  —  )L)  we obtain  119  _______  Appendix B  M(t,T)  =  (t_T)_,)  mi(e  (et_T)_1) 2 m  +  (et_T) 4 +m  —  (e2(t_T)_1) 3 m  +  1) +  ( 5 m ) 2 t_T) e  —  1),  (B-17)  where  m2  =— 1 m  m 2 crL2  =  Lrn  m4  ALLm  3 m  Ax  AX+AL  +  u 2 m  mpcrx L 5 m  AX+AL  =  —  4AL  2 mpuxaj\ (-k- + m +  =  4Ax  —  2Ax  4Ax  )  We now turn to the case of the jump model. Here N 1 and N 2 are as before, but M has two more components due to the jumps. We can write M(t) = M(t) + M (t), where M(t) is given by (B-17) and M 1 (t) satisfies: 1 dM 1 dt  1 N —  Mi(u,T,T)  ’ 1 N  =  0.  (B-18)  Integrating both sides we have  f  —  —  ,u_uie(t_T)dt  I_ ‘  =  dt.  %eAxT_AXt —  Ui  (B-19)  (B-20)  1  Applying the formula I  J with a  =  1, b  =  dx a+be  (?7ueT)/Ul  =  and  1 ac  —[cx  c  =  —  ln(a +  be)],  (B-21)  —A we get 120  ______________  Appendix B  1 M  1 =  —  Since Mi(u,T,T)  in (—i +  ((ueT)/ui)e_t)]  + C.  .  (B-22)  0,  =  C  =  T+  (B-23)  in(—1 +  1 U  Therefore  1 M  =  —(t—T)—-----1n(—1+ x \  nieAx(t_T))  (_uie(t_T) =  —(t  —  T)  —  —--  (in  AX  1 = —  —  +  Ax  +N  uieAx(t_T)  )  —  in  \  ni)  1 I—u  uie)(t_T)  in (u  1 u—U  )•  121  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0069305/manifest

Comment

Related Items