UBC Faculty Research and Publications

Verification of Mesoscale Numerical Weather Forecasts in Mountainous Terrain for Application to Avalanche.. Roeger, Claudia; Stull, Roland B.; McClung, David; Hacker, Joshua P.; Deng, Xingxiu; Modzelewski, Henryk 2003-12-31

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata


52383-Stull_AMS_2003_WF1140.pdf [ 3.34MB ]
JSON: 52383-1.0041841.json
JSON-LD: 52383-1.0041841-ld.json
RDF/XML (Pretty): 52383-1.0041841-rdf.xml
RDF/JSON: 52383-1.0041841-rdf.json
Turtle: 52383-1.0041841-turtle.txt
N-Triples: 52383-1.0041841-rdf-ntriples.txt
Original Record: 52383-1.0041841-source.json
Full Text

Full Text

1140 VOLUME 18WEATHER AND FORECASTINGq 2003 American Meteorological SocietyVerification of Mesoscale Numerical Weather Forecasts in Mountainous Terrain forApplication to Avalanche PredictionCLAUDIA ROEGER AND ROLAND STULLDepartment of Earth and Ocean Sciences, The University of British Columbia, Vancouver, British Columbia, CanadaDAVID MCCLUNGDepartment of Geography, The University of British Columbia, Vancouver, British Columbia, CanadaJOSHUA HACKERNational Center for Atmospheric Research, Boulder, ColoradoXINGXIU DENG AND HENRYK MODZELEWSKIDepartment of Earth and Ocean Sciences, The University of British Columbia, Vancouver, British Columbia, Canada(Manuscript received 18 June 2002, in final form 30 May 2003)ABSTRACTTwo high-resolution, real-time, numerical weather prediction (NWP) models are verified against case studyobservations to quantify their accuracy and skill in the mountainous terrain of western Canada. These models,run daily at the University of British Columbia (UBC), are the Mesoscale Compressible Community (MC2)Model and the University of Wisconsin Nonhydrostatic Modeling System (NMS). The main motivations of thiswork are: 1) to extend the lead time of avalanche forecasts by using NWP-projected meteorological variablesas input to statistical avalanche threat models; and 2) to create another tool to help avalanche forecasters intheir daily decision-making process.Observation data from the Whistler/Blackcomb ski area in the British Columbia (BC) Coast Mountains andfrom Kootenay Pass in the Columbia Mountains of southeast BC are used to verify the forecasts. The two modelsare run with grid spacings of 3.3 km (MC2) and 10 km (NMS) over Whistler/Blackcomb, and with 2, 10 (MC2),and 30 km (NMS) over Kootenay Pass. The quality of the forecasts is measured using standard statistical methodsfor those variables that are important for avalanche forecasting. It is found that the raw model output has biasesthat can be easily removed using Kalman filter predictor postprocessing. The resulting automatically correctedforecasts have quite small absolute errors in temperature (0.78C).It is also found that the coarser-resolution NMS model produces comparable results to the finer-resolutionMC2 model for precipitation at Kootenay Pass. These objective forecast errors are of the same order of magnitudeas the meteorological observation (sampling/representativeness) errors in the snowy, windy mountainous terrain,resulting in forecasts that have value in extending the range of avalanche forecasts for locations such as KootenayPass, as discussed in a recent study by Roeger et al.1. IntroductionForecast verification, as discussed in this paper andunderstood in meteorological literature, is concernedwith measuring the quality of a forecast. In general, ‘‘theprocess and practice of determining the quality and val-ue of forecasts’’ is called forecast evaluation (Murphyand Daan 1985). Two types of forecast evaluation withdifferent goals can be distinguished: empirical evalua-Corresponding author address: Roland Stull, Atmospheric ScienceProgramme, Dept. of Earth and Ocean Sciences, 6339 Stores Rd.,Vancouver, BC V6T 1Z4, Canada.E-mail: rstull@eos.ubc.cation (verification) with the goal to determine the qualityof a forecast; and decision–theoretic (or operational)evaluation, which is important to relate the value of aforecast to its users. Work in this latter area has beenconcerned with the development of measures of themonetary value of forecasts. For avalanche forecasting,the value of the forecast depends highly on the qualityof the forecasts.Along with accuracy, skill is also an important mea-sure of the quality of a forecast. In this context, accuracyis the ability of a forecast to match the observation andthe extent to which a forecast agrees with the measure-ment (Roeger et al. 2001). A forecast of good qualitymay also show skill, which is the degree of correctnessDECEMBER 2003 1141ROEGER ET AL.above some reference baseline, such as a climatologicalaverage. Thus, by determining the accuracy and skill ofa forecast, one can improve it and use it with confidencein the future.Although these theoretical ideas about weather fore-cast verification are well known, not many verificationresults are actually published or are easily accessiblefor mesoscale models in complex terrain. While mostweather forecast centers have their own routine modelverification schemes using several statistic methods fortheir global- and regional-scale models, results of thoseverifications often appear only in technical reports orinternal Web pages. Therefore, it was not possible tocompare our results to other results. Similarly, we knowthrough personal communications with colleagues atworkshops and conferences that attempts are beingmade to apply mesoscale models to complex terrain, butthese ideas are young and few verification results havebeen published except for a handful of ‘‘golden’’ casestudy days. We encourage the routine use and verifi-cation of mesoscale models in complex terrain.The accuracy of the Mesoscale Compressible Com-munity (MC2) Model and the University of WisconsinNonhydrostatic Modeling System (NMS) is determinedwith statistical verification against both manual and au-tomatic surface weather observations for continuous aswell as categorical variables at two avalanche sites inBritish Columbia (BC), Canada. Numerical weatherforecasts depend highly on the initial conditions and thetopography estimation in mountainous terrain, andhence, the resolution of the model grid. To estimate thisdependence—which is especially important in moun-tainous terrain—numerical weather prediction (NWP)models are run at the University of British Columbia(UBC) with slightly different initial conditions and withdifferent grid resolutions for the same forecast period.This is done to estimate the improvement using a higher-resolution grid and to reveal the effect of different to-pography approximations from each model.Snow avalanche forecasting is a complex problem,based on the interaction of weather, terrain, and thesnowpack. It is defined as the prediction of current andfuture snow instability in space and time relative to agiven triggering level. The goal of avalanche forecastingis to minimize the uncertainty about instability intro-duced by the temporal and spatial variability of the snowcover (including terrain influences), any incrementalchanges in snow and weather conditions, and any var-iations in human perception (McClung 2000).Statistical avalanche prediction refers to the organi-zation of a database of previously measured parameters,including avalanche occurrences, for use with a com-puter to help compare current or expected future con-ditions with past ones. There are many different param-eters that contribute toward snowpack instability, butprimary emphasis is on meteorological data (McClungand Schaerer 1993), not only because they are usuallymeasured by instruments at regular intervals and there-fore are relatively easy to get, but also because snowavalanche forecasting is a multiscale problem (La-Chapelle 1980; McClung and Schaerer 1993; McClung2000). Office-based forecasters often need to predictavalanches for an entire mountain range or parts of rang-es, for which high-quality meteorological informationis more relevant and can assume greater importance thanlocal snow-stability information.Precipitation and temperature are the key variablesfor dry or wet avalanche forecasting, respectively. Dryavalanches are most often slab avalanches that occurdue to an initial failure underneath a wind-packed layerof snow. This slab may be of several centimeters tomore than a meter in thickness, and its fracture line canreach over entire mountain slopes. Once in motion, theslab breaks into blocks and particles, which—if the orig-inal snow is very dry—may result into the separationof a dust cloud with very low density.Wet avalanches are often loose avalanches, usuallytriggered by heavy melt due to warming. Loose snowavalanches—as opposed to the cohesive nature of slabavalanches—start from a point at or near the surfacesnow and spread out in a triangular pattern as they movedown the slope. For this avalanche type the snow musthave low cohesion. The cohesion of snow decreaseswith increasing water content; namely, wet snow hasless cohesion than dry snow. Warm-up-related avalanch-ing can abruptly occur when the air temperature warmsto 08C in the initiation areas (McClung and Schaerer1993).Due to the great variety of climate zones in Canada,the demand for avalanche prediction is at the mesoscale(horizontal scales of 2–1000 km), which requires moreaccurate finescale prediction than for synoptic-scaleforecasts (1000–20 000 km). The avalanche hazard isconcentrated in local areas where people and facilitiesare present in mountainous regions (McClung 1995).Any avalanche model is dominated by the interactionof weather with terrain and the physical processes inthe snow cover, which leads to avalanche formation.Therefore, detailed networks of meteorological andsnowpack measurements combined with avalanche ob-servations are necessary for good avalanche forecasts(Foehn 1998).The comparison of output variables from NWP mod-els and input variables for statistical avalanche fore-casting models (AFM) shows that a lot of the NWPvariables can be directly applied into an AFM or caneasily be derived. The remaining AFM variables areusually measured in the field and cannot be directlyreceived from standard weather forecasts. But they canbe estimated or approximated with empirical relation-ships. When weather forecasts are reasonably accurateon the local scale and they are included in avalancheforecasting models, the two fields may be combinedsuccessfully, allowing the prediction of future snowpackinstabilities and avalanches (Roeger et al. 2001).This paper contains comprehensive results of the1142 VOLUME 18WEATHER AND FORECASTINGFIG. 1. Map of southwestern BC, and northern Washington (WA),indicating the site locations. WB: Whistler/Blackcomb, KP: KootenayPass.TABLE 1. Parameters used from each station at Whistler/Blackcomband Kootenay Pass and their type of observation (M: manual; R:remote); asl: above sea level.Weather station ParametersWhistler/Blackcomb Catskinner(1550 m asl)Temperature (R)Precipitation (M)Horstman Hut(2240 m asl)Temperature (R)Wind speed (R)Wind direction (R)Whistler Alpine(1825 m asl)Temperature (R)Wind speed (R)Wind direction (R)Pig Alley(1650 m asl)Precipitation (M)Kootenay Pass Kootenay Pass(1780 m asl)Temperature (R)Precipitation (R, M)Wind speed (M)Wind direction (M)Stagleap(2140 m asl)Temperature (R)Wind speed (R)Wind direction (R)weather forecast verification and the methods used. Theavalanche sites and numerical model characteristics areidentified in section 2, and statistical verification meth-ods are discussed in section 3. Verification results forthe key avalanche-prediction variables of wind, precip-itation, and temperature are presented in section 4, withconclusions in section 5.2. DataData from two different sites are used. The ski areaWhistler/Blackcomb (50.058N, 122.98W) in the CoastMountains in BC represents a maritime mountain cli-mate, which is characterized by relatively heavy snow-fall and relatively mild temperatures, resulting in deepsnow covers and the possibility of rain at any time dur-ing the winter. Kootenay Pass (49.058N, 117.08W) inthe southern Selkirk Mountains (Columbia MountainRange) of southeastern BC represents a transitional cli-mate zone, midway between a maritime and a conti-nental climate (McClung and Schaerer 1993; Armstrongand Armstrong 1987). While a continental snow climateis characterized by relatively low snowfall (shallowsnow covers), cold temperatures, and a location con-siderably inland from coastal areas, the transitionalsnow climate zone shows higher precipitation amountsresulting in middeep snow covers and temperatures coldenough for only snow events during midwinter, but alsomostly located inland from coastal areas. Figure 1 showsa map with the two sites indicated.In addition to two different climate zones, these sitesrepresent two different types of operations (ski area ver-sus highway operation) affected by avalanches. The skiarea Whistler/Blackcomb is concerned about avalanchesthat may start on or above ski runs. While most ski runsexperience relatively low avalanche danger due to con-stant grooming and skier traffic throughout the season,steep less-trafficked slopes higher above may requiresystematic avalanche control programs. In order toavoid large hazardous avalanches, some ski runs mustbe closed regularly in order to intentionally triggersmaller avalanches. These closures should be short intime and locally limited. The time of concern is duringski hours, roughly between 0900 and 1600 local time.The highway operation at Kootenay Pass is concernedwith avalanches large enough to cover parts of the high-way. Avalanche mitigation efforts primarily consist ofeither hand or artillery control. Because large avalanch-es need to be avoided due to the high costs associatedwith highway closures, avalanche control is a 24 h day21concern.With data from six meteorological observation sta-tions, as described in detail later, a wide range of dif-ferent locations is covered. Elevation of the six stationsvaries from 1550 m (Catskinner) to 2240 m (HorstmanHut); their surrounding topography varies from a partlysheltered location at midmountain (Pig Alley) to a lo-cation on top of a mountain ridge, well exposed to thewind (Stagleap). By using data from these six stationsthe behavior of the models in complex terrain is tested.Of most interest is model performance and output qual-ity for different elevations and topographical charac-teristics. This information is critical for both model de-velopers and end users.a. Meteorological observationsObservation data from Whistler/Blackcomb werefrom automatic weather stations as well as manual ob-servations taken by ski-patrol avalanche forecasters. Re-mote, automatic weather stations record weather con-ditions hourly or every 15 min, depending on the station.Manual observations are done twice daily. Table 1 liststhe parameters used from each weather station. ExceptDECEMBER 2003 1143ROEGER ET AL.FIG. 2. Map of ski area Whistler/Blackcomb with weather stationsindicated. Not to scale. Distance between the two peaks is ł6500 m.FIG. 3. Topographic map of Whistler/Blackcomb with locations ofweather stations and other points of interest.FIG. 4. Topographic map of Kootenay Pass area with the highwayand locations of weather stations indicated.for precipitation, which is observed manually, data fromremote stations have been available for two winters,1998/99 and 1999/2000. Precipitation was verified withdata from 1999/2000.Precipitation rate (mm h21) was collected hourly fromgauge measurements (remote observations) and twicedaily with snow-measurement boards (manual obser-vations: solid precipitation) at Kootenay Pass. At Whis-tler/Blackcomb, snow measurements were assessed withmanual snowboards at the weather stations Pig Alleyand Catskinner.For better orientation Fig. 2 shows a drawing map(not to scale) of the ski area Whistler/Blackcomb withthe weather stations and their altitude indicated. Figure3 shows a topographic map of the area with the scalegiven. The distance between Blackcomb Peak and Whis-tler Peak is 6.6 km. From Whistler Village in the valley,the distances are 6.8 km to Balckcomb Peak and 5.9km to Whistler Peak.Catskinner (1550 m asl) and Horstman Hut (2240 masl) are on Blackcomb Mountain (Fig. 2). Precipitationis measured at Catskinner, which is on the southwestside at midmountain elevation, neither particularly shel-tered nor exposed. (Note that an ideal site would havehad sheltered locations for precipitation, and well-ex-posed locations for wind; however, such designs aredifficult to implement in some places due to the lay ofthe land and the forest cover.) Horstman Hut is locatedon the NW–SE-aligned ridge, northwest of BlackcombPeak. The station is well exposed to winds from alldirections.At Whistler Mountain, temperature, wind speed, andwind direction were measured at Whistler Alpine at theski-patrol building near the Roundhouse Lodge (1825m asl). Since the site is in open terrain, neither shelterednor particularly exposed, it is an ideal location for winddata verification. The avalanche forecasters also usewind data from other stations, representing more specificlocations on the mountain.The field site for precipitation at Whistler (Pig Alley)is at 1650 m asl elevation in a central location of theski area. The site is surrounded by trees, but fairly openso that the trees have little influence on the measure-ments. The location has proven to be representative ofsnow amount at midelevation (J. Tindle, avalanche fore-caster, 1999, personal communication).Observation data from the Kootenay Pass site aredescribed in detail in Roeger et al. (2001). The operationconsists of two weather stations collecting manual andremote data: Kootenay Pass and Stagleap. A topographicmap of the area with the scale is given in Fig. 4. Themanual observation site at the summit of Kootenay Passis located at 1780 m asl elevation in an open area sur-rounded by trees. It is fairly sheltered and therefore windobservations here might be too slow, with direction thatis less meaningful. Precipitation measurements are rep-resentative for the area, and temperatures are typical forthis elevation. Temperature is measured at shelter height1144 VOLUME 18WEATHER AND FORECASTINGFIG. 5. Forecast grids of MC2 and NMS models run at UBC. TheNMS 30-km grid covers the same domain as the MC2 30-km grid.above the ground or snow surface. Stagleap is a remoteweather station at the top of a ridge (2140 m asl) andwell exposed to the wind. Wind speeds are thereforetypical for this mountain ridge and elevations, but arenot representative for some avalanche starting zones atmidmountain elevation, especially on the lee side. Atthis station, winds are measured remotely (anemometer)atop a 10-m-high tower. Data from both areas are gath-ered according to the guidelines from the Canadian Av-alanche Association (CAA 1995).In general, dysfunctional measurement devices canbe a problem especially related to winter weather, forexample, frozen anemometers due to significant rimingeffects. At Kootenay Pass, information about the work-ing condition of the instruments and a first check of themeasured value (within a certain range depending onthe variable) is done automatically with the measure-ments (true/false signal). At Whistler/Blackcomb, theavalanche forecasters regularly check their remote, au-tomatic measurements with additional manual measure-ments as well as examine the significant values by eye.Therefore, the avalanche forecasters know which dataare reliable, and a certain standard is maintained at bothsites. For this project, all measured data were againexamined in detail and only correctly measured data(within its range of uncertainty related to the measure-ment itself) have been chosen for verification. As aresult of this process, 10% of the data (some tempera-tures at Whistler/Blackcomb, and some winds at Koote-nay Pass) were rejected based on observer discretionand evaluation, in order to ensure reliable results.b. Meteorological forecastsThe two research NWP models used here are the MC2(version 4.8), refined by Environment Canada’s Nu-merical Prediction Research group (RPN), and the Uni-versity of Wisconsin NMS. Both were run in real-timefor this verification study, making daily forecasts onmultiple grids out to 48 h into the future with no manualex post facto tuning, in order to simulate operationalconditions.The MC2 model (Benoit et al. 1997, 2002) utilizesnonhydrostatic, fully compressible, non-Boussinesq dy-namics, and is discretized on an Arakawa C grid usingsemi-Lagrangian numerics and semi-implicit time dif-ferencing. The coordinate system is polar stereographicin the horizontal, and modified Gal-Chen in the vertical.The top boundary utilizes an absorbing layer, while lat-eral boundaries are nested with a ‘‘sponge’’ region. Bot-tom-boundary fluxes of heat, moisture, and momentumare parameterized using bulk-transfer and similarity al-gorithms between a force–restore soil layer and a 1.5-order closure turbulence scheme with turbulence kineticenergy prediction in a diffusive boundary layer. Cu-mulus convection is parameterized using the Zhang andFritsch method; mixed-phase microphysics with theSundqvist scheme; and radiation with Fouqart–Bonneland Garand schemes (see Benoit et al. 1997 for details).Surface conditions (vegetation, snow cover, sea surfacetemperature, albedo, etc.) are from climatology fieldsfrom the Canadian Meteorological Centre (CMC) of En-vironment Canada. Benoit and colleagues (1997) haveused the model both for real-time operational forecastsover all of North America, and for very fine resolution(3-km horizontal grid spacing) forecasts over the Alpsfor the Mesoscale Alpine Experiment (MAP; see Benoit2002).MC2 was run at UBC with horizontal gridpoint spac-ings of 90, 30, 10, 3.3, and 2 km, where the finer gridsin small domains were one-way nested inside coarser,larger-domain grids (see Fig. 5). The two highest res-olutions (smallest grid spacing) have been used for ver-ification, and these grids have 35 layers in the vertical.These horizontal grid spacings are 3.3 and 10 km overWhistler/Blackcomb, and 2 and 10 km over Kootenaypass. Two resolutions were used in order to comparethe specific improvements related to increasing reso-lution. The 10-km grid has X 3 Y 3 Z 5 85 3 60 319 grid points, the 3.3-km grid has 141 3 141 3 35grid points, and the number of grid points of the 2-kmgrid is 60 3 60 3 35 (all resolutions are true at 608N).The NMS model was developed primarily by G. Trip-oli at the University of Wisconsin (Tripoli 1992). It usesa nonhydrostatic, quasi-compressible, non-Boussinesqformulation on local spherical horizontal coordinatesand Gal-Chen vertical coordinates. Dynamics utilize anenstrophy-conserving second-order leapfrog scheme onan Arakawa C grid, while the thermodynamics use aflux-conservative sixth-order Crawley scheme. The up-per boundary has an absorbing layer, while radiativelateral boundaries are used. A multilayer soil model isused with a Tremback and Kessler parameterization,with Louis surface layer similarity, 1.5-order turbulentkinetic energy (TKE) turbulence closure, a cumulusconvection scheme by Kuo and Anthes, mixed-phasemicrophysics of Flatau et al., and radiation parameter-ization of Chen and Cotton (see Tripoli 1992 for details).Tripoli has used this model to simulate convection andbanding in hurricanes, for daily real-time forecasts forthe midwestern United States, and to forecast and sim-DECEMBER 2003 1145ROEGER ET AL.FIG. 6. Highly simplified scheme of interpolation (nonlinear). Twoverification points within the same grid cell have different values.The value 6.7 derives from the interpolation between its nine sur-rounding grid cells, sketched with the solid line. The dashed linesshow interpolation for the value 7.5.ulate winter convection snowbands over Lake Michigan(Kristovich et al. 2000).NMS was run at UBC for two-way interactive nestswith 90-, 30-, and 10-km grid spacing. For verification,10 km was used for Whistler/Blackcomb and 30 km forKootenay Pass (because the latter location was outsidethe operational 3.3-km domain, which was limited bycomputer power). The number of grid points is 50 368 3 24 for the 10-km gridpoint spacing and 68 3 803 28 for the 30-km spacing. The vertical domain is alsonested (viz., the top of the finest-mesh grid is below thetop of the coarser grids). For each weather station, fore-cast values from the surrounding four or nine gird pointshave been interpolated to calculate the forecast for theexact location. Figure 6 shows a highly simplifiedscheme of interpolation (nonlinear), which explains whytwo verification points have different values althoughthey may be located within the same grid cell.Initial and boundary conditions for MC2 and NMScoarse grids (90-km grid spacing) are from Eta Modelforecasts (U.S. National Centers for Environmental Pre-diction), valid every 3 h from 0 to 48 h. In turn, forecastsfrom MC2 and NMS coarse meshes provide the bound-ary conditions for the embedded finer meshes.From the MC2 model, not only the raw NWP fore-casts were verified with observations, but also forecaststhat have been improved by the Kalman-predictor post-processing correction method (Bozic 1979) have beencompared to observation data. The Kalman-predictorcorrection is an automatic postprocessing method (atype of model-output statistics) that uses the observationand the original forecast from the day before to calculatethe model error. It then predicts the model error for thenext day and uses it to correct the forecasts. This re-cursive, adaptive method ‘‘learns’’ on the fly (see de-scription in appendix B) and does not need an extensive,static database to be trained. It can be used for everyforecast where observation data are also available. Fordays of missing observations, it uses the unaltered cor-rections from the day before. The Kalman-predictor cor-rection method applied to output from both the NWPmodels has been tested for all parameters to measureits overall improvement compared to the raw modeloutput.For the verification of temperature, wind speed, andwind direction, the forecasts were divided into two fore-cast time periods. The first includes forecasts from 0 to24 h. The second covers forecasts that are valid 24–48h into the future. For precipitation, only 0–24-h fore-casts could be verified because of gaps in the MC2forecasts during the 2- and 3.3-km grid test period atthe beginning of this project.3. Evaluation methodsa. Evaluation methods for continuous variablesFor continuous variables, standard statistical methodsas well as graphical techniques have been used. Em-phasis was on robust and resistant mathematical mea-sures. The mathematical measures include interquartilerange (IQR) for information about the variation/spreadof the dataset, and the median (0.5 quantile q0.5)asasingle representative number for the dataset. Descriptivestatistical parameters (mean M, standard deviation s,variance y) have been calculated as well, but they maybe neither robust nor resistant. Robustness and resis-tance are two aspects of insensitivity to assumptionsabout the nature of a set of data. Robust methods aregenerally not sensitive to particular assumptions aboutthe overall nature of the data (e.g., it is not necessaryto assume that the data have a Gaussian distribution).A resistant method is not strongly influenced by outliers.As an example, the data series A 5 [12; 14; 13; 15; 12;14; 13; 123] contains an outlier (123) about which wedo not know if it is a correct measurement (physicallypossible) or even a typo. The data series B is the samebut without the outlier: B 5 [12; 14; 13; 15; 12; 14;13]. While the mean MA5 27 is strongly affected bythe outlier (the mean MB5 13.3), the median q0.5A513.5 is not (median q0.5B5 13); hence, the median isresistant.A summary of statistical verification equations is giv-en in appendix A. For information about the linear re-lationship between two datasets, the correlation coef-ficient [Pearson product–moment; Eq. (A1)] was used.Basic absolute measures for ordinal predictands are themean error [ME; Eq. (A2)], the mean absolute error[MAE; Eq. (A3)], the mean square error [MSE; Eq.1146 VOLUME 18WEATHER AND FORECASTINGTABLE 2. Indicators for phase and magnitude errors.PrecipitationStorm duration (h)Start time of storm cycleTime of max precipitation rateAccumulated percipitation (mm)Max precipitation rate (mm [3 h]21)Avg precipitation rate (mm h21)TemperatureTime of max temperatureTime of min temperatureMax temperature (8C)Min temperature (8C)(A4)] and the root-mean-square error [RMSE; Eq.(A5)].b. Evaluation methods for categorical variablesFor nominal predictands, contingency tables wereused for measurements of accuracy (see Table A1 inappendix A, illustrating a 2 3 2 contingency table). Ourmeasurements include the hit rate (H), the probabilityof detection (POD), the false-alarm ratio (FAR), and thebias ratio (BIAS). These quantities are given as Eqs.(A6)–(A9).The hit rate (or the percentage of forecast correct) isthe ratio of correct forecast events to the total numberof events. The worst possible hit rate is zero. A valueof 1 would represent a ‘‘perfect forecast.’’The bias ratio is the comparison of the average fore-cast with the average observation. It is the ratio of the‘‘yes’’ forecasts to the number of yes observations. Thevalue BIAS 5 1 indicates that the event was forecastcorrectly the same number of times that it was observed.Bias ratios greater than 1 indicate that the event wasforecast more often than it was observed (overfore-casting). Conversely, bias ratios less than 1 indicate un-derforecasting. The bias is not an accuracy measure be-cause it says nothing about the correspondence betweenthe forecasts and observations of the event on particularoccasions (Wilks 1995).Equations (A10) and (A11) show the Heidke skillscore (HSS) and the true skill score (TSS). They arederived by contingency table analysis as well.The Heidke skill score is based on the actual forecasthit rate relative to the hit rate expected for random fore-casts, which is used as a baseline or reference accuracymeasure. Forecasts equivalent to the reference forecastsreceive 0 scores. Negative scores are given to forecaststhat are worse than the reference forecasts. Perfect fore-casts receive a Heidke score of 1 (Wilks 1995).TSS is a measure of true forecast skill. In short, thetrue skill score is the POD, adjusted by the POFD (prob-ability of false detection); namely, TSS 5 POD 2POFD. It was originally proposed by Peirce (1884), thenknown as the Hanssen–Kuipers discriminant or Kuipers’performance index (Murphy and Daan 1985), or referredto as the true skill score as discussed in Flueck (1987)(Wilks 1995). It is similar to the Heidke skill score butthe random forecast that is taken into account is con-strained to be unbiased. Similarly, a value of 1 repre-sents a perfect forecast, 0 is random/neutral, and neg-ative values indicate forecasts that are inferior to a ran-dom forecast.c. Time series analysisIn order to assess phase errors, time series analyseswere performed on two variables: precipitation and tem-perature. Here, the main interest was to look at specificstorm cycles to see how the models perform in termsof the timing and amount of precipitation. Precipitation-event timing is the key for dry avalanche forecasting.Temperature was also chosen for this analysis becauseit is the parameter with the most complete and contin-uous time series for the field sites studied here, and itis the key to wet avalanche forecasting (see section 1).First, cross correlation as a function of phase lag wascalculated using the statistical package Systat. The in-puts are the two time series that one would like to com-pare. The output gives the correlation values for eachphase lag and the standard error. Significantly correlatedtime series can be identified by comparing the corre-lation with the standard error of the time series. Twotime series are significantly correlated when their cor-relation exceeds 2 times the standard error. Therefore,two time series have a phase lag when their correlationexceeds 2 times the standard error for any lag not equalto 0. A case where the correlation does not exceed 2times the standard error for any lag would indicate thatthe two time series are not significantly correlated. Atime lag refers to the time period between two forecasts,which is 3 h for precipitation and 1 h for temperature.Second, a more descriptive analysis was done for eachstorm. The time difference between precipitation andtemperature peaks as well as their difference in mag-nitude were compared subjectively. Table 2 contains alist of the different phase and magnitude indicators.4. Resultsa. Precipitation rateContingency table analysis was used as a verificationmethod, as outlined in section 3b. First, two categories(precipitation yes/no) were chosen. Second, precipita-tion rate was divided into seven categories (Table 3),depending on the type of precipitation and the type ofobservations. The categories intense [(12.5–80 mm h21or 37.5–250 mm 3 h)21] and extreme [(.80 mm h21or .250 mm 3 h)21] did not occur during these wintersand are therefore not mentioned any further. Heavy pre-cipitation [(1.7–12.5 mm h21or 5–37.5 mm 3 h)21] wasforecast and observed only once or twice at each station.For such a small number of events, the correct or in-DECEMBER 2003 1147ROEGER ET AL.TABLE 3. Precipitation rate in categories (water equivalent).Category mm h21mm (3 h)21mm (12 h)21NoneVery lightLightModerate00–0.40.4–0.80.8–1.700–1.251.25–2.52.5–500–55–1010–20HeavyIntenseExtreme1.7–12.512.5–80.805–37.537.5–250.25020–150150–1000.1000TABLE 4. Results from contingency table analysis of precipitation at Kootenay Pass, Nov–Dec 1999, remote and manual observations.NMS, MC2 original, and MC2 Kalman-predictor-corrected forecast. Corr. [ corrected.Precipitation vs nonprecipitationHit rate Bias HSS TSSPrecipitation rate incategoriesHit rateRemote observations (liquid and solid precipitation)MC2 10-km gridMC2 2-km gridNMS 30-km gridOriginalKalman-corr.OriginalKalman-corr.Original0.750.760.730.740.730.731.080.861.100.750.470.510.430.480.430.460.520.420.490.440.560.560.560.530.56Manual observations (solid precipitation)MC2 10-km gridMC2 2-km gridNMS 30-km gridOriginal 0.790.730.730.830.950.710.590.460.450.600.460.540.550.510.53correct forecast may be coincidence and gives no mean-ingful information. Therefore, the bias of this precipi-tation category is not included. The remaining four cat-egories used for the verification of precipitation rate are:none, very light, light, and moderate (Table 3). Asmentioned before, only 0–24-h forecasts could be ver-ified for precipitation.Details of verification results for only the remote ob-servations (total precipitation: liquid 1 solid) fromKootenay Pass were published in Roeger et al. (2001),and are briefly summarized here (see Table 4). Bothmodels (24-h forecast) underforecast precipitationevents, which means that precipitation was observedmore often than it was forecast. The best value for thebias ratio is achieved from the MC2 2-km grid (0.90).The MC2 10-km grid and the NMS 30-km grid achieved0.71 and 0.75.The hit rate is close to 0.75 for all forecasts (H 50.74 for the MC2 10 km; H 5 0.72 for the MC2 2 km;and H 5 0.73 for the NMS 30 km), which shows thatin almost 75% of all cases, precipitation events wereforecast as such and nonprecipitation events were fore-cast as such. Regarding this fairly high hit rate and thebias ratio lower than one for both models, we concludethat most precipitation events that were forecast did in-deed occur, but on the other hand, precipitation alsooccurred that was not forecast.Both models show some skill, with skill scores (HSSand TSS) of 0.4–0.5 (see Table 4 for details), but couldbe improved. The 2-km grid shows no improvement forthe skill scores compared to the 10-km grid, but the biasratio is somewhat better (closer to one) than the MC210-km grid. For most cases, the NMS model with thesignificantly lower resolution produces comparable re-sults to the MC2 model with the higher-resolution grids.All statistical results from the original MC2 forecastswere improved when the original forecasts were auto-matically corrected with the Kalman-predictor correc-tion method (see section 2b). Results are included inTable 4. Whereas only minor improvements are shownfor the hit rate (from 0.75 to 0.76 for the 10-km gridand from 0.73 to 0.74 for the 2-km grid) and the skillscores (10%–15% improvement), the bias ratio was sig-nificantly better using this method. However, the trendfor precipitation rate goes in the opposite direction: pre-cipitation events are overforecast with the Kalman-pre-dictor-corrected forecast, whereas they are underfore-cast by the original forecast at 24 h. For the MC2 10km, bias ratios are 1.08 (Kalman corrected) versus 0.73(raw forecast). The MC2 2 km achieved values of 1.10(Kalman corrected) versus 0.86 (raw forecast) for thebias ratios.New results from verification with manual observa-tions (solid precipitation in mm water equivalent) forKootenay Pass are shown in Fig. 7 and listed in Table4, again for two categories (precipitation yes/no) and24-h forecast period. The MC2 10-km grid gives thebest results for both of the skill measurements, but notfor the bias ratio. The hit rate is fairly high for all models(0.73–0.79), similar to the results with remote obser-vations. Also similar to those results, the bias ratio ishighest for the MC2 2-km grid, with a value of 0.95very close to a perfect forecast. The NMS model hasthe lowest bias ratio.When the precipitation rate was divided into morethan two categories (see Table 3), the hit rates withmanual observations are 0.55, 0.51, and 0.53 for theMC2 10-km, MC2 2-km, and NMS 30-km grids, re-spectively.At Whistler/Blackcomb, the results of all three modelsare similar to each other for solid precipitation (mmwater equivalent), as given in Figs. 8 and 9 for Pig Alley1148 VOLUME 18WEATHER AND FORECASTINGFIG. 7. Verification results for solid precipitation rate at KootenayPass, manual observations, Nov 1999–Apr 2000. Results from con-tingency table analysis. Perfect forecasts have a value of 1.FIG. 9. Same as in Fig. 8 but for Catskinner.FIG. 8. Verification results for solid precipitation rate at Pig Alley,Nov 1999–Apr 2000. Results from contingency table analysis. Perfectforecasts have a value of 1.TABLE 5. FAR and H for solid precipitation at Pig Alleyand Catskinner.FARPig Alley CatskinnerHPig Alley CatskinnerMC2 10-km gridMC2 3.3-km gridNMS 10-km grid0. Mountain) and Catskinner (Blackcomb Moun-tain); only precipitation versus nonprecipitation is com-pared. The NMS model shows better results for the skillscore measurements at both stations, most distinct atCatskinner. At Pig Alley, the bias ratio is better withthe MC2 models; both MC2 grids have a perfect valueof 1. The bias ratio of the NMS model is also very closeto 1 (0.95). At Catskinner, the NMS model achieves asimilar high value (0.94), whereas the bias ratio of bothMC2 grids is much lower.Between the two MC2 models, no improvement fromthe 10-km grid to the 3.3-km grid can be seen. The 10-km grid has better values in all statistics except for thebias ratio at Pig Alley, which has a perfect value of 1for both MC2 grids (see Figs. 8 and 9).The false-alarm ratio (viz., when precipitation wasforecast but not observed; Table 5) is lowest (best) forthe NMS 10-km grid at both mountains. The MC2 gridshave a FAR about twice as high as the NMS model.This implies that nonprecipitation events were forecastas precipitation events from the MC2 model, whichagrees with the hit rate of these two grids. Because theNMS model only slightly underforecasts precipitationevents (bias ratio 5 0.95) and the FAR is only 0.12,this model shows no trend toward one event, but un-derforecasts both categories.At Catskinner, the values for the MC2 models, to-gether with the hit rate, suggest that most of the non-precipitation events were forecast as such, but precip-itation events were not always predicted. The bias ratioshows the same result. Since the NMS model has a lowfalse-alarm ratio and a bias ratio below one, but closeto one, this model predicts precipitation events betterthan the MC2 model. It does not capture all nonprecip-itation events as such, because the hit rate is not veryclose to 1 (0.82).Figures 10 and 11 show the bias ratio when the solidprecipitation rate is divided into four categories. At PigAlley, the MC2 model achieves a perfect forecast of 1DECEMBER 2003 1149ROEGER ET AL.FIG. 10. Bias ratio: Solid precipitation rate in four categories. PigAlley, Nov 1999–Apr 2000, 24-h forecasts. Dashed line shows aperfect forecast.TABLE 6. Wind speed categories (km h21) according toCAA (1995).Category Wind speed (km h21)CalmLightModerateStrongExtreme0–11–2525–4040–60.60FIG. 11. Same as in Fig. 10 but for Catskinner.FIG. 12. Wind speed distribution at Whistler Alpine, 24-h forecast,Nov 1999–Jan 2000.for no precipitation, but it performs poorly for lightprecipitation [(1.25–2.5 mm 3h)21], which is highly un-derforecast. Comparing the MC2 10-km grid with the3.3-km grid shows improvement from the lower to thehigher resolution in all categories at Pig Alley. At Cat-skinner, the MC2 3.3-km grid does a slightly better jobthan the 10-km grid in certain categories [very light:0–1.25 mm (3h)21and moderate: 2.5–5 mm (3h)21].The NMS model shows values very close to one for thebias ratio in the categories very light [(0–1.25 mm(3h)21] and light [(1.25–2.5 mm (3h)21] at Catskinner,but performance drops off considerably for heavier pre-cipitation rates. At Pig Alley, the NMS model has ac-ceptable results for the first three categories, but per-forms poorly for moderate precipitation [(2.5–5 mm(3h)21].The hit rate is highest for the MC2 10-km grid atboth stations. Values are given in Table 5. Pig Alleyshows lower values than Catskinner. Generally, H 50.72 and 0.65 is good, considering default (equi-like-lihood; no skill) of 0.20 for this statistical measurement.b. Wind speedWind speed was verified with categories according tothe Canadian Avalanche Association (CAA 1995), asgiven in Table 6. Wind speed is generally underpredictedat both study areas. For this variable, results from theNMS model are not as good as from the MC2 model.At Whistler/Blackcomb, hit rates are very high with 0.80(NMS) to 0.88 (MC2) at Whistler Alpine, but muchlower at Horstman Hut (0.33–0.38 in 1999/2000 and0.51–0.52 in 1998/99). For both stations, the higher gridresolution (3.3 km) shows no significant improvementcompared to the next lower resolution (10 km). Figure12 shows the wind speed distribution of the 24-h fore-casts from the MC2 10-km, MC2 3.3-km, and NMS 10-km grid at Whistler Alpine. Figure 13 shows the windspeed distribution at Horstman Hut from the MC2 mod-el, with both resolutions (3.3- and 10-km grid) segre-gated into 0–24-h and 24–48-h forecast periods. Allmodels lack realistic variability. Only light and calmwinds are predicted. Light winds are highly overfore-cast, whereas higher wind speeds are not captured atall. Significant differences cannot be seen, either be-1150 VOLUME 18WEATHER AND FORECASTINGFIG. 13. Wind speed distribution at Horstman Hut, MC2 forecast,Oct 1999–Jan 2000.TABLE 7. Wind speed: H for Horstman Hut: Results from 1998/99and 1999/2000. (Kalman correction was tested only for 1998/99 fore-casts here.)HHorstman HutFeb–May 199924 h 48 hHorstman HutOct 1999–Jan 200024 h 48 hMC2 10-km gridOriginalKalman-corr.MC2 3.3-km gridOriginalKalman-corr.0.520.590.520.620.520.600.510.610.36—0.33—0.38—0.35—FIG. 14. Wind speed distribution at Horstman Hut, MC2 original (O)vs Kalman-predictor corrected (K) 24-h forecast, Feb–May 1999.TABLE 8. Results for wind speed as continuous variable, Pearsoncorrelation coefficient r, MAE and ME in km h21. MC2 original (O)vs Kalman-predictor-corrected (K) forecast, Horstman Hut, Feb–May1999.rOKMAEOKMEOK24 hMC2 10-km gridMC2 3.3-km grid0.500.600.780.7819.419.08.58.518.718.53.03.048 hMC2 10-km gridMC2 3.3-km grid0.650.700.800.8019. the two grid resolutions or between the two fore-cast periods.Good improvements can be seen with the Kalman-predictor correction method. Figure 14 shows an ex-ample of 24-h forecasts at Horstman Hut. Moderate,strong, and extreme wind events are captured as well.The overall improvement of hit rate with this automaticpostcorrection method is shown in Table 7. The H valuesfor verification of the Kalman-corrected forecasts rangebetween 0.59 and 0.62, compared to the results fromthe original forecasts of 0.51–0.52. Table 8 shows errorreduction and higher correlation coefficients comparedto the original MC2 forecasts (wind speed analyzed ascontinuous variable). The Pearson correlation coeffi-cient is increased from 0.50 to 0.70 (original MC2 fore-casts) to 0.78–0.80 (Kalman-predictor corrected fore-casts). Mean absolute errors are reduced from 18.7 to19.4 km h21from the original forecast to 8.0–8.5 kmh21from the corrected forecast. Even more significantare the improvements of the mean error. Values between18.1 and 18.7 km h21from the original MC2 forecastsare reduced to 2.9 and 3.0 km h21with the Kalman-predictor correction method. This shows that the Kal-man-predictor correction method is of high value forwind speeds.Comparing the results of the 24-h forecast period withresults of the 48-h forecast period show slightly highercorrelation coefficients for the 48-h period. No real dif-ference in MAE and ME can be seen.The results from Kootenay Pass are discussed in de-tail in Roeger et al. (2001). In summary, wind speed isalso underforecast at this study site. Figure 15 (Stagleap)shows that the wind speed distribution is similar toWhistler/Blackcomb; namely, there is also a lack of var-iability, with prediction of only light and calm winds,and overforecasts of light winds. However, the MC2 2-km grid does significantly better than the MC2 10-kmgrid for this location. The original MC2 forecasts arealso highly improved with the Kalman-predictor cor-rection method, as shown in Fig. 16.The results in Fig. 17 suggest that the topographyapproximation plays an important role in model per-formance. The plot gives median values of absolute er-ror (AE; differences between observation and forecast)and their spread (lower and upper quartile). The NMSmodel performs worst at Stagleap, where the grid spac-ing is 30 km. At Whistler Alpine, where the NMS modelhas the 10-km grid, it has comparable results to the MC2model (lower median absolute error but larger spread).Both MC2 grids have evidently higher median errors atStagleap than at Whistler Alpine, where the anemometermeasurements are not as strongly influenced by localDECEMBER 2003 1151ROEGER ET AL.FIG. 15. Wind speed distribution at Stagleap, remote observations,24-h forecast, Nov 1999–Jan 2000.FIG. 17. Median values of absolute differences between observationand forecast and their spread for Stagleap, Jan–Apr 2000, and Whis-tler Alpine, Nov 1999–Jan 2000 and Feb–Apr 1999.FIG. 16. Wind speed distribution at Stagleap, MC2 original vsKalman-predictor corrected 24-h forecast, Jan–Apr 2000.FIG. 18. Wind rose for Whistler Alpine, MC2 24-h forecast, Feb–Apr 1999.terrain, that is, where the topography approximation ofthe models is not as significant.c. Wind directionWind direction has been verified with contingencytable analysis in eight categories (458 angle section: N,NE, E, SE, S, SW, W, NW) or in four categories (908angle section: N, E, S, W). The different models andgrids were compared with wind roses, which representthe prevailing wind as a percentage of time/observationsthat the wind blows from different directions, as wellas the bias ratio for each wind direction and the hit rate.Figure 18 shows the wind rose for Whistler Alpine.Prevailing winds are from the south. The bias ratio forwind direction divided into the four main aspects aswell as the percentage of occurrence is given in Fig. 19for Whistler Alpine. It can be seen that the 3.3-km gridhas better results than the 10-km grid for southerly andwesterly winds, which together make 69% of all ob-servations. North winds are badly captured by bothmodel resolutions, but with 2% occurrence this resultis not meaningful. However, the 10-km grid is moreaccurate because the overall H values are higher (0.57)than from the 3.3-km grid (0.44; Table 9).At Horstman Hut, the bias ratios do not show sig-nificantly better performance from the 3.3-km grid(Figs. 20 and 21). The H values suggest that the 10-kmgrid performs better than the 3.3-km grid, because the10-km grid has a higher hit rate overall (see Table 9).Comparing the 24-h forecast period with the 48-h fore-cast period (Fig. 20 vs Fig. 21, Table 9) shows subtledifferences for all aspects, with the 24-h forecast beingbetter than the 48-h forecast at Horstman Hut (48-hforecasts for Whistler Alpine have not been verifiedhere).These two figures also show an improvement for this1152 VOLUME 18WEATHER AND FORECASTINGFIG. 19. Bias for wind direction in four categories. Whistler Al-pine, MC2 24-h forecast, Feb–Apr 1999. Perfect forecasts have abias of 1.FIG. 20. Bias for wind direction in four categories. Horstman Hut,MC2 original vs Kalman-predictor-corrected 24-h forecast, Feb–Apr1999. Perfect forecasts have a value of 1.TABLE 9. Wind direction: Results from contingency table analysis:H for Whistler Alpine and Horstman Hut, Feb–Apr 1999.HWhistlerAlpineHorstman HutoriginalHorstman HutKalman-corr.24 hMC2 10-km gridMC2 3.3-km gird0.5700.440.610.470.710.5048 hMC2 10-km gridMC2 3.3–km grid——0.520.480.680.68FIG. 21. Same as in Fig. 20 but for 24–48-h forecast period.variable at Horstman Hut using the Kalman-predictorcorrection method. Northerly and easterly winds are notimproved, but the bias ratio for southerly winds withthe highest percentage of occurrence (76%) is better.Similarly, westerly winds are highly overpredicted bythe original forecasts, but refined to a large extent withthe Kalman prediction. However, westerly winds occuronly 4% or 5% of all times at this location. The windrose for Horstman Hut is given in Fig. 22. Improvementfor both grids can be seen for southerly and westerlyaspects.At Stagleap (Kootenay Pass study site), prevailingwinds are generally from the west (SW: 25%, W: 27%,and NW: 21%), which is mainly due to the general flowpattern (midlatitudes in Northern Hemisphere) but mayalso be partly influenced by the east–west alignment ofthe ridge. The wind rose is given in Fig. 23. Figure 24gives the bias ratio for the four aspects with percentageof occurrence for the MC2 10-km and 2-km grid, andwith their equivalent Kalman-predictor correction. Im-provement for both grids can be seen for all aspectswith the Kalman correction. The H values have alsoincreased: the MC2 10-km grid originally has H 5 0.55versus 0.61 with the correction method; the 2-km gridhas 0.53 (original) versus 0.57 (corrected). More detailsare published in Roeger et al. (2001).d. TemperatureTemperature forecasts (for the specific hour) are gen-erally very good. All models and grids achieve highcorrelation between forecast and observation values.Predicted temperature is generally too high with meanabsolute errors between 18 and 38C at Kootenay Passand 28–68C at Whistler/Blackcomb. Figures 25, 26, and27 show MAE results from Whistler Alpine, HorstmanHut, and Catskinner. The higher-resolution MC2 gridperforms better than its lower-resolution grid in all cas-es. The NMS model has lower errors except for Cat-skinner (with its remote observations). The temperatureMAE results of the 24-h forecast are better than thoseof the 48-h forecast for all cases.Correlation coefficients are graphically shown inFigs. 28, 29, and 30. The highest correlation coefficientDECEMBER 2003 1153ROEGER ET AL.FIG. 22. Wind rose for Horstman Hut, MC2 original vs Kalman-predictor-corrected 24-h forecast. Feb–Apr 1999.FIG. 24. Bias for wind direction in four categories. Stagleap, MC2original vs Kalman-predictor corrected 24-h forecast, Jan–Apr 2000.Perfect forecasts have a value of one.FIG. 23. Wind rose for Stagleap, MC2: Jan 2000; NMS: Nov1999–Jan 2000.FIG. 25. Mean absolute error for temperature at Whistler Alpine,MC2: Feb–Apr 1999 and Nov 1999–Jan 2000, respectively; NMS:Nov 1999–Mar 2000.is achieved by the NMS model. It is above 0.8 in almostall cases. The results of the MC2 model are similarlyhigh in some cases, but can be lower than 0.6 in othercases. The 3.3-km grid has better results than the 10-km grid except with observations from 1998/99.For both years, the MC2 original forecasts at Cat-skinner were automatically corrected with the Kalmanprediction. The correlation coefficient is significantlyimproved, as illustrated in Fig. 31, which shows resultsfrom 1999/2000. The mean absolute error is also sig-nificantly reduced—in several cases by more than 40%.Results from 1999/2000 are shown in Fig. 32.Summarized results from Kootenay Pass (Roeger etal. 2001) give correlation coefficients and mean absoluteerrors that are in the same range as those for Whistler/Blackcomb. Figure 33 shows the correlation coefficientfor temperature at Kootenay Pass. The NMS model per-forms not as well as the MC2 model for this study site,but results are still good. Again, Kalman-predictor cor-rected MC2 forecasts are significantly better than theiroriginal forecasts. Correlation coefficients are increasedand achieve values up to 0.97 (see Fig. 34). Mean ab-solute errors show up to 50% error reduction by usingthe Kalman corrector.e. Results from time series analysisTime series analysis could be done only for KootenayPass since it is the only station with sufficient precip-itation records during any one storm cycle. At Whistler/Blackcomb, manual observations give values only twicea day, which is not enough data for storms that last only1–2 days. Temperature was chosen as a second variablefor time series analyses. These two are the most sig-nificant variables for dry and wet avalanche forecasting(as explained in section 1).Eight storms have been chosen, according to theirprecipitation patterns. Figure 35 shows the time series1154 VOLUME 18WEATHER AND FORECASTINGFIG. 26. Mean absolute errors for temperature at Horstman Hut,MC2: Feb–Jun 1999 and Oct 1999–Jan 2000, respectively; NMS: Oct1999–Mar 2000.FIG. 28. Correlation coefficient for temperature at Whistler Alpine,MC2: Feb–Apr 1999 and Nov 1999–Jan 2000, respectively; NMS:Nov 1999–Mar 2000.FIG. 27. Mean absolute errors for temperature at Catskinner, MC2:Feb–Apr 1999, Nov–Dec 1999 (remote) and Nov 1999–Jan 2000(manual), respectively; NMS: Nov 1999–Mar 2000 (remote and man-ual).FIG. 29. Correlation coefficients for temperature at Horstman Hut,MC2: Feb–Jun 1999, Oct 1999–Jan 2000, respectively; NMS: Oct1999–Mar 2000.for these eight storms, which are named accordingly totheir start date.Cross-correlation analysis showed no obvious timelag between any forecast model and any forecast periodwith the observations for Kootenay Pass for both var-iables. Almost every correlation that was significant wasfor 0 time lag. For precipitation, only 6 out of 21 caseswere found with significant correlation at non-0 timelags; 5 of them at time lag units of 21or11, only 1of them at time lags 12 and 13 (where each lag unitequals 3 h). For temperature, only 1 case (out of 28)shows significant correlation at nonzero time lag. Thissuggests that the timing between forecast and obser-vation is correct for the analyzed eight storms.The more descriptive time series analysis (not shownhere) confirmed that all forecasts underpredict precip-itation amount, and most of them underpredict precip-itation intensity. The NMS 24-h forecast and the originalMC2 10-km grid 24- and 48-h forecasts show extremevalues of 41%–50% for the difference in accumulatedprecipitation. The Kalman-predictor correction methodimproves precipitation amount for both MC2 forecastgrids. The corrected forecasts underpredict accumulatedprecipitation by 13% and 21%, respectively. For theNMS model, no time trend is obvious, unlike the MC2DECEMBER 2003 1155ROEGER ET AL.FIG. 30. Correlation coefficients for temperature at Catskinner,MC2: Feb–Apr 1999, Nov–Dec 1999 (remote) and Nov 1999–Jan2000 (manual), respectively; NMS: Nov 1999–Mar 2000 (remote andmanual).FIG. 32. MAE for Kalman-predictor-corrected temperature (8C).Catskinner, Nov–Dec 1999.FIG. 31. Correlation coefficient for Kalman-predictor-correctedtemperature (8C). Catskinner, Nov–Dec 1999.FIG. 33. Correlation coefficient for temperature, Kootenay Pass,Nov 1999–Jan 2000.model that starts storms too late, and continues themtoo long.The NMS model as well as the original forecast fromthe MC2 model overforecast temperature magnitude.The Kalman-corrected forecasts reduce this differencebut also indicate a reverse trend toward underforecast-ing. These conclusions are valid for the analyzed eightstorms only. To justify those conclusions as a generalbehavior of the models, more data are needed.An obvious phase shift in forecasting maximum andminimum temperature cannot be identified. For maxi-mum temperature, all forecasts seem to predict it tooearly. However, a similar statement cannot be made forminimum temperature, and therefore no conclusions re-garding the timing are possible.5. Conclusions and outlookDetailed verification of two high-resolution, real-time, numerical weather prediction (NWP) models wasperformed with case study observations from two win-ters: 1998/99 and 1999/2000. The main goal of thisresearch project was to assess the accuracy and the biasof the weather predicted by two models with regard topotential applications such as avalanche forecasting.Verification was against standard meteorological vari-ables from surface observations. Two winters are a rel-1156 VOLUME 18WEATHER AND FORECASTINGFIG. 34. Correlation coefficient for Kalman-predictor-correctedtemperature (8C), Kootenay Pass, Nov 1999–Jan 2000.FIG. 35. Three-hour precipitation for the eight chosen storms. Example shows observations andNMS 24-h forecast.atively short time period, and it should be kept in mindthat the interpretation of the results outlined here canonly represent the weather of these two winters.While this project focused on detailed quantitativeverification, some possible explanations for the perfor-mance of the two models are suggested here. ComparingMC2 versus NMS, similarly good results were obtainedfrom both. Differences based on grid resolution can beseen between the two study locations. At the KootenayPass area, where the NMS has a low grid resolution of30 km, it does not perform as well as the MC2 modelin wind speed (Stagleap) and temperature (KootenayPass), whereas the differences are fairly subtle at thelatter station. At Whistler/Blackcomb, where the NMSgrid resolution is 10 km, its results are at least as goodas the results from the MC2 model. For temperature andpartly for precipitation rate, the NMS model performsbetter than the MC2 model at this study area.Comparing the two MC2 model resolutions showssomewhat better results for precipitation rate from thefiner grid spacing at Kootenay Pass. At Whistler/Black-comb, the 3.3-km grid has a higher hit rate but the 10-km grid has better bias ratios. Overall, no significantimprovement can be seen from the lower to the higherresolution for this parameter. For wind speed at Stagleapand for temperature at Whistler/Blackcomb, the 2-kmor 3.3-km grid perform significantly better than the 10-km grid. The results for wind direction show better biasratios from the 3.3-km grid at Whistler Alpine, but the10-km grid is more accurate at this location.The 24-h forecasts are overall more accurate than 48-h forecasts for the events and locations studied here.Results are slightly better for wind speed (correlationcoefficients) and wind direction, and significantly betterfor temperature with the shorter forecast period, as men-tioned earlier. No comparison could be done for pre-cipitation (see section 2b).Time series analysis showed that the timing betweenforecast and observation is correct for the analyzed eightstorms (within the 3-h time resolution of the forecasts).For more general conclusions about the models behav-iors regarding correct timing, more storms from severalyears as well as summer storms should be investigated.The results also show that the Kalman-predictor cor-rection method is highly suitable for all tested variables.The verification results were improved at all study lo-cations with this automated correction method. Thismethod is a very successful tool in improving the orig-inal forecast and should be further developed to use inreal time.In general, precipitation events are underforecast. Theresults from Kootenay Pass and Catskinner show thatDECEMBER 2003 1157ROEGER ET AL.most precipitation events that were forecast also oc-curred, but on the other hand, additional precipitationevents also occurred that were not forecast. This, as wellas underpredicted precipitation intensity (results fromtime series analysis) may be dangerous for the appli-cation in avalanche forecasting because this may resultin an unexpected increase of avalanche risk. At Pig Al-ley, the FAR values together with the hit rate suggestthat nonprecipitation events were forecast as precipi-tation events from the MC2 model, which would at leastmean that avalanche forecasters are ‘‘on the safe side.’’The NMS model, with a low false-alarm ratio and a biasratio below one, predicts precipitation events better thanthe MC2 model at this location.The difference in temperature mean absolute errorsbetween Kootenay Pass (18–38C) and Whistler/Black-comb (28–68C) may be due to an incorrect elevationapproximation at Whistler/Blackcomb. The reason isthat although the forecast is made for the correct lon-gitude and latitude, the elevation of the model could beoff because the model smooths the topography withinits grid resolution. This can have a large effect on thetemperature field in locations of steep topography. Bi-ases of temperature at Kootenay Pass may also be in-fluenced by poor integration of the model with conti-nental air masses, but this idea has not been furtherinvestigated. Temperatures are generally predicted astoo warm, but the small MAE values around 0.78C,achieved with the Kalman-corrected MC2 forecast, sug-gest that this forecast can be used in further applications,such as avalanche forecasting.The difference in hit rate of wind speed betweenWhistler Alpine and Horstman Hut is due to the differentdistribution of observed wind speeds. At Whistler Al-pine, which shows significantly higher hit rates thanHorstman Hut, light winds were observed in 94% of allcases, while predictions of the models vary between86% and 93%. At Horstman Hut, the distribution of theobserved wind speeds is quite different: 3% calm, 37%light, 30% moderate, 22% strong, and even 8% extreme.The models, however, predict 94%–97% light winds.Therefore, the models either have a systematic error ofnot capturing wind speeds greater than light, or theyhave similar topography approximations for the two sta-tions that both differ from reality.Underpredicted wind speeds at Stagleap may occurbecause of the local topography. The weather station islocated on top of an east–west aligned ridge and there-fore fairly wind-exposed. The topography approxima-tion of the models might not capture this. In addition,a systematic error is possible because the wind speedis also underpredicted at Whistler Alpine.Another reason why we think that the topographyapproximation of the models plays an important role arethe results shown in Fig. 17. At Whistler Alpine, wherethe anemometer measurements are not as strongly in-fluenced by local terrain (i.e., where the topographyapproximation of the models is not as significant), bothMC2 grids have lower median errors than at Stagleap.This implies that the different topography approxi-mations of the two models significantly affect the re-sults. This effect is somewhat larger than the effect ofincreased grid resolution with the MC2 model. A higherresolution should improve the results because the to-pography is captured more accurately. However, forBritish Columbia, improved forecasts for all parametersmight not be realized for finer grids (i.e., for betterrepresentations of topographic effects), because a lim-iting factor is the dearth of weather observations up-stream (west of) BC. This ‘‘data void’’ over the NEPacific must be remedied before more accurate forecastsare possible. Boundary effects (boundary value prob-lems due to a closed domain in the numerical models)might be another reason for the bias. A third factor mightbe the numerical approximations made by the MC2 de-velopers to improve execution speed.It was shown that each model has different strengthsand weaknesses. Neither one of the models is best forall variables. This indicates that, in general, a singlemodel should not be used for all variables. An ensembleforecast that combines several models may do a betterjob than only one, when all parameters are considered.While this verification project focused on basic stepsin model verification, many more meteorological fea-tures are yet to be verified with measurements. For ex-ample, for snow avalanching it is of high importanceto have information about the extent, timing, and mag-nitude of temperature inversions and cold frontal pas-sages, both of which have a significant effect on snow-pack stability. Therefore not only temperature but alsotemperature change is very important and it is one ofthe significant variables of numerical avalanche models.Although not shown in this paper, Roeger et al. startedto take the next step by using this numerical forecastoutput as input to a statistical avalanche threat model.So far, the resulting 24-h forecasts of avalanche threatseem to be as skillful as traditional 6–12-h avalancheforecasts based only on weather observations.Thus, we recommend that NWP forecasts be used toincrease the lead time for avalanche forecasts. By in-creasing the advanced warning, avalanche and resourcemanagers can take mitigation action to better protectlives and property, and reduce avalanche closures ofkey transportation corridors.Acknowledgments. This research was sponsored byCanadian Mountain Holidays, the Natural Sciences andEngineering Research Council of Canada (NSERC),Forest Renewal BC, Environment Canada, the CanadianFoundation for Climate and Atmospheric Sciences, andBC Hydro. Claudia Roeger was supported by the Ger-man Academic Exchange Service (DAAD). The datafor this study are provided by the Ministry of Trans-portation and Highways (MoTH) of British Columbia,and the ski resort Intrawest Whistler/Blackcomb. Re-search coordination, infrastructure, and computer sup-port was provided by the Geophysical Disaster Com-1158 VOLUME 18WEATHER AND FORECASTINGTABLE A1. Contingency table definition, where A to D are thecounts of events in each category, out of N total events.ObservationYes NoForecastYesNoACBDputational Fluid Dynamics Centre at UBC. We wouldlike to thank John Tweedy and Ted Weick from MoTH,as well as the avalanche forecasters from Whistler/Blackcomb for their great help. We are extremely grate-ful for the support from all these organizations.APPENDIX AEquations for Statistical AnalysisPearson correlation coefficient r:n(x 2 x)(y 2 y)Oiii51r 5 , (A1)nn22(x 2 x)(y 2 y)OOii!i51 i51where xiare forecast data values, yiare observed datavalues, : mean forecast value, : mean observed value,xyand n: number of data pairs.Mean error (ME):ME 5 x 2 y, (A2)mean absolute error (MAE):n1MAE 5 |x 2 y |, (A3)Okknk51mean-square error (MSE):n12MSE 5 (x 2 y ) , (A4)Okknk51root-mean-square error (RMSE):RMSE 5 ˇMSE. (A5)Contingency table analysis equations for the 2 3 2table of Table A1.RangePerfectforecastHit rate H:A 1 DH 5N0 to 1 1 (A6)Probability of detection (POD):APOD 5A 1 C0 to 1 1 (A7)False-alarm rate (FAR):BFAR 5A 1 B0 to 1 0 (A8)BIAS:A 1 BBIAS 5A 1 C0to1‘ 1 (A9)Heidke skill score (HSS):2(AD 2 BC)HSS 5(A 1 C)(C 1 D) 1 (A 1 B)(B 1 D)21to11 1 (A10)True skill score (TSS):AD 2 BC A BTSS 552(A 1 C)(B 1 D) A 1 CB1 D21to11 1 (A11)APPENDIX BKalman Filter BasicsKalman filtering is used as an adaptive, recursivemethod (Bozic 1979) to optimally estimate the bias andreduce rms error between raw, noisy NWP forecasts andnoisy verification observations. It is recursive becauseit carries only a filtered summary of the past input sig-nals, into which it can incorporate new inputs to createa modified filter. It is adaptive in the sense that anychanges in the stationarity of the input signals is quicklyincorporated into the modified filter, causing informa-tion about the old filter to be gradually lost with suc-ceeding time steps. These attributes are desirable be-cause the filter adapts to changing climate, changingseasons, or even changing NWP model versions withoutrequiring one to first accumulate a large database ofhistorical data.Kalman (1960) and Kalman and Bucy (1961) showedhow this approach can also be used as a statistical pre-dictor to estimate future forecast bias. This is particu-larly useful for real-time weather forecasting, where theraw NWP projection can be corrected with the Kalman-estimated bias projection to create a corrected forecast.We used this objective, linear approach during postpro-cessing of each model forecast, in place of traditionalmodel-output statistics (MOS). This operation is fullyautomated with no manual tweaking or bogusing.Let ekbe the bias between the forecast and the ver-ifying observation valid today (for time step k), such asfor temperature [ek5 Tk(fcst) 2 Tk(obs)] at one weatherstation location. This ekis the signal that we would liketo predict (i.e., estimate) for tomorrow (at k 1 1). Kal-man designed his filter/predictor for a first-order auto-regressive system of the form ek115 aek1 wk, wherewkis a Gausian-distributed random term of variance. The meteorological interpretation of this ‘‘signal2swmodel’’ or ‘‘system model’’ is that a portion (a)ofthefuture bias of the weather forecast is successfully de-scribed by persistence of the current bias, but with theaddition of a random term that is related to the funda-mental deterioration of weather predictability with in-creasing lead time. This system model applies not onlyto the actual system, but to our estimate of the system.Similarly, the input observations are assumed to benoisy (with random error yk) that can be described by:ek5 ceˆk1 yk, where the error variance is , factor c2syindicates the relationship between the filtered expectedvalue and the actual observation, and the hat indicatesDECEMBER 2003 1159ROEGER ET AL.FIG. B1. Flow diagram for the Kalman predictor.expected value. Meteorologically, the random error canbe due to subgrid terrain influences, spurious numericalartifacts, inadequacies of the physical parameterizations,and errors in the observations themselves.Flow diagram Fig. B1 illustrates that the Kalman ap-proach is basically an optimal predictor-corrector meth-od. The prediction of tomorrow’s bias uses the bias fromtoday, which is assumed to persist with the loss of skillassociated with predictability. The difference betweentoday’s observed bias and the reliable (nonrandom) por-tion of today’s bias that was estimated yesterday, whenweighted by a factor b called the Kalman gain so as toproject to tomorrow, gives the correction that was‘‘learned’’ from previous errors. This correction is addedto the prediction, to give the final estimate of the biasfor tomorrow. We use this bias estimate to adjust ourraw numerical forecasts. Also, this bias estimate is savedfor one day (i.e., the time delay operator), to be recycledinto the Kalman algorithm to estimate the bias for thesubsequent day. This cycle repeats every day, as countedby index k.Combining the previous equations yields the resultingpredictor equation (Bozic 1979):eˆ 5 aeˆ 1 b [e 2 c · eˆ ],k11|kk| k21 kk k| k21(B1)where the Kalman gain b is found from:2221b 5 acp [cp 1 s ],kk| k21 k | k21 y(B2)and where p is the prediction mean-square-error from:22p 5 ap 2 acb p 1 s .k11|kk| k21 kk| k21 w(B3)Subscripts such as k | k 2 1 indicate the value for today(index k) as extimated from yesterday’s (k 2 1) value.The parameters a and c are found from the covariancematrices of bias, and the whole system is started on thefirst day using a 0 initial-bias estimate. Within the firstseveral days of operation of this postprocessing system,(B1) approaches the best estimate of forecast bias. Thisapproach is used separately for each weather station inthis study; namely, different parameters a, c, b, and pcan evolve for the different stations.REFERENCESArmstrong, R. L., and B. R. Armstong, 1987: Snow and avalancheclimates of the western United States: A comparison of maritime,intermountain and continental conditions. Avalanche Formation,Movement and Effects: Proceedings of a Symposium Held atDavos, B. Salm and H. Gubler, Eds., International Associationof Hydrological Sciences (IAHS) Publ. 162, 686 pp. [Availablefrom IAHS Press, Centre for Ecology and Hydrology, Walling-ford, Oxfordshire OX10 8BB, United Kingdom.]Benoit, R., M. Desgagne, P. Pellerin, S. Pellerin, and Y. Chartier,1997: The Canadian MC2: A semi-Langrangian, semi-implicitwideband atmospheric model suited for finescale process studiesand simulation. Mon. Wea. Rev., 125, 2383–2415.——, and Coauthors, 2002: The real-time ultrafinescale forecast sup-port during the special observing period of the MAP. Bull. Amer.Meteor. Soc., 83, 85–109.Bozic, S. M., 1979: Digital and Kalman Filtering. John Wiley &Sons, 153 pp.CAA, 1995: Observation Guidelines and Recording Standards for1160 VOLUME 18WEATHER AND FORECASTINGWeather, Snowpack and Avalanches. Canadian Avalanche As-sociation (CAA), 99 pp. [Available from The Canadian Ava-lanche Centre, P.O. Box 2759, Revelstoke, BC, Canada, V0E2S0.]Flueck, J. A., 1987: A study of some measure of forecast verification.10th Conf. on Probability and Statistics in Atmospheric Sciences.Edmonton, AB, Canada, Amer. Meteor. Soc., 64–68. [Availablefrom American Meteorological Society, 45 Beacon Street, Bos-ton, MA 02108-3693.]Foehn, P. M. B., 1998: An overview of avalanche forecasting modelsand methods. Norwegian Geotechnical Institute Publ. 203, 256pp. [Available from the Norwegian Geotechnical Institute, Sogn-sveien 72, 0806 Oslo, Norway.]Kalman, R. E., 1960: A new approach to linear filtering and predictionproblems. Trans. ASME, J. Basic Eng., 82, 35–45.——, and R. S. Bucy, 1961: New results in linear filtering and pre-diction theory. Trans. ASME, J. Basic Eng., 83, 95–108.Kristovich, D. A. R., and Coauthors, 2000: The Lake—Induced Con-vection Experiment and the Snowband Dynamics Project. Bull.Amer. Meteor. Soc., 81, 519–542.LaChapelle, E. R., 1980: The fundamental processes in conventionalavalanche forecasting. J. Glaciol., 26, 75–84.McClung, D. M., 1995: Computer assistance in avalanche forecasting.Proc. Int. Snow Science Workshop, Snowbird, Salt Lake City,UT, American Avalanche Institute, 310–313. [Available fromDave McClung, Dept. of Geography, UBC, 1984 West Mall,Vancouver, BC V6T 1Z2, Canada.]——, 2000: Predictions in avalanche forecasting. Ann. Glaciol., 31,377–381.——, and P. Schaerer, 1993: Avalanche prediction II: Avalanche fore-casting. The Avalanche Handbook. The Mountaineers, 272 pp.Murphy, A. H., and H. Daan, 1985: Forecast evaluation. Probability,Statistics, and Decision Making in the Atmospheric Sciences, A.H. Murphy, and R. W. Katz, Eds., Westview Press, Inc., 379–437.Peirce, C. S., 1884: The numerical measure of the success of pre-dictions. Science, 10, 453–454.Roeger, C., D. McClung, R. Stull, J. Hacker, and H. Modzolewski,2001: A verification of numerical weather forecasts for ava-lanche prediction. Cold Reg. Sci. Technol., 33, 189–205.Tripoli, G. J., 1992: A nonhydrostatic mesoscale model designed tostimulate scale interaction. Mon. Wea. Rev., 120, 1324–1359.Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences.International Geophysical Series, Vol. 59, Academic Press, 468pp.


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items