UBC Faculty Research and Publications

Pattern-Based Evaluation of Coupled Meteorological and Air Quality Models. Beaver, Scott; Tanrikulu, Saffet; Palazoglu, Ahmet; Singh, Angadh; Soong, Su-Tzai; Jia, Yiqin; Tran, Cuong; Ainslie, Bruce; Steyn, Douw G. 2010

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata


Steyn_2010_JAMC2471.pdf [ 2.14MB ]
JSON: 1.0041873.json
JSON-LD: 1.0041873+ld.json
RDF/XML (Pretty): 1.0041873.xml
RDF/JSON: 1.0041873+rdf.json
Turtle: 1.0041873+rdf-turtle.txt
N-Triples: 1.0041873+rdf-ntriples.txt
Original Record: 1.0041873 +original-record.json
Full Text

Full Text

Pattern-Based Evaluation of Coupled Meteorological and Air Quality ModelsSCOTT BEAVER AND SAFFET TANRIKULUBay Area Air Quality Management District, San Francisco, CaliforniaAHMET PALAZOGLU AND ANGADH SINGHUniversity of California, Davis, Davis, CaliforniaSU-TZAI SOONG,YIQIN JIA, AND CUONG TRANBay Area Air Quality Management District, San Francisco, CaliforniaBRUCE AINSLIE AND DOUW G. STEYNThe University of British Columbia, Vancouver, British Columbia, Canada(Manuscript received 13 January 2010, in final form 8 April 2010)ABSTRACTA novel pattern-based model evaluation technique is proposed and demonstrated for air quality models(AQMs) driven by meteorological model (MM) output. The evaluation technique is applied directly to theMM output; however, it is ultimately used to gauge the performance of the driven AQM. This evaluation ofAQMperformancebasedonMMperformanceisamajoradvanceovertraditionalevaluationmethods.First,meteorological cluster analysis is used to assign the days of a historical measurement period among a smallnumber of weather patterns having distinct air quality characteristics. The clustering algorithm groups dayssharing similar empirical orthogonal function (EOF) representations of their measurements. In this study,EOF analysis is used to extract space–time patterns in the surface wind field reflecting both synoptic andmesoscale influences. Second, simulated wind fields are classified among the determined weather patternsusing the measurement-derived EOFs. For a given period, the level of agreement between the observation-based clustering labels and the simulation-based classification labels is used to assess the validity of thesimulation results.Mismatches occurring betweenthe two sets of labels for a givenperiod implyinaccuratelysimulated conditions. Moreover, the specific nature of a mismatch can help to diagnose the downstreameffects of improperly simulated meteorological fields on AQM performance. This pattern-based modelevaluation technique was applied to extended simulations of fine particulate matter (PM2.5) covering twowinter seasons for the San Francisco Bay Area of California.1. IntroductionPhotochemical air quality model (AQM; Russell andDennis 2000) simulations are increasingly used for regu-latory purposes (Fine et al. 2003). They provide technicalinformation to supportairqualityplanning decisions. Theresulting policies can affect billions of dollars worth ofpublic health and economic activity annually (Yang et al.2005).Becauseofthelargestakesinvolved,policymakersrequire confidence that simulation results are valid. Foruse in policy making, AQM simulations must go beyondmerely reproducing observed pollutant levels. They mustadditionally represent atmospheric processes with suffi-cientfidelitytoallowinferencesaboutdominantpollutantbuildup mechanisms. Understanding the major builduppathwaysforregulatedpollutantsprovidestheonlysoundbasis for optimizing emission control strategies. Thus,modelers must additionally evaluate AQM inputs such asmeteorological fields, pollutant and precursor emissionsinventories, and land use, as well as the inner workings ofthe model to track ambient conditions.AQMsneedtoaccuratelyreproducedominantpollutantbuildup pathways across the full range of meteorologicalCorresponding author address: Saffet Tanrikulu, Bay Area AirQuality Management District, 939 Ellis St., San Francisco, CA94109.E-mail: stanrikulu@baaqmd.govVOLUME 49 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY OCTOBER 2010DOI: 10.1175/2010JAMC2471.1C211 2010 American Meteorological Society 2077conditions experienced during the modeled period. Theatmospheric processes responsible for pollutant builduparerepresentedingriddedmeteorologicalfieldsinputtoAQMs. These fields are often prepared using separatemeteorological models (MMs). Thus, biases in the MMoutput are propagated through the AQM. If MM biasescan be identified, they can provide strong signals toassess the accuracy and diagnose the shortcomings ofan AQM.Traditionally, simulated variables such as wind speedand direction, temperature, and humidity are directlycompared against routine measurements (e.g., Seaman2000) in an attempt to gauge the accuracy of the MMoutput. Such an operational evaluation (Tesche 2002) issimple to implement: error and bias statistics are com-puted between corresponding modeled variables andmeasured parameters that are paired in space and time.These paired comparison statistics assay the magnitudeand sign of the discrepancy between simulation and ob-servation. Large errors and/or biases indicate MM out-puts that are unacceptable for use in AQMs. Smallererror and bias levels, however, do not guarantee themodeled meteorological fields are acceptable for use inAQM simulations. That is, error and bias statistics arenecessary but insufficient for evaluating MM outputs asAQMinputs.Forexample,twoozoneepisodesfromthesame summer were modeled for the Central CaliforniaOzone Study (CCOS). Operational evaluations of thesimulated meteorological fields indicated similar MMperformance for both episodes (Wilczak et al. 2005;Tesche et al. 2004). These fields drove otherwise iden-tical AQM simulations. AQM performance was ade-quate for one episode but poor for the other.Various technical issues limit the robustness of opera-tional evaluation statistics. A commonly cited problem isincommensurability (Swall and Foley 2009), also knownaschange ofsupport(Wilke2003):measurements, whichare point estimates, are not directly comparable withmodeled quantities, which are volume averages. Also,decreasing the model grid size often fails to yield betteroperational performance statistics (Rife and Davis 2005,and references therein). Most surface meteorologicalnetworks are insufficiently dense to sample the localizedair flowsrepresented ina finely griddedMM (Gego etal.2005). Additionally, error and bias statistics cannot ac-count for stochastic fluctuations that are absent fromdeterministic model outputs (Hanna and Yang 2001).Finally, point-by-point operational evaluation cannotdistinguish between atmospheric features that are miss-ingaltogetherinthesimulatedfields,asopposedtothosethat are present but dislocated in time and/or space.Evaluation methods based on space–time patterns(Casati et al. 2008) often provide more physical insightthan operational evaluation. Generally, a statisticalmethod is applied to extract patterns from either a spa-tialfieldoratimeseriesforbothsimulatedandobservedvalues. The extracted patterns are then compared be-tween simulation and observation. This framework tocompare patterns avoids the direct pairing of observedpoint estimates with simulated quantities. Incommen-surability issues are largely avoided. Also, the patternsare estimated using multiple data points (from eithera spatial field or time series). This approach contrastsmarkedly with operational evaluation, which pairs singledata points in space or time. Thus, simulated quantitiesmay be more robustly compared against measurementsby using a pattern-based approach instead of paired sta-tistics. Moreover, evaluation explicitly based on space–time patterns is likely to characterize a model’s ability toreproduce physically relevant atmospheric features. Al-ternatively, operational evaluation is purely empiricaland may lack physical meaning.There are many types of model evaluation based onspace–time patterns. Spectral decomposition can de-termine whether important time scales are sufficientlyrepresented in a simulation. Decompositions can beperformedusinglinearfiltering(Raoetal.1997;Gilliamet al. 2006) or wavelets (Li and Shue 2004). Also, jointdistributions between different fields (e.g., wind andtemperature) can be estimated. Comparing simulatedand observed joint distributions can indicate how wellthetemporalcoherence of therespective fieldshasbeensimulated (Mueller 2009). Spatial patterns are com-monly isolated from fields of model output and obser-vations using empirical orthogonal functions (EOFs;Ludwig et al. 1995), also known as principal componentanalysis (PCA; Rohli et al. 2004). These spatial patternscan then be compared qualitatively and/or quantita-tively to evaluate the simulated fields. Cluster analysis(Ainslie and Steyn 2007) and other data partitioningmethods (Cannon et al. 2002) are also useful for ex-tracting patterns in space and/or time. In practice, mul-tiple space–time statistical techniques may be combinedto focus on specific scales at which conceptually impor-tant phenomena occur.Ideally, coupled MM–AQM evaluation should ex-plicitly account for MM shortcomings that may degradeAQM performance. In practical terms, such an evalua-tion technique would save resources, as meteorologicalfields unsuitable as AQM inputs could be identified di-rectly.ThisforesightwouldavoidthecostsofrunningandevaluatingAQMsimulationsdestinedtoperformpoorly.An evaluation technique that predicts AQM perfor-mance based on MM performance is also attractive froma scientific standpoint. Empirical relationships linkingthe weather and air quality may aid conceptual model2078 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 49development for air pollution meteorology (Christakos2003).This paper introduces a novel model evaluation tech-niqueforacoupledMM–AQMinwhichtheMMoutputisused as AQM input. It is based on comparing statisticallyextracted space–time weather patterns embedded in me-teorological observations and MM output. The methodachieves two important goals. First, it predicts how inac-curacies for MM-generated meteorological fields maydegrade AQM performance. Second, it evaluates coupledMM–AQM performance across different weather pat-terns. It can identify and diagnose representative weatherpatterns for which systematically poor MM performanceconsistently degrades AQM performance.2. Proposed pattern-based evaluation frameworkWe propose a two-stage pattern-based frameworkfor coupled MM–AQM evaluation. First, actual mete-orological conditions are categorized among, or binnedinto, a set of statistically defined weather patterns. Eachweather pattern should reflect different spatial and tem-poral distributions for the analyzed meteorological pa-rameters and should also be associated with distinct airpollutioncharacteristics.Thehistoricalperiodfromwhichthe weather patterns are identified must include, but canextend beyond, the simulation period for which modelevaluation will be performed. Second, MM outputs areclassified into the previously identified, measurement-derivedweatherpatterns.Foragivenperiod,agreementbetween the observation-based categorization and thesimulation-basedclassificationimpliesmodelvalidity.Amismatch in the labeling of some period between sim-ulation and observation implies that the distribution ofsimulated quantities is inconsistent with observation.The nature of any mismatch allows inference as to howAQM performance may be degraded by MM short-comings. Simulated pollutant levels are likely to re-semble those associated with the mistakenly simulatedweather pattern indicated by the classification, insteadof the observed weather pattern reflected by the mea-surements. Longer durations for such mismatches willlikely result in increasingly severe AQM performancedegradation.Labelings of the observed and simulated meteoro-logical conditions are implemented using unsupervised(data driven) cluster analysis and supervised classifi-cation, respectively (Jain 2000). Clustering both iden-tifies the measurement-derived weather patterns andlabels their times of occurrence. Each weather pattern,or cluster, is associated with a distinctly parameterizedstatistical model that best describes the distributionof its assigned data. Classification determines whichmeasurement-derived weather pattern most closelymatches the modeled meteorological fields for a givenperiod. The classification is performed using a statis-tical calculation analogous to that used by the clus-tering algorithm that defined the weather patterns.Here, both clustering and classification are based onEOF analysis of wind fields, as described in section 3.Clustering of meteorological parameters can readilyestablish the links between measured air quality andobserved meteorological conditions. These links allowprediction of AQM performance based on MM output.But, the clustering does need to be implemented on ap-propriatedatatoidentifyatmosphericfeaturesatscalesrelevant to air quality over the modeling domain. Thesescales may generally include the planetary, synoptic,meso-, and microscales. This model evaluation frame-workassumesthatnonmeteorologicalinputstotheAQMsuchasemissions,chemistry,andlandusearereasonablyaccurate. Otherwise, it may not be possible to determinethe causes of AQM performance issues based on anevaluation of meteorological fields alone.To illustrate the utility of pattern-based model evalu-ation, consider a simple two-pattern example. Supposethat actual conditions are clustered as stagnant, but sim-ulatedconditionsareclassifiedaswindyandturbulent.Inthis case, the AQM would be expected to underestimatepollutant levels. On the other hand, suppose that condi-tions are in fact windy but are simulated as stagnant. Inthat case, the AQM would be expected to overestimatepollutant levels. Continuing with the same example,consider the case in which actual conditions are stagnantfor an entire week. If only one day of the week is mis-takenlysimulatedaswindy,thenAQMperformancemaynot be severely degraded. If the mismatch occurs for theentire week, however, the AQM performance would beexpected to be worse.3. Theorya. EOF analysis of wind fieldsIn this study, clustering of meteorological observa-tions and classification of MM outputs are both basedon EOF analysis (Lorenz 1956), also known as PCA(Jolliffe 2002). This statistical approach can extractfeatures from meteorological parameters measuredover space and time. Here, EOFs are estimated fromhourly u and y wind components measured from a net-work of s surface weather stations. In terms of PCA, themodelisappliedintheSmode(Serranoetal.1999)withthe parameters (u or y at a specific station) treated as‘‘variables’’ and the sampling times treated as ‘‘cases.’’The u and y components for each station are scaledOCTOBER 2010 BEAVER ET AL. 2079by dividing by the mean observed wind speed for thatstation.Thisscalingweightseachweatherstationroughlyequally in the EOF analysis without distorting the winddirections. To account forthe autocorrelation(Shumwayand Stoffer 2005) in the hourly wind measurements,replicated values at 1- and 2-h delays are concatenatedto the original values. Scaled values for each hour h arestacked into ‘‘measurement vectors’’ x(h)(1 3 6s)asfollows, where the subscript is an index over the sweather stations,x(h)5u1(h), y1(h), u1(hC01), y1(hC01), u1(hC02), y1(hC02),...,us(h), ys(h), us(hC01), ys(hC01), us(hC02), ys(hC02)C20C21. (1)The EOFs for any set of measurements vectors x(h)are estimated by applying singular value decomposition(SVD). Each EOF is associated with a singular value sithat is proportional to the amount of variance in the setof measurements vectors explained by that EOF. TheEOFsarerankorderedbydecreasinglevelofvariabilityexplained. The first EOF has the lowest order (1), hasthe largest singular value, and explains more variancethananyotherEOF.Ofapossible6sEOFs,onlythefirstnmaxC28 6s EOFs are retained and stacked into the col-umns of P(6s3nmax). The percentage of the variabilityinthedecomposeddatathatisexplainedbythefirstnmaxEOFs is calculated from the singular values,%varianceexplained5C229nmaxi51siC2296si51si3100. (2)The EOFs are orthogonal, and, when applied to timeseries values, reflect wind field variability within distinctfrequency bands (Galin 2007). Atmospheric processesoccurring at lower frequencies generally have largerspatial scales (Steyn et al. 1981). Because the EOFs area spectral decomposition of the wind field time series,they are also associated with any atmospheric processesthat are coherent (correlated in time) with the windfield.For regional study domains in which large-scale in-fluences dominate the weather, the EOF rank-orderingis similar to the classical concept of wavenumber fornumerical weather modeling. Relatively lower rankedEOFs tend to represent features at relatively largerscales. Synoptic influences generally affect all stationsin a region and thereby tend to contribute the largestamounts of variability to the measurements vectors.Thus, synoptic influences tend to be represented by thelower-order EOFs. More localized atmospheric fea-tures affect subsets of the stations and thereby tendto contribute less to the overall variability in the windfield.These mesoscaleinfluencestend toberepresentedby the middle-order EOFs. Microscale influences andstochastic fluctuations may affect each weather stationuniquely. They tend to be represented in the higher-order EOFs, which explain small amounts of variability.Microscale structures in the boundary layer are of littleinterest for model validation purposes because MMstypicallydonotrepresentsuchfinescaleprocesses.Thus,the user should attempt to select nmaxto retain thelower-order (synoptic) and middle-order (mesoscale)EOFs and discard the higher-order (microscale) EOFs.b. EOF-based cluster analysisThe nontraditional nonhierarchical clustering algo-rithm of Beaver and Palazoglu (2006a) is applied tomeasurements vectors x(h) to produce k clusters of days,or weather patterns. Each cluster c is represented bya distinctly parameterized set of nmaxEOFs appearing asthe columns of matrix Pc(6s3nmax). The parameter nmaxis determined by trial and error such that all clusters suf-ficiently reflect the various synoptic and mesoscale phe-nomenarepresentedintheirassignedmeasurements.Theclustering algorithm is constrained to always assign the24 h from a given day (midnight to midnight, local time)to the same cluster. This blocking of the hourly clusterassignmentsinto24-hwindowsservesasasimplelow-passfiltering to generate daily labels by clustering hourlymeasurements. Measurement vectors x(h) for each dayd appear as the rows of data block X(d)(24 36s).Initially, each cluster is randomly seeded with thedaily data blocks. Then, the days are reassigned itera-tively to produce an optimized set of clusters. On eachiteration, an EOF model Pcis estimated for each clusterc from its assigned data blocks X(d) vertically concate-natedintotherowsofsupramatrixxc(24Nc36s),whereNcisthenumberofdaysassignedtoclusterc.Thescalarsum-of-squares errors totaled across 24 h, ec(d), is com-puted forfitting theblock of datafor each daydinto theEOF model for each cluster c,ec(d)5[ X(d)(IC0PcPTc)C13C13C13C13F]25 C229t(d)123h5t(d)[ x(h)(IC0PcPTc)C12C12C12C12]2.(3)2080 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 49The notation indicates squared Frobenius (matrixEuclidean) and L2 (vector Euclidean) norms, I is anidentity matrix, and t(d) is the first hour (midnight localtime)ofdayd.Then,eachdaydisreassignedtotheclusterc satisfying argmincec(d). The iterative procedure con-tinues until no further reassignments are possible.The nonhierarchical algorithm usually converges to alocal minimum for a given value of k, the number ofclusters. The procedure ofBeaverand Palazoglu(2006b)is applied to an ensemble of randomly initialized runs ofthe clustering algorithm. This randomized resampling ap-proach, similar to bootstrapping, yields a final ensemble-averaged solution with an appropriate number of clustersthat is near the global minimum of the solution space.Most days are assigned a single cluster label having ahigh level of confidence. Some transitional days sharingproperties of two clusters may be doubly assigned withmoderate confidence. A small proportion of the dayscannot be assigned to any cluster with reasonable con-fidence and remain unlabeled.Properly identified weather patterns should be asso-ciatedwithdistinctaloftconditions,despitenoaloftdatahaving been input to the clustering algorithm. The cor-responding aloft conditions for each cluster are deter-mined by compositing weather maps across the daysassigned to that cluster. Cluster-averaged precipitationand surface temperature fields can further characterizethe weather patterns.c. Multiscale classification of model outputs usingEOFsOnce the weather patterns are established by clus-tering,MMoutputscanbeclassifiedamongtheseknowncategories. Simulated wind values are first interpolatedfrommodelgridpointstothecorrespondinglocationsofthe weather stations that provided the clustered mea-surements.Thesimulateduandycomponentsarescaledby the modeled mean wind speed for that location. Thisscaling reduces discrepancies in wind speed betweenmodelandobservationwhilestillpreservingthesimulatedwindfieldspatialstructure.Theprocessedmodeloutputisarranged into vectors analogously to (1). Then, the sum-of-squareserrorsforfittingeachdayofmodeloutputintoeach cluster’s EOFs are calculated analogously to (3).Each day of model output is classified into the clusterhaving the EOFs that represent those simulated windswith the smallest sum-of-squares error.The classification can be performed using differentsubsets of the EOFs. Here, classifications are alwaysperformedusingthefirstnEOFs,wheren#nmaxissaidto be the EOF model order. This hierarchical EOFmodel structure is used because finer-scale patterns arenot well defined without being superimposed on theirlarger-scale settings. Classification using EOF model or-dernisachievedanalogouslyto(3),exceptusingonlythefirstncolumnsofPc.Agivendayofmodeloutputmaybeclassified into different clusters at different EOF modelorders. Such behavior indicates simulated conditions cor-responding to different measurement-derived weatherpatterns at different scales. The clustering, on the otherhand, only needs to be performed once using nmaxEOFs,as determined during execution of the algorithm. Theclustering estimates a total of nmaxEOFs for each clusterthat are by definition consistent across all scales. Thisconsistency across scales for the measurement-basedclusters reflects how distinct synoptic regimes set thestage for distinct mesoscale air flows to develop.4. Case studya. Description of study domainThe proposed pattern-based model evaluation tech-nique is demonstrated for the San Francisco Bay Area(SFBA)ofCalifornia(Fig. 1)forthecore fineparticulatematter (PM2.5) season of December–January. Exceed-ances of the 24-h PM2.5National Ambient Air QualityStandard (NAAQS) of 35 mgm23occurred mostly dur-ingthesemonths.Duringthesewinterepisodes,synoptic-scale stability and subsidence often trapped PM2.5and itsprecursorsclosetotheground.Atthemesoscale,terrain-induced air flows defined the source–receptor relation-ships, limited pollutant dispersion, and controlled thePM2.5spatial distribution. The SFBA is ideal for dem-onstrating pattern-based model evaluation at both thesynoptic scale and mesoscale.The SFBA is part of the larger central California do-main, which also includes the Sacramento Valley (SV)and the San Joaquin Valley (SJV). This pair of large,inland valleys together forms the Central Valley (CV).TheSFBA,theSV,andtheSJVhavemajorconnectionsat the Delta region to the east of the Bay. Air flowsbetween the SFBA and the CV occurred through thenarrow Carquinez Strait, the only major gap in the rimssurrounding the CV. These three basins shared similarair quality characteristics because of similar emissions,coupled meteorological conditions, and interconnectedterrain. During episodic winter conditions, the SV and/or the SJV were often upwind of the SFBA.b. Summary of previous cluster analysis resultsCluster analysis was applied to SFBA surface windmeasurements from 12 winter seasons (November–March) from 1 January 1996 to 31 March 2007 (Beaveret al. 2010). Clustering 1754 days robustly identified theOCTOBER 2010 BEAVER ET AL. 2081relevant weather patterns impacting PM2.5levels. Theclustered surface wind measurements were from a net-work of 12 SFBA weather stations shown in Fig. 1.Basedonpreviousexperience,microscalestructuresandnoise were assumed to account for around 10% to 15%of the SFBA wind field variability. Thus, the clusteranalysis was performed using nmax5 14 to explainaround 85%–90% of the variability in each cluster thatrepresented mostly synoptic and mesoscale influences.The clustering identified five weather patterns havingdistinct PM2.5characteristics. The synoptic-scale condi-tions were resolved by compositing gridded pressureleveldata up to the500-hPa pressurelevel.The 500-hPacomposite National Centers for Environmental Pre-diction (NCEP) reanalysis (http://www.esrl.noaa.gov/psd/) geopotential height fields (not shown) were usedto name the clusters. The type of synoptic features im-pacting the SFBA, their relative strengths of forcing onthe surface winds, and cluster names are indicated inTable 1. Interregional surface airflow patterns for theseclustersareshowninFig.2.DistinctwindfieldpatternsintheCV,outsideoftheclustereddomain,providedfurtherevidence that the weather patterns are real. Each clusterwas also verifiedtoexhibita distinct surfacetemperatureFIG. 1. SFBA and partial CV study domain showing surface wind stations used in clusteranalysis, PM2.5monitors, the Arbuckle weather station, and important geographic features.2082 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 49pattern(notshown).Thesefiveweatherpatternsarefullydescribed in Beaver et al. (2010).Three clusters (named R1, R2, and R3) were associ-ated with anticyclonic conditions and elevated PM2.5levels; ‘‘R’’ denotes upper-level high pressure ridges.Over 80% of the SFBA PM2.524-h exceedances oc-curred under R2. The rest occurred mostly under R3.EpisodicweatherpatternsR2andR3 bothhadridgesofaloft high pressure positioned over the SFBA, resultingin weak large-scale pressure gradients. Light, shallow,easterly air flows developed around the SFBA. ClusterR3 had the weakest large-scale forcing and the lowestSFBA wind speeds. It was also the only weather pat-tern with diurnally reversing wind directions. Both epi-sodic weather patterns exhibited near-calm conditionsthroughout the CV. Like the episodic weather patterns,R1 also had easterly surface winds through the SFBA.Unlike the episodic patterns, however, the R1 airflowpattern was driven by a strong large-scale pressure gra-dient.TheR1flowpatternwasrelativelydeep,andbothmoderate mechanical mixing (vertical dispersion) ratesand mixing depths resulted in moderate PM2.5levels.Strong winds entered the SV from the north and flowedsouthward along the SV major axis. Unlike the episodicTABLE 1. Names,numberof occurrences,numberof NAAQSexceedancedays (anySFBA monitorexceeds35 mgm23), and qualitativecharacteristics of five clusters.Name No. days total No. exceedance days PM2.5levels Synoptic (500-hPa level) featureStrength of synopticforcingR1 219 7 Moderate Offshore high pressure ridge StrongR2 422 145 Highest Shoreline high pressure ridge WeakR3 279 25 High Inland high pressure ridge WeakestV 413 6 Low Trough (ventilated) StrongS 489 6 Low Storm–cyclone (zonal flow aloft) StrongFIG. 2. Mean 0900 PST 1-h surface wind fields for five clusters. Arrow lengths are proportional to windspeed. Arrows point along direction of wind. Arrow tails are positioned at stations.OCTOBER 2010 BEAVER ET AL. 2083weather patterns, the aloft ridge for R1 was positionedoffshore instead of over the SFBA. Two other cyclonicweather patterns were named V for ventilated and S forstormy. Both exhibited strong large-scale pressure gra-dients,strongmarinewindsenteringtheSFBAfromthewest, and low PM2.5levels. Nearly all SFBA precipita-tion occurred under S.c. Description of MM–AQM simulationsMesoscale meteorological and photochemical simu-lations were performed for a subset of the 1996–2007winter cluster analysis study period. The modeling do-main included the SFBA, the SV, the SJV, and remoteregions over the Pacific Ocean and the Sierra Nevada.Simulations for 1 December–2 February were perfor-med for both the 2000/01 and 2006/07 winters, for 128days total. Meteorological fields were prepared usingthe fifth-generation Pennsylvania State University–National Center for Atmospheric Research MesoscaleModel (MM5) with 4-km horizontal grid size and 30vertical layers. Then, PM2.5levels were simulated usingthe Community Multiscale Air Quality (CMAQ) modelwith the Statewide Air Pollution Research Center,version 1999 (SAPRC99), chemical mechanism and theModels-3AERO3aerosolmodulewiththeRegionalAcidDeposition Model aqueous chemistry mechanism (AE3-aq). Emissions onlyvaried by dayof week, with significantweekday–weekend differences, and by winter season.CMAQ performance was evaluated for three keymonitoring locations. Gravimetric samplers analyzed us-ing the federal reference method (FRM) provided daily24-h PM2.5measurements at Concord and San Jose. Betaattenuation method (BAM) instruments provided daily24-h PM2.5measurements at Livermore and San Jose.San Jose PM2.5level was taken as the average of theFRM and BAM measurements. Observations werecompared against the minimally deviating simulatedvalue within a 3 3 3 array of first-layer grid cells cen-tered around the monitor. Pairing the observations withsimulated values in adjacent grid cells helped accountfor the sharp PM2.5gradients over the complex terrain.5. Resultsa. MM evaluationSimulated hourly winds were interpolated from theMM5 output to locations corresponding to the surfaceweather stations used in the clustering. These simulatedwinds were used to classify each day among the fiveweather patterns described in section 4b. Classificationswere performed using EOF model orders 1–14.Table 2 shows the correspondence of each pair ofclustering (observation) and classification (simulation)labels using selected lower- and middle-order EOFs. Theselected lower-order EOF classification used the first 3EOFs and reflected mostly large-scale (synoptic) vari-ability.Theselectedmiddle-orderEOFclassificationusedthefirst11EOFsandreflectedmorelocalized(mesoscale)circulations. In reality, there is a continuum of scalesrepresented across the 14 EOF model orders. Repre-sentative results for two model orders (3 and 11) nearopposite ends of this spectrum demonstrate the multi-scalecapabilitiesoftheEOF-basedevaluationtechnique.Regardless of actual conditions (cluster label), MM5was generally unable to reproduce the R3 pattern. Thecluster analysis assigned 23 days from the simulationperiod to this pattern. Of the 128 simulated days, only 5and 6 (column sums in Table 2) were simulated as R3(correctlyorotherwise)attheselectedlowerandmiddleEOF model orders, respectively. Many of the R3 dayswereincorrectlysimulatedaseither R1orS,bothwindypatterns. This mismatch suggested CMAQ would un-derestimatePM2.5levelsformostoftheR3daysbecausesimulated wind speeds were too high.MM5 also had trouble simulating R2. At the selectedlower model order, under half (19 of 51) of the R2 dayswerecorrectlysimulated.Aroundhalf(24of51)oftheseR2 days were mistakenly simulated as R1, a windypattern.AttheselectedmiddleEOFmodelorder,MM5performance was further degraded. More R2 days (34)were mistakenly simulated as R1, and fewer R2 days(10) were correctly simulated. MM5 performance forthis episodic cluster was more degraded at finer scales.TABLE 2.Numbersofassigneddaysforobservedclusters(firsttwocolumns).Numbersofsimulateddaysfromeachclusterclassifiedtoeachpattern,usinglower-and middle-orderEOFs.Sumsacrosscolumnsforthe classification tables(centerandrightgroupsofcolumns)maynotmatchvalueinsamerowunder‘‘observedclusters’’becauseanydoublyassigneddayswerecountedasfullmatchestobothpatterns.Observed clusters Classification using first 3 EOFs (lower order) Classification using first 11 EOFs (middle order)Name No. days R1 R2 R3 V S R1 R2 R3 V SR1 31 26 0 0 0 4 28 0 0 0 4R2 51 24 19 1 0 7 34 10 1 0 7R3 23 12 1 4 2 6 14 0 4 3 4V 1 5 00442 15 5S 22 1 1 0 0 22 2 0 0 1 202084 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 49These mismatches suggested that CMAQ would un-derestimate PM2.5levels for the majority of the R2 daysbecause they were simulated as high wind pattern R1.This systematic MM5 bias to mistakenly simulate ob-served episodic R2 conditions as nonepisodic R1 con-ditions is termed the R2–R1 mismatch.Acrossallscales,bothR1andSwereusuallysimulatedcorrectly. MM5 had difficulty simulating V on more thanhalf ofits occurrences.At lower order, V waslikely tobeconfused with either R1 or S. At middle order, however,V was generally only confused with S. Simulating V as Swas not likely to degrade CMAQ performance signifi-cantly, because both cyclonic weather patterns had lowPM2.5levels.Figure 3 shows the time series for the clustering andclassification labels. Results of the classification across all14 EOF model orders are shown for 2000/01 only. Thetime series for the model evaluation results indicated onesignificant problem with the timing of MM5. The cluster-ing indicated that the transition R2/Soccurredover8–9 January. MM5, however, produced this same transitiontwo days in advance of the observed transition. This mis-timingsuggestedthatCMAQwouldproduceaprematuredecrease in simulated PM2.5levels before 8–9 January.For most EOF model orders, the classification labelvaried smoothly with EOF model order (vertical di-mension in Fig. 3, bottom panel). For example, R2 dayswere often simulated as R2 for lower model orders, butthey were simulated as R1 for middle orders. Classifi-cations using just the lowest lower-order EOF (1) andincluding the highest middle-order EOFs (13–14) wereoften inconsistent with those using intermediate EOFmodel orders (2–12).b. AQM evaluationThe coupled MM–AQM evaluation technique gen-erally indicated that episodic conditions (R2 and R3)were inaccurately simulated whereas nonepisodic con-ditions (R1, V, and S) were reasonably accurately sim-ulated. Thus, CMAQ performance varied considerablybetween episodic and nonepisodic conditions. DaysclusteredintoR1,V,orSexhibitedsimulated24-hPM2.5levels at Concord, Livermore, and San Jose with meanbiases (model minus observation; negative biases indi-cated model underestimation) and errors of21.666.4,24.268.8, and 0.069.3 mgm23, respectively, relativeto the measurements over 2000/01 and 2006/07. Incomparison, days clustered into R2 or R3 had consid-erablypoorerstatisticsof27.3612.2,213.7615.6,and27.3 6 14.8 mgm23, respectively. Despite significantbiases, the observed and simulated PM2.5levels werewell correlated. Pearson correlation coefficients be-tween simulated and observed PM2.5levels for 2000/01and 2006/07 at Concord, Livermore, and San Jose were0.81, 0.69, and 0.82, respectively.Time series for the CMAQ simulation results areshown in Fig. 4 for 2000/01. This simulation period in-cluded four complete episodes of elevated PM2.5in-terspersed with relatively unpolluted conditions. Threeepisodes occurred under persistent R2 conditions, as de-termined by the clustering (see cluster labels on Fig. 3):1–7 December; 26 December–8 January; and 17–22January. As indicated in Fig. 3, each of these episodesexhibited varying degrees of the R2–R1 mismatch.Meteorologicalconditionsfortheseepisodeswereoftensimulated correctly at lower EOF model orders; how-ever,MM5mistakenlysimulatedmanyoftheR2daysasR1 at middle EOF model orders. Moreover, the R2–R1mismatchoccurred formultipleconsecutivedaysduringeachepisode.Therefore,asexpected,CMAQ-simulatedPM2.5levels were underestimated, and in many cases,severely so. The third persistent R2-type episode (17–22January)exhibitedthemostsevereR2–R1mismatch,with mismatches occurring for most days at most EOFFIG.3.Timeseriesforclusterandclassificationlabelsfor2000/01simulationperiod.Eachsquareindicateslabel(s)for a single day using a given EOF model order. Squares are broken into pairs of triangles for doubly assigned days.(top)Cluster(observation)labelsusingnmax514;(bottom)classification(simulation)labelsfor1#n#nmaxstackedvertically.PatternsR1,R2,andR3areindicatedbygrayscaleshading.PatternsV,S,andunlabeleddaysareshownaswhite with no marker, a dot, and an x, respectively.OCTOBER 2010 BEAVER ET AL. 2085model orders. CMAQ-simulated PM2.5levels weremostseverelyunderestimatedforthisepisode.Also,thesecond persistent R2-type episode exhibited the R2–R1mismatchatmostscalesduring31December–2January.Simulated PM2.5levels for this episode were more se-verely underestimated during this period of intensifiedR2–R1 mismatch as compared to the straddling periodshavingmismatchesatfewerscales.DuringthepersistentR2-type episodes, the simulated meteorological condi-tionsweremostaccurateduring4–6January.Thesedayswere correctly simulated as R2 for most EOF modelorders, except for orders 13–14, which often resulted ininconsistent classifications. The downward bias in sim-ulated PM2.5levels was less severe for most locationsduring this period.Figure 5 provides a snapshot for a day exhibiting theR2–R1 mismatch. Simulated PM2.5levels and winds areshownforthecentralCaliforniamodelingdomainonR2day27December2000.Inreality,thisdayhadnear-calmwinds and high PM2.5levels throughout the CV. Thesimulation, however, produced winds that were too strongin the northern SV. The simulated surface airflow pattern(Fig. 5) most strongly resembled that of R1 (see Fig. 2).Diagnosis of the R2–R1 mismatch focused on surfacelocations in the southern SV. Winds here were impor-tant for several reasons. First, clusters R1 and R2 weremost strongly differentiated in the SV (see Fig. 2). Sec-ond, previous research has suggested that the complexSV surface flows are more sensitive to small changes inthe large-scale pressure gradient driving flow throughtheCarquinezStraitthanfortheothercentralCaliforniabasins (Bao et al. 2008). Third, the southern portion oftheSVisconnectedwiththeSFBA,anddirectpollutantexchange may have occurred here.Figure 6 shows the time series for simulated and ob-served hourly wind speed and direction at Arbuckle (seeFig. 1) for the first two persistent R2-type episodes.Similar behavior was observed for the third persistentR2-type episode (not shown). The Arbuckle station wasrepresentative of the southwestern SV during these epi-sodes. When the R2–R1 mismatch occurred, simulatedwindspeedsweretoohigh.Themodelalsodidnotappearto reproduce the timing of the observed wind speedminima that often occurred overnight. Additionally, theobservations indicated diurnally shifting flows with over-night westerly winds. MM5 winds were persistently fromthe northwest. During 4–6 January, when the R2–R1mismatch was minimal, the simulated Arbuckle windstracked the observed winds reasonably well. A similarpattern appeared in the southeastern SV (not shown),except that the observed overnight flows were easterly.A different type of episode developed over 17–25 December, during which the sequence R1/R2/R3/R1 occurred. This more transient type of episodehaving evolving large-scale conditions over an 8-dayperiod was reasonably well modeled by MM5. Exceptfor the R3 days, which were almost never simulatedproperly, the lower-order classification labels matchedthe cluster labels. Mismatches occurred at middle EOForders. CMAQ performance for this episode was lessdegraded than for the other episodes that occurred un-der persistent R2 conditions.The mismatch in timing for the R2 / S transitionobserved to occur over 8–9 January appeared to signifi-cantly degrade CMAQ performance. As expected, sim-ulated PM2.5levels at many locations began to decreasein advance of the observations. The PM2.5levels wereseverely underestimated on 7 January.FIG. 4. Simulated (plus signs) and observed (squares) 24-h PM2.5levels at three SFBA monitoring locations for2000/01winter.Fourhighlightedepisodesareoftwoclasses:persistentR2(diagonalhatch)andmoretransientR1/R2/R3/R1 (vertical hatch). Horizontal lines are at 24-h PM2.5NAAQS exceedance threshold (35 mgm23).2086 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 496. Discussiona. MM diagnosisThe predominant classification mismatches signaleda classic deficiency of MM5 to overemphasize the large-scale pressure gradient and underemphasize localizedair flows (e.g., Hogrefe et al. 2001). MM5 generally haddifficulty reproducing the conditions associated withweak synoptic forcing having an aloft ridge over theSFBA. The model was unable to produce R3, the pat-tern with the weakest synoptic forcing, regardless ofactual conditions. The model also had difficulty re-producing R2, the pattern with the second-weakestsynoptic forcing. Anticyclonic conditions (R1, R2, andR3) were generally simulated as R1, the anticyclonicpattern having the strongest synoptic forcing and mod-erate PM2.5levels. Here, R1, V, and S shared stronglarge-scale pressure gradients, regionally high winds,and lacked strong stability. The R1 days were usuallysimulated correctly. The cyclonic patterns (V and S)were typically simulated as cyclonic, although the V–Smismatchwascommon.TheV–Smismatchwasnotveryimportant for air quality applications because bothpatterns were windy and well ventilated; however, thisfinding may be important for precipitation applicationsbecause V is dry and S is rainy.The clustering indicated that PM2.5episodes in theSFBA resulted upon transitions from cyclonic towardanticyclonic regimes. MM5 could distinguish betweenanticyclonic (R1, R2, and R3) and cyclonic (V and S) re-gimes having moderate-to-high and low PM2.5levels, re-spectively. Thus, the simulated meteorological conditionsshould be able to distinguish between days with moderateto high PM2.5levels and days with low PM2.5levels. Thesesimulated fields would be expected to reproduce the tim-ing of PM2.5episodes when used to drive an AQM. Oneexception would be the prematurely simulated R2 / Stransition observed to occur over 8–9 January 2001. Mis-matches among the anticyclonic weather patterns wouldbe expected to result in appreciable downward biases forCMAQ-simulated peak PM2.5levels that occurred underepisodic patterns R2 and R3.FIG.5.Simulatedsurfacelayer24-hPM2.5levelsand24-hwindfieldfor27Dec2000.Arrowspoint along direction of wind. The PM2.5levels are indicated by grayscale. California bound-aries and CV municipal boundaries for Sacramento, Stockton, Modesto, Fresno, and Bakers-field (north to south) are shown for reference. The dashed box indicates the extent of Fig. 1.OCTOBER 2010 BEAVER ET AL. 2087The R2–R1 mismatch was an important systematicbias identified for MM5 because the R2 conditionsaccounted for most SFBA PM2.5episodes. This sys-tematic bias occurred only under certain conditions, soit was not obvious using traditional operational evalua-tion techniques. The observed R1 and R2 patterns wereclearly distinguished by the EOF-based clustering; how-ever, simple analyses of weather maps or surface windfields may not have distinguished these conditions. TheEOFs reflected atmospheric processes coherent with thesurface wind field, and therefore helped to reveal three-dimensional features differentiating the related R1 andR2patterns.Theclusteringofsurfaceobservationsclearlyindicated differences in the positions of the ridges aloft.For R1 the ridge was positioned offshore, whereas for R2the ridge was positioned directly over the SFBA. MM5,on the other hand, appeared relatively insensitive to dif-ferences inthe boundary conditions between R1 and R2.Both R1 and R2 were observed to produce persistenteasterly surface winds through the SFBA. These airflowpatterns appeared similar based on simple wind fieldanalyses of the SFBA measurements. The EOF-basedclustering,however,distinguishedR1asarelativelydeepflow generated by the large-scale pressure gradient andR2 as a relatively shallow flow generated by terrain andsurface heating effects.The performance of MM5 was scale dependent, espe-cially under conditions with pronounced terrain-inducedairflow features. At the synoptic scale (lower-orderEOFs), MM5 was able to reproduce the effects of thestrong ridging pattern R2 about half of the time. Thus,the model was often able to replicate the bulk easterlySFBA surface air flows associated with the ridge. Atthe mesoscale (middle-order EOFs), however, MM5-simulated conditions that were not strongly conducive toPM2.5buildup. SFBA surface winds were correctly sim-ulated as persistently from the east; however, the R2–R1mismatch implied that wind speeds, mixing rates, andthereforeoverallpollutantdispersionratesweretoohigh,especially in the SV. The scale dependency was lessprevalent for the weather patterns with strong synopticforcing. At lower orders, V was mistakenly simulated asR1 or S, the other patterns with strong synoptic forcing.At middle orders, however, V was only mistakenly sim-ulatedasS,theotherpatternwithwesterlymarinesurfacewinds.Atmiddleordersthatreflectmesoscaleinfluences,R1 and V were not confused because they exhibited op-posite directions of bulk surface flow through the SFBA(easterly and westerly, respectively).At the very lowest (1) and highest middle (13–14)EOF model orders, the model evaluation techniqueitself did not perform well. The first EOF typically rep-resented 40%–50% of the variability in the MM5 out-put. The simulated conditions likely were insufficientlyrepresentedusingthisloneEOF.Classificationusingthehighest middle-order EOFs (13–14) generated manylabelsthatdidnotvarysmoothlywithEOFmodelorder.These highest middle-order EOFs were likely explain-ing highly localized conditions that were not stronglyconnected to the organized flows that determined PM2.5source–receptor relationships. Also, these highest middle-order EOFs may have represented stochastic fluctuationsand/or microscale structures in the ambient condi-tions that were not represented by MM5. The poorFIG.6.Timeseriesforobserved(solidlinewithplussigns)andsimulated(dashedlinewithcircles)hourly1-h(top)wind speed and (bottom) direction at Arbuckle (see Fig. 1). Two periods exhibiting R2–R1 mismatch from 2000/01winter are separated by a gray patch: 1–7 Dec and 26 Dec–8 Jan. Hashed vertical lines appear at midnight PSTbeginning each day.2088 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 49performance of the model evaluation technique at ex-treme EOF model orders (1 and 13–14) represented ef-fects of underfitting and overfitting, respectively, by theEOF models when classifying MM5 outputs.b. AQM diagnosisThe predictive capability of the proposed MM–AQMevaluation technique provided a number of insights be-yond those revealed by operational evaluation. As ex-pected,episodicPM2.5levelswereusuallyunderestimatedby CMAQ. This effect was pronounced for episodesoccurring under persistent R2 conditions, as manifestedbytheR2–R1mismatch.The scale dependencyof MM5to exhibit the R2–R1 mismatch appeared to explainthe degree of degraded CMAQ performance. Periodswith R2–R1 mismatches occurring across more EOFmodel orders exhibited poorer CMAQ performance.Also,longerdurationsfortheR2–R1mismatchproducedlarger downward biases for the simulated PM2.5levels.Presumably,alackofsimulatedpollutantbuilduphelpedcause these significantly underestimated PM2.5levels,especiallyduringpersistentR2conditions.Foradifferenttype of episode with stronger synoptic forcing, CMAQperformed reasonably well. With a single exception, thetiming of the episodes was reproduced accurately. Amistiming occurred for a storm observed to pass over 8–9 January 2001 that was simulated two days in advance.Theotherwiseaccuratemodeltimingwasreflectedbythehigh correlation coefficients between observed and sim-ulated PM2.5levels, despite often severe biases.During the R2–R1 mismatch, the simulated windsspeeds in the SV were too high. Surface winds appearedto be reasonably well simulated within the SFBA; how-ever, the inaccurately simulated conditions upwind ofthe SFBA in the SV appeared to considerably degradeCMAQperformance.SurfacewindspeedsintheSVweretoo high, suggesting artificially high simulated mixingrates under the stable and subsiding conditions. Duringtheseepisodes,theSJVmayalsohavebeenupwindoftheSFBA. SJV winds appeared to be simulated reasonablywell during the R2–R1 mismatch.Beyond inaccurate wind speeds, the R2–R1 mismatchindicatedthatMM5producedthewrongtypeoflow-levelairflowfeatures.TheobservedovernightwesterlyflowsatArbuckle(Fig.6)representedterrain-induceddownslope(drainage) flows under clear-sky anticyclonic conditions.The diurnally shifting observed wind directions furtherevidenced the localized nature of this flow pattern. NoaloftmeasurementswereavailableovertheSV;however,the observed overnight downslope flow pattern was pre-sumably relatively shallow. A similar pattern along theeastern SV slopes also suggested overnight downslopeflows.These observeddownslope flows over the CVrimshave been previously linked with SFBA exceedances(Beaver et al. 2010). MM5 winds, however, were persis-tently from the northwest. During the overnight hours,MM5 was producing low-level down-valley flows chan-neled along the SV major axis when in fact downslopeflows converged toward the valley floor. The simulatednortherly flows over the SV extended from the surfacethrough the tenth model layer, or around 800 m AGL.The simulated down-valley flows were likely deeper andhad higher mixing rates than the observed downslopeflows. The model likely created too much dispersion inthe SV, which subsequently affected downwind SFBAlocations. Also, MM5 appeared to be unable to simulatethe overnight calm conditions in the SV. This likely al-lowedforinsufficientairmassaginginCMAQ,inhibitingbuildup for dominant secondary PM2.5components suchas ammonium nitrate. The underestimation of PM2.5levelsduringtheR2–R1mismatchmayhavealsoresultedfrom inaccurately simulated stability; R1 was far lessstable than R2, allowing additional vertical dispersion ofpollutants.A second type of episode occurred under somewhatstronger synoptic forcing than episodes occurring underpersistentR2conditions.Localizedterrain-inducedflowswere not as prevalent for this second type of episode.Thus, CMAQ performance was relatively improved be-cause of the ability of MM5 to better handle these moresynoptically forced surface air flows. Also, for episodesoccurring under persistent R2 conditions, the R2 dayswith correct lower-order EOF classifications had some-what improved CMAQ performance. One brief period(4–6 January 2001) having persistent R2 conditions wassimulated correctly for both the lower- and most middle-order EOF classifications. MM5 reproduced diurnallyshifting winds in the SV, and coupled MM5–CMAQperformance was better than for any other period ex-hibiting persistent R2 conditions. Nonepisode days typi-cally occurred under patterns R1, V, and S. They hadstrong large-scale pressure gradients and reasonableperformance for both the MM5 and CMAQ. The rea-sonable CMAQ performance to simulate the moderatePM2.5levels associated with R1 suggested that the emis-sions inventory and chemical mechanism were reason-ablyaccurate.Thus,overall,themostimportantfactorforexplaining degraded CMAQ performance appeared tobethe inability of MM5toproduceterrain-inducedflowsover the complex central California terrain during weaksynoptic forcing events.7. ConclusionsA pattern-based method for coupled MM–AQMevaluation has been developed. It was tested for PM2.5OCTOBER 2010 BEAVER ET AL. 2089simulationsovertheSFBAfortwowinterPM2.5seasons.An EOF-based clustering of surface winds was per-formed using SFBA measurements. Five major weatherpatterns reflecting both synoptic and mesoscale variabil-ity impacting PM2.5levels were identified. MM5 outputsfor the two winter seasons were classified among thesefive measurement-derived weather patterns. For eachday of the simulation period, the labels for the observedwinds and MM5-simulated winds were compared. TheeffectsoftheMM5classificationmismatcheswereusedtodiagnose degraded CMAQ performance.In general, MM5 had difficulty reproducing the me-teorological conditions associated with weak synopticforcing events. CMAQ performance was especially de-gradedforepisodeshavingpersistentridgesofalofthighpressure over the study domain, leading to stagnatingsurface conditions. For such episodes, the model oftenincorrectly produced winds driven by the large-scalepressure gradient instead of by localized mechanisms.(This discrepancy was termed the R2–R1 mismatch.) Akey shortcoming of MM5 appeared to be its inability tosimulate overnight downslope flows over the complexcentral California terrain, especially in the SV. It wasinteresting to find that the CMAQ performance for theSFBA appeared to be limited by degraded MM5 per-formance in the upwind SV. Episodes having somewhatstrongersynopticforcingwerebettersimulatedby MM5,and CMAQ-estimated PM2.5levels were in closeragreement with observations. The timing of most epi-sodeswasproperlysimulatedbecauseMM5couldusuallydistinguishbetweenanticyclonic and cyclonicconditions.Theabove MM5 shortcomingisconsistentwitha well-known deficiency of this model. It typically provides toomuch synoptic push through complex terrain during pe-riods of weak large-scale pressure gradients and lightlocalized winds. The pattern-based evaluation techniquewas quite valuable to identify and diagnose the impactof this general MM5 shortcoming for a specific applica-tion. Identification of the MM5 bias would have beendifficultusingtraditionalmethods.First,themeteorology-dependent bias was not obvious from operational eval-uation statistics averaged across entire winter seasons.Second,theMM5bias wasnotapparentusingonlylocalsurface analyses. The systematic deficiency involvedinaccurately simulated three-dimensional structures inthe boundary layer. The evaluation of CMAQ per-formance based on MM5 performance was only possi-ble because other CMAQ inputs such as emissions andchemistry appeared to be reasonable.Identification anddiagnosisofsystematicmodel biasesare critical for transferring knowledge between mod-elers and model developers. Such knowledge transfer isof paramount importance for collaboratively improvingmodel performance to meet the needs of air qualityplanners.REFERENCESAinslie,B.,andD.G.Steyn,2007:Spatiotemporaltrendsinepisodicozone pollution in the Lower Fraser Valley, British Columbia,in relation to mesoscale atmospheric circulation patterns andemissions. J. Appl. Meteor. Climatol., 46, 1631–1644.Bao, J. W., S. A. Michelson, P. O. G. Persson, I. V. Djalalova, andJ. M. Wilczak, 2008: Observed and WRF-simulated low-levelwinds in a high-ozone episode during the Central CaliforniaOzone Study. J. Appl. Meteor. Climatol., 47, 2372–2394.Beaver,S.,andA.Palazoglu,2006a:Clusteranalysisofhourlywindmeasurementsto revealsynopticregimesaffectingairquality.J. Appl. Meteor. Climatol., 45, 1710–1726.——, and ——, 2006b: A cluster aggregation scheme for ozoneepisode selection in the San Francisco, CA Bay area. Atmos.Environ., 40, 713–725.——, ——, A. Singh, S.-T. Soong, and S. Tanrikulu, 2010: Identi-fication of weather patterns impacting 24-h average fine par-ticulate matter pollution. Atmos. Environ., 44, 1761–1771.Cannon,A.J.,P.H.Whitfield,andE.R.Lord,2002:Synopticmap-patternclassificationusingrecursivepartitioningandprincipalcomponent analysis. Mon. Wea. Rev., 130, 1187–1206.Casati, B., and Coauthors, 2008: Forecast verification: Currentstatus and future directions. Meteor. Appl., 15, 3–18.Christakos, G., 2003: Critical conceptualism in environmentalmodelingandprediction.Environ.Sci.Technol.,37,4685–4693.Fine,J.,L. Vuilleumier,S. Reynolds,P.Roth,and N.Brown,2003:Evaluatinguncertaintiesinregionalphotochemicalairqualitymodeling. Annu. Rev. Environ. Resour., 28, 59–106.Galin, M. B., 2007: Study of the low-frequency variability ofthe atmospheric general circulation with the use of time-dependentempiricalorthogonalfunctions.Izv.Atmos.OceanicPhys., 43, 15–23.Gego, E., C. Hogrefe, G. Kallos, A. Voudouri, J. S. Irwin, andS.T.Rao,2005:Examinationofmodelpredictionsatdifferenthorizontal grid resolutions. Environ. Fluid Mech., 5, 63–85.Gilliam, R. C., C. Hogrefe, and S. T. Rao, 2006: New methods forevaluating meteorological models used in air quality applica-tions. Atmos. Environ., 40, 5073–5086.Hanna,S.R.,andR.Yang,2001:Evaluationsofmesoscalemodels’simulationsof near-surface winds, temperature gradients, andmixing depths. J. Appl. Meteor., 40, 1095–1104.Hogrefe, C., and Coauthors, 2001: Evaluating the performanceof regional-scale photochemical modeling systems: Part I–meteorological predictions. Atmos. Environ., 35, 4159–4174.Jain, A. K., 2000: Statistical pattern recognition: A review. IEEETrans. Pattern Anal. Mach. Intell., 22, 4–37.Jolliffe, I. T., 2002: Principal Component Analysis. 2nd ed.Springer-Verlag, 497 pp.Li,S.T.,andL.Y.Shue,2004:Dataminingandpolicymakinginairpollution management. Expert Syst. Appl., 27, 331–340.Lorenz, E. N., 1956: Empirical orthogonal functions and statisticalweather prediction. Scientific Rep. 1, Statistical ForecastingProject, Massachusetts Institute of Technology Defense Docu-ment Center 110268, 49 pp.Ludwig,F.L.,J.Y.Jiang,andJ.Chen,1995:Classificationofozoneand weather patterns associated with high ozone concentra-tions in the San Francisco and Monterey Bay Areas. Atmos.Environ., 29, 2915–2928.2090 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 49Mueller, S. F., 2009: Model representation of local air qualitycharacteristics. J. Appl. Meteor. Climatol., 48, 945–961.Rao, S. T., I. G. Zurbenko, R. Neagu, P. S. Porter, J. Y. Ku, andR. F. Henry, 1997: Space and time scales in ambient ozonedata. Bull. Amer. Meteor. Soc., 78, 2153–2166.Rife, D. L., and C. A. Davis, 2005: Verification of temporal vari-ationsinmesoscalenumericalwindforecasts.Mon.Wea.Rev.,133, 3368–3381.Rohli, R. V., M. M. Russo, A. J. Vega, and J. B. Cole, 2004: Tro-posphericozoneinLouisianaandsynopticcirculation.J.Appl.Meteor., 43, 1438–1451.Russell, A., and R. Dennis, 2000: NARSTO critical review ofphotochemical models and modeling. Atmos. Environ., 34,2283–2324.Seaman, N. L., 2000: Meteorological modeling for air-quality as-sessments. Atmos. Environ., 34, 2231–2259.Serrano,A.,J.A.Garcia,V.L.Mateos,M.L.Cancillo,andJ. Garrido, 1999: Monthly modes of variation of precipitationover the Iberian Peninsula. J. Climate, 12, 2894–2919.Shumway, R. H., and D. S. Stoffer, 2005: Time Series Analysis andIts Applications. Springer-Verlag, 571 pp.Steyn,D.G.,T.R.Oke,J.E.Hay,andJ.L.Knox,1981:Onscalesinmeteorology and climatology. McGill Climatol. Bull., 30, 1–8.Swall,J.L.,andK.M.Foley,2009:Theimpactofspatialcorrelationand incommensurability on model evaluation. Atmos. Envi-ron., 43, 1204–1217.Tesche, T. W., 2002: Operational evaluation of the MM5 meteoro-logical model over the continental United States: Protocol forannualandepisodicevaluation.U.S.EnvironmentalProtectionAgency, Office of Air Quality Planning and Standards, 51 pp.——, D. E. McNally, and J. G. Wilkinson, 2004: Evaluation of the16-20 September 2000 ozone episode for use in 1-hr SIP de-velopment in the California Central Valley. California AirResources Board, 93 pp.Wilczak, J., J. Bao, S. Michelson, S. Tanrikulu, and S.-T. Soong,2005: Simulation of an ozone episode during the CentralCalifornia Ozone Study. Part I: MM5 meteorological modelsimulations. Proc. 13th Conf. on the Application of Air Pol-lution Meteorology, Pittsburgh, PA, Air and Waste Manage-ment Association, Paper J.2.1.Wilke, C. K., 2003: Hierarchical models in environmental science.Int. Stat. Rev., 71, 181–199.Yang, T., K. Matus, S. Paltsev, and J. Reilly, 2005: Economicbenefits of air pollution regulation in the USA: An integratedapproach. Rep. 113, revised January 2005, Massachusetts In-stitute of Technology, 29 pp.OCTOBER 2010 BEAVER ET AL. 2091


Citation Scheme:


Usage Statistics

Country Views Downloads
China 7 14
United States 3 2
France 1 0
Canada 1 0
City Views Downloads
Beijing 7 0
Ashburn 3 2
Unknown 1 4
Vancouver 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items

Admin Tools

To re-ingest this item use button below, on average re-ingesting will take 5 minutes per item.


To clear this item from the cache, please use the button below;

Clear Item cache