UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Stochastic modeling of space-time processes : an air pollution problem Kalenderski, Stoitchko 2009

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2009_fall_kalenderski_stoitchko.pdf [ 799.02kB ]
Metadata
JSON: 24-1.0052379.json
JSON-LD: 24-1.0052379-ld.json
RDF/XML (Pretty): 24-1.0052379-rdf.xml
RDF/JSON: 24-1.0052379-rdf.json
Turtle: 24-1.0052379-turtle.txt
N-Triples: 24-1.0052379-rdf-ntriples.txt
Original Record: 24-1.0052379-source.json
Full Text
24-1.0052379-fulltext.txt
Citation
24-1.0052379.ris

Full Text

STOCHASTIC MODELING OF SPACE - TIME PROCESSES. AN AIR POLLUTION PROBLEM  by STOITCHKO KALENDERSKI  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY  in THE FACULTY OF GRADUATE STUDENTS (Atmospheric Sciences)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) April 2009  © Stoitchko Kalenderski, 2009  ABSTRACT  This study presents an interdisciplinary approach to an air pollution problem that takes into account the physical bases that govern the processes of interest under the framework of a Bayesian spatiotemporal model. Based on this approach we have developed a physically motivated stochastic model that decomposes the ground-level pollutant concentration field in three components, namely: transport, local production, and largescale mean trend mostly dominated by the emission rates. The model highlighted the importance of simultaneous considering different aspects of an air pollution problem. This approach is novel in the field of environmental spatial statistics and gives a different perspective on the subject. Based on the advection equation we have modeled the transport component. The advection equation implies that two important factors have to be considered when modeling the transport of an air pollutant namely: wind field and the pollutant gradients. The equation also shows that these two factors should be considered concurrently. The local production component ignores contributions from neighboring sites and takes into account only specific local meteorological conditions. This approach reflects the fact that physics and chemistry do not depend on location and time. As a result the local production spatial structure is not modeled directly, but is inherent in the covariate fields. To specify the large-scale mean process, which is mostly dominated by the emissions rates, we used the first eigenfunction of a Principal Component Analysis performed on wind pre-filtered dataset. This new approach allowed us to capture very complex spatial patterns and to avoid using emissions inventory data. The emissions rates reveal the different capacities of a pollutant production thoughout the domain of interest.  ii  The proposed models were applied to a Lower Fraser Valley dataset and were able to quantify the transport and local contributions to ozone fields. The models also capture a number of key features of ozone behavior and show excellent agreement with the observed data and outperformed considerably simpler univariate time series models by a factor of 2 to 3.  iii  TABLE OF CONTENTS  ABSTRACT .................................................................................................. ii TABLE OF CONTENTS ............................................................................iv LIST OF TABLES ..................................................................................... vii LIST OF FIGURES .....................................................................................ix NOTATIONS, ABBREVIATIONS AND CONVENTIONS ....................xi ACKNOWLEDGEMENTS ..................................................................... xiii DEDICATION ............................................................................................ xv 1  INTRODUCTION .............................................................................. 1 1.1  Modeling an air pollution problem .........................................................1  1.2  Research problems, motivation and objectives ......................................3  1.3  2  1.2.1  Research problems ........................................................................3  1.2.2  Motivation ......................................................................................3  1.2.3  Research objectives .......................................................................5  Organization of dissertation .....................................................................5  MODEL DEVELOPMENT .............................................................. 7 2.1  Statistical methods for spatial-temporal data ........................................7 iv  2.1.1  Regression models .........................................................................7  2.1.2  Extreme value approaches ...........................................................8  2.1.3  Space-time models .........................................................................9  2.2  Model requirements and modeling strategy .........................................11  2.3  A hierarchical spatiotemporal dynamic model ....................................12 2.3.1  Data model ...................................................................................12  2.3.2  Process model ..............................................................................13  2.3.2 2.4  3  2.3.2.1  Large-scale spatiotemporal trends ............................14  2.3.2.2  Transport model ..........................................................16  2.3.2.3  Local production model ..............................................20  Parameters model .......................................................................22  Prediction ..................................................................................................23  SIMULATION STUDY ................................................................... 25 3.1  Boundary sites ..........................................................................................25  3.2  The autoregressive process......................................................................28  3.3  Spatial data coverage ...............................................................................29  3.4  Prior information .....................................................................................31  3.3  Results .......................................................................................................32 3.5.1  Parameters inference ...................................................................33  v  3.5.2 3.3  4  Model performance .....................................................................42  Discussion .................................................................................................50  APPLICATION TO GROUND LEVEL OZONE CONCENTRATION DATA ..................................................................................... 52 4.1  The Metro Vancouver data .....................................................................52  4.2  Background chemistry of ozone ............................................................55  4.3  Exploratory Data Analysis ......................................................................57  4.4  Results ......................................................................................................63  4.5  5  4.4.1  Parameter inference ....................................................................63  4.4.2  Model performance .....................................................................67  4.4.3  Model comparison .......................................................................75  Discussion .................................................................................................77  CONCLUSIONS AND FUTURE RESEARCH ............................ 79  REFERENCES ........................................................................................... 82 APPENDIX A .............................................................................................. 89 APPENDIX B ............................................................................................ 100  vi  LIST OF TABLES Summary of the noninformative prior specification ...................................... 31 Summary of the informative prior specification ............................................ 32 Summary of the posterior parameter inference with different number ݊ of available observations based on model ‫݈݁݀݋ܯ‬ଵ and noninformative prior (Table 3.1) ............................................................................ 34 Table 3.4 Summary of posterior parameter inference base on model ‫݈݁݀݋ܯ‬ଵ using different number ݊ of available observations and informative prior (Table 3.1) ............................................................................................. 35 Table 3.5 Summary of posterior parameter inference base on model ‫݈݁݀݋ܯ‬ଶ , using different number ݊ of available observations and noninformative prior (Table 3.1) ...................................................................................... 38 Table 3.6 Summary of posterior parameter inference base on model ‫݈݁݀݋ܯ‬ଶ , using different number ݊ of available observations and informative prior (Table 3.2) ............................................................................................. 38 Table 3.7 SD statistics for spatial varying autoregressive process ߩ ............................. 38 Table 3.8 Summary of posterior parameter inference base on model ‫݈݁݀݋ܯ‬ଷ , using different number ݊ of available observations and noninformative prior (Table 3.1) ...................................................................................... 39 Table 3.9 Summary of posterior parameter inference base on model ‫݈݁݀݋ܯ‬ଷ , using different number ݊ of available observations and informative prior (Table 3.2) ............................................................................................. 40 Table 3.10 Summary of posterior parameter inference base on model ‫݈݁݀݋ܯ‬ସ , using different number ݊ of available observations and noninformative prior (Table 3.1) ...................................................................................... 40 Table 3.11 Summary of posterior parameter inference base on model ‫݈݁݀݋ܯ‬ସ , using different number ݊ of available observations and informative prior (Table 3.2) ............................................................................................. 41 Table 3.12 Summary of the models performance and relative importance of the transport effect. * denotes that informative prior was used for calculations ............................................................................................................. 43 Table 4.1 Summary of posterior parameter inference ................................................... 64 Table 4.2 Estimation of boundary parameter................................................................. 64 Table 4.3 Summary of the overall models performance and the relative importance of the transport effect ........................................................................... 67 Table 4.4 Comparison of model performance [ppb]...................................................... 76 Table B.1 Ozone summaries for three years studied ...................................................... 99 Table B.2 Meteorological data summaries for three years studied ................................ 99 Table B.3 Model ‫݈݁݀݋ܯ‬ଵ parameter and performance statistics summaries for three years studied ......................................................................................... 100 Table 3.1 Table 3.2 Table 3.3  vii  Table B.4 Table B.5 Table B.6  Model ‫݈݁݀݋ܯ‬ଶ parameter and performance statistics summaries for three years studied ......................................................................................... 101 Model ‫݈݁݀݋ܯ‬ଷ parameter and performance statistics summaries for three years studied ......................................................................................... 102 Model ‫݈݁݀݋ܯ‬ସ parameter and performance statistics summaries for three years studied ......................................................................................... 103  viii  LIST OF FIGURES Figure 3.1 Selected grid cells for the four datasets. Cells are counted from the left bottom corner .......................................................................................... Figure 3.2 Sequences of 5000 iterations from the posterior distributions of the model parameters and the histograms of the last 2000 iterations................................................................................................................ Figure 3.3 True and interpolated surfaces of based on informative prior and 32. Shading interval is uniform ............................................................. Figure 3.4 Left: sequences of 5000 iterations from the posterior distributions of the randomly selected grid cells without observations O(9,20) and with observations O(25,20) ; Center: histogram of the last 2000 iterations; Right: autocorrelation function ..................................................... Figure 3.5 Posterior mean time series for grid cells without observations 9,· and for grid cells with observation 25,· ........................................ Figure 3.6 Model-estimated mean field of the true pollutant concentration (grey scale), and observed values (numbers). * - denotes the grid cells used for fitting the model ..................................................................... Figure 3.7 Contour plots of pollutant variance field. * - denotes the grid cells used for fitting the model ............................................................................... Figure 3.8 Model estimated mean transport contributions map. . The observed wind field is superimposed on the plots ........................................................ Figure 3.9 Scatter plot for grid cell with observations 25,· and for grid cell without observation 2,· ............................................................................ Figure 4.1 Interpolated 8-hour average wind field for 12 May 2004 .............................. Figure 4.2 Locations of ozone observation sites “*” and prediction grid “+”. Numbers are station identifiers ...................................................................... Figure 4.3 Boxplot of the domain daily maximum 8-hour ozone levels by month ............................................................................................................. Figure 4.4 Time series of observed daily maximum 8-hour zone concentrations at 20 monitoring sites from April 15th to September 30th ............................ Figure 4.5 The first EOF modes of ozone data ............................................................... Figure 4.6 Normal Q-Q plot of residuals on original scale (top panels), square root scale (middle panels), and logarithmic scale (bottom panels)................ Figure 4.7 Partial autocorrelation functions of the first nine time series ....................... Figure 4.8 Estimated surfaces of autoregressive process · for models and ........................................................................................ Figure 4.9 Sequences of 5000 iterations from the posterior distributions of the model parameters and the histograms of the last 2000 iterations................................................................................................................  30  36 39  44 45  46 47 48 49 53 54 57 58 60 62 62 64  66  ix  Figure 4.10 Left: sequences of 5000 iterations from the posterior distributions of the randomly selected grid cells without observations O(125,20) and with observations O(71,20) ; Center: histogram of the last 2000 iterations; Right: autocorrelation function ..................................................... Figure 4.11 Time series for cells (75) with observations (station 30)............................... Figure 4.12 Model-estimated mean field of the true pollutant concentration as residuals. The observed values are super imposed on the plots .................... Figure 4.13 Contour plots of pollutant variance field. * - denotes the monitoring sites locations ................................................................................................ Figure 4.14 Model estimated mean transport contributions map. The observed wind field is superimposed on the plots ........................................................ Figure 4.15 Scatterplot of observed (x-axis) and modeled (y-axis) concentrations for models , , , and on the scale of residuals ............................................................................................  68 69 71 72 73  74  x  NOTATIONS, ABBREVIATIONS AND CONVENTIONS  Whenever possible, we follow the widely use conventions throughout the thesis. A random variable or a matrix are denoted by upper-case Roman letter e.g. X. Bold upper-case Roman letter are used for random vectors e.g. . Components of a vector  are denoted  by (X1, X2 …) whereas components of sequence will tend to be denoted by x(1), x(2),…. . The vector  ,…,  ,  ,…  is the vector  with its ith component  removed, i = 1, … , n. Greek letters are used to denote parameters and parameter vectors e.g. , . For a random vector X, [X] represent joint probability density function and [X|Y] represent conditional density of X given the vector Y.  ~  distributed as almost equal to proportional to Kronecker product Identity matrix vector of ones vector of zeros diagonal matrix with elements of the vector ,  on the main diagonal  univariate normal distribution with mean , and variance ,  p-variate Multivariate Normal distribution with mean vector  and covaxi  riance matrix . process , MRF  a physical variable of interest univariate inverse gamma distribution with shape  and rate  Markov random field Estimated values Prior values  xii  ACKNOWLEDGEMENTS  First of all, I would like to thank my two research supervisors, Dr. D.G. Steyn and Dr. J.V. Zidek for their encouragement, guidance and intellectual support. Their insightful suggestions and comments greatly helped me to clarify my thoughts and extend the research in many aspects. I especially thank Dr. Steyn for helping on several times with revisions of this thesis and helping me to master basic writing skills in English. I am grateful to him for giving me the opportunity to participate in Greenapple research project, and for funding my research from his grants. I am also grateful to Dr. Zidek for his kindness and for giving me the opportunity to participate in the PIMS summer school, University of Washington July 9-13, 2007. I am pleased to thank the other two members of my committee, Dr. P. Austin and Dr. S. Galmarini for their help and encouragements on each stage of my research and especially for their constructive comments on the final draft of this thesis. Their help and support is greatly appreciated. Additionally, I gratefully acknowledge the MatLab-code provided by Dr. N.J. McMillan from the Battelle Memorial Institute, USA, which helped me to organize the implementation of the Markov Chain Monte Carlo algorithm used in this study. I am grateful to the Metro Vancouver Air Quality Division for providing me (through an agreement with my research supervisor, and assisted by Al Percival) their entire air pollutant and meteorological data set covering the 24 year period from 1982 to 2006. Without these data, my research would not have been possible. I also wish to acknowledge my friends and fellow students, who have accompanied me throughout these four years at the university. I particularly want to thank Dr.  xiii  Bruce Ainslie and Dr. Christian Reuter for their help, support and the many meaningful discussions we have during these pas years. Finally, I would like to thank my family who always supported me unconditionally. Their encouragement and love gave me strength and confidence which helped me to overcome many obstacles and to get the job done.  xiv  DEDICATION  For my father, who would be so proud.  xv  CHAPTER 1 INTRODUCTION  In studies of air pollution, a central question is the spatial prediction and forecasting of pollutant concentration. We need to know pollutant concentrations because in general the severity of the effect depends on the concentration level of the pollutant. Because direct measurements are costly and difficult, modeling is an often-used approach to determining concentration levels. There are several approaches for modeling an air pollution problem. In the following section a brief review of these approaches is given.  1.1  Modeling an air pollution problem  Air pollution models are generally categorized into two types depending on the methods used (Berliner, 2003), namely: deterministic (such as transport dispersion and photochemical), and statistical (such as regression and kriging). Deterministic air pollution models are based on physical and/or chemical relations between the pollutant concentration and other factors that characterize the atmosphere. While these models are fully deterministic in the sense that they solve the equations of physics and chemistry (primarily the Navier-Stokes and continuity equations), they also contain elements that are implicitly statistical since the deterministic relations used in these models are spatially and temporally filtered so that they apply to grid-averaged quantities. We only want to know the average state of the atmosphere. It is neither feasible nor desirable to consider in detail all of the small scale motions (turbulence) that are  1  present in the atmosphere (Randall, 2006). The averaged deterministic equations are mathematically unclosed without further approximations, which is why the deterministic models also explicitly use statistical methods, for example in the way we model turbulence and diffusion at subfilter scales. Fully deterministic approaches involve a large number of equations and parameters. By employing a statistical approach we are able to greatly reduce the size and complexity of the problem. What we lose in statistical models is the ability to resolve fields in time and space at scales bellow those defined by the measurement grid. What we gain in deterministic models is the ability to examine physical and chemical processes in detail. Mixed deterministic-statistical approaches retain advantages of both deterministic and statistical approaches. It should also be noted that deterministic models are initialized by emissions and meteorological data and can be used to predict concentrations at selected locations on a computational grid. The computational grid can be either Eulerian or Lagrangian but a hybrid approach is also possible. Additionally, the implementation of the chemistry in these models introduces significant uncertainty because of deeply parameterized chemistry. Jimenez et al. (2003) provide brief descriptions of the main chemical mechanisms used in air quality models. A few examples of combined meteorological and chemical models are: MAQSIP – Multiscale Air Quality Simulation Platform (Odman and Ingram, 1996); CMAQ – Community Multiscale Mode for Air Quality (Byun et al., 1998); EMEP – European Monitoring and Evaluation Program (Simpson, 1992); EURAD – European Air Dispersion Model (Hass, 1991). The list given here is far from exhaustive: see Russell and Dennis, 2000; Holmes and Morawska, 2006, for a review. In a situation where scientific understanding of physical and chemical processes is incomplete, statistical models are often more appropriate than deterministic models. The statistical models are not as detailed as the deterministic models but can be more efficient and are still able to address very important questions. Furthermore, statistical approaches  2  have one important advantage over (deterministic) dynamic forecasting methods: their capacity to produce probabilistic forecasts and explicitly account for uncertainty in general. Its ability to evaluate uncertainty, which is inevitable in complex systems, is an extra desirable characteristic of such model. A concise overview of statistical methods relevant to environmental problems is given in the next chapter. In my study I intend to adopt a middle-ground modeling approach, one that is intermediate between purely deterministic and purely stochastic. The approach I will take is essentially stochastic but is able to easily incorporate physical and chemical reasoning into the statistical analysis. The spectrum of incorporation of physics could range from a qualitative use of physical arguments to strong reliance on numerical models. While this approach is not new, the review of Berliner (2003) shows that initial applications have been quite preliminary. The present work is a substantial elaboration and extension of the approach.  1.2  Research problems, motivation and objectives  1.2.1  Research problems  In addition to the evaluation of concentration levels, another central question of air pollution, which has not been considered in the context of space-time statistical modeling is the quantification of the relative importance of regional transport and local production of an air pollutant. In my dissertation I want to address this question simultaneously with the other two major problems of air pollution: predicting pollutant levels and estimating the associated uncertainty. Obtaining a quantitative estimate of the transport and local contributions together with the prediction of the concentration levels and evaluating un-  3  certainty will be the main objectives of a stochastic space-time model I will develop below.  1.2.2  Motivation  In many studies, the quantitative estimate of the relative importance of pollutant transport and local production has been of great help in finding solutions to an air pollution problem. Such studies often lead to practical methods for solving air pollution problems once scientific understanding of the nature of the problem has been achieved. Sometimes intuitive approaches can be very misleading if, for example, the analysis is conducted at a large scale while there are smaller scale effects (e.g. of topography or small water bodies) which are important for local weather systems but are not considered in the analysis. Alternatively, local solutions are often put in place while the problem starts thousands of kilometers away. The quantitative estimate of a pollutant transport and local production is crucial for effective control strategies. For example, a small local contribution relative to the transport suggests that risk management plan based on local emission control policies will not likely be effective. I believe that a large part of the reason why atmospheric scientists have not investigated these problems in the context of the spatiotemporal modeling approach is their unfamiliarity with the capabilities this approach provides. Statistical analysis that accounts for both spatial and temporal dependence has seldom been addressed in the literature (Le et al, 2001; Guttorp et al, 1994; Cressie and Huang, 1999). However, these studies usually do not incorporate meteorological variables like wind speed and direction, temperature, solar radiation and pressure into the spatio-temporal covariance structure (Huang and Hsu, 2004). In addition, statistical approaches to this problem do not require preceding knowledge about the emission rates of anthropogenic and natural precursors for the pollutant, which are necessary if we try to address the problem using deterministic models. Requir-  4  ing such information could be problematic. Emission rates are not always available; it is generally very expensive to collect and update information about them; and finally they are not constant and are subject to change in time and space (Yin et al., 2004). An important motivation for the present investigation is the fact that uncertainty, which is an intrinsic element of any modeling approaches and atmospheric system is seldom addressed. Quantification of uncertainty is needed to mitigate the risk or our ignorance and to evaluate the degree of reliability of a forecast. Since the atmosphere is a dynamically chaotic system (Lorenz, 1963) and always incompletely observed, uncertainty is a central issue here at each level of the analysis. Finally, sometimes it is necessary to predict pollutant concentration levels at locations where there are no monitoring stations. This is because very often locations of measuring sites are selected based on criteria that have very little to do with science (see Wikle for relevant discussion, 1996) in the first place and second because of the effects of constantly increasing urbanization on very different aspects of the ecological system. The class of statistical models we are considering provides estimates of the pollutant levels throughout the domain of interest at each particular point in time.  1.2.3    Research objectives  Developing a non-stationary space-time model for forecasting ground-levels concentrations of an air pollutant at local to regional scales.    Quantitative assessment of transport and local production of an air pollutant.    Understanding and estimations of inherent uncertainty as well as model uncertainty and ultimately uncertainty in model predictions.  5  1.3  Organization of dissertation  This dissertation is composed of five chapters. Chapter 2 is the most theoretical part of the dissertation. It summarizes the statistical methods used for modeling spatiotemporal data, specifies the model requirements and modeling strategy, and formulates a general hierarchical model for describing environmental datasets. In Chapter 3 we analyze a simulated dataset and investigate how parameter inference, model performance, and transport effect depend on: the way we treat the domain boundary sites; whether we allow the time autoregressive process to vary spatially or not; spatial data availability; and prior information. Chapter 4 shows how to apply the proposed general model to ground-level ozone data. This chapter also provides background chemistry for ozone and extended exploratory data analysis (EAD). A general discussion and future issues to explore are given in Chapter 5.  6  CHAPTER 2 MODEL DEVELOPMENT  2.1  Statistical Methods for spatial-temporal data  This section is a brief overview of the statistical methods commonly used in many air pollution studies. We need statistical methods generally to acknowledge our ignorance (and possibly uncertainty) and draw conclusions regarding underling physical and sampling processes (see Le and Zidek p.53, 2006, for more reasoning and examples). In general, these methods can be grouped in three broad categories: regression modeling, extreme value approaches, and space-time models (Thompson et al., 2001). Within each category there are further distinctions depending for example on: the way meteorological data are introduced (directly or via dimension reduction), whether parametric or non parametric techniques are used, and the way space-time dynamics are incorporated.  2.1.1  Regression models  Regression based models are the most frequently used statistical models in air pollution literature, with widely varying degrees of complexity. These models, in turn, can be further, grouped in three major classes: linear regression, regression tree, and non-linear regression. Linear regression models are the simplest form of regression models but have limited flexibility to capture interactions and non-linearities between meteorological and pollutant quantities. Linear models have been used for forecasting surface ozone concen-  7  tration as well as other pollutants by: Gardner and Dorling (2000), Robeson and Steyn (1989), and McKendry (2002). Classification and Regression Trees (Breiman et al., 1984) are able to represent non-linear relationship between meteorological variables and pollutant concentrations. Regression trees have been used to model daily maximum ozone concentrations (Burrows et al., 1995). However, the trend component cannot be captured directly in such models. Finally, non-linear regression can be appropriate when a non-linear relationship is required by the nature of the physical problem. Cobourn and Hubbard (1999) develop a nonlinear regression model for predicting daily maximum 1-h ozone concentrations. The terms in the model were developed from exploratory analysis using graphical techniques and nonparametric regression to discover the salient relationships involving ozone and many candidate predictor variables.  1.2.2  Extreme value approaches  Due to the inherent averaging in, regression models, they are often poor predictors of extreme values (NRC, 1991). An alternative approach for modeling rare events is to use extreme value theory (Gumbel, 1953). Extreme value approaches are particularly useful in the context of modeling threshold exceedances. One advantage of the extreme value approach is that one can easily compute a forecast probability of exceedance for a day, given its (forecasted) meteorology and current-day maximum ozone. This enables local governments to release ozone alert forecasts (Thompson et al., 2001). Tobías and Scotto (2005) were able to model ozone exceedances of the thresholds established by the air quality guidelines and obtain n year return level (i.e. the level which is exceeded on average once in n years), using a Generalized Pareto Distribution (GPD). This model was carried out by considering a single long time-series without relating ozone concentration to meteorological condition. Cox and Chu (1996) developed an extreme value model that  8  takes into account ozone-meteorology relationship. The proposed model is based on Weibull hazard model applied to an ozone value.  2.1.3  Space-time models  In atmospheric sciences, the most frequently used methods for analyzing space-time processes are the “eigentechniques” such as Empirical Orthogonal Function/Principal Component Analysis (EOF/PCA: Lorenz, 1956), Principle Oscillation Pattern (POP), Singular Spectrum Analysis (SSA), and Canonical Correlation Analysis (CCA). See von Storch and Zwiers (1999) for an overview of these methods. Recent developments in neural network modeling have further led to the nonlinear generalization of these classical methods NLPCA, NLCCA and NLSSA (Hsieh, 2004). These dimension-reduction linear techniques are able to extract dominant patterns of variability from a stochastic process. However, when applying these methods, there are several concerns that deserve particular attention. First, as with any data reduction technique they focus but also filter, and the filtering is what makes this approach incomplete at best and misleading at worst. In some cases the reduced (filtered) dataset may not contain important patterns for the analysis which originally were presented in the data and so leading to wrong conclusions. Second, it is very important that observations on an irregular grid are representative of the domain of interest (Karl et at., 1982). Third, singular value decomposition methods (“eigentechniques”) can be used for predictive purposes (Fuentes et al., 2006). Another approach, still within the framework of space-time models, which focus on the prediction in space and time is dynamical space-time models. These models are built on the extension of established spatial (Cressie, 1993; Wikle and Cressie, 1999) or temporal (Box and Jenkins, 1976) methods such as: kriging and autoregressive models. The dynamical space-time models treat jointly space and time variations and typically decompose the random field of interest into a mean component modeling the trend 9  (smooth variability) and a residual component modeling the higher frequency fluctuations around that trend. An alternative approach is to decompose the random field into collection of spatially correlated time series (i.e. a multivariate time series model, Bennet, 1979) or collection of temporally correlated space patterns (Simard and Marcotte, 1993). These two approaches can be considered as a two-stage method where the results from the conditional analyses in the first stage are combined in the second stage. This approach is appealing because of its simplicity but do not takes into account the interaction between the spatial and temporal component. The dynamics (causality) in a space-time model is incorporated directly or indirectly. The direct modeling of causality requires specifying a valid non-separable covariance function. There are several methods for constructing non-separable covariance function for spatio-temporal data some of which are in time domain (Gneiting, 2002) and the other in spectral domain (Cressie and Huang, 1999; Fuentes et al., 2008). See Schabenberger and Gotway p.435, 2005, for an overview. The indirect modeling of causality requires specifying an autoregressive model, that is, specifying lagged relationships in time and space. In the time series context (Box and Jenkins, 1976) ARIMA models are very popular. The autoregressive models are also widespread in the spatial data context, popular approaches include Simultaneous Autoregressive (SAR) Models (Whittle, 1954) and Conditional Autoregressive (CAR) models (Besag, 1974). The above autoregressive models can be generalize for space-time data to space-time autoregressive-movingaverage (STARMA) models (Cressie p.449, 1993). The most widely known and used subclass of dynamic models are the Dynamic Linear Models (DLM), or state-space models (West and Harrison, 1997). A DLM framework is a common approach used in environmental problems, which allows updating uncertainties about model parameters as observations are made. The Kalman filter and state-space smoothing algorithms are used to carry out the computations (Carter and Kohn, 1994). The DLM can be seen as a generalization of the regression models that al-  10  lows time varying coefficients. Recently, the DLM has been applied successfully to multidimensional (space) time series including: Huerta et al., 2004; Stroud et al., 2001. Recently, several dynamic spatiotemporal models have been reported in literature with application to air pollution problems: Sahu et al., 2007; McMillan et al., 2005; Huang and Hsu, 2004; Huerta et al., 2004; Wikle and Cressie, 1999.  2.2  Model requirements and modeling strategy  To incorporate the physical nature of environmental processes and to meet the objectives specified in Chapter 1, we require development of statistical methods that take into account:   Dynamical nature of the modeled processes    Interaction between the spatial and temporal components    Dependence on other physical processes    Available deterministic relationships    Uncertainties  These complexities also require a sophisticated modeling strategy. The Bayesian hierarchical modeling paradigm provides a natural framework for achieving these goals (Berliner, 2003). The Bayesian Hierarchical Modeling (BHM) approach has several advantages which make it very attractive. First, it allows separation of statistical modeling of observations from stochastic modeling of the underlying physical processes. Second, since the outputs of a Bayesian model are probability distributions, the approach focuses on uncertainty analysis. Third, prior information concerning previous experience or scientific understanding can be easily incorporated into the analysis. The essence of BHM is con-  11  structing a high-dimensional joint probability distribution for all unknowns as the product of comparatively simple conditional probability models. Developing BHM usually goes through organizing variables in three levels (Wikle et. al, 1998): Data model: distribution of observations, conditional on process and model parameters. Process model: distribution of process of interest, conditional on parameters. Parameter model: prior distribution of all model parameters. Then BHM takes the following form: |  , |  ,  |  x  x  where the bracket notation [.] represents a probability distribution.  2.3  A hierarchical spatiotemporal dynamic model  2.3.1 Data model  At this first stage of the hierarchy we usually assume that the data are conditionally independent which leads to relatively easy construction of the likelihood function and simplified calculations. The general model considered here is for spatiotemporal data recorded at n locations ,  1, … , , in a gridded spatial domain  equally spaced time points. Let observation vector at time point ,  ,  ,…,  ,  over a period of T ′ denote the n-dimensional  1, … , . Then given the true process of interest, the  data model takes the form: (2.1)  12  ,  where  ,…,  ,  ′ is an unobserved true process whose components  represent the average pollutant concentration for each particular grid cell. ber of grid cells in the domain and  is a  is the num-  mapping matrix that maps averaged pol-  lutant concentration throughout the domain to the observed concentration levels. Thus, the  matrix elements are zeros and ones and a non-zero element maps the ith monitoring  site to the jth grid cell (Wikle, 1998). The  is a white noise process that represents sub-  grid-scale variations and measurement error (nugget effect). Thus we assume that the components of  are independent and identically distributed normal random variables  with zero mean and unknown variance, that is,  ~  0,  . Additionally, the model  assumes that the mean value of all observations within a grid cell is equal to that grid cell average concentration level represented by the true process. The above assumptions also signify that the spatiotemporal interactions in the data are presented by the true process, .  2.3.2  Process model  ,  A natural approach for modeling  is by using scientific arguments. Using physical  reasoning the pollutant concentration field is further decomposed into three components namely: large-scale spatiotemporal trends, local production and transport (advection). That is: ,  where  ,  ,  ,  ,  ,  (2.2)  represent the unexplained variability at this stage of the hierarchical model.  In addition, it is known that meteorological variables are significant predictors of a pollutant concentration (Burrows et. al, 1994; McKendry, 2002). Thus, grouping meteorology in two categories of variables related to the above decomposition will enable us 13  to quantify the significance of these unobservable processes. The first category (  ) rele-  vant to the pollutant local production may include for example: surface temperature, solar radiation, and humidity. The second category (  ) relevant to the transport may include:  wind speed and wind direction. Other important meteorological predictors are those related to airmass temperature and convective stability (Burrows et. al, 1995). However, these data are not easily available and for example require coupling the pollution model to a weather prediction system. The proposed decomposition of the pollutant field identifies the mechanisms (sources of heterogeneity) that contribute to a non-stationary covariance structure. This approach allows development of physically motivated submodels for each of the components conditioned on different categories of meteorological variables.  2.3.2.1  Large-scale spatiotemporal trends  I propose a deterministic additive model (multiplicative on the original scale if  ,  are  on the logarithmic scale) to capture the data mean structure defined by the equation ,  (2.3)  in vector form x  x  (2.4)  The additive form of the model is physically justifiable since the physical phenomena behind the large-scale temporal and spatial trends are independent. The large scale temporal variability,  , mostly dominated by the seasonal and diurnal cycles is caused  by the planetary motion. On the other hand the large-scale spatial pattern,  , is mostly  dominated by the emissions rates of the pollutants or their precursors throughout the do-  14  main of interest. Heterogeneous emissions rates are one of the mechanisms that induce heterogeneity in the pollutant process and can explain different capacities of sub regions to produce pollutant concentration all other conditions being unchanged. We assume that ,  the emissions are relatively constant in time (i.e.  ), which is in contrast to  the other two types of heterogeneity modeled by transport and local production components of the model. How we model the large-scale temporal trend,  , depends on what kind of data  we have at hand. There are several approaches to model  including: Box-Jenkins  models (Box and Jenkins, 1976); linear combination of different basic functions such as: cubic spline (Huang and Hsu, 2004); sine and cosine components (Huerta et al. 2004); indicator functions. However, using the difference operator in a Box-Jenkins model for removing seasonality and large scale trends can be dangerous. For complex signals where different scales and types/interactions of signals are presented, differencing could deform the remaining signal in unpredictable ways and lead to wrong interpretation and conclusions. For example, to remove the linear trend in a non stationary random time series one may decide to use first differencing, since 1 but if the signal is more complex such as rencing leads to  2 cos  sin  , sin  ⁄  . The first diffe-  , a residual signal that has differ-  ent phase, amplitude (smaller as k is large) and period than the one of the original signal. The hourly mean data are the most difficult to model since they contain several deterministic signals such as: annual trends, diurnal and seasonal cycles and unknown short term fluctuations. In Chapter 4 I give an example of how the large-scale temporal variability,  , for the Metro Vancouver dataset can be modeled.  The large-scale spatial pattern due to varying emission rates can have a very complex structure. That structure sometimes cannot be captured easily by traditional methods 15  such as polynomial function of latitude and longitude or Markov Random Fields (MRF) (Wikle et al. 1998; Royle and Berliner, 1999). Alternatively, the emissions structure can be estimated from the data by the first spatial pattern of a PCA. Calculating the spatial patterns directly from observations will include heterogeneity induced by the transport (wind). This problem can be relatively easily eliminated by selecting only observations that correspond to very light winds (e.g.  3  / ) into the analysis.  The important point here is that we take the mean field as a measure of the emissions rate structure and we use the first  mode in our analysis since this eigenfunc-  tion resembles the mean surface. The mean field is obtained by averaging out the different meteorological conditions on which the pollutant concentration levels depend. As a result averaging eliminates non-stationary mechanisms which are responsible for local production and what remains is a spatial pattern caused by the varying emissions rates. One can use the mean field to estimate the large-scale emissions structure as well but we think that it is safer to use  which allow us to not ignore some other potential sources  of non-stationarity. We should also notice that the emission rates alternatively can be obtained from emissions inventory data. The view outlined above leads us to the following model for the mean spatial structure (2.5)  where  is the proportionality parameter that should be estimated from the data.  2.3.2.2 Transport model  Again, we can use physical arguments to build a statistical model to evaluate the regional transport of an air pollutant. Specifically, since the mean wind flow is responsible for the redistribution of the pollutant by advection, we can use the advection equation, which 16  relates the local changes of a pollutant concentrations levels to the mean wind flow. That is: , ;  , ;  U x,y;t  , ;  U x,y;t  (2.6)  where u,v represent the east-west and north-south components respectively of the mean wind flow and  ,  are the corresponding directions.  Following an approach originating in the study of dynamic systems modeling with random perturbations (see for example Pontriagin et al. (Понтрягин и др.),1933; Christakos, 1991) we can easily approximate (2.6) using finite difference methods. For example, we can construct an explicit scheme for this equation; using first central difference approximation for the first space derivative and first forward difference for the time derivative (Kalnay, 2003) and we get , ,  , ,  1  u  , ,  1  v  , ,  1  ,  1, ,  1  1 ,  1  ,  1, ,  1  1 ,  1  (2.7)  Alternatively, if we use a fully implicit (backward) time scheme we get , ,  , ,  1  u v  where  ,  , , , ,  1, , ,  1 ,  1, , ,  1 ,  (2.8)  represent the grid cell coordinates. The first approximation (2.7) suggests  lagged spatial-time interaction on the other hand, the second one approximation (2.8) suggests simultaneous spatial-time interaction. Similarly, both (2.7) and (2.8) suggest an autoregressive term. In the context of the advection equation (2.6) the autocorrelation can  17  be seen as non-wind transport (persistence) from the previous day. The time autocorrelation which naturally arises from (2.6) has been employed in modeling of numerous air pollution problems: Sahu et al., 2007; McMillan et al., 2005 etc. The explicit scheme (2.7) suggests the following stochastic transport model that can be used to explain much of the spatiotemporal structure of the  , ,  , ,  1  u  , ,  1  v  , ,  1  1, ,  1  1 ,  1  ,  ,  1, , ,  1 ,  process  1 1 ,  (2.9)  2, … ,  where spatial autoregressive parameters  ,  depend on the lattice specification,  have no dimensions, and are characterized by the spatial pollutant dispersion in east-west and north-south directions of the region. For a small flat region it is reasonable to assume that ’s have no spatial structure.  - denotes the time autoregressive parameter.  The vector form of the model (9) is given by ,  where H is a four, off diagonal  ;  matrix of regression coefficients, parameterized to  allow the first-order lagged nearest-neighbor structure described above, ters vector, and at  ,  (2.10)  are  is the parame-  -vectors of the u and v components of the wind field  grid locations. The fully implicit scheme (2.8) suggests the following stochastic transport model , ,  , ,  1  (2.11)  18  u  , ,  v  , ,  1, , ,  1, ,  1 ,  , ,  ,  2, … , .  The vector form of model (11) is given by ,  ;  (2.12)  or using (2.2), we obtain (2.13)  where at  and  are  -vectors of the local production component and the mean process  grid locations. Again, H is a four, off diagonal  matrix of regression coeffi-  cients, but this time parameterized to allow the first-order simultaneous nearest-neighbor structure described above. Which model specification is preferable depends on the data at hand. For hourly data the lagged model (2.10) is more appropriate. However, for daily maximum 8-hour average data the simultaneous model (2.12) is preferable. Given that we have the wind field, it appears to be more realistic to associate today’s transport effect at given location with today’s values at its neighboring locations than yesterday’s values. A possible justification for such an assumption would be that the time scale of daily maximum 8-hour ozone levels is hours (as is the ground-level horizontal ozone transport effect) but not days.  19  2.3.2.3 Local production model  In this section we will consider a local production model for pollutants whose generating mechanism depends mostly on the meteorology. We would not need such a model if we were dealing with a primary air pollutant. In this case our focus will be the emissions model. However, if we model a secondary air pollutant such as ozone we will need to develop a model conditional on the local meteorology. What are reasonable requirements the local production model should meet? At the first place the model should ignore dynamic contributions from the neighboring sites and take into account only specific local meteorological conditions, that is, we should allow covariates to be functions of location, and regression coefficients should not depend on location. On one hand the spatially dependent regression coefficients will account for part of the spatial heterogeneity induced by the transport processes. In this way, our local production model will explain part of the pollutant transport and will therefore not be a fully local term. We want this kind of heterogeneity to be modeled only by the transport model and only there. On the other hand this will put physical/chemical meaning in our regression coefficients, which is a desirable feature of a local production model since the physics and chemistry do not depend on the location. Furthermore, if we decide to use regionally averaged covariate fields it will mean that we treat the response process (pollutant concentration) differently from its covariates (different meteorological fields). This is a scale problem since the averaged covariates are surrogates at a coarser scale of the responses that occur at a finer scale. An important feature of this approach is that the local production spatial structure is not modeled directly, but inherent in the covariates field. This requirement is also reasonable since local production is strongly dependent on the meteorological fields. Taking into account the above reasoning, we propose the following regression model for modeling the local production:  20  ,  ,  (2.14)  or in vector form (2.15)  where  is  x design matrix of the exploratory variables at time point , and  is a -  vector of random regression parameters. Physically, these parameters signify the relative importance of the different meteorological fields that are relevant to local pollutant production. In particular, in Chapter 4 we will use the following meteorological fields to estimate ozone local production: temperature, relative humidity and solar radiation. For simplicity the proposed model is linear even if the local production processes are not linear. However, it has been shown in several studies (Gardner and Dorling, 2000) that linear regression models produce very similar results to those of non linear models. Substituting, the latent processes  and  in (2.2) by (2.10) and (2.15) we ob-  tain the following process model: ,  ;  2, … ,  where  ,  (2.16)  is a collection of independent unstructured error terms specifically assumed to  follow: ~  0,  The autoregressive model (2.16) requires an initial condition for  (2.17)  to be specified. We  assume the following model:  21  (2.18)  We should also take care in dealing with points on the boundary of the region. In such cases there will not be observations on all sides of a given point. In Chapter 3 we will consider two examples about how we can deal with this problem. Finally, because identifiability of the variance components of  in (2.2) the cova-  riance structure of the proposed submodels in this section is free of a nugget effect, that is, we do not introduce error terms in the equations (2.10) and (2.15), otherwise it is assumed that the overall nugget effect can be decomposed in the same manner as the true process,  , in known proportions.  2.3.3 Parameters model  To complete the hierarchical model we need to specify the prior distribution of the model parameters such as: variance parameters describing measurement error and uncertainty in each level of the hierarchy, meteorological regression parameters, and spatial autoregressive parameters, that is,  ,  , , ,  . In general, we will assume an independent pa-  rameters model i.e. , , ,  ,  .  (2.19)  the selected priors should lead to meaningful proper posteriors and allow us to incorporate relevant physics into the model. We specify the following conjugate priors: ~  ,  , ~  ~ ,  ,  ,  , ~  ~ ,  .  ,  (2.20)  22  The final stage of the hierarchy requires numerical values to be specified for all unknowns ,  2.4  , ,  , ,  ,  ,  ,  ,  .  Prediction  ,…,  Let  , so that, the  and  over the study period. We define pollutant levels and  represents the unobserved true process of interest in a similar fashion, where  denotes observed  denotes the observed meteorological fields. The joint posterior  distribution of the all unknowns in the hierarchical model specified by (2.1), (2.16) and (2.19) is given by: , , , ,  ,  , , |  |  ,  , ,  , , , , ,  ,  ,  | , | ,  , ,  (2.21)  .  The full conditional distribution of the unknown variates can now be obtained by collecting the terms of (2.20) that correspond to the variate of interest (Gilks et al., 1996). We also use the joint posterior distribution (2.21) to make k-step ahead predictions. Let  denotes the collection of all unknowns, such that  , , , ,  ,  . The  one-step-ahead prediction distribution is given by  | , ,  |  | , ,  (2.22)  23  |  where the likelihood term  1 is given. Using the squared error loss the optimal one-step-ahead  the time moment forecast is given by from  is given by (2.16) assuming that the meteorology at  |  | ,  . We approximate  | ,  by drawing samples  and forming the sample average (Banerjee, 2004).  24  CHAPTER 3 SIMULATION STUDY  In this chapter, we examine the performance of the model presented in Chapter 2 using different scenarios while testing the model code implementation. Of particular interest is how the parameter inference and spatial interpolation depend on: the way we complete the gradients in (2.6) on the boundary of the region; whether we allow the time autoregressive process to vary spatially; available data; prior information.  3.1 Boundary sites  Let us consider the following first-order neighbor structure which arises from the gradients approximation in (2.7) or (2.8)  .  Apparently, if the point  is on the boundary of the domain of interest one or two of its  neighbors will be missing. This requires special treatment of the boundary sites. In this simulation study we consider two different methods for dealing with this problem. In the first approach, we use only available observations and complete the gradients by using only the data from the central point  . For example, if  is located at  the left bottom corner of the domain there will be two missing neighbor sites one to the west and one to the south. In this case the East-West gradient will given approximate by 25  ,·  ,· and the North-South gradient will given approximate by  ,·  ,· , Royle et al., 1998. In the second approach, following McMillan (2005) we can extend the domain by introducing boundary sites and their associated processes. That is, we consider a collection of four boundary processes of the domain. The  ·  for each time point and for each side  1, … ,4  are x1 vectors whose elements correspond to the sites needed to stochastically which allows us to ac-  complete the boundary gradients. We model the  count for uncertainty and dependence between the neighboring boundary sites; see Wikle et al. (2003) for general discussion. In this case, the process model given by (2.16) should be extended by adding the following term  (3.1)  and process model will take the form ,  ;  ,  (3.2)  2, … ,  and ,  for  1. The  are xN matrices which maps the  (3.3)  to the corresponding sites with  incompleted gradients. For example, if we count the grid cells from the bottom left corner of the grid. The north boundary matrix  is given by  26  0  …  0  …  0  1  0  …  0  1 0  and the west boundary matrix  is given by  10 … 00 …  00 00  … …  00 … 00 … 01 …  00 00 00  … 0 … 0 … 0  00 …  00  … 0  0 0  . 00 … 00 00 … 00  … 0 … 0  00 … 00 … 00 …  01 00 00  … 0 … 0 … 0  00 …  00  … 0  hierarchically as well  We model the boundary processes ,  1, … ,4  0,  ~  2, … ,  , ~  (3.4)  0, ,  ~  0,  .  27  The final stage of the model requires specification of the prior distributions of the ,  variance parameters ,  , and  , and  as well as of the autoregressive parameters  . Again, we assume that all of these parameters are independent. For the  variance parameters we use the conjugate inverse gamma prior, that is  ~  ,  ~  ,  ~  , ,  1, … , 4  (3.5)  .  To simplify matters we assume the same variances for the boundary rows i.e. ,  ,  ,  and for the boundary columns i.e.  . For the autoregressive parameters we use the conjugate normal prior, that is ~  ,  ,  1, … , 4;  and simplifying assumptions  0,  1,  and  2.  (3.6)  and  .  3.2 The autoregressive process  In this chapter we also investigate how the model performance depends on the autoregressive process . Initially we do not allow this process to vary spatially or temporally, that is, ,  and ~  ,  .  (3.7)  Next, we will allow spatially varying autoregressive processes but do not allow spatial dependence among their components i.e. we will allow only independent local adjustments. Formally, this can be written as  28  ~  ,  .  (3.8)  These assumptions seem to be realistic since we expect the large-scale spatio-temporal effects to be captured by the deterministic mean process, , and eventually we need only independent local adjustments to capture within grid-scale variability. Alternatively, following Wikle et al. (1998) we can model the spatially varying autoregressive parameters , as a conditionally specified Gaussian MRF with first-order spatial dependence , |  , : ,  ,  ~  , 1,  1, ,  where  is the MRF mean,  and  1,  1,  1  ,  , 1  ,  1 1  (3.9)  ,  are MRF spatial dependence parameters, and  is  the homogeneous variance.  3.3  Spatial data coverage  5x8 (so  For our simulations, we consider a regular lattice of  40) and  30  equally spaced time points. We simulate a realization of the process given by , ~  ;  0,  ,  2, … ,  ~  0,  .  (3.10)  29  The difference between the process model (2.16) and (3.10) is that (3.10) does not in, . This difference is not essential since the  clude large-scale spatiotemporal trends,  statistical inferences are valid for both processes. We compile four different datasets based on the above realization and fit the hierarchical model (2.1), (2.16), and (2.20) given in the previous chapter. The number of grid cells where data are available  (this is  termed spatial data availability) varies from 20% to 80% corresponding to sequence of 8, 16, 24, 32 grid cells. The spatial sampling locations are selected at random and are shown ,  in Fig. 3.1. The meteorological fields  ,  were generated from Gaussian processes  centered at the observed mean values of the dataset which will be described in Chapter 4 and using corresponding sample variances. n= 8  n= 16  36  36  25  30  25  37  26  38 30  24 10  11  9  12  1  10  1  11  25  36  26 19  9  10  11  1  Figure 3.1  3  12  32  23  24  12  3  n= 24 35  31  n= 32 40  33  31  32  25  22  23  24  14  15  16  37  38  29  30  9 1  26  10  35  36  37  38  27  28  29  30  31  32  19  20  21  22  23  24  11  12  13  14  15  16  6  7  3  40  Selected grid cells for the four datasets. Cells are counted from the left  bottom corner.  30  3.4  Prior information  To investigate the effect of having different degrees of prior knowledge, we consider two prior specifications for the parameters. For that purpose we specify the hyperparameter values in the prior distributions, (2.20), (3.5), and (3.6) to convey different degrees of information. In our first specification given in Table 3.1 we consider proper noninformative priors. Noninformative prior distributions are intended to allow Bayesian inference for parameters about which we have no previous information.  Table 3.1  Summary of the noninformative prior specification.  Parameter  ,  ,  ,  Prior Mean  Prior Variance  30  100  11,  300  30  100  11,  300  1  10  2.1,  1.1  0.1  10  2.001,  Prior Parameters  0.1  0,0,0  0,  10000  0,0  0,  10000  0,  10000  0,  1000  0  1000  In the second specification, the informative prior, we use the same priors for the boundary related parameters, that is,  ,  ,  ,  ,  , and  , but priors for  31  ,  , , ,  are centered very close to the true values. Specifically, Table 3.2 gives the  specification we use for non-spatially varying autoregressive process, . Table 3.2  Summary of the informative prior specification.  Parameter  ,  ,  ,  Prior Mean  Prior Variance  1  10  2.1,  1.1  1  10  2.1,  1.1  1  10  2.1,  1.1  0.1  10  2.001,  0.1  0.4, 0.2, 0.01  10  0.1, 0.15  10  0.4  10  0  3.5  Prior Parameters  1000  0,  1000  Results  We consider four different forms of the hierarchical model described in the previous chapter to fit the simulated realization of the process defined by equations (3.10). : the time autoregressive parameter  i)  is not allowed to vary spatially and  we use only available observations to evaluate the gradients on the boundaries. That is, we use equation (2.16) as the second stage in our hierarchical model. ii)  : the time autoregressive parameter  is allowed to vary spatially and we  use only available observations to evaluate the gradients on the boundaries. That is, we use equation (2.16) again as the second stage in our hierarchical model but in this case the time autoregressive parameter  is replace by  x  diagonal ma-  trix.  32  iii)  : the time autoregressive parameter  is allowed to vary spatially and we  extend the domain by introducing boundary sites and their associated processes. That is, we use equation (3.2) and (3.3) as the second stage in our hierarchical model. In this case the autoregressive parameter and  trix. Note that iv)  is again an  x  diagonal ma-  are special cases of model  : the time autoregressive parameter  is not allowed to vary spatially and  we extend the domain by introducing boundary sites and their associated processes. That is, we again use equation (3.2) and (3.3) as the second stage in our hierarchical model but in this case, the autoregressive parameter  is a scalar  quantity. All results were obtained via MCMC computer simulations. For simplicity, only the full conditional distributions required for the Gibbs Sampler for the most complicated , are presented in Appendix A.  model,  3.5.1  Parameter inference  Tables 3.3, and 3.4 present a summary of the parameters inferred by the model using two different prior specifications. The results in these tables suggest the following conclusions:   The regression parameters  ,  ,  are reasonably well estimated for all four  cases, especially  which is essentially equal to the true value. The uncertainty  about parameters  ,  declines with the increase of the available observations,  and for all datasets the estimates of these parameters are not very sensitive to the prior specification. Figure 3.2 shows MCMC samples from the posterior distributions of these parameter under noninformative prior and  16.  33    The mean posterior estimates for the spatial autoregressive parameters  ,  show poor performance and relatively large standard deviations for a small num8, 16 in both cases of prior specification.  ber of available observations  However, the situation substantially improves as the number of the observations increases.   The time autoregressive parameter  is the most difficult parameter to estimate.  The situation does not improve as the number of the available grid cells increases and there is only slight improvement when an informative prior specification is used. This could indicate poor modeling assumptions which we will try to address in our other model versions  , and  that include a spatially varying  autoregressive process.   and  The inferences about  under the noninformative prior specification are  very poor. The data variability is split more or less equally between the first two stages of the hierarchical model indicating poor overall model performance. However, there is significant improvement when we use an informative prior specification. Table 3.3 Summary of the posterior parameter inference with different numbers of available observations based on model and noninformative prior (Table 3.1).  Parameter  True Value  16  8  24  32  Mean  s.d.  Mean  s.d.  Mean  s.d.  Mean  s.d.  0.4  0.41  0.11  0.39  0.05  0.44  0.04  0.43  0.03  0.2  0.2  0.03  0.21  0.02  0.21  0.01  0.21  0.01  0.01  0.01  0.01  0.01  0.00  0.01  0.00  0.01  0.00  0.1  0.02  0.05  0.06  0.03  0.05  0.02  0.08  0.02  0.15  0.06  0.05  0.09  0.03  0.13  0.02  0.14  0.02  -0.4  -0.54  0.04  -0.53  0.04  -0.52  0.03  -0.52  0.02  10  9.08  1.188  6.76  0.70  5.89  0.55  5.27  0.46  NA  10.034  1.282  7.82  0.74  7.42  0.62  7.24  0.54  34  Table 3.4 Summary of posterior parameter inference base on model using different numbers of available observations and informative prior (Table 3.1).  Parameter  True Value  8  16  24  32  Mean  s.d.  Mean  s.d.  Mean  s.d.  Mean  s.d.  0.4  0.39  0.07  0.38  0.04  0.43  0.03  0.41  0.02  0.2  0.2  0.02  0.20  0.01  0.20  0.01  0.20  0.01  0.01  0.01  0.00  0.01  0.00  0.01  0.00  0.01  0.00  0.1  0.00  0.04  0.06  0.02  0.06  0.02  0.08  0.02  0.15  0.09  0.04  0.12  0.02  0.13  0.02  0.14  0.01  -0.4  -0.51  0.04  -0.49  0.03  -0.47  0.03  -0.48  0.02  10  0.86  0.67  1.29  0.80  0.80  0.42  0.7  0.32  NA  11.17  1.34  9.45  1.14  10.1  0.77  10.05  0.63  Figure 3.2 displays the MCMC samples from the posterior distributions of all model grid cells equal  parameters in the case of noninformative prior and number of available 16. We can see that the Markov chains exhibit the same qualitative  behavior after a few iterations, which is an indication of convergence. Furthermore, Figure 3.2 also presents the MCMC histograms that approximate the marginal posterior distributions of the model parameters. It is clear from the presented histograms that the model parameters are concentrated around their true values and the algorithm correctly identifies the parameters.  35  500 4000  0.05  0.15  0  1  4000  0.3  0.4  0.5  0.6  300 100 0 0  1  2000  4000  0.010  0.020  iteration  4000  0 4000  -0.60  -0.50  -0.40    500 0  0 100  20  40  300  60  500 0 100 2000  2000 iteration  300  100 200 300 400 0 0  0  3  iteration  0.24  100 200 300 400  0.2 300 100 0 0.000  0.20 2  -0.6 -0.4 -0.2 0.0  0.08  4000  0.16  iteration  0.04 0.00  2000  0.10  0.10 0.00 0.2  iteration  0  0.00  2  0.20  600 200 0 4000  2000 iteration  400  0.5 0.3 0.1  2000  0  0.00 -0.05  iteration  0  100  0.10  300 0 100 2000  300  500  0.15 0.05 -0.05 0  5  6  7 2 z  8  9  10  0  2000 iteration  4000  6  7  8  9  10 11  2 o  Figure 3.2 Sequences of 5000 iterations from the posterior distributions of the model parameters and the histograms of the last 2000 iterations.  36  Next, we will concentrate on parametric inference for the model has significantly more parameters than models  and  .  , which results in a  slower Gibbs sampler and a considerable rise in computation cost. Tables 3.5, and 3.6 parameters using two differ-  present a summary of the inference about the model ent prior specifications. The results in these tables suggest:   Again, the regression parameters  ,  ,  are reasonably well estimated for all  cases. The uncertainty about these parameters declines with an increase in the available observations, and for all datasets the estimates of these parameters are not very sensitive to the prior specification.   The spatial autoregressive parameters  ,  are very sensitive to the number of  available observations in both cases of prior specification. The increasing number of observations substantially improves parameter inference.   The variances  and  are the parameters that are most strongly influenced by  the prior specification and are not very sensitive to the number of available observations. Figure 3.3 displays the true and interpolated surfaces of the spatially varying autoregressive process, , for informative prior and  32. A visual inspection of these two  surfaces shows that they are closely similar (spatial pattern). To quantify the differences between the true and the model-estimated surfaces of the process  we use a statistic Sur-  face Distance (SD) defined as the mean sum of squared differences between the observed and estimated values 1  where  is the observed and  estimated values of the  at  . Table 3.7 reports the  SD statistics for . The SD statistic appears to decrease with increasing available data in a non monotonic fashion.  37  Table 3.5 Summary of posterior parameter inference based on model , using different number of available observations and noninformative prior (Table 3.1). 16 24 32 8 True Parameter Value Mean s.d. Mean s.d. Mean s.d. Mean s.d. 0.4  0.5  0.28  0.36  0.13  0.36  0.11  0.42  0.08  0.2  0.27  0.1  0.23  0.04  0.23  0.03  0.22  0.03  0.01  0.0  0.01  0.01  0.0  0.01  0.0  0.01  0.0  0.1  0.02  0.02  0.03  0.02  0.08  0.2  0.08  0.2  0.15  0.01  0.02  0.01  0.01  0.09  0.3  0.14  0.2  10  9.24  1.17  6.78  0.70  6.16  0.58  5.69  0.49  NA  9.27  1.13  6.94  0.71  6.74  0.60  6.29  0.51  Table 3.6 Summary of posterior parameter inference based on model , using different number of available observations and informative prior (Table 3.2). 16 24 8 32 True Parameter Value Mean s.d. Mean s.d. Mean s.d. Mean s.d.  Table 3.7 SD  0.4  0.54  0.22  0.39  0.11  0.39  0.09  0.42  0.08  0.2  0.22  0.06  0.21  0.04  0.2  0.03  0.22  0.02  0.01  0.0  0.01  0.01  0.0  0.01  0.0  0.01  0.0  0.1  0.06  0.04  0.06  0.02  0.09  0.02  0.08  0.02  0.15  0.0  0.02  0.01  0.01  0.11  0.02  0.13  0.02  10  2.49  2.57  1.50  1.24  0.84  0.58  1.38  0.85  NA  8.17  2.58  8.04  1.44  9.15  0.82  8.62  1.04  SD statistics for spatial varying autoregressive parameters . 8  16  24  32  0.07  0.06  0.02  0.01  0.04  0.09  0.02  0.01  0.09  -  -  0.008  0.09  -  -  0.008  38  4 y 3 2 1  1  2  y 3  4  5  Interpolated sutface of   5  True surface of   2  4  6  8  2  4  x  6  8  x  Figure 3.3 True and interpolated surfaces of 32. The shading interval is uniform.  based on an informative prior and  The following Tables 3.8, 3.9, 3.10 and 3.11 summarize the results of parameter inference for models  and  eration in all analysis to datasets  . For computational reasons we restrict consid8 and  32. The results suggest similar conclu-  sions about model parameters that have been made for the models  and  Table 3.8 Summary of posterior parameter inference based on model , using different numbers of available observations and noninformative prior (Table 3.1). 8 32 True Parameter Value Mean s.d. Mean s.d. 0.4  0.78  0.23  0.48  0.08  0.2  0.19  0.07  0.19  0.03  0.01  0.01  0.01  0.01  0.0  0.1  0.03  0.01  0.12  0.02  0.15  0.02  0.01  0.15  0.01  (-5.2,-0.42,1.0)  (2.44,0.43,0.23)  (0.07,-0.3,1.0)  (0.88,0.47,0.22)  (1.92,-0.51,1)  (1.93,0.5,0.22)  (1.28,-0.27,1.0)  (1.45,0.44,0.24)  8.45  1.05  5.33  0.48  8.65  1.1  6.76  0.52  ,  (0.11, 0.07)  (0.16,0.09)  (0.1,0.08)  (0.14, 0.09)  ,  (0.08, 0.08)  (0.1,0.1)  (0.09,0.08)  (0.12, 0.1)  1  1.48  1  0.52  39  Table 3.9 Summary of posterior parameter inference based on model , using different numbers of available observations and informative prior (Table 3.2). 32 8 True Parameter Value Mean s.d. Mean s.d. 0.4  0.7  0.17  0.48  0.07  0.2  0.16  0.06  0.18  0.02  0.01  0.0  0.01  0.01  0.0  0.1  0.09  0.03  0.11  0.01  0.15  0.08  0.04  0.15  0.01  (-0.15,-0.26, 1)  (1.21,0.46,0.24)  (0.12,-0.31,1)  (1.04,0.46,0.23)  (1.16,-0.35, 1)  (1.38,0.45,0.22)  (1.86,-0.25,1)  (1.85,0.46,0.24)  0.83  0.64  0.56  0.29  8.12  1.11  9.36  0.56  ,  0.08, 0.07  (0.11, 0.09)  (0.1,0.08)  (0.17,0.09)  ,  0.09,0.08  (0.17,0.1)  (0.08,0.08)  (0.09,0.1)  1  1.32  1  0.58  Table 3.10 Summary of posterior parameter inference based on model , using different numbers of available observations and noninformative prior (Table 3.1). 8 32 True Parameter Value Mean s.d. Mean s.d. 0.4  0.44  0.05  0.45  0.02  0.4  0.37  0.06  0.4  0.03  0.2  0.23  0.03  0.22  0.01  0.01  0.01  0.0  0.01  0  0.1  0.06  0.02  0.8  0.01  0.15  0.12  0.04  0.16  0.01  (-0.39,-0.18,1)  (1.2,0.44,0.24)  (-0.3,-0.16,1)  (0.59,0.44,0.23)  (-2.5,-0.41,1)  (1.94,0.42,0.24)  (-3.09,-0.61,1)  (3.41,0.43,0.24)  8.42  1.07  4.80  0.4  9.45  1.26  6.83  0.52  40  ,  (0.09,0.08)  (0.12, 0.1)  (0.1,0.08)  (0.15, 0.1)  ,  (0.1,0.08)  (0.19,0.1)  (0.09,0.08)  (0.08,0.11)  2  2.19  1  1.21  Table 3.11 Summary of posterior parameter inference based on model , using different numbers of available observations and informative prior (Table 3.2). 8 32 True Parameter Value Mean s.d. Mean s.d. 0.4  0.43  0.04  0.42  0.02  0.4  0.37  0.04  0.4  0.02  0.2  0.22  0.02  0.21  0.01  0.01  0.01  0  0.01  0  0.1  0.07  0.02  0.08  0.01  0.15  0.14  0.03  0.16  0.01  (0.01,-0.23,1)  (1.08,0.44,0.24)  (-0.67, -0.11, 1)  (0.71,0.45,0.23)  (-1.84,-0.4,1)  (2.02,0.42,0.23)  (-2.13,-0.6,0.9)  (3.86,0.46,0.24)  0.81  0.62  0.57  0.27  9.46  1.26  9.11  0.57  ,  (0.09, 0.08)  (0.11,0.09)  (0.11,0.08)  (0.09,0.08)  ,  (0.1, 0.08)  (0.2,0.1)  (0.09,0.08)  (0.1,0.15)  2  2.04  1  1.3  10  41  3.5.2  Model performance  We now investigate the spatial-temporal performance of the proposed models ,  ,  and  and how this depends on different datasets and  priors. Additionally, the proposed models also allow us to investigate the local transport effect on concentration levels of the true process. To quantify the agreement between observed and estimated concentration levels we calculate the root mean squared error (RMSE) and to assess the relative importance of the transport effect we use the ratio of the variances of the transport term to the total variance. Table 3.12 reports the model performance and relative importance of the transport effect under different datasets and priors. We note that the average RMSE decreases with an increase in the available observations and, by contrast, the average transport effect increases with an increase in the available observations. Using informative prior specification improves model performance in all cases. This result could be connected to the observed improvement in the parameter inferences when we use informative prior specification and in particularly the variance parameters, which are the most sensitive parameters to the prior specification. Another important point is that the spatially varying autoregressive models and  and  do not perform better than models  with smaller datasets. The reason for this could be the poor modeling as-  sumption; the way we generate our datasets and/or the natural requirement that more parameters require lager dataset to be estimated. To clarify the latter we performed an additional run of the model  on the larger dataset (  8 and  100). As a result,  we observe improvement of 14% and better in model performance. This shows that these models are not suitable for small datasets. The model performance might eventually improves if we consider some form of spatial interaction among the time autoregressive parameters such as MRF. This option was not investigated here.  42  Table 3.12 Summary of the model’s performance and relative importance of the transport effect. * denotes that informative prior was used for calculations. 8 16 24 32 Model RMSE Tr. RMSE Tr. RMSE Tr. RMSE Tr. 3.50  3.7%  3.19  5.4%  2.79  7.5%  2.34  9%  3.35  10.3%  2.94  11.7%  2.36  16%  1.7  17.6%  14.65  2.4%  13.95  3.4%  3.64  9.6%  2.89  10.9%  9.43  7.2%  5.67  16.2%  3.34  22%  2.57  16.3%  19.99  3.4%  -  -  -  -  2.55  11.2%  17.27  16.5%  -  -  -  -  1.96  22%  3.34  8%  -  -  -  -  2.23  9%  3.2  12.9%  -  -  -  -  1.61  16.9  Figure 3.4 (center) presents MCMC histograms of the response process from randomly selected grid cells with observations (grid cell number 25) and without observations (gird cell number 9), based on model point  (  8, noninformative prior) at time  20. The posterior distribution of the grid cells that do not have observations is  wider indicating that much more uncertainty is associated with response estimates where no data are presented. Furthermore, in this case the autocorrelation (right panel) is higher which implies slow convergence of the Gibbs Sampler. The local impact of data availability in time is further investigated in Figure 3.5 where the posterior time series of the true processes are presented for the same locations. The 95% credibility lines represent the evolution of the uncertainty in time. It is clear that the credibility interval is wider at each time point for the grid cell that has no observations within the grid cell. Another point is that the 95% credibility interval for the grid cell 9 is relatively constant in time and does not follow closely the time variations of the true process (the solid line). The posterior time series for grid cell 9 also fails to  43  O(9,20) 1.0  O(9,20) 0.8 0.6 0.4 0.2 0.0  0  0  10  100  20  200  30  40  300  50  400  60  O(9,20)  0  1000 2000 3000 4000 5000  5  10  15  20  25  0  5  10  iteration  15  20  25  30  25  30  lag  O(25,20)  O(25,20)  0  0.8 0.6 0.0  0  10  100  20  0.2 0.4  200  30  300  40  1.0  O(25,20)  0  1000 2000 3000 4000 5000 iteration  5  10  15  0  5  10  15  20  lag  Figure 3.4 Left: sequences of 5000 iterations from the posterior distributions of the randomly selected grid cells without observations O(9,20) and with observations O(25,20) ; Center: histogram of the last 2000 iterations; Right: autocorrelation function.  capture most part of the variability of the true process even though on average, these two time series are very close. By contrast, the posterior time series for grid cell 25 that has observations within the grid cell follows very closely the observed time series as well as the 95% credibility interval. We also should note that the credibility interval is narrow in this case and in both cases the observed time series is within the 95% credibility interval.  44  O(20,); Grid cell without observations.  10  30  fitted data raw data 95% credibility interval  0  5  10  15 time  20  25  30  25  30  35  O(25,); Grid cell with observations.  5  15  25  fitted data raw data 95% credibility interval  0  5  10  15 time  20  Figure 3.5 Posterior mean time series for grid cells without observations for grid cells with observation 25,· .  9,· and  Model based interpolated maps of the true process (Figure 3.6) are given for the randomly selected time period (time interval 20 to 25). We found good agreement with the observations superimposed on the plots and that the model was able to capture correctly the space-time dynamic evolution. The RMSEs for the period are 3.55, 3.04, 2.70, 3.80, 3.05, and 2.78 respectively. Similar results were obtained for the other datasets 16, 24, 32 and models for the whole period of  ,  ,  (not shown). The average RMSE  30 equally spaced time points are reported in Table 3.12.  45  16.1  19.5  18.3  22.9  14.6  19.2  10.8  18.2  15.1  14.3  15.5  11.1  11.5  23.7  13.1  11.6  9.3  *  *  16 14  14.2  17.9  6  17.6  17.6  15.1  14.6  21  14.7  9.3  25.1  19.8  2  4  6  x  10  12.1  11.7  13.1  9  15  22.1  17.6  15.9  15.3  12.8  8.2  *  14.1  11.2  20.7  26.5  12  27.3  20  *  12  17.9  17  12.9  9.6  20.4  16.1  16.6  19.6  7.1  12.9  5.6  19.8  14  11.3  *  *  *  19.3  2  16.8  4  13.2  17.5  14.3  8.2  16.2  6  x  16.2 12.9  17.6  4  13.3  15.6  10.8  15.9  15.2  19.7  12.5  16.7  13.2  0.8  11.3  12.6  22.3  *  * 12.9  *  15.7  5  11.6  4  15.8  6.8  9.6  22.8  10.1  19.8  17.8  17.1  3  17.5  16.6  12.2  8.7  10.7  12.8  11.7  18.4  11.5  13.9  9.2  18.3  19.9  12.8  9.5  13.3  14.6  14.2  18.3  *  14.7  9.9  21.5  15  *  20 18  *  16 14  2  4  x  6  12.9  16.4  18.3  15.5  12.5  *  20 18  *  9.9  15.7  22.6  18.8  12  14.6  8.8  12.8  17.4  23.2  17.7  10.7  23.8  17.1  4  6  x  16 14  *  12 10  8  8  23.1  19.4  11  13.9  *  15.8  16.9  *  8.9  21.1  18  *  14.8  12.7  19  17.3  9.3  10.9  11.4  19.4  14.4  11  15.9  16.4  22.8  15  11.9  16.3  15.8  14  16  15.9  15.8  11.5  13.8  14.7  10.2  23.1  13.6  12  * 15.7  27  4  15.8  *  *  *  11.5  3  17.4  *  8  * 6.1  2  y  18.5  6  x  2  16.5  12.2  1  19.4  14  Mean: time 25  2  5  21.2  *  11.2  8  1  y  11.4  19.6  16  12 17.1  Mean: time 24 *  *  22.7  10 12.8  14.8  15.7  15  * 13.3  18  10.4  5  15.3  5.3  19.9  15.3  4  17  8.1  18.5  12.3  3  13.5  13.1  *  24.1  16.8  2  15.2  11.7  11.3  2  y  16.6  *  *  Mean: time 23  19.4  *  20.2  10.7  1  5 4 3  y  1  2  11.2  *  18.6  20.6  8  * 8  *  17.2  Mean: time 22 16.7  17.9  11.1  12 10.4  *  15.6  13  18  *  24  4  4.6  *  18  20  6.3  *  5  22  20.7  3  *  10.6  y  17.1  2  14.9  4  14.9  3  *  13.2  1  13.1  Mean: time 21  2  y  19.1  *  1  5  Mean: time 20  *  16.3  *  17  *  * 14.2  2  15.9  20.7  4  x  6  *  16 14 12  8  Figure 3.6 Model-estimated mean field of the true pollutant concentration (grey scale), and observed values (numbers). * - denotes the grid cells used for fitting the model.  Figure 3.7 shows the corresponding standard deviation contour plots. The plots illustrate, as expected, that the uncertainty is lowest near the grid cells that have observations. Since the standard deviation is very sensitive to the data availability these plots are  46  not very suitable for characterizing the domain variability. Large-scale variability patterns might be very different from those presented in Figure 3.7.  15  *  4  15  *  *  10  *  3  y  3  y  *  15  10  10  *  10  5  Var: time 21 15  10  *  10  4  5  Var: time 20  * 10  5*  2  2  10  *  *5  *  *  2  3  4  1  2  3  4  x  5  6  7  8  15  * 1  *  5  6  7  8  Var: time 23 5  5  Var: time 22  x  *  4  *  15  3  y  10  15  10  *  *  10  10  *  15  3  *  y  4  15  10  10  *  1  15  10  1  10  *  * 10  1  2  3  4  *  *  2  3  4  x  5  6  7  8  * 1  5 4  7  8  *  10  *  3  y  15  *  5*  *  3  4  *  1  2  3  4  x  5  6  7  8  * 1  10  1  *  15  1  10  5*  10 15  *  6  10  2  3  y  * 10  * *  2  *  10  10  15  5  15  *  x  Var: time 25  10  4  5  Var: time 24 *  15  10  *5  15  *  10 15  10  *  2  *  1  1  2  10  *5  2  x  5  6  7  8  Figure 3.7 Contour plots of pollutant variance field. * - denotes the grid cells used for fitting the model. The model’s ability to estimate the transport effect is demonstrated in Figure 3.8. In general the effect is small and the wind can be associated with positive, negative and  47  no local transport contributions to the pollutant concentration levels. The sign of the transport effect is determined by the pollutant concentration gradients and the magnitude of the effect depends on wind speed and wind direction as well magnitude of gradients. Again, we do not try to find large-scale transport patterns since the data are randomly generated. The average overall transport effect is reported in Table 3.12.  5  Mean: time 21  5  Mean: time 20  0.6  2  4  x  6  4 3  y  -0.5  0.2  2  0.0  0.4  0.0 -0.2  1  3 2  y  0.5  1  4  1.0  8  2  x  6  8  0.2  0.6  5  0.4  0.4 4  5  Mean: time 23  4  Mean: time 22  4  0.2 0.0  3  y  3  y  0.0 -0.2 2  2  -0.2 -0.4  -0.6 2  4  x  6  -0.6  1  1  -0.4  8  2  x  6  8  5  0.2  4  Mean: time 25  5  Mean: time 24  4  0.0 -0.2  3  y  0.0  3  y  4  0.5  2  2  -0.4 -0.5 -1.0 2  4  x  6  8  1  1  -0.6 -0.8 2  4  x  6  8  Figure 3.8 Model estimated mean transport contributions map. . The observed wind field is superimposed on the plots.  48  We have also performed the usual diagnostic plots for checking model adequacy. Figure 3.9 shows the scatter plot of the estimated values against observed values for the two different types of grid cells. These results agree with our previous results and conclusions reported in Table 3.12 and Figure 3.5. We have observed similar behavior for all datasets and models used in this simulation study.  20 15 0  5  10  Model-estimates  15 10 0  5  Model-estimates  20  25  O(2,)  25  O(25,)  0  5  10 15 20 Observed value  25  0  Figure 3.9 Scatter plot for grid cell with observations out observation 2,· .  5  10 15 20 Observed value  25  25,· and for grid cell with-  49  3.6  Discussion  In this simulation study we investigated different aspects of the general model proposed in Chapter II including: how we treat the boundary sites; the way we incorporate uncertainty by allowing spatial varying autoregressive parameters or not; the data availability; and prior specification. We found that:   In general the model parameters can be divided into three groups independent of differences in model specification. In the first group we have well estimated parameters independent of data availability and prior specification, such as . In the second group we have parameters whose estimates can be substantially improved depending on data availability and prior specification, such as , and  .  .  The third group includes parameters that are not very sensitive to prior specification or data availability, such as .   Model performance and the transport effect depend significantly on spatial data availability (the ratio of the number of the cells without observations to the number of the cells with observations, i.e. / ) and prior specification especially for the models that allow spatial varying autoregressive parameters, , and  . The reason for the relatively poor performance of some  of the models could be the length of the data records; poor model assumptions; or the way we generate the datasets. Therefore, it would be worthwhile to investigate further simulation studies with larger datasets and other ways for incorporating uncertainty.   There is a considerable difference between grid cells that have or do not have observations within the cells. The grid cells that do not have observations are associated with much more uncertainty. This would explain the strong dependence of model performance on spatial data availability. Increasing the number  50  of the cells with observations will decrease the overall uncertainty and vice versa, and ultimately model performance.   The Gibbs Sampler slows significantly as the complexity of the model increases.  Additional runs of the models have been carried out to investigate the sensitivity of the results on deliberately misleading prior specifications. That is, we incorrectly specified the prior information for different groups of model parameters  , ,  in factor of  3-4. In all cases the result was very similar to the one obtained using correctly specified priors.  51  CHAPTER 4 APPLICATION TO GROUND LEVEL OZONE CONCENTRATION DATA 4.1  The Metro Vancouver data  The dataset considered here consists of one-hour-average оzone measurements provided the Lower Fraser Valley (LFV) Air Quality Monitoring Network, covering the time period of 24 years from 1982 to 2006 at 38 stations (Lower Fraser Valley Ambient Air Quality Report, 2007). However, we model daily maximum 8-hour average ozone concentration data obtained from hourly ozone data by calculating the maximum of rolling 8hour averages for each day. The 8-hour average values are of particular interest because that is the averaging basis for the Canadian-Wide Standards for ozone, based on the adverse effect of ozone and other pollutants on human health and the environment (EPA AQCD, 2006). The spatial domain of study is restricted to a rectangular region with size of about 146x61 km2 covering the Lower Fraser Valley. We also restrict the time period of analysis the most recent 3 years available (2004, 2005, and 2006). That period includes one year of high concentrations (2006), one of low (2005) and one of intermediate (2004), though the analysis of data from other years is not excluded. The ozone season considered here starts from 15 April through 30 September, 169 days in all (McMillan et al., 2004). All hourly measurements have a quality control flag and in this analysis we use only high quality data from stations with at least 90% data availability. These criteria should optimize model spatial interpolation performance and partially eliminate any uncertainty in our results that might be attributed to measurement error or more generally inadequate data quality.  52  In addition, we have meteorological data from 13 weather stations located at some ozone monitoring sites, since our modeling approach requires meteorological fields as covariates. We interpolate the covariates fields to a regular grid specified over the entire domain of study for each time point using ordinary kriging. Once we obtain the covariate fields we calculate the daily rolling 8-hour averages of meteorological covariates at each grid cell, the daily 8-hour average values that are used as input in analysis. For grid cells that have no ozone observations within we select values that correspond to the domain daily average maximum 8-hour ozone concentration. Figure 4.1 displays the resulting interpolated 8-hour average wind field for 12 May, 2004. Alternatively, we can use the output of a Numerical Weather Prediction (NWP) system to obtain the meteorological fields. We should also note that, first: the wind field is a vector variable described by wind speed and direction that was decomposed into two orthogonal components and then interpolated over the grid (Stull, 1988); second: not all stations were operational at the same time; third: all of the data points have a quality control flag that indicates possible  2  4  6  8  problems that we do not believe affect the conclusions drawn from our analysis.  0  Figure 4.1  5  10  15  Interpolated 8-hour average wind field for 12 May 2004  53  Figure 4.2 provides a geographical summary of the spatial domain and the location of the 20 air quality sites that had available data for the time period of study. The figure also shows the region over which we define the interpolation grid for applying the spatiotemporal model and that the spatial data coverage is approximately 11%.  26 * 06 01 * 04 14 09 32 02 * * * * * 20 * * 18 * 31 * 13 17 * 15 * *  29 *  30 * 12 * 27 *  49  49.1  Latitude 49.25  49.4  25 *  -123.2  -123  33 *  -122.8 -122.6 -122.4 -122.2 Longitude  -122  -121.8 -121.6  Figure 4.2 Locations of ozone observation sites “*” and prediction grid “+”. Numbers are station identifiers.  Different aspects of ozone behavior in the LFV have been studied by many authors including: McKendry, 2002; Le and Zidek, 2006; Robeson and Steyn, 1989, 1990; Pryor et. al, 1994; Pryor and Steyn, 1995; Joe et. al, 1996; Vingarzan and Taylor, 2003; Burrows et. al, 1995; Ainslie and Steyn, 2007. The range of methods used in these studies vary from simple time series and regression models to complex hierarchical Bayesian models and Artificial Neural Networks. Restricting the time domain to a particular season is appropriate for the time scale of an operational model and was done in several previous studies: McMillan et al., 2005;  54  Huang and Hsu, 2004; Huerta et al., 2004; Wikle and Cressie, 1999. In addition, it keeps the problem computationally tractable and facilitates the modeling approach since the seasonal and longer scale cycles appear as a linear trend.  4.2  Background chemistry of ozone  Different aspect of ozone as an air pollutant have been studied: hourly surface level concentrations (e.g. Gardner and Dorling, 2000), daily maximum concentration (e.g. Burrows et. al, 1995), annual cycle (e.g. Vingarzan and Taylor, 2003), threshold exceedance (e.g. Smith and Huang, 1993), correlated variables (e.g. McKendry, 2002), evaluation of an air quality model (e.g. Riccio et. al, 2005), systematically missing data (e.g. Le and Zidek, 2006), and trends assessment (e.g. Rao and Zurbenco, 1994, Sahu et al., 2007). Ozone is secondary pollutant which forms by the photochemical destruction of nitrogen dioxide (NO2) in the presence of sunlight. As a result of the reaction, nitric oxide (NO) and an atom of oxygen (O) are created. NO2+hν -> NO+O Here the energy of the solar radiation is presented by the second term; hν, which is the product of Planck’s constant, h, and the frequency, ν, of solar electromagnetic radiation. Next, the oxygen atom combines with molecular oxygen (O2) in the presence of a third molecule that absorbs the excess energy to produce ozone (O3) O+O2 +M ->O3+M In the absence of volatile organic compounds (VOCs) ozone reacts with nitric oxide to reform nitrogen dioxide O3 +NO -> NO2+O2  55  The presence of VOCs and CO amplifies ozone production by removing NO and creating additional NO2. When sufficient amounts of NO2 and VOCs are present in appropriate weather conditions, ozone formation exceeds dissociation, causing it to build up in the atmosphere. Ozone is also transported to the troposphere from the stratosphere. The sources of tropospheric zone precursors, NOx and VOCs, in general can be grouped in two major categories namely: biogenic (natural) and anthropogenic. The natural NOx sources include stratospheric intrusions, lightning, soils, and wildfires and the two largest anthropogenic NOx sources are electric power generation plants and motor vehicles. Uncertainties in natural NOx emissions are much larger than for anthropogenic NOx emissions. Naturally emitted VOCs are produced by vegetations and the largest anthropogenic sources are industrial processes and transportation. On the regional and global scales, emissions of VOCs from vegetation are much larger than those from anthropogenic sources. On an urban scale, uncertainties in both biogenic and anthropogenic VOCs emission inventories prevent determination of the relative contributions of these two categories with much precision (EPA AQCD, 2006). Ozone may exhibit significant temporal variations on different scales, for example, seasonal cycles caused by increases in biogenic emissions during the summer when the temperature and incident sunlight are highest. Temperature and sunlight are also responsible for an ozone diurnal cycle, while climate changes are responsible for long term trends (EPA AQCD, 2006). Additionally, ozone may exhibit significant spatial variation as well. For example, in urban areas, it is well known that ozone concentration is lowest in the center and increases in the downwind direction. The reason is that efficiency of ozone production per NOx oxidized is highest in areas where NOx concentrations are lowest (suburban/rual areas) and decreases with increasing NOx concentration (urban centers). The atmospheric chemistry of tropospheric ozone is a complex, non-linear function of many factors, which  56  goes through several stages. A comprehensive overview of the problem is given in Seinfeld and Pandis, 1998.  4.3  Exploratory Data Analysis (EDA)  In order to gain some understanding of the spatial and temporal patterns presented in the data, we have conducted a preliminary EDA. Figure 4.3 presents boxplots of the domain daily maximum 8-hour ozone values by month. It is clear that there is a temporal trend over the period of study. The domain-average levels steadily decrease from the beginning  0  Daily maximum 8-hr ozone 20 40 60  of the period (April 15) to the end of the period (September 30).  April  May  June  July  August  September  Month  Figure 4.3  Boxplot of the domain daily maximum 8-hour ozone levels by month.  57  This decreasing trend is also seen in the daily maximum 8-hour ozone time series (Figure 4.4), suggesting that a simple linear trend would be a reasonable large-scale tem, for the data in this case.  Daily maximum 8-hr ozone [ppb] 0 20 40 60 80  poral model,  1 May  1 Jun  1 Jul  1 Aug  1 Sep  30 Sep  Day  Figure 4.4 Time series of observed daily maximum 8-hour zone concentrations at 20 monitoring sites from April 15th to September 30th.  A Principal Component Analysis (PCA) was employed to investigate the largescale spatial features that are mostly dominated by the spatial variability of emissions rates of ozone precursors throughout the domain of interest. Since EOFs are defined to be orthogonal and the PCs are defined to be uncorrelated they are an efficient means of isolating coherent modes of variability in large data sets (McPhaden and Hayes, 1991). To perform PCA we use an iterative Bayesian model that is able to handle the missing values problem. The model is implemented in an R package namely: pcaMethods, (Shigeyuki et al., 2003).  58  The top panel of Figure 4.5 shows the first empirical orthogonal function (EOF) of ozone data. This spatial pattern explains 85.7% of the total variance and is similar in structure to the mean ozone field (not shown). The bottom panel of Figure 4.5 plots the first EOF mode of wind prefiltered ozone data. By this we mean we used only ozone values that correspond to very light winds less than or equal to 3 km/h; similar results were obtain for 2 km/h. In addition, this eigenfunction describes 68.37% of the total variance and clearly indicates the considerable impact caused by the transport effect on the mean field. Both panels in Figure 4.5 present a complex large-scale spatial pattern which is very difficult to reproduce by a continuous function of some covariates, e.g. longitude and latitude (Wikle 1998). That is why we choose to use the first EOF mode of wind prefiltered ozone data to model the large-scale spatial trend,  , as was discussed in Chap-  ter 2.  59  49.5  The first EOF mode of ozone data -9  Latitude 49.1 49.3  -10  -10  .5  -12.5 0 -1  -9  -13 -12  -9.5  -12  -11.5  -13  -13  48.9  - 11  -11  .5  -123.0  -122.5 Longitude  -122.0  -121.5  49.5  The first EOF mode of the wind prefiltered ozone data (wind <3 km/h) -4  -4  Latitude 49.1 49.3  5 -5.  -5  -5.5  -6  -6 -7.5  -7  -4. 5  48.9  -8  -7  -5  -6.5  -8 -123.0  Figure 4.5  -122.5 Longitude  -122.0  -121.5  The first EOF modes of ozone data.  Now, we have sufficient information to specify the larger-scale spatiotemporal model (2.4), which in our case takes the form ;  x  1, … , .  Least-square (LS) methods were used for inferring the parameters  (4.1)  ,  ,  , with esti-  mated values 5.50, 0.01, 0.28 . After removing the large-scale signals from the observations the residuals  ,  ̂  ,  ,  1, … , ;  1 … , where  is the number  60  of ozone monitoring sites, were used for fitting the Bayesian hierarchical ,  ,  , and  models  specified in Chapter 3.  We investigate the normality of the residuals using histograms and normal quantile-quantile (Q-Q) plots on three measurement scales: original, square root, and logarithmic. Figure 4.6 displays the normal Q-Q plots of four randomly selected ozone monitoring sites on the three scales. It is easy to see that in all cases the normality assumption is reasonable and there is a slight improvement by using variance stabilizing scales (square-root and logaritmic), seen in the middle and bottom panels. However, in some cases, e.g. station 1, the logarithmic transformation appears to have over compensated for the skewness. The histogram plots (not shown here) confirm this observation. This makes us believe that the square-root scale would be the most appropriate scale to work with on in our analysis in prospect of stabilizing the variance and symmetry. This result is in agreement with other works in modeling ozone concentration levels; see for example, Sahu et al., 2007; Carroll et at., 1997; Huerta et al., 2004; Huang and Hsu, 2004. The partial autocorrelation function for the residuals of the first nine (ordered on station identifiers) time series are plotted on Figure 4.7. The result indicates clearly the cut off at lag one of the partial autocorrelation function that suggest the first order AR process would be appropriate for each station. This result is in accord with the statistical model suggested in Chapter 2, and built on physical arguments. In the suggested model, the time autoregressive term appears naturally from the advection equation.  61  Station 01  Station 12  Station 09  0 1 2  Station 01  0 0 1 2  -2  -0.5  -0.5 -2  0 1 2  0 1 2  Station 09  0.5  Station 12  -2.0 0 1 2  -2 -2  Station 01  -1.0 0.0 -2  2  3 0 1 2  0 1 2  Station 09  1 -2  Station 20  -2  -1  -3 0 1 2  0 1 2  Station 12  -1  2 0 -2 -2  30 -2  1  Station 20  -10 10  20 -2  0.0  0 1 2  -1.0  -2  -10  -20  -20  0  0  20  50  Station 20  -2  0 1 2  -2  0 1 2  Figure 4.6 Normal Q-Q plot of residuals on original scale (top panels), square root scale (middle panels), and logarithmic scale (bottom panels). Station 02  Station 04  5  10  15  20  -0.1  0.2  -0.1 0.1  -0.1 0.1 0.3  0.3  Station 01  5  15  20  10  15  20  15  20  5  15  20  0.2  0.5  -0.2 0.0  -0.1 20  10  Station 17  0.2  0.5  15  20  0.2 10  Station 15  -0.1 0.2  Figure 4.7  10  15  -0.1 5  Station 14  5  10  Station 13  -0.2 0.1  0.1 -0.2  5  5  Station 09 0.4  Station 06  10  5  10  15  20  5  10  15  20  Partial autocorrelation functions of the first nine time series.  62  4.4  Results  4.4.1  Parameter inference  A Gibbs sampler was applied to estimate the marginal posterior distributions for model parameters. A summary of the results from the Gibbs sampler are presented in Table 4.1. The results suggest the following conclusions:   ,  The estimates of the regression parameters  ,  are similar for all models.  The temperature ( ) is the meteorological parameter that has the most significant influence on average grid cell ozone concentration.   , and  The autoregressive process for models  , , has the second  most significant influence on ozone levels. Additionally, the spatially varying autoregressive processes for models  and  are shown in Figure 4.8.  A visual inspection of these two surfaces shows that they have substantial spatial variation, which supports the use of a model with spatially varying autoregressive coefficients.   The spatial autoregressive parameters ( ,  ) have only a minor effect on ozone  concentration in all cases. However, the zonal (east-west) component bigger than the meridional (north-south) component and  is much  for models  which is in agreement with the prevailing westerly summertime  winds in LFV (Ainslie and Steyn, 2007)   The estimates of the variance components  and  explained by the second stage error process ( (  ), except  show that more variance is  ) than by the data error process  .  Parameter inferences related to the boundary process that is included in models , and  are reported in Table 4.2.  63  Summary of posterior parameter inference.  Table 4.1  Parameter Mean  s.d.  Mean  s.d.  Mean  s.d.  Mean  s.d.  Temperature  0.429  0.031  0.604  0.03  0.586  0.032  0.585  0.034  Rel hum  -0.135  0.005  -0.168  0.004  -0.165  0.005  -0.166  0.006  Solar rad  0.001  0.001  -0.002  0.001  -0.003  0.001  -0.003  0.001  E-W space  -0.006  0.005  0.013  0.001  0.01  0.002  -0.004  0.003  N-S space  -0.001  0.006  -0.016  0.003  -0.013  0.002  0.001  0.005  Auto reg.  0.48  0.02  0.479  0.016  Data var.  33.33  1.133  20.26  3.137  25.51  2.68  24.86  6.116  Process var.  24.1  1.51  25.63  1.53  26.73  1.46  30.61  1.503  Table 4.2.  Estimation of boundary parameter.  Parameter Mean  s.d.  Mean  s.d.  (1.55, -0.34, 1)  (1.02, 0.47, 0.23)  (2.53, -0.34, 1)  (1.16, 0.44, 0.22)  (0.66, -0.4, 1)  (1.1, 0.42, 0.24)  (0.35, 0.12, 1)  (0.34, 0.53, 0.25)  ,  (0.11, 0.07)  (0.35, 0.09)  (0.64, 0.07)  (1.34, 0.08)  ,  (0.26, 0.08)  (1.22, 0.12)  (1.66, 0.08)  (3.06, 0.13)  3  4.35  2  1.04  2 1  Latitude 49.2  Latitude 49.2  49.4  Model M3  49.4  Model M2  0  49.0  49.0  -1  -123.0  -122.5 -122.0 Longitude  -121.5  -2 -123.0  -122.5 -122.0 Longitude  Figure 4.8 Estimated surfaces of the autoregressive process and .  -121.5  · for models  64  The parameters’ MCMC plots for model  on the original data scale are  presented in Figure 4.9. It appears that we have good convergence of the Gibbs sampler for all model parameters and the histograms of the posterior for the most of the parameters are concentrated in small intervals, except for the variance parameters,  and  ,  which have a wider spread. We have observed similar MCMC results for the other three models as well (not shown here). In Chapter 3 we found that inference about the variance parameter can be very sensitive to the choice of the prior specification. Based on this, we performed additional runs of the models using different prior specifications. However, the results were not very sensitive to these changes in priors.  65  4000  0.012  0.016  0.020  500 300 0  1  4000  250 150  -0.15  0 0.50  0.60  0.70  4000  -0.002  0.001  3  -0.150  150 0  2000 iteration  4000  22  24  26  28  30  2 z  0  5  0 50  10  150  20  250  30  iteration  -0.165  250  80 100 40 20 -0.005  -0.180  2  60  100 200 300 0 4000  2000 iteration  400  0.001 -0.002 -0.005  2000  0  1  iteration  0  -0.006  0 50  2000  -0.012 2  0 0  -0.018  iteration  100 200 300 400  0.6 0.4 0.2 0.0  4000  -0.05 0.00  iteration  2000  50  2000  0 100  0  0.0  100  0.2  0.4  300  0.6  0.8  500  0.8 0.6 0.4 0.2 0.0 0  0  2000 iteration  4000  22  24  26  28  2 o  Figure 4.9 Sequences of 5000 iterations from the posterior distributions of the model parameters and the histograms of the last 2000 iterations. 66  4.4.2  Model performance  We characterize the predictive model performance by using root mean squared predictive error (RMSPE) as in Chapter 3 and we characterize overall model performance using average RMSE. The difference between RMSPE and RMSE is that in the first case we use one-step ahead model prediction values (2.22) and in the second case we use the model estimated concentration levels. These measures answer two different questions namely: how good is the forecast and how well does the model fit the data. For estimating the transport effect again we use the ratio of the variances of the transport term to the total variance. All figures reported in this section are based on the model  .  Table 4.3 reports the models overall performance and relative importance of the transport effect. We note that the average RMSEs stay at a very similar levels for all and  models. However, RMSPEs for models  that allow the time autore-  gressive processes to vary spatially are smaller than for models and  well as that transport effect for models and  and  as  is larger than for the models  .  Table 4.3 Summary of the overall models performance and the relative importance of the transport effect. Model  RMSE [ppb]  RMSPE [ppb]  Tr. [%]  3.35  6.42  1%  3.54  6.01  4.2%  3.43  6.19  2.5%  3.34  6.44  0.4%  Next, we continue investigating the model performance on different scales. At the local scale, time point and grid cell level, our observations (Figure 4.10) indicate that higher uncertainty is associated with grid cells where no data are present. These results  67  agree with our previous findings from Chapter 3. However, when we compare the MCMC results in Figure 3.4 and Figure 4.10 (left and right panels) it seems that autocorrelations among the Gibbs samples are smaller in the real data case. Although not shown, this behavior was observed throughout most of the prediction domain. O(71,20)  O(71,20) 0.8 0.6 0.4 0.2 0.0  0  -15  -10  100  -5  200  0  300  5  400  1.0  O(71,20)  0  1000  2000 3000  4000 5000  -10  -5  0  5  10  0  5  10  iteration  15  20  25  30  25  30  lag  O(125,20)  O(125,20)  0.2 0.0  0  -20 -10  200  0  0.4  10  400  0.6  20  0.8  600  30  1.0  O(125,20)  0  1000  2000 3000 iteration  4000 5000  -20  -10  0  10  20  0  5  10  15  20  lag  Figure 4.10 Left: sequences of 5000 iterations from the posterior distributions of the randomly selected grid cells without observations O(125,20) and with observations O(71,20) ; Center: histogram of the last 2000 iterations; Right: autocorrelation function.  Figure 4.11 examines the model performance and the transport effect throughout the time period of study at a randomly selected location where data are present most of the time. That is, we investigate the model performance on domain-wide time scale and on the local scale in space. The top panel of the figure shows the observed ozone time series together with the estimated and one-step ahead predictions. We find the model in excellent agreement with the observed data and very good agreement between the pre-  68  dicted and observed values. In both cases the observed time series is within the 95% pointwise credibility interval (CI). In the bottom panel of the figure we present the time series of the raw data, wind field and transport effect. As supported by our simulation analysis, the impact of the wind field on the ozone concentration levels can be positive, negative or zero. In addition, we found the difference in time evolution between grid cells with and without observation similar to the results in Chapter 3. Since these results are  20  Ozone [ppb] 40 60  80  raw data one-step ahead forecast fitted data 95% CI  0  50  100  150  100  150  30  Day  25 10 15 20 [km/h],[%] -5  0  20  5  Ozone [ppb] 40 60  80  raw data [ppb] transport [%] wind field [km/h]  0  50 Day  Figure 4.11  Time series for cells (75) with observations (station 30).  69  similar, we do not discuss them at length here. However, it should be noted that direct comparison between the results from the simulation study and real case results are not fully correct since the two lattices are quite different in size, 153 and 40 cells respectively. The reason I have used a smaller grid in the simulation study was purely practical. A real case model run takes about 5 to 7 hours and the Gibbs’ sampler output is about 3.5 GB. In the simulation study I have to perform 32 model runs and to make the model development process feasible I chose to use a smaller grid. To investigate model performance on the domain scale both in time and space we used a sequence of model based spatial maps. Given the large number of time points over which the model was applied, we restrict our consideration to the period of 6 consecutive days, from 7 August to 12 August. This period comprises the most significant area-wide ozone episode observed in the region for the entire period of study (see Figure 4.4). Figure 4.12 displays 6 interpolated surfaces of the true process based on the model as residuals. The model has correctly reproduced the observed space-time dynamic evolution and the model estimates were in close agreement with the observed values, superimposed on the plots. The RMSEs and RMSPEs in units of ppb for the period are 5.15, 5.02, 3.34, 6.33, 4.33, 6.81 and 9.04, 7.78, 5.65, 8.88, 9.12, 13.17 respectively. The average RMSE and RMSPE for the whole period are reported in Table 4.3. It should also be noted that we obtained similar results (not shown) from the other models. However, not surprisingly the spatial maps obtained with models than the interpolated surfaces obtained with the models  and and  are smoother . The latter  models allow the time autoregressive process to vary spatially which would account for more realistic representation of the spatial variability.  70  7 Aug  20 -15.5  10  -19.4 -18.1  -14.9  0  -20.6 -16.3  49.0 -123.0  -122.5  -20 -122.0  49.3  49.4  -10 -19.3  49.2  30  Latitude  -18.3  40  50 0.  5.1 -4.7 2.9 -7.3 -0.9 5.41.1 5.7 6.2 3.7  -121.5  10  8.8  6.8 8.7  -123.0  -122.5  6.8  0  4.2 7.5  49.0  -20 -122.0  49.3  49.4  -10 9.6  49.2  10  Latitude  6.3  50 5.  1.9 -11.3 -13.1 1.5 15.75.19.2 3  -121.5  -123.0  -122.5  -20 -122.0  -121.5  12 Aug  20 10 31.8  0 -10  25.2  -20 -122.0  Longitude  -121.5  49.3  30  -2 25.3 -0.6 17.2 23.1 22.7 14.1 24.4 26.5 9.7 21.5 19.8 24.1 23.4 28  Latitude  3.3  40  49.1 49.2  40  18.2  49.4  50  50 51  25.2 7.9 3.2 21.3 36.6 25.228 12.6 20.7  20 28.1  12  10  12.3 21.9  24.3  49.3  0  27.9 35.5  -10 48.1  -123.0  40 30  49.0  49.4 49.3  Latitude  -10 17.3  11 Aug  49.1 49.2  0  15.3  Longitude  6.6  49.0  10 9.8  13.7  Longitude  -122.5  20 15.4  -1.8 -2.3 1.9  40 30  5.9  -4.7  49.1  20  4.3  -2.4  -123.0  -121.5  -0.5  49.0  49.4 49.3 49.2  40 30  -2.7  49.1  Latitude  50  -122.5  -20 -122.0  10 Aug 2.4  -123.0  -10 6.8  9 Aug  -1.7  0  9.8  Longitude  1.5  -4.1  20 9.4  4.6  Longitude  -1.8 -9.6 -11 -6.3 6.50.33.9 -1  40 30  7.9  49.1  -15.4 -16 -14.5 -24.1 -12.8 -3-13.8 21.8 -6.9  -7.1  49.0  49.4 49.3 49.2  50 -16  -17  49.1  Latitude  8 Aug  -8.6  -122.5  -20 -122.0  -121.5  Longitude  Figure 4.12 Model-estimated mean field of the true pollutant concentration as residuals. The observed values are superimposed on the plots.  Figure 4.13 displays corresponding standard deviation contour plots. The standard errors are smaller near the location of the observed data and basically trace the observed locations. These plots are not suitable for characterizing the domain variability for rea-  71  sons described in Chapter 3. Large-scale spatial variability is captured by the EOFs, and could be very different from those presented in Figure 4.13.  100  *  49.1  *  *  *  *  0  *  49.4 49.3  1 00  *  10 0  *  *  *  100  **  100  *  ** * * * *  *  -122.5  Longitude  -122.0  1 00  *  -121.5  *  * 10  *  -123.0  *  *  0  *  -123.0  -122.5  Longitude  *  * *  *  10 0  *  *  * 10  10  *  0  -123.0  0  *  49.4  ** **  49.3  *  100  49.2  **  100  * **  *  ** * * *  *  *  *  *  Longitude  *  * *  -122.0  -121.5  -123.0  -122.5  Longitude  11 Aug  20  100  *  *  Figure 4.13 locations.  -122.5  1 00  Longitude  0  *  * -123.0  49.4  Latitude  *  -122.0  49.3  10  *  20 0  49.1  *  *  *  49.2  0  49.4 49.3 49.2  *  ** **  100  49.1  Latitude  0  *  10 0  *  -121.5  20  200  10  *  -122.0  12 Aug *  *  10 0  * *  * *  *  100  * -122.5  -121.5  100  49.1  *  *  Latitude  49.4 49.3  10 0  49.1  49.2  100  -122.0  10 Aug *  100  Latitude  9 Aug *  *  0  *  ** **  49.2  *  100  49.1  *  **  *  Latitude  49.4 49.3  10 0  *  10  49.2  100  8 Aug  10  Latitude  7 Aug *  * *  *  * *  20 0  ** * * *  * *  0  *  10 0  * *  20 0  * * * *  -121.5  -123.0  -122.5  Longitude  -122.0  -121.5  Contour plots of pollutant variance field. * - denotes the monitoring sites  Figure 4.14 shows model-based interpolated transport effect maps. The domain mean transport effect is small and positive most of the time except day 3 when the effect is again small but negative. In the beginning of the ozone episode we have light easterly 72  winds with an average speed of about 3 km/h, and the smallest temperature for the ozone episode being analyzed. The ozone levels continue to increase as the temperature increases  (see  Figure  4.12)  but  the  transport  effect  -5  49.4 49.3  5 0  49.2  49.2  0  Latitude  49.3  5  10  -5 -10  49.0  49.0  -10  -122.5  -122.0  Longitude  -121.5  -123.0  -122.5  -122.0  Longitude  -121.5  49.2  0  49.1  -5  49.4  10  49.3  5 0  49.2  5  Latitude  10  -5  49.1  49.4  10 Aug  49.3  9 Aug  Latitude  constant.  49.1  49.4  10  -123.0  -10  49.0  49.0  -10  -123.0  -122.5  -122.0  Longitude  -121.5  -123.0  -122.5  -122.0  Longitude  -121.5  49.1 49.2  0 -5  49.4  5 0 -5 -10  49.0  49.0  -10  10  49.3  5  Latitude  10  49.1 49.2  49.4  12 Aug  49.3  11 Aug  Latitude  relatively  8 Aug  49.1  Latitude  7 Aug  remains  -123.0  -122.5  -122.0  Longitude  -121.5  -123.0  -122.5  -122.0  Longitude  -121.5  Figure 4.14 Model estimated mean transport contributions map. The observed wind field is superimposed on the plots.  73  At the end of the episode the winds are light again: average wind speed about 2 km/h, but this time the winds are westerly. One further remark is that the maximum domain average temperature (29.05  ) occurs on day 5 when the average wind speed is the highest for  the episode, about 6 km/h, but the maximum ozone levels were observed on day 6 when we have light winds and still high temperatures (28.8  ). This result suggests that the  highest ozone levels are not generally associated with the higher wind speeds. This is in agreement with previous publications: Gardner and Dorling, 2000. Figure 4.15 shows scatterplots of modeled and observed concentrations during the period of study at grid cell 75, obtained from various models. It can be seen that we have similar association between observed and modeled values for all models.  0 -20  -20  0  10 20 30  Model M 3  10 20 30  Model M 1  -20  -10  0  10  20  30  -20  -10  10  20  30  20  30  0 -20  -20  0  10 20 30  Model M 4  10 20 30  Model M 2  0  -20  Figure 4.15 models  -10  0  10  20  30  -20  -10  0  10  Scatterplot of observed (x-axis) and modeled (y-axis) concentrations for , , , and on the scale of residuals.  74  4.4.3  Model comparison  Our model provides an opportunity for partial comparison with the results obtained through simpler time series models that were employed in many previous ozone studies (Gardner and Dorling, 2000; McKendry, 2002; Robeson and Steyn, 1989, 1990; Pryor et. al, 1994; Pryor and Steyn, 1995). As a benchmark we used the two best models employed by Robson and Steyn (1989) namely: Temperature Persistent (TP) model and an ARIMA model. The TP model is a linear model that relates the ozone daily maximum value to the daily maximum air temperature ( ) and the previous day’s daily maximum ozone concentration. That is: (4.2)  where  is the measurement error or unexplained residual term. This model can be seen  as a special univariate case of the model (2.16) that takes into account partly the local production term (the regression part of (4.2)) and non-wind transport (autoregressive term in (4.2), persistence). As in Robson and Steyn (1989) we analyze data from two monitoring stations, namely: station 09 (T09 in Robeson and Steyn, 1989) at Rocky Point Park and station 12 (T12) at Chilliwack Airport (see Fig. 4.2). These two stations crudely represent two distinct atmospheric situations: an urban (station T09) and a rural (station T12). Model performance statistics are summarized in Table 4.4 for comparative purposes. The comparison between TP and ARIMA model performance confirm the results in Robson and Steyn (1989). However, both models perform worse than all models introduced in Chapter 3. The models  ,  ,  , and  outperform the ARIMA and  TP model by factor of 2-3. The average model RMSE of the spatiotemporal models is 3.7 ppb and the average RMSE of the univariate time series models is 8.6 ppb.  75  Table 4.4 Comparison of model performance [ppb]. Station T9  Model* (1)  (2)  (3)  Observed mean  -29.41-  Observed standard deviation  -10.8-  (4)  (5)  (6)  Predicted mean  29.6  29.9  29.7  28.9  29.5  29.46  Predicted standard deviation  13.3  7.0  7.06  8.08  9.8  7.93  RMSE  3.71  4.14  3.8  3.0  8.85  7.56  (4)  (5)  (6)  Station T12  Model (1)  (2)  (3)  Observed mean  -33.96-  Observed standard deviation  -13.4-  Predicted mean  33.3  33.5  33.52  33.05  34.0  34.01  Predicted standard deviation  9.9  10.17  10.31  11.03  13.0  10.93  RMSE  4.08  4.36  4.06  3.22  10.25  7.55  *(1) , (2) , (3) and persistence based model  , (4)  , (5) ARIMA(1,1,0), (6) Temperature  76  4.5  Discussion  In this chapter we have applied several space-time hierarchical models specified in the Chapter 3 to the LFV dataset. The hourly data, both meteorological and ozone were aggregated to daily rolling 8-hour averaged values. The meteorological data were interpolated using ordinary kriging over the entire domain of study for each particular day. However, only the daily maximum 8-hour average ozone values and corresponding meteorological fields were used in the analysis. Using 8-hour average values instead of hourly values simplifies (linearizes) the problem since we do not need to model small scale time signals (such as ozone diurnal cycle) present in the data. Based on EDA we have specified the large-scale mean process. We modeled the mean space and time trend as an additive deterministic model. The time trend component was identified as a linear model and the large-scale spatial component we specified to be proportional to the first EOF mode of wind prefiltered ozone data. For this particular case we found the first eigenfunction to be the most appropriate choice to model a complex large-scale spatial pattern. The models were able to capture a number of key ozone features such as: the strong dependence of ozone levels on temperature and persistence; the highest ozone levels are not generally associated with the strong winds; the wind’s ability to transport pollutants. Additionally, we have investigated the models performance and ozone transport effect on different scales. The results show that the different models have similar performance and that the estimated transport effect could vary considerably. As we found in Chapter 3, the transport effect strongly depends on spatial data availability as well. For example, increasing the data availability from 20% to 80% could lead to an increase of 4 times of the estimated transport effect (Table 3.12). Hence, in this particular case, LFV dataset, where the data availability is only 11% the estimated overall transport effect (1%-4%) would be very erroneous.  77  Another characteristic of the models we used in this chapter is that we can attach uncertainty to all of these predictions and estimates. This uncertainty can be, derived directly from the model posterior distributions. As in Chapter 3 the uncertainty associated with the ozone level of a grid cell strongly depends on whether the data are presented or not present within and around that particular grid cell (Fig. 4.13). The uncertainty could decreases significantly if multiple monitoring sites are presented within a grid cell. This chapter has been based on data from 2004. In order to broaden the analysis, we applied the model to data from 2005 and 2006. Model results and some summaries of the observed ozone and meteorological fields for years 2005 and 2006 are presented in Appendix B. We draw the same general conclusions we have made for 2004 as well as for 2005 and 2006. Additionally, we should point out that the model’s performance remains relatively constant through the years as well as the most of the parameter estimates but there are some parameters that can vary significantly. The interannual variability of some of the model parameters is a result of the observed changes in the meteorological fields and ozone concentration levels. Investigation of the interannual variability (ozone trend) is beyond the scope of this study.  78  CHAPTER 5 CONCLUSIONS AND FUTURE RESEARCH  In this statistical study we have highlighted the importance of simultaneouly considering different aspects of an air pollution problem as well as taking into account the physical bases that govern the processes of interest. We have developed a physically motivated stochastic model that decomposes the ground-level pollutant concentration field in three components, namely: transport, local production, and large-scale mean trend mostly dominated by the emission rates. The model is novel in the field of environmental spatial statistics and gives a different perspective on modeling in air pollution problem. The transport component is modeled based on the advection equation. It was shown that different approximation schemes lead to different spatial autoregressive models. In our analysis we have used a lagged nearest-neighbor structure to model spatial dependency caused by the wind field. This is one of many possible parameterizations that could be investigated including simultaneous or staggered schemes. In Chapter 2 we showed that in modeling daily maximum 8-hour ozone data the simultaneous scheme is more appropriate. We have ignored the downward transport effect since: our dataset does not provide vertical wind fields and second intrusions of stratospheric ozone that reach the surface are rare events, though is possible that surface concentrations can be affected by mixing air between the surface layer and the free troposphere above. It is important to emphasize two factors when modeling the transport of an air pollutant: wind field and the pollutant gradients should be considered and they should be considered concurrently. Excluding small-scale phenomena such as diffusion, means it is not possible to have transport if there are no gradients in the pollutant field and/or there is  79  no wind. This fact is a consequence of the advection equation and has been employed by the proposed models. The local production component is implemented as a vector regression model with fixed in time and space coefficients. Adopting such an approach allows us to ignore contributions from the neighboring sites and takes into account only specific local meteorological conditions. This puts physical/chemical meaning in our regression coefficients, which is a desirable feature of a local production model since physics and chemistry do not depend on location. Therefore, the local production spatial structure is not modeled directly, but is inherent in the covariates field. The meteorological fields used in this study were obtained through a kriging method that interpolated meteorological data over the grid. That is, the meteorological variables, which are required for the model as input, are not modeled dynamically. This together with the poor representativeness of the meteorological data could lead to poorer model performance. In forecasting mode the model requires dynamic meteorological fields as starting input in order to predict ozone concentrations. Alternatively, the meteorological fields can be obtained from weather forecast data. The large-scale mean process is specified based on the results obtained from EDA. In EDA we have used wind prefiltered ozone data to determine the first EOF mode that is used as large-scale spatial trend mostly dominated by the emissions rates. This is a new approach that allows us to avoid using emissions inventory data. However, the approach is only applicable if rich datasets are available. Further study is needed to improve the methods for estimation of the emission rates maps as well as to investigate the dependence of ozone transport on the details of wind fields. The proposed models were able to capture a number of key features of ozone behavior and show excellent agreement with the observed data. The models outperformed the considerably simpler univariate time series models by a factor of 2 to 3. Additionally, the Bayesian framework allowed us to easily incorporate the scientific knowledge and to  80  estimate grid-cell average values, conditioned on point-referenced observed data. recently, Berliner (2003) emphasized that stochastic and deterministic models are not separate items, but should be considered ends of a continuum. In this work we have demonstrated this by providing a stochastic model that captures deterministic elements of the problem. More specific discussions and conclusions are provided in the discussion subsections of Chapter 3 and 4. Finally, more or less any aspect of this approach can be further developed. This includes: further exploring ground-level ozone modeling on point-level spatial resolution that takes into account pollutant transport; considering simultaneous nearest neighbor structure or other approximation schemes for modeling the transport component; improving the methods for estimation of the emission rates; coupling the proposed model with the output of a Numerical Weather Prediction (NWP) system; improving the way relative importance of the transport effect is calculated.  81  REFERENCES Ainslie B. and Steyn D.G., 2007. Spatio-Temporal Trends in Episodic Ozone Pollution in the Lower Fraser Valley, B.C. in Relation to Meso-scale Atmospheric Circulation Patterns and Emissions. In Press. Journal of Applied Meteorology. (October, 2006). Banerjee S., Carlin B.P., and Gelfand A.E., 2004. Hierarchical Modeling and Analysis for Spatial Data. Chapman and Hall/CRC, Boca Raton, FL. Bennett R.J., 1979, Spatial time series: Analysis-forecasting-control, Pion, London, 674p. Berliner L.M., 2003. Physical-statistical modeling in geophysics, Journal of Geophysical Research, vol. 108, no. d24, 8776, doi:10.1029/2002jd002865.  Besag J., 1974, Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society B, 36:192-225. Box G. and Jenkins G., 1976. Time series analysis forecasting and control, Holden-Day, Inc. Breiman L., Friedman J.H., Olshen R.A. and Stone C.J., 1984. Classification and Regression Trees. Wadsworth and Brooks/Cole, Monterey. Bretthorst L., 1988. Bayesian Spectral Analysis and Estimation. New York, Springer. Burrows W.R., Benjamin M., Beauchamp S., Lord E.R., McCollor D. and Thomson B., 1995. CART decision tree statistical analysis and prediction of summer season maximum surface ozone for the Vancouver, Montreal, and Atlantic Regions of Canada. Journal of Applied Meteorology 34, 1848-1862. Byun, D., Young, J., Gipson, J., Godowitch, J., Binkowski, F., Roselle, S., Benjey, B., Pleim, J., Ching, J., Novak, J., Coats, C., Odman, T., Hanna, A., Alapaty, A., Mathur, R., McHenry, J., Sankar, U., Fine, S., Xiu, A. and Jang, C., 1998, Description of the Models3 Community Multiscale Air Quality (CMAQ) model. Proceedings of the American Meteorological Society 78th Annual Meeting, January 11–16, Phoenix, AZ. Carroll R.J., Chen R., George E.I., Li T.H., Newton H.J., Schmiediche H., Wang N., 1997. Ozone exposure and population density in Harris county, Texas. Journal of the American Statistical Association 92: 392–415.  82  Carter C.K. and Kohn R., 1994. On gibbs sampling for state space models. Biometrika, 81(3). Cobourn W.G. and Hubbard M.C., 1999. An enhanced ozone forecasting model using air mass trajectory analysis, Atmospheric Environment, Vol. 33, pp. 4663–4674. Cox W. and Chu S., 1996. Assessment of interannual ozone variation in urban areas from a climatological perspective. Atmospheric Environment, Vol. 30, p. 2615-2625. Cressie N. and Huang H-C., 1999. Classes of nonseparable, spatio-temporal stationary covariance functions. Journal of the American Statistical Association 94: 1330–1340. Cressie N., 1993. Statistics for Spatial Data, John Wiley & Sons, INC. Christakos G., 1991. On certain classes of spatiotemporal random fields with applications to space - time data processing. IEEE Transactions on Systems, Man, and Cybernetics, 21, 861 - 875. Fuentes M., Chen L., and Davis J.M., 2008. A class of nonseparable and nonstationary spatial temporal covariance functions. Environmetrics 2008; 19: 487–507 Fuentes M., Guttorp P., and Sampson P., 2006. Using transforms to analyze space-time processes. Chapter in Finkelsted et al.(eds), pp. 77-150.CRC/Chapman Hall.  Gardner M.W., Dorling S.R., 2000. Statistical surface ozone models: an improved methodology to account for non-linear behavior, Atmospheric Environment, Vol. 34, p. 2134. Gelman A. and Rubin D.B., 1992. Inference from iterative simulation using multiple sequences, Statistical Science, 7, 457-511. Gilks W.R., Richardson S., and Spiegelhalter D.J., 1996. Markov Chain Monte Carlo in Practice. Chapman and Hall/CRC, Boca Raton, FL. Gregory P.C., 2005. Bayesian Logical Data Analysis for Physical Sciences: A Comparative Approach with Mathematica Support. Cambridge University Press, 589 pp. Gneiting T., 2002. Nonseparable covariance function for space-time data. Journal of the American Statistical Association 97: 590–600. Gumbel E.J., 1958. Statistics and Extremes, Columbia University Press, New York.  83  Guttorp P., Meiring W., and Sampson P.D., 1994. A space–time analysis of ground-level ozone data. Environmetrics 5: 241–254. Government of British Columbia, Ministry of environment, 2005. Lower Fraser Valley Ambient Air Quality Report. http://wlapwww.gov.bc.ca/sry/p2/air_quality/index.html Hass H., 1991. Description of the EURAD chemistry transport module (CTM) version 2. In: Ebel, A., Neubauer, F.M., Speth, P. (Eds.), Report 83. Institute of Geophysics and Meteorology, University of Cologne, Cologne, Germany. Holmes N.S. and Morawska L., 2006. A review of dispersion modeling and its application to the dispersion of particles: An overview of different dispersion models available. Atmospheric Environment, Volume 40, Issue 30, p. 5902-5928. Hsieh W.W., 2004. Nonlinear multivariate and time series analysis by neural network methods. Reviews of Geophysics, 42, RG1003, doi:10.1029/2002RG000112 Huang H-C. and Hsu N-J., 2004, Modeling transport effects on ground-level ozone using a non-stationary space–time model, Environmetrics, 15: 251–268 (DOI: 10.1002/env. 639). Huerta G., Sanso B., and Stroud J.R., 2004. A spatiotemporal model for Mexico City ozone levels, Appl. Statist., 53, Part 2, pp. 231–248. Jimenez P., Baldasano, J. M., Dabdub D., 2003. Comparison of photochemical mechanisms for air quality modeling. Atmospheric Environment. Vol.37: p.4179-4194. Joe H., Steyn D.G. and Susko E., 1996. Analysis of trends in tropospheric ozone in the Lower Fraser Valley, British Columbia. Atmospheric Environment, Vol. 30, No. 20, p. 3413-3423. Kalnay, E. (2003), Atmospheric modeling, data assimilation and predictability, Cambridge: Cambridge University Press Karl T.R., Koscielny A.J., and Diaz H.F., 1982. Potential Errors in the Application of Principal Component (Eigenvector) Analysis to Geophysical Data, Journal of Applied Meteorology Volume 21, Issue 8. Le N.D. and Zidek J.V., 2006. Statistical analysis of environmental space-time processes. New York: Springer.  84  Le N.D., Li S., Zidek J. V., 2001. Spatial Prediction and Temporal Backcasting for Environmental Fields Having Monotone Data Patterns, The Canadian Journal of Statistics / La Revue Canadienne de Statistique, Vol. 29, No. 4., pp. 529-554. Lindley D.V. and Smith A.F., 1972. Bayes estimates for the linear model. Journal of the Royal Statistical Society B, 34A. Lorenz E.N., 1963. Deterministic nonperiodic flow, Journal of Atmospheric Sciensies, Vol. 20. Lorenz E. N., 1956. Empirical orthogonal functions and statistical weather prediction. Sci. Rep. No. 1, Statistical Forecasting Project, M.I.T., Cambridge, MA, 48 pp. McKendry I.G., 2002. Evaluation of Artificial Neural Networks for fine particulate pollution (PM10 and PM2.5) forecasting, J Air & Waste Manage Assoc 52, 1096–1101. McMillan N., Bortnick S.M., Irwin M.E., Berliner L.M., 2005. A hierarchical Bayesian model to estimate and forecast ozone through space and time. Atmospheric Environment 39, pp. 1373–1382. McPhaden M. J. and Hayes S.P., 1991. On the Variability of Winds, Sea Surface Temperature, and Surface Layer Heat Content in the Western Equatorial Pacific. Journal of Geophysical Research, 96, supplement, 3331-3342. NRC (National Research Council), 1991, Rethinking the Ozone Problem in Urban and Regional Air Pollution. National Academy Press, Washington. Odman M.T., Ingram C.L., 1996. Mulitscale Air Quality Simulation Platform (MAQSIP). Source Code Documentation and Validation. MCNC Report No. ENV-96TR002-v 1.0. MCNC, Research Triangle Park, North Carolina. Понтрягин Л.С., Андронов А.А., Витт А.А., 1933. О статистическом рассмотрении динамических систем. — Журнал Экспериментальной и Теоретической Физики, т. 3, вып. 3, с. 165—180. Андронов А. А. Собр. трудов, с. 142—160. Pryor S.C., McKendry I.G., and Steyn D.G., 1994. Synoptic-scale meteorological variability and surface ozone concentrations in Vancouver, BC. Journal of Applied Meteorology, Vol. 34, No. 8, p. 1824-1833. Pryor S.C. and Steyn D.G., 1995. Hebdomadal and diurnal cycles in ozone time series from the Lower Fraser Valley, BC. Atmospheric Environment, Vol. 29, No. 9, p. 10071019.  85  Randall D.A., 2006. Reynolds Averaging. http://kiwi.atmos.colostate.edu/group/ dave/pdf/Reynolds_Averaging.pdf Rao S.T., Zurbenko I.G., 1994. Detecting and Tracking Changes in Ozone Air Quality. J Air & Waste Manage Assoc 44, 1089–1092. Riccio A., Barone G., Chianese E., and Giunta G., 2006, A hierarchical Bayesian approach to the spatio-temporal modeling of air quality data. Atmospheric Environment, Vol. 40, 554–566. Robeson, S.M. and Steyn D.G., 1989. A Conditional Probability Density Function for Ozone Air Quality Data," Atmospheric Environment, 23 (3), p. 689-692. Robeson S.M. and Steyn D.G., 1990. Evaluation of Statistical Forecast Models for Daily Maximum Ozone Concentrations. Urban Atmosphere, 24b, p. 303-312. Royle, J. A., Berliner, L.M., Wikle C. K., and Milliff R., 1998. Ahierarchical spatial model for constructing wind fields from scatterometer data in the Labrador Sea. Case Studies in Bayesian Statistics, edited by C. Gatsonis et al., pp. 367– 382, Springer-Verlag, New York. Royle, J. A. and Berliner, L.M., 1999. A Hierarchical Approach to Multivariate Spatial Modeling and Prediction. Journal of Agricultural, Biological, and Environmental Statistics. Vol. 4, No. 1, pp. 29–56. Russell A. and Dennis R., 2000. NARSTO critical review of photochemical models and modeling. Atmospheric Environment, 2000, Vol.34, p. 2283-2324. Schabenberg O. and Gotway C.A., 2005. Statistical Methods for Spatial Data Analysis. Chapman and Hall/CRC, Boca Raton, FL. Sahu S.K., Gelfand A.E., and Holland D.M., 2007. High-Resolution Space–Time Ozone Modeling for Assessing Trends. Journal of the American Statistical Association December 2007, Vol. 102, No. 480 Seinfeld J.H., Pandis S.N., 1998, Atmospheric chemistry and physics, John Wiley & Sons, INC. Shigeyuki O., Masa-aki S., Ichiro T., Morito M., Ken-ichi M., and Shin I., 2003. A Bayesian missing value estimation method for gene expression profile data. Bioinformatics, 19(16):2088-2096. Smith R.L. and Huang L., 1993. Modeling high threshold exceedances of urban ozone. 86  National Institute for Statistical Science Technical Report #6. Simard Y. and Marcotte D., 1993. Assessing similarities and differences among maps: A study of temporal changes in distribution of northern shrimp (pandalus borealis) in the gulf of St. Lawrence, in A. Soares, ed., Geostatistics Troia ’92, Vol. 2: Kluwer Academic Publ., Dordrecht, p. 865–874. Simpson D., 1995. Biogenic emissions in Europe 2: Implications for ozone control strategies. Journal of Geophysical Research 100, 22891-22906. Stroud J. R., Muller P., and Sanso B., 200. Dynamic models for spatiotemporal data. Journal of Royal Statistical Society, Series B, 63, 673- 689. Stull B.R., 1988. An Introduction to Boundary Layer Meteorology. Kluwer Academic Publishers. Thompson M.L., Reynolds J., Cox L.H., Guttorp P. and Sampson P.D., 2001. A review of statistical methods for the meteorological adjustment of tropospheric ozone. Atmospheric Environment 35, pp. 617–630. Tobías A. and Scotto M.G., 2005. Prediction of extreme ozone levels in Barcelona, Spain, Environmental Monitoring and Assessment , Volume 100, Numbers 1-3. Vingarzan R. and Taylor B., 2003. Trend analysis of ground level ozone in the greater Vancouver/Fraser Valley area of British Columbia, Atmospheric Environment, Vol. 37, p. 2159–2171. von Storch H. and Zwiers F., 2000. Statistical Analysis in Climate Research, Cambridge University Press, 476 pp. West M. and Harrison P. J., 1997. Bayesian Forecasting and Dynamic Models. Springer: New York, 2nd ed. Wikle C.K. (1996), Spatao-temporal statistical models with application to atmospheric processes, Ph.D. dissertation, Iowa State University, Ames, IA. Wikle C.K., Berliner M. and Cressie N., 1998, Hierarchical Bayesian space-time models. J. Environ. Ecol. Statist., 5, 117–154. Wikle C.K., Cressie N., 1999. A dimension reduction approach to space-time Kalman filtering. Biometrika 86, 815–829. Whittle P., 1954. On stationary processes in the plan. Biometrica, 41:434-449 87  Yin D., Jiang W., Roth H., and Giroux E., 2004. Improvement of biogenic emissions estimation in the Canadian Lower Fraser Valley and its impact on particulate matter modeling results. Atmospheric Environment, Vol.38(4), p. 507-521  88  APPENDIX A Full conditional distributions for Gibbs sampling  The Gibbs sampling is the simplest and the most easily implemented algorithm for doing MCMC computer simulations. To implement a Gibbs sampler we need to know the full conditional distributions of all unknowns. However, it turns out that we only need to know the kernel part of these distributions and do not need to know the normalizing constants which often does not have analytical expression. That is why the full conditional distributions derived here are only known up to a constant factor. ’s full conditional distribution follows from (2.1) and (3.2): |·  | ,  ,  , , ,  ,  ,  ,  ,  , , ,  ,  ,  ,  ,  exp ∑ ∑  x  ∑ ∑  x exp  ∑  2  ∑  2, … ,  1. Thus |· ~ Λ  ,  1  1  1  2  2  2  where  and 89  1  1  2  2  1  ∑4  1  1  ∑  We need to “initialize” the time series. The full conditional distribution of |·  | ,  ,  , ,  ,  ,  , , ,  ,  ,  is given by ,  ,  exp ∑ ∑  x  ∑ ∑  x exp ∑  2  ∑  Thus |· ~ Λ  1  1  2  2  ,  where 1  1  and ∑  ∑  Similar for |·  | ,  ,  , , ,  ,  ,  ,  ,  exp ∑ x  ∑  90  exp ∑  2  Thus |· ~  ,  where  1  1  2  2  and 1  1  ’s full conditional distribution follows from (3.2) and (3.4): ·  ,  , , ,  ,  ,  ,  ,  exp ∑ ∑  x exp  2  1 2  1  2  1  ∑  Thus |· ~  ,  where  1  1  2  2  and ∑  1, … , 4  91  ’s full conditional distribution follows from (3.4): |·  ,  ,  ,  ,  ,  ,  ,  exp 1  1  1  1  exp  2  1 2  2  2 2  2  1  1  Thus |· ~  , 1  1  2  2  where 2 2 2  and 1 1  1, … , 4; For  1  t = 2, … T-1  1 we have |·  ,  ,  ,  ,  ,  exp 1  1  2  exp  1  1  2  1 1  1 2  2  1  2 0  0  2 2  1  Thus  92  |· ~  , 1  1  2  2  where 2 2 2  0  and 1 2  1, … , 4 Similar for |·  ,  ,  ,  ,  exp 1  1  2  exp  1  Thus |· ~  ,  where  1  1  2  2  and 1 1  1, … , 4 · ’s  full conditional distribution follows from (3.4) and (3.6): T  |·  ,  ,  ,  ,  93  ∑  exp Thus  |· ~  1,  1  1  2  2  ∑  2  and ∑  1, … , 4 T  |·  ,  ,  ,  ,  ∑  exp Thus  |· ~  2,  1  1  2  2  ∑  2  1  1  and ∑  1, … , 4 |·  ,  ,  exp Thus  |· ~  1  1  1  2  2  2  0  1 0  2  0  1  ,  1  1  2  2  0  94  1, … , 4 ’s full conditional distribution follows from (2.20) and (3.2): |·  , ,  | ,  ,  ,  , , ,  ,  ,  ,  ,  exp ∑  ∑  ∑  ∑ ∑  x ∑  exp  ∑  2  ∑  ∑  Thus  |· ~  ,  1  1  2  2  ∑  1  and ∑  ∑  ∑  ’s full conditional distribution follows from (2.20) and (3.2): |·  , , ,  | ,  ,  ,  ,  ,  exp ∑  ∑ x  ∑  95  ∑  exp  ∑  2 ∑  Thus  |· ~  ,  1  1  2  2  ∑  1  2  1  and ∑  ∑  ’s full conditional distribution follows from (2.20) and (3.2): |·  , ,  | ,  , ,  , , ,  ,  , ,  ,  exp ∑  ∑  ∑  ∑ ∑  x ∑,  exp  ∑,  ∑,  ∑  2  ∑  ∑  | · we use design matrixes  in the above presentation of and  rametric matrixes  ∑  ;  , i.e.  ,  ;  and  instead of the pa-  ,  .  Thus  |· ~ 1  1  2  2  ∑4,j  1 1  1  2  ∑  , 2  1  ∑4,j  1  ∑4,j  96  and ∑  χ  ∑  ∑  ’s full conditional distribution follows from (2.1) and (2.20): |·  |  ,  | ,  ,  exp  exp  ∑  ∑  exp  Thus |·  1 2  ,  2  ’s full conditional distribution follows from (2.1) and (2.20):  |·  |  ,  , ,  , ,  , , ,  ,  , ,  ,  exp x  ∑  exp ∑  ∑ ∑ ∑  x  Thus |·  ∑  , ∑  ∑ ∑  x  ∑  97  ’s full conditional distribution follows from (3.4) and (3.5): |·  |  ∏  ,  ∏  , exp  ∑  x  ∑  exp  ∑  ∑ ∑  exp  ∑  Thus ∑  |·  1 2  ,  2  ’s full conditional distribution follows from (3.4) and (3.5): |·  |  ,  ∏  |  ,  ,  ,  exp x  ∑  exp  ∑  exp x  Thus |·  ,  ∑ 1, … ,4  98  ’s full conditional distribution follows from (3.4) and (3.5): |·  |  ,  |  , exp  x  exp exp  Thus |·  2  ,  ;  1, … ,4  99  APPENDIX B Extension to three of data This section provides model results and summaries of the observed ozone and meteorological fields for the years 2004, 2005 and 2006. In Chapter 4 we presented model results for the year 2004 only. However, we broaden the analysis and apply the model for two more years: 2005 and 2006. The extended period includes one year of high (2006), one of low (2005) and one of intermediate (2004) ozone concentrations, as summarized in Table B.2. Tables B.1 provide summary statistics for meteorological conditions during ozone season in all three years. Tables B.3 to B.6 provide summaries of model performance and parameters in all three years, following from Table 4.1 to 4.4.  Table B.1 Meteorological data summaries for three years studied 2004 2005 Data summaries Mean s.d. Mean s.d.  2006 Mean  s.d.  Temperature  19.87  0.8  19.36  0.8  19.56  0.9  Relative Humidity  58.5  3.1  57.73  3.7  53.48  3.8  Solar Radiation  407.68  25.3  387.34  24.1  416.52  24.9  Wind Speed East-West  5.37  3.3  5.28  3.4  5.88  3.1  Wind Speed North-South  2.21  2.1  2.18  1.8  2.14  2.0  100  Table B.2 Ozone summaries for three years studied Percentiles  2004  2005  2006  50%  29.7  27.4  29.7  75%  36.6  33.8  36.5  90%  42.8  39.8  43.5  95%  50.7  44.1  48.1  97.5%  57.4  49.5  54.8  parameter and performance statistics summaries for  Table B.3 Model three years studied. 2004  2005  2006  Parameter Mean  s.d.  Mean  s.d.  Mean  s.d.  0.429  0.03  0.391  0.03  0.145  0.03  -0.135  0.01  -0.121  0.001  -0.097  0.01  0.001  0.001  0.001  0.001  0.007  0.001  0.003  0.001  0.006  0.001  0.007  0.001  0.001  0.001  0.003  0.001  0.001  0.01  0.54  0.02  0.55  0.01  0.619  0.02  33.33  1.33  25.92  1.024  45.15  1.47  24.1  1.46  17.93  1.063  13.16  1.04  Model  2004  2005  2006  RMSE  4.26  3.802  5.458  RMSPE  6.58  5.79  6.633  Tr.  1%  1%  1%  101  Table B.4 Model three years studied.  parameter and performance statistics summaries for 2004  2005  2006  Parameter Mean  s.d.  Mean  s.d.  Mean  s.d.  0.462  0.027  0.388  0.035  0.179  0.033  -0.141  0.005  -0.124  0.006  -0.107  0.006  -0.002  0.001  0.001  0.001  0.007  0.001  0.017  0.002  -0.001  0.004  0.008  0.002  0.011  0.002  0.001  0.008  0.014  0.003  35.206  3.795  27.604  3.445  47.682  2.826  18.43  1.118  16.567  1.211  10.255  0.946  Model performance  2004  2005  2006  RMSE  4.48  3.91  5.73  RMSPE  6.32  5.75  6.68  Tr.  8%  1.5%  2.1%  102  Table B.5 Model years studied. Parameter  parameter and performance statistics summaries for three 2004  2005  2006  Mean  s.d.  Mean  s.d.  Mean  s.d.  0.463  0.03  0.571  0.03  0.191  0.03  -0.143  0.001  -0.167  0.01  -0.107  0.01  0.001  0.001  -0.002  0.001  0.006  0.001  0.009  0.001  0.011  0.001  0.005  0.001  0.004  0.001  0.01  0.001  0.009  0.001  33.70  6.51  25.10  8.907  46.55  6.378  22.56  1.09  28.46  1.597  10.84  0.897  (0.02,0.9,0.9) (.03, .06, .18) ( .02, .93,1.0) ( .01, .03, .2) (.10, .741, .9) (.13, .31, .18) (-0.01, 0.9, 1) (.001,.02,.18) (.01, .98, 1.0) (.01, .02, .2) (0.01,1.0, 0.9) (.01, .02, .18) ,  (0.001, 0.07) (0.03, 0.08) (0.007, 0.06) (0.03, 0.09) (0.001, 0.07) (0.02, 0.08)  ,  (0.003, 0.067) (0.1, 0.09) 10.91  3.53  (0.004, 0.06)  (0.1, 0.08) (0.002, 0.068) (0.07, 0.09)  2.97  2.3  2.10  Model performance  2004  2005  2006  RMSE  4.29  3.36  5.68  RMSPE  6.19  6.27  6.69  Tr.  3.3%  5.3%  2%  2.38  103  Table B.6 Model years studied. Parameter  parameter and performance statistics summaries for three 2004  2005  2006  Mean  s.d.  Mean  s.d.  Mean  s.d.  0.585  0.034  0.379  0.03  0.159  0.02  -0.166  0.006  -0.12  0.001  -0.1  0.001  -0.003  0.001  0.001  0.001  0.007  0.001  -0.004  0.003  0.001  0.001  0.004  0.001  0.001  0.005  0.001  0.001  0.002  0.001  0.479  0.016  0.546  0.019  0.61  0.014  24.86  6.116  25.63  1.61  44.54  2.073  30.61  1.503  18.89  1.02  14.25  0.925  (2.53,-0.34, 1) (1.16, .44,.22) (.012,.96,.9) (.01,.02,.18) (0.01,0.96, 1) (.01, .02, .18) (0.35, 0.12, 1) (.34, .53, .25) (.001,.98,.99) (0, 0.02,0.19) (0.001,0.99,1) (.001,.01,.18) ,  (0.64, 0.07)  (1.34, 0.08) (0.001, 0.07) (0.02, 0.08) (0.001, 0.07) (0.02, 0.07)  ,  (1.66, 0.08)  (3.06, 0.13) (0.002, 0.067) (0.06, 0.08) (0.002, 0.066) (0.06, 0.07)  2  1.04  2.625  1.89  5.42  Model performance  2004  2005  2006  RMSE  3.34  3.75  5.387  RMSPE  6.44  5.79  6.634  Tr.  0.4  1%  %1  2.02  104  We draw the same general conclusions from model runs in 2005 and 2006 as we deid for 2004. Additionally, we should point out that the model’s performance remains relatively constant through the years as well as the most of the parameter estimates but there are some parameters that can vary significantly. The interannual variability of some of the model parameters is a result of the observed differences in the meteorological fields and ozone concentration levels. Investigation of the interannual variability (long term ozone trend) is beyond the scope of this study.  105  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0052379/manifest

Comment

Related Items