SPATIAL A N D T E M P O R A L ANALYSIS OF A M B I E N T H O U R L Y PMlO IN V A N C O U V E R by K A T H Y H . Q. L I B.Sc.(Mathematics), Shandong University, China, 1984 M.Sc.(Mathematics), Shandong University, China, 1987 Ph.D.(Oceanography), Inst, of Oceanology, Chinese Academy of Science, 1990 A T H E S I S S U B M I T T E D I N P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F M A S T E R O F S C I E N C E in T H E F A C U L T Y O F G R A D U A T E S T U D I E S D E P A R T M E N T O F S T A T I S T I C S We accept this thesis as conforming to the required standard T H E U N I V E R S I T Y . O F B R I T I S H C O L U M B I A August, 1998 © H u i q i n g L i , 1998 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of The University of British Columbia Vancouver, Canada Date t\^^ I If- . DE-6 (2/88) ABSTRACT Fine particulates (PMlO, PM2.5) have attracted many studies from a variety of disciplines. Because of the composition of higher proportion of various toxic metals and acidic sulfur species, fine particulate pollution is of specific concern to public health. They can contribute to the development of multiple adverse health effects. Many communities have put the particulate monitoring, evaluation and emission reduction as the prior issues in their air quality management plans. A current three-year Harvard-UBC study supported by the Environmental Protec-tion Agency of the United States focuses on the development of new statistical methods and models for estimating population-level exposures to PMlO and PM2.5 in the greater Boston and Vancouver areas. That study will among other things require the spatial in-terpolation of Vancouver PMlO field, for example, down to the Census Tract levels. This thesis reports preliminary findings about the spatial and temporal variations of ambient hourly PMlO collected in the Greater Vancouver Regional District and nearby Fraser Val-ley. The spatial interpolator to be used will require independent response over time. To ensure such as assumption leads to the removal of temporal autocorrelation and whiten residuals for interpolation. The results from descriptive analysis of the raw data exhibit obvious consistency of monthly, weekly, week-day and day-hour patterns from site-to-site. In the study region motor vehicle emissions are the major source of PMlO pollution and the week-day and day-hour effects capture the traffic pattern. Poor air quality episodes at some stations are associated with anomalous weather conditions, such as coastal fogs, stagnant air conditions. Our space-time model for PMlO includes a deterministic trend and stochastic resid-ual. We introduce a spatially homogeneous trend that comprise the above-mentioned site-to-site consistent temporal effects. For the detrended residuals, we find that the ii ARMA(2,1) or AR(3) model seems to capture the short-term autocorrelation structure satisfactorily. The coherent and slight spatial variation of the coefficients of ARMA(2,1) or AR(3) models suggests us to employ a single ARMA(2,1) or AR(3) model for all sites to eliminate the time-direction correlation and get a parsimonious description of our data. There remains some daily autocorrelation (lag-24 hours) in ARMA(2,1) or AR(3) residuals which can not be described by linear time series models. Nevertheless this corre-lation is not more than 0.06, and does not seem to cause concern in spatial interpolation of these residuals. It is suggested to incorporate the meteorological variables into models in future studies that might explain this daily serial correlation. It is noteworthy that there is little spatial correlation in station residuals after removing the trend and the same temporal autocorrelation structure represented by either a single ARMA(2,1) or AR(3) model. This low inter-site correlation may suggest that any given site in the study region picks up most of the variations of PMlO and station residuals contain little useful additional information for the construction of PMlO field. At the same time this analysis may suggest that the current network (with a scale of 7km) may not have sufficiently high spatial resolution to reflect the local important spatial gradients of PMlO. For certain purposes, a finer monitoring network (down to the scale, say 2km) may be required to determine the fine spatial structure of PMlO. iii T A B L E OF CONTENTS Abstract ii Table of Contents iv List of Tables vi List of Figures viii Acknowledgements xi 1 Introduction 1 2 Preliminary Analysis of Data Set 6 2.1 Data Set 7 2.2 Measures of Air Quality 9 2.3 Long-term and Short-term Temporal Patterns of PMlO 12 3 Space-Time Models for PMlO 15 3.1 A Log Transformation 16 3.2 Model 18 3.3 Trend and Its Estimation 18 3.4 Temporal Correlation in Detrended Residuals 22 3.4.1 Site-specific Time Series Model 24 3.4.2 A Single ARMA(2,1) Model 35 4 Some Spatial Analyses Related to PMlO 38 iv 4.1 Cluster Analysis 39 4.2 Inter-site Correlation 42 5 Discussion 48 Bibliography 51 v LIST OF TABLES 2.1 P M l O Monitor ing Sites in Vancouver : 8 2.2 Percentage of Missing Da ta at Stations and Hours 8 2.3 Percentages of "0", "< 5","> 30", "> 50" in the Da ta 10 2.4 Exceedances of 24-hour A i r Quali ty Objective 11 3.1 Standard Deviations at Stations and Hours (Original Data) 17 3.2 G L S Estimates of Trend Parameters 21 3.3 Correlations between Successive Hours (Detrended Residuals) 23 3.4 Estimates of AR(3 ) Coefficients at A l l Stations. . 24 3.5 Orders of A R Model Selected by A I C Criterion 28 3.6 B I C for Different Order A R Processes. 29 3.7 Estimates of A R M A ( 2 , 1 ) Coefficients at A l l Stations 31 3.8 Portmanteau Statistics for the A R M A ( 2 , 1 ) Model , K = 1 5 34 3.9 Portmanteau Statistics for the A R ( 3 ) Model , K = 1 5 35 3.10 Means and Standard Deviations for Log-transformed, Detrended, the Single AR(3 ) and A R M A ( 2 , 1 ) Residual Series 37 4.1 Between-site Geographical Distance 42 4.2 Between-site Correlation of Log-transformed P M l O 43 4.3 Between-site Correlation of Residuals after Site-consistent v i A R M A ( 2 , 1 ) Model 44 4.4 Between-site Correlation of Residuals after Site-specific A R M A ( 2 , 1 ) Model 45 4.5 Between-site Correlation of Residuals after Site-consistent AR(3 ) Model 45 4.6 Between-site Correlation of Residuals after Site-specific AR(3 ) Mode l 46 v i i LIST OF FIGURES 2.1 Geographical Locations of P M l O Moni tor ing Sites in G V R D 56 2.2 Exceedance Days of 24-hour A i r Quali ty Objective at Ki ts i lano and Richmond 57 2.3 Exceedance Days of 24-hour A i r Qual i ty Objective at Abbotsford and Chil l iwack 58 2.4 Monthly Effects of P M l O , Year 1996 59 2.5 Weekly Effects of P M l O , Year 1996 60 2.6 Week-day Effects of P M l O , Year 1996 61 2.7 Day-hour Effects of P M l O , Year 1996 62 2.8 Monthly and Weekly Effects of P M l O , Years 1995 and 1994 63 2.9 Week-day and Day-hour Effects of P M l O , Years 1995 and 1994 64 3.1 Histograms of P M l O at Original , Square-root and Log Transformed Scales, Stations 1-5 65 3.2 Histograms of P M l O at Original , Square-root and Log Transformed Scales, Stations 6-10 66 3.3 Time Series Plots of P M l O , Year 1996 67 3.4 T ime Series Plots of P M l O , Year 1995 68 3.5 T ime Series Plots of P M l O , Year 1994 69 3.6 Autocorrelation and Par t ia l Autocorrelation Coefficients of v i i i Detrended Residuals, Stations 1-5 70 3.7 Autocorrelation and Partial Autocorrelation Coefficients of Detrended Residuals, Stations 6-10 71 3.8 Autocorrelation and Partial Autocorrelation Coefficients of Site-specific AR(3) Residuals, Stations 1-5 72 3.9 Autocorrelation and Partial Autocorrelation Coefficients of Site-specific AR(3) Residuals, Stations 6-10 73 3.10 Autocorrelation and Partial Autocorrelation Coefficients of Site-specific ARMA(2,1) Residuals, Stations 1-5 .74 3.11 Autocorrelation and Partial Autocorrelation Coefficients of Site-specific ARMA(2,1) Residuals, Stations 6-10 75 3.12 Autocorrelation and Partial Autocorrelation Coefficients of ARMA(2,1) Residuals for Seasonally Differenced Series Stations 1-5 76 3.13 Autocorrelation and Partial Autocorrelation Coefficients of ARMA(2,1) Residuals for Simply Differenced Series Stations 1-5 77 3.14 Autocorrelation and Partial Autocorrelation Coefficients of Site-consistent AR(3) Residuals, Stations 1-5 78 3.15 Autocorrelation and Partial Autocorrelation Coefficients of Site-consistent AR(3) Residuals, Stations 6-10 79 3.16 Autocorrelation and Partial Autocorrelation Coefficients of Site-consistent ARMA(2,1) Residuals, Stations 1-5 80 3.17 Autocorrelation and Partial Autocorrelation Coefficients of ix Site-consistent A R M A ( 2 , 1 ) Residuals, Stations 6-10 81 3.18 T ime Series Plots of Log(PMlO) , Year 1996 82 3.19 T ime Series Plots of Detrended Residuals, Year 1996 83 3.20 Time Series Plots of Single A R M A ( 2 , 1 ) Residuals, Year 1996 84 3.21 Time Series Plots of Single AR(3 ) Residuals, Year 1996 85 4.1 Group Structure among the Monitor ing Stations 86 4.2 Between-site Correlations vs Geographical Distances 87 x A C K N O W L E D G E M E N T S I would like to thank Prof. James Zidek and Dr. Nhu Le for their guidance and assistance throughout the past year. I would like to express my sincere gratitude to them for their careful reading and helpful comments on this thesis. I would also like to thank Prof. John Petkau for his encouragement throughout the program and for providing helpful information related to this thesis work. Many thanks to Dr. Li Sun and Rong Zhu for their substantial help during the course of this work. This study was financially supported by the Natural Science and Engineering Re-search Council of Canada as well as by the Environmental Protection Agency of the United States through a grant to Harvard University. Special thanks go to Mr. Ernie Tradewell (Quality Assurance Scientist) of B. C. Ministry of Environment, Lands and Parks for providing the PMlO data. Thanks are also given to Mr. Ale Percival and Mr. Surjit Nizzar from the Greater Vancouver Regional District for providing field information about PMlO monitoring network. xi Chapter 1 Introduction Airborne particulates gain access to the human body through the lungs. The deposition of inhaled particles on the lung airways and in the alveoli depends largely on the particle size [Gerrity et al., 1979]. Because of their weight, particulates larger than 10 fim in diameter settle to the ground quickly. If we do inhale them, they tend to collect in our throat and nose, and are eliminated from our body by sneezing, coughing or through the digestive system. But the smaller particles, with diameter of 10 fim or less, or PMio as are commonly called, are easily inhaled into the upper respiratory tract of humans, where they can remain for weeks to months, causing breathing disorders. Even smaller particles, those with diameter of 2.5 fxm or less, called PM2.5, are of more concern since they penetrate deeper into the lungs and there discharge any of the toxins of which they are comprised. Elevated levels of PMw are responsible for multiple adverse health effects. These include increases in hospitalization for lung and heart problems [Bates and Sizto, 1987; Pope, 1990], increases in hospital emergency room visits [Bates et al., 1990], increases in days of restricted activity in adults and school absenteeism in children [Ostro, 1987; Michael and Pope, 1992], increased reporting of symptoms that indicate mild respiratory illness and small reduction in measured levels of lung function [Dockery, et al. 1989]. Be-cause of the mutagenic and carcinogenic chemical compounds in particulates, inhalation of particles may also increase the risk of developing respiratory tract cancers. The most troubling finding from many health studies is that increases in particle concentration 1 are associated with increases in mortality [Schwartz and Marcus, 1990; Schwartz and Dockery, 1992 a,b]. These include deaths from lung disease. However deaths from heart disease are also linked to increases in particle concentration. Certain groups within the general population who are more susceptible to the effects of exposure to particulates include the elderly, those with pre-existing lung and heart diseases, and probably the children. A recent study on Vancouver Island [Vedal et al., 1998] suggests that children with diagnosed asthma are more sensitive to these effects than are other children. It is also shown that increases in particle concentration result in reduced visibility [Sloane and White, 1986; Mathai, 1990]. The size and composition of particles, along with the relative humidity of the air can affect the scattering and absorption of light, both important factors in determining the visibility. Particles with diameter between 0.1 /xm and 1.0 fim are particularly effective in affecting visibility because of their comparable size to the wavelength of visible light. Moreover, particles in this size range (aerosols) are not efficiently removed from the atmosphere, and under stagnant air conditions can accumulate and result in regional haze episodes. PMio is invisible to the naked eye. The sources of P M i 0 vary from community to community and from season to season. Some P M 1 0 is released directly from sources into the air. These are called "primary particulates". Other P M W is formed from physical and chemical reactions mainly involving gases emitted into the air. These "secondary particulates" are mainly sulphates, nitrates and organic carbon compounds. Most of the mass of secondary particles reside in the PM 2.5 fraction. In the Lower Fraser Valley of British Columbia, diesel vehicles (trucks and buses), bulk shipping terminals, the wood products industry, gasoline vehicles and road dust are the largest sources of primary PMio ; while diesel and gasoline vehicles are thought to be the major sources of P M 2 . 5 . Studies limited to the Lower Fraser Valley indicate that secondary particles comprise up to 50% of the very fine PMio particles collected during the summertime [Lowenthal et 2 al., 1994]. Fine particulate pollution is of specific concern because it contains a higher pro-portion of various toxic metals and acidic sulfur species [Ozkaynak and Spengler, 1985]. In recognition of the health risks associated with particulate matter, many communities have put the particle monitoring, evaluation and emission reduction as the priority is-sues in their air quality management plan. In a current three-year Harvard-UBC study sponsored by the US EPA, new statistical methods for estimating population exposures to PMW and P M 2 . 5 pollution in the greater Boston and Vancouver areas are being developed. As one of the early products of this project, this thesis presents a spatial and temporal analysis of hourly ambient PMW concentrations collected from monitoring network in Vancouver and the nearby Fraser Valley area for the years 1994 to 1996. Unlike other pollutants such as ozone which derives from photochemical processes with a very large spatial scale and broad regional continuity, ambient particle concen-trations can vary substantially over relatively small areas depending on the influence of point sources. Even a relatively dense monitoring network may not reflect the small-scale spatial gradients in particle concentrations as our analysis below will show. However even with complete knowledge of the particulate field, the population level of exposure is still difficult to infer. One of the main reasons for this is that people spend the majority of their time indoors and ambient particle concentrations are only weakly associated with indoor concentration, especially in colder seasons when windows are kept closed most of the time. In any case, the concentration field is not completely known and siting of monitors in B.C and Greater Vancouver Regional District (GVRD) area is not based on the spatial distribution of the population, and therefore may not adequately reflect the PMW concentration to which much of the population is exposed. Nevertheless the density of monitoring sites in this region is as high as any network anywhere in the world and these monitors do provide a picture of the range of likely P M 1 0 exposures in the 3 GVRD. Although there have been many epidemiological studies about the health impact of particulate pollution, little seems to be known about the spatial and temporal vari-ations of small particles. This may be because it is only in recent years that airborne particulates have been systematically monitored, especially on a continuous basis, in various parts of the world. The objective of this study is to understand and develop models for the space and time correlation structure of PMW, and this will enable us to do spatial interpolation in the study region. Statistical models in the literature for the analysis of spatial-temporal fields have involved interpolation of both a space-time field or trend and the analysis of the residuals from the mean field. A method developed by Le and Zidek (1992) for spatially predicting or interpolating concentration has been successfully used for interpolating ozone and other air pollutants. It has even been refined so that vectors of such concentrations can be interpolated (Brown, Le and Zidek, 1994). There are two important assumptions about the response field described by this model, that is, independent responses over time with a joint Gaussian response distribution. One of the focuses of our research is making the PMW data ideal for the application of available interpolation technique. To be specific, in this research we attempt to decompose the PMW into trend and random components, the latter modeled adequately by time series models to remove the temporal autocorrelation before doing spatial interpolation. Our analysis shows remarkable consistency of temporal variation across the sites. The consequences of using the same temporal structure for all the monitoring sites or using site-specific ones will be discussed. For our ambient PMW data, it does not show strong spatial variation across the monitoring sites. Thus the model developed by Le and Zidek may still be valuable for the interpolation of PMio in this area. 4 This thesis consists of five chapters. In Chapter 2 we describe the data and carefully explore the underlying structure of the data. Chapter 3 deals with the space-time model which consists of first taking a log transformation and then decomposing the transformed data into trend and residual components. We discuss the temporal structure in the detrended residuals. As the spatial interpolation of the residuals and relevant results will be discussed elsewhere, we only give a brief description about the spatial structure of PMW in Chapter 4. A summary about the study will be given in the last chapter. 5 Chapter 2 Preliminary Analysis of Data Set In this chapter, we analyze monitoring data provided by B. C. Ministry of Envi-ronment. P M \ Q levels throughout the province have being monitored for a number of years. While the earliest monitoring data dates back to 1985 (at Port Alberni, Vancouver Island), large-scale monitoring began in 1989. Two different techniques are currently used to measure PMW in the provincial monitoring network. The traditional manual method measures concentration typically once every six days. Three types of manual samplers, the SSI high-volume sampler, the Partisol™ sampler and the Dichotomous sampler are presently used in the provin-cial network. The manual samplers operate under the following principle. Ai r is drawn through a pre-weighed filter for a 24-hour period at a known flow-rate. The filter is then removed and sent for analysis. The increase in filter mass, together with the sampling period and sampling flow-rate are used to calculate the 24-hour PMW concentration. Additional analyses can be performed on the filter to determine the chemical compo-sition of the sample. The continuous method provides real-time measurements and is finding growing application. The continuous sampler used in the provincial network is TEOM™ which works in the following way. Heated air is drawn through a small filter which sits on the end of a hollow tube. This tube is fixed at the other end, and therefore free to oscillate much like a tuning fork. As particles collect on the filter, the mass of the filter changes, and this affects the rate at which the tube oscillates. The oscillation frequency of the tube is used to estimate ambient PMW levels. The TEOM™ sampler 6 is expensive relative to manual ones, and the basic setup of this instrument does not allow for chemical analyses to be done on the collected sample, as in the case of man-ual samplers. Yet it can provide continuous measurements of P M U and requires filter changes only once every two to four weeks. 2.1 Data Set Initially the G V R D monitored ambient levels of PMW by manual samplers at three locations. They are situated in downtown Vancouver, Kitsilano Vancouver and in Rocky Point Park of Port Moody area. Since 1994, continuous samplers have been installed at other locations of the G V R D . These PMi0 samplers provide real-time tracking of PMW concentration on an hour-by-hour basis. The data set for our study includes continuous hourly ambient PMW measure-ment collected by the TEOM™ sampler at 11 monitoring stations in Vancouver and the nearby Fraser Valley for the years 1994 to 1996, although different stations began operation at different times. Figure 2.1 shows the locations of the first 10 stations (ex-cept Hope) in this area. Each station is labelled with a code number. In Table 2.1, we list the geographical locations of the stations, their corresponding code numbers, and the time they commenced operation. A summary of missingness of data is given in Table 2.2. At most stations, little data are missing except for a few stations where continuous monitoring started late or was interrupted over the years. For our analysis, we include the first 10 stations for the years 1995 and 1996; for 1994, only Stations 1, 2, 4, 5, 7 are considered. With our selection of data, we can ignore missing values and when necessary temporarily fill in values with averages for the same hours over the days for each site. 7 Table 2.1. PM10 Monitoring Sites in Vancouver Monitor Site Latitude Longitude Area Starting Time Code 0310162-T9 49.16.51 122.50.53 Port Moody 93/11/01 1 0310175-T2 49.15.45 123.09.45 Kitsilano Vancouver 93/12/13 2 0310177-T4 49.16.45 122.58.11 Burnaby North 94/05/13 3 E206271-T15 49.07.58 122.41.36 Surrey East 94/01/06 4 E207417-T17 49.08.31 123.06.28 Richmond South 93/10/27 5 E207418-T18 49.13.00 122.59.00 Burnaby South 94/03/29 6 E207723-T13 49.09.30 122.54.03 North Delta 93/12/17 7 E209178-T27 49.05.46 122.33.59 Langley 95/03/01 8 E217029-T28 49.02.58 122.17.33 Abbotsford 94/07/19 9 E220891-T12 49.09.04 121.56.49 Chilliwack 95/03/01 10 E223756-T29 49.22.12 121.29.58 Hope 96/12/03 11 Table 2.2. Percentage of Missing Data at Stations and Years Stations Year 1 2 3 4 5 6 7 8 9 10 11 1996 1.8 1.3 1.1 0.4 2.7 0.4 1.4 0.8 17.2 1.6 92.8 1995 7.7 4.5 5.3 6.2 5.4 4.9 14.4 16.6 7.3 16.7 100 1994 4.3 4.9 36.9 4.5 1.5 25.3 2.9 100 55.9 100 100 1993 97.9 96.6 100 100 87.9 100 97.8 100 100 100 100 8 2.2 Measures of Air Quality PMio concentrations is expressed in micrograms of particles per cubic meter of air sampled ([ig/m3). As a measure of the air quality in a region, contaminant levels are compared to air quality objectives. Air quality objectives are developed by environmen-tal and health authorities to provide guidance for environmental protection decisions. BC Environment has set an air quality objective of 50 ng/m3 for P M i 0 during a 24 hour period and 30 fj.g/m3 for the annual average. Exceedances of the air quality objective indicate reduced protection against associated health effects. Hence, one measure of PMio air quality is the number or frequency of exceedances of the air quality objective. In Table 2.3, we give a summary of P M i 0 concentration levels, which includes annual means, annual maximums, and other relevant statistics. For all stations, the annual means never exceed 20 fig/m3, well below the annual objective. During 1996, Stations 5 (Richmond) and 9 (Abbotsford) had the highest annual averages, 16.2 and 15.2 /j,g/m3 respectively, and the P M i 0 concentrations at these two stations seem to be more variable than at other stations. We also observe a slight decline of P M i 0 from 1995 to 1996. The annual maximum in the table stands for the highest level of hourly PMio concentration over a year. Although the maximum values are surprisingly high , ranging from 80 to 350 /i#/m3, most of these records occur in late evenings. This makes it difficult to assign causes for such high P M 1 0 concentrations. We will combine these findings with those from the analyses below. The number of exceedances of the 24-hour air quality objective (that is the number of days in a year when the daily average exceeds certain specified concentration levels) are summarized in Table 2.4. 9 Table 2.3. Percentages of "0 " , "< 5" , "> 30", "> 50" in the Data (Units: fig/m3) Station Codes Year 1996 1 2 3 4 5 6 7 8 9 10 "0" 0.3 0.2 0.5 0.4 0.1 0.5 0.3 0.4 0.6 0.4 "< 5" 5.9 5.8 12.9 9.5 4.9 9.0 8.5 9.4 9.6 9.7 "> 30" 5.9 4.5 3.7 3.5 8.3 5.5 5.6 3.5 8.2 6.3 "> 50" 0.4 0.5 0.4 0.4 2.5 0.8 0.6 0.4 1.7 1.4 Annu. Mean 14.6 14.0 12.0 12.5 16.2 13.6 13.6 12.47 15.2 13.7 Annu. Median 13 12 10 11 13 11 11 10 12 11 Annu. Max. 77 153 98 109 270 91 138 94 342 162 Standard Dev. 8.83 8.72 8.43 8.32 14.82 9.52 9.37 8.17 13.96 11.30 Year 1995 "0" 0.2 0.1 0.2 0.2 0.1 0.2 0.3 0.3 0.3 0.2 "< 5" 4.5 4.0 10.0 6.5 4.7 6.3 7.2 7.1 6.1 6.0 "> 30" 9.6 6.8 3.7 3.9 7.0 7.8 8.7 4.8 14.0 9.1 "> 50" 1.3 0.8 0.3 0.3 0.9 1.4 1.6 0.2 3.2 1.7 Annu. Mean 16.9 15.3 12.8 13.5 14.8 15.2 15.6 13.7 18.4 15.7 Annu. Median 15 13 11 12 12 13 13 12 15 13 Annu. Max. 101 127 76 84 163 121 120 75 200 178 Standard Dev. 10.50 9.48 8.24 8.03 9.87 10.76 11.01 8.38 15.27 11.75 Note: In this table, the row of "0" represents the percentage of 0 recordings i n the da t a set; the "< 5" row represents the percentage of P M 1 0 measurements that are less than 5 fig/m3, and so on . 10 Table 2.4. Exceedances of 24-hour Air Quality Objective Stations Year 24-hour Average 1 2 3 4 5 6 7 8 9 10 1995 > bO^g/m3 0 0 0 0 0 0 0 0 4 4 > 40ng/m3 1 0 0 0 1 2 1 0 9 8 1996 > 50fig/m3 0 0 0 0 4 0 0 0 3 2 > 40fig/m3 0 2 0 1 7 1 1 0 5 5 We now concentrate on the 1996 data. Exceedances of the 24-hour objective oc-curred at Richmond, Abbotsford and Chilliwack in this year with Kitsilano station having 2 days when their daily averages were close to the objective limit. The time series plots of PMW for days with exceedances at these four stations are displayed in Figures 2.2 and 2.3. At Kitsilano and Richmond which are close to the shore, the highest levels of PMio were both observed near the end of October. It is difficult to believe that an artificial disturbance could have been involved at the same time (around midnight) at both stations. Thus these high values reflect the surrounding air conditions. The causes of this episode of poor air quality may be associated with stagnant air conditions and overnight coastal fogs which covered the coastal portion of the region [GVRD Air Quality Annual Report, 1996]. Exceedances in Chilliwack happened in January. Strong outflow winds in the eastern end of the region can create sandstorm conditions which may elevate the particle concentration in the air. The situation at Abbotsford seems to be complicated. Exceedances occurred in February, July and September respectively. There may exist local sources for high PMW concentration in this area. The weather conditions also contribute. Prevailing winds in the Lower Fraser Valley are from east to west in all months of the year. However, 11 in the summer, during warm and sunny days, on-shore winds tend to push the air and any accompanying pollutants inland during the day. At night the airflow reverses, bringing pollutants back to the coast. The late evening high levels of P M 1 0 recorded at Abbotsford may correspond to the transient phase of air flow reversion. 2.3 Long-term and Short-term Temporal Patterns of PMIO Figures 2.4 - 2.7 show the monthly, weekly, week-day, and day-hour effects of the 1996 PMio levels for different stations and those corresponding to 1994 and 1995 levels are displayed in Figures 2.8-2.9. Here the monthly effect represents the monthly mean which is adjusted by removing the annual mean first. We observe consistent monthly variation across the stations, especially for 1996. It was once believed that the highest PMio concentrations are typically observed during the winter months [Fine Particulate (PM 1 0 ) Levels, 1990-1995, Ministry of Environment, Lands and Parks, B.C., 1997]. This data set shows instead obvious summer peaks. For 1996, it is interesting that the July peak is much higher than that of February. Around April, P M 1 0 levels decline. The April - December pattern may be related to weather conditions. But the mechanism for the decrease in April is not clear. A report from the GVRD Air Quality Department mentions that the provincial government implemented a testing program in 1996 for heavy duty vehicles (e.g. smoking trucks). Information from a follow-up inquiry indicates that in Feb, 1996, about 2000 trucks were chosen randomly for testing and fines were issued for those with high emission indices. This action may be hypothetically associated with the decline of P M i 0 levels from February. The weekly mean analysis is an alternative to that for monthly means to detect fine temporal features that might be lost in the monthly means. Again we observe consistent patterns across the stations. In Vancouver, the weather pattern usually goes from a few days of sunny skies followed by consecutive cloudy and rainy days. The weekly effects 12 of PMio m a y capture some of this pattern. In our analysis below, we adopt the weekly effects into model building. For week-day effects, the means are taken after removing annual mean and the weekly mean. The patterns are fairly consistent from station-to-station. PMW levels increase from Monday to Friday and fall during the weekend. The day-hour means are found after removing the annual mean, the weekly mean and the week-day mean. We observe that for all stations, the PMW concentration rises from 5am to 10am; from 10am to 10pm, PMio remains relatively stable for most of the stations and it goes down from 11pm to 4am. The patterns of the week-day and day-hour effects might be explained by traffic pattern. Special features for some stations are obvious in the plots. For Stations 2 and 5, located at Vancouver's Kitsilano and Richmond respectively, we observe two peaks, one in late morning and the other in late evening with relatively low levels during the daytime. For the Richmond station, the evening level is fairly strong. It is difficult to know whether these elevated levels were because of real emission sources or the influ-ence of environment surrounding the TEOM™ samplers. The Vancouver International Airport is located in Richmond and perhaps the noise from airplanes has impact on the operation of the TEOM™ sampler which involves an oscillation mechanism. Also weather conditions and associated air flow patterns or coastal fogs could have influence on the daily distribution of PMio level as discussed in Section 2.2. From the above exploratory analysis, we observe consistent weekly, week-day and day-hour effects from station-to-station. The magnitude of temporal variation is around 30 /ig/m 3 for the weekly effect, 6 /ig/m3 for the week-day effect and 8 /ig/m 3 for the day-hour effect. To our knowledge, there are no heavy industries in the study area. The emission inventory done by the GVRD in 1990 shows that the largest category of emission source 13 consisted mostly of mobile sources, accounting for about 85 % of the total emissions of all five common pollutants—carbon monoxide (COx), volatile organic compounds (VOC's), nitrogen oxides (NOx), sulphur oxides (SOx) and particulate matter (PM) [GVRD Web-site, Air Quality]. Thus traffic pollution is the main source for PMW ambient concentration although there may exist local point sources for some stations. Weather variation (temperature, humidity, air current) may impact the PMw levels as well, for example, rainfall will clean the troposphere and reduce the pollution level; high temperature will increase the activity of particles and air circulation may transport and redistribute the PMW concentrations. Monthly or weekly effects may capture the influence of weather on PMW, while the week-day and day-hour patterns reflect the traffic impact. The consistency of patterns across stations leads us to introduce a common trend for all the stations, and this trend can be estimated by combining the weekly, week-day and day-hour effects. 14 Chapter 3 Space-Time Models for PM10 An environmental monitoring data derives from multivariate responses at a num-ber of sites over a period of time. Typically, because of the high cost of establishing and maintaining a monitoring program, the number of monitoring sites is small. This leads to the need for the reliable interpolation of responses at non-monitored sites when looking for the association between levels of health impact and levels of air pollutants at subregional levels, such as census subdivisions or census tracts. Whenever data from space-time fields are available, the spatial and temporal structure of the data becomes of central concern. The identification of an appropriate statistical model for such data is a fundamental requirement for interpolation at non-monitored sites, and also for the eval-uation of existing networks and for the efficient design of new networks. Non-parametric studies are quite useful for exploring the basic data structure as we have done in Chapter 2, but the ultimate goal is often a parametric model. It is essential that the parametric model adequately represents the structure in the data. The model built in this section is based on our exploratory analysis of raw data, and for the first step, only the data from the year 1996 is considered. Because of the fairly strong hourly variability of P M 1 0 , we do not make any temporal aggregation of the data. In environmental sciences including such fields as hydrology, meteorology, air pol-lution, and other subject areas, there have been many approaches devoted to the ex-tension of traditional geostatistical methods to the space-time domain [Le and Petkau, 1988; Haslett, 1989; Sampson and Guttorp, 1992; Mardia and Goodall, 1993; Handcock 15 and Wallis, 1994; Guttorp, et al. 1994; Host, et al. 1995; Carroll, et al. 1997]. The space-time variability of study fields is characterized by a covariance function that often exhibits different spatial behavior at different times. It is difficult to identify valid co-variance models to incorporate the space, time and space-time interaction components. Most of the models involve the analyses of both a space-time trend and the residu-als. Our study follows this commonly-used approach. In this chapter, we focus on the analysis of the temporal correlation structure of trend-adjusted residuals. 3.1 A Log Transformation Space-time fields having joint Gaussian distributions admit particularly simple analyses. Thus in this section, we first examine the normality of the data. Let pm(x, t) represent the PMW level at station x and time t. The original PMW frequency plot at each station shows a skewed distribution. The standard deviation exhibits hour-to-hour and station-to-station variation (see Table 3.1). Similar phe-nomena are observed for other air pollutants such as ozone. Consequently Guttorp et al. (1994) and Carroll et al. (1997) use the square root transformation for ground-level ozone data. Figures 3.1 and 3.2 show the histogram of PMW at original, square root and log scales for all the 10 stations. Comparing the two commonly-used transformations, we take the log transformation for our P M 1 0 data to stabilize the variances and make the distributions of PMW approximately normal, at least marginally. From Table 3.1, it is interesting to note that in the original data, the early morning and late evening standard deviation of PMW at most of the stations is much lower than that of the daytime. This observation coincides with our intuition, since we expect that the factors which impact PMW levels, such as traffic, will play a much more active role during the daytime. 16 Table 3.1 Standard Deviations at Stations and Hours (Original Data, fig/m3) Stations Hour 1 2 3 4 5 6 7 8 9 10 1 6.8 8.0 6.3 7.6 13.6 7.6 7.7 6.7 21.9 9.4 2 6.0 7.1 5.7 6.6 10.3 6.7 6.8 6.1 12.2 9.5 3 5.5 6.4 5.6 6.3 8.5 6.4 6.2 5.5 12.7 9.2 4 5.5 5.5 5.5 6.2 7.6 5.7 6.2 5.3 13.6 10.3 5 5.4 5.5 5.1 5.9 8.4 5.7 6.1 5.1 12.4 10.9 6 6.2 6.0 6.0 7.7 9.0 5.7 8.0 7.6 13.2 12.2 7 7.6 6.8 6.0 8.9 11.6 7.0 8.1 8.4 12.8 9.9 8 7.9 8.4 7.4 8.9 13.7 10.0 8.9 8.3 12.2 10.2 9 8.6 9.9 8.1 8.1 11.6 10.9 9.6 7.3 12.4 10.8 10 8.7 9.9 7.9 8.3 10.2 11.8 8.9 6.2 12.6 11.9 11 9.1 9.1 9.2 8.4 8.4 12.2 9.9 7.6 8.4 12.7 12 10.2 7.6 10.6 8.9 8.9 11.0 9.9 7.1 7.9 12.4 13 10.5 7.9 10.2 8.1 9.0 11.5 9.4 8.0 8.4 11.3 14 10.1 8.8 9.9 8.1 7.8 11.7 9.8 7.5 8.4 10.7 15 9.8 7.1 9.8 8.1 8.0 10.9 10.0 7.9 8.4 10.9 16 9.9 6.8 10.3 9.3 10.0 10.8 9.7 7.2 7.8 11.5 17 10.3 8.1 10.9 9.7 12.2 10.4 9.4 8.5 7.9 11.5 18 10.1 8.5 9.4 9.0 15.1 11.0 10.3 8.0 9.6 10.8 19 10.0 8.5 8.9 8.7 17.5 9.8 10.7 9.0 11.8 11.4 20 9.7 9.8 8.3 8.9 19.9 9.0 10.5 10.5 12.6 12.0 21 8.8 10.9 8.6 9.0 26.0 8.2 11.1 12.2 14.4 12.1 22 8.7 12.0 7.7 8.8 25.7 7.8 11.1 11.0 12.4 11.8 23 7.9 10.6 6.7 8.7 23.9 7.6 9.6 8.6 23.1 12.1 17 3.2 Model Let Y(x, t) be the log of hourly PMi0, i. e. Y(x, t) = \og(pm(x, t)). In the original data, there are zero PMW levels recorded. However it seems implausible that the actual PMW level is absolutely zero at any time. Therefore those zero recordings seem likely to have been just measurement anomalies. Also we have the so-called below detection limit recordings in the data, denoted by "< 0". When taking the log transformation, these records are changed to l's, the lowest PMW level in the data. However the impact of this change is negligible since there are only few such "< 0" recordings in the data, plus the zeros, in all not more than 50 out of a total of 8784 hourly measurements over the year for all the sites. Our spatial-temporal model for log-transformed hourly ambient PMW is of simple form, Y(x,t) = T(t) + E(x,t) • (1) where T(t) represents the site-consistent deterministic part that incorporates the weekly, week-day and day-hour effects, and we have left the spatial-temporal correlation struc-ture due to other factors in the random part E(x,t). For many environmental studies, the trend involves variability over both space and time. In the above model, we have chosen a trend that is spatially homogeneous. There is no explicit component for spatial effect. This implies that the diurnal patterns are effectively the same across the sites. Our data analysis in Chapter 2 justifies our selecting this model, as the current data display remarkable consistency of temporal variation over sites. 3.3 Trend and Its Estimation The adoption of a common trend for all the stations is supported by the observed 18 consistency of weekly, week-day and day-hour effects of P M W concentrations across the stations. We propose the following model for T(t): where /J, is the overall mean level, and H H O U R , D D A Y and Wweek represent the day-hour, week-day and weekly effects respectively. Up to now we have not included any meteorological variables into the model. Undoubtedly P M W levels relate to weather conditions to some extent. As we mentioned in Chapter 2, the two major factors that influence ambient P M W levels in the study area are traffic pollution and weather conditions. The analyses done in Section 2.3 suggest that the weekly variation of P M W is associated with meteorological states, while the week-day and day-hour variations represent the traffic impact satisfactorily. Thus in the T(t) model, these two factors are involved even though not directly or completely. As our model will be used to predict or interpolate P M \ Q values at non-monitored sites, for example, at the centroids of census tracts in Vancouver area, any variables that are included in T(t) must also be observable or predictable across the area. The model we used for T(t) avoids the need to interpolate the meteorological variables and the difficulty of building appropriate models for the interpolation of these variables. Estimates of the parameters in model (2) are based on the observed P M W concen-trations at the 10 stations. We merged the data into a single vector series and fitted a linear regression model for the parameters. Specifically, let Y- = (yi,i,y2,u •••,yio,i),i = 1,2,...,8784,F' = (Y{X,...X7S4) and 0 = (//, Hu H23, D6, Wu W52). Here the Yi represents the log-transformed P M 1 0 values from the 10 stations for hour i, so the column vector Y is of length 87840. The vector 3 consists of all the parameters with the constraints that i / 2 4 = 0, D7 = 0 and W53 = 0. Then the linear model for 3 is T(t) — fl + H^our + Dday + Wweek (2) Y = X3 + error (3) 19 where X is the model matrix composed of indicator variables. Obviously, because of spatial and temporal correlation in the P M i o data, there are dependences in the errors. The covariance matrix of the error vector is of size 87840 by 87840. We write it as follows / ^ \ cov(error) = V 8^784 On the diagonal, each E* is of size 10 by 10 and is the spatial covariance matrix at hour i. These diagonal blocks may vary from hour-to-hour. Other elements of cov(error) represent the temporal correlation structure. In principle it would be possible to get a minimum variance estimator of 8 by considering the structure of cov(error) and using generalized least squares (GLS). Here we avoid the complicated problem of estimating the covariance matrix. Keeping in mind that the model is used to estimate the general trend, the consistency of P M i o variation in time across the stations and the large number of observations guarantee a reasonable estimate of such a trend, no matter which method is used. For confirmation, we obtain two estimates of 8, one is the ordinary least square (OLS) estimator obtained by simply ignoring the dependence in the data; the other is the generalized least square estimator with simplified diagonal covariance matrix / cov(error) = E 0 0 0 0 E \ obtained by assuming stationary spatial correlation and ignoring the temporal correla-tion. Two resulting estimates for 8 are in fact very similar; the generalized least square method slightly reduces the standard errors of the estimation. All the parameters in the model are significant, presumably in part because of the big sample size (87840). 20 Table 3.2 gives the parameter estimates obtained by using the generalized least square method. The estimate of the overall mean is fi = 1.69. In the next section, we wi l l analyze the random part E(x,t) of model (1), which captures al l the irregular variations not explained by the trend. Table 3.2. GLS Estimates of Trend Parameters Hour 1 2 3 4 5 6 7 8 9 10 11 12 Hi -0.09 -0.15 -0.21 -0.26 -0.25 -0.19 -0.07 0.05 0.10 0.09 0.08 0.03 Hour 13 14 15 16 17 18 19 20 21 22 23 24 Hi 0.01 -0.01 0.01 0.0 0.0 0.05 0.11 0.16 0.18 0.15 0.09 0 Day 1 2 3 4 5 6 7 * * * * * Di 0.09 0.17 0.22 0.24 0.24 0.11 0 * * * * * Week 1 2 3 4 5 6 7 8 9 10 11 12 Wi 0.18 0.30 0.43 0.41 0.99 0.54 0.96 0.09 0.80 0.38 0.66 0.54 Week 13 14 15 16 17 18 19 20 21 22 23 24 Wi 0.57 0.41 0.44 0.35 0.24 0.42 0.62 0.37 0.44 0.90 0.67 0.71 Week 25 26 27 28 29 30 31 32 33 34 35 36 Wi 0.78 0.62 0.58 1.09 0.70 1.27 0.74 1.17 0.81 0.83 0.91 0.38 Week 37 38 39 40 41 42 43 44 45 46 47 48 Wi 0.75 0.62 0.96 0.84 0.81 0.33 0.39 0.82 0.41 0.34 0.68 0.15 Week 49 50 51 52 53 * * * * * * * Wi 0.09 0.17 0.30 0.26 0 * * * * * * * 21 3.4 Temporal Correlation in Detrended Residuals The random part E(x, t) of our model contains both spatial and temporal correla-tions. In their study of ground-level ozone data, Carroll et al. (1997) adopt a stationary covariance structure model for E(x, t). Such stationarity seems unwarranted for PMW as the sample correlations between successive hours of the day can vary from hour-to-hour. As shown in Table 3.3, for most of the stations, the serial correlation between successive hours (lag-1 autocorrelation) of the day seems to be higher in late evening and early morning. The maximum lag-1 correlation is 0.85, while the minimum is only 0.55. It is necessary to construct time series models to explain the autocorrelation structure in E(x,t). For space-time interaction, a simplying assumption is that the spatial and tem-poral correlation structures are separable, i.e., the temporal correlation structure is the same for all the monitoring sites. In our analysis, we first use spatially varying time series model for each site. The results suggest that for most of the sites, the mixed autoregres-sive (order 2) and moving average (order 1) process explains the short term temporal correlation satisfactorily. We observe the spatial variations of these ARMA(2,1) model coefficients, suggesting some interaction between space and time correlation structures; nevertheless this interaction is small since the coefficients only change slightly in space. As the residuals obtained by eliminating the time direction autocorrelation will be used for future spatial interpolation, we will encounter the problem of interpolating a set of estimated temporal correlation models, i.e., interpolating the triple ARMA(2,1) coefficients from monitored sites to non-monitored sites if we use site-specific ARMA(2,1) models. This poses a substantial problem owing to the complex sampling distributions of these parameter estimates. To avoid it, here we fit a single ARMA(2,1) model for all the sites. This simplifies the interpolation, but may leave some temporal autocorrelation in the residuals for some sites. If the magnitude of left-over temporal correlations is small, 22 Table 3.3. Correlations between Successive Hours (Detrended Residuals) Stations Hour 1 2 3. 4 5 6 7 8 9 10 1 0.70 0.85 0.81 0.84 0.84 0.80 0.85 0.83 0.78 0.81 2 0.81 0.79 0.82 0.78 0.82 0.82 0.78 0.85 0.82 0.83 3 0.76 0.78 0.76 0.81 0.82 0.80 0.77 0.73 0.75 0.83 4 0.63 0.76 0.79 0.80 0.84 0.76 0.80 0.68 0.80 0.83 5 0.76 0.73 0.80 0.77 0.78 0.77 0.83 0.71 0.82 0.82 6 0.73 0.73 0.80 0.80 0.77 0.79 0.81 0.70 0.80 0.83 . 7 0.75 0.69 0.74 0.77 0.79 0.77 0.74 0.68 0.80 0.80 8 0.67 0.74 0.71 0.76 0.78 0.68 0.76 0.57 0.62 0.80 9 0.66 0.80 0.74 0.70 0.80 0.66 0.76 0.55 0.64 0.83 10 0.69 0.81 0.80 0.68 0.71 0.77 0.71 0.64 0.55 0.78 11 0.71 0.76 0.74 0.74 0.77 0.72 0.72 0.73 0.59 0.74 12 0.73 0.70 0.70 0.74 0.73 0.71 0.75 0.71 0.62 0.76 13 0.75 0.74 0.74 0.70 0.71 0.71 0.75 0.66 0.62 0.78 14 0.74 0.70 0.70 0.75 0.80 0.66 0.79 0.71 0.65 0.74 15 0.67 0.76 0.74 0.74 0.77 0.70 0.74 0.62 0.67 0.75 16 0.70 0.72 0.74 0.76 0.75 0.73 0.68 0.66 0.73 0.79 17 0.75 0.75 0.77 0.72 0.73 0.81 0.79 0.65 0.73 0.76 18 0.80 0.70 0.79 0.79 0.75 0.74 0.78 0.74 0.70 0.80 19 0.80 0.74 0.77 0.81 0.77 0.69 0.77 0.73 0.79 0.73 20 0.79 0.80. 0.76 0.79 0.81 0.75 0.76 0.78 0.75 0.80 21 0.68 0.79 0.80 0.79 0.82 0.74 0.85 0.79 0.77 0.85 22 0.82 0.82 0.78 0.80 0.81 0.82 0.84 0.82 0.75 0.80 23 0.80 0.80 0.77 0.83 0.81 0.76 0.84 0.82 0.79 0.83 23 and not of concern in spatial interpolation, then the single ARMA(2,1) model may be used to remove the temporal autocorrelation from all the sites. In the study of daily ozone data in Southern Ontario, Sun (1994) uses a single AR(1) model for all sites before doing the interpolation. Unless appropriate models or techniques have been developed for the interpolation of spatially varying temporal structures, the selection of single structure model seems to be an accepted alternative. 3.4.1 Site-specific Time Series Models Figures 3.6 and 3.7 show the empirical autoregressive correlation coefficients (ACF) and partial autoregressive correlation coefficients (PACF) of the detrended residuals for all the 10 stations. The ACF plots suggest that there still remains daily periodic structure. This may reflect the departure of daily variation of individual site from the mean pattern. The PACF plots give a clearer picture of the correlation structure. We observe relatively strong short-term (1-3 hours) lagged correlations and a non-negligible correlation at lag 24 hour. Guttorp et al. (1994) discover similar correlation pattern for the hourly mean-adjusted ground-level ozone data. They explained that the lag 24 hour correlation could correspond to the weather systems passing over the area. Time series analysis involves both model identification and parameter estimation, with the former being thought as the more difficult part. Because there is rarely a direct physical motivation for the selection of a model, the choices must be based on the data. Many practitioners usually follow the Box-Jenkins approach to time series modelling. To a very large extent this method is based on making inferences from the patterns of the sample autocorrelation and partial autocorrelation functions of the series. An alternative procedure for selecting appropriate model is the use of penalized log-likelihood measure. AIC [Akaike,1973] and BIC [Schwarz,1978] are two automatic model selection criteria which balance the reduction of estimated error variances with the number of parameters 24 being fitted. There are many classes of time series models to choose from. The autocorrelation function of a moving average (MA) process usually cuts off at some lag, but the sample ACF plots in Figures 3.6 and 3.7 do not exhibit this special feature, suggesting that the selection of simple MA process to describe the detrended series is not appropriate. In the analysis below, we will arrive at an appropriate model to represent the temporal structure in detrended residuals from among the possible choices of AR, ARMA and ARIMA processes. A R Model: An autoregressive (AR) model of order p for a time series xt is defined by xt = aixt-i + a2xt-2 + ••• + apxt-p + et where (ffi,^--) is a sequence of independent JV(0, af) random variables. This model may be applied in situations where it is reasonable to assume that the present value of a time series depends on the immediate past values. To use this model, the order parameter p must be specified in advance. It is usually difficult to assess the order of an AR process from the sample ACF alone. For a first-order AR process, the theoretical ACF decreases exponentially. But for higher-order processes the ACF is a mixture of damped exponential or sinusoidal functions and is difficult to identify. The use of partial autocorrelation function helps to determine the order of an AR process. When fitting an AR(p) model, the last coefficient ap measures the excess correlation at lag p which is not accounted for by an AR(p-l) model. It is called "the pth partial autocorrelation coefficient". Sample partial autocorrelation coefficients are estimated by fitting AR processes of successively higher order and taking the fcth partial autocorrelation coefficient as the last coefficient when an AR(k) model is fitted, k = 1, 2, ...,p. 25 If an A R ( p ) model represents a process adequately, then the sample P A C F of the process wi l l terminate at lag p. The P A C F plots in Figures 3.6 and 3.7 suggest that an AR(3 ) model may be appropriate to describe the detrended residuals E(x,t). For each station, we fit the model E(x, t) = aiE(x, t-l)+ a2E(x, t - 2) + a3E(x, t - 3) + e(x, t). (4) where ai, a2 and a3 are spatially varying coefficients of the AR(3 ) model. If the AR(3 ) model captures the temporal structure well, then each site-residual e(x, t) is approxi-mately a white noise process and we do not expect any significant serial correlations in e(x, t). For an autoregressive model A R ( p ) , the equations used to estimate the unknown coefficients a\, ap are linear. Here we present two algorithms for the estimation of these parameters. • Yule-Walker Equations: Let 7k be the autocovariance at lag k for an A R ( p ) process xt, then the A R ( p ) coefficients a i , ap satisfy the Yule-Walker equations (after G .U .Yu le and Sir Gilber t Walker): v zZlk-i&k = li, i = l,2,...,p k=l B y replacing the jk in the above equations with the sample estimate 7^ and solving the resulting equations, we can get the estimated coefficients o?i, dp. • Burg's Algor i thm: Burg's approach [Burg, 1967] is based on estimating the fcth partial correlation coefficient by minimizing the sum of forward and backward prediction errors: n SS(ak,k) = zZ (ixt ~ ai,kXt-i - . . . - aKkxt-kf + [xt-k ~ ahkxt_k+1 - . . . - ak,kxtf) t=k+l 26 Here the parameters are defined as following. Let denote the estimate of the i t h autoregressive coefficient in an A R ( k ) model. If we have the estimates a i ] f c _ i , dk-i,k-\ and the estimated error variance o\_y assuming an A R ( k - l ) model, then estimates for an A R ( k ) model are i k - E*=i aj,fc-i7/-fc 0-k,k — 5 aj,k = aj,k-i - a-k,kO-k-j,k-i, 1 < j < k - 1 °l = c r k- i ( 1 -al,k) This recursive method is referred to as the Levinson or Levinson-Durbin algorithm. It allows one to obtain estimates for a kth order model from the estimates of the [k — l ) t h model. The estimates of coefficients of our AR(3 ) model v ia Yule-Walker Equations and Burg's Algor i thm give the same result presumbly because of the large number of observations. Table 3.4 summarizes the AR(3 ) coefficient estimates by Yule-Walker Equations. Table 3.4. Estimates of AR(3) Coefficients at Al l Stations Stations parameter 1 2 3 4 5 6 7 8 9 10 0.60 0.64 0.65 0.63 0.76 0.61 0.68 0.60 0.61 0.63 Oil 0.12 0.10 0.09 0.10 -0.02 0.15 0.09 0.09 0.13 0.14 0.07 0.06 0.07 0.10 0.07 0.03 0.05 0.09 0.04 0.08 0.13 0.13 0.15 0.13 0.14 0.15 0.13 0.16 0.19 0.13 For al l stations the lag-1 A R coefficient dominates the rest. We observe coherent variations of these coefficients across the stations. However, something unusual is found 27 at the Richmond station (Station 5); its order 1 coefficient attains the spatial maximum while its coefficient of order 2 is negative, different from all other stations. However this negative coefficient is very small and could be random fluctuations. Figures 3.8 and 3.9 present the A C F and P A C F plots of the residuals e(x, t) for the 10 stations. Even though we cannot see significant correlation in the A C F plots, we st i l l get correlation at some short-term lags and around lag 24 hours in the P A C F plots; nevertheless the magnitude of these correlations is not more than 0.06. A majority of points lie wi thin the 95% confidence interval band, suggesting the residuals e(x,t) approximate white noise process. The above model selection is based on the estimated A C F and P A C F functions. In the analysis, we also tried the A I C and B I C criteria to select the appropriate order of fitted A R process. For the present case of an order p A R model, the A I C criterion can be written as AIC(p) = —2\og(maximized — likelihood) + 2p In this equation, the first term gives the penalty due to badness of fit whilst the 2p penalizes the selection of too high a model order. The A I C procedure chooses that order p for which the A I C attains its minimum. The orders selected by minimizing the A I C are given in Table 3.5. Table 3.5. Orders of A R Model Selected by AIC Criterion Stations 1 2 3 4 5 6 7 8 9 10 order 35 32 36 32 31 35 33 25 26 27 0.13 0.13 0.15 0.13 0.14 0.15 0.13 0.16 0.19 0.13 The orders found by employing the A I C seem excessively large. In fact, simulation 28 studies [Jones,1975; Shibata, 1976] suggest that the A I C tends to choose too many parameters for autoregressive processes. The B I C is another criterion which does not have the over-fitting property of the A I C . The B I C replaces the term 2p in the A I C criterion by (p + p log(A r ) ) , where N is the number of observations used to fit the model (Chatfield, 1995); this criterion penalizes models wi th large number of parameters more severely than the A I C . Table 3.6. BIC for Different Order A R Processes Stations order 1 2 3 4 5 6 7 8 9 10 1 325.1 270.1 259.3 360.5 162.6 343,9 205.5 337.9 336.6 500.7 2 96.9 112.6 110.1 143.8 162.9 74.3 85.9 168.2 146.8 187.1 3 61.2 95.0 77.8 73.9 127.3 75.7 76.6 105.9 144.5 143.2 4 62.0 105.1 82.2 70.8 137.2 67.1 57.7 71.8 132.3 137.5 5 63.9 115.2 87.3 72.2 145.5 76.9 65.8 76.3 113.4 135.4 The calculation of the B I C for different orders (Table 3.6) shows that for the first five stations, the minimum B I C is attained at order 3, for Stations 6,7,and 8 at order 4 and for the last two stations at order 5. Thus the B I C selection tends to confirm the selection made by using the sample A C F and P A C F for at least half of the stations. For the remaining stations, even when we fitted the AR(4 ) or A R ( 5 ) models, we did not observe much difference in the residuals from that of AR(3 ) model, reflecting the small partial autocorrelation of lag-4 and lag-5 in Figures 3.6 and 3.7. From the above analyses, we conclude the AR(3 ) model adequately represents the short-term correlation in the detrended residuals. 29 A R M A Model: Another useful class of models for time series is formed by combining the moving average and autoregressive processes. A n A R M A ( p , q ) model which contains p A R terms and q M A terms is given by Xt — Ct\Xt-l — ... — CtpXt-p = Et — 0\Et-\ — . . . — OqSt-q This model is often written in the form (j)p(B)xt = 9q(B)et, where B is the backward shift operator such that Bdxt = xt-d, and (f>p(B) = 1 - axB - ... - apBp, 9q(B) = 1-0^- ... - 6qBq. The importance of the A R M A process lies in the fact that a stationary time series may often be described by an A R M A model involving fewer parameters than a pure M A or A R process by itself. As we do not expect more parameters to be involved than we saw in our A R analyses, only the following models are fitted to the detrended residuals: A R M A ( 1 , 1 ) ; A R M A ( 2 , 1 ) ; A R M A ( 1 , 2 ) . Among the three models, A R M A ( 2 , 1 ) seems to do the best for most of the stations. Thus the model we selected to fit E(x,t) by using A R M A process is E(x, t) - axE(x, t-1)- a2E(x, t - 2) = e(x, t) - O^x, t-1). (5) The A C F and P A C F distributions (Figures 3.10 and 3.11) of the residuals left after fitting the A R M A ( 2 , 1 ) model indicate that at most stations, A R M A ( 2 , 1 ) removes almost al l the short-term correlations, but the correlation around lag-24 st i l l exists. B y way of comparison, the mixed A R M A ( 2 , 1 ) process seems better able than the single AR(3 ) models to capture short-term correlation structure. The estimates of parameters in model (5) are summarized in Table 3.7. 30 Table 3.7. Estimates of ARMA(2,1) Coefficients at Al l Stations Stations parameter 1 2 3 4 5 6 7 8 9 10 Oil 1.17 0.99 1.18 1.18 1.63 0.90 1.22 1.25 1.20 1.20 a.2 -0.25 -0.11 -0.26 -0.25 -0.65 -0.04 -0.30 -0.31 -0.28 -0.26 0i 0.57 0.34 0.53 0.56 0.88 0.30 0.54 0.65 0.59 0.57 °? 0.13 0.13 0.15 0.12 0.14 0.15 0.13 0.16 0.19 0.13 Compared with the estimated AR(3 ) coefficients, the spatial variabili ty of the A R M A ( 2 , 1 ) coefficients is coherent as well. Bu t the magnitude of this variabili ty is greater than that of the AR(3 ) model coefficients. S t i l l the coefficient at the Richmond station attains spatial maximum for a\ across al l sites. A R I M A Model: Even though the A R M A ( 2 , 1 ) or AR(3 ) model adequately explains quite well the short term temporal correlation of the detrended residuals, we st i l l observe correlations at lag 24, 48, etc. There remains some periodic structure in the residuals. A more complicated model that accommodates both the short-term and daily correlations is the seasonal A R I M A model. A n A R I M A (p,d,q) model which fits A R M A ( p , q ) models to the differenced series of xt can be defined by 4>p{B)vdxt = 9q{B)et This class of model includes A R , M A and A R M A models as the special cases. Box and Jenkins (1970) have generalized the A R I M A model to deal wi th seasonality and define 31 a general seasonal A P J M A model as </>p(B)$p{Ba)wt = eg(B)eQ(Bs)et where </>p, <3>p, 0 g , QQ are polynomials of order p, P, q, Q respectively while wt = V d V f xt. The variables wt are formed from the original series xt not only by simple differencing (\/xt = xt — xt-i) but also by seasonal differencing (\/sxt =.xt — xt-s) to remove seasonality. For our hourly data, it seems reasonable on heuristic grounds to choose the pa-rameter for seasonal differencing as s = 24. We first fit an A R M A (p,q) model to the seasonal differenced series of E(x, t) for each station, i . e., we take D = 1 and d = 0 and wt = V2AE(X, t) = E(x, t) - E(x, t - 24). The parameters p and q are chosen not greater than 3. Unfortunately, no matter which order of A R M A model is fitted to the obtained seasonal differenced series wt, the autocorrelations of the residuals at lags 24, 48, etc. are artificially inflated (Figure 3.12). We further take the simple difference, i . e., we take D = l and d = l , and wt = V V 2 4 E(x, t) = E(x, t) - E(x, t-1)- E(x, t - 24) + E(x, t - 25). Similar phenomena obtain for the residuals to those seen above. Thus the seasonal A R I M A process seems unable to explain the correlations at lag 24, 48, etc. A n A R M A ( 2 , 1 ) model for the simply differenced series wt = sjE(x,t) = E(x,t) — E(x,t — 1) can only describe short-term correlations (Figure 3.13). Bu t in comparison with the A R M A ( 2 , 1 ) for the original E(x,t) process, the latter may st i l l be a better choice. 32 We also tried the linear regression model E(x, t) = axE{x, t-l) + a2E(x, t - 2) + a3E(x, t - 3) + a2AE(x, t - 24) + e(x, t) (6) This model can be regarded as a special AR(24) process obtained by forcing the coeffi-cients 0:4,0:5, ...,0:23 to be zero. It removes some correlation at lag 24, but it brings in new correlation to short-term lags. The non-stationarity in the detrended residual series is not strong, but it cannot be fitted by any linear models of the time series itself. Instead it is desirable to use explanatory or regression variables to explain the daily correlation. The weather system passing over the study region may be responsible for this daily correlation as suggested by Gut torp et al.(1994). In our spatial-temporal model of PMW, we do not include any meteorological variables which are not available for the study now. It is suggested to add these variables into the model when relevant data become available and appropriate models for the meteorological fields are built up. For the current study, we keep the models (1) and (2), and leave the daily correlation in the residuals. A s this correlation is not high (not more than 0.06), it does not seem to be our concern in spatial interpolation. Residual Analysis: When a model has been fitted to a time series, we need to check that the model does provide a good description of the data. So far we have fitted A R , A R M A , A R I M A models to the trend-adjusted series. Because of the non-stationarity of the series, al l these models have the deficiency that they can not explain the long-term correlations although both the AR(3 ) and A R M A ( 2 , 1 ) processes seem to adequately represent short-term correlations in the series. To validate our models, we use analyses based on the left-over residuals e(x, i ) . One technique is to examine the autocorrelation and partial autocorrelation of the residuals. 33 Let Tit be the autocorrelation of the residuals e(x, t) at lag K. If the model is appropriate, then TJt would be uncorrelated and approximately normal variables wi th mean zero and variance 1/N for reasonably large value of N. S-plus software automatically provides the 95% confidence interval band for the % in the A C F and P A C F plots. B y examining the A C F and P A C F plots of the residuals after AR(3 ) or A R M A ( 2 , 1 ) models, we conclude that there are no exceptionally large errors from fitting either AR(3 ) or A R M A ( 2 , 1 ) to the detrended residuals. B y comparison, A R M A ( 2 , 1 ) may be more valid to describe the short term correlation, as there are fewer points lying outside the confidence band. In addition to examining the % ' s individually, it is useful to base a diagnostic on the autocorrelations taken as a whole. Suppose we have fitted an A R M A ( p , q ) model to a series. It is shown by Box and Pierce (1970) that if the model is adequate, then i=l is approximately distributed as x2(K — p — q), where K, the maximum number for calculating the autocorrelation is typically chosen in the range 10 to 30. This statistic is often called the Portmanteau goodness of fit test statistic. Inflated values of Q wi l l indicate lack of fit of the models. W i t h the fitted AR(3 ) and A R M A ( 2 , 1 ) models, we calculate the portmanteau statistic for K — 15. The results are displayed in Table 3.8 and 3.9. Table 3.8. Portmanteau Statistics for the ARMA(2,1) Model, K=15 Stations 1 2 3 4 5 6 7 8 9 10 df 12 12 12 12 12 12 12 12 12 12 statistic 18.5 11.9 19.2 17.8 39.5 15.8 22.7 11.2 39.7 29.6 p-value 0.14 0.53 0.12 0.17 0 0.26 0.05 0.59 0 0 34 Table 3.9. Portmanteau Statistics for the AR(3) Model, K=15 Stations 1 2 3 4 5 6 7 8 9 10 df 12 12 12 12 12 12 12 12 12 12 statistic 36.3 8.1 20.8 33.0 38.4 21.7 41.5 57.1 60.4 51.6 p-value 0 0.83 0.08 0 0 0.06 0 0 0 0 For the AR(3 ) model, the test statistic at most stations is significant. This tells us that there is some doubt about the adequacy of this model. Instead, for the A R M A ( 2 , 1 ) model, the test statistic is not significant ( 5% level) at most stations, except for Station 5 (Richmond), Station 9 (Abbotsford) and Station 10 (Chill iwack). Therefore A R M A ( 2 , 1 ) model seems better able to fit the data. This is consistent wi th our examination of the sample A C F and P A C F of the residuals. 3.4.2 A Single ARMA(2,1) Model A s we mentioned above, the residuals left after removing the temporal autocorre-lation wi l l be used for spatial interpolation. If we use site-specific time series models, we wi l l face the problem of interpolating these temporal models as well. To avoid this problem, we chose a single A R M A ( 2 , 1 ) model for al l the sites. The three coefficients we used for this A R M A (2,1) model are ax = 1.19, a2 = - 0 . 27 and Qx = 0.55. Doing our analysis this way, we actually assume separable structure of space and time correlations. The spatial variations of site-specific A R M A ( 2 , 1 ) coefficients does not formally support this compromise selection. Bu t because the coefficients vary only slightly from site-to-site, using the single A R M A (2,1) for a l l sites does not seem to produce significantly different residuals, compared wi th the results from site-specific 35 A R M A ( 2 , 1 ) models. We examine the serial correlations after fitting just one A R M A ( 2 , 1 ) model for al l sites. Figures 3.16 and 3.17 display the A C F and P A C F of the left over residuals. Station 5 (Richmond) has the highest lag-1 autocorrelation, the magnitude being about 0.1. For al l sites, the correlation around lag-24 is not more than 0.06. The residuals after fitting just one AR(3) model wi th coefficients a\ = 0.64, a2 — 0.10 and a 3 = 0.07 for al l the sites exhibit similar behaviour as that after fitting the same A R M A ( 2 , 1 ) model (Figures 3.14 and 3.15). S t i l l Station 5 gets the highest left-over lag-1 autocorrelation. From our analysis, we can see that except for Station 5, either the single A R M A ( 2 , 1 ) model or single AR(3 ) model does not result in the residuals much difference from site-specific A R M A (2,1) models or AR(3 ) models. Table 3.10 summaries the means and standard deviations for the log-transformed, detrended, the single AR(3 ) and A R M A ( 2 , 1 ) residual series. B y taking out the common trend, the mean of each site-residual is almost zero. Thus in our time series analyses, we did not adjust the mean. In the next chapter, we wi l l further compare the spatial covariance matr ix across the sites for the residuals by fitting site-specific and site-consistent A R M A ( 2 , 1 ) (or AR(3) ) models. 36 Table 3.10. Means and Standard Deviations for Log-transformed, Detrended, the Single AR(3) and ARMA(2,1) Residual Series (units: fxg/m3) Stations Series 1 2 3 4 5 6 7 8 9 10 Log-tran. Mean 2.50 2.47 2.26 2.33 2.56 2.39 2.40 2.32 2.49 2.37 St.Dev 0.64 0.61 0.71 0.66 0.66 0.70 0.67 0.66 0.70 0.72 Detrend. Mean 0.09 0.05 -0.16 -0.08 0.14 -0.03 -0.01 -0.09 0.08 -0.04 St.Dev 0.54 0.56 0.60 0.56 0.60 0.60 0.58 0.58 0.63 0.62 A R 3 Mean 0.02 0.01 -0.03 -0.02 0.03 -0.01 0.00 -0.02 0.02 -0.01 St.Dev 0.36 0.36 0.38 0.35 0.37 0.39 0.36 0.40 0.43 0.37 A R M A 2 1 Mean 0.02 0.01 -0.03 -0.01 0.03 -0.01 0.00 -0.02 0.01 -0.01 St.Dev 0.36 0.36 0.38 0.35 0.37 0.39 0.36 0.40 0.43 0.36 37 Chapter 4 Some Spatial Analyses Related To PM10 In the previous chapter, we have decomposed PMW into trend and residual compo-nents and modelled the temporal correlation structure in the detrended residuals. Now we wi l l look at the spatial patterns in the data. A s a detailed covariance model w i l l be presented elsewhere, we only give a brief investigation about the spatial correlation structure here. B y removing the temporal effects from the data, we can view the data as replicate measurements in space. Estimates of correlations and variances then can be computed by using their counterparts based on the "samples" of these replicates. The spatial cor-relation in the original series is high and exponentially decreasing wi th the geographical distance. Bu t after removing the common trend and A R M A ( 2 , 1 ) temporal correlation structure, we get very low spatial correlations in the residuals. In fact the current am-bient PMW data suggests that the spatial correlation in the data is relatively small compared to the temporal correlation. Consequently we infer that in the study region any given station picks up most of the variations of PMW. A t the same time, this anal-ysis suggests that the current monitoring network may not have sufficiently high spatial resolution to capture important local spatial correlation. Thus for certain purposes a finer scale monitoring network may be needed. O n average, the spatial variation of PMW is relatively small based on the data collected from the current network. Nevertheless there are differences among the sta-tions. In Section 4.1, we first investigate the grouping structure of the stations by using 38 a clustering method. We believe that spatially varying geographical and meteorological factors are responsible for the different behaviour at the stations. 4.1 Cluster Analysis The current G V R D monitoring network, extending about 30 km from south to north and 100 k m from west to east, is located wi thin the area of the Lower Fraser Valley. One of the unique features of the Lower Fraser Valley is its geography. Mountains and ocean combine to form a beautiful natural setting which attracts visitors from all over the world. But this unique setting is also one of the causes of poor air quality episodes in the region. The Coast Mountains to the north, the Cascade Mountains to the southeast and the Georgia Strait to the west al l inhibit the free air movement in the region. Together these natural features form the walls of the Lower Fraser Valley air basin; walls that contain the air and often prevent the dispersion of pollutants [ G V R D Website, A i r Quality]. Loca l meteorological conditions are another major influence on the air quality levels in the region. Factors which impact the air pollutant concentration include: wind, temperature, sunshine, precipitation and stability. The prevailing winds in the Lower Fraser Valley are from east to west in al l months of the year. Bu t in the summer, on-shore winds tend to push the air inland during the day, and at night the airflow reverses [ G V R D Website, A i r Quality]. A n y pollutants in the air wi l l transport back and forth accompanying the airflow pattern. Temperature inversion is also a common occurrence in the Lower Fraser Valley, when a layer of cold air is trapped next to the ground by a layer of warm air. Inversion prevents ventilation of air by forming a " l id" covering the air basin. Poor ventilation results in a bad air quality situation when successive days of industrial and motor vehicle emissions are bui ld up. Undoubtedly the above mentioned geographical and meteorological features wi l l 39 affect the PMW levels at different locations of the region. It is expected that the PMW concentrations at coastal locations will differ from that at the interior and far eastern points. By grouping these different locations based on their PMW levels, we may as-sociate the local PMW concentration with the geographical and meteorological factors, and reflect the underlying physical processes in the region. The descriptive analysis in Chapter 2 already exhibits the similarity between Stations 2 (Kitsilano) and 5 (Rich-mond) in their common anomalous behaviour. These two locations are close to the ocean. Furthermore the clustering analysis done below will provide insight about the spatial grouping of the locations. Cluster analysis seeks the group structure amongst the objects or cases. Hierar-chical clustering techniques proceed by either a series of successive mergers or a series of successive divisions. Agglomerative hierarchical algorithm starts with the individual ob-jects. There are initially as many clusters as objects. The most similar objects are first grouped, and these initial groups are merged according to their similarities. Eventually, as similarity decreases all subgroups are fused into a single cluster. To produce a simple group structure from a complex data set, any method will require a measure of "similarity" between two objects and a linkage method to merge subgroups. In our analysis, we adopt the commonly-used Euclidean distance for contin-uous observations to measure the dissimilarity between any two cases. By choosing the compact linkage (also called complete linkage) where the distance between clusters is de-termined by the maximum distance between cluster elements, we can avoid the chaining effect which may be induced by other methods, such as the single linkage in which groups are formed by merging nearest neighbors measured by the smallest distance between the elements (Johnson and Wichern, 1982). The purpose of hierarchical cluster analysis is descriptive. There is no assumption about the distribution of the data in hierarchical clustering. For most clustering meth-40 ods, sources of error and variation are not formally considered in the procedures. This means that a clustering method wi l l be sensitive to outliers or noisy observations. For this reason, we perform the cluster analysis on log-transformed P M 1 0 . The results of clustering can be graphically displayed in the form of dendrogram or tree diagram. Figure 4.1 show the group structure of monitoring stations, based on the 1996 and 1995 data respectively. W i t h 1996 data, the stations are partitioned into three groups, wi th Group 1 containing Stations 9 and 10 which are far to the east of the area, Group 2 with Stations 2 and 5 which are close to the shore and Group 3 al l the remaining interior sites. Clustering based on 1995 data is almost consistent with that of 1996 except for Station 10 which joins into the third group, leaving Station 9 as a single point group. The group structure we obtain is quite consistent with the geographical and me-teorological patterns over the region. The dissimilarity between Group 1 and Group 2 is obvious. Geographically, Stations 2 and 5 have their west-sides to the ocean. A i r pollution in this area is heavily influenced by coastal conditions. For example, overnight coastal fogs present in this area wi l l inhibit the dispersion of pollutant, resulting in high temporary concentrations in the mornings. Stations 9 and 10 in Group 1 are to the east-ern end of the region; in winter strong outflow winds can create sand storms that may not reach the coastal stations or other interior sites. The eastern Cascade Mountains also contribute to poor air quality episodes; during the summer, pollutants brought by the inland winds wi l l bui ld up because of these mountains walls. For interior sites there are no physical mechanism for the accumulation of pollutants, except for possible local point emissions. 41 4.2 Inter-site Correlation In this section, we wi l l investigate inter-site spatial correlations based on ambient P M i o measurements at the stations. As spatial interpolation wi l l be done through residuals obtained by removing trend and a common temporal correlation structure, the spatial correlation structure in the residuals w i l l be necessary to assure the success of interpolation. It is of interest to notice that levels of spatial correlation have been reduced considerably by removing the above-mentioned time-effects from each site. We wi l l address this issue in the section as well. W i t h the current monitoring network, the maximum geographical distance between sites is 92 k m (between Sites 2 and 10) while the minimum is 7 k m (between Sites 3 and 6). The stations are located in a rectangular region, or a belt from the ocean eastward to the interior of Lower Fraser Valley. Table 4.1. Between-site Geographical Distance (units: km) Sites 1 2 3 4 5 6 7 8 9 10 1 0 23.7 9.1 20.6 25.2 12.5 14.6 30.0 49.4 69.2 2 23.7 0 14.6 38.2 14.5 14.4 23.0 48.6 69.7 92.1 3 9.1 14.6 0 26.7 18.9 7.3 14.8 36.9 57.3 78.1 4 20.6 38.2 26.7 0 31.1 23.8 15.8 10.4 31.6 56.0 5 25.2 14.5 18.9 31.1 0 12.7 15.6 40.9 62.1 87.1 6 12.5 14.4 7.3 23.8 12.7 0 9.1 34.2 55.3 78.1 7 14.6 23.0 14.8 15.8 15.6 9.1 0 26.1 47.3 71.5 8 30.0 48.6 36.9 10.4 40.9 34.2 26.1 0 21.2 46.9 9 49.4 69.7 57.3 31.6 62.1 55.3 47.3 21.2 0 28.4 10 69.2 92.1 78.1 56.0 87.1 78.1 71.5 46.9 28.4 0 42 In contrast to the "single snapshot" approach of geostatistics, environmental mon-itoring data are obtained over time and so provide a sequence of replicates from which to compute the spatial covariance. The inter-site correlations of log-transformed PMW are in Table 4.2. Without taking out the temporal autocorrelation, these inter-site correlations include both spatial and temporal correlation. Table 4.2. Between-site Correlation of Log-transformed PM10 Sites 1 2 3 4 5 6 7 8 9 10 1 1.00 0.56 0.72 0.67 0.51 0.70 0.66 0.61 0.53 0.58 2 0.56 1.00 0.56 0.56 0.64 0.63 0.57 0.51 0.46 0.45 3 0.72 0.56 1.00 0.67 0.47 0.73 0.69 0.59 0.48 0.55 4 0.67 0.56 0.67 1.00 0.53 0.72 0.76 0.73 0.57 0.56 5 0.51 0.64 0.47 0.53 1.00 0.54 0.53 0.51 0.45 0.46 6 0.70 0.63 0.73 0.72 0.54 1.00 0.75 0.62 0.51 0.55 7 0.66 0.57 0.69 0.76 0.53 0.75 1.00 0.66 0.55 0.54 8 0.61 0.51 0.59 0.73 0.51 0.62 0.66 1.00 0.54 0.59 9 0.53 0.46 0.48 0.57 0.45 0.51 0.55 0.54 1.00 0.55 10 0.58 0.45 0.55 0.56 0.46 0.55 0.54 0.59 0.55 1.00 From the above table, we see that the magnitude of correlation is exponentially decreasing wi th the geographical distance between sites (Figure 4.2). The highest cor-relation is 0.76 (between Sites 4 and 7); the minimum is 0.45 (between Sites 2 and 10 or Sites 5 and 9); and the average is about 0.58. For spatial analysis, isotropy is a desirable property of a study field. "Isotropy" means that the spatial correlation between two sta-tions is invariant with respect to the relative positions of the two stations. The current network contains only 10 stations, located wi thin a relatively narrow band. It is not extensive enough to enable us to infer the isotropic feature of PMW field from the l im-ited information provided by these 10 stations. However by simply checking the above 43 correlation matrix, we find that the between-site correlation depends on the positions of the stations. For example, Station 2 is almost the same distance (15 km) from Stations 3 and 5, but their correlations have a difference of 0.1; the same situation obtains at Station 7 which has a similar distance from Stations 1 and 4 but different correlations. The dominant west-east meteorological conditions suggest that isotropy is not a valid assumption in this area. In Chapter 3, we have constructed the site-specific A R M A ( 2 , 1 ) (or AR(3) ) model to remove the serial correlation from each site. It was also suggested that we use site-consistent A R M A ( 2 , 1 ) (or AR(3) ) models to avoid the difficulty of interpolating spatially varying time series models. These two approaches wi l l produce different residuals at each site. The between-site correlations for these two kinds of residuals are presented in Tables 4.3-4.6 for comparison. Table 4.3. Between-site Correlation of Residuals after Site-consistent ARMA(2,1) Model Sites 1 2 3 4 5 6 7 8 9 10 1 1.00 0.16 0.25 0.22 0.11 0.24 0.19. 0.17 0.11 0.09 2 0.16 1.00 0.17 0.15 0.23 0.22 0.17 0.12 0.09 0.05 3 0.25 0.17 1.00 0.18 0.09 0.23 0.22 0.14 0.08 0.09 4 0.22 0.15 0.18 1.00 0.12 0.23 0.26 0.27 0.13 0.10 5 0.11 0.23 0.09 0.12 1.00 0.14 0.17 0.12 0.10 0.07 6 0.24 0.22 0.23 0.23 0.14 1.00 0.30 0.14 0.10 0.08 7 0.19 0.17 0.22 0.26 0.17 0.30 1.00 0.18 0.15 0.10 8 0.17 0.12 0.14 0.27 0.12 0.14 0.18 1.00 0.13 0.13 9 0.11 0.09 0.08 0.13 0.10 0.10 0.15 0.13 1.00 0.13 10 0.09 0.05 0.09 0.10 0.07 0.08 0.10 0.13 0.13 1.00 44 Table 4.4. Between-site Correlation of Residuals after Site-specific ARMA(2,1) Model Sites 1 2 3 4 5 6 7 8 9 10 1 1.00 0.16 0.25 0.22 0.10 0.24 0.19 0.17 0.12 0.09 2 0.16 1.00 0.16 0.14 0.22 0.22 0.16 0.12 0.10 0.05 3 0.25 0.16 1.00 0.17 0.08 0.23 0.21 0.14 0.08 0.08 4 0.22 0.14 0.17 1.00 0.11 0.23 0.25 0.28 0.13 0.10 5 0.10 0.22 0.08 0.11 1.00 0.13 0.16 0.12 0.10 0.06 6 0.24 0.22 0.23 0.23 0.13 1.00 0.29 0.15 0.10 0.08 7 0.19 0.16 0.21 0.25 0.16 0.29 1.00 0.18 0.15 0.09 8 0.17 0.12 0.14 0.28 0.12 0.15 0.18 1.00 0.15 0.14 9 0.12 0.10 0.08 0.13 0.10 0.10 0.15 0.15 1.00 0.13 10 0.09 0.05 0.08 0.10 0.06 0.08 0.09 0.14 0.13 1.00 Table 4.5. Between-site Correlation of Residuals after Site-consistent A R ( 3 ) Model Sites 1 2 3 4 5 6 7 8 9 10 1 1.00 0.15 0.24 0.22 0.10 0.23 0.19 0.16 0.11 0.09 2 0.15 1.00 0.17 0.14 0.23 0.22 0.16 0.12 0.09 0.05 3 0.24 0.17 1.00 0.17 0.09 0.22 0.21 0.14 0.08 0.08 4 0.22 0.14 0.17 1.00 0.11 0.23 0.25 0.27 0.13 0.10 5 0.10 0.23 0.09 0.11 1.00 0.14 0.17 0.12 0.10 0.07 6 0.23 0.22 0.22 0.23 0.14 1.00 0.29 0.13 0.10 0.08 7 0.19 0.16 0.21 0.25 0.17 0.29 1.00 0.18 0.15 0.10 8 0.16 0.12 0.14 0.27 0.12 0.13 0.18 1.00 0.13 0.13 9 0.11 0.09 0.08 0.13 0.10 0.10 0.15 0.13 1.00 0.13 10 0.09 0.05 0.08 0.10 0.07 0.08 0.10 0.13 0.13 1.00 45 Table 4.6. Between-site Correlation of Residuals after Site-specific AR(3) Model Sites 1 2 3 4 5 6 7 8 9 10 1 1.00 0.16 0.25 0.22 0.10 0.24 0.19 0.17 0.12 0.09 2 0.16 1.00 0.16 0.14 0.22 0.22 0.16 0.12 0.09 0.05 3 0.25 0.16 1.00 0.17 0.08 0.23 0.21 0.14 0.09 0.08 4 0.22 0.14 0.17 1.00 0.11 0.23 0.24 0.28 0.13 0.10 5 0.10 0.22 0.08 0.11 1.00 0.13 0.16 0.12 0.10 0.06 6 0.24 0.22 0.23 0.23 0.13 1.00 0.29 0.14 0.10 0.08 7 0.19 0.16 0.21 0.24 0.16 0.29 1.00 0.18 0.15 0.09 8 0.17 0.12 0.14 0.28 0.12 0.14 0.18 1.00 0.15 0.13 9 0.12 0.09 0.09 0.13 0.10 0.10 0.15 0.15 1.00 0.13 10 0.09 0.05 0.08 0.10 0.06 0.08 0.09 0.13 0.13 1.00 We observe that the results are quite similar. This suggests that using a site-consistent A R M A ( 2 , 1 ) (or AR(3) ) model does not result in much difference in the spatial structure of the residuals from that of site-specific A R M A ( 2 , 1 ) (or AR(3) ) models. There is a considerable reduction in spatial correlation achieved by taking out the trend and the site-consistent A R M A ( 2 , 1 ) (or AR(3) ) temporal structure from each site. Now the maximum inter-site correlation in the residuals is only 0.29. The decrease in spatial correlation following the approach of first eliminating the serial correlation seems unavoidable. Some studies following this approach [ c.f. Guttorp, et al . (1994); Sun (1994)] only address spatial covariance estimates and interpolation. They ignore the magnitude of the spatial correlation. A s the sample covariance matrix is positive definite, in principle we can interpolate at any non-monitored sites, even when the between-site correlation is small. In this study, two points need to be addressed concerning the low correlation 46 in the residuals. First, the low correlation may help us to realize that, based on the ambient PMW collected from current monitoring network, the spatial variation of PMW is relatively small over the study region, as most of the variation can be explained by the same temporal structure across the sites. That leads us to believe that the current network does not reflect the spatial gradients of PMW. To get information at higher resolution would require intensive monitoring down to the scale of, say, under 2 km. Another point concerns interpolation based on current data and the model. With this low correlation, how much is the contribution of the interpolation to the construction of ambient PMW field in the study region? This issue will be discussed in future work. 47 Chapter 5 Discussion This thesis reports preliminary findings in a study involving investigations at both Harvard and U B C . That study which was commissioned by the Environmental Protec-tion Agency of the United States wi l l among other things require the spatial interpo-lation of Vancouver PMW field. The results of carrying out that interpolation wi l l be reported elsewhere, so we do not give here a detailed investigation of the spatial corre-lation structure in the current study. The spatial interpolator to be used wi l l require however independence over time. Ensuring the validity of this assumption constitutes one of the principal by-products of the present analysis that leads to the removal of temporal autocorrelation and whitened residuals for interpolation. B y analyzing the raw ambient hourly P M 1 0 data collected in the G V R D and nearby Fraser Valley area, we found remarkable consistency of monthly, weekly, week-day and day-hour variation patterns from site-to-site. W i t h i n the study area, motor vehicle emissions are the major source for PMW pollution and the week-day and day-hour effects well represent the traffic pattern. Anomalous weather conditions, such as coastal fogs, stagnant air condition, etc., may cause poor air quality episodes at some stations (e. g., Richmond, Abbotsford, Chil l iwack). In our space-time model, in contrast wi th the square-root transformation by Gut -torp et al . (1994) and Carrol l et al . (1997) in their hourly ground-level ozone studies, we chose the log transformation to stabilize the variances and make the distribution of PMW approximately normal. For spatial-temporal fields, many models involve the 48 analysis of both a space-time trend and the left over residuals. Our efforts followed this commonly-used approach. Usually the trend varies spatially and temporally over the study region. In our area, the preliminary analysis of PMW data suggests consistent temporal variations over the sites. The spatial-temporal model we used has a deter-ministic trend component which captures this site-to-site consistent temporal variation pattern. The other component of the model includes al l the irregular variations not ex-plained by the trend, thus containing both spatial and temporal correlations. To model such a random field, a variety of approaches have been reported. Haslett and Raftery (1989) as well as Handcock and Wall is (1994) eliminate the time-direction correlation before they model the spatial structure. In the ground-level hourly ozone data study, Carro l l et al . (1997) choose a space-time covariance structure where the logarithm of the covariance is a linear function of distance and slope and intercept changing wi th time-lag only. In our PMW analysis, that correlation structure for ozone seems inappropriate. We first fitted site-specific time series model to eliminate the temporal correlation from each site. The A R M A ( 2 , 1 ) (or AR(3) ) model seems to capture the short-term autocorre-lation structure satisfactorily. B y noting the coherent and slight spatial variations of the coefficients of A R M A ( 2 , 1 ) (or AR(3) ) over sites, we attempt to use a single A R M A ( 2 , 1 ) (or AR(3) ) model for al l sites. The disadvantage in doing this is that we may leave temporal correlation in the residuals for some stations. The most singular station seems to be Richmond. That station exhibits different behaviour from the other stations, in particular high late evening P M 1 0 levels. We may isolate this station from other stations to construct a special model for it. Even though the A R M A ( 2 , 1 ) (or AR(3) ) model describes the short-term autocor-relation in detrended residuals quite well, we do st i l l get serial correlation at lag 24-hour, 48-hour, etc. Gut torp et a l . (1994) observe the same phenomenon wi th respect to hourly ozone data. Weather conditions may be responsible for this daily correlation. To address 49 this problem in future studies, we plan to incorporate meteorological variables into the models. It is very interesting to notice that after removing the common trend and the same A R M A ( 2 , 1 ) (or AR(3) ) temporal structure from the original data, the spatial correlation between sites is considerably reduced. There is li t t le spatial correlation in station residuals wi thin the scale (7 km) of the current network. The decrease in spatial correlation following the elimination of serial correlation seems to be unavoidable. A s we used the same temporal structure for a l l the sites, this low inter-site correlation may suggest that the information from a single site picks up most of the variations of PMW in the study region. Thus station residuals contain lit t le useful additional information for the construction of PMW field. So to determine the fine structure of the PMW spatial correlation wi l l require field studies at high resolution on a scale of, say 2 km. 50 Bibliography Akaike, H . (1973), "Information Theory and an Extension of the M a x i m u m Likel ihood Principle", in The Second International Symposium on Information Theory, eds. by Petrov, B . N . and Csake, F . , Akademiai Kiado , Hungary, 267-281. Bates, D . V . and Sizto, R . (1987), " A i r Pol lut ion and Hospital Admissions in Southern Ontario: The A c i d Summer Haze Effect", Environmental Research, 43, 317-331. Bates, D . V . , Baker-Anderson, M . and Sizto, R . (1990), "Asthma Attack Periodicity: A Study of Hospital Emergency Visi ts in Vancouver", Environmental Research, 51, 51-70. B . C . Minis t ry of Environment, Lands and Parks (1997), " A i r Qual i ty Report for Br i t i sh Columbia, Fine Particulate (PM10) Levels (1990-1995)". Box, G . E . P. and Pierce, D . A . (1970), "Distr ibution of Residual Autocorrelations in Autoregressive-integrated Moving Average Time Series Models", Journal of the Amer i -can Statistical Association, Vol.64, 1509-1526. Box, G . E . P. and Jenkins, G . M . (1976), "Time Series Analysis, Forecasting and Con-trol" , San Francisco: Holden-Day. Brown, P. J . , Le, N . D . and Zidek, J . V . (1994), "Multivariate Spatial Interpolation and Exposure to A i r Pollutants", Canadian Journal of Statistics, Vol.22, No.4, 489-509. Carrol l , R . J . , Chen, R. , George, E . I., L i , T . H , Newton, H . J . , Schmiediche, H . and Wang, N . (1997), "Ozone Exposure and Population Density in Harris County, Texas", Journal of the American Statistical Association, Vol.92, No.438, 392-415. Ji: Chatfield, C . (1996), " The Analysis of T ime Series, A n Introduction", London: Chap-man &c Ha l l . Clark, I. (1979), "Practical Geostatistics", London: Appl ied Science Publishers. Cressie, N . (1986), "Kr ig ing Non-stationary Data" , Journal of the American Statistical Association, Vol.81, No.395, 625-634. Dockery, D . W . , Speizer, F . E . , Stram, D . O. , Ware, J . H , Spengler, J . D . and Ferris, B . 51 G . (1989), "Effects of Inhalable Particles on Respiratory Health of Chi ldren" , American Review of Respiratory Disease, 139, 587-594. Gerrity, T . R. , Lee, P. S., Hass, E . J . , Marinel l i , A . , Werner, P., and Lourenco, R. V . (1979), "Calculated Deposition of Inhaled Particles in the Airway Generations of Normal Subjects" , Journal of Appl ied Physiology, 49, 867-873. Gooijer, J . G . de, Abraham, B . , Gould , A . and Robinson, L . (1985), "Methods for Determining the Order of an Autoregressive-Moving Average Process: A Survey ", In-ternational Statistics Review, Vol.53, N o . l , 301-329. Guttorp, P., Meir ing, W . and Sampson, P. D . (1994), " A Space-Time Analysis of Ground-Level Ozone Data" , Environmetrics, Vol .5 , 241-254. G V R D (1994), "Fine particulate and Visibi l i ty , Great Vancouver Regional Distr ict ." G V R D (1997), "Ambient A i r Quali ty Annua l Report 1996, Great Vancouver Regional Distr ict ." Haas, T . C . (1990), "Log-normal and Moving Window Methods of Est imat ing A c i d Deposition", Journal of the American Statistical Association, Vol.85, No.412, 950-963. Handcock, M . S. and Wall is , J . R. (1994), " A n Approach to Statistical Spatial-Temporal Model ing of Meteorological Fields", Journal of the American Statistical Association, Vol.89, No.426, 368-378. Haslett, J . and Raftery, A . E . (1989), "Space-time Model l ing wi th Long-memory De-pendence: Assessing Ireland's W i n d Power Resource", Appl ied Statistics, Vol.38, N o . l , 1-50. Host. G . , Omre, H . and Switzer, P. (1995), "Spatial Interpolation Errors for Moni tor ing Data" , Journal of the American Statistical Association, Vol.90, No.431, 853-861. Johnson, R . A . and Wichern, D . W . (1982), "Appl ied Mult ivariate Statistical Analysis" , Englewood Cliffs, N.J . :Prentice-Hall . Jones, R. H . (1975), "Fi t t ing Auto-regressions", Journal of the American Statistical Association, Vol.70, No.351, 590-592. Koehler, A . B . and Murphree, E . S. (1988), " A Comparison of the Akaike and Schwarz Cri ter ia for Selecting Model Order", Appl ied Statistics, Vol.37, No.2, 187-195. 52 Laslett, G . M . (1994), "Kr ig ing and Splines: A n Empir ica l Comparison of Their Predic-tive Performance in Some Applicat ions", Journal of the American Statistical Association, Vol.89, No.426, 391-400. Le, N . D . and Petkau, A . J . (1988), "The Variabi l i ty of Rainfal l Ac id i ty Revisi ted", Canadian Journal of Statistics, Vol.16, N o . l , 15-38. Le, N . D . and Zidek, J . V . (1992), "Interpolation with Uncertain Spatial Covariance: A Bayesian Alternative to Kr ig ing" , Journal of Multivariate Analysis, 43, 351-374. Lowenthal, D . H . , Wittorff, D . and Gertler, A . W . (1994), " C M B Source Apportionment Dur ing R E V E A L (Regional Vis ib i l i ty Experiment in the Lower Fraser Val ley)" , F ina l Report Submitted to the A i r Resources Branch, Minis t ry of Environment, Lands and Parks. Mardia , K . V . and Goodal l , C . R . (1993), " Spatial-Temporal Analysis of Mult ivariate Environmental Moni tor ing data", in Pa t i l , G . P. and Rao, C . R . (editors), Mult ivariate Environmental Statistics, Nor th Holland, Amsterdam, 1993, 347-386. Matha i , C . V . (1990), "Vis ib i l i ty and Fine Particles. A Summary of the A & W M A / E P A International Specialty Conference", Journal of A i r Waste Management Association, Vol.40, N o . l l , 1486-1494. Michael , R . R . and Pope, C . A . III. (1992), "Elementary School Absences and P M 1 0 Pol lut ion in Utah Val ley", Environmental Research, 58, 204-219. Oehlert, G . W . (1993), "Regional Trends in Sulfate Deposition", Journal of the American Statistical Association, Vol.88, No.422, 390-399. Ostro, B . D . (1987), " A i r Pol lut ion and Morbidi ty Revisited: A Specification Test", Journal of Environmental Economics and Management, 14, 87-98. Ozkaynak, H . and Spengler, J . D . (1985), "Analysis of Health Effects Resulting from Population Exposures to A c i d Precipitation Precursors " , Environmental Health Per-spective, 63, 45-55. Ozkaynak, H . and Thurson, G . D . (1987), "Associations Between 1980 U.S . Morta l i ty Rates and Alternative Measures of Airborne Particle Concentration", Risk Analysis, Vol.7, No.4, 449-461. Pope, C . A . III. (1990), "Respiratory Hospital Admissions Associated wi th P M 1 0 Pol -lution in Utah , Salt Lake, and Cache Valleys", Archives of Environmental Health, 46, 53 89-97. Pope, C . A . I l l , Dockery, D . W . , Spengler, J . D . and Raizenne, M . E . (1991), "Respira-tory Health and P M I O : A Dai ly Time-series Analysis" , American Review of Respiratory Disease, 1991b, 149, 1123-1128. Sampson, P. D . and Guttorp, P. (1992), "Non-parametric Est imation of Non-stationary Spatial Covariance Structure", Journal of the American Statistical Association, Vol.87, No.417, 108-119. Schwartz, J . and Marcus, A . (1990) " Morta l i ty and A i r Pol lut ion in London: A Time Series Analysis" , American Journal of Epidemiology, 131, 185-194. Schwartz, J . (1991), "Particulate A i r Pol lut ion and Dai ly Morta l i ty in Detroit", Env i -ronmental Research, 56:204-213. Schwartz, J . and Dockery, D . W . (1992a), "Particulate A i r Pol lut ion and Dai ly Morta l i ty in Steubenville, Ohio" , American Journal of Epidemiology, 135, 12-19. Schwartz, J . and Dockery, D . W . (1992b), "Increased Morta l i ty in Philadelphia Associ-ated with Dai ly A i r Pol lut ion Concentrations", American Review of Respiratory Disease, 145, 600-604. Schwarz, G . (1978), "Estimating the Dimension of a Mode l " , The Annals of Statistics, Vol.6, No.2, 461-464. Shibata, R . (1976), "Selection of the Order of an Autoregressive Mode l by Akaike's Information Cri ter ion", Biometrika, 63, 117-126. Sloane, C . S. and White , W . H . (1986), "Vis ib i l i ty : an Evolving Issue", Environmental Science and Technology, Vol.20, No.9, 760-766. Spedding, D . J . (1974), " A i r Pol lut ion" , Oxford: Clarendon. Sun, W . (1994), "Bayesian Multivariate Interpolation wi th Missing Da ta and its A p p l i -cations", P h . D dissertation, University of Br i t i sh Columbia, Dept. of Statistics. Vedal, S. (1995), "Health Effects of Inhalable Particles; Implication for Br i t i sh Columbia" , Prepared for the A i r Resources Branch, Br i t i sh Columbia Minis t ry of Environment, Lands and Parks. Vedal, S., Petkau, J . , Whi te , R. and Bla i r , J . (1998), "Acute Effects of Ambient Inhalable 54 Particles in Asthmatic and Nonasthmatic Chi ldren" , American Journal of Respiratory and Cr i t i ca l Care Medicine, 157, 1034-1043. Venables, W . N . and Ripley, B . D . (1996), "Modern Appl ied Statistics with S-Plus", New York: Springer-Verlag. Weisberg, S. (1985), "Appl ied Linear Regression", New York: Wiley. 55 •J 56 Kitsilano, 0*311996 Richmond, Aug.61996 Figure 2.2. Exceedance Days of 24-hour A i r Quali ty Objective at Ki ts i lano and Richmond 57 Abbotsford, Feb.131996 Abbotsford, Jul.241996 10 20 10 20 Hour Hour Figure 2.3. Exceedance Days of 24-hour Air Quality Objective at Abbotsford and Chilliwack 58 Month-Effects, 1996 Week-Effects, 1996 Week-day-Effects, 1996 i i i 1 1 1 r 1 2 3 4 5 6 7 Week-day Figure 2.6. Week-day Effects of P M I O , Year 1996 61 Month-Effects, 1994 Month-Effects, 1995 Week-Effects, 1994 0 10 20 30 40 50 Week Week-Effects, 1995 0 10 20 30 40 50 Week Figure 2.8. Monthly and Weekly Effects of PMIO Years 1995 and 1994 63 Week-day-Effects, 1994 Week-day-Effects, 1995 Figure 2.9. Week-day and Day-hour Effects of P M I O Years 1995 and 1994 64 Station 1.PM10 Station 1,sqrt(PM10) Station 1,log(PM10) m H I Station 2, PM10 Station 2, sqrt(PM10) Station 2, log(PM10) 100 150 6 8 10 12 Station 3, PM10 Station 3, sqrt(PM10) Station 3, log(PM10) 20 40 60 80 100 0 2 4 6 v.«!l« ~ -BUM 1 2 Station 4, PM10 Station 4, sqrt(PM10) Station 4, log(PMIO) 0 20 40 60 80 100 Station 5, PM10 50 100 150 200 250 6 8 Station 5, sqrt(PM10) Station 5, log(PM10) o 1 5 6 Figure 3.1. Histograms of P M I O at Original , Square-root and Log Transformed Scales, Stations 1-5 65 Station 6, PM10 Station 6, sqrt(PMIO) Station 6, log(PMIO) Station 7, PMIO 100 Station 8, PM10 20 40 60 Station 7, sqrt(PM10) 0 2 4 6 10 12 Station 8, sqrt(PMIO) Station 7, log(PM10) 1 2 3 Station 8, log(PMIO) 2 3 4 Station 9, PM10 100 200 300 Station 10, PM10 100 150 Station 9, sqrt(PM10) Station 10, sqrt(PM10) 8 10 12 Station 9, log(PM10) i l l 0 1 2 3 Station 10, log(PMIO) Figure 3.2. Histograms of P M I O at Original , Square-root and Log Transformed Scales, Stations 6-10 66 Figure 3.3. T ime Series Plots of P M I O , Year 1996 67 Station 1 Station 2 Station 1 Series: Station[, 1] Series :Station[,1] .IilllJIJiJllJ.llllJ.lli : t].lll^,.. ; „ , , IS 6 fl 6 CM 6 o Ji;ati .Lj^.^L . .t!\! j . .! i! . j . i^ij. t j . . i j . ( . .cij . i j . . ; j:i . . . i^ij . .: .vi«-.viLij J::::.I.;.C.... 0 0 20 40 60 0 20 40 60 Lag Series: Station[, 2] Lag Series: Station[, 2] III llUlllllllUlilJllllU.. - i U l U u _ ID d fl d CM d 0 \ , r-.-M-nvM-hl+ :a J:c;:;;::;:: 0 20 40 Lag Series: Station[, 3] 60 0 0 20 40 Lag Series: Station[, 3] 60 . i j l l l j l i J l l l l i l l i l l i i J l l lu ^ <fl 6 fl d CM d o k 0 20 40 Lag Series: Station[, 4] 60 o 00 0 20 40 ' Lag Series: Station[, 4] 60 III l l l H l l l U l l l l M . i«;j!f[yf)!|!H:;|!i! ;!;' 0 <0 d fl. d CM d o I.L- : 0 20 40 • 60 0 0 20 40 60 Iii Lay Series: Station[, 5] illjlilimjiiillllljihw i¥H!;ii!iiHi«n!f|y^!f|y;|!fn;f|!|!i ' a 0 <o d fl d CM d 0 d Lag Series: Station[, 5] 0 20 40 60 0 20 40 60 Lag Lag Figure 3.6. Autocorrelation and Partial Autocorrelation Coefficients of Detrended Residuals, Stations 1-5 70 Series: Stationf, 6] Series: Station[, 6] ID 6 d CM d 1 0 0 0 20 40 60 Lag Lag Series: Station], 7] Series: Station[, 7]. " M T I I T I T " " — - • " H V M t M ' f ' f M 1 Lag Series: Station[, 8] Series: Station[, 8] ' ' " T i v r i T l v r n ' r < " Series: Station[, 9] Lag Series: Station[, 9] 20 40 60 Lag Series: Station[, 10] Lag 20 40 60 Lag Series: Station[, 10] Figure 3.7. Autocorrelation and Par t ia l Autocorrelation Coefficients of Detrended Residuals, Stations 6-10 71 Series: Station! 1] Series: Station[, 1] Lag Lag Series: Station[, 2] 20 40 Lag Series: Station[, 3] Series: Station[, 2] L.. . . .1 J . . 1 . ( iT. i ., Ii t l ; i n : : r 11 1 ' ' . ' 'ii n II Lag Series: Station[, 3] Ii . Li II n l 1 il. i Ii, .1, i , I i. ,1 1. | ' | i 1 'ijii NJ ' " ' " 1 |i '| II1 Lag Series: Station[, 4] 20 40 Lag Series: Station[, 5] Series: Station[, 4] i l i i . ' u L i . i [1 i i , ii . iFhi i. i t - 1 'i • • | ' | |i |i n n il ||| jl | | | | 0 20 40 Lag Series: Station[, 5] 60 L 1 . ii i . . ll l l, i f 1 i i ih ,h i II 'III' ' H l| i|i 'i i I'll . ' i Figure 3.8. Autocorrelation and Par t ia l Autocorrelation Coefficients of Site-specific AR(3 ) Residuals, Stations 1-5 72 Series: Station[, 6] Series: Station[, 6] Lag Series: Station[, 7] Lag Series: Station[, 8] Lag Series: StationJ, 9] 20 40 Lag Series :Station[, 10] • 1 i.ii. ii ., n l i l M i l . . 1 . ,,1 ....! :.. II. I1 i 1 i M ii r T T r Lag Series: Station[, 7] 111. I., 1 ll.fll III 1 il . . . i .. 1 '1 1 1 II 1 1 H i M ' | | | ' ' i n Lag Series; Station[, 8] 1 J 1 lllllll,.ll..1l .1 , i . . . i . i 1 f , , , J '1 M M | [ ' j ' || | ' l I f ' ||' 1 0 20 40 60 Lag Series: Station[, 9] i i.ll l.il.l.ll ii I . 1 , if t . , i , i 1 .1 i l i | i • • INN i i ' | |l | * || 1 i| | 20 40 Lag Series: Station[, 10] Ml i " I 1 MIT 1 'I | Ug Figure 3.9. Autocorrelation and Par t ia l Autocorrelation Coefficients of Site-specific AR(3 ) Residuals, Stations 6-10 73 Series: Station], 1] Series: Station[, 1] I I.I , .1, 1 111., I . I ll 1 Jl II 1 11 111"]1 • i n " " i i ' | r i Series: Station[, 2] Series: Station[, 3] Series: Station[, 2] . ,._L J „ i . 1 f, 1 it , . 1, t t ,1 1 .... 1 "II 11 " 'II l'| I T I'l 1 II 20 40 Lag Series: Station[, 3] " r i i i i ' i Lag Series: Station[, 4] Lag Series: Station[, 4] I ' l - ' i l ' l l i - i ' ' - ' 1 i I ' l " ! ! ! ! ' Series: Station[, 5] Lag Series: Station[, 5] : i ! j i : i i i ! : j L . y . l . . ^ ' j . i ! i l i l ^ Lag Lag Figure 3.10. Autocorrelation and Par t ia l Autocorrelation Coefficients of Site-specific A R M A ( 2 , 1 ) Residuals, Stations 1-5 74 Series: Station[, 6] Series: Station[, 6] Lag Series: Station[, 7] i n . . I-I .. .ii i L k . 111 • 111 ' I N 'I Nil Series: Station[, 7] TTT I i i Lag Series: Station[, 8] Lag Series: Station[, 8] i 1 h ,1 , , H i . i M i i , ,i 11 'm i i ' T ' i' j III r • II i'i |[' i Series: Station[, 9] 20 40 60 Lag Series: Station! 10] 20 40 60 U g Series: Station[, 9] I 1 , 1 rr i "rr i. , II . i i i 1 1 1 ' i 1 ' H ' i I'ji 1 i i ] 11| i Lag Series: Station[, 10] l,,il,,,i,„i I I Lag Figure 3.11. Autocorrelation and Par t ia l Autocorrelation Coefficients of Site-specific A R M A ( 2 , 1 ) Residuals, Stations 6-10 75 Series :Station[, 1] Series: Station], 1] 20 40 Lag Series: Station[, 2] Series: Station[, 3] Ug Series: Station[, 4] 40 60 Ug Series: Station[, 5] 20 40 Lag Series: Station[, 2] Lag Series: Station[, 3] • • i t — . — - r — , . , . . . . . , . . . „ . , . „ . . . . . . . . . . . . „ , . . Lag Series: Station[, 4] Series: Station[, 5] o T-+-T 0 CM 0 0 0 20 40 60 0 20 40 60 Lag Figure 3.12. Autocorrelation and Par t ia l Autocorrelation Coefficients of Residuals of A R M A ( 2 , 1 ) for Seasonally Differenced Series Stations 1-5 76 Series: Station[, 1] 20 40 60 Lag Series: Station[, 2] Series: Station[, 1] i | i i i im . .n l , ' . . , i i | I'Mi 1 |"111 n i i ' Series: Station!, 2] 1111111 • Series: Station!, 3] Lag Series': Station!, 3] .Ll. . r I... II ll 1 llil II.I , 1. . . 1 , 1 1 1 1 1 "1 I I H I ' H '| HI Series: Station!, 4] Lag Series: Station!, 4] HI: Series: Station!, 5] • Lag Series: Station!, 5] .' " ' . I 1 III II 'i Figure 3.13. Autocorrelation and Par t ia l Autocorrelation Coefficients of Residuals of A R I M A ( 2 , 1 ) for Simply Differenced Series Stations 1-5 77 Series: Station], 1] Series: Station[, 1] Series: Station!, 2] 20 40 Lag, Series: Station!, 3] 20 40 Lag Series: Station!, 4] Series: Station!, 5] 1, If , l l , , l l 1 [Ill, 1 I ,1 i, i i 1 jl ' I] ' 11 rj i • 1 1 ' II | i ' i Lag Series: Station!, 2] " l n i l ' V 1 " 1 ' 1 ' j i I ; ! I I 20 40 Lag Series: Station!, 3] .1 i , i... 1 I, . 1,1 1, l.llill il , li.l.l, , , n . ,1 1. |M | i 1 'l"i j j l ' l M l ' 1 | | | | | | Series: Station!, 4] J l.li hi hi, i tl 1 1. h . ifl.i L i t I III I 1 111 1' " I I .'1 -II 11 111 20 40 60 Lag Series: Station!, 5] i v : vn " ' i i i . i 11 II 11•t111 'ii1 Lag Figure 3.14. Autocorrelation and Par t ia l Autocorrelation Coefficients of Site-consistent AR(3) Residuals, Stations 1-5 78 Series: Station], 6] Series: Station[, 6] iiilL -iUi in;Jlj'ri LZJI'il'i-i1 Series: Station[, 7] Series: Station[, 8] Series: Station[, 7] L,..i .illl 1.. 1 li.fll Illl 1 h . . 1 .. i 1 1 1 'II" H'lj |l i \ r |y i| |i>| . 4 0 Lag Series: Station[, 8] I M I ,. II II II ,1 .l.!...l.i!.[|i!.l|!...„.!.LJ.:.l..!l!!.JI.!' Lag Series: Stationf, 9] Lag Series: Station[, 9] i!ZiL'[ ' : n L''!l[ . : i luiZ Series: Station!, 10] Lag Series: Station!, TO] r r MIL 1 I I-. ll Lag Figure 3.15. Autocorrelation and Part ia l Autocorrelation Coefficients of Site-consistent AR(3 ) Residuals, Stations 6-10 79 Series: Station[, 1] Series: Station!, 1] Lag Series: Station!, 2] • 20 40 Lag Series: Station!, 3] Lag Series: Station!, 41 20 . 4 0 Lag Series: Station!, 5] 20 40 60 Lag l i l i ..Ii MINI,, I .'i • 11" 11' • M ' • I' i •"1 • 11 • • M Lag Series: Station!, 2] .L.j , " • ; i t r J , i. ., . Ii M ,1 i . . , m n i n" 'ii ,n • • \ > i'i i II ' • f Lag Series: Station!, 3] III 1 . , .ill 11,1 1 i„ 1 i m ' 1 1 ' 11 • 1 p i1 " 1" ' 111 '] III • Lag Series: Station!, 4] f i l l , III, 1,1 f i i i . n . ifl.i i, 1 1 -i i i ' i i 1 i | r | i " i 1 ' i I I p i T | , I 20 . 40 ' 60 Lag Series: Station!, 5] ll. ].L 1 , .1 1 , itl II 1 1 II, ., , 11 1 ll "1 1 II ll'l'l i i |i'll "I-MM |l I Figure 3.16. Autocorrelation and Par t ia l Autocorrelation Coefficients of Site-consistent A R M A ( 2 , 1 ) Residuals,. Stations 1-5 80 Series: Station[, 6] Series: Station!, 6] Series: Station!, 7] Series: Station!, 8] 20 40 Lag . Series: Station[, 9] 20 40 , 60 Lag Series: Station!, 10] I , .1 ll i •. 1 1 , , . . • II n: . . • i . J i iM i |r 1 , |[ 1 1 'MHI' -20 40' Lag Series: Station!, 7] TTT M T Lag Series: Station[, 8] . , i . ., ll 1 . ,II ,i i . 11 ii • i i , ,1 ........ .... ...11 ...... 4. ir ' T * . if 40 .;, Lag Series: Station[, 9] , 1 1 1 .1 1 ll ll ll. .i , 1 ,11 f, , , i . L l , i i n i i ' '| ' . mi iii-i |i | > il 1 1| I Series: Station!, 10] T T T r • i i |. Figure 3.17. Autocorrelation and Par t ia l Autocorrelation Coefficients of Site-consistent A R M A ( 2 , 1 ) Residuals, Stations 6-10 81 Station 1 Station 2 0 2000 4000 6000 6000 0 2000 - 4000 6000 .8000 Hour Hour Station 5 Station 6 0 2000 4000 6000 8000 0 2000 4000 ... 6000 8000 Hour Hour Station 7 Station 8 ' 0 2000 4000 6000 ' 8000 0 2000 4000 6000 ,8000 0 2000 4000 6000 8000 0 2000 4000 6000 8000 Hour Hour Figure 3.18. Time Series Plots of L o g ( P M l O ) , Year 1996 82 Station 1 Station 2 0 2000 4000 6000 8000 0 2000 4000 6000 8000 Hour Hour Figure 3.19. T ime Series Plots of Detrended Residuals, Year 1996 83 Station 1 Station 2 Figure 3.20. T ime Series Plots of Single A R M A ( 2 , 1 ) Residuals, Year 1996 84 Station 1 Station 2 Figure 3.21. T ime Series Plots of Single AR(3 ) Residuals, Year 1996 85 Clusters based on Log-transformed 1996 Data, link=compact 0) o co OJ 10 (0 Clusters based on Log-transformed 1995 Data, link=compact 0) CM w ^ oo Figure 4.1. Group Structure among the Moni tor ing Stations 86 CO q -Cd T -D TJ 00 W 0 0 DC CO (0 DC 0 < ion 0 lat CM <D i_ 6 0 0 20 40 60 80 Geographical Distance • • • k' . • 20 40 60 80 Geographical Distance Figure 4.2. Between-site Correlations vs Geographical Distances (km) 87
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Spatial and temporal analysis of ambient hourly PM10...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Spatial and temporal analysis of ambient hourly PM10 in Vancouver Li, Kathy H.Q. 1998
pdf
Page Metadata
Item Metadata
Title | Spatial and temporal analysis of ambient hourly PM10 in Vancouver |
Creator |
Li, Kathy H.Q. |
Date Issued | 1998 |
Description | Fine particulates (PM10, PM2.5) have attracted many studies from a variety of disciplines. Because of the composition of higher proportion of various toxic metals and acidic sulfur species, fine particulate pollution is of specific concern to public health. They can contribute to the development of multiple adverse health effects. Many communities have put the particulate monitoring, evaluation and emission reduction as the prior issues in their air quality management plans. A current three-year Harvard-UBC study supported by the Environmental Protection Agency of the United States focuses on the development of new statistical methods and models for estimating population-level exposures to PM10 and PM2.5 in the greater Boston and Vancouver areas. That study will among other things require the spatial interpolation of Vancouver PM10 field, for example, down to the Census Tract levels. This thesis reports preliminary findings about the spatial and temporal variations of ambient hourly PM10 collected in the Greater Vancouver Regional District and nearby Fraser Valley. The spatial interpolator to be used will require independent response over time. To ensure such as assumption leads to the removal of temporal autocorrelation and whiten residuals for interpolation. The results from descriptive analysis of the raw data exhibit obvious consistency of monthly, weekly, week-day and day-hour patterns from site-to-site. In the study region motor vehicle emissions are the major source of PM10 pollution and the week-day and day-hour effects capture the traffic pattern. Poor air quality episodes at some stations are associated with anomalous weather conditions, such as coastal fogs, stagnant air conditions. Our space-time model for PM10 includes a deterministic trend and stochastic residual. We introduce a spatially homogeneous trend that comprise the above-mentioned site-to-site consistent temporal effects. For the detrended residuals, we find that the ARMA(2,1) or AR(3) model seems to capture the short-term autocorrelation structure satisfactorily. The coherent and slight spatial variation of the coefficients of ARMA(2,1) or AR(3) models suggests us to employ a single ARMA(2,1) or AR(3) model for all sites to eliminate the time-direction correlation and get a parsimonious description of our data. There remains some daily autocorrelation (lag-24 hours) in ARMA(2,1) or AR(3) residuals which can not be described by linear time series models. Nevertheless this correlation is not more than 0.06, and does not seem to cause concern in spatial interpolation of these residuals. It is suggested to incorporate the meteorological variables into models in future studies that might explain this daily serial correlation. It is noteworthy that there is little spatial correlation in station residuals after removing the trend and the same temporal autocorrelation structure represented by either a single ARMA(2,1) or AR(3) model. This low inter-site correlation may suggest that any given site in the study region picks up most of the variations of PM10 and station residuals contain little useful additional information for the construction of PM10 field. At the same time this analysis may suggest that the current network (with a scale of 7km) may not have sufficiently high spatial resolution to reflect the local important spatial gradients of PM10. For certain purposes, a finer monitoring network (down to the scale, say 2km) may be required to determine the fine spatial structure of PM10. |
Extent | 4122456 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-05-26 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0088493 |
URI | http://hdl.handle.net/2429/8208 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1998-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1998-0517.pdf [ 3.93MB ]
- Metadata
- JSON: 831-1.0088493.json
- JSON-LD: 831-1.0088493-ld.json
- RDF/XML (Pretty): 831-1.0088493-rdf.xml
- RDF/JSON: 831-1.0088493-rdf.json
- Turtle: 831-1.0088493-turtle.txt
- N-Triples: 831-1.0088493-rdf-ntriples.txt
- Original Record: 831-1.0088493-source.json
- Full Text
- 831-1.0088493-fulltext.txt
- Citation
- 831-1.0088493.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0088493/manifest