UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The relationship between asthma and pollution levels in Prince George, British Columbia Li, Bing 1989

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
[if-you-see-this-DO-NOT-CLICK]
UBC_1989_A6_7 L48.pdf [ 2.96MB ]
Metadata
JSON: 1.0097525.json
JSON-LD: 1.0097525+ld.json
RDF/XML (Pretty): 1.0097525.xml
RDF/JSON: 1.0097525+rdf.json
Turtle: 1.0097525+rdf-turtle.txt
N-Triples: 1.0097525+rdf-ntriples.txt
Original Record: 1.0097525 +original-record.json
Full Text
1.0097525.txt
Citation
1.0097525.ris

Full Text

THE RELATIONSHIP BETWEEN ASTHMA AND POLLUTION LEVELS IN PRINCE GEORGE, BRITISH COLUMBIA By Bing Li B.Sc, Beijing Institute of Technology, 1982 M.Sc, Beijing Institute of Technology, 1986 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE DEPARTMENT OF STATISTICS  We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA September 1989 © B i n g Li, 1989  In  presenting this  degree  at the  thesis  in  University of  partial  fulfilment  of  of  department  this thesis for or  by  his  or  requirements  British Columbia, I agree that the  freely available for reference and study. I further copying  the  representatives.  an advanced  Library shall make it  agree that permission for extensive  scholarly purposes may be her  for  It  is  granted  by the  understood  that  head of copying  my or  publication of this thesis for financial gain shall not be allowed without my written permission.  Department The University of British Columbia Vancouver, Canada  Date  DE-6 (2/88)  y  )ep9J  j  i^Q  Abstract  Three methods are used to analyse the relationships between asthma and pollution levels in Prince George. The first is the spectral analysis approach, which concentrates on the spectra of the pollusion level series and the asthma counts series, their cross spectra, their coherency, and a measure of the linear relationship between them. The other two methods are what we term the "generalized harmonic process" and the "generalized autoregressive approach", which generalize the traditional harmonic process and autoregressive model and allow us to analyse time series with less restricted distribution assumption such as discrete time series. The analyses suggests a weak positive association between asthma and the pollution levels of total reduced sulphates.  n  Contents  Abstract  ii  Table of contents  iii  List of tables  vi  List of figures  viii  Acknowledgemet  ix  1 Introduction  • • •_•  1  1.1 Description of the Data Set  1  1.2 Overview  2  2 Methodology  3  2.1 The Spectral Analysis Approach  3  2.1.1 Prewhitening of the X and Y Series  4  2.1.2 Cross Spectrum and Coherency  6  2.1.3 A Measure of Linear Relationship between the Series X and Y  7  2.2 The Generalized Harmonic Process (GHP) Approach  8  2.2.1 The GHP Model  .9  2.2.2 The Spectra of GHP and Their Null Distributions  10  2.2.3 Spectra Based on Likelihood Ratio Tests Against Periodicities  11  2.2.4 Application of the GHP Model  13 iii  2.3 The Generalized Autoregressive Model (GAR) Approach  14  2.3.1 The GAR Model  14  2.3.2 Application of the GAR Model  16  3 Relationships of the Health Series and the T R S Series  16  3.1 The Spectral Analysis Approach  16  3.2 The Generalized Harmonic Process Approach  17  3.2.1 The Spectra of ER and AD  18  3.2.2 ER versus TRS  19  3.2.3 AD versus TRS  21  3.3 The Generalized Autoregressive Model Approach  22  3.3.1 ER versus TRS  .22  3.3.2 AD versus TRS  24  4 Relationship© of the Herlth Series and the T S P Series  24  4.1 The Spectral Analysis Approach  25  4.2 The Generalized Harmonic Process Approach  27  4.2.1 ER versus TSP  28  4.2.2 AD versus TSP  32  4.3 The Generalized Autoregressive Model Approach  36  4.3.1 ER versus TSP  36  4.3.2 AD versus TSP  40 iv  5 Final Remarks  Bibliography  List of Tables  Table 1. Sample Means and Variances  2  Table 2. Week Pattern of ER and AD  18  Table 3. P-values for the Tests for Cycles  19  Table 4. Tests of Significance  21  Table 5. Tests of Significance  23  Table 6. p 2 for ER and AD versus TSP  27  Table 7. Test of Significance for X terms  29  Table 8. Test of Significance for X terms  29  Table 9. Test of Significance for X terms  30  Table 10. Test of Significance for X terms  31  Table 11. Test of Significance for X terms  31  Table 12. Test of Significance for X terms  32  Table 13. Test of Significance for X terms  33  Table 14. Test of Significance for X terms  33  Table 15. Test of Significance for X terms  34  Table 16. Test of Significance for X terms  35  Table 17. Test of Significance for X terms  35  Table 18. Test of Significance for X terms  36  Table 19. Test of Significance for X terms  37  Table 20. Test of Significance for X terms  37  vi  Table 21. Test of Significance for X terms  38  Table 22. Test of Significance for X terms  39  Table 23. Test of Significance for X terms  39  Table 24. Test of Significance for X terms  40  Table 25. Test of Significance for X terms  41  Table 26. Test of Significance for X terms  41  Table 27. Test of Significance for X terms  42  Table 28. Test of Significance for X terms  42  Table 29. Test of Significance for X terms  43  Table 30. Test of Significance for X terms  ; 43  vn  List of Figures  Figure 1. Histograms of ER and AD  48  Figure 2. Comparison of Whitened vs Unwhitened Spectra  49  • Figure 3. Coherencies between Whitened Series  50  Figure 4. Proposed Spectra of ER  51  Figure 5. Proposed Spectra of AD  52  Figure 6. Comparison between the Whitened and Unwhitened Spectra of ER .. 53 Figure 7. Comparison between the Whitened and Unwhitened Spectra of AD .. 55 Figure 8. Comparison between the Whitened and Unwhitened Spectra of TSP . 57 Figure 9. Coherencies between Whitened ER and Whitened TSP Figure 10. Coherencies between Whitened AD and Whitened TSP  vm  58 60  Acknowledgement I would like to thank Dr. J. Petkau, my supervisor, for his very helpful comments, suggestions and advices on this thesis as well as many other aspects of my study at UBC, and for his patiently reviewing the draft. I would also like to thank Dr. P. De Jong for reading the thesis and making valuable suggestions. I am very grateful to the Department of Statistics for providing me the financial support.  ix  1 Introduction 1.1 Description of the Data Set One part of the data set to be analyzed consists of the counts of the number of admissions for asthma to the single hospital in Prince George, which will be referred to as AD, and those of visits to the emergency room of that hospital, which will be referred to as ER. These counts were extracted from the hospital records for the period April 1, 1984 to March 31, 1986. The other part of the data is the levels of total suspended particulates (TSP) and the levels of total reduced sulphates (TRS) in Prince George for the same period. The TRS levels were measured at six monitoring stations on a daily basis, whereas the TSP levels were measured at five of the above stations, and are available only every sixth day. We estimated the missing values in TSP and TRS via the EM algorithm of Dempster, Laird and Rubin [2]. To reduce the pollutant series to single series describing the levels experienced in Prince George, we then take the average of TSP across the six stations and of TRS across thefivestations. There is no missing data in the ER and AD series. There has long been a public concern in Prince George that the air pollution in the city may be influencing the health of the residents, since the pollution levels often exceed the provincial air quality standards. The objective of the work reported in this thesis was to study the relationships between the two health series, AD and ER, 1  and the two pollution series, TRS and TSP, to examine whether any associations are apparent between human health and the ambient levels of air pollution in Prince George. The hospital admission counts for asthma on a specific day can be viewed as the outcome of a large number of Bernoulli experiments (one for each of the residents of Prince George), each of which has a very small corresponding probability of success (being admitted to the hospital for asthma). So, as a starting point, it seems reasonable to treat the admission counts as the observed values of Poisson random variables. A similar argument can be made for the emergency room visit counts. We can see from Table 1 that the sample means and variances of ER and AD are quite close, which agrees with the Poisson assumption. Table 1: Sample Means and Variances Sample Mean  Sample Variance  ER  0.75  0.86  AS  0.55  0.51  The histograms of ER and AD are plotted in Figure 1; these also suggest that a Poisson model may be reasonable. 1.2  Overview  We shall use three different methods to study these relationships. Section 2 discusses the theoretical aspects of these methods. The results of their application to studying the above relationships are presented in Sections 3 and 4. Since TRS 2  and T S P are collected differently, the relationships between the health series and the T R S series are examined i n Section 3, while those between the health series and the T S P series are examined in Section 4. In Section 5 we summarize the results of these analyses and briefly discuss other possible methods of analysis. T h e methods used here differ from those employed i n [3] and [4], and the results obtained complement those reported i n these earlier studies.  2 Methodology The first method to be applied is the traditional spectral analysis approach, carried out as a preliminary and primarily descriptive analysis. It concentrates on the spectra of each pair of X and Y series, their cross spectrum, their coherency, and finally a measure of the linear relationship between them. The other two methods are combinations of ideas and techniques from the methodology of generalized linear models ( G L I M ; see [5]) and time series analysis. One method will be referred to as the generalized harmonic process ( G H P ) , and combines G L I M with the spectral analysis, the other method will be referred to as the generalized autoregressive model ( G A R ) , and combines G L I M with autoregressive models.  2.1 The Spectral Analysis Approach  In this section, we describe the details of a method of measuring the strength of linear relationship between two time series {X , t — 1,2,...} and {Y , t = 1,2,...}, t  which will be referred to as the X and Y series.  3  t  2.1.1 Prewhitening of the X and Y Series The relationship of interest between the X and Y series is that which excludes any association due to the deterministic patterns each series may have. Therefore it seems reasonable to "whiten", that is, to remove the dependence within each of the X and Y series before studying the relationship between the two series. This might be done, for example, by first fitting the X and Y series with AR models, and then considering the relationship between their residuals. A possible concern is that such a whitening process might remove more than just the association due to the deterministic patterns of each series or even that it might remove all the relationship between the X and Y series. The following theorem indicates that this is not the case. If a time series {X } t  is stationary, then it can be represented as (see [6], page  246): J — TT  where the stochastic process Z^(CJ) will be referred to as the spectral representation of the series {X }. t  If we have two series {X } and {Y }, then at each frequency w, the t  t  correlation coefficient between the two random variables dZx(u) and dZy(u>), that is, Cor(dZx(^>), dZy(to)), is referred to as the "complex coherency between series X and Y at frequency  and denoted as WXY(W). The absolute value of WXY(W) is  called the "coherency" between series X and Y at frequency OJ.  Theorem (see [6], page 661): Let {X }, {Y }, {X' }, {Y{} be-time series, stationary t  t  4  t  up to the second order, with +00  't =  X  +00  YJ uX -u  and  a  t  Y[ '=  U= —CO  ^t-uU=—OO  Let Zx(u), Zy(oj), ZX '(u), ZY '(u), be their spectral representations. Then WXY{U) =  wx 'y>(u).  The theorem can be interpreted as "coherency is invariant under linear transformations" . Therefore, in the sense of coherency, prewhitening the two time series as described at the beginning of this section would not remove the relationship between them, provided that both of them are stationary and the prewhitening procedure is linear. Based on this consideration, we will use the following whitening process. We first estimate the spectra of the X and Y series (see [6], page 398-399). Evaluate the estimated covariance function, R(s), as  i y  t=i  the estimated autocorrelation function, r(s), is then given by  f(a) = R(s)/R(0).  The spectra are then estimated by: N-1  h(u) = 2  Yl  M(s)r(s)cos(oos),  (1)  s=-(N-l)  where M(s) is the "weight" assigned to f(s) for each s, which is often referred to as the "lag window" (see [6], page 434). In our analyses, we will use the Parzen lag  5  window (see [6], page 443): 2  3  l-6(s/M) + 6(H/M) , M(s) = < 2(1 - \s\/Mf,  M < M/2, M/2 < \s\ < M,  \s\ > M,  0 with some integer M.  We fit each of the X and Y series with an AR model, vising maximum likelihood to estimate the parameters and Akaike information criterion to determine the order. The residual from the fit is the "whitened" series. This whitening "removes" the autocorrelation within each series but, according to the above theorem, does not disturb the intercorrelation between the X and Y series if they are stationary. Comparing the estimated spectra for each pair of whitened and unwhitened series clearly illustrates the effects of the prewhitening. 2.1.2 Cross Spectrum and Coherency Next, we wish to study the relationship between the whitened X and Y series. In the following discussion, if not specifically mentioned, the whitened X and Y series will be simply referred to as the X and Y series. First, we estimate the cross spectra between the X series and the Y series (see [6], page 693): hxrH  =  E  M(s)f (s)e- '; iu  XY  s=-(N-l)  6  here rxY(s)  is the estimated cross-correlation function,  YY(0)  where RXY(S)  is the estimated cross covariance function, Rx (s)=^j:(Xt -T)(Y -Y), Y  t+s  with the summation extending from t = 1 to JV — s for s > 0 and from t = 1 — s to N for s < 0, and M(s) is the Parzen window centered at the maximum value of TXYThe coherency between time series X and Y is then estimated by: hxv(u)  (2)  The introduction of the coherency between two time series allows us to have a measure of the strength of linear relationship between the two series, as described in the next subsection. 2.1.3 A Measure of Linear Relationship between the Series X and Y Although coherency gives the correlation between dZx(u) and dZyiui), it corresponds to specific frequencies, and is not a direct measure of the linear relationship between the series X and Y. However, based on coherency and spectra, such a measure of linear relationship between two stationary time series is provided by the following theorem. Theorem (see [6], page 675): Suppose X and Y are stationary up to the second t  t  order, and Y can be represented as a linear transformation of X of the following t  t  7  form: +00  where e4 and  are uncorrelated. Then we have:  / i y y ( c j ) j ? - O x y ( u ; ) | c ? w + a\ 2  — TT  where a\ — var(YJ)  and  o~\ = var(e4).  This result provides a decomposition of the total variance of Yt into the variance explained by the regression of Yt on Xt and the variance of the residuals. So the ratio 2  2  p = 1 — o~ /aY represents the extent to which Yt and X are linearly related in the t  above sense. More specifically, it represents the percentage of variation in the series Y explained by its "best possible" linear relationship (with infinitely many parameters) 2  2  with the series X. It can be estimated by p = 1 — a /aY , where (see [6], page 675):  2  In our analysis, p provides us with a rough idea of the strength of the linear relationship between two time series.  2.2 The Generalized Harmonic Process ( G H P ) Approach  In this section we introduce an approach that can be applied to detect periodicities for data modelled with the exponential family class of distributions. These cycles can 8  then be incorporated into regression models relating two series. Due to its importance for application to the Prince George data, emphasis in the development which follows is given to the Poisson case.  2.2.1 The GHP Model Assume that we have independent observations: Y\,Y/v, and and Y has probt  ability density function proportional to exp[Y g(0 ) + a(6 ) + b(Y )]. t  t  t  (3)  t  Then {Y } is said to be generated by a GHP if t  k  g(0 ) = A + ^(Aismuit t  + Bicosuiit)  t=i  for some A, {A,}, {Bi}, and a prespecified set of {w,}. Example 1: If Y ~ Poisson(At), then g(X ) = logAt, and t  t  k  logX = X + t  y^(Ajsincj,^ + Bicosujit). i=i  We call it the Poisson harmonic process. Example 2: If Y ~ Binomial(ro,/>j) with common n, then g(pt) = log ( j ^ - ) , and t  log -2—  = A + £(A,sinu,ti + B  jCOSWjt).  ,=i  We call it the binomial harmonic process. Example 3: If  ~ N(/it, 1), then g(pt) — fJ-t,  a n <  i  k  p = X+ t  £(A,sinu;;i + Bicoscoit).  (4)  This is the traditional harmonic process.  2.2.2 The Spectra of GHP and Their Null Distributions  Once we have the model (3) and relation (4), we can use GLIM to get the MLE's A, A,'s and J3,'s for A, A,'s and Bis. Following the traditional approach, we can plot - 2 - 2  the scaled A, +B; as the "power" associated with the frequency u;,- (see [6], pp. 395). The following theorem gives their limiting null distribution in the Poisson case. The result is very similar to that for traditional spectra, except that the current results are asymptotic. Completely similar arguments can be applied for other distributions in the exponential family (3) such as the binomial to obtain limiting null distributions for spectra. A set of "fundamental frequencies" is given by {u>i : U{ = 2iri/N,i  =  1,[N /2]},  where JV is the number of observations. The set of frequencies thus defined has the orthogonality properties (see [6], page 392). Theorem: Let { ( ^ i , u i k ] he a set of fundamental frequencies, and A, Aj's and J3,'s be the MLE's for A, A;'s and B^s under the model (3). Then, under the null hypothesis: H : Ai = 0, Bi = 0, i = 0  the yjYYAiS  1,k  and \pjYJ3,'s are asymptotically independently and identically dis2  2  tributed (iid) as N(0,1), and therefore the yY^A -f J3, )'s are asymptotically iid xl-  10  The intuition of this spectrum is similar to that of the traditional one, that is, the "power" assigned to each frequency. Proof: It is easy to verify that under model (3), the Fisher information matrix for the MLE's is 1(d) = W'MW, where /  W  ^ 1  1  sinu^l  cosu^l  sinu>fcl coso^l  1  sino'12  cosa;i2  sinwA;2  cosu>iJV  sinu>jtiV  s'muJiN  cosa;jt2  cosuikN  j  and M = diag(Ai,A n ). Under JET0, M = diag(A,A), and because of the orthogonality properties of the fundamental frequencies, W is an orthogonal matrix with E i l r Wf = N, and £ £ x W$ = N/2,j > 1, we have x  t I(0 ) Q  =  W'MW  NX  \  2  0  0  hkx2k  \  Setting 9 = (Ai, B \ , A k , Bk) , under Ho, we have f  -A^=^N(0,/2fcx2fc). The theorem follows because under Ho, Y is a consistent estimator of A.  2.2.3 Spectra Based on Likelihood Ratio Tests Against Periodicities  In this subsection we introduce other two methods for detecting periodicities, each based on a series of hypothesis tests associated with a prespecified set of frequencies. The null hypotheses are that the series has a mean not varying with time; 11  the alternative hypotheses are that the series has a cycle of a specified frequency. A likelihood ratio test can then be constructed with respect to each of the frequencies, and these likelihood ratio tests lead directly to spectra. Proposal 1: Consider the following series of hypothesis tests: : Ai = ... = An = A p)  rl[ : A; = X  i= 1,n - p  i+P  For any value of p, the likelihood ratio is: R = 21ogL(H< ) - 2logL(nt ) P)  )  p  where  logL(H«) = £ £ j=l  t'=l  y(|..1)jH.ilogAjA„ p  j=l  and — w e c a n  Since R ~ Xp-n p  p  n/,p  Y(i-i)p+ . p  the P-value associated with each period p. If the  P-value at some period p is below a critical level a, this suggests a period p in the process. Alternatively, we can plot 1 — P-value against period, and the large (close to 1) values in this "quantile spectrum" suggests a periodicity.  Proposal 2: Similarly, consider the following series of hypothesis tests: HQ  ^ : Ai = ... = An = A 12  Hi  : logA< = A + As'mut + Bcosut  Associated with each u>,  R(u) = 21ogL(Hf") - 2logL(HS, ) » \ )  w)  x  where logL(Ho^) is the same as before, and  logL(Hi ') = u)  n  n  ^(A + As'mujt + Bcosut) - ]T) exp(A + Asinut + Bcosut). t=i t=i  Then we can plot R(ui) against u.  2.2.4 Application of the GHP Model  The spectrum methods introduced above provide an idea of the "importance" of each frequency. The "important" frequencies can then be included in regression models for Y against X. The applications which follow in Sections 3 and 4 involve the Poisson case and the following procedure will be employed. First, based on the GHP spectrum introduced in Section 2.2.2, we select the set of frequencies corresponding to small P-values from the asscoated x\ statistics, and combine the terms corresponding to these frequencies with the X terms in the following model: m  logAt = A + ^(At-sina>< + B osojt) + J2 A^t-v; iC  w  (5)  t=0  here m is a relatively large integer believed sufficient to cover the past with which the present could be associated. Then model reduction is based on retaining in the above model only those terms with coefficients which differ significantly from zero 13  (P-value < a.'). This process is repeated in stages, leading to a final model of the form  where {Ai}, {Bi}, and {/?,} are the coefficients surviving the model reduction. At each stage of model reduction, Adeviances are also investigated to make sure the reduction is legitimate. Tests for the "importance" of the X terms, and specification of the relationship between {-X"*}, and {Y }, if any, are then based on thisfinalmodel. t  2.3 The Generalized Autoregressive Model (GAR) Approach Traditional AR models are most suitable for time series with continuous marginal distributions. The GAR model is developed to describe time series with the exponential family class of marginal distributions. This allows us to model time series with discrete marginal distributions. 2.3.1 The GAR Model A sequence of N observations Y\, ... , Yjv are said to follow an GAR model of order p if they have the following conditional distributions:  /W(V.-|VT) = ex {y [0(y?)] + d[0(yf)] + S(y,)}, P  lC  where  14  c[9(Y?)] = 0o + 0*17,  P = (A, A . .".AO*.  Example 1: When the conditional distributions of the V s are Poisson, we have 9(Y?) = EiYlY?),  and c[6{Yft] = logt^r/)] = 0 + p*Y?. O  We call it a log-AR model. Example 2: When the conditional distributions of Y's are binomial with common known total countrc,we have nB(YT) = EiYlY?),  and c\6<yn\ = i o g l  e  Y  } g ^n  =00 + p**?.  We call it a logit-AR model. Example 3: When the conditional distributions of Y's are normal with common known a = 1, we have 0(Y?) = EiYlYf),  and [e(xn\ = eom = 0 +  C  O  This is the traditional AR model. 15  2.3.2 Application of the GAR Model Similar to Section 2.2.4, we can combine the GAR model with regression against X. In the Poisson case, we start with the following model: logAt = c*o + E aiYt-i + J2 PiXt-i i=l  (6)  t'=0  here p and q are relatively large integer believed sufficient to cover the past values of X and Y with which the present Y could be associated. A model reduction procedure similar to that of Section 2.2.4 can then be applied.  3 Relationships of the Health Series and the TRS Series 3.1 T h e Spectral Analysis Approach  The prewhitening is done byfittingeach series with an AR model. Providing thefitis adequate, the residuals will be close to "white noise". The AR fitting is based on maximum likelihood, and the model reduction procedure is based on Akaike's information criterion. The residual autocorrelation function, the residual sum of squares, and the spectrum of the residual series are investegated to check the goodness-of-fit.We present only the resulting model and the spectrum of residuals. In the following models, Y denotes an observation centered at the maximum likelihood estimate (MLE) of the mean of the original series; that is, Y denotes t  Y — fi, where Y is the observation and p is the MLE of the mean of the series. t  t  For ER, fi = 0.75, and thefittedmodel is: Y = 0.079Ff_i - 0.07lFt_n + 0.11Yt_i5. t  16  For AD, fi = 0.54, and thefittedmodel is: Y = 0.12rt_2 + 0.083Yt_8 + 0.068Yf_9 - 0.062Yt_„ - 0.054Yt_13 + 0.07lYt_19. t  For TRS, fi = 3.37, and thefittedmodel is: Y = 0.37Yi_1 + 0.12Y,_2 + 0.059Yi_5 - 0.097Yt_12 - 0.053Y"t_16. t  The above models show how the present data is linearly associated with the past data, thereby describing the dynamics within each series. We do not see a similar pattern among the three series. The whitened and unwhitened spectra for ER, AD and TRS are estimated using the foumula (1) with M = 60 and are plotted in Figure 2, with solid curves representing the whitened spectra and dotted curves representing the unwhitened spectra. In all of the 3 pictures, the solid curves are much flatter than the dotted curves. This illustrates the effect of prewhitening the series. The coherency between ER and TRS and between AD and TRS are estimated 2  using formula (2) and are plotted in Figure 3; the p measure described in Section 2.1.3 equals 0.04 and 0.06 for ER vs TRS and AD vs TRS respectively. Both values are quite small, which means the strength of the linear relationships between ER and TRS and between AD and TRS are both quite weak. 3.2 The Generalized Harmonic Process Approach  In this section we first study the dynamics within each health series and then incorporate the apparent dynamics into the regression of each of the health series 17  against the TRS pollution series. 3.2.1 The Spectra of ER and AD We calculate the 3 spectra introduced in Sections 2.2.2 and 2.2.3 for both ER and AD. The fundamental frequencies we selected for the GHP spectrum and the Proposal 2 spectrum are:  : k = 13,26,39,364}. The quantile spectrum is  plotted against the period T = 2 , 1 6 days. Figure 4 presents the GHP, Proposal 2 and Proposal 1 spectra for ER. From Figures 4a and 4b we can see a high jump at the frequency u> = 0.89; the associated period is T = 27r/0.89 = 7.02 ~ 1 week. From Figure 4c we can see this spectrum has high jumps at both T = 7 and 14, which also indicates a period of one week. This fact is comfirmed by Table 2, where we can see that there are more emergency room visits in weekends than in weekdays. Figure 5 presents the GHP, Proposal 2 and Proposal 1 spectra for AD. Figure 5a and Figure 5b both show a relatively high value at frequency u = 0.56, the associated period is T = 11, which seems to suggest a period of 11 days. However, this is not supported by Figure 5c, where we do not see any indication of such a period. Although it seems hard to give an intuitive interpretation to this period, the corresponding trigonometric function does, as will be seen later, capture quite a bit of the variation in the data and provide us a reasonable predictor in modelling the "dynamics" inside AD. Table 2: Week Patterns of ER and AD 18  Sun. Mon. Tue. Wed. Thu. Fri. Sat. Average of ER counts  0.98  0.81  0.63  0.49  0.54  0.76 1.07  Average of AD counts  0.55  0.64  0.53  0.57  0.48  0.50 0.56  We can also construct tests for the significance of the cycles using the GHP spectrum. According to the Theorem in Section 2.2.2, we know that the statistics in the GHP spectrum are asymptotically iid x\'i that is, exponential with 6 — 2. Therefore the largest peak in the spectrum should be distributed asymptotically as the first order statistic for a random sample of N exponentials, and similarly for the second and third largest peaks. Comparison to these null distributions leads to the results summarized in Table 3: Table 3: P-Values for the Tests for Cycles ER  AD  First Peak  0.0002  0.024  Second Peak  0.33  0.25  The third and following order statistics also have large values, indicating they do not correspond to significant cycles. ER and AD seem to have only one dominating cycle each.  3.2.2 ER versus TRS Based on the GHP spectrum calculated above, we can then do model reduction to obtain important frequencies and then incorporate them into the regression versus 19  TRS as described in Section 2.2.4. Starting with model (5) with m=30, and using a — a' = 0.10 in the model reduction procedure, we are led to the following model: logAi =  -.33 + .13sin(.llt) + .12sin(.89r) - .15sin(2.13i) - .llcos(.lli) +.13cos(.34t) + .21cos(1.12i) + .13cos(1.68t) + .016X _ t  -.022X t  U  5  + .017X(_14 - .025X(_16 + .015Xt_2O.  The z-scores for the coefficients of the X terms are: 1.91, -1.98, 1.83, -2.16, 1.76. The deviance is 803.5, with 697 degrees of freedom (P-value=0.0031). There is a lack of fit in this fitting, this could probably be removed by introducing an overdispersion parameter. Notice that TRS enters thefittedrelationship via some quite large lags; for example, a lag of 20. This peculiar phenomenon may be due to the fact that, since a' = 0.1, roughly 10% of the predictors would survive the model reduction even if there were no significant predictors. However, without further empirical knowledge of the relationship between ER and TRS, we have no way to determine which of the lags surviving are of essential importance. Due to the above fact and the lack of fit, this model only provides us with a rough idea of how ER is related with TRS. Also observe the alternating sequence of signs on the TRS terms. It does not seem unreasonable to assume that changes in air pollution levels have an effect on ER. Therefore we refit a simpler model in which X ^ , X _n t  5  t  Xt -i4,  only through A i = Xt -5 — X -u, and A 2 = Xt -\4 — X -iQ. t  t  20  and Xt_16 enter  This results in the  following fit: logAt =  -.37 + .13sin(.ll<) + .13sin(.89i) - .15sin(2.13<) - .llcos(.llf) +.13cos(.34*) + .21cos(1.12t) + .13cos(1.68<) + .OI8A1 +.019A2 + .014Xt_20.  (7)  This refitting results in an increase in the deviance of 0.8, with 2 degrees of freedom; this suggests the reduction is appropriate. The results of tests for the importance of the X and A terms based on the above model are summarized in Table 4. Table 4: Tests of significance Test  Adev  df  P-value  for Ax  7.4  1  0.007  for A 2  6.5  1  0.01  for Xt -2G  3.1  1  0.08  We can see the P-values for the A terms are quite small, which suggests that the changes in TRS levels could be an important factor. 3.2.3 AD versus TRS In exactly the same fashion we obtain the GHP model for AD vs TRS: logAt = -.70 - .12sin(0.56<) - .12sin(1.34f) -.16sin(2.24i) + .20cos(.56t) + .020Xt_4. 21  The z-score for the coefficient of Xt -4 is 2.25 and the deviance is 697 with 720 degrees of freedom (P-value=0.724). Withholding -X"t_4 increases the deviance by 4.5 (Pvalue=0.034), which suggests an association of AD with TRS with 4 days lag.  The above analyses seem to suggest positive associations of both ER and AD with TRS. Emergency room visits seem to be more affected by changes in pollution levels than by the actual values, although the peculiar structure of the relationship suggested by model (7) and the lack of empirical knowledge as well as the lack of fit do not allow us to say so with high confidence. On the other hand, hospital admission seem to be affected by a high level of TRS with 4 days lag.  3.3 The Generalized Autoregressive Model Approach In this section we will apply the GAR model to study the relationships between the health series and the TRS series. 3.3.1 ER versus TRS We start with model (6) with the order of autoregression p = 20, and the order of regression against TRS q = 30. The model reduction rule is a = 0.10. Restriction to the order q = 30, corresponding approximately a month, assumes that pollution levels a month ago would not affect the present counts of emergency room visits. Restriction to the order p = 20 assumes that the counts 20 days ago would not affect the present counts. The following model results: logAi =  -.49 + .091YU + .14yt_15 + .016X,_5 - .022X,_n 22  +.020X .  t 14  -  .027Xt_16 + .018Xt_20.  The deviance is 827.78, with 702 degrees of freedom (P-value=.0004). The z-scores for the coefficients of the X terms are -2.32 and 2.05. There is a lack of fit, this could probably be removed by introducing an overdispersion parameter. Notice that the fitted relationship between ER and TRS is quite similar to that obtained via the GHP approach, except that the dynamics within ER is accounted for by regression on the past values of ER rather than on trigonometric functions. Again, given the peculiar structure of the model with some large lags and the fact that there is a lack offit,we are not very confident about the conclusions based on this model. Proceeding as in the GHP approach, we consider reduction to the effects of changes in the TRS levels and refit. The follow model results: logA, = -.53 + .09iyt_! + -14Y4_15 + .018AX + .022A2 + .016Xt_2O.  (8)  Eliminating two parameters in this fashion increases the deviance by only 0.8, which again suggests the reduction is appropriate. The results of tests for the significance of the A and X terms are summarized in Table 5. Table 5: Tests of significance Test  Adev  df  P-value  for Ai  7.3  1  0.007  for A 2  8.1  1  0.004  for X _ Q  4.0  1  0.05 '  t  2  23  The small P-values for the A terms seems to suggest that changes in TRS levels could be important factors.  3.3.2 AD versus TRS In exactly the same way as above, we obtain the following model for AD vs TRS: logAt = -.81 + .19Ft_2 + .13Y|_8 + .020Xf_4. The deviance is 699.5, with 718 degrees of freedom (P-value=0.682). The z-score for the coefficient of Xt_4 is 2.13; Withholding X _ t  4  increases the deviance by 4.1  (P-value=0.043). The small P-value suggests an association of AD with TRS with 4 days lag.  We can see that the two methods GHP and GAR yield very similar results.  4 Relationships of the Health Series and the TSP Series Because the TSP levels were measured only every sixth day, straightforward application of the three methods is not possible. We shall study the relationship between the TSP levels and the health series corresponding to the same day, 1 day after, 5 days after, the days when TSP levels were measured. We refer to the relation between {Y6»+A:} (referring to either ER or AD) and {-X^,} as the lag k relation, where k = 0,1,5. Therefore we shall be dealing with a total of 13 different time series: 6 for ER, 6 for AD, and 1 for TSP. 24  4.1 T h e Spectral Analysis Approach  The AR models for prewhitening of the above 13 time series are obtained the same way as before. In the following results, the unit of time is 6 days; for example, Yt -6 denotes the value at 6 x 6 = 36 days prior to Y . The full model allows 12 lags, t  and the reduction procedure is as described in Section 2.1.1.  Applying the prewhitening procedure to the 6 ER series leads to the following results: — with lag 0: fl = 0.74, and the AR model is: Y = -0.18Yi_6 + 0.18Y4_9 + e ; t  t  — with lag 1: fj, = 0.87, and the AR model is: Y = -0.15rt_3 - 0.20rf_5 - 0.18Y,_lo + et; t  — with lag 2: fi = 0.73, and the AR model is: Y = +0.18Ft_1 + e ; t  t  — with lag 3: fx = 0.76, and the AR model is: Y = 0.19Yt_7 - 0.14Yi_8 - 0.15Ft_12 + et; t  — with lag 4: jj, = 0.69, and the AR model is: Y = 0.16^ t  - O.UY -2 - 0.22^.!! + et; t  25  — with lag 5: ft = 0.76, and the AR model is:  Y = -0.12rt_12 + tt. t  The whitened and unwhitened spectra for ER with lag 0,  lag 5 are plotted  in Figure 6, with solid curves representing the whitened spectra and dotted curves for unwhitened. As in Section 3.1, the solid curves are flatter than the dotted ones, illustrating the effect of prewhitening the series.  Applying the same procedure to the 6 AD series leads to the following results: — with lag 0: (J, = 0.62, and the AR model is: Y = -0.17Yt_6 + 0.23Yt_9 + 0.19Ft_12 + e ; t  t  — with lag 1: fi 0.50, and the AR model is: Y = -o.ny _ + 0.15r _ + et; t  t  — with lag 2: ft  3  t  12  0.61, and the AR model is: Y = -0.17yi_! - 0.23Yt_4 + e ; t  t  — with lag 3: ft 0.53, and the AR model is: Y = +0.15Ft_9 + 0.20yt_io - 0.28Yi_12 + e ; t  — with lag 4: jx  t  0.49, and the AR model is: Y = -0.17^.7 + 0.13y«_n - 0.18Ft_i2 + et; t  26  — with lag 5: \i = 0.55, and the AR model is: Y = -0.12Yt_2 - 0.21Y,_5 - 0.15Ft_6 - 0.22Ft_9 + e . t  t  The whitened and unwhitened spectra for ER with lag 0,  lag 5 are plotted in  Figure 7. Again, theflattersolid curves show the effect of prewhitening.  Finally, for TSP, jl = 49.55, and the AR model is: Y = O.UY _ + e . t  t  9  t  The whitened and unwhitened spectra for TSP are plotted in Figure 8.  The coherencies between ER lag 0 and TSP, in Figure 9; those between AD lag 0 and TSP,  ER lag 5 and TSP are plotted AD lag 5 and TSP are plotted in  2  Figure 10. The corresponding values of p are provided in Table 6. Table 6: p 2 for ER and AD versus TSP Lag  0  1  2  3  4  5  ER  0.099  0.152  0.120  0.135  0.157  0.152  AD  0.217  0.081  0.158  0.134  0.167  0.188  Again, we can see that the strength of the linear relationships between ER and TSP and between AD and TSP are quite weak.  4.2 T h e Generalized Harmonic Process Approach  27  Since TSP is available only every sixth day, we cannot use model (5)- directly. Instead, important frequencies in ER or AD are suggested by their GHP spectra and they are then incorporated with the regression against TSP. The frequencies are extracted from the spectrum of the full ER and AD data. To take into account the fact that one time unit here is equivalent to 6 time units in the full data, we have the following model: m  Yt = A + £{Awsin[u;(6* - 5)] + £wcos[u;(6i - 5)]} + ]T ftlt.,- + e  u  (9)  where the summation is over the important frequencies selected from the GHP spectra by the selection rule: P-value < a. Then, based on (9) we carry out model reduction according to the rule: P-value < a'. Often, none of the X terms survived the reduction, suggesting they are not important. For the sake of completeness, we forced the X terms and test each of the coefficients via the change in deviance. The model reduction is implemented using a = a' = 0.10; but we now take m — 5 because this corresponds to the use of m = 30 before. 4.2.1 ER versus TSP Applying the procedure to the 6 ER series leads to the following results: — with lag 0: None of the X terms survive the reduction (Adev = 4.85, degrees of freedom=6). The model with the X terms forced is: Y  t  =  -0.77 + 0.29sin[0.34(6i - 5)] + 0.35sin[2.13(6* - 5)] - 0.31cos[1.68(6* - 5)] -f-0.0048Xt - 0.0019JT_i + 0.0047X4_2 - 0.0024Xt_3 '- 0.0020Xt_4 t  28  +0.0052Xt_5. The deviance is 128.66, with 106 degrees of freedom (P-value=0.066). The z-scores for the coefficients of the X terms are: 1.15, -0.44, 1.17, -0.54, -0.45, 1.29. Test of significance for the X terms are summarized in Table 7. Table 7: Test of Significance for X terms Xt  Xt -2  X_  Xt _4 Xt -5  1.28 0.20  1.33  0.30  0.21  P-value 0.26 0.65  0.25  0.58  0.65 0.21  xl  X -\ t  t  3  1.59  — with lag 1: None of the X terms survive the reduction (Adev = 7.47, degrees of freedom=6). The model with the X terms forced is: Y  t  = -0.85 + 0.29sin[0.34(6i - 5)] + 0.47cos[0.90(6i - 5)] - 0.0039^ +0.0040Xt_i + 0.002lXt_2 - 0.0061X4_3 + 0.0017Xt_4 + 0.006lXt_5.  The deviance is 117.45, with 107 degress of freedom (P-value=0.402). The z-scores for the coefficients of the X terms are: -0.99, 1.14, 0.57, -1.34, 0.41, 1.67. Tests of significance for the X terms are summarized in Table 8. Table 8: Test of Significance for X terms Xt  X _i  Xt -2  X -z  Xt -4 Xt -5  1.00  1.26  0.31  1.86  0.16 2.65  P-value 0.32 0.26  0.58  0.17  0.69 0.10  xl  t  29  t  — with lag 2: The model with the X terms forced is: Y  t  = -1.23 + 0.39sin[0.34(6i - 5)] + 0.38sin[0.90(6i - 5)] - 0.30sin[2.13(6i - 5)] -0.29cos[0.11(6i - 5)] + 0.40cos[1.12(6t - 5)] - 0.0078X*  +0.0024Xi_i + 0.0033X,_2 + 0.0062X,_3 + 0.0082Xt_4 + 0.0024Xi_5. The deviance is 123.97, with 104 degrees of freedom (P-value=0.088). The z-scores for the coefficients of the coefficients of the X terms are: -1.72, 0.59, 0.88, 1.51, 1.87, 0.52. Tests of significance for the X terms are summarized in Table 9. Table 9: Test of Significance for X terms Xt  X -i  X -2  X -z  X ~4  3.11 0.35  0.77  2.22  3.36 0.27  P-value 0.08 0.55  0.38  0.14  0.07 0.60  Xl  t  t  t  t  X -s t  — with lag 3: None of the X terms survive the reduction (Adev = 4.21, degrees of freedom=6), the model with the X terms forced is: Y  t  =  -0.30 + 0.33sin[0.34(6i - 5)] - 0.35cos[0.34(6i - 5)] - 0.31cos[2.13(6i - 5)] -0.0041X, + 0.0058Xt_i + 0.0018Xt_2 - 0.0005Xt_3 - 0.0045Xt_4 -f-O.OOOOX^g.  The deviance is 123.93, with 106 degrees of freedom (P-value=0.113). The z-scores of the coefficients of the X terms are: -0.99, 1.49, 0.44, -0.12, -0.99, 0.01. Tests of significance for the X terms are summarized in Table 10. 30  Table 10: Test of Significance for X terms  Xt  Xt -\ Xt -2 Xt -3  Xt_4  1-02  2.14  1.93  0.15  1.00 0.00  P-value 0.31 0.14  0.66  0.90  0.32 0.99  Xi  Xt _5  — with lag 4: The model with the X terms forced is: Y  t  = 0.50 - 0.33cos[0.34(6i - 5)] + 0.28cos[2.13(6i - 5)] - 0.0016X, -0.0022Xt_! + 0.0008Xt_2 -  0.0005Xt-3  - 0.011lXt_4 - 0.0052Xt_5.  The deviance is 124.84, with 107 degrees of freedom (P-value=0.115). The z-scores of the coefficients of the X terms are -0.37, -0.49, 0.20, -0.12, -2.17, -1.09. Test of significance for the X terms are summarized in Table 11. Table 11: Test of Significance for X terms  Xt  Xt -2  Xt ~z Xt -A Xts  0.13 0.24  0.03  0.01  5.17 1.23  P-value 0.72 0.62  0.86  0.92  0.02 0.27  Xi  — with lag 5: None of the X terms survive the reduction (Adev = 6.27, degrees of freedom=6). The model with the X terms forced is: Y  t  = -1.23 + 0.0016X( - 0.0033Xt_! + 0.0064Xt_2 -|-0.0028Xt_3 + 0.0039Xt_4 + 0.0064.Yt_5.  31  The deviance is 118.68, with 109 degrees of freedom (P-value=0.248). The z-scores of the coefficients of the X terms are: 0.38, -0.78, 1.62, 0.67, 0.94, 1.59. Tests of significance for the X terms are summarized in Table 12. Table 12: Test of Significance for X terms Xt  Xt -2  Xt -Z  Xt -4 Xt -5  0.14 0.62  2.54  0.44  0.85 2.42  P-value 0.71 0.43  0.11  0.51  0.36 0.12  xl  X -\ t  None of the above analyses suggests any clear positive association between ER and TSP. In fact, the only P-value smaller than 0.05 is in Table 11 and corresponds to a negative coefficient. The other two smaller P-values of 0.08 and 0.07 appear in Table 9 and correspond to a negative and positive coefficient respectively. But those are only 3 out of 36, and may very well be due to the random variation, since roughly speaking, even if there are no relationships, around 4 of these 36 P-values should be in this range.  4.2.2 AD versus TSP Similarly, examination of AD versus TSP leads to the following results: — with lag 0: None of the X terms survive the reduction (Adev = 5.94, degrees of freedom=6). The model with the X terms forced is: Y  t  = -0.51 + 0.34sin[2.46(6r - 5)] - 0.0055^ - 0.0049Xt_! +0.0011X(_2 - 0.0018Xt_3 + 0.005lXt_4 + 0.0060Xt_5. 32  The deviance is 108.74, with 108 degrees of freedom (P-value=0.462). The z-scores of the coefficients of the X terms are: -1.15, -1.07, 0.25, -0.38, 1.18, 1.36. Tests of significance for the X terms are summarized in Table 13. Table 13: Test of Significance for X terms X  X -i  X -2  X-  1.21  0.06  P-value 0.24 0.27  0.80  XJ  t  1.38  t  t  X -4  X-  0.15  1.34  1.77  0.70  0.25 0.18  t  3  t  t  5  — with lag 1: None of the X terms survive the reduction (Adev = 2.39, degrees of freedom=6). The model with the X terms forced is: Y  t  = -1.63 + 0.41sin[2.24(6i - 5)] + 0.34sin[2.80(6i - 5)] + 0.42cos[l.34(6i - 5)] -0.35cos[2.46(6* - 5)] + 0.0037Xf + 0.0036Xt_! + 0.0036Xt_2 +0.0017Xt_3 - 0.0008Xt_4 + 0.0042Xt_5.  The deviance is 96.40, with 105 degrees of freedom (P-value=0.714). The z-scores of the coefficients of the X terms are: 0.75, 0.73, 0.68, 0.33, -0.15, 0.82. Tests of significance for the X terms are summarized in Table 14. Table 14: Test of Significance for X terms X4  Xt-i  Xt-2  Xt-Z Xt-4 Xt -5  0.55 0.52  0.46  0.10  0.02 0.66  P-value 0.46 0.47  0.50  0.75  0.88 0.42  x\  33  — with lag 2: None of the X terms survive the reduction (Adev = 2.43, degrees of freedom=6). The model with the X terms forced is: Y  t  = -0.50 + 0.41sin[0.56(6* - 5)] + 0.35sin[1.23(6* - 5)] - 0.41sin[2.46(6t - 5)] -0.37cos[0.56(6t - 5)] + 0.0028Xt - 0.0013Xt_! - 0.0053Xt_2 +0.0027Xt_3 + 0.0004Xt_4 - 0.0023Xt_5.  The deviance is 95.03, with 105 degrees of freedom (P-value=0.747). The z-scores of the coefficients of the X terms are: 0.61, -0.25, -1.07, 0.54, 0.07, -0.47. Tests of significance for the X terms are summarized in Table 15. Table 15: Test of Significance for X terms Xt  Xt-l  Xt -2 Xt -Z Xt -4 Xt -5  0.37 0.06  1.19  0.29  0.00 0.22  P-value 0.54 0.80  0.28  0.59  0.95 0.64  xl  — with lag 3: None of the X terms survive the reduction (Adev = 3.18, degrees of freedom=6). The model with the X terms forced is: Y  t  = -1.09 - 0.35cos[1.34(6i-5)] 0.0028Xt + 0.0009X(_! + 0.0012Xt_2 - 0.0003Xt_3 + 0.0074^-4 -0.0015Xt_5.  The deviance is 114.70, with 108 degrees of freedom (P-value=0.311). The z-scores of the coefficients of the X terms are: 0.61, 0.20, -0.24, -0.05, 1.59, -0.30. Tests of significance for the X terms are summarized in Table 16. 34  Table 16: Test of Significance for X terms Xt  X -\ t  Xt-2 X t  3  Xt -4 Xt -5  Xl  0.36 0.04  0.06  0.00  2.41 0.09  P-value  0.55 0.84  0.81  0.96  0.12 0.76  — with lag 4: None of the X terms survive the reduction (Adev = 3.85, degrees of freedom=6). The model with the X terms forced is: Y  t  =  -0.63 - 0.40cos[0.56(6i - 5)] + 0.0047X* - 0.0017Xt_! -0.0047Xt_2 + 0.0033Xt_3 + 0.0012X,_4 - 0.0063Xt_5.  The deviance is 94.27, with 108 degrees of freedom (P-value=0.824). The z-scores of the coefficients of the X terms are: 0.97, -0.34, -0.85, 0.65, 0.23, -1.09. Tests of significance for the X terms are summarized in Table 17. Table 17: Test of Significance for X terms Xt  X -\ t  Xt-2 Xt-2, Xt-4 Xt -5  xl  0.91 0.06  1.02  0.42  0.05  1.26  P-value  0.34 0.80  0.31  0.52  0.82 0.26  — with lag 5: None of the X terms survive the reduction (Adev = 4.16, degrees of freedom=6). The model with the X terms forced is: Y  t  = -0.47 + 0.0028Xt - 0.0004Xt_! - 0.0054Xt_2 + 0.0022X^3 +0.0028Xt_4 - 0.0081Xt_535  The deviance is 97.86, with 109 degrees of freedom (P-value=0.769). The z-scores of the coefficients of X terms are: 0.59, -0.08, -0.97, 0.43, 0.57, -1.38. Tests of significance for the X terms are summarized in Table 18. Table 18: Test of Significance for X terms X -\ t  x\  Xt-2  Xt -3  Xt -4. Xt -s  0.34 0.01 0.99 0.18 0.31 2.04  P-value 0.56 0.94 0.32  0.67 0.57 0.15  None of the above analyses (Table 13-18) suggests any positive association between AD and TSP; in fact, all the P-values are larger than 0.1.  In neither the of the analyses of ER vs TSP nor those of AD vs TSP do we see any clear indications of positive association between the health and the TSP series.  4.3 The Generalized Autoregressive Model Approach We shall use model (6) in Section 2.2.3 with p = 3, q = 5, since in the present case, this is approximately equivalent to the case p = 20, q = 30 used when dealing with TRS. Again, in many of the followingfits,none of the X terms survive the model reduction. We shall force the X terms and test each via the change in deviance.  4.3.1 ER versus TSP Applying the procedure to the 6 ER series leads to the following results: 36  — with lag 0: None of the X terms survive the reduction (Adev = 4.16, degrees of freedom=6). The model with the X terms forced is: Y  t  =  -0.22 - 0.30y"t_i + 0.0022Xt - 0.0018Xt_! + 0.0023Xt_2 -0.0020X(_3 - 0.0014Xt_4 + 0.0028Xt_5.  The deviance is 135.58, with 108 degrees of freedom (P-value=0.374). The z-scores of the coefficients of the X terms are: 0.56, -0.43, 0.60, -0.47, -0.34, 0.73. Tests of significance for the X terms are summarized in Table 19. Table 19: Test of Significance for X Terms Xt  X -l  Xts  Xt-2 Xt -3  Xt -4  0.29 0.19  0.36  0.22  0.12 0.52  P-value 0.59 0.67  0.55  0.64  0,73 0.47  Xl  t  — with lag 1: the model with the X terms forced is: Y  t  = -0.22 - 0.26F(_3 - 0.0020X* + 0.0055Xt_i + 0.0006X(_2 - 0.0073X^3 +0.0022Xt_4 + 0.006lXt_5.  The deviance is 119.21, with 108 degrees of freedom. (P-values=0.217). The z-scores of the coefficients of the X terms are: -0.51, 1.60, 0.17, -1.70, 0.54, 1.65. Tests of significance for the X terms are summarized in Table 20. Table 20: Test of Significance for X Terms  37  Xt  X -\ t  Xt-2 Xt -3 Xt-4 Xt-S  XI  0.27 2.46  0.03  3.09  0.29 2.62  P-value  0.61 0.12  0.86  0.08  0.59 0.11  — with lag 2: None of the X terms survive the reduction (Adev = 7.57, degrees of freedom=6). The model with the X terms forced is: Y  t  = -1.08 + 0.28Yi_i - 0.0049Xt + 0.005lXt_! + 0.0053X(_2 + 0.0053.X,_3 +0.0032Xt_4 - 0.0036Xt_5.  The deviance is 139.50, with 108 degrees of freedom. (P-value=0.022). The z-scores of the coefficients of the X terms are: -1.11, 1.23, 1.35, 1.29, 0.76, -0.79. Tests of significance for the X terms are summarized in Table 21. Table 21: Test of Significance for X Terms Xt Xl  P-value  X -i  Xt-2 Xt -3 Xt -4 Xt -5  1.48  1.76  1.62  0.56 0.64  0.26 0.22  0.18  0.20  0.45 0.42  1-28  t  — with lag 3: None of the X terms survive the reduction (Adev = 3.62, degrees of freedom=6). The model with the X terms forced is: Y  t  = -0.046 - 0.0034Xt + 0.0055Xt_! - 0.0007Xt_2 - 0.0008Xt_3 - 0.003lXt_4 -0.0018Xi_5.  38  The deviance is 137.65, with 109 degrees of freedom (P-value=0.033). The z-scores of the coefficients of the X terms are: 0.81, 1.46, -0.19, -0.22, -0.74, -0.44. Tests of significance for the X terms are summarized in Table 22. Table 22: Test of Significance for X Terms Xt  Xt-l  X -2  Xt-3  X -4  0.67 2.06  0.04  0.05  0.56 0.19  P-value 0.41 0.15  0.84  0.83  0.45 0.66  Xi  t  t  X -s t  — with lag 4: The model with the X terms forced is: Y  t  = +0.12 + 0.22yt_1 - 0.0017X, - 0.0003Xt_i + 0.0007Xf_2 - 0.0006Xt_3 -0.0093Xt_4 - 0.0038Xt_5.  The deviance is 129.03, with 108 degrees of freedom (P-value=0.082). The z-scores of the coefficients of the X terms are: -0.38, -0.06, 0.17, -0.13, -1.84, -0.80. Tests of significance for the X terms are summarized in Table 23. Table 23: Test of Significance for X Terms X  Xi  t  Xt-l  X -2 t  X -3 t  X -4 t  Xt-5  0.15 0.00  0.03  0.02  3.69 0.67  P-value 0.70 0.95  0.87  0.90  0.05 0.41  — with lag 5: None of the X terms survive the reduction (Adev = 6.27, degrees of freedom=6). The model with the X terms forced is: Y  t  = -1.23 + 0.0064Xt - 0.0033_Yt_i + 0.0063Xt_2 + 0.0028Xt_3 + 0.0039X<_4 39  +0.0063Xt_5. The deviance is 118.68, with 109 degrees of freedom. (P-value=0.258). The z-scores of the coefficients of the X terms are: 0.38, -0.78, 1.62, 0.67, 0.94, 1.59. Tests of significance for the X terms are summarized in Table 24. Table 24: Test of Significance for X Terms Xt  X -.i  X -2  X ..  Xl  0.15 0.63  2.55  0.45  0.85 2.43  P-value  0.70 0.43  0.11  0.50  0.36 0.12  t  t  t  3  X -4 t  Xt -5  None of the above analyses (Table 19-24) suggests any clear positive association between ER and TSP. The only small P-value is the 0.05 in Table 23, which correspond to a negative coefficient. However, as argued before, this may well be due to random variation.  4.3.2 AD versus TSP Similarly, examination of AD versus TSP leads to the following results: — with lag 0: None of the X terms survive the reduction (Adev = 4.37, degrees of freedom=6). The model with the X terms forced is: Y  t  = -0.47 - 0.004lXt - 0.0045Xt_i - 0.0005Xt_2 - 0.0004Xt_3 + 0.0045Xt_4 -f-0.0048Xi_5.  40  The deviance is 112.47, with 109 degrees of freedom (P-value=0.39l). The z-scores of the coefficients of the X terms are: -0.89, -0.99, -0.13, -0.09, 1.04, 1.11. Tests of  significance for the.X terms are summarized in Table 25. Table 25: Test of Significance for X Terms X  Xt-l  Xt -2  Xt -3  Xt -4  Xt -5  0.82  1.01  0.02  0.01  1.05  1.19  0.89  0.93  0.31 0.28  t  Xl  0.36 0.31  P-value  — with lag 1: None of the X terms survive the reduction (Adev = 1.51, degrees of freedom=6). The model with the X terms forced is:  Y  t  =  - 1 . 2 7 + 0.0032X + 0.0040X _! + 0.0016X _ + 0.0001X^3 - 0.0003X _ t  t  t  2  t  4  -r-0.0030Xi_5.  The deviance is 112.17, with 109 degrees of freedom (P-value=0.398). The z-scores of the coefficients of the X terms are: 0.67, 0.84, 0.33, 0.01, -0.06, 0.59. Tests of significance for the X terms are summarized in Table 26. Table 26: Test of Significance for X Terms  Xl  P-value  Xt  Xt-l  Xt -2  Xt-3  Xt -4  Xt -5  0.44  0.69  0.10  0.00  0.00  0.35  0.75  0.99  0.95 0.55  0.51 0.41  — with lag 2: None of the X terms survive the reduction (Adev = 0.93, degrees of 41  freedom=6). The model with the X terms forced is: Y  t  = -0.22 + 0.0010Xt - 0.0002X4.! - 0.0034Xt_2 - 0.0001X(_3 - 0.0004Xt_4 -0.0026Xt_5.  The deviance is 114.19, with 109 degrees of freedom (P-value=0.348). The z-scores of the coefficients of the X terms are: 0.23,-0.03,-0.73, -0.02, -0.09, -0.54. Tests of significance for the X terms are summarized in Table 27. Table 27: Test of Significance for X Terms X(  Xt_x  X _2 t  X _3 f  Xt -4 Xt -5  0.05 0.00  0.55  0.00  0.01 0.30  P-value 0.82 0.97  0.46  0.99  0.93 0.59  Xl  — with lag 3: The model with the X terms forced is: Y  t  = -1.09 + 0.0029X, + 0.0016X,_X - 0.0015Xt_2 - 0.0008Xt_3 + 0.0008X_4 -0.0014Xt_5.  The deviance is 118.40, with 109 degrees of freedom (P-value=0.253). The z-scores of the coefficients of the X terms are: 0.66, 0.35, -0.31, -0.16, 1.78, -0.28. Tests of significance for the X terms are summarized in Table 28. Table 28: Test of Significance for X Terms Xt  X(_i X(_2 X(_3 Xt_4 Xt -5  0.42 0.12  0.10  0.03  3.02 0.08  P-value 0.52 0.72  0.76  0.87  0.08 0.78  Xl  42  — with lag 4: None of the X terms survive the reduction (Adev = 4.16, degrees of freedom=6). The model with the X terms forced is: Y  =  t  -0.47 + 0.0028X, - 0.0004Xt_x - 0.0054Xt_2 + 0.0022X;_3 + 0.0028X^4 -0.0081X<_5.  The deviance is 97.86, with 109 degrees of freedom (P-value=0.769). The z-scores of the coefficients of the X terms are: 0.59, -0.08, -0.97, 0.43, 0.57, -1.38. Tests of significance for the X terms are summarized in Table 29. Table 29: Test of Significance for X Terms X  Xt -2  X -z  Xt-4  0.34 0.01  0.99  0.18  0.31 2.04  P-value 0.56 0.94  0.32  0.67  0.57 0.15  t  X\  X -i t  t  Xts  — with lag 5: The model with the X terms forced is: Y  t  = 0.32 - 0.36Yt_2 - 0.0071X, - 0.0091X4_i + 0.0033X4_2 - 0.0059Xf_3 +0.0017Xt_4 - 0.0010Xt_5.  The deviance is 107.44, with 108 degrees of freedom (P-value=0.497). The z-scores of the coefficients of the X terms are: -1.72, -1.35, -1.75, 0.73, -1.12, 0.36, Tests of significance for the X terms are summarized in Table 30. Table 30: Test of Significance for X terms  43  X  t  Xt -i  Xit-2  Xt-3  Xt-4  Xt-  t  5  1.94  3.35  0.52  1.32  0.13  0.05  P-value 0.16  0.07  0.47  0.25  0.72  0.83  xl  None of the above analysis suggests any clear positive association between AD and TSP. There are two moderately small P-values: 0.08 in Table 28 and 0.07 in Table 30, corresponding to a positive and a negative parameter respectively. Again, as argued before, these may well be due to random variation.  Both GHP and GAR seems to reach the same confusion: there is no clear positive association between the health series and the TSP series suggested by the data available. Of course, this might be due primarily to the fact that TSP is collected only every sixth day. This makes the effective length of the series one-sixth the length of that for TRS. Relationships between the health series and TSP would therefore have to be much stronger to be identified with the same level of confidence.  5 Final Remarks  Analysis via both GHP  and GAR  seem to suggest a weak positive association  between both AD and TRS and ER and TRS. The hospital admission counts seem to be affected by TRS levels 4 days ago. The emergency room visit counts seem to be affected by changes in the levels of TRS, as well as by the level of TRS. However, without subject matter knowledge to support the peculiar structure of the models (7) 44  and (8), the second statement should perhaps be considered as a conjecture rather than as a conclusion. On the other hand, the results of Section 4 provide no clear indication of positive association between either ER and TSP or AD and TSP. This suggests either that TSP is less influential (to asthma) or that the way TSP is collected might disguise the association, as was discussed in Section 4. All three methods used in this study are focused two points: to take acount of the dynamics within each series; to study the relationship between two series of data. The spectral analysis approach provides an preliminary exploration of the frequency pattern of each series as well as an approximate measure describing the strength of linear relationship between two time series. However, since the health data is discrete, this approach may not be entirely appropriate. The generalized harmonic process approach and generalized autoregressive model approach both allow us to model discrete time series and their relationships with air pollution covariates. Both approaches allow identification of the form of relationships and provide statistical inferences such as testing the importance of the covariates. The GHP  approach  focuses on capturing seasonality in the data, whereas the GAR approach focuses on describing the "memory", or "inertia", of the data. Other approaches could be applied to this data analysis. For example, one could consider the daily counts as binary time series that consists of only 0 (a 0 count) and 1 (a positive count). Since the actual counts are mostly 0 and 1, very little information would be lost and the methodology of binary time series could then be 45  applied (see [l]). One could also apply the "parameter-driven model" discussed in [7]. Here, the counts are conditioned on a "latent process", and the dynamics within the count series are accounted for in the correlation structure of the latent process. In dealing with TSP, another approach would be to sum or average the asthma counts within each six day period, and then study the relation between the TSP levels and these summed or averaged counts. In future studies, we suggest that a complete record of the levels of TSP on every day, instead of only on every sixth day, should be made, for the reason described in Section 4. In addition, a larger scale of ER and AD data, possibly by including more hospitals and a longer time period, should be collected.  46  Bibliography [1] Kedem, Benjarnin. (1980). "Binary Time Series". Marcel Dekker, Inc. [2] Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society (B),  39 1-38. [3] Knight, K., Leroux, B., Millar, J., Petkau, J. (1988). Air pollution and human health: A study based on hospital admissions data from Prince George, British Columbia. Report prepared under Contract No. 1754 with the Health Protection Branch, Department of National Health and Welfare, Ottawa, Ontario. [4] Knight, K., Leroux, B., Millar, J., Petkau, J. (1988). Air pollution and human health: A study based on emergency room visits data from Prince George, British Columbia. Report prepared under Contract No. 1977 with the Health Protection Branch, Department of National Health and Welfare, Ottawa, Ontario. [5] McCullagh, P. and Nelder, J.A. (1983). "Generalized Linear Models". Chapman and Hall, London. [6] Priestley, M.B. (1981). "Spectral Analysis and Time Series". Academic Press, London. [7] Zeger, S. L. (1988). A Regression Model for Time Series of Counts. Biometrika, 75, 4, pp.621-9.  47  Figure 1. Histograms of ER and AD  —  i  1  2  —  -r  3  -  5  4  (a) Histogram of ER  0 (b) Histogram of AD 48  —r~  -r~  3  4  Figure 2. Comparison of Whitened vs Unwhitened Spectra  frequency (a) For ER  1.5  2.0  frequency (c) For T R S  49  3.5  Figure 3 . Coherencies between Whitened Series  Figure 4. Proposed Spectra of ER  0.0  0.5  1.0  1.5  2.0  2.5  3.0  3.5  frequency (a) Spectrum of ER via GHP  o . OJ in E J = o. o •>w in • o I 1 _  CD  CL  0.0  1,111  i  0.5  i  1.0  1.5  i  I  i  2.0  I  i  . . .  i  2.5  3.0  12  14  i  3.5  frequency (b) Proposal 2 Spectrum of ER  8  10  days (c) Proposal 1 Spectrum of ER  51  16  Figure 5. Proposed Spectra of A D CD  CM  .  E 3  •~ 00  o  CL O  .  0.0  I  1  I  l 0.5  I  i  I  111  I  1.0  1.5  i l l  2.0  j  |  2.5  3.0  3.5  2.5  3.0  3.5  12  14  16  frequency (a) Spectrum of A D via G H P  CM  E oo  r  .£= CD  •  CD Q.  ,  w  •  CM  0.0  0.5  1.0  1.5  2.0  frequency (b) P r o p o s a l 2 Spectrum of A D  6  8  10 days  (c) Proposal 1 Spectrum of A D  52  Figure 6. Comparison between the Whitened and Unwhitened Spectra of ER  0.0  0.5  1.0  1.5  2.0  2.5  3.0  3.  2.5  3.0  3.  2.5  3.0  3.  frequency (a) Lag 0  0.0  0.5  1.0  1.5  2.0  frequency (b) Lag 1  0.0  0.5  1.0  1.5  2.0  frequency (c) Lag 2 53  in  o L  ,  ,  ,  ,  ^  °0.0  0.5  1.0  1.5 2.0 frequency (d) Lag 3  ,  2.5  3.0  3.5  ° 0.0  0.5  1.0  1.5 2.0 frequency (e) Lag 4  2.5  3.0  3.5  0.5  1.0  1.5 2.0 frequency (e) Lag 5  2.5  3.0  3.5  o  CO  E  o  0.0  54  Figure 7. C o m p a r i s o n between the Whitened and Unwhitened S p e c t r a of A D  frequency (a) Lag 0  ° 0.0  0.5  1.0  1.5  2.0  2.5  3.0  3.5  2.5  3.0  3.5  frequency (b) Lag 1  ° 0.0  0.5  1.0  1.5  2.0  frequency (c) Lag 2  55  ° 0.0  0.5  1.0  1.5  2.0  frequency (e) Lag 4  2.5  3.0  3.5  Figure 8. Comparison between the Whitened and Unwhitened Spectra of T S P  0.0  0.5  1.0  1.5  2.0  frequency  57  2.5  3.0  3.5  Figure 9. Coherencies between Whitened ER and Whitened TSP  0.0  0.5  1.0  1.5 2.0 frequency (a) Lag 0  2.5  3.0  3.5  co  0.0  0.5  1.0  1.5  2.0  frequency (e) Lag 4  2.5  3.0  3.5  Figure 10. Coherencies between Whitened AD and Whitened T S P  0.0  0.5  1.0  1.5  2.0  frequency (a) AD with Lag 0  2.5  3.0  3.5  CO  O  Fi  ,i  >  0.0  0.5  1.0  —  i  1.5  1—•  , .  2.0  frequency (e) AD with Lag 4  j.  2.5  3.0  3.5  

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
China 26 1
United States 10 2
Canada 9 0
City Views Downloads
Beijing 23 0
Unknown 5 6
Ashburn 4 0
Prince George 4 0
Shenzhen 3 1
Redmond 2 0
Pelion 1 0
Rapid City 1 0
Mountain View 1 0
Redwood City 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0097525/manifest

Comment

Related Items