@prefix vivo: . @prefix edm: . @prefix ns0: . @prefix dcterms: . @prefix skos: . vivo:departmentOrSchool "Science, Faculty of"@en, "Statistics, Department of"@en ; edm:dataProvider "DSpace"@en ; ns0:degreeCampus "UBCV"@en ; dcterms:creator "Yee, Irene Mei Ling"@en ; dcterms:issued "2010-09-09T21:11:39Z"@en, "1988"@en ; vivo:relatedDegree "Master of Science - MSc"@en ; ns0:degreeGrantor "University of British Columbia"@en ; dcterms:description "Poisson process is a common model for count data. However, a global Poisson model is inadequate for sparse data such as the marked salmon recovery data that have huge extraneous variations and noise. An empirical Bayes model, which enables information to be aggregated to overcome the lack of information from data in individual cells, is thus developed to handle these data. The method fits a local parametric Poisson model to describe the variation at each sampling period and incorporates this approach with a conventional local smoothing technique to remove noise. Finally, the overdispersion relative to the Poisson model is modelled by mixing these locally smoothed, Poisson models in an appropriate way. This method is then applied to the marked salmon data to obtain the overall patterns and the corresponding credibility intervals for the underlying trend in the data."@en ; edm:aggregatedCHO "https://circle.library.ubc.ca/rest/handle/2429/28360?expand=metadata"@en ; skos:note "LOCAL PARAMETRIC POISSON MODELS FOR FISHERIES DATA by IRENE MEI LING YEE B.SC, THE UNIVERSITY OF BRITISH COLUMBIA, 1986 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES Department of Statistics We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA September, 1988 © IRENE MEI LING YEE, 1988 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of S t.O.'fc i S\"t j OS The University of British Columbia Vancouver, Canada Date S e p t . , 13 8 ? DE-6 (2/88) ABSTRACT Poisson process is a common model for count data. However, a global Poisson model is inadequate for sparse data such as the marked salmon recovery data that have huge extraneous variations and noise. An empirical Bayes model, which enables information to be aggregated to overcome the lack of information from data in individual cells, is thus developed to handle these data. The method fi t s a local parametric Poisson model to describe the variation at each sampling period and incorporates this approach with a conventional local smoothing technique to remove noise. Finally, the overdispersion relative to the Poisson model is modelled by mixing these locally smoothed, Poisson models in an appropriate way. This method is then applied to the marked salmon data to obtain the overall patterns and the corresponding credibility intervals for the underlying trend in the data. i i TABLE OF CONTENTS ABSTRACT i i TABLE OF CONTENTS i i i LIST OF TABLES V LIST OF FIGURES v i ACKNOWLEDGEMENTS . . . . . . . . . X 1. INTRODUCTION 1 2. BACKGROUND REVIEW 4 2.1 The Benchmark Data Set 4 2.2 Statistical Techniques 6 2.2.1 Negative binomial and mixed Poisson regression ... 6 2.2.2 Using SABL to decompose time series data 8 2.2.3 Time series analysis of a contagious process . . . . 9 2.2.4 Smoothing techniques 11 1. Estimating smooth functions by the local scoring algorithm 11 i i . Bayesian nonparametric smoothing method for local regular process 14 2.2.5 Empirical Bayes(EB) and Hierarchical Bayes(HB) analyses 17 i . Introduction 17 i i . Empirical Bayes(EB) analysis 18 i i i . Hierarchical Bayes(HB) analysis 19 i i i iv. Comparison of EB and HB 21 3. DATA EXPLORATION 22 3.1 introduction 22 3.2 Benchmark Release and Recovery Data Sets 23 3.2.1 Missing values in the two benchmark data subsets . 23 i . Release data 23 11. Recovery data 24 3.2.2 The structure of observed recoveries 24 3.3 Other Related Information and Data Sets 27 4. EMPIRICAL BAYES APPROACH FOR MODEL COUNT DATA 30 4.1 Introduction 30 4.2 Local Parametric Poisson Models with Smoothing Techniques 31 4.3 Inference on the Parameters of Interest 39 5. APPLICATION 45 5.1 Introduction 45 5.2 Problems Encountered when Modelling the Salmon Recovery Data 46 5.2.1 Missing values 46 5.2.2 The edge effect 48 5.3 Fitting the Proposed Models to the Selected Data Sets . . 48 5.4 Conclusion 56 REFERENCES 58 APPENDIX. The Mark Recovery Program (MRP) Database 60 iv L I S T OF T A B L E S TABLE 3.1 The tag codes found in the benchmark data subset 64 3.2 Summary of data fields for the benchmark release data subset . . 65 3.3 Summary l i s t of chinook data fields for the benchmark rollup recovery subset . . . . 66 3.4 Table for computing \"Period\" from statistical week (MMW) 67 3.5 List of catch region codes, and names 68 3.6 A summary of catch regions with observed recoveries . . . 69 3.7 A sample l i s t of coho release replicates that are classified according to size 70 v L I S T OF FIGURES FIGURE 3.1a Size of chlnook release for tag codes from the benchmark release data subset 71 3.1b Size of coho release for tag codes from the benchmark release data subset 71 3.2 Chinook observed recoveries over the recovery period considered, (tag code: 021827 brood year: 1979 recovery year: 1981 to 1984) 72 3.2a Commercial and sport observed recoveries 72 3.2b Commercial observed recoveries 72 3.2c Sport observed recoveries 72 3.3 Coho observed recoveries over the recovery period considered, (tag code: 081842 brood year: 1979 recovery year: 1981 to 1982) 73 3.3a Commercial observed recoveries 73 3.3b Sport observed recoveries 73 3.4a Plot of cumulative sum of Chinook commercial observed recoveries over time, (tag code: 021827) 74 3.4b Plot of cumulative sum of coho commercial observed recoveries over time, (tag code: 081842) . . . . 74 vl FIGURE 3.5 Pl o t s of chinook commercial observed recoveries over the sampling period, (tag code: 021827 brood year: 1979) . . 75 3.6 Pl o t s of coho commercial observed recoveries over the sampling period, (tag code: 081842 brood year: 1979) . . 76 5.1a Zeta(t) for coho. (Hatchery: Quinsam brood year: 1979 s i z e at release: medium) 77 5.1b Transformed zeta(t) (power =0.25) 77 5.1c Trend for the zeta(t) 77 5.2a Zeta(t) for coho. (Hatchery: Capilano brood year: 1980 s i z e at release: medium) 78 5.2b Trend for the zeta(t) 78 5.3a Zeta(t) for chinook tag code 021827 79 5.3b Trend for the zeta(t) 79 5.4 The trends of zeta's for Quinsam coho from d i f f e r e n t brood years 80 5.5 The trends of zeta's for Capilano coho from d i f f e r e n t brood years 81 5.6 The trends of zeta's for the three chinook tag codes . .82 5.7 The estimated recovery i n t e n s i t y of coho for each of the 4 catch regions. (Hatchery: Quinsam s i z e at release: large) 83 v i i FIGURE 5.8 The estimated recovery intensity of coho for each of the 4 catch regions. (Hatchery: Quinsam size at release: medium) 84 5.9 The estimated recovery intensity of coho for each of the 4 catch regions. (Hatchery: Quinsam size at release: small) . . . . 85 5.10 The estimated recovery intensity of coho for each of the 4 catch regions. (Hatchery: Capilano size at release: large) 86 5.11 The estimated recovery intensity of coho for each of the 4 catch regions. (Hatchery: Capilano size at release: medium) 87 5.12 The estimated recovery intensity of coho for each of the 4 catch regions. (Hatchery: capilano size at release: small) 88 5.13 The estimated recovery intensity of chinook for each of the 3 trolling regions 89 5.14 Estimated recovery intensities of coho and the corresponding 95% credibility intervals. (Hatchery: Quinsam brood year: 1979 size at release: medium) 90 v i i i FIGURE 5.15 Estimated recovery intensities o£ coho and the corresponding 95% credibility intervals. (Hatchery: Capilano brood year: 1980 size at release: medium) 91 5.16 Estimated recovery intensities of Chinook and the corresponding 95% credibility intervals. (tag code: 021827 brood year: 1979) 92 ix ACKNOWLEDGEMENTS I would like to thank Dr. James Zidek for his guidance, assistance and encouragement in producing this thesis. I would like to express my gratitude to Dr. Mohan Delampady for his helpful comments and careful reading on this work. I am indebted to Dr. Jon Schnute from the Pacific Biological Station for providing the data sets and offering advice related to these data. Also I am grateful to technical staff members at the Station, in particular, Brian Kuhn for producing ancillary data sets when their importance emerged in the course of the analysis. I am very grateful for the comments and suggestions from Dr. Keith Knight and Dr. Jian Liu. I would also like to thank my parents and my brothers for their support and encouragement throughout the years. The financial support of the Natural Science and Engineering Research Council of Canada is gratefully acknowledged. x 1. I N T R O D U C T I O N This thesis develops an empirical Bayes model for marked salmon data collected over time. The method, which employs a hierarchical prior distribution, is used because the data are sparse and the empirical Bayes approach enables information to be aggregated to overcome the lack of information from data in individual cells. The novelty of our approach lies in our use of locally parametric Poisson models and smoothing techniques to obtain estimates of underlying trend in the tagged salmon data. Over years, data on the return of tagged salmon are collected and prepared for the Mark Recovery Program(MRP) database. This database, which is described in detail in the Appendix, consists of the release data on tagged and untagged salmon, the data on individual marked salmon observed when returning from the ocean for spawning, and data on the sampling periods for each of the fishing regions. Since the database contains a vast amount of information, only selected sample data sets, such as the benchmark data sets, are analyzed here. Other data sets are also formatted like the benchmark data because this benchmark is well documented. With these data, various questions can be posed and investigated. One topic of Interest is the relationship between the size of 1 smolts at release and their return rate as measured by observed marked salmon counts. Another is the comparison of marked salmon counts from different brood years and fishing regions. In this study, only the marked recoveries are examined. In addition, only two species of salmon, chinook and coho, are considered. In tackling the two problems of interest described above, we first develop a model for the observed fish counts. The Poisson model is a conventional choice for count data. However, i t will not be adequate for *noisy' data with large sampling variation. Our solution to this problem adopts a local Poisson model to describe the variation at each sampling period. Noise is removed by local smoothing. Finally, the overdispersion relative to the Poisson model is modelled by mixing these locally smoothed, Poisson models in an appropriate way. In Chapter 2, a brief description of the benchmark data sets is given. In addition, some relevant recent studies are summarized for completeness and later comparison or use. We discuss modelling Poisson processes with overdispersion, time series techniques for evaluating long-term trend effects, models for handling contagious or self-inhibiting processes, a local smoothing procedure for obtaining nonlinear regression estimates, and a Bayesian nonparametric smoothing method for modelling locally regular processes. Finally, the empirical Bayes method with hierarchical priors, the basis of this 2 thesis, is reviewed. To obtain insight for further investigation, the data are carefully examined in Chapter 3. The results indicate that some data pooling might be desirable to partially integrate the separate models for the marked recoveries observed in each catch region. The data from commercial fisheries appear to be more reliable and consistent than those from sport fisheries and escapement; thus, only the commercial data are used in the modelling stage of our analysis. The local parametric Poisson models are developed in chapter 4. Smoothing techniques are also developed there for removing noise, and estimating long-term trends in data. The main inferences are estimates of the Poisson intensity functions and the calculation of their corresponding credibility intervals. Finally, in Chapter 5, the proposed models are fitted to selected coho and Chinook data sets. A summary of the estimates and the corresponding credibility intervals is given. The problems of missing values and edge effects are also addressed there. 3 2 . BACKGROUND REVIEW 2 . 1 The Benchmark Data Set The benchmark data set is established in the Pacific Biological Station(PBS), which is a research branch of the Canadian Department of Fisheries and Oceans (DFO) in Nanaimo. The tag codes in this benchmark, which are obtained from the MRP database, form a sample data set for statistical analyses and exchanging data with other agencies. Complete documentation, including the selected formats and information related to the tag codes, is available in the 1986 report *A Canadian MRP Data Benchmark1. The benchmark consists of release data of tagged and associated juvenile salmon, and recovery data of adult marked salmon for selected tag codes. Data related to the sampling periods for recovering marked salmon are obtained from the MRP database directly. The following is a brief description of these data. The benchmark release data contain a code for each release group of juvenile salmon. In particular, the origin, age, and average size of fish, the number of tagged and associated fish, as well as the site and date of the release are included for each group. When adult fish are recovered, not every fish is inspected for 4 mark. Different recovery methods have different sampling and reporting procedures associated with them. These recovery methods are mainly of three types: i . commercial fisheries, i i . sport fisheries, and i i i . escapement — fish that are not captured by any fishery. Each benchmark recovery data record includes the code found on each recovered and tagged salmon, the time, region and method of recovery. Times are usually recorded as year, month and statistical week (about 5 per calendar month). The recovery regions are geographic catch regions divided according to each of the fishing methods: t r o l l , net, and sport. For each marked recovery, there is one record, except for escapement data. Thus, redundant sample information may appear on numerous records of individual tagged fish from the same sample. Fortunately, the fields of each data record are organized in such a way that data for an entire fish sample can easily be obtained. The period for observing marked salmon is different in each catch region; thus, the data on sampling periods (described later in section 3.2) are important for determining whether a record is missing because of no sampling, or simply because there is no recovery during that period. Therefore, together with the recovery data, the distribution of marked recoveries in each catch region and the abundance of each 5 group of tagged salmon can be observed over time. With a l l these data, many related questions can be tackled. 2.2 S t a t i s t i c a l T e c h n i q u e s 2*2.1 Negative binomial and mixed Poisson regression In this subsection, we describe for completeness and comparison, a model which bears some resemblance to that adopted in this thesis. However, i t seems less flexible than ours and so has been set aside during the current investigation. Suppose the response variable Y, a count, and a vector x of explanatory variables are specified. In general, let U I V denote the conditional distribution of U given V, where U and V are any two random variables with a joint distribution. Then a Poisson model for the response is as follows: Y I x is Poisson distributed with mean where p(x) is to be estimated. Very often data exhibit extra-variation or overdispersion relative to the proposed Poisson model. For the count data with no covariates, the negative-binomial distribution is a popular choice for handling the extra-Poisson variation. To handle covariates, this result can be generalized to 6 -1 P(Y = y I x) 1 + a /u(x) a M(x) 1 + a /Li(x) 1 a y! r(a *) y = 0, 1 ,..., (2.1) where a > 0 is called the index or dispersion parameter. The mean and variance of Y given x are E(Y I x) = fj(x) and Var(Y I x) = M(X) + a / J ( X ) 2 . Note that (2.1) yields the Poisson model i f a -» 0. Lawless(1987) studies these negative-binomial models and examines their properties in detail. He reviews the maximum likelihood and moment estimation procedures for estimating the dispersion parameter and regression parameters. In addition, he compares the asymptotic covariance structures, efficiency and robustness of the parameters estimated by these two methods. Since Poisson regression models are very useful, a test of the Poisson hypothesis is often of interest. One method is to test a = 0 within the negative-binomial model. Lawless suggests some useful statistics such as the likelihood-ratio and the standardized dispersion, for testing this hypothesis. He also gives a note of caution that the result of any test depends on the size of the sample and/u(x). 7 2.2.2 U s i n g S A B L t o d e c o m p o s e t i m e s e r i e s d a t a A method which is extensively used in this thesis will now be described. Suppose observations of a time series are taken at equally spaced time-points and the problem of interest is that of determining the long term trend in the deseasonalized series. Nicholls, Heathcote and Cunningham(1987) suggests a method, implemented in a software called SABL, that deseasonalizes the data, possibly after a transformation, without actually modelling the seasonal components. This method decomposes the series into three additive components by means of a minimization criterion and robust data smoothing techniques. The results at time t are the 'trend'(Tt), 'seasonal'(St) and * irregular'(I t) components. Let Y* denote the transformed response at time t. Then YT = T + S + I . t t t t Nicholls, et.al.(£&£d) explain that to construct the additive model, the original data must be transformed so as to minimize the interaction between the trend and the seasonal components. This criterion is reasonable since i f their interaction were not at its minimum, then for example, if the trend were increasing, the seasonal component might also increase. With robust smoothing techniques based on moving medians, the trend and seasonal components can be determined iteratively. These robust estimates will not be affected by outliers because these outliers will be incorporated in the irregular component of the series. (To give more flexibility to users, SABL allows them 8 to choose a p a r t i c u l a r transformation, and to selec t the widths of smoothing windows for the trend and seasonal components.) After the decomposition, the seasonally adjusted s e r i e s i s obtained by simply subtracting o f f the seasonal component to give Y T = T + i . t t t This s e r i e s can be converted back to the o r i g i n a l response scale by applying the inverse transformation to Y*. Once the trend and irre g u l a r components are computed, they may be plotted for v i s u a l inspection so that one can model the trend. The model can then be validated by other time series procedures, such as the Box-Jenkins autoregressive moving average (ARMA) technique. 2.2.3 Time s e r i e s analysis of a contagious process An a l t e r n a t i v e model to ours i s described i n t h i s subsection and one of i t s d e f i c i e n c i e s i s noted. However, i t promises to have some value and w i l l be investigated further i n future work. Holden(1987) developed a model for rare events l i k e the d a i l y a i r c r a f t hijackings i n US between 1968 and 1972, for example. The proposed model i s for stationary processes. I t incorporates the assumption that the contagiousness of an event eventually declines to zero, and that the rate of occurrences l e v e l s off over a long period with occasional, temporary peaks when an occurrence excites the 9 process. With modification, the model can also Incorporate the effects of exogenous time series. The data is potentially applicable to the commercial marked salmon data of a given tag code observed in a catch region since there is a long period of no recovery during the winter season. However, the leveling off phase of epidemics is not fully reflected in our data because of the definition of the yearly sampling 'periods' (See Table 3.4). During the sampling season, there are only occasional recoveries which can be thought of as rare events. But an important similarity is that the observed recoveries are serially correlated. Thus, we conclude that Holden's model might be adapted for modelling the salmon data in spite of its deficiencies with respect to our data. Holden assumes that the observed sequence of daily counts, { Nt>, is a sequence of Poisson variates with means given by some sequence, { X^}, which incorporates the stimulating effects of previous incidents. The linear contagion model for rare events is given by X t u < v + 6 (2.2) where CD (2.3) W. > 0 (i 1,2 ,... ) and t is an integer. For a discrete-time process V (2.2), N conditioned on the history { N , u < t} has a Poisson 10 distribution with mean is required to be less than one to ensure that s E( X ) > 0. The quantity v is the rate at which events are generated by factors other than contagion (assumed constant). The lag structure of W. (1 = 1,2,...) in (2.3) describes the contagiousness of an event i periods after its occurrence. To get a finite number of parameters, simply set Wt to zero after some maximum lag, or assume that Wt has a specified functional form, such as the lag weights associated with a given ARMA process. Then the time-series techniques suggested by Box-Jenkins may be used to obtain the parameter estimates. 2 . 2 . 4 Smoothing techniques i . Estimating smooth functions by the l o c a l scoring algorithm To provide additional perspective on the approach taken in this thesis, a very recently proposed method, similar to our own in spirit, will now be described. For likelihood-based regression models with response variable Y, such as normal linear regression, one usually assumes a linear form in the covariates X ,X ,...,X . A set of n independent realizations of 1 2 p these random variables will be denoted by (y,x ,...,x ), ...,(y ,x ,...,x ). Hastie and Tibshirani(1986) propose the class n n l n p i i of generalized additive models which replace the linear form ^ ft X^ by a sum of smooth functions ^ « (X ). The « (•) are unspecified functions that are estimated using a scatterplot smoother in an iterative procedure called the local scoring algorithm. Any regression procedure can be viewed as a method for estimating E(Y I X ,X ,...,X ). The additive model assumes the following form 1 2 p for this conditional expectation: p E(Y | X,X,...,X ) = s + ) s.(X.), (2.4) 1 2 p O £^ j J i =1 where the smooth s.(-)'s are standardized so that E(s.(X.)) = 0. J j j These functions are estimated one at a time using a scatterplot smoother. A simple class of scatterplot smoother estimates are the local average estimates, s(x ) = Ave < y> \" j « N. J j where Ave represents some averaging operator like the mean and N is a neighborhood of x^ (a set of indices of points whose x values are close to x ) . The type of neighborhoods considered in Hastie and Tibshirani's paper are symmetric nearest neighborhoods. Associated with this is the span or window size w, which is the proportion of points contained in each neighborhood. Other more complicated estimates of E(Y I X) can be used, such as kernel or spline smoothers. 12 The span ro is selected to tradeoff between the bias and variability of the estimate. A data-based criterion is derived for selection. Let s'^x^.be the smoother of span w at x, having removed (x^, y j from the sample. Then the c r o s s — v a l i d a t i o n sum of squares (CVSS) is defined by n . 2 CVSS(w) = (1/n) i =1 The optimal span w is that which gives the smallest value of CVSS(w). This criterion effectively weighs bias and variance based on the sample. Note that the E(CVSS(w)) can be shown to be approximately equal to the integrated prediction squared error (PSE) PSE = E(Y - s(X)) 2, and that CVSS is approximately unbiased for the expected prediction squared error. In addition to these desirable properties, CVSS is recommended because i t is computationally efficient for obtaining the optimal value of u>. For the additive model in (2.4), the l o c a l s c o r i n g algorithm estimates the s(*)'s by iteratively smoothing the adjusted dependent variable on X, and i t requires a choice of span which can be estimated using the CVSS(w) in (2.5). Theoretically, this technique can be viewed as an empirical method of maximizing the expected log-likelihood, or equivalently, of minimizing the Kullback-Leibler distance to the true model. It is called l o c a l s c o r i n g because the Fisher scoring update is computed using a local estimate of the score. 13 i i . Bayesian nonparametric smoothing method f o r l o c a l regular process A potential refinement of the approach adopted in this thesis is described by Ma(1986) who improves on the Bayesian nonparametric approach proposed by weerahandi and Zidek(1988) for smoothing stochastic processes. The processes of concern are of the form R = S + N, where S is a smooth function and N is an independent noise process. R is assumed to be observed at a sequence of time-values, t , 1 = 1,..,n, and S is assumed to be locally regular, that is, expandable in a Taylor series to the pth term about t = t . Then an a priori structural model for the data is R = X ft + s, (2.6) where R = (R ,...,R ) is a vector of n observations, i n X = (1,X ,...,X ) is an n by (p+1) matrix, i p where l is a vector of ones and X* = ( [ t - t [t - t ] j/j!), j 1 n+l n n+1 i ft - {ft ,ft ,—,ft ) is vector of coefficients, o 1 p where ft - S(t ) and ft. = D vs(t ) with D as the operator O n+l v. r t + t * of differentiation, and s = 7) + N is the error term, where both 77 and N are vectors; specifically 7? is the remainder of the Taylor expansion of S(t ) and N is the noise with variance a2. One further assumption underlying this approach is that the expansion errors and a l l other a priori uncertainty about R, ft and the smoothing 14 parameter c, related to the variance of the noise s, have a joint multivariate normal distribution. The \"smoothing parameter\" c controls the degree of smoothness of the estimated R. The main objective of Ma's study( ibidD is a simple method to compute an estimate of c and to obtain R, a smooth estimate of R. His method estimates c, and computes a measure of accuracy for any given R, called the predictive squared error PSE (to be defined later) for each fixed order p which reflects the degree of local regularity of S. The value of p that has the minimum PSE is chosen to be the optimal value. For each fixed p, the parameter c can be estimated by cross-validation which chooses the value of c that minimizes the cross-validated sum of squared (a similar method is described in section 2.2.4.1). Ma(ibid) develops a simpler alternative called the back./it ting method and compares i t to cross-validation. His new method is recommended for obtaining c because i t is easier to implement and computationally more efficient than the cross-validat ion approach. Ma's method may be described as follows. Suppose (2.6), S has p + 1 derivatives. Then the a priori model form R.= ft + ft x. + ft x z +...+ ft + ft y? +1 + e., \\. O 1 x. 2 V P + l l . l . in equation is of the (2.7) 15 where R is the tth component of the vector R, e is the ith component of the error vector s, and x.= (t - t )V i ! , i = l,...,p+l. The bachfi t ting method uses the fact that c = S2/ &z, where <52 is the prior variance of £>p+1S/(p+l) !, and a2 is the variance of the noise N. Then, for the order p, c is given by p ^ 6Z c * P sample variance of (ft ) - E L (2.8) - J m(x) = | f ( x I©) dF(©). (2.10) © Then, providing m(x) * 0, When no prior information about 9 is available, what is needed in such situations is a noninformative prior, by which is meant a prior that contains no information about ©. A reasonable choice of such a prior is to give equal weights to a l l possible values of ©. A typical noninformative prior density is n(9) = 1 , the uniform density on K. Given the prior, the analysis can proceed in a conventional Bayesian fashion. i i . Empirical BayesCEBO analysis Assume n(e) has a given functional form, and choose the density of this given form which closely matches the prior beliefs. We assume TT e r with r = { TT : TT(0) = g(0 |\\) where \\ <= A }. (2.12) Here g is a specified function. Then the choice of prior reduces to 18 the choice of \\ e A which is usually called a hyperparameter of the prior. The Type II maximum likelihood estimate (ML-II estimate) of n is such that m(x |?T) = sup m(x | T T ) , TC e r where m(x |n) = J* f(x |©) rr(©)d©, and r is the set described in (2.12). The marginal density of x given n, m(x In), reflects the plausibility of n with the data in hand. This function is clearly maximized by choosing n to be concentrated where f(x |©) is maximized (as a function of ©). Thus, i t is reasonable to consider m(x |rr) as a likelihood function for n. Then sup m(x 17T) = sup mCx |g(© |X)!> x 0: t(N= A. IX.,^ 3.,0 = , (4.1) v v. v. v n . ! , -K/ ft. f(X.|f3.,C) = - ~ e \\ (4.2) 32 r - C / ft. t{ft. IC) = — ^ ~ e v . (4.3) Note that because of our choice of a noninformative prior for ft^f the moments of HftAC) do no exist. We can rewrite (4.3) as follows f ( ^ i o = - i - ( . -«V*> ( ^ / c r 2 ] . That is, iiftAC) = ~-g(ft./K)r (4.4) where g (^. /C) = exp(-f?./C) iftJK )\"2 . We can easily identify C in equation (4.4) as the scale parameter of the density f(^3 IC) and 1/C as the precision parameter. Thus, C indicates the spread of the ft population from which the ft As are picked, and 1/C expresses the degree of equality among the ft As. In particular, the ft. 's are identical when C is zero and the ft. 's are very different when C is large. Now the joint density of V and ft given C is f ( \\ , P . K > = f ( \\ l ^ , C ) f(^.IC) 33 That Is, -(X.+ C ) / ft = — £ — e 1 1 . (4.5) Then the prior for V given C is 00 f ( \\ K ) = I f(X.,^. IC) oV3.. That is, 0 0 -(X. + C ) / ^ f(X. IC) = C J &ft. • (4.6) v •* „ 3 v. t o c \\After the change of variable, u = 1/ f?., equation (4.6) becomes -(x.+ C ) u f(X .IC) = C | u e v du. uu I -or f(X. IC) = - - • (4.7) (X.+ C) It is easy to show that £ ( / ? J C ) and f ( X J C ) are both unimodal functions with respect to C and their unique modes are at C = 2fti and C = x^, respectively. Thus, a priori, most of the ft As are concentrated near C/2. We now determine the joint density of N and X given C which is f(rV.,X. IC) = UN. IX. ) f(X. IC), 34 that is, -X n v i e X. _ f(W.,X. IC) = \" 7 \" \" —• (4.8) v * V (x.+ c)2 After integrating out x^ , ve obtain -X n oo v. . v r e X. UN IC) = - - T \" ^ dX. . (4.9) J 0 V (x.+ o 2 v Finally, to obtain the conditional posterior joint distribution of X and ft , we require an estimate of C = C ( t ) . The value of C can be estimated by maximizing an expression which involves the integral of (4.9). Then, using (4.1) ,(4.5) ,(4.9) and C, the above mentioned estimate, the posterior is UN IX ft C ) f ( X ft IC) f ^ ^ l V 1 = - ~ — -t(N. IC) That is, -X. n. e 1 X. p -(X. + C ) / ft-i. e n. J ft. 3 f(X.,/?. |W.,C) = 1 1 ,(4.10) v ^ 1 -X. rx 00 l .. l e X. ~ i £ I — dX. V (x. + C ) 2 1 o V where /? and X are non-negative. 35 The integral in the denominator o£ (4.10), which is just (4.9), is essentially a constant once C is estimated. It remains to estimate C using (4.9). For clarity, the subscripts in (4.9) are dropped and we let P = n.!. Then (4.9) becomes f -A. n g(C) = P\" 4 e \\ c ( \\ + C ) \" 2 d \\ (4.11) o where \\ > 0, n and P are positive integers. To prove that g(C) has a unique maximum, we appeal to a lemma of Brewster and Zidek(1974). First, suppose WCxO is a continuous non-negative function whose domain is either (0,oo) or ( - 0 0 , 0 0 ) , and i t is s t r i c t l y bowl-shaped. Thus, W is differentiable almost elsewhere. In addition, assume that, whenever necessary for integrals involving W, the Interchange of integral and derivative is permissible. The lemma i s : Iff is a. d&rxsi ty on CO, ooJ> [ C —oo, OCL5 ] arid < fCxc ±^>: c > O > [: - 0 0 < c < 0 0 > ] has monotone lihelihood ratio property CMLRP), then c -+ J x W'Ccxl fCx> dx £ fWCx+cl fCxl dx j has at mos t one sign change and is stric t ly bowl-shaped Cor monotone}. 36 We f i r s t show that g(C) in (4.11) satisfies the conditions stated in this lemma. From (4.11), we have w(\\) = p - 1 &~X x n , where n and P are positive integers and X > 0. It can easily be shown that ¥ ( X ) is strictly bowl-shaped (opening down) and / ( X l O is a scale density which can be written as / ( X I C ) = C (1 + X/C >' • / 0 ( M : ) , where / (y) = / ( y l l ) = (1 + y f 2 dy. In addition, < / ( X K ) > has the MLRP since i f X < X and C < C / then 1 2 1 ^2' / ( \\ I C ± ) / ( X t I C 2 ) > 0. Since g(C) satisfies a l l the conditions in the lemma, we conclude that C -»> f x w\"cxc:> / cx;> dx has at most one sign change, and C J* wc\\> / fC^.IC) . (4.15) i = 1 Then, we integrate the PAs in (4.15) over its entire domain to get 00 00 \\ '\"\\ ~TT~ fCN I/? ,0 fCp. K ) &P. • • '&P. J J t SS 1 O O = fCN ,N ,N ,N IC3 . (4.16) Using the result in equation (4.13) and the assumptions of conditional independence, we have KN ,N ,N ,N IC3 i ' 2' a' 4 ^ 00 oo = J ...J ~JT~ f(W. IX. ) f(X. IC) dXi o o -X n. 00 00 v . 1, jk • • - - e X. r * » 4 - . t - t - 1 2 e X. ^ _ - I .»I T T [ T T T T 1 — c — 1 *V J J i = i j = t - i k = i n. ., ! J(X.+ C) O O v jk v 40 that is, KN ,N ,N ,N 1' z 7 a' « s - 6 \\ . n. oo 00 t. i , • I -I T T - ! — - - — — _t J J i = l P. (\\. o o where --d\\., (4.17) ( V + C ) ' 1+1 2 and t +1 2 p i = T l TT - i j = t - l k=1 Note that KN.N.N.N given in equation (4.17) is a product x 2 3 4 of t(N. IC) and is similar to the result for the unreplicated case given ln equation (4.9). Though the function in equation (4.9) itself is unimodal, the analogous result in the present case has not been established. The shape of this product integral evaluated in equation (4.17) with numerous different combinations of n.., 's at various values of C's suggest the result is true. Numerical methods are used to compute the estimate of C that maximizes the function in equation (4.17) in spite of the potential risk of multi-modality. The smoothing window used for estimating C is small but the C(t) obtained may s t i l l contain a fair amount of noise and sampling variation. A numerical method called SABL, described in section 2.2.2, can be used to decompose this C-series into trend, seasonal and irregular components. The seasonal and irregular components, which 41 are usually o£ much smaller magnitude when compared with the trend, reflect some of the sampling variation and noise s t i l l in the C-series. The trend is smoother than the C-series, and is closely related to the trends of x\\ (t)*s for the 4 processes so we will use i t in this study as a general summary of the data in the domains for which C is computed. With the smoothed version of C ( t ) , we can compute the posterior point estimates for V at each t by maximizing an equation similar to that of (4.9) with respect to V . So for each 1 = 1,...,4 and each time t, -6A. n. X. - X. + + » X. p max fCX. \\N. = max , (4.18) X. ' \" X. P. (X. + o 2 X. X. X. X. where 1 + 1 2 1+1 2 rx. 1 + + I l ^ ^ n = T y j = \" - l k = l J j = t - 1 k = 1 This maximization problem is easier to handle when we take the natural logarithm of (4.18) to obtain a quadratic expression of X^ . The maximum can be explicitly evaluated in this case and is found to be at (n. - 2 - 6C ) + I (n. - 2 - 6C f + 24 rx. ( a* = 1+* , ( 4 . 1 9 ) 12 t + 1 K where rx, = S Y rx.., . x,++ . L, L, i j k The X^ in equation (4.19) is only 1/6 of the actual X^ so i t is 42 multiplied by 6 (2 for the number of replicates times 3 for the window size). Now, let V = 6X* which is comparable in magnitude with the observations of the i t h process. It would be desirable to present a 100(l-a)% credibility interval for \\ at each t. Each interval is a subset, C, of the parameter space which gives the probability that V is in C. This amounts to choosing a pair of upper and lower limits, (a,£>), such that F ( C f 4 | fCX. dX. = 1-ot, (4.20) where a > 0 and 6 > 0, and -6\\ . n. F(C) = Pf* J e 1 \\ ; i + * C <\\ + 6\"2 dV, X. X. X. n t + i 2 t + 1 2 j= l-1 k=l j = t - l k = l In choosing a credibility Interval for V at time t, the usual approach is to minimize its length. This may be done by using the highest posterior density(HPD) criterion which is to include in the interval only those points with HPD, that is, the 'most likely' values of \\.. To evaluate the HPD credibility interval in equation (4.20), we set up the following program: 1) set the lower limit a = 0; 2) create a subroutine which, for a given time t, computes the 43 A maximum of £CX. IN. as a function of X. at, say, X. = 3) set x = (a+m.)/2 and evaluate fcx. \\N. , 0 at X. = x; 4) create a subroutine which find the value of 6 such that f(X.= 6 |N.,C) = f(X.= x l*.,C); 5) numerically integrate where a and fe are the values from steps (3) and (4). If this value is approximately (1-ot), then stop. If not and i f : i) the value is larger than (1-a), set m, = x (a remains unchanged) and go back to steps (3) to (5); i i ) the value is smaller than (1-a), set a = x (m remains unchanged) and return to steps (3) to (5). It can happen that the integral P(a,6) evaluated from a - 0 to the point where fCXJA^O has its maximum is very small so that the above procedure cannot be used. In such situations, we abandon the HPD criterion and find o such that P( 0 is arbitrarily small. In this case, we are sure that there is no recovery in the i t h catch region. The result is obtained to a good approximation by taking the limit of G(a,b,C) in (5.4) as C •* 0. The proof of the result lim G(a,b,C) = 1 is shown in C - 0 the following. First, note that rb J e 1 / (X.+ C) 2 dX. lim G(a,b,C) = Um — --. (5.5) C + 0 C - 0 r°° -X J e / (X.+ C) dX. Letting u = X + C/ (5.5) becomes rb + C f -(u.-C) 2 V L U v / u 2 J da lim — ^ --. (5.6a) du. 54 The expression in (5.6a) is just r - u . J * \" 2 e u. du. lim -- --. (5.6b) C O § e \" u.\" du. C -*• 0 r -u. ^ 1 t -2 Since in equation (5.6b), both numerator and denominator are analytical functions, they are differentiable. Further, they are both Infinite in the limit; therefore, we can apply the L'Hospital rule to (5.6b) and obtain g - ( b * C ) (b + C ) \" 2 - e C C lim < * 0 - ( e ^ C \" 2 ) lim C \" 2 • b (b + C ) \" 2 + 1 C 0 = 1. Sample plots of these credibility Intervals for coho and chinook and their corresponding \\-estimates are portrayed in Figures 5.14 to 5.16. We must be careful in interpreting these credibility intervals. The bands in Figures 5.14 to 5.16 are not simultaneous interval estimates. They merely indicate the pointwise credibility interval for the recovery intensity, M t), at each period t. Note that during some periods when a region has substantial recoveries, the ratio of the \\ and the width of theses intervals on average for that region is about two or three times smaller than regions with much smaller 55 recoveries. This indicates that large recoveries are ln general more informative for estimating the recovery intensities and computing the corresponding credibility interval just as Intuition would suggest. 5 . 4 Conclusion To identify an appropriate model for the salmon recovery data, we have examined the raw data in detail. We learned that the salmon recovery data set has missing values, huge extraneous variations and noise. An empirical Bayes model was thus developed to handle these data. The method fitted local parametric, Poisson models to be precise, and incorporated this approach with a conventional smoothing technique to obtain the overall recovery patterns and the corresponding credibility Intervals for the underlying salmon recoveries. The resulting conclusions are: 1 ) there are huge variations in the pattern of recovery intensities for both species of salmon from different brood years; 2) in a l l catch regions, the overall recovery trends for coho from the two hatcheries are different in shape; 3 ) no overall difference is observed for the three sizes of coho; 4) in general the Quinsam coho recovery is usually the largest in the Johnstone Strait Net area while for Capilano this is largest in Southwest Vancouver Island Troll area; and 56 5) among the 3 intervals of recovery for the chinook, the recovery rate is always the largest during the fi r s t two periods. Our method has demonstrated its usefulness in aggregating information from Individual sampling periods to overcome the problem of sparse data. However, we suggest that further investigations be carried out, including 1) extending our method to Include general parametric models, and 2) finding a weighting scheme for smoothing so that the weight given to a point depends on the distance i t is from the 'period' of interest. 57 REFERENCES Becker, R. A. and Chambers, J. M. (1984). S: An Interactive Environment for Data Analysis and Graphics. California: Wadsworth. Berger, J. 0. (1985). Statistical Decision Theory and Bayesian Analysis. New York: Springer-Verlag. Brewster, J. F. and Zldek, J. V. (1974). Improving on equivariant estimators. Annals of Stat is tics, 2, 21-38. A Canadian MRP Data Benchmark (1986). Fisheries Research Branch, Department of Fisheries and Oceans, Canada. English, K. K. (1985). The contribution of hatchery produced chinook and coho to west coast fisheries: preliminary analysis. Department of Fisheries and Oceans, Canada. Hastle, T. and Tibshirani, R. (1986). Generalized Additive Models. Stat ist ical Science, 1, 297-318. Holden, R. T. (1987). Time Series Analysis of a Contagious Process. J. Amer. Statist. Assoc. , 2, 1019-1026. 58 Lawless, J. F. (1987). Negative binomial and mixed Poisson regression. TK& Canadian Jo-urnal of Statistics, 15, 209-225. Ma, H., Joe, H. and Zidek, J. (1986). A Bayesian Nonparametric Univariate Smoothing Method, with Applications to Acid Rain Data Analysis. University o£ British Columbia Statistics Department Technical Report No. 47. Nicholls, D. F., Heathcote, C. R. and Cunningham, R. B. (1986). The Evaluation of Long Term Trend I. Austral. J. Statist. , 28, 294-313. Salmon Stock Interpretation Unit (1984). The mark recovery program as an assessment tool for the hatchery chinook and coho salmon enhancement program. Fisheries Research Branch, Department of Fisheries and Oceans, Canada. Salmon Stock Interpretation and Assessment Unit (1986). Development of a Pacific coastal database for assessing the contribution of B.C. hatchery chinook and coho salmon production to the Canadian commercial catch. Fisheries Research Branch, Department of Fisheries and Oceans, Canada. Weerahandi, S. and Zidek, J.V. (1988). Bayesian nonparametric smoothers for regular processes. 77ie Canadi an Joxirna I of Statistics , 16, 61-74. 59 A P P E N D I X T h e M a r k R e c o v e r y P r o g r a m C MRP5 D a t a b a s e A two-phase marking program, which includes tagging fish in the hatchery and recovering tags in the fishery, is currently the best known method for providing information to assess the benefits of a r t i f i c i a l l y reared fish. The most popular marking technique for hatchery coho and chinook is the use of coded wire tags (CWTs), which are usually inserted in the snouts of juvenile fish as an indicator of the fish's origin. In addition, the adipose fins of these tagged fish are clipped to allow detection of them later as adults. Two main types of data are thus available from this marking program. One is the hatchery release data and the other is the recovery data. The fi r s t type is of two broad categories: 1) fish released with a CWT and clipped adipose fin (marked), and 2 ) fish released without a CWT (unmarked). The second type of data include fishery data giving recoveries from commercial and sport fishing, and escapement data which are recoveries not from any fishery. In each category of the release data, there are time variables, such as the brood year (the year in which the eggs were spawned) and 60 the date of release. There are also geographic variables which include stock site (location from which eggs are taken), hatchery site and release site. In addition, an average size (gram/fish) and the actual number of releases are given to each released group of fish. The above variables represent only a small subset in the release data. Many others variables related to the survival of fish are also available. Note that in order to calculate the total contribution of hatchery fish to fisheries, each hatchery release group must be represented by a group of marked fish. A method has been developed to determine the marked release group that would represent the release of unmarked juveniles. Thus, the different variables in the release data are also important for associating marked and unmarked releases. In the fishery data, there are information about recoveries of marked fish and the corresponding sampling program. Whenever possible, the recovery time is recorded as year, month and statistical weeks (about 5 per calendar month), and the recovery region corresponding to the fishing method is also recorded. Not every fish caught by a commercial fishery is inspected for tag since this is costly. Thus, a sample is taken for mark inspection and only those marks detected in the sample are recorded as data. The sport recovery data usually come from voluntary returns of salmon heads by fisherman. Thus, these recoveries depend very much on the fishermen's awareness 61 of the clipped adipose fins. The sport catch size is obtained differently from commercial catch, which is estimated based on sales slips. The escapement data include tagged fish escapement to the hatchery or escapement to rivers or lakes near the hatchery. Detailed recovery information on a single tagged fish is difficult to obtain so only yearly bulk data are available. A brief review of the l i f e cycle of chinook and coho would show us how recoveries from a particular brood are distributed over time. A coho's or Chinook's l i f e begins as an egg in the year of spawning, the 'brood year'. Eggs are hatched, and juvenile fish are reared until the 'release year', when they are allowed to leave the hatchery and begin l i f e in the ocean. Eventually, in the 'recovery year', adults are captured by the fishery or escape to their spawning ground. Therefore, i f the brood year is defined to be year 0, then the following table from the Salmon Stock Interpretation and Assessment Units(1986) report is a summary for the majority of coho and chinook. Year chinook coho 0 brood year brood year 1 release year release year (fry) 2 release year (smolts) 3 recovery year (age 3) recovery year (age 3) 4 recovery year (age 4) 5 recovery year (age 5) 62 In some cases, some fish of both species are recovered in year 2 as jacks and chinook are sometimes recovered in years 6 and 7. These release and recovery data of tagged and untagged salmon have been collected over years, but they were unorganized and scattered among different agencies. Thus, i t is difficult to have an analysis on a complete set of data. In 1983, the Canadian Department of Fisheries and Oceans(DFO) decided to construct a Mark Recovery Program(MRP) database on the VAX computer at the Pacific Biological Station(PBS) in Nanaimo. As a result, many interesting questions can now be addressed. When this database is completed, valuable information will be available for assessing the coho and chinook salmon enhancement program. 63 Table 3.1. The tag codes found ln the benchmark data subset. code used original species in plots code Hart code 1 020408 124 (chinook) 2 020409 124 3 021615 124 4 021635 124 5 021661 124 6 021827 124 7 021829 124 8 022202 124 9 022405 124 1 081810 115 (coho) 2 081811 115 3 081812 115 4 081813 115 5 081841 115 6 081842 115 7 081843 115 8 081844 115 9 081845 115 10 082001 115 11 082002 115 12 082003 115 13 082004 115 14 082005 115 15 082006 115 16 082007 115 17 082008 115 18 082009 115 19 082019 115 20 082020 115 21 082021 115 22 082022 115 23 082023 115 24 082024 115 25 082025 115 26 082026 115 27 082027 115 64 Table 3.2. Summary of data fields for the benchmark release data subset. field description number of zeros % of zeros 1 tag code 0 0 2 species Hart code 0 0 3 brood year 0 0 4 run type code 27 75.00 5 day first released 29 80.56 6 month f i r s t released 29 80.56 7 year fi r s t released 29 80.56 8 day last released 0 0 9 month last released 0 0 10 year last released 0 0 11 number tagged 0 0 12 adipose only 0 0 13 undipped 0 0 14 total released 0 0 15 number of days held 5 13.89 16 size code 0 0 17 size at release 0 0 18 percentage tag loss 0 0 19 expected survival 36 100 20 stage code 0 0 21 study type 0 0 22 hatchery code 0 0 23 release site code 0 0 24 stock site code 0 0 25 agency code 0 0 26 co-ordinator code 0 0 27 production area code 0 0 28 province/state code 0 0 29 years with recoveries 0 0 30 release type 0 0 31 total associated release 0 0 65 Tab le 3 . 3 , Summary l i s t o£ chinook data fields for the benchmark rollup recovery subset. (There are four possible recovery methods: t r o l l , net, sport(S), or escapement(E). The letter in square brackets indicates that only one of these methods applies to the field. Those fields without any letters apply to a l l methods. The number and percentage of NAs were calculated according to the number of records corresponding to a particular catching method.) * NAs = missing values field description number of NAs % of NAs 1 tag code 0 0 2 recovery year 0 0 3 gear 0 0 4 catch region 0 0 5 brood year 0 0 6 non-tag indicator 0 0 7 species Hart code 0 0 8 statistical week 0 0 9 average fork length (mm) 1361 98.27 10 average hyperal length (mm) 1385 100 11 average total length (mm) 1385 100 12 average dress weight (kg) 1385 100 13 average round weight (kg) 1385 100 14 % immature female 1385 100 15 % mature female 1385 100 16 % immature male 1385 100 17 % mature male 1385 100 18 % unknown sexual maturity 35 2.53 19 recovery site code [E] 0 out of 35 0 20 recovery site number [E] 0 II 0 21 run type [El 0 II 0 22 sample age type [El 0 II 0 23 number of observed recoveries 0 0 24 catch or escapement 67 4.84 25 sample size 67 4.84 26 sum of known tags 5 0.36 27 number of no-pins 138 9.96 28 number of lost-pins 347 25.05 29 number with no data 614 out of 1350 45.48 30 number of sport marks observed [S] 68 out of 89 76.40 31 est. marks in the est.sport catch [S] 68 II 76.40 32 number observed sport recoveries [S] 0 0 33 sum of escapement non-tags [El 3 out of 35 8.57 66 Table 3.4. Table for computing \"Period\" from statistical week (MMW). Week Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1 2 3 4 5 6 7 8 9 10 11 12 1 40 40 1 5 10 14 18 23 27 31 36 40 2 40 40 2 6 11 15 19 24 28 32 37 40 3 40 40 3 7 12 16 20 25 29 33 38 40 4 40 40 4 8 13 17 21 26 30 34 39 40 5 40 0 0 9 0 0 22 0 0 35 0 0 \"Period\" is a number ranging between 1 and 40 representing a one week time period during which salmon fishing may occur. 67 Table 3.5. List o£ catch region codes, and names. new code old code Name 1 1 NW Vancouver Is. Troll 2 2 SW Vancouver Is. Troll 3 3 Washington/Oregon Troll 4 4 Georgia Strait Troll 5 5 Central Troll 6 6 Northern Troll 7 7 Alaska Troll 8 14 Juan de Fuca Troll 9 15 NW Vane. Is. and Central Troll 10 17 NW Vane. Is. and SW Vane. Is. Troll 11 18 Northern and Central Troll 12 34 Georgia Strait and Central Troll 13 53 Georgia Strait and SW Vane. Is. Troll 14 56 North Central Troll 15 57 South Central Troll 16 8 Fraser Gillnet 17 9 Northern Net 18 10 Georgia Strait Net 19 11 Johnstone Strait Net 20 12 Central Net 21 13 Juan de Fuca Net 22 19 Johnstone Strait and Central Net 23 20 NW Vancouver Is. Net 24 21 SW Vancouver Is. Net 25 33 Northern and Central Net 26 36 Yukon Net 27 37 Juan de Fuca and Georgia Strait Net 28 45 Johnstone Strait and Georgia Strait Net 29 46 Fraser Gillnet and Georgia Strait Net 30 47 Alaska Net 31 48 British Columbia Net 32 58 Fraser Seine Net 33 25 Northern Sport 34 26 Central Sport 35 27 Washington Sport 36 28 Georgia Strait Sport 37 29 Freshwater Sport 38 99 Canadian Escapement 68 Table 3.6. k summary o£ catch regions with observed recoveries. (New catch region codes from Table 3.5 are used here.) Description COHO CHINOOK # Troll catch regions 10 6 catch region codes I, 2,4,5,6,9,10, II, 14,15* 1,4,6,11,14,15 # Net catch regions 9 6 Catch region codes 16,17,18,19*,20, 21,22,23,24 17*,18,19,20,22, 24 # Sport catch regions 4 4 Catch region codes 34,35,36*,37 33,34,35,36* # tags considered 153 69 # records examined 2989 940 * : Catch region with more than a hundred observed recoveries. 69 Table 3.7. A sample l i s t of coho release replicates that are classified according to size. (All the replicates are In groups of three.) release tag release brood total total tote date code site year size re l . obs. % rec. es] 10/5/81 081855 Quinsam 1979 small 7189 123 1.71 87 59 7191 130 1.81 86 62 7192 111 1.54 71 10/5/81 081856 Quinsam 1979 medium 7192 144 2.00 108 58 7210 114 1.58 88 61 7193 115 1.60 86 10/5/81 081857 Quinsam 1979 large 7202 148 2.05 110 60 7192 146 2.03 116 63 7207 134 1.86 99 26/5/81 081910 Capilano 1979 small 4098 135 3.29 49 11 4093 115 2.81 48 12 3845 103 2.68 42 26/5/81 081913 Capilano 1979 medium 3983 123 3.09 51 14 4038 127 3.15 54 42 4208 139 3.30 67 26/5/81 081943 Capilano 1979 large 3516 91 2.59 46 44 3570 102 2.86 57 45 3565 81 2.27 46 70 Figure 3.1 a. Size of chinook release for tag codes from the benchmark release data subset S S 10 Figure 3.1 b. Size of coho release for tag codes from the benchmark release data subset 10 15 20 25 30 tag c o d e 71 Figure 3.2. Chinook observed recoveries over the recovery period considered, (tag code: 021827 brood year: 1979 recovery year: 1981 to 1984) Figure 3.2a. Commercial and sport observed recoveries. 200 Figure 3.2b. Commercial observed recoveries. 200 Figure 3.2c. Sport observed recoveries. 200 periods over the 4 recovery years 72 Figure 3.3. Coho observed recoveries over the recovery period considered, (tag code: 081842 brood year: 1979 recovery year: 1981 to 1982) Figure 3.3a. Commercial observed recoveries. periods over the 2 recovery years Figure 3.3b Sport observed recoveries. months over the 2 recovery years 73 Figure 3.4a. Plot of cumulative sum of chinook commercial observed recoveries over time. 250 periods over the 4 recovery years Figure 3.4b. Plot of cumulative sum of coho commercial observed recoveries over time. 1 0 0 1 2 0 periods over the 2 recovery years Figure 3.5. Plots of chinook commercial observed recoveries over the sampling period. (tag code: 021827 brood year: 1979) observed recoveries from Northwest Vancouver Island Troll 0 20 40 60 80 observed recoveries from North Central Troll 0 20 40 60 80 a d j u s t e d p e r i o d s 75 Figure 3.6. Plots of coho commercial observed recoveries over the sampling period. (tag code: 081842 brood year: 1979) observed recoveries from Southwest Vancouver Island Troll IS 20 25 30 35 observed recoveries from Georgia Strait Troll 15 20 25 30 35 p e r i o d s 76 Figure 5.1a. Zeta(t) for coho. (Hatchery: Quinsam brood year: 1979 size at release: medium) 15 20 25 30 35 Figure 5.1b. Transformed zeta(t) (power = 0.25). 15 20 25 30 35 p e r i o d 77 78 Figure 5.3a. Zeta(t) for chinook tag code: 021827. Figure 5.4. The trends of zeta's for Quinsam coho from different brood years. Size at release: Small 80 Figure 5.5. The trends of zeta's for Capilano coho from different brood years. Size at release: Small p e r i o d 81 Figure 5.6. The trends of zeta's for the three chinook tag codes. 021661 021829 012827 | — 1 1 1 1 I I I 0 10 20 30 40 50 60 70 a d j u s t e d p e r i o d Figure 5.7. The estimated recovery intensity of coho for each of the 4 catch regions. (Hatchery: Quinsam size at release: large) S W Vancouver Island Troll \" ' . brood year 1979 brood year 1978 o CM in o 15 20 25 30 35 Georgia Strait Troll in c & c £• CD > O South Central Troll ra -Q E ra Johnstone Strati Net p e r i o d 83 Figure 5.8. The estimated recovery intensity of coho for each of the 4 catch regions. (Hatchery: Quinsam size at release: medium) S W Vancouver Island Troll co brood year 1979 brood year 1978 15 20 25 30 35 Georgia Strait Troll co c CO > o South Central Troll CO •o .a E ro Johnstone Strati Net p e r i o d Figure 5.9. The estimated recovery intensity of coho for each of the 4 catch regions. (Hatchery: Quinsam size at release: small) S W Vancouver Island Troll 8 . • ' b r o o d year 1 9 7 9 b r o o d year 1 9 7 8 ^^^^^^ ~ * * * ' . 15 20 25 30 35 Georg ia Strait Troll tn c S c £< co > o u O South Central Troll CO T3 .O E CO Johnstone Strati Net p e r i o d 87 Figure 5.12. The estimated recovery intensity of coho for each of the 4 catch regions. (Hatchery: Capilano size at release: small) S W Vancouver Island Troll brood year 1980 brood year 1979 15 20 25 30 35 Georg ia Strait Troll South Central Troll Johnstone Strati Net period 88 Figure 5.13. The estimated recovery intensity of chinook for each of the 3 trolling regions. NW Vancouver Island Troll a d j u s t e d p e r i o d Figure 5.14. Estimated recovery intensities of coho and the corresponding 95% credibility intervals. (Hatchery: Quinsam brood year: 1979 size at release: medium) SW Vancouver Island Troll 15 20 25 30 35 Georgia Strait Troil 8 • 15 20 25 30 35 Johnstone Strait Net i 1 \" — • r 1 1 • 1 1 i 15 20 25 30 35 p e r i o d 90 Figure 5.15. Estimated recovery intensities of coho and the corresponding 95% credibility intervals. (Hatchery: Capilano brood year: 1980 size at release: medium) SW Vancouver Island Troll Georgia Strait Troll South Central Troll Johnstone Strait Net p e r i o d 91 Figure 5.16. Estimated recovery intensities of chinook and the corresponding 95% credibility intervals. (tag code: 021827 brood year: 1979) South Vancouver Island Troll 0 10 20 30 40 50 60 70 a d j u s t e d p e r i o d "@en ; edm:hasType "Thesis/Dissertation"@en ; edm:isShownAt "10.14288/1.0097859"@en ; dcterms:language "eng"@en ; ns0:degreeDiscipline "Statistics"@en ; edm:provider "Vancouver : University of British Columbia Library"@en ; dcterms:publisher "University of British Columbia"@en ; dcterms:rights "For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use."@en ; ns0:scholarLevel "Graduate"@en ; dcterms:title "Local parametric poisson models for fisheries data"@en ; dcterms:type "Text"@en ; ns0:identifierURI "http://hdl.handle.net/2429/28360"@en .