OPPORTUNITIES FOR MANAGEMENT CREATED BY SPATIAL STRUCTURE: A CASE STUDY OF FINNISH REINDEER By JAMES MEYER BERKSON B.A., The University of Calif o r n i a , San Diego A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE i n THE FACULTY OF GRADUATE STUDIES (Department of Zoology) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA January 1988 © James Meyer BerKson, 1988 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. The University of British Columbia 1956 Main Mall Vancouver, Canada Department of V6T 1Y3 i i ABSTRACT This study examines o p p o r t u n i t i e s f o r renewable resource management when p o p u l a t i o n data are c o l l e c t e d by s p a t i a l s u b d i v i s i o n s . In p a r t i c u l a r I look at p o t e n t i a l a p p l i c a t i o n s f o r the design of management experiments, the d i s t r i b u t i o n of mon i t o r i n g resources, and the improvement of parameter e s t i m a t i o n . Methods are developed to rank p o s s i b l e groupings of s u b d i v i s i o n s f o r use as experimental u n i t s . F a c t o r s e x t e r n a l to the experiment can cause d i f f e r e n c e s between experimental u n i t s . S e l e c t i n g s u b d i v i s i o n s t h a t have r e a c t e d s i m i l a r l y in the past to e x t e r n a l f a c t o r s could minimize the r i s k of e x t e r n a l f a c t o r s c r e a t i n g d i f f e r e n c e s in experimental u n i t s . . Methods are developed to i d e n t i f y s u b d i v i s i o n s t h a t could provide i n f o r m a t i o n about s i m i l a r s u b d i v i s i o n s when monitoring resources are low or when s t r a t i f i e d sampling i s being used. The use of these s u b d i v i s i o n s as "index u n i t s " could n o t i f y managers of extremely good or bad years i n a large number of s u b d i v i s i o n s . Two methods developed by Walters (1986) provide i n n o v a t i v e e s t i m a t i o n techniques that can be used w i t h subdivided p o p u l a t i o n s . A Bayesian approach allows parameter estimates to be adjusted using a known d i s t r i b u t i o n . Another approach allows s i m i l a r s u b d i v i s i o n s to be estimated j o i n t l y more a c c u r a t e l y than would be p o s s i b l e i n d i v i d u a l l y . Not a l l renewable resource data sets provide r e l i a b l e i n f o r m a t i o n f o r use with these a p p l i c a t i o n s . Data sets where there i s l i t t l e common v a r i a t i o n , high l e v e l s of a u t o c o r r e l a t i o n i n the noise, or even modest amounts of measurement e r r o r are i n a p p r o p r i a t e f o r most methods. A s e r i e s of steps i s introduced f o r managers to t e s t the r e l i a b i l i t y of the methods on t h e i r p a r t i c u l a r data s e t s . Data on F i n n i s h r e i n d e e r (Rangifer tarandus tarandus) are used throughout the t h e s i s to i l l u s t r a t e the methods. The r e i n d e e r data appear to be a p p r o p r i a t e f o r these methods when t e s t e d using the steps developed. P o s s i b l e experimental u n i t s and index u n i t s f o r mo n i t o r i n g are i d e n t i f i e d . W a l t e r s ' (1986) methods of parameter e s t i m a t i o n are used on the data set as w e l l . The r e i n d e e r data show that s u b d i v i s i o n s w i t h s i m i l a r e x t e r n a l e f f e c t s were lo c a t e d c l o s e to one another. This p a t t e r n was at l e a s t p a r t i a l l y caused by the e x i s t e n c e of extremely bad years o c c u r i n g w i t h i n geographic r e g i o n s . The r e i n d e e r s u b d i v i s i o n s are very h i g h l y managed and provide l i t t l e evidence of any kind of d e n s i t y dependence. Managers could p o t e n t i a l l y b e n e f i t by conducting experiments to t e s t the b i o l o g i c a l l i m i t s of the p o p u l a t i o n growth r a t e s and c a r r y i n g c a p a c i t i e s w i t h i n s u b d i v i s i o n s . i v TABLE OF CONTENTS ABSTRACT i i LIST OF TABLES V i LIST OF FIGURES V i i AC KNOWLE G DEMENT S ix CHAPTER 1. INTRODUCTION 1 Background 2 Experimental Design 5 M o n i t o r i n g A l l o c a t i o n 7 Parameter E s t i m a t i o n 8 Case Study 9 O r g a n i z a t i o n of the Th e s i s 13 CHAPTER 2. IDENTIFYING SUBDIVISIONS WITH SIMILAR RESIDUALS 15 I n t r o d u c t i o n 15 Simulations 16 Methods 16 Re s u l t s 21 D i s c u s s i o n 23 Case Study 29 Methods 29 Re s u l t s 31 D i s c u s s i o n 36 CHAPTER 3. IDENTIFYING SIMILAR SUBDIVISIONS IN THE PRESENCE OF COMPLEXITY 3 7 I n t r o d u c t i o n 37 Simu l a t i o n s 39 Methods 39 Res u l t s 42 D i s c u s s i o n 45 Case Study 48 CHAPTER 4. OPPORTUNITIES FOR EXPERIMENTAL DESIGN AND MONITORING ALLOCATION 56 I n t r o d u c t i o n 56 Methods 58 Experimental Design 58 Brute f o r c e ranking of u n i t s f o r small experiments 58 H i e r a r c h i c a l c l u s t e r i n g to rank u n i t s f o r l a r g e r experiments 59 M o n i t o r i n g A l l o c a t i o n 62 Res u l t s 63 Experimental Design 63 M o n i t o r i n g A l l o c a t i o n 81 Discuss ion 86 CHAPTER 5. IMPROVING PARAMETER ESTIMATION OF SUBDIVIDED POPULATIONS 90 I n t r o d u c t i o n 90 Bayesian Approach 91 V Methods 91 Result s 94 E s t i m a t i o n of common e x t e r n a l e f f e c t s 99 Simulations 106 Methods 106 Res u l t s 107 Case Study I l l Methods i l l Res u l t s I l l D i s c u s s i o n 112 CHAPTER 6. CONCLUDING REMARKS 118 LITERATURE CITED 124 v i LIST OF TABLES Table 1.1 L i s t of the 56 r e i n d e e r management d i s t r i c t s in northern F i n l a n d 10 Table 3.1 L i s t of r e i n d e e r herds and years t h a t e x h i b i t the 50 lowest s t a n d a r d i z e d r e s i d u a l values 53 Table 4.1 Ten c l o s e s t groupings of r e i n d e e r s u b d i v i s i o n s based on r e s i d u a l s i m i l a r i t y 64 Table 4.2 Percentage of s i m u l a t i o n s r e s u l t i n g i n c o r r e c t i d e n t i f i c a t i o n of quads and p a i r s from c l u s t e r a n a l y s i s 69 Table 4.3 Three sets of herds used f o r f u r t h e r ranking of experimental u n i t s 76 Table 4.4 Top candidates f o r experimental u n i t s w i t h i n sets 77 Table 4.5 Groups of herds remaining at the 20th s p l i t f o r each half of the time s e r i e s 82 Table 4.6 Ten most s i m i l a r p a i r s during each h a l f of the time s e r i e s 83 Table 4.7 P o t e n t i a l key I n d i c a t o r s u b d i v i s i o n s 84 Table 5.1 Mean growth r a t e values by herd 95 Table 5.2 RicKer "a" values as estimated before and a f t e r the Bayesian method 100 Table 5.3 R e s u l t s of i n d i v i d u a l vs j o i n t e s t i m a t i o n s i m u l a t i o n s 108 Table 5.4 Estimates of RicKer "b" parameter by i n d i v i d u a l and j o i n t parameter a n a l y s i s 113 v i i LIST OF FIGURES F i g u r e i . i Map of the 56 Reindeer management d i s t r i c t s in Northern F i n l a n d 12 Fi g u r e 2.1 Performance of 3 s i m i l a r i t y s t a t i s t i c s as i n d i v i d u a l v a r i a t i o n i n c r e a s e d 22 Fig u r e 2.2a Surface view of the p r o b a b i l i t y of s u c c e s s f u l p a i r i n g s versus i n d i v i d u a l and common v a r i a t i o n 24 Fig u r e 2.2b Contour view of the p r o b a b i l i t y of s u c c e s s f u l p a i r i n g s versus i n d i v i d u a l 25 and common v a r i a t i o n F i g u r e 2.3 Mean c o r r e l a t i o n as an i n d i c a t o r of p r o b a b i l i t y of c o r r e c t p a i r i n g s 26 Fig u r e 2.4 D i s t r i b u t i o n of cross c o r r e l a t i o n c o e f f i c i e n t s between p a i r s of F i n n i s h r e i n d e e r herds 32 Fi g u r e 2.5 D i s t r i b u t i o n of rank c o r r e l a t i o n values comparing the f i r s t h a l f of the time s e r i e s to the second h a l f 33 Fig u r e 2.6 D i s t r i b u t i o n of rank c o r r e l a t i o n values comparing the f i r s t h a l f of the time s e r i e s to the e n t i r e time s e r i e s 34 Fi g u r e 2.7 D i s t r i b u t i o n of rank c o r r e l a t i o n values comparing the second h a l f of the time s e r i e s to the e n t i r e time s e r i e s 35 Fi g u r e 3.1 E f f e c t of a u t o c o r r e l a t i o n on the proba-b i l i t y of c o r r e c t p a i r i n g s 43 Fi g u r e 3.2 E f f e c t of measurement e r r o r on the p r o b a b i l i t y of c o r r e c t p a i r i n g s 44 Fig u r e 3.3 E f f e c t of extremely bad years on the p r o b a b i l i t y of c o r r e c t p a i r i n g s 46 Fig u r e 3.4 D i s t r i b u t i o n of a u t o c o r r e l a t i o n i n r e i n d e e r herd r e s i d u a l s 49 Fi g u r e 3.5 Comparison of r e i n d e e r s t a n d a r d i z e d r e s i d u a l s to a normal d i s t r i b u t i o n 50 Fi g u r e 3.6 Comparison of r e i n d e e r trimmed, s t a n d a r d i z e d r e s i d u a l s to a normal d i s t r i b u t i o n 52 Fig u r e 3.7 Map of r e i n d e e r herds with extremely low r e s i d u a l values during common years (3 years i l l u s t r a t e d ) 54 Fig u r e 4.1 F i v e of the ten most s i m i l a r p a i r s of s u b d i v i s i o n s 66 Fi g u r e 4.2 Amount of time r e q u i r e d to f i n d the 10 top candidates f o r experimental u n i t s 68 Fi g u r e 4.3 R e s u l t s of d i v i s i v e c l u s t e r i n g of the r e i n d e e r herds 72 Figu r e 4.4 R e s u l t s of the f i r s t 15 s p l i t s of d i v i s i v e c l u s t e r i n g 73 v i i i F i g u r e 4.5 Major c l u s t e r s of herds remaining at the 15th s p l i t of d i v i s i v e c l u s t e r i n g 74 Fig u r e 4.6 Best candidate f o r 6 u n i t experiments w i t h i n each of the 3 sets 80 Fig u r e 4.7 P o t e n t i a l i n d i c a t o r herds at the l e v e l of 0.65 shown with the herds t h a t they r e p r e s e n t 87 Fig u r e 5.1 D i s t r i b u t i o n of the mean growth r a t e estimated f o r the r e i n d e e r herds 98 Figu r e 5.2 Re s u l t of s i m u l a t i o n s comparing i n d i v i d u a l to j o i n t e s t i m a t i o n w i t h high l e v e l of common v a r i a t i o n 110 ix ACKNOWLEDGEMENTS Many people provided me with assistance in the completion of this thesis. Dr. Carl Walters introduced the topic to me and helped with my project orientation. Dr. Carl Walters and Dr. Don Ludwig assisted me with i t s organization and were very helpful in discussions. I p a r t i c u l a r l y wish to thank my entire Supervisory Committee; Dr. Carl Walters, Dr. Don Ludwig, Dr. C.S. Holling, Dr. A.R.E. S i n c l a i r , and Dr. N.J. Wilimovsky. Their suggestions and help were always appreciated. John Eadie, Dr. Don Ludwig, Locke Rowe, and Peter Watts had the thankless task of reading an early d r a f t of this thesis. I appreciated all of the help provided by the UBC s t a t i s t i c s department, par t i c u l a r l y the s t a f f of SCARL and Professors Schulzer and Zamar. Brad Anholt, A l i s t a i r Blachford, Pete Cahoon, Colin Daniels, John Eadie, Dr. Chris Foote, Dr. Lee Gass, Linda Glennie, Bob Gregory, Gordon Haas, Don Hall, Debbie McClennan, Teresa Patterson, Rob Powell, John Richardson, Don Robinson, Locke Rowe, Arlene Tompkins, Andrew Trites, and Peter Watts all honored me with t h e i r friendship by providing helpful comments, needed encouragement, and unwarranted abuse. Financial support was provided by the U n i v e r s i t y of B r i t i s h Columbia in the form of a research assistantship, a university graduate fellowship and a teaching assistantship. Don Hall formatted the equations in Chapter 5 for proper p r e s e n t a t i o n . Finally, I wish to thank my family; Dick, Katie, Ben, Justin, Garrett, and particularly my wife Denise; for always being there when I need them. I would also like to thank Denise for proofreading the various drafts of this thesis and for keeping me calm and productive. 1 C H A P T E R 1 I N T R O D U C T I O N The management of renewable natural resources, such as wildlife and fisheries has been attempted on a very localized basis, with each biological subpopulation, herd, or stock treated as a unique e n t i t y with regard to i t s production parameters and pattern of natural variation. Recently, that approach has been questioned (Walters 1986), and i t has been suggested t h a t c o o r d i n a t i o n of management a c t i v i t i e s (monitoring, experimental policy tests) across units may provide substantially better harvests. U n t i l recently, the resource management l i t e r a t u r e has not addressed a systematic approach for ut i l i z i n g information provided by subdivided populations. Textbooks and reviews in the following fi e l d s have not discussed t h i s approach: Fisheries (Gulland 1983, May 1984, Nelson and Johnson 1983, Ricker 1975); Wildlife Management (Bailey 1984, Giles 1969, Shaw 1985); Wildlife Ecology (Cox 1969, Robinson and Bolen 1984, Wakeley 1982, Watt 1968); Resource Modeling (Grant 1986, S t a r f i e l d and Bleloch 1986). The systematic u t i l i z a t i o n of information from subdivisions was f i r s t introduced by Walters (1986). This work presented in this thesis will expand on the work of Walters (1986). In particular, new methods will be 2 introduced to improve experimental design allocation. Examples of recent improvements estimation will be presented. BACKGROUND A main objective of renewable resource managers is to increase the commercial value of th e i r resource. Fisheries and wildlife managers often t r y to harvest a maximum number of animals each year, without depleting the overall population. Before management policies, such as harvest rates, can be e s t a b l i s h e d , managers r e q u i r e i n f o r m a t i o n about the relationship between population size and potential population size increases. This relationship can be better understood with the aid of a population model to predict f u t u r e population size based on present population size. The population model is an equation involving present population size and a number of unknown constants, or parameters. A population growth r a t e is an example of a production parameter. Managers estimate values for the parameters that are the most appropriate, or in other words, provide the best f i t for the past population data. Managers constantly look for methods to improve the estimation of model parameters. Models do not f i t population data exactly. There always is a difference between values the model predicts and values observed in nature. The difference between a predicted and and monitoring in parameter 3 observed value is called a residual. Managers are also interested in the information unaccounted for by the model, i.e. in the pattern of natural v a r i a t i o n represented by the r e s i d u a l s . With information on population parameters and natural variation, managers develop several kinds of policies. They develop h a r v e s t policies to establish harvest rates and harvest times. They develop other policies in order to collect information, or data, about the population size. Resources available f o r monitoring populations are often l i m i t e d , r e q u i r i n g managers to e s t a b l i s h monitoring p r i o r i t i e s . The d i s t r i b u t i o n of monitoring resources is commonly called monitoring allocation. Managers may also want to test new policies on portions of t h e i r population such as the testing of higher harvest rates. These tests can be thought of as experiments and require proper design. Managers often develop and implement policies f o r individual herds or stocks of the species in t h e i r charge, providing biologically d i s t i n c t u n i t s f o r management. In other cases, management units are not based on biological distinctions but instead on a r b i t r a r y factors. In many cases a grid is placed over a map of the population range as in the case of P a c i f i c Groundfish stocks in Canada (Tyler and McFarlane 1985). Management units which are geographically di s t i n c t but within the same population range can also be called subdivisions. In the case of separate herds as 4 well as separate regions laid out by a grid, managers subdivide the population for management purposes. Often data providing information on population size are collected within the population subdivisions. Where management is localized to one unit, managers will have a single series of animal population estimates over time. With this, managers can attempt to find a relationship between population size at one time, and population size at some time in the future. Model parameters can be estimated for this relationship. In addition, managers will have a time series of residuals providing information on n a t u r a l v a r i a t i o n . Although patterns or cycles of population behavior can be discovered within one time series of residuals (Box and Jenkins 1976, Chatfield 1980), in many cases these procedures provide information of l i t t l e use to managers. Where management is coordinated among population subdivisions, managers will have available a time series of population data from each subdivision. Models can be f i t to population data within each subdivision. New procedures can be developed to improve the accuracy of parameter estimation. Also, a time series of residuals is available from each subdivision. Residuals represent v a r i a t i o n not explained by the model, or effects external to the model. Residuals can be compared between subdivisions to ide n t i f y subdivisions that share common external effects. Policies which take into 5 account s i m i l a r i t y in residuals among subdivisions provide a more systematic approach to the management of subdivided populations than is currently being practiced. EXPERIMENTAL DESIGN Managers can gain increased information about th e i r populations by using the approach of adaptive management (Holling 1978). Adaptive management recommends regular and active "probing experiments" in order to provide information to management. Managers can manipulate harvest rates in order to learn how a population will grow given a broad range of population sizes. With this information, managers can set population sizes in order to maximize annual harvests. Without active probing, a population is Kept at a constant level and managers miss potential opportunities to improve harvests. Another example of active probing is the testing of new management policies in a management experiment. When the population is subdivided, new policies can be tested on some experimental units while other u n i t s maintain c o n t r o l policies. Managers can then assess the benefits and costs of the experimental policy relative to the control baseline. Experimental policy tests have often been conducted in fisheries and wildlife management. An example of this is the wolf control project of the B r i t i s h Columbia Wildlife Branch 6 (Atkinson and Janz 1986). Ungulate s u r v i v a l rates have been compared between regions where wolves have been removed and regions where wolves remain. Overall management policy has been made based on the results of the experiments (Hatter and Janz 1986). Results of many of these experiments are unpublished or published as internal technical reports within i n d i v i d u a l government agencies. Management experiments can be very d i f f i c u l t to interpret because subdivisions are not identical to each other as they would be in a laboratory experiment. Suppose two units are being used for a management experiment, one as a control with an existing management policy and one as a treatment with an experimental policy. If a change in behavior is observed between the control and the treatment, one can not answer with certainty i f the change is due to the experimental policy or to some other change in external factors between the subdivisions. There are a number of possible c r i t e r i a that managers can use to select experimental units: Managers way wish to use subdivisions which are not economically important so as not to interfere with important commerce. Managers may wish to use subdivisions with a broad range of environmental conditions to ensure that the experimental policy behaves similarly under a broad range of conditions. This thesis suggests another possible c r i t e r i o n for selecting u n i ts as controls and treatments in management 7 experiments: select subdivisions t h a t are l i k e l y to have similar responses to external factors, in order to minimize differences in external effects. Subdivisions t h a t have behaved similarly to external factors in the past are more like l y to behave s i m i l a r l y in the f u t u r e . If pairs of subdivisions can be compared in terms of t h e i r external effects, similar subdivisions can be identified for use in management experiments as control and treatment units. MONITORING ALLOCATION In many cases managers may wish to monitor some subdivisions more extensively than others. In the case where limited resources do not permit the monitoring of all subdivisions, managers must select the s u b d i v i s i o n s to monitor. Managers using improved s t r a t i f i e d sampling plans (Cochran 1977, Murthy 1967) will also have to single out subdivisions to receive additional monitoring e f f o r t . Many possible c r i t e r i a can be used to select the subdivisions to be monitored the most extensively: Managers may want to precisely monitor only commercially important subdivisions. Subdivisions that are easily accessible may be monitored the most in t e n s e l y . Other c r i t e r i a include concentrating on subdivisions whose dynamics are poorly understood, or on subdivisions with a population in steep decline. Tanner (1978) suggests selecting units at random for 8 intensive sampling within each s t r a t a . This thesis suggests an a l t e r n a t i v e c r i t e r i o n f o r s e l e c t i n g s u b d i v i s i o n s f o r extensive monitoring: select subdivisions which are similar to other subdivisions in terms of external effects. In this way extremely good or bad years could be i d e n t i f i e d f o r a large number of subdivisions, provided s i m i l a r i t y in response to external factors in the past is a good indicator of s i m i l a r i t y in the future. The subdivisions being monitored more extensively can be called "index units". If pairs of subdivisions can be compared in terms of t h e i r external effects, representative subdivisions can be identified for use as index units. PARAMETER ESTIMATION Additional information for management can be gained by estimating parameters using population data collected within subdivisions. The l i t e r a t u r e on resource management other than Walters (1986) does not address t h i s issue (Grant 1986, Gulland 1983, May 1984, Ricker 1975). Parameter estimation can p o t e n t i a l l y be improved by using j o i n t estimation procedures with data from subdivisions. I will examine two of the techniques developed by Walters (1986). These techniques are presented in Walters' book without using sample data sets as examples. Resource managers may find the methods easier to understand and implement after seeing an example of their use. 9 In one approach, parameter estimates across subdivisions can be adjusted assuming they were randomly "drawn by nature" from a known d i s t r i b u t i o n . According to Walters, t h i s Bayesian method can improve the estimation of parameters assumed to have been drawn at random from a normal d i s t r i b u t i o n . In the second approach, parameters are estimated f o r a group of s u b d i v i s i o n s j o i n t l y , where parameters are estimated as well as the shared external ef f e c t s of each year. This procedure works better than individual parameter estimation when shared external effects are large (Walters 1986). CASE STUDY Methods for improving experimental design, monitoring allocation, and parameter estimation will be developed and tested in this thesis. These methods will then be used on an actual data set. Population data on r e i n d e e r (Rangifer tarandus tarandus) from northern Finland will be used as the case study. The population is subdivided into 56 herds f o r management purposes (Table 1.1 and Figure l.i). A l l herds are located north of the Kiimiinki r i v e r . The Union of the Reindeer Raising D i s t r i c t s is responsible for the management of about 366,000 reindeer (Anonymous 1987). Reindeer within the 56 management d i s t r i c t s graze on n a t u r a l pastures. Table l . l List of the in Finland Herd # Name 1 Paistunturi 2 Kaldoaivi 3 Naatamo 4 Muddusjarvi 5 Vatsari 6 Ivalo 7 Hammastunturi 8 Sallivaara 9 MuotKatunturi 10 NaKKala 1 i Kasivarsi i 2 Muonio 13 Kyro 14 Kuivasalmi 15 Alakyla 16 Sattasniemi 17 Oraniemi 18 Syvajarvi 19 Pyhajarvi 20 Lappi 21 Kemin-Sompio 10 56 reindeer management districts Herd # Name 2 9 Palojarvi 30 Orajarvi 31 Kolarin alanen 32 JaasKo 3 3 NarKaus 34 Niemela 35 Timis jarvi 36 Tolva 37 Livo 36 Isosydanmaa 39 Manty jarvi 40 Kuukas 41 AlakitKa 42 Akanlahti 43 Hossa-Irni 44 Kallioluoma 45 Oivanki 46 Joki jarvi 47 Taivalkoski 48 Pudas jarvi 49 Oijarvi Herd # Name 22 Sallan pohjoinen 2 3 Salla 24 Hirvasniemi 25 Kallio 26 Vanttaus 2 7 Poikajarvi 2 8 Lohijarvi Herd # Name 50 Livo 51 Pintamo 52 Kilminki 53 Kollaja 54 Ikonen 55 Naljanka 56 Halla Figure 1.1 Map of the 56 Reindeer management districts In Northern Finland 13 Reindeer are kept within their d i s t r i c t by means of herding and also fencing where feasible. Any Finnish citizen living within a management d i s t r i c t can own reindeer. Management decisions within each d i s t r i c t are made by a board of directors from the d i s t r i c t in concert with the Union of Management d i s t r i c t s . There are 7500 reindeer owners. Eight hundred families get t h e i r principal income from owning reindeer (Anonymous 1987). Data on the reindeer populations were provided by Dr. Carl Walters of U.B.C. from a data base prepared by Dr. Timo Helle, Rovaniemmo, Finland. Data include the number of males, females, and calves censused along with the number of males, females, and yearlings harvested. Census data are available for all years in all areas from 1961 through 1983. ORGANIZATION OF THE THESIS This thesis will develop, test, and use methods to improve experimental design, monitoring a l l o c a t i o n , and parameter estimation of subdivided populations. Each chapter will develop methods, test them using Monte Carlo simulations where appropriate, and use them on the Finnish reindeer data. Chapter 2 wi l l i n t r o d u c e methods to i d e n t i f y subdivisions with similar time series of residuals along with the conditions necessary f o r these methods to work 14 effectively. Managers will be given suggestions about how to test the reliabil i t y of these methods for any given data set. Chapter 3 will test the methods further by adding many re a l world complications to the system. The presence or absence of complications in actual data sets will help managers judge the r e l i a b i l i t y and appropriateness of the methods. Chapter 4 looks specifically at experimental design and monitoring allocation. Methods to select subdivisions to be used as controls and treatments are developed. Methods to select subdivisions as index units f o r more i n t e n s i v e monitoring are developed. Chapter 5 tests and uses two of the methods developed by Walters (1986) to improve parameter estimation. It will be shown that in many cases, these methods are superior to independent parameter analysis. 15 CHAPTER 2 IDENTIFYING SUBDIVISIONS WITH SIMILAR RESIDUALS. INTRODUCTION The preceding chapter mentioned several opportunities available to managers of subdivided populations. Three main types of possible improvement are experimental design, monitoring allocation, and parameter estimation. A l l three of these approaches require the a b i l i t y to identify subdivisions t h a t react similarly to external effects. This requirement can be met: f i r s t by using models that account for major sources of v a r i a t i o n (such as population size) within each subdivision; second, by comparing the time series of residuals among subdivisions where the residuals are measured around the models. This chapter will develop and test methods to identify similar subdivisions in terms of external effects. The methods will not work well in all cases. Managers will be given suggestions on how to Judge the potential r e l i a b i l i t y of these methods on t h e i r individual data sets. Finally, the methods will be demonstrated with the Finnish reindeer data. 16 SIMULATIONS Simulation Methods To develop and test methods for identifying similar subdivisions from time series residuals, a population model must f i r s t be introduced. A population model is a mathematical tool used to relate the population size in one year with the population size in another year. A simple model is used as an example. Population size is assumed to be available annually. Suppose that population size is assumed to change in the following way: N t + l , i = N t | i ( i - H t | 1 ) XA where N^.i = population size at time t in subdivision i H-t,i = harvest rate at time t in subdivision i (fraction of harvested) Aj = mean population growth rate in subdivision i Population growth rates are independent of population density in t h i s model. Values of X^ can be estimated by f i t t i n g population data from each subdivision to the model (Draper and Smith 1981). Once parameters (X^) are estimated, a time series of residuals can be easily calculated 17 in the following way: R t + i , i = Actual Population Size - Expected Population Size = N t + i , i - N t l i (1 - H t > i ) XA where = the estimated mean growth rate for subdivision i A simulation model was designed to compare the abilities of three s t a t i s t i c s to correctly Identify similar time series of residuals. Monte Carlo simulations are frequently used to t e s t the effectiveness of complicated methods of analysis (Ludwig and Walters 1981, Walters 1985). Techniques to measure si m i l a r i t y have been used in the field of numerical taxonomy (Sokal and Sneath 1963, Sneath and SoKal 1973) in which subjects are compared in terms of a l i s t of characteristics. For this work subjects were compared in terms of a l i s t of differences in population responses over time, ie. in terms of a l i s t of residuals from a population model f i t t e d to each time series. Three s t a t i s t i c s tested were the c o e f f i c i e n t of association, the rank correlation c o e f f i c i e n t and the c o r r e l a t i o n coefficient. The s t a t i s t i c s were calculated between residuals of pairs of subdivisions. The c o e f f i c i e n t of association is a simple s t a t i s t i c which represents the percentage of the time that two columns of numbers have the same sign. Magnitude is not taken into account. This particular s t a t i s t i c is also named the simple 18 matching c o e f f i c i e n t ((Zubin 1938, Sokal and Michener 1958). The pair of subdivisions with the highest value for the coefficient of association would be considered to be the most similar. A more complex s t a t i s t i c to compare time series of r e s i d u a l s is the cross c o r r e l a t i o n c o e f f i c i e n t . This s t a t i s t i c was f i r s t introduced as the i n v e r t e d f a c t o r technique (Stephenson 1936). Magnitudes of the residuals are considered as well as signs. There is a v a r i e t y of s t a t i s t i c s with complexities intermediate between the c o r r e l a t i o n c o e f f i c i e n t and the coefficient of association. One is the non-parametric rank c o r r e l a t i o n c o e f f i c i e n t . To calculate t h i s s t a t i s t i c , the residuals within each time series are ranked from 1 to n (where n is the last year of the time series) from lowest to highest. The correlation coefficient between two subdivisions is calculated from the pairs of ranks. Monte Carlo simulations were designed to test which of these three s t a t i s t i c s worked best under various conditions. A simple four subdivision model was created. (More complex and r e a l i s t i c series of models will be used in the next chapter.) The four subdivisions annually behaved according to: N t + l , i = N t t i (1 " H t ( i) Xi u> t | i 19 with all variables defined as above and u>t.,i = noise t erm as d e f i n e d below. Two components of the noise wt,i were added to each subdivision each year. One component was an "individual" component of noise picked randomly each year for each subdivision. The second component of noise was a "common" term picked randomly each year for each of two pairs of subdivisions. Subdivisions 1 and 2 shared the same common component of noise, and subdivisions 3 and 4 shared a second common component. This design resulted, by definition, in subdivisions 1 and 2 being similar, and dif f e r e n t from 3 and 4. The noise terms were calculated as: «t, 1 = e x P CVt ,1 + *nt ,1-2 + *1> w t > 2 = exp ( y t i 2 + ^ 1.1-2 + K 2 ) w t,3 = exp ( y t ( 3 + Ht.3-4 + K 3 ) W t,4 = exp ( y t t 4 + l i t , 3-4 + * 4 ) where: w t , l : noise term at time t for subdivision i Vt, 1 = individual noise term at time t for sub. 1 •nt.1-2 = common noise term at time t for subs, i and 2 T l t , 3 - 4 = common noise term at time t for subs. 3 and 4 Ki = correction factor for mean to be 1.0 for sub. i All noise terms were drawn from a normal d i s t r i b u t i o n 20 with mean of 0. A range of combinations of standard deviations were tried. Residuals were calculated by f i t t i n g the model described e a r l i e r to each subdivision, then subtracting the observed from the expected. For a large set of Monte Carlo simulations of the model, the results were tested to see i f the values of the three statistics were the highest for the pairs i and 2, and 3 and 4 out of all possible pairwise combinations. One hundred test data sets were generated for each combination of individual and common standard deviations. Parameters used in the model were representative of values estimated for the Finnish reindeer data. Mean annual population growth rate was set to i.40, the annual harvest rate was set to 0.28, and the i n i t i a l population size was set to 1000. Each simulation ran for 23 years. The independent standard deviation and the common standard deviation each varied from 0.02 to 0.4, in steps of 0.02. A l l combinations were tr i e d . For each combination of individual and common standard deviations, the model was run 100 times in order to estimate the probability of correct pairings. A mean cross correlation coefficient was calculated by taking the average of all possible pairwise cross correlation coefficients over the 100 runs. 21 Simulation Results The simulations showed that the correlation coefficient did a better Job of correctly identifying similar subdivisions (subdivision pair 1 and 2 and pair 3 and 4) than did the other two s t a t i s t i c s . The success rate increased as the amount of common var i a t i o n increased. The correlation coefficient correctly identified similar pairs more of the time than did the other two statistics. The coefficient of association was the least reliable. Figure 2.1 shows how the si m i l a r i t y measures performed for a range of individual standard deviations, for a constant value of the common s t a n d a r d d e v i a t i o n of 0.15. The o r d e r i n g of re l i a b i l i t i e s held for other values of the common standard deviation as well. The rank correlation coefficient performed nearly as well as the correlation coefficient. The probability of successful pairings (identifying both pairs created to be similar) depended on the r e l a t i v e proportion of individual variation to common var i a t i o n as well as the overall magnitude of the variation. There was a greater probability of successful pairings as common variation increased r e l a t i v e to individual variation. Magnitude of the components of v a r i a t i o n was an important f a c t o r in correctly identifying pairs. Successful pairings increased as subdivisions sharing a 2 to 1 r a t i o of common to individual v a r i a t i o n increased the overall magnitude Probability of Successful Pairings 0.05 0.1 0.15 0.2 0.25 0.3 Individual Standard Deviation Correlation 0 Rank Correlation ° Coeff. of Asso. ro IX) F i g u r e 2.1 Performance of 3 s i m i l a r i t y s t a t i s t i c s as i n d i v i d u a l v a r i a t i o n increased 23 of v a r i a t i o n . Similarly, successful pairings decreased as subdivisions sharing a 2 to i r a t i o of individual to common variation increased the overall magnitude of variation. The results of the simulations are shown as a surface drawing in Figure 2.2a and as a contour drawing in Figure 2.2b. The probability of successful pairings ranged from as high as i.O in cases where a high amount of variation was common to as low as O.i or less in cases when most of the var i a t i o n was i n d i v i d u a l . The mean c o r r e l a t i o n c o e f f i c i e n t was a f a i r l y good indicator of the probability of successful pairings. The mean cross correlation over 100 runs increased as the probabililty of correct pairings increased (Figure 2.3). The mean cross correlation value of individual runs varied greatly around the overall mean value. There were occasional high values of the mean c o r r e l a t i o n c o e f f i c i e n t i n simulat i o n runs which i d e n t i f i e d i n c o r r e c t pairings. Discussion of Simulation Results The simulations showed t h a t the cross c o r r e l a t i o n coefficient was the most reliable at identifying subdivisions t h a t have responded s i m i l a r l y to e x t e r n a l f a c t o r s . Unfortunately, the c o r r e l a t i o n did not always c o r r e c t l y i d e n t i f y similar pairs of subdivisions. Managers need some way to Judge the liKely r e l i a b i l i t y of the methods on any 24 • .ot F i g u r e 2.2a S u r f a c e view o f the p r o b a b i l i t y of suc c e s s f u l p a i r i n g s versus i n d i v i d u a l and common v a r i a t i o n 25 Figure 2.2b Contour view of the probability of successful pairings versus Individual and common variation Probability of Correct Pairings 1.0 0.6 --0.1 0 0.1 0.2 0.3 0.4 Mean Cross Corr. Coeff. (100 runs) Figure 2.3 Mean correlation as an Indicator of probability of correct pairings 27 given data set before using these methods to select units for experimentation or units to monitor. The results of the simulations provide possible ways to judge or improve the r e l i a b i l i t y of methods to i d e n t i f y subdivisions with similar external e f f e c t s . The c o r r e l a t i o n c o e f f i c i e n t produced c o r r e c t r e s u l t s more often than the other two statistics because i t used more of the available information and the noise was normally distributed. Only the correlation coefficient took advantage of all of the information available in the data: the signs and magnitudes. In addition, the components of noise added in the simulations came from a normal distribution. The correlation coefficient should work best under conditions of normal noise. It is important to note that the residuals were not identical to the original noise which had been added due to bias in parameter estimation (Walters 1985). The probability of the correlation identifying similar pairs increased when pairs of subdivisions were created to be more similar. I d e n t i f i c a t i o n methods became useless when common variation was small relative to individual variation. In these cases pairs selected to be most similar were most similar by chance rather than by causation. The simulations imply that managers cannot always use these methods to accurately identify similar subdivisions. If managers could estimate the common and individual standard deviations found in t h e i r data sets they could roughly Judge 28 the r e l i a b i l i t y of t h i s method. However, t h i s is not possible. The simulations contained, by design, one level of common va r i a t i o n shared between pairs of subdivisions. Real world data sets contain many levels of common variation: each possible pair of subdivisions shares a d i f f e r e n t level of common variation; each group of three shares a different level of common variation; all of the subdivisions together will share some level of common variation. There is no way to identify which of these levels of variation should represent a single measure of common variation as used in the model. Managers can do several things to improve and to gauge the r e l i a b i l i t y of assessments of s i m i l a r i t y . F i r s t , i f managers can group the data from their subdivisions based on known, common, external factors between subdivisions, they can increase the r e l i a b i l i t y of the methods. As an example, suppose that all of the reindeer herds in northern Finland are kept outside in the winter, foraging through the snow and ice, while all of the southern herds are kept in heated barns and supplementally fed. By s p l i t t i n g up the two dis t i n c t groups before analysis, managers should increase the r a t i o of common to individual variation within each group and thereby increase the r e l i a b i l i t y of the method. Secondly, the mean cross correlation coefficient can act as a rough indicator of the amount of common to individual noise. As an example, i f a group of subdivisions contained only i n d i v i d u a l noise, the expected mean cross correlation 29 c o e f f i c i e n t would be 0. As the amount of common noise increases r e l a t i v e to i n d i v i d u a l noise, the mean cross c o r r e l a t i o n c o e f f i c i e n t should increase as well. The simulations showed that the mean cross correlation coefficient on average was a good indicator of probability of successful pairings. T h i r d l y , the data can be broken down into smaller segments and id e n t i f i c a t i o n methods can be tested on each segment. In this way managers can see i f the same patterns of simi l a r i t y occur over time. Additionally managers can see i f overall patterns of similarity are due to one or two extreme years. CASE STUDY The reindeer data are used as an example of evaluating s i m i l a r i t y of subdivisions. The r e l i a b i l i t y of i d e n t i f y i n g similar reindeer herds based on residuals must f i r s t be gauged before the data can be used with methods in Chapters 4 and 5 to improve monitoring allocation, experimental design, or parameter estimation. Case Study Methods The f i r s t step as described e a r l i e r was to group subdivisions based on known common external factors. I have 30 l i t t l e information about the primary causes of external noise in this population and without this information, this step could not be completed except to group the subdivisions into general geographic units (eg. f a r north, central, southern, etc.). The second step was to calculate the mean cross c o r r e l a t i o n c o e f f i c i e n t among a l l possible p a i r s of subdivisions. The model used earlier for simulations was f i t to all 56 subdivisions, resulting in 56 estimates of mean growth rate as well as 56 series of 22 residuals over time. Cross c o r r e l a t i o n c o e f f i c i e n t s were calculated between a ll possible pairs of subdivisions. The d i s t r i b u t i o n of cross c o r r e l a t i o n c o e f f i c i e n t s and the mean cross c o r r e l a t i o n c o e f f i c i e n t was also calculated. The t h i r d step involved comparing si m i l a r i t y patterns over time. A simple test to compare the sim i l a r i t y patterns was designed. The data set was split up into two halves, years 1 through i i , and years 12 through 23. Residuals were calculated for each herd for the two time periods as had been done e a r l i e r . Cross correlation coefficients were calculated between each herd and all other herds one by one for the two time periods. For each herd, all correlations between i t and all other herds were ranked separately for the two time periods. If the herds had shown the same structure of similarity (pattern of s i m i l a r i t y in terms of common external effects 31 between subdivisions) during both time periods, the two columns of rankings would be the same, A rank correlation c o e f f i c i e n t was calculated comparing each herd's si m i l a r i t y rankings f o r the two time periods. This produced a distribution of 56 rank correlation coefficients, one for each herd. Each half of the time series was also compared to the entire time series using the same method. Case Study Results The results of the case study analysis showed that common variation was very apparent in the data set. However, the structure of similarity varied over the time series. The d i s t r i b u t i o n of cross c o r r e l a t i o n coefficients is shown in Figure 2.4. The mean value is 0.27. The d i s t r i b u t i o n of cross c o r r e l a t i o n c o e f f i c i e n t s is centered above 0 showing the presence of common variation. When the f i r s t half of the time series was compared to the second half, the average rank correlation among herds was only 0.16. If the herds had shown no relationship the average value would be 0.0. The d i s t r i b u t i o n is shown in Figure 2.5. The distribution of the rank correlations comparing the f i r s t half of the time series to the overall time series (Figure 2.6) produced an average of 0.80. The d i s t r i b u t i o n between the second half of the time series and the overall time series (Figure 2.7) produced an average of 0.59. 250 200 160 100 50 Number in Range -.4 • Mean • 0.258 StDev • 0.259 • : • -I I -.2 0 0.2 0.4 0.6 Cross Correlation Coefficient -0.2 means the interval -0.3 < X < -0.2 0.8 F i g u r e 2.4 D i s t r i b u t i o n o f c r o s s c o r r e l a t i o n c o e f f i c i e n t s between p a i r s o f F i n n i s h r e i n d e e r h e r d s CO l\5 Number of Herds in Range -0.2 -0.1 0 0.1 0.2 0.3 0.4 Rank Correlation Value 0.5 0.6 0 means -0.1 < X < 0.0 F i g u r e 2 .5 D i s t r i b u t i o n o f r a n k c o r r e l a t i o n v a l u e s c o m p a r i n g the f i r s t h a l f o f t h e t i m e s e r i e s t o t h e second h a l f 25 20 1 6 10 Number of Herds in Range Mean - 0.80 Rank Correlation Value 0 means -0.1 < X < 0.0 F i g u r e 2.6 D i s t r i b u t i o n o f r a n K c o r r e l a t i o n v a l u e s c o m p a r i n g t h e f i r s t h a l f o f t he t i m e s e r i e s t o t h e e n t i r e t i m e s e r i e s Number of Herds in Range 0.3 0.4 0.5 0.6 0.7 0.8 Rank Correlation Values 0 means -0.1 < X < 0.0 F i g u r e 2 . 7 D i s t r i b u t i o n o f r a n k c o r r e l a t i o n v a l u e s c o m p a r i n g t h e s e c o n d h a l f o f t he t i n e s e r i e s t o the e n t i r e t i m e s e r i e s 36 Case Study Discussion Overall, the results encourage the use of clustering and shared parameter estimation methods with the reindeer data. The mean cross correlation coefficient is greater than 0 and both the f i r s t and the second half of the time series contributed greatly to the overall s t r u c t u r e of s i m i l a r i t y found in the data. The structure of similarity was obviously very d i f f e r e n t between the f i r s t i l years and the last 12 years of the time series. This could be due to changing environmental conditions, but i t is more likely due to changing management policies among herds. If more was known about management actions over the time period, t h i s information could be included in the model as explainable factors, rather than in the residuals. The process of identifying similar herds would be much easier i f a d d i t i o n a l i n f o r m a t i o n were a v a i l a b l e about management policies or about the principle causes of noise by herd and by year. Obviously, identifying s i m i l a r i t y based on obvious differences in causation would be preferred over trying to guess at i t with residuals. In many cases, as in my analysis of the reindeer data, managers may not have this i n f o r m a t i o n . 37 CHAPTER 3 IDENTIFYING SIMILAR SUBDIVISIONS IN THE PRESENCE OF COMPLEXITY INTRODUCTION In Chapter 2, the correlation coefficient was used to identify subdivisions with similar residuals. The r e l i a b i l i t y of this s t a t i s t i c depended on the amount of shared v a r i a t i o n among subdivisions. Several suggestions were offered to gauge the usefulness of the measure with any part i c u l a r data set. However, the analysis was based on a very simple model. In r e a l i t y , many additional complexities exist. In this chapter the a b i l i t y to identify similar subdivisions in the presence of three additional factors, a u t o c o r r e l a t i o n in residuals, measurement error, and the presence of extremely bad years, will be tested. One of these factors will aid in the i d e n t i f i c a t i o n of similar subdivisions, two will hinder. The presence or absence of these factors in actual data sets can encourage or discourage the identification of si m i l a r i t y of subdivisions. Autocorrelation occurs when the noise in any given year is correlated with the noise in previous years. If any of the unmodeled components involved i n population change or variation operate on a cycle or other nonrandom pattern over 38 time, autocorrelation is present in the residuals of any model t h a t does not c o r r e c t l y represent the components. Some e n v i r o n m e n t a l phenomena, f o r example oceanographic variations (Mysak 1986), are thought to occur in cycles. The factors t h a t create noise are in many cases dependent on previous states such as predators, prey, habitat conditions, etc. Measurement e r r o r occurs when the population size observed or estimated is not the actual population size but instead departs from the actual by some random factor each year. In many cases populations are d i f f i c u l t to census effectively. In these cases measurement error would be high. In the model of chapter 2, noise terms were drawn from a normal distribution. This may not be the case in actual data sets. Bad years may occur in data sets more frequently, or more extremely, than good years. The maximum r a t e of population increase is a biological limit. A population of reindeer can never t r i p l e in a good year. There is a d i f f e r e n t l i m i t on the maximum rate of decrease; in an extremely bad year, a population could go extinct. Because of t h i s populations may be subject to a skewed noise d i s t r i b u t i o n . In this chapter Monte Carlo simulations will be used to estimate the ef f e c t s of these complexities on methods to iden t i f y s i m i l a r i t y of subdivisions. The presence or absence 39 of these elements of complexity will encourage or discourage the use of such methods. This will provide managers with an additional way to judge the appropriateness of these methods for individual data sets. The reindeer data set will be used as an example of the methods. SIMULATIONS Simulation Methods The population model used in chapter 2 will be the basis for the models used in this chapter. Simulations will be used to estimate the probability of correctly identifying pairs of populations with shared effects. Autocorrelated noise was generated in the following way: noise in the f i r s t year of each simulation was generated as i t was in Chapter 2 : W i ,i = exp (Yi f i + <nj + Ki) In following years a term relating population size at time t to that at time t - i is added: V t . i = « Yt-1 , i + e n,i and similarly : *1t,i-J = « * n t - i , i - J + € v t , i - j where: a = autocorrelation coefficient e = square root of ( i - a*) <f>t,i = random normal deviate for time t in subdivision i v t , i - j = random normal deviate for time t for subdivision pair i . J As a is increased, the amount of autocorrelation is increased. The individual standard deviation was set to O.iO and the common standard deviation was set to 0.15. The level of a ranged from -0.4 to 0.4. For each combination of values, 1000 tr i a l s were performed in groups of 100. Measurement er r o r was modeled using the simple model from Chapter 2 with one slight modification: * t . i = N t , i * exp ( e t f i ) where: N t , i = Observed population at time t in sub. i 41 N t i i = Actual population at time t in subdivision i calculated as in chapter 2. * t , i : random normal deviate for time t for subdivision i For t e s t s of measurement e r r o r effects, the individual standard deviation was set to 0.10 and the common standard deviation was set to 0.15. The value for the standard deviation of the measurement error was then varied from 0.00 to 0.45. For each standard deviation value, 1000 simulations were run. Simulating the presence of extremely bad years required two steps each simulated year. F i r s t , a random number was drawn from a uniform d i s t r i b u t i o n in order to determine whether an extremely bad year had occurred. If the random number was less than or equal to the probability of a bad year occurring, a bad year occurred and the noise term for that year was set to a predetermined low level. If the random number was greater than the probability of a bad year occurring, a bad year did not occur, and the noise term for that year was drawn from a normal distribution as in the other simulations. A l l bad years received a value of -0.30 for their noise term. This procedure took place for both common noise terms as well as individual noise terms. The probability of a bad year was varied 0.00 to 0.30, while holding the individual standard deviation at 0.15 and 42 the common standard deviation at 0.10. Five hundred simulations were run for each value of the probability of a bad year. Simulation Results Autocorrelation and measurement error both decreased the pr o b a b i l i t y of c o r r e c t l y i d e n t i f y i n g similar subdivisions. Autocorrelation had a slight effect while measurement error had more drastic effects. The presence of extremely bad years increased the p r o b a b i l i t y of c o r r e c t l y i d e n t i f y i n g similar s u b d i v i s i o n s . A u t o c o r r e l a t i o n , whether positive or negative, lowered the chances of recognizing a correct pairing (Figure 3.1). Each point on the plot represents the mean autocorrelation value of 100 runs on the x axis versus the probability of correct pairings on the y axis. With an autocorrelation value of 0.0 the probability of a correct pairing was 0.83. With an autocorrelation of 0.25, the probability dropped to 0.76. The sign of the autocorrelation did not affect the results. This pattern held for other combinations of individual and common vari a t i o n as well. Measurement error greatly decreased the probability of correctly identifying pairs (Figure 3.2). Without measurement 0 . 9 0 . 8 8 -0 . 8 6 -0 . 8 4 -0 . 8 2 -0 . 8 -0 . 7 8 ~ 0 . 7 6 -0 . 7 4 -0 . 7 2 -0 . 7 -0 . 6 8 -0 . 6 6 -0 . 6 4 -0 . 6 2 - -• • • DD • • • • CO • n e q a a a a a a c a • a • • m CD • DOC ] o • a Q a D a • o • • • on a • an • m a cn on a • m a an T D O • • • CO a mn a a • a D • m • • • a • • a a • • • 0 . 1 a •a • • a —E3B-- 0 . 5 - 0 . 3 - 0 . 1 Mean Autocorrelation Value — I — 0 . 3 0 . 5 Figure 3.1 Effect of autocorrelation on the proba-bility of correct pairings Figure 3.2 Effect of measurement error on the probability of correct pairings 45 error, the probability of a correct pairing was 0.82. With a standard deviation of measurement error of only 0.10 the pr o b a b i l i t y of c o r r e c t p a i r i n g dropped to 0.53. The probability continued to quickly drop off before i t leveled at approximately 0.20. This pattern also held for various values of common and individual variation. The addition of extremely bad years to the model improved the p r o b a b i l i t y of c o r r e c t l y i d e n t i f y i n g pairs (Figure 3.3). As the probability of an extremely bad year rose from 0.0 to O.i, the probability of a correct grouping rose from 0.5 to 0.6. This pattern occurred for a variety of values of common and individual variation. Discussion of Simulation Results The results of the simulations in this chapter provide additional ways f o r managers to gauge the r e l i a b i l i t y of estimating s i m i l a r i t y of subdivisions. Chapter 2 concluded that subdivisions sharing l i t t l e common variation would not provide r e l i a b l e s i m i l a r i t y information. This chapter adds that populations that have large amounts of autocorrelation in their noise terms, or even modest amounts of measurement error would not provide reliable results as well. On the other hand, the use of correlations to identify similar subdivisions is encouraged for populations where extremely bad years affect Probability of Correct Pairings 0.81 0.7 0.6 0.6 0- 0.05 0.1 0.15 0.2 0.26 0.3 Probability of Bad Year Figure 3 . 3 Effect of extremely bad years on the probability of correct pairings 4* 47 groups of subdivisions. The presence of autocorrelation in noise terms violates fundamental assumptions implied for the standard correlation coefficient to be a valid measure of similarity. For this reason the correlation coefficient does not perform as well in the presence of autocorrelation. Subdivisions in the real world will not all have the same level of autocorrelation present in t h e i r residuals. The d i s t r i b u t i o n and average level of autocorrelation can be used as an indicator of the overa l l e f f e c t on r e l i a b i l i t y of the analysis. From the simulations i t appears that small levels of autocorrelation have l i t t l e effect on r e l i a b i l i t y . Measurement error acts to mask the true population size. This additional component of v a r i a t i o n , by d e f i n i t i o n , interacts with the true variation present in the data. With even modest amounts of measurement error, common noise becomes undetectable due to the additional, individual, multiplicative term. Populations with inaccurate censuses would be poor candidates f o r the c o r r e c t i d e n t i f i c a t i o n of s i m i l a r s u b d i v i s i o n s . The addition of extremely bad years to subdivision pairs increases the amount of common variation present. If bad years occur in i n d i v i d u a l s u b d i v i s i o n s only, i n d i v i d u a l v a r i a t i o n increases and the probability of correct pairings decreases. If managers discover extremely bad years that are common to groups of subdivisions, the prospects of correctly 48 i d e n t i f y i n g similar subdivisions are improved. CASE STUDY The reindeer data set was examined in regard to each of the three factors discussed above. A u t o c o r r e l a t i o n c o e f f i c i e n t s were calculated using the residuals of the 56 herds from Chapter 2. Autocorrelation for each of the 56 time series of residuals was estimated by calcu l a t i n g the correlation coefficient between the residual values at each time step (time = t) and values one time step ahead (time = t+i). The mean value was -0.21 and the di s t r i b u t i o n is shown in Figure 3.4. On average this is a small amount of autocorrelation and should have l i t t l e effect. Measurement e r r o r is assumed to be v i r t u a l l y non-existent in this data set. Reindeer managers are able to count all animals by herding them into counting corrals (Anon. 1987). In order to determine the d i s t r i b u t i o n of the residuals across herds, the residuals, calculated in chapter 2, were f i r s t standardized. The standard deviation of the residuals of each herd was calculated, then divided into each residual in that herd. The d i s t r i b u t i o n of standardized residuals was then compared to a normal d i s t r i b u t i o n (Figure 3.5) for all residuals aggregated across herds. The residuals did not f i t a normal d i s t r i b u t i o n using a Chi Square test (X* = 101.4, Number of Herds in Range 14 i - .8 - .4 0 .4 .8 Autocorrelation Value 0 means -0.1 < X < 0.0 VO F i g u r e 3.4 D i s t r i b u t i o n o f a u t o c o r r e l a t i o n i n r e i n d e e r h e r d r e s i d u a l s 140 Number in range -3 -2 -1 0 1 Standardized Residual Value Actual data Normal values Figure 3.5 Comparison of reindeer standardized residuals to a normal distribution cn o 51 P < 0.01, df = 27). The di s t r i b u t i o n appeared to be skewed. The points were not symmetrical about the mean of 0 (X* = 7.48, p < .01, df = l). The 50 lowest points were then removed, and the di s t r i b u t i o n was restandardized and compared to a normal distribution using a Chi Square test. This time the d i s t r i b u t i o n f i t better than before (Figure 3.6) (X* = 45.95, p < 0.05, df = 27). With the removal of the 50 lowest points, and the r e s t a n d a r d i z a t i o n , the r e s i d u a l s were symmetrical about the mean of 0 (X* = 0.068, p < 0.75, df = 1). These r e s u l t s strongly suggest the presence of extremely bad years in the data. Bad years af f e c t e d regions of the population range (Table 3.1). Figure 3.7 i l l u s t r a t e s regions displaying low residuals during three of the extremely bad years (from Table 3.1). Low residuals were evident during year 10 in the ce n t r a l and southern regions primarily, year 6 i n the northwestern region, and year 14 in the northeastern region. Twenty of the f i f t y lowest residuals occurred during one extremely bad year in twenty different herds. Only 3 of the f i f t y points were associated with only one herd in a unique year. The analysis of the case study data set looks very encouraging for the application of methods to be presented in chapters 4 and 5. Autocorrelation was low, measurement er r o r was assumed to be v i r t u a l l y non-existent, and extremely bad years occurred in common among subdivisions. These results Figure 3 . 6 Comparison of reindeer trimmed, standardized residuals to a normal distribution 53 Table 3. i List of reindeer herds and years that exhibit the 50 lowest standardized residual values ( i 9 - - ) 63 64 66 67 69 Year 2 3 5 6 8 Herd # 51 8 46 3 11 48 52 4 12 71 10 18 19 5 22 6 23 7 24 9 25 26 29 34 35 36 37 39 40 41 42 45 50 54 56 Total # of herds 2 l 2 6 2 20 75 78 80 82 14 17 19 21 8 45 20 13 9 27 10 28 12 30 14 31 15 32 16 23 32 \bar e «ar 10 «ar 14 Figure 3.7 Map of reindeer herds with extremely low residual values during common years (3 years illustrated) 55 together with the results of chapter 2 suggest that the use of correlations of residuals to i d e n t i f y similar reindeer herds are appropriate for the basis of upcoming applications. Chapter 4 will explore how subdivisions identified as being similar can aid the processes of experimental design and monitoring a l l o c a t i o n . 56 Chapter 4 OPPORTUNITIES IN EXPERIMENTAL DESIGN AND MONITORING ALLOCATION INTRODUCTION In the preceeding two chapters, methods wnwnft tto&wrimpftii to i d e n t i f y subdivisions with similar responses to external factors and to estimate the reliability of these methods. The methods were Judged to be applicable to the reindeer data. In addition, the reindeer data provided evidence of similar effects occurring within regions of the population range. The r e s t of t h i s thesis will i d e n t i f y applications f o r t h i s information. This chapter will examine methods to aid in designing management experiments and monitoring allocation. In designing management experiments, managers must decide which subdivisions to use as controls and treatments. Managers in the past have selected experimental units based on convenience and p r a c t i c a l i t y primarily. Chapter one pointed out that u n t i l recently, there were no systematic approaches for coordinating information among subdivisions. This thesis introduces one possible c r i t e r i o n f o r s e l e c t i n g experimental u n i t s : Select s u b d i v i s i o n s which respond similarly to external factors. This would minimize the chance of differences not caused by the experimental 57 policy occurring between subdivisions. This chapter will develop methods to rank subdivisions based on s i m i l a r i t y of residuals, with top r a n k i n g s u b d i v i s i o n s being obvious candidates f o r experimentation. Monitoring allocation decisions involve many important tradeoffs. In many cases managers will need to monitor a subsample of subdivisions due to limited monitoring resources. In other cases managers may wish to monitor some subdivisions more heavily than others in a s t r a t i f i e d sampling design to increase the amount of information obtained. Managers must decide which subdivisions to monitor more heavily, to use as the monitoring index units. Many possible c r i t e r i a could be considered in making t h i s decision. A systematic approach in the l i t e r a t u r e suggests selecting index units at random within s t r a t a (Tanner 1978). An alternative approach suggested in this thesis is to select, as index units, subdivisions that are highly similar to others in t h e i r response to external f a c t o r s . This way managers could potentially spot extremely bad or good years in a large number of subdivisions, f o r each one monitored. Methods will be developed to select subdivisions which should be used as index units. Methods developed in this chapter will be i l l u s t r a t e d with the reindeer data. In this case the data cluster nicely into a few good sets for experimentation and monitoring. 58 METHODS Methods for Experimental Design To select appropriate experimental units, groups of reindeer herds were ranked in terms of t h e i r s i m i l a r i t y of residuals from the simple population model of chapter 2. A li s t of the ten best experimental units (herds) was created f o r experiment sizes of 2, 3, 4, and 5 units. For an experiment size of two units, cross correlation coefficients for all possible pairings of herds were ranked. Brute force ranking of units for small experiments In comparing each possible group f o r a 3 unit experiment, three correlation coefficients were involved: the correlations between herds 1 and 2, 1 and 3, and 2 and 3. To rank a ll possible groups of t h r e e herds, the t h r e e correlations involved in each group had to be aggregated into one s t a t i s t i c . One way to do this aggregation would be to average the three correlations. Instead, a more conservative ranking of the possible groups was based on the minimum of the 3 correlations. With this method groups were ranked based on the lowest pairwise s i m i l a r i t y measure found in the group. This ruled out the possibility that a high rank would be given to a group where two of the three correlations were very high 59 while the th i r d was very low. Groups of four and above were also ranked by minimum correlation value. As the required experiment size increased, the number of possible groups to test also increased. For an experiment size of two, a l l possible groups involving 2 herds (1540 groups) had to be tested. For an experiment size of three, all possible groups of 3 herds (27,720) had to be tested. For an experiment size of six, 32,468,436 possible groups had to be tested. If the number of potential subdivisions to test was decreased, testing and ranking would be possible for even larger experiments. The amount of time necessary to find the most similar groups of 56 subdivisions was determined and compared to the time necessary to find the most similar groups of 20 subdivisions. Hierarchical clustering to rank units for larger experiments In order to decrease the number of groups to test, highly dissimilar groups of subdivisions could be ruled out before t e s t i n g is begun. Subdivisions could be clustered based on similarity in residuals and then ranked within each cluster, thus reducing the number of potential groups to test. H i e r a r c h i c a l c l u s t e r a n a l y s i s i d e n t i f i e s groups of subdivisions based on a numerical measure, such as correlation coefficients. Individuals are continually added or subtracted to clusters allowing groups to be identified both at the 60 course grain and at the fine grain (Everitt i960). Within h i e r a r c h i c a l methods, two types of analyses exist: agglomerative clusterings which "fuse" the individuals into groups in a series of progressive steps; and divisive clusterings which p a r t i t i o n the overall group into f i n e r groups u n t i l only individuals remain. Monte Carlo simulations were designed to test which method would work best at c o r r e c t l y grouping subdivisions known to be similar. The method that worked the best would be used to cluster the reindeer herds. The model used in Monte Carlo tests was similar to the model in Chapter 2, except each subdivision had three components of var i a t i o n instead of two. Eight subdivisions were created with each having an individual component of variation. Within the eight, each group of four (subdivisions 1-4 and 5-8) had a common component of variation. Pairs within each group of four (subdivisions 1-2, 3-4, 5-6, and 7-8) each had a common component of variation as well. The goal of the model was to see i f the clustering methods could correctly i d e n t i f y the pairs designed to be similar (1-2, 3-4, 5-6, 7-8) and the groups of four or quads designed to be similar (1-4, 5-8). The agglomerative method used was the "nearest neighbor" or single link method. In the beginning, eight groups, each containing one subdivision, were used. The two most similar 61 groups were joined. New si m i l a r i t y values between this new group and a l l other groups were calculated in order to simplify the s i m i l a r i t y matrix (the highest values of sim i l a r i t y between members of the new group and all other groups were stored). Of the seven remaining groups, the two most similar were Joined and the s i m i l a r i t y matrix was again simplified. This process continued u n t i l only 1 group remained (containing all 8 subdivisions). The divisive method began with one group which contained all 8 subdivisions. An average correlation was calculated for each subdivision within the group, using a l l correlations between t h a t subdivision and the remaining seven. The subdivision with the lowest average correlation (the most dissimilar subdivision) became the f i r s t member of a new group. The new group gained additional members by adding subdivisions who were more similar to the new group than they were to the old group. When no additional subdivisions f i t this c r i t e r i o n , the process was repeated on the two newly created sub-groups. This process continued u n t i l 8 groups e x i s t e d , each c o n t a i n i n g 1 s u b d i v i s i o n . Both the agglomerative and the divisive methods used are reviewed in E v e r i t t (1980). In each run of the Monte Carlo simulation, residuals and cor r e l a t i o n s were calculated between the eight subdivisions. The c l u s t e r i n g proceeded using the matrix of correlation coefficients. The program kept track of how many pairs and 62 quads were c o r r e c t l y i d e n t i f i e d i n the process. The simulations were run for a number of combinations of standard deviations. For each combination of the three standard deviations, the model was run 100 times. The reindeer subdivisions were clustered using the more dependable of the two methods based on the simulations. Possible experimental units were tested and ranked within several clusters, g r e a t l y reducing the number of possible groups to test. Possible groups within clusters were ranked using the minimum correlation method described above. The top groups within each cluster were combined and ranked in order to identify the best potential experimental units among all clusters. The results using cluster analysis were compared to the r e s u l t s produced e a r l i e r without the aid of cluster analysis. In order to see i f the s i m i l a r i t y relationships changed over time, cluster analysis was performed on each half of the time series. The ten most similar pairs of herds for each half of the time series were compared. Methods for Monitoring Allocation To i d e n t i f y possible index units, cross c o r r e l a t i o n coefficients among reindeer herds were systematically compared to a pre-set level. A li s t was made showing: all herds with 63 a pairwise cross correlation value greater than a pre-set level; the herds with which this value was shared. This was done for correlation values of 0.65, 0.75, and 0.85. From these lists i t was easy to pick out the herds that could be index units in order to represent the maximum number of other herds. RESULTS Experimental Design Lists of the groups containing the most similar reindeer herds were produced as candidates for management experiments. Divisive clustering limited the number of potential groupings to test, cut down the time required substantially, and i d e n t i f i e d a l l highly similar groups. In general, similar herds were found to be spatially adjacent. Both the clusters of herds and the groupings of highly similar herds changed over time. Results of the analysis (using all possible groups of the 56 reindeer herds) are presented in Table 4.1. Five of the top ten candidates for two unit experiments are shown in Figure 4.1. Most of the groups are made up of herds located close to one another. The 10 most similar groupings of 4 subdivisions are made up of various combinations of a small Table 4.1 Ten closest groupings of reindeer subdivisions based on residual similarity Groups of 2 Rank Herd 1 Herd 1 37 42 2 24 26 3 37 39 4 18 27 5 3 4 6 36 42 7 37 50 8 24 41 9 27 32 10 25 26 Groups Of 3 Rank Herd 1 Herd 2 1 37 42 2 37 39 3 36 37 4 18 27 2 Correlation 0.91 0.88 0.87 0.87 0.85 0.85 0. 84 0.84 0.84 0.83 Minimum Pairwise Herd 3 Correlation 50 0.82 42 0.82 42 0.82 32 0.80 Minimum Pairwise Rank Herd 1 Herd 2 5 37 42 6 26 39 7 35 39 8 35 37 9 35 37 10 25 26 Herd 3 Correlation 56 0.80 42 0,80 42 0.79 42 0.78 39 0.78 39 0.78 Groups of 4 Rank Herd i Herd 2 1 35 37 2 26 37 3 36 37 4 26 37 5 34 37 6 26 34 7 26 34 6 26 34 9 26 35 10 26 35 Min . Pairwise Herd 3 Herd 4 Correlation 39 42 0.78 39 42 0.77 42 50 0.77 42 50 0.76 39 42 0.76 37 42 0.75 39 42 0.75 37 39 0.75 37 42 0.75 39 42 0.75 Figure 4.i Five of the ten most similar pairs of subdivisions 67 number of subdivisions that are highly similar. It took considerably less time to check all possible groupings from a total of 20 subdivisions than from a total of 56 (Figure 4.2) using a Compaq XT computer. Results of the Monte Carlo simulations of clustering performance showed that the divisive methods worked better than the agglomerative method in all cases (Table 4.2). Both methods produced the same p a t t e r n of results. As the individual component of v a r i a t i o n was increased, both pairs and quads became more d i f f i c u l t to identify. As the pair component of v a r i a t i o n increased, pairs were more easily identified and quads became more d i f f i c u l t to identify. As the quad component of variation increased, quads were more easily identified and so were pairs. Figure 4.3 shows the pattern of divisive clustering of the reindeer herds. The X axis l i s t s all 56 herds by t h e i r numeric codes. The spikes between herds represent splits caused by the clustering procedure. The height of the spike represents the order of the split. The lowest spikes appear between the most closely related herds. The Y axis lis t s the number of the split. The highest spike occurs at split i , the lowest occurs at split 55. A f t e r the f i r s t 15 splits in the process, several large groups of herds remained (Figure 4.4). Figure 4.5 i l l u s t r a t e s 5 of the major clusters that remained at the fifteenth split. The groups for the most part are each located within geographic regions. Of the five Log (Minutes needed to find best group) 0 1130.0 m i n u t e s t 1.9 minutes -+> o 6 Number in Group 56 Subdivisions 20 Subdivisions CTi CO Figure 4.2 Amount of time required to find the 10 top candidates for experimental units 69 Table 4. 2 Percentage of simulations resulting in correct identification of quads and pairs using cluster analysis Individual Standard Deviation = 0.15 Pair Common Standard Deviation = 0.05 Quad Common Standard Deviation = 0.05 Agglomerative Divisive */ of Pairs Identified 7. of Pairs Identified 0 1 2 3 4 0 1 2 3 4 '/. of 0 | 48 36 3 i 0 0 j 38 33 14 2 0 Quads 1 ) 2 4 3 2 0 i | 3 4 2 i 0 Iden. 2 1 1 0 0 0 0 2 1 1 0 1 1 0 Individual Standard Deviation = 0.05 Pair Common Standard Deviation = 0.15 Quad Common Standard Deviation = 0.05 Agglomerative Divisive z of Pairs Identified z of Pairs Identified 0 1 2 3 4 0 1 2 3 4 z of 0 1 0 0 0 0 49 0 1 0 0 0 0 47 Quads 1 1 0 0 0 0 35 1 1 0 0 0 0 34 Iden. 2 1 0 0 0 0 16 2 1 0 0 0 0 19 70 Individual Standard Deviation = 0.05 Pair Common Standard Deviation = 0.15 Quad Common Standard Deviation = 0.10 Agglomerative Divisive v- of Pairs Identified x of Pairs Identified 0 1 2 3 4 0 1 2 3 •/. of 0 1 0 0 0 0 25 0 1 0 0 0 0 26 Quads 1 1 0 0 0 0 35 1 1 0 0 0 0 35 Iden. 2 1 0 0 0 0 40 2 1 0 0 0 0 39 Individual Standard Deviation = 0.05 Pair Common Standard Deviation = 0.05 Quad Common Standard Deviation = 0.15 Agglomerative Divisive */ of Pairs Identified z of Pairs Identified 0 1 2 3 4 0 1 2 3 4 X o f 0 1 0 0 0 0 0 0 1 0 0 0 0 0 Quads 1 1 0 0 0 0 0 1 1 0 0 0 0 0 Iden . 2 1 0 2 13 25 60 2 1 0 0 6 17 75 71 Individual Standard Deviation = 0.05 Pair Common Standard Deviation = 0.10 Quad Common Standard Deviation = 0.15 Agglomerative Divisive of Pairs Identified '/ of Pairs Identified 0 1 2 3 4 0 1 2 3 4 x of 0 | 0 0 0 0 0 0 | 0 0 0 0 0 Quads 1 | 0 0 0 0 3 1 | 0 0 0 0 3 Iden. 2 1 0 0 0 1 96 2 1 0 0 0 0 97 a. u. o o oc o 16 26 36 46 M l l * J4 II 41 J7 M » M » M M t l t l t4 41 40 t l 47 4f St 43 4* Jl 4$ 4t It 10 IS 11 17 It 31 14 11 M II 11 14 1 1 X — Set 1 — X X — Set 2 — X X — 4 S 11 I t 10 t l Set • 17 M 3 I 44 7 SS St — X HERD NUMBERS ro Figure 4.3 Results of d i v i s i v e clustering of the reindeer herds S4 31 34 It 42 37 M 50 M M 15 25 23 t i 24 41 40 22 47 4« SI 43 48 51 4S 41 It 10 IS 1» 27 32 31 24 13 X 20 11 14 1 3 4 S 12 1 2 20 21 » 17 It ( 44 7 SS 52 * — Set 1 — X X — set 2 — X X — Set 3 — X Herd Numbers Figure 4.4 Results of the f i r s t 15 spl i ts of divis ive clustering Figure 4 . 5 Major clusters of herds remaining at the 15th split of divisive clustering 75 cl u s t e r s i l l u s t r a t e d in the figure, one cluste r occurs primarily in the southeast, one occurs on the central western coast, one occurs in the north, and two occur in the center of the reindeer range. The reindeer herds were clustered in order to find groups of herds t h a t have reacted similarly to external factors. This information was used for selecting candidates for larger experiments and to find potential groupings in alternative regions. Three sets of herds were formed (Table 4.3) based on the clusters evident at the 15th split. The f i r s t set consisted of 20 herds that remained undivided by the 15th split. The second set consisted of a cluster of 9 herds as well as additional herds located in close geographic proximity or in close proximity in terms of clustering to the 9 herds. Most of these herds are in the central region of the reindeer range. The t h i r d set consisted of the remaining 18 herds, most of which occur in the northern portion of the range. The three sets are noted in Figures 4.3 and 4.4. The most similar groups of herds within the three sets were identified and compared to produce overall l i s t s for experiment sizes of 2, 3, 4, 5, and 6 units (Table 4.4). The groups i d e n t i f i e d as most similar within the three sets matched the groups identified for the overall population for group sizes 2 and 3 (Table 4.1). The top groups within set i , the southeastern set, had a higher level of s i m i l a r i t y than the top groups from other sets. Figure 4.6 shows for Table 4.3 Three sets of herds used for further ranking of experimental units Set i 19 22 23 24 25 26 33 34 35 36 37 39 40 41 42 46 47 50 54 56 Total # Set 2 10 11 13 15 16 18 27 26 29 30 31 32 43 45 48 49 51 53 Set 3 1 2 3 4 5 6 7 8 9 12 14 17 20 21 38 44 52 55 of herds 18 77 Table 4.4 Top candidates for experimental units within sets Groups of 2 Herd 1 Herd 2 Herd 3 Herd 4 Herd 5 Herd 6 Minn Corr St i 37 24 37 42 26 39 0.91 0.88 0.87 St 2 18 27 18 27 32 32 0 . 86 0.84 0 .80 St 3 3 20 4 4 21 5 0.85 0.82 0.72 Groups of 3 St 1 37 42 50 0.82 37 39 42 0.82 36 37 42 0.82 St 2 18 27 27 27 31 29 32 32 32 78 0.80 0.66 0 .66 St 3 1 1 6 3 4 20 4 14 21 0.66 0. 56 0. 55 Groups of 4 St 1 35 37 39 26 37 39 36 37 42 St 2 13 27 31 27 29 31 13 18 27 St 3 1 3 4 1 9 12 9 14 20 42 0.78 42 0.77 50 0.77 32 0.60 32 0.60 32 0.59 14 0.53 14 0.51 21 0.48 79 Groups of 5 St i 2 6 26 36 St 2 18 13 15 St 3 1 3 4 Groups of 6 St 1 2 6 25 23 St 2 10 27 15 St 3 1 3 4 34 37 39 35 37 39 37 42 50 27 29 31 18 27 31 18 27 29 3 4 9 4 9 14 6 7 9 34 35 37 26 34 37 24 25 26 15 16 18 28 29 30 27 28 29 3 4 9 4 6 9 6 7 9 42 0.75 42 0.75 56 0.72 32 0.58 32 0.58 32 0.53 14 0.48 20 0.45 20 0.41 39 42 0.70 39 42 0.70 42 50 0.69 29 32 0.44 31 32 0.38 30 32 0.38 14 20 0.38 14 20 0.37 14 20 0.37 Figure 4.6 81 each of the three sets, the most similar herds for a 6 unit e x p e r i ment. C l u s t e r a n a l y s i s produced d i f f e r e n t p a t t e r n s of similarity between the two halves of the reindeer time series (Table 4.5). Several small groupings were similar in the two halves, but these were the exception. Clusters formed from each half of the time series appeared over wider ranges of area compared to clusters found using the entire time series. Lists of the 10 most similar pairs varied greatly between the two time series (Table 4.6). Herds from the southeast region comprised the majority of the similar pairs calculated with the f i r s t half of the time series while herds from the central region comprised the majority of the similar pairs calculated with the second half of the time series. Monitoring Allocation As the level of correlation required of index units decreased, the number of potential index units increased and so did the number of herds which could potentially be represented (Table 4.7). Nine herds had at least one pairwise cross correlation coefficient greater than or equal to 0.85. There were several ways to select index units from this group of nine. If the goal was to maximize the number of herds being represented, four index units would be selected to represent the remaining 5 herds. Forty three herds were Table 4.5 Clusters of herds remaining at the 20th split for each half of the time series Tears 1-12 Cluster i Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 Cluster 8 Cluster 9 Tears 13-23 Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 Cluster 6 Cluster 9 16, 18, 22, 23, 24, 25, 26, 27, 29, 32, 34, 35, 36, 37, 39, 40, 41, 42, 50, 56 1, 3, 4, 5, 9 12, 15 38, 48 19, 31, 53 14, 55 6, 13, 20, 21 33, 45, 46, 54 7, 17, 44 13, 15, 18. 19, 27, 28, 30, 32, 34, i . 3, 4, 7. 14, 16 6, 55 35. 45 20. 21 . 22 , 47 36 . 40 31 , 39, 56 2, 8, 10. 12, 4 i , 46 17. 23, 24, 25. 26, 29, 33 , 42, 54 Table 4. 6 Ten most similar pairs during each half of the time series. Years 1-12 Herd i Herd 2 Corr 3 4 0.95 37 39 0.94 36 42 0.94 37 50 0.92 18 21 0.92 40 42 0.91 39 50 0.91 24 41 0.91 33 54 0.91 39 40 0.90 Years 13-23 Herd 1 Herd 2 Corr 27 30 0.95 18 30 0.94 26 54 0.94 23 24 0.93 29 54 0.93 28 34 0.93 18 32 0.93 18 27 0.92 37 42 0.92 30 32 0.92 Table 4. 7 Potential index units Correlation > 0.85 Index Unit Herds Herd # Represented 3 4 18 27 24 26 37 39 42 Correlation > 0.75 3 4 20 21 26 23 24 25 34 35 32 15 18 27 34 19 23 Correlation > 0.65 Index Unit Herd # Herds Represented 4 9 10 £0 22 32 42 1 3 5 17 16 21 47 15 18 27 29 30 31 19 23 24 25 26 34 35 36 37 39 40 41 50 51 54 56 86 involved in at least one pairwise relationship at the level of 0.65. In this case 7 herds would represent 29 others. Figure 4.7 shows four of the potential index units at the level of 0.65 and the herds that they would be representing. Index units would primarily represent other herds located within the same geographic region. DISCUSSION Reindeer herds that showed the highest s i m i l a r i t y were ones that were influenced by the presence of extremely bad years. The changing p a t t e r n of s i m i l a r i t y over time corresponds to the changing location of bad years over time. The f i r s t half of the time series produced s i m i l a r i t y among subdivisions in the southeastern portion of the reindeer range. From chapter 3, Table 3.1, an extremely bad year affected many of these units in year 10. The second half of the time series produced s i m i l a r i t y among subdivisions in the central portion of the range where bad years occurred in years 14 and 21. These results are not surprising, considering that the correlation coefficient used to estimate s i m i l a r i t y among time series of residuals is highly sensitive to outliers as shown in the simulations in chapter 3. Managers wishing to select similar subdivisions for experimentation or monitoring will in most cases want to use subdivisions that have faced common extreme years in the past Figure 4. 7 Potential index herds at the level of 0.65 shown with the herds that they represent. (For each shading pattern, index herd shown with shading only in center of area.) 88 and would likely face common extreme years in the future. Other managers may not be as interested in extremes and may be more interested in year to year variation. To find similar subdivisions with less emphasis on extremes, cor r e l a t i o n s could be calculated between square roots or logs of the time series of residuals. Environmental and management actions are effective at d i f f e r e n t spatial scales over time. Given enough time, the most similar subdivisions will likely appear next to or near each other. This was true of the reindeer herds when the e n t i r e time series was analyzed. Smaller time series will contain the effects of fewer environmental and management actions and may not show the same degree of regionalization of similarity. Many of the most important groupings appeared during halves of the time series, but others were not evident. This is consistent with the results from chapter 2. The f i r s t half of the time series showed a d i f f e r e n t pattern of similarity than the second half but both halves were important to the overall picture. In using clustering methods, managers should presumably use the longest available time series. However, portions of the time series t hat contain external effects t h a t are known to no longer be applicable should be excluded, unless these factors can be built into the model. For example, i f reindeer herds in the southern region began to be supplementally fed halfway through the time series, either this information should be built into the model or only the 89 second half of the time series should be used. Before managers can select index units, they must decide on a level of s i m i l a r i t y needed between the index units and the subdivisions being represented. There is an important tradeoff involved in doing this. The higher the level of s i m i l a r i t y desired, the fewer subdivisions will be available as index units and the fewer subdivisions can be represented. Managers will have to decide whether they want index units representing a small number of closely related subdivisions or whether they want index units representing a large number of subdivisions that are not as closely related. 90 CHAPTER 5 IMPROVING PARAMETER ESTIMATION OF SUBDIVIDED POPULATIONS INTRODUCTION The preceding chapter i l l u s t r a t e d potential improvements in monitoring allocation and experimental design by collecting data within subdivisions. Walters (1986) was the f i r s t to i n t r o d u c e s u b d i v i d e d p o p u l a t i o n p a r a m e t e r e s t i m a t i o n procedures. The theory behind a number of parameter estimation procedures is laid out in Walters' book without examples. This chapter will provide examples of the use of two of the procedures with the reindeer data. In the f i r s t procedure, parameters assumed to belong to a known distribution can be better estimated using a Bayesian approach. In the second procedure, the accuracy of parameter estimates can be improved for subdivisions that have shown similar responses to external f a c t o r s . This chapter will explore the use of these two methods on the Finnish reindeer data. 91 BATESIAN APPROACH In many cases parameter values may be assumed to have been "drawn by nature" from a Known distribution. In the simplest case, parameters from population subdivisions may be assumed to have been drawn from a single normal distribution. This d i s t r i b u t i o n would have one overall population mean and one overall variance. This d i s t r i b u t i o n can be used to fine tune parameter estimates from each subdivision, particularly poorly estimated parameters. Bayesian Methods The methods for this approach are described on pages 299-302 in Walters (1986). If we assume that the actual parameter values f o r the subdivisions, 0$, were "drawn by nature" from a single normal d i s t r i b u t i o n with mean 0 and v a r i a n c e a'p. Bayes theorem can be used to f i n d the most probable value 0 f o r each r e p l i c a t e 0 i f w i t h the r e s u l t being: fii = Wih + {l-Wi)fi where Wj is a weighting associated w i t h each s u b d i v i s i o n : 92 is; and 0 is a weighted estimate of the mean 0 around which nature's "sample" 8^ were drawn: i The overall population variance can be calculated through an i t e r a t i v e procedure. A t r i a l estimate of o ?g can be used to c a l c u l a t e values of Wj and 6. A new value of a»g can be c a l c u l a t e d as: » = 1 m N »=i This process is repeated u n t i l the successive estimates stop changing or approach zero. With this method, parameter estimates are adjusted using the assumed n a t u r a l d i s t r i b u t i o n . Parameter estimates with large variances are adjusted towards the population average. Parameter estimates with small variances change ,3 -93 l i t t l e i f at all. If the population variance is large r e l a t i v e to the subdivision variances, a large degree of parameter variation over the population range exists. If the population variance is small relative to the subdivision variances, there is a large degree of u n c e r t a i n t y Involved in the parameter estimation of each subdivision. This is important information for managers to have with regard to experimental design and monitoring a l l o c a t i o n . To provide an example of this method, the mean growth rate calculated earlier for the reindeer herds was assumed to be normally distributed. Using the growth rate estimates and t h e i r v a r i a n c e s c a l c u l a t e d e a r l i e r , 0 and ofi , were estimated. The Ricker model (Ricker 1954, Ricker 1975) was also f i t to the reindeer data in order to add a density dependence parameter. N t + l = s t * exp (a - b * S t) where Nt_ + 1 = population at time t+1 St = population after harvest at time t a,b = parameters to be estimated Although the carrying capacity parameter that represents the density dependence ("b") would not in theory be 94 i d e n t i c a l l y d i s t r i b u t e d among all subdivisions, the i n t r i n s i c rate of increase parameter ("a") could be. The Ricker "a" parameter estimates were used to estimate P i o-fe> and to r e c a l c u l a t e v a l u e s of 0 j . Results of Bayesian Approach The Bayesian process changed the values estimated for the Ricker "a" parameter slightly but had l i t t l e effect on the mean growth rate term. Original estimates of the log of the mean annual growth rate (from Chapter 2) produced a value for 0 equal to 0.28 corresponding to an annual growth rate of i.33 and a value for a*p equal to 2.24E-03. E s t i m a t e s of the annual growth rate from the subdivisions ranged from i.20 to 1.53 (Table 5.1). Figure 5.1 shows the di s t r i b u t i o n of the annual growth rate with the exception of the one outlier value of 1.53. The va l u e of a*g was c o n s i d e r a b l y l a r g e r t h a n the e s t i m a t e s f o r each s u b d i v i s i o n of a*g. For t h i s reason the updated parameter estimates, 0^, were identical to the original values to the f o u r t h decimal point. The Ricker "a" parameter estimates produced a value for •»/ 0 equal to 0.66 and a value f o r o'g equal to 7.98E-02. These values generated estimates of 0j only Table 5.1 Mean growth rate values by herd where Pi was estimated as log ( I 4 ) Herd BA a*gv exp (Pi) 1 .31 1.21E-05 1.36 2 .28 3.78E-05 1.32 3 .28 2.28E-05 1.32 4 .23 1.02E-05 1.26 5 .19 1.60E-05 1.21 6 .30 1.82E-05 1.36 7 .27 1.90E-04 1.30 8 .25 1.37E-05 1.28 9 .28 2.53E-05 1.32 10 .25 3.46E-06 1.28 11 .20 1.12E-06 1.22 12 .19 7.34E-06 1.21 13 .27 4.45E-06 1.31 14 .28 4.08E-06 1.31 15 .26 6.03E-07 1.29 16 .31 8.52E-06 1.37 17 .31 1.65E-05 1.36 18 .26 4.20E-06 1.30 19 .32 5.10E-06 1.37 20 .26 4.12E-05 1.29 21 .24 1.10E-05 1.27 Herd exp(P A) 96 22 .21 1.15E-05 1.24 23 .24 9.61E-06 1.27 24 .25 1.31E-05 1.28 25 .27 7.51E-06 1.31 26 .21 5.95E-06 1.23 27 .26 3.76E-06 1.30 28 .25 1.10E-06 1.29 29 .25 2.52E-06 1.28 30 .23 9.94E-07 1.26 31 .25 4.36E-06 1.28 32 .24 5.06E-06 1.27 33 .26 3.31E-06 1.30 34 .23 1.93E-06 1.26 35 .22 6.60E-06 1.25 36 .23 5.95E-06 1.26 37 .27 7.07E-06 1.31 38 .33 1.41E-06 1.39 39 .30 2.65E-06 1.34 40 .29 2.07E-06 1.34 41 .19 9.33E-07 1.21 42 .22 1.71E-06 1.25 43 .27 7.85E-07 1.31 44 .27 8.64E-06 1.31 45 .23 3.81E-07 1.25 Herd P A exp (f^) 46 .34 1.27E-06 1.40 47 .36 2.04E-06 1.43 48 .35 1.61E-06 1.43 49 .33 7.74E-07 1.39 50 .31 1.30E-06 1.37 51 .36 4.42E-07 1.43 52 .32 1.88E-06 1.38 53 .28 5.89E-07 1.32 54 .30 2.16E-06 1.35 55 .43 1.02E-05 1.53 56 .30 9.24E-08 1.34 Number of Herds in Range 12 i 1.22 1.24 1.26 1.28 1.3 1.32 1.34 1.36 1.38 1.4 1.42 1.44 1.46 Mean Growth Rate Value Figure 5.1 Distribution of the mean growth rate estimated for the reindeer herds 99 s l i g h t l y d i f f e r e n t than those o r i g i n a l l y estimated (Table 5.2). Parameter estimates with large variances were adjusted in the d i r e c t i o n of the population average. Parameter estimates with small variances remained unchanged. Parameter values ranged from 0.10 to 1.36. Many of these values are unreali s t i c for reindeer populations. For example a Ricker "a" value of 1.36 corresponds to a maximum annual growth rate of 3.90, which is impossible for reindeer. ESTIMATION OF COMMON EXTERNAL EFFECTS Another possible parameter estimation method involves estimating common e x t e r n a l e f f e c t s over time between subdivisions. This is described on pages 303-306 in Walters (1966) for use with the Ricker model. The easiest way to f i t a Ricker model to a number (N) of subdivisions would be to f i t the model individually to each subdivision. This would produce N values of the Ricker "a" parameter and N values of the Ricker "b" value estimating a total of 2 « N parameters. A joint estimation procedure could attempt to estimate the parameters for all subdivisions together. In Walters' procedure, all subdivisions are assumed to share common external effects (noise) at each time step. The amount of common noise shared by all subdivisions at each year is estimated as an additional parameter. When common noise is large, parameter estimation should be more accurate. If data Table 5. 2 Ricker "a" values as estimated before and after Bayesian process Herd ^ a « o . n e w pA 1 0.85 8.94E-04 0.85 2 1.25 2.34E-03 1.23 3 0.93 1.86E-03 0.93 4 0.77 8.76E-04 0.77 5 0.56 2.56E-04 0.56 6 0.67 5.57E-04 0.67 7 0.63 4.54E-03 0.82 8 0.54 1.10E-03 0.54 9 0.59 5.71E-04 0.59 10 0.70 1.23E-03 0.70 11 0.28 8.37E-05 0.28 12 0.44 5.52E-04 0.44 13 0.63 1.17E-03 0.63 14 0.96 1.60E-03 0.95 15 0.45 8.88E-04 0.46 16 0.69 4.70E-04 0.69 17 0.82 1.47E-03 0.82 18 1.00 2.71E-03 0.99 19 0.89 1.32E-03 0.88 20 1.38 2.20E-03 1.36 21 0.80 1.79E-03 0.80 Herd pi new 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 0.55 0.99 1.18 0. 86 0.77 0.80 0.56 1 . 06 0.68 0. 99 0.68 0.86 0. 67 0.51 0.76 0.43 0.53 0.43 0. 56 0.60 0.26 0.24 0.47 0.29 7.61E-04 1.39E-03 2.90E-03 4.18E-03 3.23E-03 4.03E-03 1.10E-03 6.45E-04 9.67E-04 1.49E-03 2.36E-03 1.44E-03 3.25E-03 2.35E-03 1.23E-02 2.61E-03 9.74E-05 3.16E-04 2.28E-04 2.63E-03 2 . 19E-03 8.96E-05 1.61E-04 4.39E-05 0.55 0.99 1.16 0.85 0.77 0.79 0.56 1 . 06 0.68 0.98 0.68 0.85 0.67 0.51 0. 74 0.44 0.53 0.43 0.56 0 . 60 0.27 0.24 0.47 0.29 Herd 6^ new e i 46 0.26 1.79E-04 0.26 47 0.77 3.52E-03 0.76 48 0.30 1.68E-03 0.31 49 0.44 1.78E-05 0.44 50 0.08 1.64E-03 0.10 51 0.14 8.79E-05 0.14 52 0.29 5.01E-05 0.29 53 0.21 1.14E-04 0.21 54 0.44 9.67E-04 0.44 55 0.51 5.70E-05 0.51 56 0.27 1.12E-07 0.27 103 were available over a time series for T+l years, a total of 2 » N + T parameters would be estimated using this procedure. The increased number of parameters to be estimated causes a deterioration in estimation performance, which will make the method less accurate than individual area f i t t i n g unless the shared effects are large. With i n d i v i d u a l parameter estimation, the following equation is f i t to population data from each subdivision: where Hi t\, = the population in sub. l at time t s i , t - l : t n e Population in sub. i at time t-1 after the harvest a^ = Ricker a parameter for subdivision i bi = Ricker b parameter for subdivision l w i , t = normally distributed process error The Joint estimation procedure assumes that the noise term consists of two components: where w i , t = noise In subdivision i at time t 104 w^ = common effect shared by all replicates during year t w»i ( t = an independent effect due to conditions encountered in sub. i In order to calculate parameter values the following matrix equation is used: w .1 A : B * d " .B' • • • : M. W where b = the b parameter ests. (1 for each subdivision) w = the mean noise ests. ( i for each y r : i T) A = diagonal matix having the elements: T Aw^ s,-,-*.)' *=1 B = a matrix having the following elements: Bu = {Si,t - §i) NI = the T x T identity matrix multiplied by the number of subdivisions d = an array having the following elements: T t=i 105 W = an array having the following elements: t = i where = the arithmetic mean of over time for subdivision i yt>t = ln(/Yit/S,-«_x) y\ = the mean of Yit\ over time for subdivision i The estimates of a^ are found by: fi,- = Y{- biSi The key parameters f o r s e t t i n g optimum h a r v e s t population levels for the subdivisions are the b A (Walters 1986). The variances for these parameters are calculated in the following way: where: A and B are matrices as defined above ( B B % = ^ ( 5 , i t - 5 , ) ( 5 i i t - 5 y ) t = i 106 £ E(K,t-a,--$,-s,->t-«Bt)a 5 a = , = l t = 1 NT - 2N - T + 1 Additional information on this procedure is provided in Walters (196*6). The Individual and Joint procedures were compared using Monte Carlo simulations. Parameters were estimated using the reindeer data with both the individual and Joint procedures. Simulations Simulation Methods Data were generated for a series of subdivisions using the Ricker model with Known parameters and with common and individual noise added as in Chapter 2. The data were then f i t both to a RicKer curve individually and with the Joint estimation procedure. The two procedures were compared as the amount of common noise increased and as the number of subdivisions increased. Simulations were r u n f o r two combinations of noise: common standard deviation (s.d.) = 0.25 with individual s.d. = 0.05; and common s.d. = 0.05 with i n d i v i d u a l s.d. = 0.25. Simulations were r u n f o r 4 subdivisions and for 6 subdivisions. For each combination of 107 noise components and number of subdivisions created, 100 simulations were run. At the end of each simulation, a number of conditions were compared involving the "b" parameter. The program kept track of whether the individual point estimate or the Joint point estimate was closer to the true parameter value. It also kept track of whether each true parameter value was in the range of the point estimate plus or minus i t s standard e r r o r for both the individual and the joint estimates. In order to compare the relative uncertainty of each estimate, each point estimate was divided by i t s standard error. The program calculated and stored whether the individual or joint method produced the largest value of this term. Simulation Results Results of the simulations are listed in Table 5.3. Joint parameter estimation became more effective as the number of subdivisions increased and the amount of common variation increased, as stated by Walters (1986). Figure 5.2 shows the results of 100 simulations with common standard deviation = 0.25, individual standard deviation = 0.05, and 8 subdivisions being analyzed. In this case the Joint estimation procedure was much better than the individual procedure. The method which produced the better point estimate depended on the amount of common noise present. I n d i v i d u a l parameter 108 Table 5. 3 Results of individual vs joint estimation simulations 4 subdivisions , a J common = 0.05, o' ind. =0.25 Ind. Joint z of simulations with closest point estimate 59.5 40.5 z of simulations with true value within range of pt. estimate ± stan. error 57.3 6 2.3 z of simulations with highest value of pt. estimate - standard error 67.0 33.0 8 subdivisions , a* common = 0.05, a* ind. = 0.25 Ind. Joint z of simulations with closest point estimate 55.0 45.0 z of simulations with true value within range of pt. estimate + stan. error 57.8 59.6 x of simulations with highest value of pt. estimate - standard error 63.0 37.0 109 4 subdivisions , a* common = 0.25, a* ind. = 0.05 Ind. Joint z of simulations with closest point estimate 14.0 86.0 v. of simulations with true value within range of pt. estimate ± stan. error 57.5 6 7.8 y- of simulations with highest value of pt. estimate - standard error 16.5 83.5 8 subdivisions , a* common = 0.25, a* ind. = 0.05 Ind. Joint x of simulations with closest point estimate 12.0 8 8.0 of simulations with true value within range of pt. estimate + stan. error 57.6 6 7.4 x of simulations with highest value of pt. estimate - standard error 12.0 88.0 90 1 1 0.0004 Parameter estimate (True — —0.00003) • Joint Estimation + Ind. Estimation Figure 5.2 Result of simulations comparing individual to Joint estimation with high level of common variation Ill estimation produced point estimates closer to the true value in cases of l i t t l e common noise. True parameter values were more often in the range of the joint estimation point values plus or minus their standard errors. The method that produced the highest values of the point estimates divided by th e i r standard e r r o r s also produced the closest point estimates. This value could potentially indicate the appropriate method with real data sets. Case Study Case Study Methods The Ricker equation was f i t to the population data from each herd using both the individual estimation procedure and the joint estimation procedure. Results from chapter four showed that common variation within a group decreased as the number of herds within the group increased. Since common variation and number of herds could not both be maximized, intermediate values of both were selected. Joint estimates were calculated using each of the three sets listed in chapter f o u r (Table 4.3). Case Study Results Values of the "b" parameter from both individual and 112 Joint estimation are listed in Table 5.4. The joint parameter variances were a l l less than the corresponding individual variances. The j o i n t estimates were p r e f e r r e d f o r two reasons: F i r s t , the clusters selected shared common variation. Second, the value of the Joint parameter estimate divided by i t s standard e r r o r was greater in most cases than the corresponding i n d i v i d u a l s t a t i s t i c . For the f i r s t set, involving primarily southern herds, 16 of the 20 values were higher, for the second set, involving primarily central herds, 13 of the 18 values were higher, and for the t h i r d set, involving primarily northern herds, 12 of the 18 values were higher. Cluster analysis produced the same ranking of these sets based on t h e i r similarity. Discussion When the appropriate assumptions are valid, both the Bayesian method and the estimation of common external effects method can provide improved parameter estimates within subdivisions commpared to individual estimation procedures. These are only two of the possible methods which can be used to improve parameter e s t i m a t i o n f o r subdivided populations. Another possible approach would involve a Joint estimation procedure where one parameter is estimated common to all subdivisions, while a second parameter is estimated uniquely f o r each subdivision. A d d i t i o n a l simulations are Table 5.4 Estimates of Ricker "b" parameter by individual and joint parameter analysis Joint Ind. Joint Ind. Herd Estimate Estimate Variance Variance i 2 3 4 5 6 7 6 9 10 11 12 13 14 15 16 17 18 19 20 .95E-04 .16E-03 .59E-03 . 12E-03 .15E-03 .20E-03 .26E-03 . i l E - 0 3 . 89E-04 .61E-04 . 17E-04 . 55E-04 .95E-04 . 15E-03 .iOE-03 . 16E-03 . 23E-03 . 16E-03 . 93E-03 .14E-03 83E-04 16E-03 60E-03 13E-03 15E-03 17E-03 , 25E-03 .76E-04 . 99E-04 .59E-04 . 90E-05 .51E-04 . 15E-03 . 17E-03 .61E-04 . 13E-03 .21E-03 . 19E-03 . 81E-03 . 17E-03 . 73E-09 . 10E-08 . 26E-07 . 20E-08 . 19E-08 . 36E-08 .26E-08 . 17E-08 . 13E-08 .35E-09 . i i E - 0 9 .95E-09 . 17E-08 .51E-08 .34E-08 . i l E - 0 8 .47E-08 . 22E-08 .35E-07 .95E-09 . 72E-09 . 14E-08 .35E-07 . 17E-08 . 22E-08 .54E-08 . 12E-07 .24E-08 .23E-08 .63E-09 . 13E-09 .iOE-08 .37E-08 . 27E-08 .32E-08 . 31E-08 .60E-08 . 36E-08 . 76E-07 . 11E-08 Joint Ind. Joint Ind. Herd Estimate Estimate Variance Variance 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 .46E-04 .49E-04 . 10E-03 . 26E-03 . 15E-03 .41E-03 . 87E-04 . 34E-03 . 17E-03 .49E-03 .58E-03 . 31E-03 . 26E-03 .39E-03 . 33E-03 . 35E-03 .28E-03 . 23E-03 .42E-03 .43E-03 .26E-03 .22E-03 .82E-05 52E-04 72E-04 14E-03 40E-03 . 16E-03 65E-03 , 12E-03 25E-03 . 21E-03 .42E-03 .58E-03 .45E-03 . 34E-03 .30E-03 . 19E-03 . 30E-03 . 10E-03 . 15E-03 . 15E-03 .25E-03 .26E-03 . 18E-04 . 17E-04 . 35E-09 .30E-09 .55E-09 . 33E-08 . 16E-08 . 26E-07 . 20E-08 . 22E-07 .21E-08 . 32E-07 . 16E-07 . 23E-07 . 63E-08 . 16E-07 .68E-08 . 12E-07 .69E-08 . 19E-07 . 13E-07 .97E-08 . 19E-07 , 12E-07 . 17E-08 . 36E-09 . 13E-08 . 14E-08 . 10E-07 . 51E-08 .78E-07 . 37E-08 . 23E-07 . 17E-08 . 28E-07 . 23E-07 .53E-07 . 11E-07 .27E-07 .23E-07 . 38E-07 . 25E-07 .67E-08 . 27E-07 . 15E-07 .22E-07 .22E-07 . 16E-08 Joint Ind. Joint Ind. Herd Estimate Estimate Variance Variance 44 45 46 47 48 49 50 51 52 53 54 55 56 . 17E-03 . 71E-04 . 12E-03 . 34E-03 •41E-04 . 18E-03 . i i E - 0 3 .74E-04 . 28E-04 . 58E-04 . 80E-03 . 13E-03 .57E-04 14E-03 31E-04 IiE-03 23E-03 28E-04 , IiE-03 ,14E-03 , 92E-04 ,57E-04 ,76E-04 .35E-03 .91E-04 .36E-04 .59E-08 . 26E-08 .15E-07 . 12E-07 .65E-08 •52E-08 . i i E - 0 7 . 36E-08 . 19E-07 . i l E - 0 7 . 91E-07 .64E-08 . 15E-08 .61E-08 . 16E-08 . 23E-07 . 19E-07 . 93E-08 .43E-08 . 15E-07 . 13E-08 .98E-08 .84E-08 .20E-06 .75E-08 . 18E-09 116 needed to better understand the uses of these and other methods. Additional information will be gained when these methods are used on other data sets. As pointed out by these methods, the Finnish reindeer herds were very highly managed. With the Bayesian approach, variances around the mean growth rates within herds were very small r e l a t i v e to the overall population level. Managers probably attempted to Keep these rates constant. Higher growth rates were found in the southern herds possibly due to the culling of reproductively inactive animals. Managers Kept most of the populations at a constant size by harvesting the surplus animals each year. Managers of herds with smaller population growth rates may want to experiment to see i f higher population growth rates can be achieved in order to provide greater surpluses for harvest each year. The RicKer model ended up having l i t t l e meaning when applied to the reindeer herds, because very l i t t l e , i f any, density dependence existed in this data set. Values for the RicKer curve show several d i f f e r e n t Kinds of curve f i t s . In some cases, the carrying capacity term was negative showing a population growth rate that increased as the population grew. This was true of many of the southern herds. In other cases the parameters produced a curve t h a t increased at low population levels much f a s t e r than biologically possible in order to level out through the actual data points. The RicKer a values were biased upwards even a f t e r the Bayesian 117 m o d i f i c a t i o n s . C a r r y i n g capacities are a factor in determining the maximum number of animals that can be harvested each year. Managers of the reindeer herds have provided l i t t l e informative v a r i a t i o n needed to estimate carrying capacities. Managers may wish to experiment by allowing t h e i r populations to increase (and/or decrease) in order to provide informative v a r i a t i o n within t h e i r population data. This would allow carrying capacities to be estimated and populations to be increased to t h e i r maximum potential. The actions of management should stongly influence the structure of similarity found by the methods used in chapter 4. Adjoining subdivisions will have similar residuals due to similar environmental conditions and s i m i l a r management policies. Management policies were similar within regions as evidenced by the high growth rate policy of managers in the south. 118 CHAPTER 6 CONCLUDING REMARKS The research presented in this thesis discusses some ways that resource managers could improve the management of s u b d i v i d e d populations. The sy s t e m a t i c approach f o r coordinating policies among subdivisions is new to resource management. This thesis expands upon the ideas introduced by Walters (1986). In this chapter, I bri e f l y review the primary r e s u l t s of each chapter and discuss f u t u r e r e s e a r c h d i r e c t i o n s . The simulations in Chapter 2 showed that subdivisions sharing similar external effects can be correctly identified under proper conditions. Subdivisions sharing high levels of common v a r i a t i o n r e l a t i v e to i n d i v i d u a l v a r i a t i o n have a greater p r o b a b i l i t y of being c o r r e c t l y i d e n t i f i e d . The correlation coefficient was more reliable than either of the othe r s t a t i s t i c s tested. The mean cross c o r r e l a t i o n coefficient was a good indicator on average of the relative amounts of common to i n d i v i d u a l v a r i a t i o n . With t h i s information, managers can begin to roughly gage the expected effectiveness of these methods on t h e i r data sets. Results of the case study in Chapter 2 showed that the reindeer data contained common variation. The mean cross c o r r e l a t i o n was g r e a t e r than zero. U n f o r t u n a t e l y , no 119 information was available on the causes of va r i a t i o n within the population which would have enabled the subdivisions to be grouped according to similar external factors. The pattern of similarity changed between the f i r s t half and the second half of the time series. The simulations in Chapter 3 showed that the presence or absence of autocorrelation, measurement error, and extremely low residuals could aid managers in gaging the effectiveness of these methods. High levels of autocorrelation hindered the a b i l i t y to c o r r e c t l y i d e n t i f y s i m i l a r s u b d i v i s i o n s . Measurement error at any level decreased the probability of co r r e c t l y identifying similar subdivisions. If extremely bad years represented by low residuals were present in common among subdivisions, the a b i l i t y to correctly i d e n t i f y similar subdivisions increased. The r e i n d e e r herds d i s p l a y e d low l e v e l s of autocorrelation, and l i t t l e , i f any, measurement e r r o r . Extremely low residuals occurred within the same year to herds located within regions of the population range. A l l three of these results encourage the use of the methods to identify s i m i l a r s u b d i v i s i o n s . Chapter 4 introduced methods to select control and treatment units for management experiments and index units for monitoring. In order to select the best candidates for management experiments, all possible groups were ranked based on the minimum c o r r e l a t i o n coefficient within each group. 120 Groups with the highest minimum correlation shared the most similar external effects and would be the best candidates for management experiments. For large experiments hierarchical clustering was suggested as a way to reduce the number of possible groups by ruling out highly dissimilar groups. In order to select index units, all subdivision pairs with correlation values above a pre-set level can be listed then divided into index units and herds to be represented. The higher the level of s i m i l a r i t y desired, the fewer herds can act as index units and the fewer herds can be represented. Possible experimental units and index units were i d e n t i f i e d for the reindeer herds using the new methods. Similar herds were located close to one another. Many of the larger groups were found to be within regions that experienced extremely low residuals in the same year (from chapter 3). The idea of selecting experimental units based on shared external e f f e c t s would po t e n t i a l l y benefit many fields in addition to resource management. In many types of experiments, units for controls and treatments are not and can not be i d e n t i c a l . Selecting u nits with similar external effects minimizes the chance that factors other than the experimental manipulation could cause differences between the controls and treatments. For ecological studies where time series data is available by region, these methods provide an additional level of information to aid in the selection of experimental units. 121 Chapter 5 provided examples of two of the methods for improved parameter estimation as described by Walters (1986). In Waiters* book the methods were not used on sample data sets. It is hoped that the examples of the methods provided in this chapter will make i t easier for managers to use these methods on the i r own data sets. In addition, the reindeer herds produced some interesting results. There was l i t t l e to no evidence of density dependence within the reindeer time series data. The herds were so highly managed that in most cases the population sizes were not allowed to vary and provided l i t t l e i n f o r m a t i v e v a r i a t i o n . This was also confirmed by the low variances around the mean growth rate parameters. The Bayesian method changed the parameter estimates very l i t t l e because the overall population variance was so much larger than the variances around individual herds. Reindeer managers appear to harvest the surplus animals each year and keep the populations at a rela t i v e l y constant level. The number harvested is dependent on the mean growth rate as well as the t o t a l population size within each herd. For managers to improve t h e i r harvests, they should increase the mean growth rate and/or increase the total population size of each herd. The mean growth rate among subdivisions varies g r e a t l y (from chapter 5). Some of the southern herds maintained mean growth rates as high as 1.4 to 1.5. Managers of less productive herds may wish to experiment to see if they can increase the productivity of their herds by changing their 122 composition. By replacing the older, less f e r t i l e animals with additional young females the mean growth rates for the herd can be increased. In order to determine i f the population size can be increased, management experiments can attempt to establish the carrying capacity of the environment. The population size could then potentially be increased to safe levels Just below the carrying capacity. This thesis has provided examples of the benefits available to resource managers by adopting coordinated policies among population subdivisions. New methods were presented in the areas of experimental design and monitoring allocation and examples of recent methods are presented in the area of parameter estimation. Many fisheries and wildlife data sets exist where the data has been collected by region, yet the data is either analyzed independently for each region or i t is combined and analyzed for the entire population. Chapter 5, along with Walters' book (1986), suggest that managers may do better by t r y i n g joint parameter estimation. Although many wildli f e and f i s h e r i e s managers conduct experimental policy tests, units are selected a r b i t r a r i l y . Chapter 4 introduces methods for selecting experimental units which may improve the chance of observing effects of the experimental policy. Wildlife and f i s h e r i e s managers often select units to monitor more intensively either at random or out of convenience. Chapter 4 introduces methods to select units that will more closely represent others. There are, 123 however, limits to these methods. Chapter 2 and 3 point out that many data sets are inappropriate for these methods. The systematic study of subdivided populations is a relatively new field. Research in this area is based on one important fact: Each subdivision provides information not only about itself but about other subdivisions as well. All of the methods in this thesis attempt to take advantage of this additional information. Researchers will learn a great deal as concepts and methods are introduced in the field and act u a l l y implemented on r e a l populations. New parameter estimation techniques will be created and others will undergo additional testing. These methods will provide uses in other fie l d s such as the design of experiments in ecological studies. 124 L i t e r a t u r e Cited Anonymous. 1987. Suomen Po r o t a l o u s . The Reindeer Industry of F i n l a n d . P a l i s k u n t a i n Y h d i s t y s . Rovaniemi, F i n l a n d . 5 pages. Atkinson, K. and D.W. Janz. 1986. E f f e c t of wolf c o n t r o l on B l a c k - T a i l e d deer in the Nimpksih V a l l e y on Vancouver I s l a n d . W i l d l i f e Working Report Number WR-19. M i n i s t r y of the Environment. W i l d l i f e Branch. Nanaimo B.C. 31 pages. B a i l e y , J.A. 1984. P r i n c i p l e s of W i l d l i f e Management. John Wiley and Sons. New York. 373 pages. Box, G.E.P. and G.M. Jen k i n s . 1976. Time S e r i e s A n a l y i s : f o r e c a s t i n g and c o n t r o l . Revised e d i t i o n . Holden-Day. San F r a n c i s c o . 575 pages. C h a t f i e l d , C. 1980. The A n a l y s i s of Time S e r i e s : An I n t r o d u c t i o n . Second e d i t i o n . Chapman and H a l l . London. 268 pages. Cochran, W.G. 1977. Sampling techniques. T h i r d E d i t i o n . John Wiley and Sons. New York. 426 pages. Cox, G.W. (ed.) 1969. Readings i n Conservation Ecology. Mer e d i t h C o r p o r a t i o n . New York. 595 pages. Draper, N.R. and H. Smith. 1981. A p p l i e d Regression A n a l y s i s . Second E d i t i o n . John Wiley and Sons. New York. 709 pages. E v e r i t t , B. 1980. C l u s t e r a n a l y s i s . Second E d i t i o n . H a l s t e d Press. John Wiley and Sons. New York. 136 pages. G i l e s , R.H. J r . (ed.) 1969. W i l d l i f e Management Techniques. T h i r d E d i t i o n . The W i l d l i f e S o c i e t y . Washington, D.C. 632 pages. Grant, W.E. 1966. Systems a n a l y s i s and s i m u l a t i o n in w i l d l i f e and f i s h e r i e s s c i e n c e s . John Wiley and Sons. New York. 338 pages. Gulland, J.A. 1983. F i s h Stock Assessment. John Wiley and Sons. C h i c h e s t e r . 223 pages. 125 Hatter, I. and D. Janz. 1986. R a t i o n a l e f o r wolf c o n t r o l in the management of the Vancouver I s l a n d predator-ungulate system. W i l d l i f e B u l l e t i n No. B. M i n i s t r y of the Environment. W i l d l i f e Branch. V i c t o r i a , B.C. 33 pages. H o l l i n g , C.S.(ed.) 1978. Adaptive environmental assessment and management. Wiley. Chichester, New York. 377 pages. Ludwig, D. and C.J. Walters. 1981. Measurement e r r o r s and u n c e r t a i n t y i n parameter estimates f o r stock and re c r u i t m e n t . Can J . F i s h . Aquat. S c i . 38:711-720. May, R.M. (ed.) 1984. E x p l o i t a t i o n of Marine Communities. S p r i n g e r - V e r l a g . B e r l i n . 370 pages. Murthy, M.N. 1967. Sampling theory and methods. S t a t i s t i c a l P u b l i s h i n g S o c i e t y . C a l c u t t a , I n d i a . 684 pages. Mysak, L.A. 1986. E l Nino, int e r a n n u a l v a r i a b i l i t y and f i s h e r i e s i n the nor t h e a s t P a c i f i c Ocean. Can J . F i s h Aqua. S c i . 4 3 ( 2 ) 1 4 6 4 - 4 9 7 . Nelson, L.A. and D.L. Johnson (eds.) 1983. F i s h e r i e s Techniques. American F i s h e r i e s S o c i e t y . Bethesda, Maryland. 468 pages. Ricker, W.E. 1954. Stock and r e c r u i t m e n t . J . F i s h Res. Board. Can. 11:559-623. Ricker, W.E. 1975. Computation and i n t e r p r e t a t i o n of b i o l o g i c a l s t a t i s t i c s of f i s h p o p u l a t i o n s . B u l l e t i n of F i s h Res. Board Can. No. 191. 382 pages. Robinson, W.L. and E.G. Bolen. 1984. W i l d l i f e Ecology and Management. Macmillan. New York. 478 pages. Shaw, J.H. 1985. I n t r o d u c t i o n to W i l d l i f e Management. McGraw-Hill. New York. 316 pages. Sneath, P.H.A. and R.R. Sokal. 1973. Numerical Taxonomy: the p r i n c i p l e s and p r a c t i c e of numerical c l a s s i f i c a t i o n . Freeman and Co. San F r a n c i s c o . 573 pages. Sokal, R.R. and C D . Michener. 1958. A s t a t i s t i c a l method f o r e v a l u a t i n g systematic r e l a t i o n s h i p s . Univ. Kansas S c i . Bui 1. 38: 1409-1438. 126 Sokal, R.R. and P.H.A. Sneath. 1963. P r i n c i p l e s of Numerical Taxonomy. W.H. Freeman and Company. San F r a n c i s c o . 359 pages. S t a r f i e l d , A.M. and A.L. B l e l o c h . 1986. B u i l d i n g Models f o r Conservation and W i l d l i f e Management. Macmillan. New York. 253 pages. Stephenson, W. 1936. The i n v e r t e d f a c t o r technique. B r i t . J . Psychol. 26:344-361. Tanner, J.T. 1978. Guide to the Study of Animal P o p u l a t i o n s . The U n i v e r s i t y of Tennessee Press. K n o x v i l l e . 186 pages. T y l e r , A.V. and G.A. McFarlane (eds). 1985. Groundfish stock assessments f o r the west coast of Canada and recoommended y i e l d options f o r 1985. Canadian Manuscript Report of F i s h and Aqua. S c i . No. 1813. 353 pages. Wakeley, J.S. (ed.) 1982. W i l d l i f e P o p u l a t i o n Ecology. The Pennsylvania State U n i v e r s i t y Press. U n i v e r s i t y Park. 385 pages. Walters, C.J. 1985. Bias in the e s t i m a t i o n of f u n c t i o n a l r e l a t i o n s h i p s from time s e r i e s data. Can. J . F i s h . Aquat. S c i . 42: 147-149. Walters, C.J. 1986. Adaptive management of renewable re s o u r c e s . Macmillan. New York. 374 pages. Watt, K.E.F. 1968. Ecology and Resource Management. McGraw-H i l l . New York. 450 pages. Zubin, T. 1938. A technique f o r measuring 1ike-mindedness. J . Abnorm. Soc. Psy c h o l . 33:508-516.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Opportunities for management created by spatial structures...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Opportunities for management created by spatial structures : a case study of Finnish reindeer Berkson, James Meyer 1988
pdf
Page Metadata
Item Metadata
Title | Opportunities for management created by spatial structures : a case study of Finnish reindeer |
Creator |
Berkson, James Meyer |
Publisher | University of British Columbia |
Date Issued | 1988 |
Description | This study examines opportunities for renewable resource management when population data are collected by spatial subdivisions. In particular I look at potential applications for the design of management experiments, the distribution of monitoring resources, and the improvement of parameter estimation. Methods are developed to rank possible groupings of subdivisions for use as experimental units. Factors external to the experiment can cause differences between experimental units. Selecting subdivisions that have reacted similarly in the past to external factors could minimize the risk of external factors creating differences in experimental units. Methods are developed to identify subdivisions that could provide information about similar subdivisions when monitoring resources are low or when stratified sampling is being used. The use of these subdivisions as "index units" could notify managers of extremely good or bad years in a large number of subdivisions. Two methods developed by Walters (1986) provide innovative estimation techniques that can be used with subdivided populations. A Bayesian approach allows parameter estimates to be adjusted using a known distribution. Another approach allows similar subdivisions to be estimated jointly more accurately than would be possible individually. Not all renewable resource data sets provide reliable information for use with these applications. Data sets where there is little common variation, high levels of autocorrelation in the noise, or even modest amounts of measurement error are inappropriate for most methods. A series of steps is introduced for managers to test the reliability of the methods on their particular data sets. Data on Finnish reindeer (Rangifer tarandus tarandus) are used throughout the thesis to illustrate the methods. The reindeer data appear to be appropriate for these methods when tested using the steps developed. Possible experimental units and index units for monitoring are identified. Walters' (1986) methods of parameter estimation are used on the data set as well. The reindeer data show that subdivisions with similar external effects were located close to one another. This pattern was at least partially caused by the existence of extremely bad years occurring within geographic regions. The reindeer subdivisions are very highly managed and provide little evidence of any kind of density dependence. Managers could potentially benefit by conducting experiments to test the biological limits of the population growth rates and carrying capacities within subdivisions. |
Subject |
Renewable natural resources Reindeer Animal populations |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-08-25 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0097656 |
URI | http://hdl.handle.net/2429/27799 |
Degree |
Master of Science - MSc |
Program |
Zoology |
Affiliation |
Science, Faculty of Zoology, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1988_A6_7 B47.pdf [ 4.44MB ]
- Metadata
- JSON: 831-1.0097656.json
- JSON-LD: 831-1.0097656-ld.json
- RDF/XML (Pretty): 831-1.0097656-rdf.xml
- RDF/JSON: 831-1.0097656-rdf.json
- Turtle: 831-1.0097656-turtle.txt
- N-Triples: 831-1.0097656-rdf-ntriples.txt
- Original Record: 831-1.0097656-source.json
- Full Text
- 831-1.0097656-fulltext.txt
- Citation
- 831-1.0097656.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0097656/manifest