Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Discrete transect designs for estimating animal population Li, Leah Min 1994

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1994-0284.pdf [ 1.04MB ]
JSON: 831-1.0087533.json
JSON-LD: 831-1.0087533-ld.json
RDF/XML (Pretty): 831-1.0087533-rdf.xml
RDF/JSON: 831-1.0087533-rdf.json
Turtle: 831-1.0087533-turtle.txt
N-Triples: 831-1.0087533-rdf-ntriples.txt
Original Record: 831-1.0087533-source.json
Full Text

Full Text

DISCRETE TRANSECT DESIGNS FORESTIMATING ANIMAL POPULATIONbyLeah Mm LiB.Sc. (Mathematics), Fudan University, 1987A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE STUDIESDepartment of StatisticsWe accept this thesis as conformingto the required standard/J7THE UNIVERSITY OF BRITISH COLUMBIAApril 1994©Leah Mm Li, 1994In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission._________________________Department of_________________The University of British ColumbiaVancouver, CanadaDateDE.6 (2188)AbstractWhen studying the animal population in the forest, one sometimes collects dataalong a path which is called a transect. To estimate the size of an animal population,one travels along the transects and counts all animals sighted along the route.The goal of this project is to derive estimators of population size and to determineoptimal transect designs.11ContentsAbstract iiTable of Contents iiiList of Figures ivAcknowledgements v1 Introduction 12 Mathematical Background 73 The Optimal Estimator for Fixed Design 104 Optimal Design 205 Examples 365.1 Disjoint Transects 365.2 Two Overlapping Transects 425.3 The Case with Startup Cost 48References 71111List of FiguresFigure 1.1 An area with 6 subregions 3Figure 5.1.1 The optimal values of g 40FigureFigureFigureFigureFigureFigureFigureFigureFigureFigure5. 5.3.9 The optimal values of g4 with u2 = 0.2 67F1 with b — c> 1/4 41F1 with b—c.<1/4 42Regions studied in Theorem 5.4 52The optimal values of g with a2 = 0 62The optimal values of g with a2 = 0 63The optimal values of g3 with u2 = 0 63The optimal values of g4 with u2 = 0 64The optimal values of g1 with u2 = 0.2 66The optimal values of g with = 0.2 66The optimal values of g3 with = 0.2 67Figure 5.3.10 The optimal values of g with a2 = 0.2 68Figure 5.3.11 The optimal values of g with u = 0.6 68Figure 5.3.12 The optimal values of g with u2 = 0.6 69Figure 5.3.13 The optimal values of g3 with cr2 = 0.6 69Figure 5.3.14 The optimal values of g4 with u’2 = 0.6 70Figure 5.3.15 The optimal values of g with u2 = 0.6 70ivAcknowledgementsI would like to thank my supervisor, Nancy Heckman, for her guidance, support,inspiration, and encouragement throughout the development of this thesis. I had verygood fortune to study under her supervision.I would also like to thank Bent Jorgensen, for his careful reading and valuable comments.I am grateful as well to all the graduate students, faculty, and staff who made mygraduate study a rewarding experience.v1 IntroductionMany research programs for the management and conservation of wildlife will require,at some point, information on the number of animals in a region. Since counting allanimals in a large area may be expensive and impracticable, the population size isusually estimated from the number of animals counted in sampled subregions.The transect method has become a popular technique for estimating animal population size. It is being widely used either by individuals or groups. In transect sampling,the observer collects data while traveling along randomly located paths in the region.The transects are the strips running the length of the population area. The widthand the length of the strip can be decided in advance according to the factors underconsideration. The observer counts all animals sighted within the strip.We can travel in a few ways such as by light aircraft, by vehicles, or on foot. Themethod chosen depends on many factors like the cost of counting the sample and thesize of the area involved. Another important factor is the nature of the area such aswhether it is flat or mountainous and accessible or inaccessible.Light aircraft are flOW very widely used for wildlife censuses and surveys. They cancover large areas quickly by taking photographs from the air and are the only meansfor censusing in areas where access on the ground is difficult. Their use only becomeslimited when the vegetation is so thick that animals cannot be seen from the air.Counting from vehicles is practicable and economical. It gives excellent results insmall to medium sized areas that can be traveled by vehicles. It is limited when groundaccess is difficult or the area to be covered is very large.Foot counts can be used for a small sized area which is flat and accessible.For a discussion of estimation of animal population size, see Seber [7] and [8].We now consider a discretized version of the problem by dividing the whole area intoa finite grid and defining each transect to be some collection of units of the grid. We1want to estimate the population size by counting the number of animals in a randomselection of these discrete units of the grid.We denote the set of N(finite) subregions in the grid as X = {xi, x2,. . . , XN}. LetZ = (Z(xi),.. . , Z(xj))’ E R”, with Z(x) being the number of animals in the ithsubregion. For a given weight function on X, 4 = ((x1), . . . , (xv)) e RN, we arenow interested in estimatingNI = q(x)Z(x)based on observations from the randomly chosen subsets of X. (Here I includes both observed and unobserved components of Z). For instance, if we are interested in estimatingthe total number of animals in the region, we take q(x) = 1 for all x E X.Definition 1.1 A set T(a) is called a Transect if it is a nonempty subset of X, for ain an index set A = {ai, a2,. . . , ak}Typically, A does not represent all subsets of X, but rather a collection of subsetschosen by the investigator. The researcher may choose the direction of the strips andthe strip width, depending on physical features of the area and on the pattern of animaldistribution. For example, if there is a river running east-west and we believe thatanimals are distributed along the river, then we choose transects running in a northsouth direction. Otherwise, if we choose east-west transects, our transects will containeither many or very few animals. We assume that A is fixed.Now we consider an example of an area with six subregions given in the followingFigure:2Figure 1.1: An area with 6 subregionsIf we wanted to use oniy east-west transects, we would let a1 = {1,2,3}, a2 ={4,5,6}, and A = {a1,2}. Then T(ai) = {x1,23}and T(a2) = {x4,56}. If wechoose a1 = {l,4}, a2 {2,5}, a3 = {3,6}, and A = {ai,a2,3} then T(ai) {x1,4},T(a2) = {x2,x5}, and T(a3) = {x3,x6}. These transects are the north-south transects.Let G be a probability measure on A, such that ZLjG{a} = 1. We select transectsindependently and randomly, according to the design measure G. It would be preferableto study simultaneous transect choice, that is, to study probability measures on A =A x A x ... x A. However, this is a far more complicated problem and is not addressedin this work.We assume that no animal is counted more than once and that animals tend tostay in their territories when observed. Thus we avoid the possibility of duplication incounting. Then the accuracy of the observations will depend on the visibility of animals,which depends on things such as animal size, vegetation covers available, and activityof the animal at the particular time of the year. Suppose the process is observed witherror. Let = (e(x1),E(x2) e(xJ\r))t, where e(x) is the observation error in the ith3subregion, so the observation from the ith subregion is (x) = Z(x)+€(x). We assumeZ and are independent, and denote Z = Z + . Let aj, i = 1,2,. . . , n be elements of anindependent random sample from A, each with distribution G. The a1’s are independentof Z and . We assume E() = 0, E(Z) = , and and 6 are the covariance matricesof Z and , where and 6 are strictly positive definite. We also assume ji, 6 and areknown based on prior information.We restrict ourselves to an affine estimator, I, which is design-unbiased, that is,with E(IZ) = I, where the conditional expectation is calculated by averaging over allpossible transect choices. In this sense, I is an unbiased estimator of I, no matter whatthe value of Z. Such an estimator is possible only if our design is random or if we censusthe entire population. Thus, it is essential that we choose values of a from A at random.We write the affine estimate of I based upon the randomly chosen transects T(a)’s inthe form1 = l(a1,. ..= n1 c(x)((x)— (x)) + 7(ai,... , a,) +i1 XET(a:)= fli CaBZ + 7(ai,. . . , a) +where ZK = Z— = Z + € — . Thus Z* has mean 0 and covariance matrix + 6. Thevector c contains the weights a(X), used for observations along transect T(a). Bcis a matrix of 0’s and l’s of dimension d by N, where d indicates the cardinality ofT(a). BOZ* equals the vector of Z(x)—(x) values with x E T(a).We denote the cost of sampling along the transect T(a) by S(a). Typically S(a)would be related to the length of the transect and the difficulty of traversing the associated subregion.4Definition 1.2 We use the notationS = S(a)to denote the total cost of sampling along the n randomly chosen transects.The expected value of the total cost isE(Sn) = n S(a)G{a}.aEAWe are interested in finding an estimate and design measure that are in some senseoptimal. This work extends that of Heckman and Rice [3], who did not include cost.Ideally we want estimates with high precision that are obtained at low cost. Since biasindicates systematic error, it is reasonable to restrict ourselves to estimators that areunbiased. We formulate our problem as follows.Let c = (Cait, Ca2t, . , Caj) be the unknown vector of weights. For a given collectionof subsets, that is for a given A, find c and G to minimizeE(I—I)2+E(S (1.1)subject toE(1IZ) = I. (1.2)The expected values are taken with respect to both the random transect choice andthe random processes Z and e.We split this optimization problem into two steps by first fixing the probabilitymeasure G and minimizing (1.1) over c satisfying (1.2), then minimizing the resultingvalue of (1.1) over G.5In Chapter 2, some mathematical definitions, notation and results are provided.They will be frequently used in the following sections. In Chapter 3 we will show thatfor given design measure G, we can easily calculate the closed form of the optimalestimator. In Chapter 4 we will also show that an optimal design measure exists.In general, the optimal design may not be unique, and the explicit formula for theoptimal design is often difficult to write out. In Chapter 5 we study several special casesfor which we can find the closed form for optimal design, and also prove the uniquenessfor the optimal design in some situatiolls.62 Mathematical BackgroundIll Chapter 3 and Chapter 4 we will show that the optimization problem (1.1) and (1.2)is a convex programming problem. We now discuss the general solution for this type ofproblem.First we discuss the conditions for the existence of local minimizers for the optimization problem (we call it Problem A):minimize f(x) over x E Rsubjecttol(x)Oandh(x)=O,fori=1,...,m,j=1,...,p.We define the Lagrangian function L byp mj=1 i=1(Here the n is used for general R, it has nothing to do with the sample size n we usefrequently throughout this paper).Definition 2.1 Linear Independence Assumption: We say that the linear independence assumption condition holds at x E R if x satisfies the constraints of Problem(A), and the vectorsVl(x),i : l(x) = 0 and Vh3(x),j = 1,2,...are linearly independent. (The V denotes differentiation with respect to x).A general statement of the necessary conditions for x to be an optimal solution toProblem (A) is given by McCormick [5].Theorem 2.1 Suppose that f(.), l(.), and h(.) are continuously differentiable functions. If x is a local solution to (A) and the Linear Independence Assumption holds atx, then there exists= (,i, It2,. . , urn), ) = (A1 ‘2,. . . , A7,) such thatttO,fori=1,2 rn,71ul(x)=O, fori=1,2,...,m, andVL(x,pt,t)= 0.We now discuss the sufficient conditions for x to be an optimal solution to Problem(A).Definition 2.2 A set of points B R is a convex set if and only if forx1,x2 e B andwe have\x1+(1—))xB.Definition 2.3 A function f is convex on a convex set B if and only if for x1,x2 e Band 0 <) < 1, we havef(xi + (1 — )X2) <f(xi) + (1— A)f(x2).Definition 2.4 Problem (A) is called a convex programming problem if the functionsf(.) and {l(.)} are convex and the functions {h(.)} are linear.The following Theorem due to Kuhn aild Tucker [4] gives the sufficient conditionsfor x to be an optimal solution to the convex programming Problem (A).Theorem 2.2 Suppose that Problem (A) is a convex programming problem and f(.),l(.), and h(.) are continuously differentiable functions. Suppose at some point x satisfying the constraints of Problem (A), there exist multipliers and A, such that0, for i= 1, 2, ... , m,tl(x)=0, fori=1,2,...,m, andVL(x,1t,A)= 0.Then x is a solution to the problem.8In practice, we can also solve the Problem (A) via computer programming. Forexample the FORTRAN subroutine NCONF is available in the IMSL MATH/LIBRARYwhich is a collection of FORTRAN subroutines and functions. We can write subroutinesto calculate f(x), h3(x) and l(x) for a given x and input an initial xo. Then NCONFiterates to find a solution to Problem (A). Since NCONF uses a gradient search method,f, h, and i must be differentiable.93 The Optimal Estimator for Fixed DesignThe object of this chapter is to find the optimal estimator for a fixed design measureG. The explicit form for the optimal I is given in Theorem 3.3.Notation and Preliminary ResultsThe definition of I depends on the function FG, defined as follows:Definition 3.1 Given G, we letPG(x) = G{a : x E T(a)}denote the probability that x is in a transect chosen randomly according to the designmeasure G.We first define a matrix that appears in our optimal estimator. Lemma 3.1 describesits property. Theorem 3.1 gives (3.1), the equivalent formula of (1.2) and states thatthe optimal I has y = 0. Theorem 3.2 shows the set of ca’s satisfying the constraint(3.1) is nonempty, and provides important properties of a matrix MG which is used tocalculate the optimal I. Theorem 3.3 proves the existence and the ulliqueness of theoptimal I for the fixed design and gives its closed form.Definition 3.2 We define the notationAa Ba.(Z+5)Ba.twhich is the covariance matrix of BajZ, the vector of Z(x) values for x E T(a).Lemma 3.1 Aaj is strictly positive definite for all a E A.Proof: Recall that d is the cardinality of T(a). Since Z + 6 is strictly positive definite,for v E RdiVtAaV VtBa( + 5)BaitV > 0.10The equality holds if and only ifBatv = 0.If Bav = 0, since Ba is full rank and BajBa = I, thenV = Ba.Ba.tV = BaO = 0.So the equality holds if and only if v = 0. Hence Aa2 is strictly positive definite. ITheorem 3.1 Fix G. Then we have the following results:a. (1.2) is equivalent to the following two conditions:BcG{a} = (3.1)aEAE(7(ci,c2,.. .,c)) = 0.b. In minimizing (1.1) subject to (1.2), we have y 0 a.e. GTh(GxGx... xG).Proof of (a): To prove (a), we write E(IIZ) = I asE (_‘ +7+ =that isCatBa(Z —1u)G{a} + E(7) = 4t(Z—aEAfor all values of Z. So we haveBaCaG{a} =aEAE(7(c1,a2..= 0.11Proof of (b): We wish to minimizeE(l — 1)2 + E(S) = E(I — + p — 1)2 + E(S)= E(I — )2 + E(I — t)2- 2E[(I--4i)} + E(S).However, the cross-product term isE[(1 — tIL)(j — = E[E((I — t)(I —4tp)Z)]= E[(I — qstp)E(I—4tZ)]= E(I — #t)2Thus, we write (1.1) as the following:E(I — 1)2 + E(S) = E(1 — t)2 — E(I — t)2 + E(S). (3.2)So for fixed G, we only need to minimizeE(1 — t)2 = E(n’(ctBZ* + 7))2= E(n’ctBZ*)2+ E(72)+ 2E(n’7 ctBZ*).Again, the cross-product term is zero, that isE(n’7 cBZ*) = n1E(7CtBaZ*)= :1E12andE(n1CtBciZ*)2 _2+n2i=1 jinE(c1tBZ*)+ n’(n — 1)E(CoitBciZ*Cc2tBaZ*).We calculatekE(citBiZ*)2=G{az}E(caiBajZ*)2= G{ai}ca1tBa(YZ + 8)BatCa.=G{aj}catAaca,,andk kE(CyitBy1Z*Cc2tBa*) = G{aj}G{aj}E(catBatZ*cajiBajZ*)i=1 j=1k k= G{ajG{a3}CajtBa ( + li)BajtCa.i=1 j=1By (3.1), we havek kE(citBa1Z*Co2tBo*) = ( caitBaiG{ai})( + 5)( BatCa3G{a3})i=1 j=1== E(qSt(Z — p))2 += E(I—t)2 + q5ç.Then we obtainE(1 — t)2 = n1kG{aj}ca1tA ca+(1 )E(I_tp)2+(1 — )tS+E(72). (3.3)This is the sum of four parts; one is a function of Ca’5, two are constants, and theother is E(72). E(’y2) does not depend on the Ca’5. Therefore to minimize (1.1) we take13E(72) = 0. From (a) we know that E(7) = 0, so -y is a constant and must be 0 a.e. G’.IFrom (3.2) and (3.3), we know that minimizing E(I — 1)2 + E(S) is equivalent tominimizingk kH(c, G) = n G{aj}catA caj + n S(a)G{a}.To minimize H for fixed G, we only need to minimize the first term.Theorem 3.2 Fix G. Then we have the following results:a. There exist Ca’5 satisfying (3.1) if and only if PG(x) = 0 implies (x) = 0.b. Let MG = aEA BatAa_1G{ }. Then MG is non-negative definite. MG is strictlypositive definite if and only if PG(x) > 0 for all x.Proof of (a): To prove (a), first we suppose that there exist Ca’5 satisfying (3.1) andx with PG(x) = 0. We will show that q(xj) = 0.If x e T(a), since G{a : x e T(a)} = 0, we have G{a} = 0. If x T(a), then theith component of BatCa is zero. Thereforeçb = çS(x) = (BatCa)jG{a} = 0.aEATo prove the converse, suppose that PG(x) = 0 implies q(x) = 0. We show thatthere exist Ca’5 satisfying (3.1).Let ca*(x) = q(x)/PG(x). Since PG(x) = 0 implies g(x) = 0, ca*(x) is well-defined ifwe take 0/0 to be zero. For such Ca*,(BatCa*)iG{a}= E T(a)}G{a}aEA aEA= {(x)/PG(x)}G{a : x E T(a)}=14So there exist Ca*S satisfying (3.1).Proof of (b): Since Aa is strictly positive definite, then for v E RNvtMGv = vtBatAa_lBavG{a}aEA>0.Now we want to show that VtMGV = 0 for some v 0 if and oniy if Po(x) = 0 forsome x.For any v, VtMGV = 0 if and only ifvtBatAa_lBavG{a} = 0for all a E A. If G{a} = 0, then BavG{a} = 0. If G{a} > 0, we have BaY 0. SoVtMGV = 0 if and only ifBavG{a} = 0for all a, that is if and only ifvI{x E T(a)}G{a} = 0for all i and a. Here v = (vi, v2,. . , VN).If VMGV = 0 and v 0, there exists v 0 and x such thatI{x E T(a)}G{a} = 0for all a. SoPG(X) = G{a : x E T(a)} = 0.Conversely suppose there exists x such that PG(x) = 0. Let v be a vector with ithcomponent one and all the others zeros. Then vtMGv = 0. I15The Optimal EstimatorBased on these Lemmas we have introduced, we now state the main result:Theorem 3.3 Fix G, and suppose FG(x) > 0 for all x. Then an optimal affine estimator I exists, is unique almost everywhere and is given by:1 = n_ltMl B(B(E + +I()withE(1 — 1)2 + E(S) =n’tMG’# —n1(+ 6) + + E(S).The unique Ca is given byCa = Aa’BaMG’cf’.In addition, if G{a} and G{a3} are non-zero, for some i andj, and if x e T(a)flT(a),then we havecov(Z(x),1(a)cr = a)) = cov((x),1(a) = as). (3.4)Comment: If the optimal estimator and design can be found using the method of Lagrangian multipliers, then the optimal estimator and design are characterized by Equation (3.4). In Example 2, two overlapping transects case, we can find the optimalestimator for the fixed design using Equation (3.4).Proof: From Theorem 3.1(b), to minimize (1.1) subject to (1.2), we take y = 0. So we16consider estimators of the formI = CjtBajZ* + 4’p.Now we minimize H. By Theorem 3.2(a), the set of Ca’S satisfying the constraint (3.1)is nonempty. For a with G{a} = 0, the value of Ca does not contribute to the constraint(3.1), nor to the function to be minimized. Therefore, we let A0 = {a : G{a} 0}and calculate the optimal Ca’S for a e A0. Without loss of generality we can writeA0 = {a1,a2, . , a}, for r k. Let*t_( t t tC— iCai , Ca2 , . , CarFor fixed G, E(S) is fixed. We therefore wish to minimize the functionf(c*) = catAacaG{a}aEA0subject to the linear constraintBatcaG{a} = 4’.aEA0First we will prove f is a strictly convex function by showing72f/vc* is positivedefinite. Since Aai is positive definite, then the matrix2Aai G{ai }_____— 2AaG{\7C*22Aar G{ar }is positive definite, so we know that f is a strictly convex function. Hence if the minimizer of f exists, then it is uniqile. Furthermore, a local minimum of f will be a globalminimum.We find the minimizing c by using a Lagrangian multiplier vector ) e RN, and wedifferentiate the following:17CatAacaG{i} — G{a}ctBA + t\aEA0 aEAowith respect to Ca for each a in A0, and set the results equal to zero. We then have thefollowing:1Ca = Aa’BaA. (3.5)We use the constraint (3.1) to solve for A and obtainBatCaG{a}aEA= BatAa’BaAG{a}aEA= MGA.Since PG(x) > 0 for all x, by Theorem 3.2(b) MG is positive definite, so MG1 exists.ThereforeA = 2MG1#. (3.6)Combining (3.5) and (3.6), we obtainCa = Aa’BaMG’çb. (3.7)We can easily calculate the optimal estimator which is the following:I = tMG_lBtA_lBZ* += fl_ltM_l BjtAj_lBajZ* +Substituting (3.7) into (3.3) and using (3.2) yieldsE(I — 1)2 + E(S) = cStM_BtA_lAA_BM_l#G{a}aEA—n1qtS+ (1 —n1)t6+ E(S)= n’4tMG’qS —n1t(+ 6)# + #t64 + E(S).18To verify (3.4) we suppose that G{a} and G{a3} are non-zero, for some i and j,and xl T(a) fl T(a3). For some 1’ and 1”, we have2(xl) (Bai)1, (Ba3Z)111.Applying (3.5) we obtainBaA = 2AaCa.SoA1 = (BaA)1,= 2(Aa Ca)1,.But(Aacaj1, = cov((Ba)1i,CajBajZ*)= cov((xi),1(a) = a).By a similar approach, we may also obtain )q = 2cov(Z(xi), I(a)c = a3). This proves(3.4). I194 Optimal DesignIntroductionWe now proceed to find the optimal design measure G assuming that q(x) > 0 for allx. In Chapter 3, we showed that the unique optimal affine estimator for a given designmeasure exists if and only if PG(x) > 0 for all x. We therefore define the following:Definition 4.1 We denote 1’ to be the set of all probability measures on A such thatPG(x) > 0for all x.Thus from Theorem 3.3, we want to minimizeF(G) =over G F. The following Lemmas and Theorem will show that an optimal designexists. In Lemma 4.1, we show that F is a convex set. In Lemma 4.3 and Lemma 4.4we show that F is continuous and convex on F. Unfortunately, it is easy to see thatF is not a closed set, so more work is required to show that an optimal design exists.Lemma 4.5 provides us with the necessary result. Theorem 4.1 states that the optimaldesign exists. Theorem 4.2 gives a property of the optimal design and estimator. Nowwe discuss the property of the set F.The Properties of the set F and the function FIn this section we will show that F is a convex set, and function F is continuous and alsoconvex on F. Therefore the problem of finding the optimal G is a convex programmingproblem which we have already defined in Chapter 2. Now we prove some importantresults.20Lemma 4.1 F is a convex set.Proof: Recall Definition 2.2 in Chapter 2. We must showAG1 + (1 — A)G2 e Ffor allO A land G1,G2 EF.Let GA = AG1 + (1 — A)G2. Then it is easy to see that GA is still a probabilitymeasure on A. Assume GA F, then there exists some x E X, such thatPG(x) = 0.This can be written as followsPGA(x) = APG1(x)+(l—A)PG2x=0.Since A 0 and 1 — A 0, so we must have eitherPG1(x) = 0orPG2(x) = 0which is a contradiction to our assumption of G1,G2 F. ITo study F, it is convenient to view G as an element in Rk and it is useful to definea matrix norm. Let G be associated with the vector g = (gi,g,. g)t such thatG{aj = gj. We will sometimes write F(G) as F(g) and MG as Mg. Now we define thenorm . for a matrix.Let A be an rn x n matrix with entries, We define the norm AW of A bylAW = max ajl.21The function II II: Rm< —* R satisfies(a) lAW > 0 for all A Rm’.(b) lAW 0 if and oniy if = 0 for all 1 <i < Tn, 1 j n.(c) IIaAW = al lIAR for all a E R, A E Rrnxn.(d) IA -i- BIl < IAII + IBII (This is called the triangular inequality).The function II II has the following property:IIABII < IIAII IBIIfor A E Rm<fl, B e RnXT. (See Graham [2] for a discussion of these properties).For y= (yl,y2,. . . ,yk) E Rk, the matrix M is defined ask=where R = Ba3tAai_lBa E RNXN. Now we prove the following lemma which will beused in the proof of Lemma 4.3.Lemma 4.2 Suppose G F, and G is associated with a vector g. For y Rk with IIIIsufficiently small, the matrix Mgy is positive definite.Proof: From Theorem 3.2, we know that Mg is positive definite. Let v E Rc withv 0 andm = mm vtMgv> 0.IIvII=1Then we can writefy(V) vtMg+yv= vt(Mg+My)vk= VtMGV + :: vtRjvyj.When IlvIl = 1, we havejvtR I lRII M22forM = maxWRII. So if we take y such thatBW< 2Mk’then< 2MkThus we obtainkf(v)= 1>0.Now for vW 0, thenfy(j)Thereforev(Mg + M)v vW2 > 0.So Mg+y is positive definite. INow we prove that F has the following properties:Lemma 4.3 F is a continuous function on F.Proof: For any G F, we want to show that F is continuous at G. It is enoughto show that for any > 0, there exists 6 = 6(g, ) > 0, such that whenever y =(yl, Y2,.. . , yk) e R’ satisfies 6, we haveF(g+y)—F(y) <.Since Mg is positive definite, from Lemma 4.2 we know that Mg+y is also positivedefinite for small, and so Mg+y’ exists. We writeF(g + y) — F(g) n’t(M — M’) + nS(a)y23= n#t((Mg + M) — + n S(a)y= n’t(Mg + M)(—MM1+ n S(a)y.Recall that we have defined M = By using the properties of matrixnorm we have the followingF(g + y) - F(g) fl’ j#My 1 (Mg + M) 1+ n S(a)y-i 2 IIRi yjj M’ (Mg + M)+n S(a)y< Ciy (Mg + M)W +where C1 = n’ 2WM’W and C2 = n>1S(a). Now we need to showthat I(Mg + M)W is bounded.For simplicity, we defineA(y) = Mg + M (aij(y)) E RNXN.So we obtainA(O) = MgA(y)= detA(y)UwhereAii(y)... A1N(y)UA(y)=AN1(y) ... ANN(Y)where A(y) is the cofactor of the element ajj(y). A(y) is defined to be (—1) timesthe determinant obtained by omitting the ith row and jth column of the matrix A(y).24The detA(y) is a continuous function of y. So we havedetA(y) = det(A(O) + M) —* detA(O) as —*For WIJ i for some we can writedetA(y)— detA(O) < detA(O)LSodetA(y) = detA(y) — detA(O) + detA(O)detA(O)— detA(y)— detA(O)I> detA(O) — detA(O)= detA(O)LWe write1=detA(y) UA(y)detA(y) UA(y) — UA0W + detA(y)whereA11(y) — A11(O)... A1N(y) — A1N(O)UA(y) — UA(o) =AN1(y)— AN1(O) ... ANN(Y) — ANN(O)Since A(y) — A(O) —* 0 as y —* 0 for all i and j, without loss of generality we haveA(y)— A(O)I 1,for W3’W <5. This leads to the followingUA(y) — UA(o)W N.Then we havedetA(O) (N + UA(o)W) = C3.25ThereforeF(g + y) — F(g) CiyC3+ C2jy <eif we take 62 = e/(C1C3+ C2) and < min{61,62}. ILemma 4.4 F is a convex function of G F.Proof: To prove this, we need to showf(A) = AF(G0)+ (1 — A)F(G1— F(AG0+ (1 — A)G1) 0for any G0, G1 E PandA e(0,1).Since we havef(o)=f(1)=0,then it is enough to show that f is a concave function for A (0, 1),that isf”(A) <0for A e (0, 1). Suppose G E F, g is the associated vector in Rc, h E Rc, and 6 E R. Todifferentiate f, we use the fact that since Mg is positive definite, then Mg + 6Mh is alsopositive definite for 6 small. Therefore M&h exists. So we can writeM4sh = (Mg+Msh)1= (Mg+6Mh)’= (I+6Mg_1Mh)_lMg_1= Mg’— 6Mg’MhMg + 0(62).So we haveM’g+sh Mg’Mg’MhMg’26as S —* 0. We denote MAgo+(1_)gi by M. From Lemma 4.1, we know that F is aconvex set. Then M is invertible. To calculate f’, we use the above calculations withg = \go + (1— )gi and h = go — g E RAT. Then as 6 —* 0, we obtainkf)) = F(go) — F(gi) + n4tM_lMgo_g1M lcS — nS(a)(Go{a} — Gi{aj).Furthermore, we can writeMjshMhMsh =(Mg’SMgMhMg’+O(62)) h(Mg’SMg).We expand the right hand side of the above formula, and by a similar argument we canobtain the following:MshMhMsh— Mg’MhMg1S, 2Mg1MhMg’as S — 0. Therefore we havef”) = _2n1#tM_lMgo_gi M’Mg0_g1I[’.Since Iit is positive definite, we obtainf”(\) i; 0.Therefore F is convex function of G E F. IThe Main ResultsThe task of this section is to show that the optimal design exists. First we give animportant result which will be used later ill the proof of Theorem 4.1.Lemma 4.5 Let {G1} be a sequence in F such thatlim minG1{a : x E T(a)} = 0.xIf g(x) >0 for all x, thenlimsupqfitMG’qS = Do. (4.1)l—oo27Proof: To prove (4.1), it is enough to show that there exists a subsequence of {Gi},{ G1”}, such thatlim tM 4, =1-ooSince Gj{a} is in [0, 1], and A is a finite set, there exists a subsequence {G1} and Gsuch thatG1i{a}—*G{a} for a11aA,where G is a probability measure on A. Therefore, since X is a finite set,PG1,(x) —* PG(x) for all x E X.Again, since X is a finite set, by assumption, there exists x E X such that PG(x) = 0.Then G{a} = 0 whenever x T(a). If x T(a) then (BatAa_lBa)ij = 0 for any j.Therefore(IYIG11) = >(BatAa_1B G,{a})ijaEA‘S (BatAa’Ba)jG{a}aEA=0.Since we can reorder the set X without loss of generality, then we can assume PG(x) = 0for all 1 i m and PG(x) > 0 for rn + 1 i k. So for some matrix A(N_m)x(N_m),°mxm °mx(N—m)MG1, —* M as 1 —* cc.°(N-m)xm A(N_m)x(N_m)By an argument similar to that in Theorem 3.2 (b), we can show that A(N_m)x(N_m)is strictly positive definite. Therefore, M has exactly in eigenvalues equal to 0, withcorresponding eigenvectors spanning the rn-dimensional subspace of RN,{(x1,2. . .,Xm,O,. . .,0)t : Xi E R}.We now study MG11. Let ).ij’s and v1’s be the eigellvalues and eigenvectors of MG1, withjv1’j = 1 for i = 1,2, . . . , N. First we show that all the Asj’s lie in a bounded set.28Since MG1, —* M, MG1IV —* Mv uniformly for v in compact set. Then we havesup MG1,v — Mv —* 0 as — 00.V:IIVII=1Sosup jILIG11V j — iWv1j —* 0 as 1’ —* cc.1<i<NButMG1IV1I =so we havesup \iivii— Mvi’W —÷ 0.1<i<NBy the Cauchy- Schwarz inequalityvl,t ()qivjij — Mviii) v1i \jiv1’— Mv1’Since vi’lI = 1, thensup — vi,ttMvi,, 0.1<i<NSincevl,tIV[vl, sup.1111vtiWv= maxwhere ‘max is the largest eigenvalue of M, then it follows that the \j’s are bounded.Using the spectral decompositions for MG1, and M, we obtainMG1, P11D,PtM=PDP29where p11tp, = j PtV = I, and D1 and D are diagonal matrices. Since F1’ =(viii,vii2,... ,ViIN) and Jvi’j = 1, all of the components of the matrix P1 are in [—1, 11.So there exists a subsequence {P1} of {P11} and some F withurn P11 = P.l”—ooThus plp* = limji’p1,,tp1 = I. Since we knowA1, 0 0 0o A112 0 0D1 =o o 0 Al’Nand A11 are in a bounded set, there exists a subsequence {D1111} of {D11} and D*, suchthatD111 D*.Therefore7V1G11, = p*D*p*tButMG11,, —f M = PDPt.Suppose we have ordered the diagonal elements of D and D*. Because of the uniquenessof the eigenvalues of M, we haveD=D*.From the discussion of M’s eigenvalues and eigenvectors, we know that the firstm elements of D are 0 and the first m columns of P span {(x1,X2, . . Xm, 0,... ,Therefore, the first m columns of F” span {(x1,X2, . . Xm, 0,.. . , O)t} and, for i =1,2,...(Ft#) ,. (p*i#). = 030and—* 0.Finally we showqSt ]41—+ 00 as 1”’ —÷ o.Nowtvl3’,,, ( —1 ],,, t) 4= (p11I,t)tDl,, 1(P1111t)N nfl ti’, i2—L’l” p)J— : “l”im nT-) tl\12II1-lll çi.L\i=1-+ 00.ITheorem 4.1 Suppose q(x) > 0 for all x. Then there exists G* F that minimizes F.Proof: By Lemma 4.3, F is a continuous function of the G{a}’s, i 1,2,. . . , k.However, I’ is not a closed set. For all x E X, we can find a sequence {G1} E F, withP0(x)—0asl-—*co. SowecannothaveGi—*GeF.We define F = {G F : mint PG(x) e}. It is a closed bounded set in Rk. SoinfGj-’ F(G) = F(G) for some E F.Now we show that there exists an e0 > 0 such thatinf F(G) = inf F(G) = F(G0).GEF GEFEQIf not, there exists a sequence {G1} with F(G1) decreasing andurn mm PG1(x) = 0.l—+oo31This contradicts Lemma 4.4 . IThus we reach the important conclusion that the optimal design measure exists onthe nonclosed set F.For a given G, the weights in the optimal estimator are functions of G, written as CG.So if we can find the design G* which minimizes F, then, by Theorem 3.3, the optimalweights, CG* can be calculated explicitly. Let d= > d. Then we haveH(cG*,G*)= minminH(c,G)H(CG,G)H(c,G)for any c e Rd and G e F. Therefore, (c*, G*) minimizes H.To minimize H subject to the constraints (3.1), we use the results in Chapter2. Here again G is associated with the vector g. Now in our case we have xt =(gi,g,.. . ,gk, c) Rk*, where k* = d + k. The constraint functions are:1(x) for i = 1,2,...,k,h(x) = [Btc—for j = 1,2,..hN+l(x) = — 1.As noted before, to use the Lagrangian method we need to check the Linear Independence Assumption at a point x. Now we define a matrix D = D(x). Its rows aretransposes of the vectors Vl(x),i : l(x) = 0, and Vh3(x),j = 1,2,. ..,N + 1. TheLinear Independence Assumption holds at x if and only if the rows of D are indepen—dent. Thus to check the linear independence we only need to check if the matrix is fullrank.32For a given xt = (gt, Ct), let 1 be the number of zero gj’s. Then we write the followingmatrixQi °lxdQ2 Qwhere Qi Rlxc is a matrix obtained from ‘kxk in the following way: remove rowi if g 0. The remaining rows make up Qi. The matrices Q2 R(N+l) andQ e are as follows:BaitCai . .. BaCaQ2—1andG{a1}Bait... G{ak}BaktNow we have the following result about the optimal design. Equation (4.2) showsa relationship between cost of sampling and variability of the estimator for certaintransect.Theorem 4.2 Suppose that I = n CBajZ* + gt and G is a local solution tothe problem of minimizing H subject to Equation (3.1), G{a} = 1, and G{a} 0,for i = 1,2,..., k. Suppose the matrix D is of full rank N + 1 + 1. (1 is the number ofzero G{a}’s). IfG{a}G{a} >0Var(I(cc = a) —n2S(a) = Var(1(oo = a) —n2S(a). (4.2)Proof: The function to be minimized and the constraint functions are continuouslydifferentiable at xt = (gt, Ct). By assumption, D is full rank N + 1 + 1. So the LinearIndependence Assumption holds at x. By Theorem 2.1, x is a critical point of theLagrangian equationk k kL(1t,. ., 1k, A1,.. . , Ak, /3) = n ca1tAacaG{ai} + n S(:)G{a} — jG{a}+2(t— G{aj}caBa2)A+ /3(ZG{aj} — 1).33Differentiating the Lagrangian equation with respect to G{a}, and equating to zero, weobtainflhCa.tAaCa + nS(a) — — 2CatBcL.) + /3 = 0. (4.3)Differentiating with respect to Cat givesG{aj}(2n’Aajca — 2Ba.X) = 0.For G{a} 0 the above can be written asBa). = fl’AaCa.By Theorem 2.1 we must have jj = 0. We substitute j = 0 and Baa). into Equation(4.3) and obtain= flhCatAaCa.— nS(a)= n’Var(I(co = a) — nS(a).So, when G{a} 0,Var(I(ca = a) —n2S(a) = n3.IIt is important to note that if G{aj = 0, for some i, then the minimizing G is onthe boundary of F. In this case, Equation (4.2) will not be valid anymore.In Chapter 5 we will have an example where the optimal design is on the boundary,that is the probability of choosing a particular transect is zero.In general, the optimal estimate and design must be calculated numerically. Wehave already proved that the optimal estimate of I and the optimal transect designexist. They may not be unique. But all these solutions will minimize the sum ofE(I — 1)2 and E(S).There is no available formula for the optimal design in general, although it is simpleto calculate numerically in specific cases by using computer programming. Recall we34have mentioned in Chapter 2, the FORTRAN subroutine NCONF can work quite wellfor this case. The FORTRAN subroutine can be used to minimize H with respect tox = (gt, Ct), or to minimize F with respect to gt and then calculate the optimal weightsby using Theorem 3.3. In the latter case if we input F(g), h(g) gj — 1, lj(g) = gj,and the initial g, then by using the subroutine we will get the output: the optimal g.We prefer minimizing F because the space we work on has lower dimension than whenwe minimize H.In some cases, we can still use the equations containing conditional variances andcovariances, given in Theorem 3.3 and Theorem 4.2, to solve for the optimal g and theoptimal weights. In the next chapter we apply the results in some special cases.355 ExamplesAs mentioned in Chapter 4, we cannot find general formulae for the optimal design. Alsothe optimal G may not be unique. So it is hard to understand how the cost functionS(.) and variability measures ( and (5)affect the optimal design.In general, it is reasonable to choose transects which are cheap to sample, and whichhave high variability. A large prior variance on Z(x) indicates vague prior information.In this chapter we study several examples in order to quantify the balance betweencost and variability. We study the case with k disjoint transects, the case with twooverlapping transects, and a case with a very specific physical sampling plan. Sections5.1 and 5.2 deal with the case of k disjoint transects and two overlapping transects. Inthese two cases we have the result that the optimal design is unique. In Section 5.3,we consider a sampling plan where there is an initial startup cost for each path. Weconsider the cost as two parts: the cost for setting up a sample location and the costfor traveling in the forest. In this case, we have explicit formulae for the optimal designfor some values of the cost function and the variability measures.5.1 Disjoint TransectsWe consider the case that the set of possible transects is a partition of the whole region.Even in this simple case, the optimal G cannot be found explicitly. However we can stillstudy some of the optimal G’s properties.Definition 5.1 The transects T(a), i = 1,2,..., Ic are called disjoint if T(a)flT(a) =O when i j, and ui T(a) = X.For simplicity, we defineVar(rT(a)36= Var(tBaBaZ*)= cI;ItBatAaBaicb.Theorem 5.1 Assume q(x) > 0 for all x. If the k transects are disjoint, then thereis a unique optimal estimate and a unique optimal design G F, with G{a} > 0 fori = 1,2,. . . , k. The optimal estimator hasCa = G{aj}1Baqfand the optimal G satisfiesnG{a}2 — nS(a) = /9where 9 is a Lagrangian multiplier.Proof: We order the set X so that T(ai) = {xi,x2,... ,xL1}, T(a2) = {XL1+1,. . .etc, with Lk = N. We assume L0 = 0, so the cardinality of the ith transect is d =L — L_1. For a given G, we haveAa1G{ai} 0o Aa21G{a}MG=o o... Aak’G{ak}Where Aag is a d x d matrix. For disjoint transects, PG(x) > 0 for all x impliesG{a}>0fori= 1,2,••,k. ThenM0’exists,andAai X G{ai}’ 0MG’= 0 Aa2 x G{a2}’0 0 Aak x G{ak}By Theorem 3.3, we obtainCa. = Aa.’Ba.MG’37—1 —1= Aaj [OdXL_,G{a} Aai, OdX(L_L)]çf—1= G{a} [OdxL_,‘ct2xd, OdX(L_L)]çb= G{aj}’Ba#.To find the optimal design G, we minimizeF(G) =kG{aj} tBaAaBa + nkS(a)G{a}= nG{a} +nS(a)G{a})over F. We know from Theorem 4.1 that the minimizing G exists. We now show it isunique by showing that F is strictly coilvex.We have proved in Lemma 3.1 that Aaj is strictly positive definite. Since Bac equalsthe vector of q(x) values with x e T(a), by the assumption that q(x) > 0 for all x, wehave Ba.# 0. So b, > 0.We calculate the followingb1G{ai}32__________—1 G{a2}3702bk3 2 2 .. .Since b/G{aj > 0, SO 7 F/7G is strictly positive definite. Thus F is a strictlyconvex function. The minimum is therefore achieved uniquely for G E F.Using Lagrangian multipliers with the constraint G{aj = 1, the Lagrangianfunction isk kL= nG{a} + nS(a)G{a}) + G{a} — 1),where 3 is the Lagrangian multiplier. By Theorem 2.1, we know that if G is the optimalsolutioll, then ãL/öG{a} = 0, i = 1, 2,... , k, that isnG{aj2 — nS(a) =38where 3 is the Lagrangian multiplier. IRemark: In this form it is easy to see that the optimal G{a} is a decreasing functionof S(a) for given b. Also for fixed S(a), G{a} is an increasing function of b. This is afairly obvious statement; the more expensive the sampling of transect T(a), the lowerthe probability that we choose the transect i; the more variable the observations alongtransect i, the more important it is to sample that transect.Two Disjoint TransectsTo better understand the effects of cost and variability on the optimal design, we considera simple case with only two disjoint transects. For a given A = {ai,a2}, supposethat T(ai) fl T(a2) = 0, T(ai) U T(a2) = X, and T(a1) = {x1,2... ,XL}, T(a2) ={XL+1, , xjy}. Let G{ai} = g and G{a2} = 1 — g. Here PG(x) > 0 for all x impliesg E (0, 1). From Theorem 5.1, the optimal weights are:1 (X2)Ca1q(xL)c(xL+1)1Ca21_gand the optimal g satisfiesb1__________— nS(ai)= 2 — nS(a2).ng n(1—g)Let LS = S(ai) — S(a2). If L\S = 0, then the optimal g is1= 1 + (b2/1)1239If zS 0, thenwhereandb c—-i+ 2+1_0,g (1—g)b =b1/(ri2tS)c =b2/(nAS).Thus for given b and S(a), we can find the unique optimal g e (0, 1) by solving afourth degree polynomial. Once we have the optimal g, it is easy to calculate theoptimal weights c and estimator I.The optimal values of g for different values of b and c are given in the followingfigure:1•co000Figure 5.1.1: The optimal values of gFigure 5.1.1 also supports the remark: for fixed c, if the variability of observationsalong transect 1 is large, then the probability of sampling this transect is large. Also4’40if the cost of sampling transect 1 is much larger than sampling transect 2, then theprobability of sampling transect 1 will be lower.Although an explicit formula for the optimal g would be very complicated, we caneasily determine that g < 1/2, = 1/2, or > 1/2, if b — c < 1/4, 1/4, or > 1/4respectively. To show this we will graph a function F1, which is related to F. WriteF(g) =+1b:3@18@2X1— + + nSg + nS(a2)rig n(1—g)b1______________flS(n2Sg +n2S(1— g) + g) + nS(a2)= nS(+g+)+nS(a2)nL.SF1(g) ± nS(a2).Minimizing F is equivalent to minimizing F1. Since F’1(1/2) = 4(c— b) + 1 and F1’ isalways positive, if b— c > 1/4, then F(1/2) < 0 and the optimal g is greatç than 1/2.Figure 5.1.2: F1 with b—c> 1/4F‘1g1410.5 1Figure 5.1.2 and 5.1.3 show the shapes of F1 for these two cases.Furthermore, for b— c = 1/4, the optimal g = 1/2, that is the probabilities ofsampling the two transects are the same. This means when cost functions and thevariability measures satisfy b— c = 1/4, then the probability of sampling either transectis the same.5.2 Two Overlapping TransectsNow we deal with the case with two overlapping transects. As in Section 5.1, theoptimal g cannot be found explicitly in this case. Again we discuss some properties ofthe optimal g. Theorem 5.2 shows that the optimal design is unique.If b — c < 1/4, thenF1(1/2) > 0 and the optimal g is less than 1/2.Figure 5.1.3: F1 with b — c < 1/4F10 g42Definition 5.2 Two transects T(a) and T(a) are called overlapping if T(a)flT(a)0.Suppose that A = {a1,a2}, T(a1) = {x1,x2,••• ,XL}, and T(a2) = {Xr+i, Xr+2, ,XN},where 1 < r < L N, so these two transects overlap. We partition X into three disjointsets asX1 = {x1,2. ,Xr}X2 = {Xr+1,Xr+2, ,XL}x3 = {XL+1,XL+2,... ,XN}.So T(ai) = Xi UX2, and T(a2) = X2 UX3. Set X2 is the overlapping part. Correspondingly, we partition all the vectors and matrices as follows:— Ii,. t J. t I tY — kWi W2 ,Y3Z—I *t *t *t1 , 2 , 3Calt = (c11t,2)t I t tCa2 = iC22 C2311 12 13= 21 >22 2331 32 33and11 12 13621 622 623631 632 633For example:= (() (x2),. . . ,q5(xr)), Z2 = (Z(r+i), Z*(Zr+2), , Z*(XL)),and is the covariance matrix of Z and Z2. For simplicity, we write I’ = + 6 with= + 6jj. So our optimal estimators are of the forml(ai) = 11Z +c12Z2+ q43l(a2) =c22tZ*+c23tZ3*+ q.Suppose G{ai} = g and G{a2} = 1 — g. Here PG(x) > 0 for all x implies g E (0, 1).First we defineTV = —for i = 1,3. The optimal design will depend on V1, V3, and S(ai) — 5(a3). Next weshow that the V1’s are Bayes risks.Lemma 5.1 Suppose that Z is multivariate normal. Then under squared error loss,V1 and V3 are Bayes risks of estimating, respectively, 1tZ* and 3tZ* based on Z2.Proof: First we show thatI jtT T —1 *= Pi ‘112’22 2is the Bayes estimate of qitZl* based on Z2’.To show this, we must show that w = IJ22_1I12141 minimizes the following:f(w) = E(qf1tZ — wtZ)2= E(1tZ)2— 2E(qS1tZwtZ) + E(wtZ)2= #tlIEc— 2ç1II2w+ wlPw.Differentiating f with respect to w and equating to 0 yields= —2W21q1+ 2P2w =0.So the minimizing w isW = I’221’21#1.So the Bayes estimate for 41tZ isIl = wZ2.Now we can calculate the Bayes risk of estimating#1tZi* from Z2*.44E(qf1tZ — l)2 = E(tZ)2— E(I)= E(#1tZ)2—=—‘1’12’112221)4iTherefore, V1 is the Bayes risk of estimating#1tZ* from Z2*.Similarly, we can show that V3 is the Bayes risk of estimating qfitZ3* from Z2*. •Theorem 5.2 Suppose that A = {ai,a2} and T(a1) and T(a2) are two overlappingtransects. Then, for a fixed g E (0, 1), there is a unique optimal estimate withc11 =C12= #2 + 22 1#— (1;C22= #2 + I22 ‘I’21#i— g11221123#3(1—g)C23 = #37(1 — g).Furthermore, a unique optimal g E (0, 1) exists. If /S = 0, then the optimal g is1g=1+(V3/V1)’2If S 0, then the optimal g satisfiesv-—+ +1=0g2 (1—g)2where u = Vi/(n2S) and v =V3/(n2S).Proof: First we find the optimal weights cj’s for a given g. For g E (0, 1), the optimal estimator exists and is unique by Theorem 3.3. Since the optimal weights satisfyconstraint 3.1,(c,,t,c12 Ot)g (ot c22,c23t)(l— g) = (#it, #2, #3t).45Then we obtainc11 = (5.1)gc12 + (1 — g)c22 = 4’2 (5.2)C23 = 13/(1 — g) (5.3)Since x E T(ai) fl T(a) if and only if x E X2, then by Equation (3.4), we havecov(Z2*,l(cc= ai) = cov(Z2*,l(a)c = a2),that iscov(Z2*,c11Zi +c12tZ2*) = cov(Z2*,c22Z +c23tZ3*).Calculating the above yieldsk112C1 + I’22c1 = q122c+ ‘P23c. (5.4)We combine the formulae (5.1)-(5.4), and obtain the expressions for c12 and c22.To find the optimal g, we suppose q(x) > 0 for all x. From Theorem 4.1, there existsan optimal g (0, 1). Now we want to minimizeF(g) = _l[c1tAcg+Ca2tA(1— g)] + nS(ai)g + nS(a2)(1 — g)= +V3+ 2332#2 + 2222 + 22t211 +g 1—gltJ21ti2_1lJ,#+ 3t32P2’Ji34+243tW2IJhlPicS]+nS(ai)g + nS(a2)(1 — g)+ 1—g +C] + nS(ai)g + nS(a2)(1 — g).Since V1 and V3 are the Bayes risks of estimating, respectively, qSitZl* and 4,3tZ* fromZ2*, they are positive. Now we write F asF(g) = _1[ +] + nSg + nS(a2)+ n’C= flZS[n2g/S + n2(1— g)/S + g] + nS(a2)+ n C.46So minimizing F is equivalent to minimizingVg 1—gFor given V1, V3, S(ai) and S(a2), we can find the unique g E (0, 1) by solving afourth degree polynomial, which is obtained by differentiating F2 and equating the resultto zero. Thus, we solveU V2+1g (1—g)Since2u 2vF2”(g)=—-+ 3>0,g (1—g)there is a unique optimal g in (0, 1).If /S = 0, then the optimal g solvesV1(1—= V3g2.Then the optimal g is1= 1 + (V3/1)2IRecall F1 in Section 5.1. F2 has same form as F1. So the graph of the optimal valuesof g for given values of u and v will be exactly the same as in Figure 5.1.2 except weuse u and v instead of b and c for our overlapping transects case.Although in this case an explicit formula for the optimal g would also be very complicated, as in the previous example, we can still determine that g < 1/2, = 1/2, or> 1/2 if u — v < 1/4, = 1/4 or > 1/4 respectively. The calculations are identical tothose in Section 5.1.Thus, when u is much larger than v, that is, when V1 is much greater than V3, weare more likely to sample transect T(ai). A large value of V1 indicates that #1tZi* isdifficult to estimate from Z2*. So if V1 is large, it is important to observe Z1*.475.3 The Case with Startup CostMany sampling designs involve collecting data along transects with starting points atvery different locations. In this case, there are two components to sampling cost. Thefirst component, which we call the startup cost, is the cost incurred by beginning sampling at a particular location. This cost usually involves travel expenses and sometimes,the hirillg and training of field assistants. The second component of cost reflects thedirect cost of sampling along the path. We therefore consider that the cost containstwo parts: c represents the price of setting up a sampling location, and y represents theprice of traveling one unit distance at that location. We assume c > 0 and y > 0, andthat neither c nor y depends on the transect.We now consider a simple case. Suppose we can start at two separate locations Aand B. From location A we can either travel one or two units distance, and from B wecan only travel one unit distance, counting all the animals sighted along the route.The physical description of the sampling region thus determilles our choices of transects. We can start at either place to travel one unit of distance, start at location A totravel two units of distance, start at both locations to travel one unit of distance fromeach location, or travel the whole area. Our optimal strategy will depend on the costsinvolved and the accuracy of our observations. It is reasonable to consider that settingup a samplillg location is more expensive than traveling a unit distance in the forestalthough we do not assume that in our calculation. The question arises if we shouldspend more money to visit the whole area, or if we should just travel in one or twosubregions.The region is divided into three subregions, x1, x2, and x3. Let x1 and x2 be,respectively, the regions for the first unit of distance and the second unit of distancewhen we start at A. Let x3 be the area we cover when we start at B and travel throughthe one unit of distance. Then there are five possible transects.48Suppose A =as follows:The transects and the corresponding costs are givenT{ai}T{a2}T{a3}T{a4)T{as}= {x1},= {x3},= {x1,2},= {xi,x2,3}= {xi,x3}8(ai) = c + yS(a2)=c+y8(a3) =c+2yS(a4) =2c+3yS(a5) = 2c + 2y.We write gj = G{a} for i = 1, 2, 3,4, 5, and g = (gl,g2,g3,g4,gs)t.Since A andB are separate locations, we assume the numbers of animals in regions x1 and x2 areuncorrelated with the number of animals in region x3. Also we assume the measurementerrors from three subregions are uncorrelated. To simplify calculations, we also assumethat the variances of Z(x) and j do not depend on i. Therefore, we suppose12 0°12 0o o 272 0 06= 0 2 0o o 72Then straightforward calculation shows that the matrix MG may be written as followsMOMG=O (g2+g4+g5)andINow we letr = 2 +7249whereM = (gi +gs) +2_122(g3+g4 r2_ui2(93 +94)r2_c7122 (g + g4) r2_j122 (93 + g4)Suppose that n = 1 and= (1, 11)tThen F can be writtenF(g)= MG’#+ZS(aj)th1 + 191 + g3 + g4 +95 g3 + g4 g2 + 94+95U1221 1 1g3 + 94 — gi +g + g4 + 95)+ 2U12+ 93 + g4 +95+c(gi +g2 +93 +2g4 +2gs)+y(gi +92 + 293 +394 + 2g5).Denote*U12 =rwhich is the correlation between 2(x1) and Z(x2),CS = —,randt=.It is obvious that s and t are both positive. For simplicity, we write= 91+93+94+95P2 = 93+94P3= g2+g4+g5.50Then we can rewrite the problem as minimizingF*(g)=subject toand gO for i=1,2,...,5.For specific values of u12, s, and t, the optimal gj’s can be calculated easily usingthe FORTRAN subroutine NCONF. Here we write a subroutine to calculate F*(g),h(g) = gj — 1 and l(g) = gj for a given g, then input an initial value go. Then wecan get the solution for this problem.The case with u2 = 0The following Theorems apply to the case where = = 0, that is, the numbers ofanimals in these three subregions are uncorrelated. This simplification of the problemallows us to study how the costs and variabilities affect the optimal design. In Theorem5.3, we show that the optimal design has g5 = 0, that is we never sample {x1,x3}. Todiscuss the values of g, g2, g3 and g4, we denote,‘__1__1/2\s + t/ 2 N112s+2tJtiN’12In Theorem 5.4 (a), (b) and (c), we give explicit formulae for the optimal gj’s. Theseformulae depend on whether ci and /3 lie in regions (a), (b) or (c) of Figure 5.3.1. Wecannot calculate formulae for the optimal gj’s for all points in region (d). However, wedo find formulae for the subregion < 1/2 (see Theorem 5.4(d)).51Figure 5.3.1: Regions studied in Theorem 5.4For the case 12 = 0, the problem becomes minimizing1 1F(g) = —+—+P1 P2 +P3)+t(P1+P23)subject toand 9iO for i=1,2,...,5.We call this constrained minimization problem (B). We write the setC={g:Pi>0,P2>0,P30,g0,i=1,2,3,4,5, and g=1}= {g : g3 +g4 >0,92 +g4 +95 >0, g 0, i = 1,2,3,4,5, and = 1}.Recall in Chapter 2, Theorem 2.1 requires that the Linear Independence Assumptionholds at the optimal g. This is a simple case with lj(g) = g, 0 and h(g) = g—1 =0. Thus we know that Vl(g) is a vector with ith component 1 and the others 0’s: Vh(g)is a vector with all components equal to 1. So for any g corresponding to a probabilitymeasure, the gj’s cannot be all 0, so the vectors, Vl(g),i g, = 0, Vh(g), are linearlyCa52independent. Therefore the Linear Independence Assumption always holds for thisproblem. On the other hand, Theorem 2.2 requires the convexity of C and F*. This iseasy to see since C is just a simple case of F, and from Lemma 4.4 F* is also a convexfunction on C. Now we can apply Theorems 2.1 and 2.2 to our problem.The Lagrangian function isSuppose g is an optimal design. From Theorem 2.1, we know that there exist \ and ‘ssuch that, for this g;ui 0 for i = 1,2, 3, 4,5 (5.5)= 0 for i = 1,2, 3, 4,5 (5.6)8L 1gj = 1, (5.7)——+s+t+X—j=0 (5.8)(5.9)(5.10)8F2F2]32 (5.11)app2(5.12)Conversely suppose there exists g E C and A and tj’s satisfying Equations (5.5)—(5.12).Then g is an optimal solution to the problem. This is the result of Theorem 2.2.We now use these equations to calculate the optimal gj’s.Theorem 5.3 If g is a solution of(B), then we must haveg5 0.53Proof: Suppose there is an optimal g with g5 0. Equation (5.6) implies that = 0.Substitution of [L5 = 0 into Equation (5.12) yields= + — 2.s — 2t,and substitution into Equations (5.8)—(5.11) yields1ui =1=111 1[L3 = —————Sn2 r2r3 F1[t4 =By Equation (5.5) we must have——s—t > 02 ——s—t > 0————S > 02 2 ——+t 0,or equivalently5+1;Since it is obvious that P1 > P2, we have 1/P2 1/P2. However, from the above,s+t >t54which leads to a contradiction. IThe optimal designs are given in the following:Theorem 5.4 (a) g =(0,1)t is a solution of(B) if and only ifa 1.(b) If g = (O,O,g3,g)t, g3 0, g4 0, is an optimal design, thena<1andg3 = 1 ag4 = a.Conversely, if a < 1, and 3 1, then g as given is an optimal design.(c) If g = (O,g2,g340)t, g2 0, g3 0, g4 0 is an optimal design, then13<1a+i3> 1andg2 = 1—/3g3 = 1—ag4 = a+/3—1.Conversely, if a < 1, /3 < 1, and a + 3> 1, then g as given is an optimal design.(d) If g = (gi,g2,30,0)t, g1 0, g #0, g3 0 is an optimal design, then< 1/255and1gi =1g2 =g3 = 77.Conversely, if < 1/2, then g as given is an optimal design.tProof of (a): We note that for g = (0,0,0,1,0), P7 P2 = P3 = 1. Suppose that gis an optimal design. Then by Theorem 2.1, there exist ) and jj’s satisfying Equations(5.5)—(5.12). Since g4 = 1, Equation (5.6) implies that [L4 0. Substituting L4 = 0 andP1 = P2 = F3 = 1 into Equation (5.11) yields= 3 — 2s — 3t,and substitution into Equations (5.8)—(5.10) and (5.12) yields= 2—s---2tIt2 = 2—s—2tIt3 = 1—s—tIL5 = 1_t.By Equation (5.5), we must have2 — s — 2t 01 —s—t>01 — t 0,or equivalentlys+t<1.56Now we suppose that s+t 1. By the calculations above, there exist A and [Li’s satisfyingEquations (5.5)—(5.12) with g = (0,0,0, 1, 0). Thus, by Theorem 2.2, g = (0,0,0, 1, 0)tis an optimal design. Therefore g = (0,0, 0, 1 0)t is a solution of (B) if and only ifa>1.Proof of (b): We consider g = (0, 0, g3, g4, 0)t where g3 and g4 satisfy 93 0, and0. Then P1 P2 = 1, and P3 = g4. Suppose that g is an optimal design. Thenby Theorem 2.1 there exist A and t’s satisfying Equations (5.5)—(5.12). Since g 0,Equation (5.6) implies j = 0. Substituting u4 = 0, P1 = P2 = 1 and P3 = = 1 —into Equation (5.11) yieldsA = 2+12 — 2s — 3t. (5.13)(1—g3)Since g3 0, Equation (5.6) implies ,u = 0. Substituting A, P1 = P2 = 1 and u3 = 0into Equation (5.10), and also using Equation (5.13) yields1 1/2g3== 1—a,and thereforeg4 = 1—g3= a,for 0 < a < 1. It is obvious that 93, g4 e (0, 1). Using these results we can simplify Ain Equation (5.13) toA = 2 — s — 2t,and substitution into Equations (5.8), (5.9) and (5.12) yields[Li = 1—tIt2 = 2—s—2t[L5 = 1_i.57By Equation (5.5), we must have2—s—-2t > 01—t 0.This is equivalent to8>1.Now we suppose that c < 1 and > 1. By the argument above, there exist A and nj’ssatisfying Equations (5.5)—(5.12) with g = (0, 0, 1 c, c, . Thus by Theorem 2.2g = (0, 0, 1 — , a, 0)t is an optimal design.Proof of (c): We consider g = (0,g2,g34)t, in which case Pi = P2 = g3 + g4and F3= g2 + g4. Suppose g is a solution of (B) and satisfies g L 0, g3 0, and= 1— g— 9 0. Then by Theorem 2.1 there exist A and tj’s satisfying Equations(5.5)—(5.12). Since g4 0, from Equation (5.6) we have [t4 = 0. Substituting = 0into Equation (5.11) yieldsA ==Since g 0 , Equation (5.6) implies i’2 = 0. Substituting A, and /22 = 0 into Equation(5.9) yields2 2(93+94)s+2t’and sog2 = l—(g3+g4)= i—(_2_h/2+ 2t1= 1-a.58Since g3 0, Equation (5.6) implies t3 = 0. Substituting \, P1 = P2 and ,u3 = 0 intoEquation (5.10) yields2 1(g2+g4)=and thus/__1__1/2g4=( ) —g2\S + tj=From Equation (5.7) we haveg3 = 1—g2—g4= 1—oSubstituting these results into Equations (5.8) and (5.12) we obtain1 1 S=1 s[L5 =LSince we must have g, > 0, for i = 2, 3,4, we requirec<19<1a+i3> 1.Now we suppose that c < 1, < 1 and a+/3 > 1. By the arguments above, there exist \and j’s satisfying Equations (5.5)—(5.12) with g = (0, 1 — /3, 1 — a, /3 + a — 1, 0)t• Thusby Theorem 2.2, g = (0, 1 — /3, 1 — a, /3 + a — 1, 0)t is an optimal design.Proof of (d): We consider g = (gi,g2,0,0)t, in which case P1 = gl+g3, P2 = g3, andP3= g2. Suppose g is an optimal design and satisfies g 0, g 0 and g3 0. Then59by Theorem 2.1 there exist ) and ij’s satisfying Equations (5.5)—(5.12). Since g3 0,Equation (5.6) implies /13 = 0. Substituting p 0 into Equation (5.10) yieldsSince g 0, Equation (5.6) implies = 0. Substituting = 0 and \ into Equation(5.8) and using the fact that P2 = g3, we obtain/1hI2g3=J) =1,which we require to be less than 1. Since g 0, Equation (5.6) implies /12 = 0.Substituting /12 = 0 and \ into Equation (5.9) and using the fact that P1 = 1—P2 = g3, and P3 = g, we obtain1g2 =Thus, g = 1—— g3 = 1/2 — . Since we assume g > 0, we require i < 1/2.Substituting the g’s and ) into Equations (5.11) and (5.12) yields/14 = /15= —4+s+t.By Equation (5.5), we require s + t 4. But this follows since t = (1/)2 > 4. Thusg = (1/2_,,710)tis an optimal design.Now suppose that < 1/2. By the calculations above, there exist .\ aild jj’s satisfying (5.5)—(5.12) with g = (1/2 — i, 1/2, i, 0, 0)t• Thus by Theorem 2.2 we know thatg = (1/2 —i,1/2,i,0,0)tis an optimal design. IThe conclusions that can be drawn from the above theorems are that, if the observations in the three subregions are uncorrelated, we will never sample the transect{x1,x3}. Sampling the transect {x1,x2} yields the same amount of information as sampling {x1,x3}, but at a smaller cost (c + 2y compared to 2c + 2y). We should look60at all subregions when both .s and t are small. We will never sample the whole regionwhenever (s, t) is in the set K, which is defined as followsK = {(s, t) : 1/(s + t)2 + (2/(s + 2t))’72 < 1}.The following figures show the five components of an optimal design as a functionof s and t. For values of s and t between 0 and 8, we calculate the optimal values of gjusing FORTRAN subroutine NCONF. We omit the plot of g5, since we have shown inTheorem 5.3 that g 0.Recall that s is the startup cost/the variance of Z(x) and t is the “per unit of travel”cost/ the variance of (x). Figure 5.3.2 shows the optimal gi, the probability of choosing{x1}. It appears that the probability of sampling {x1} is always 0 when t <4, that iswhen r > 1/2. Fort > 4 from Theorem 5.4 (d) we know that 91 = 1/2—17 = 1/2—l/t’/.Figure 5.3.2 illustrates this fact. From Theorem 5.4 (a)—(c), we know that g’ = 0 whens and t are small. This, too can be seen in the figure.Figure 5.3.3 shows the the optimal g2, the probability of choosing {x3}. It seemsthat the probability of choosing {x3} is always 0 when values of s and t are both small.From Theorem 5.4 (a) and (b) we know that g = 0 when a 1 or /3 1, that is, whens + 2t 2. Also from Figure 5.3.3 and Theorem 5.4, we see that g is non-decreasing insalldt,andg1/2fort>4.Figure 5.3.4 shows the optimal 93, the probability of choosing {x1,x2}. It seemsthat the probability of choosing {x1,2} is always 0 when values of s and t are bothsmall. Furthermore, from Theorem 5.4 (b) and (c), we see that g3 = 1 — a, an increasingfunction of s and t, for a + /3 > 1. This corresponds to moderate values of s and t. ByTheorem 5.4 (d), g3 = t1’2 for t > 4. This is clearly seen in Figure 5.3.4. Theorem 5.4(a) states that g3 = 0 if a 1, that is, if .s + t 1.Figure 5.3.5 shows the optimal g4, the probability of choosing {x1,23}, that is,the whole area. It appears that probability of choosing the whole area is 1 when values61of s and t are both small. From Theorem 5.4 (a) we know g4 = 1 when 1, that iswhen s + t < 1. Then when s and t increase, the probability of choosing the whole areagoes to 0. This can been seen in the figure. Both the theorem and the figure indicatethat we should visit the whole region when the costs are low. If the costs are high, weshould not visit the whole area. In particular, by Theorem 5.4 (d), g4 = 0 if t 4.Figure 5.3.2: The optimal values of g with a2 00CC%4C00062CD CD 0 CD C,) 0 q *CD C.;’CD 0 CD 0 t.) I:0O.20.40.6°.81q00.20,40100Figure 5.3.5: The optimal values of g4 with °2 =OD•0(0•00c9The case with u 0In this case, the observations in regions x1 and x2 are correlated. (The correlationbetween (x1) and 2x) is equal to°2) Recall when a12 = 0, then we will neversample {x1,x3} because it costs more but we get the same amount of information assampling {x1,2}. But for 0, the situation is quite different. Sampling {x1,3},although costly, yields information about x2 when the observations in x1 and x2 arecorrelated. So the transect T(as) = {x1,x3} will be chosen sometimes because samplingtransect T(a5) will yield more information than sampling T(a3) = {x1,x2}.Although we do not have explicit expressions for the optimal designs for this case, wecan always calculate the results using numerical methods. The following figures showthe five components gi through g5, of an optimal design as a function of s and t fordifferent o2’s. These values were calculated using NCONF.Figures 5.3.6—5.3.10 display the five components of an optimal design as a function0064of s and t when u’2 = 0.2. Here g5 is included since it is no longer 0 for this case.Figures 5.3.11—5.3.15 are similar to Figures 5.3.6—5.3.10 except 12 = 0.6.Comparing Figures 5.3.6 and 5.3.11 with Figure 5.3.2, we can see the probabilityof choosing {x1} increases with u2. That is because, for fixed cost, when the correlation between the observation in regions x and x2 increases, sampling x1 yields moreinformation about x2. It is therefore reasonable to have a larger probability of sampling{Xi}.Comparing Figures 5.3.7 and 5.3.12 with Figure 5.3.3, we can see when‘2 increasesfrom 0 to 0.2 to 0.6, the probability of choosing {x3} decreases. This might be because,for fixed costs, when the correlation of the observations in regions x1 and x2 increases,sampling x1 yields more information than sampling x3.Comparing Figure 5.3.10 to Figure 5.3.15, and also noting the fact g5 0 when= 0, we can conclude that the probability of choosing {x1,x3} increases asincreases, at least for small .s and some t’s. Recall small values of s corresponding tolow startup costs, in which case, the cost of sampling {x1,x3} is comparable to that ofsampling {x, x2}. Similar to the previous discussion, because of the correlation of theobservations in regions x and x2, sampling the transect {x1,x3} yields more informationthan sampling {x1,x2}. Thus it is worthwhile to sample {x1,x3} even though it is morecostly than sampling {x1,2}.From these figures, we cannot find any obvious relationship for g3 and g4 when u’2changes from 0 to 0.2 to 0.6.The way that .s and t affect the optimal gj is quite similar to the case when o2 = 0except for g5. In the uncorrelated case (°2 = 0), the optimal g5 is equal to 0, and thusdoes not change with s and t. But for the dependent case (a2 0), the optimal g5depends on the values s and t.65CD CR CD C — CD (1) C I:CD CR CD C CD C q—*00.20A0.60.81,,q ‘1 C.T 0 ct (l 0 q *aq Cl’ (D 0 Ct, 0, 0.300.°.8100. CD CR CD 0 CD IJ 0 q*Ct’ CR CD 0 CD Cl) 0 C” q ,-.*cP— CD CD 0 — CD C’) 0 (3aq CD CD 0 0 I:, CD c;1 3 cy’CD 0 CD C’) 0 C,’—*c,q CD 3 CD 0 e÷.CD 0—‘*[1] Bertsekas, D.P. Constrained Optimization and Lagrange Multiplier Methods, Academic Press, New York, 1982.[2] Graham, A. Nonnegative Matrices and Application Topics in Linear Algebra. EllisHorwood Ltd. 1987.[3] Heckman, N. and Rice, J. “Linear Transects of Two-dimensional Random Fields:Estimation and Design”, Technical Report, Department of Statistics, UBC, No.132, 1993.[4] Kuhn, H.W. and Tucker, A.W. “Nonlinear Programming”, Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, J. Neyman(ed.), 481—493, University of California Press, Berkeley, 1951.[5] McCormick, G.P. “Optimality Criteria in Nonlinear Programming”, SIAM—AMSProceedings, Vol. 9, 27—38, 1982.[6] Rao, C.R. Linear statistical inference and its applications, Wiley, New York, 1973.[7] Seber, G.A.F. The estimation of animal abundance and related parameters,C.Griffin, London, 1982.[8] Seber, G.A.F. “A Review of Estimating Animal Abundance”, Biometrics, Vol. 42,267—292, 1986.71


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items