"Applied Science, Faculty of"@en . "Electrical and Computer Engineering, Department of"@en . "DSpace"@en . "UBCV"@en . "Borowski, Nick"@en . "2011-05-03T18:12:53Z"@en . "1971"@en . "Master of Applied Science - MASc"@en . "University of British Columbia"@en . "Three recent random search algorithms are compared on the basis of efficiency, and on the basis of sensitivity to noise, scaling and the number of variables. A general discussion of random search methods points out their advantages and disadvantages in relation to deterministic methods. A new random vector generator is described in the appendix."@en . "https://circle.library.ubc.ca/rest/handle/2429/34218?expand=metadata"@en . "A COMPARISON OF THREE RANDOM SEARCH METHODS by NICK BOROWSKI B.A.Sc., U n i v e r s i t y of B r i t i s h Columbia, 1961 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE i n the Department of E l e c t r i c a l Engineering We accept t h i s thesis as conforming to the required standard Research Supervisor \u00C2\u00BB Members of the Committee Head of the Department Members of the Department of E l e c t r i c a l Engineering THE UNIVERSITY OF BRITISH COLUMBIA January, 1971 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f the r e q u i r e m e n t s f o r an advanced deg ree a t t he U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t t h e L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s t u d y . I f u r t h e r a g r ee t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s may be g r a n t e d by t he Head o f my Depar tment o r by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Depar tment o f The U n i v e r s i t y o f B r i t i s h C o l u m b i a V a n c o u v e r 8, Canada ABSTRACT Three recent random search algorithms are compared on the basis of e f f i c i e n c y , and on the basis of s e n s i t i v i t y to noise, s c a l i n g and the number of v a r i a b l e s . A general d i s c u s s i o n of random search methods points out t h e i r advantages and disadvantages i n r e l a t i o n to d e t e r m i n i s t i c methods. A new random vector generator i s described i n the appendix. TABLE OF CONTENTS Page ABSTRACT II TABLE OF CONTENTS I l l LIST OF ILLUSTRATIONS IV 1. INTRODUCTION 1 Mathematical Programming Problems, 1 Random Search Methods> 1 Scope and Aim of Thesis> 3 2. DESCRIPTION OF RANDOM SEARCH ALGORITHMS 6 Adaptive Random Search, 6 . Adaptive Step Size Random Search, 10 Kj e l l s trout's Random Search, 15 3. RESULTS AND COMPARISONS 22 Average E f f i c i e n c y , 22 E f f e c t of Scaling, 29 Ef f e c t of High Dimensions, 29 E f f e c t of Noise, 32 Summary of Numerical Experiments, 34 4. GENERAL DISCUSSION OF RANDOM SEARCH METHODS 39 Other Random Search Algorithms*39 E f f i c i e n c y of Random Search Methods, 40 Advantages of Random Search Methods, 43 Suggestions for Further Research, 44 5. CONCLUSION 47 APPENDIX 48 REFERENCES \u00E2\u0080\u00A2 56 LIST OF ILLUSTRATIONS Page 1. Flowchart of Adaptive Random Search 8 2. Flowchart of Adaptive Step Size Random Search ... 11 3. Performance Check on Adaptive Step Size Random Search Hypersphere of Dimension n 14 4. Flowchart of Kjellstrom 's Random Search 17 5. Performance Check on Kjellstrom's Random Search - Test Functions II and I I I 19 6. Performance Check on Kjellstrom's Random Search -The Moon Problem 20 7. Sample Data - Adaptive Random Search and Test Function I I I 23 8. Comparison of E f f i c i e n c y - Test Function I 25 9. Comparison of E f f i c i e n c y - Test Function II 27 10. Comparison of E f f i c i e n c y - Test Function I I I 28 11. Comparison of E f f i c i e n c y - Test Function IV 30 12. E f f e c t of Scaling - Test Function V 31 13. E f f e c t of High Dimensions - Test Function VI 33 14. E f f e c t of Noise - Test Function VII 35 15. E f f e c t of Noise - Test Function VIII 36 16. D e f i n i t i o n of Variables f o r the Derivation of the P r o b a b i l i t y Density Function 50 17. Generation of Random Numbers by a Refined Acceptance -Rejection Method 50 18. Sample Histogram from the Random Vector Generator 54 ACKNOWLEDGEMENT I wish to aknowledge the encouragement and assistance of my supervisor, Dr. G. F. Schrack. In p a r t i c u l a r , he pointed out to me a number of relevant t e c h n i c a l papers which I might otherwise have missed. Further, I wish to acknowledge the f i n a n c i a l support of the National Research Council of Canada and the Un i v e r s i t y of B r i t i s h Columbia. 1 1. INTRODUCTION Mathematical Programming Problems This thesis deals with a certain class of techniques for the num-e r i c a l solution of a mathematical programming problem, which may be stated as follows: Minimize Q(x) with respect to x, subject to the constraints h (x) = 0 i = 1,2,. . . . , m, 0 i = 1,2,...,p where x denotes a vector i n n-dimensional Euclidean space with components x^, X ^ J - . - J X ^ . In the context of optimization problems Q(x) i s called the quality function, and i s a measure of performance of a system with parameters x.. I t i s i n general nonlinear, as are the functions h_^(x) and g^(x) . This problem occurs i n diverse areas of applied science, such as optimum network design [22,41], adaptive control devices [19,26] and the optimization of chemical processes. The problem of finding the optimal control function may also be formulated i n this manner. For example, i n one such formulation the parameter vector x represents a point i n i n i t i a l co-state space [21], and i n another i t represents the co e f f i c i e n t s of a polynomial approximation to the control function. Random Search Methods Searches for the solution of the mathematical programming problem may be dichotomized i n several ways. Thus a pa r t i c u l a r search method may be c l a s s i f i e d as being either direct or i n d i r e c t , simultaneous or sequential, deterministic or random. A direct search does not require the evaluation of derivatives and sometimes not even thei r existence. Sequential methods use 2 information gained i n previous steps of the search to guide future steps, whereas simultaneous methods make no use of such information. The d e f i n i t i o n of \"random search\" used i n this thesis i s that of Schumer [38]: \"The search technique i s deterministic i f , when the search starts under i d e n t i c a l conditions, i t always follows the i d e n t i c a l route to the i d e n t i c a l outcome. If on the other hand, i t does not always follow i d e n t i c a l paths, the method i s said to be a randomized search method\". I t should be noted that two assumptions are made. F i r s t , i f the random numbers required for the search are obtained from a pseudo-random number generator i t must be r e - i n i t i a l i z e d d i f f e r e n t l y for each run. Second, random variables do not enter into the problem i t s e l f , only into the method of solution. Methods of dealing with the case i n which random variables form part of the problem to be solved come under the heading of stochastic programming and are not considered here. Most random methods incorporate various degrees of determinism i n that they use previous information to guide the next t r i a l step. I f no such determinism enters into the search i t i s called pure random search, which means the same as simultaneous random search. On the other hand, i f the results of previous steps completely determine the next t r i a l step the method i s obviously deterministic. Between these extremes l i e s the spectrum of sequential random search methods. Pure random searches are useful for the location of global extrema whereas sequential random searches are gen-e r a l l y more e f f i c i e n t for l o c a l extrema. The large majority of search methods found i n the l i t e r a t u r e at the present time i s deterministic, but a number of random search methods have also appeared i n recent years, beginning with Brook's paper i n 1958 [5]-Since then at least a dozen algorithms have been published and a number of 3 theses have been written on the subject [2,8,9,11,12,15,19,21,22,24,25,26, 31,35,38,41]. However, with some of these algorithms, l i t t l e attempt has been made to ascertain th e i r advantages and disadvantages. Sometimes only convergence to) the solution i s proved, and i t i s then stated that the method i s efficient,without benefit of supporting data. Scope and Aim of Thesis Three sequential random search algorithms w i l l be compared against each other and, i n less d e t a i l , against deterministic methods. They were designed for d i g i t a l computer implementation. The choice of these p a r t i c u l a r methods was not completely arb i t r a r y . They were chosen because they are r e l a t i v e l y recent and exemplify the variety of approaches that may be used. The comparisons w i l l be more ind i c a t i v e than d e f i n i t i v e i n nature because the result of any comparison of search methods depends to a large extent on the p a r t i c u l a r test function that i s used, and on the values assigned to the various parameters that occur i n most algorithms. In fact there i s no generally v a l i d basis of comparison for search algorithms because their intended application i s also an important factor i n deciding on the super-i o r i t y of one algorithm over another. However, even a very rough quantitative comparison i s preferable to no such comparison at a l l . The convergence of an algorithm may be studied i n terms of location convergence or value convergence. The former refers to the convergence of x to x*, the optimum parameter vector, while the l a t t e r refers to the con-vergence of Q(x) to Q(x*), the optimum value of the quality function. I t i s always assumed here that x converges to x* as Q(x) converges to Q(x*)> This assumption evidently holds for unimodal functions i n the absence of noise. Thus, the convergence rate of any one method r e l a t i v e to a given test function i s measured only i n terms of the rate at which Q(x) converges to Q(x*). This rate i s determined by the number of function evaluations required to reduce Q(x) by a given r a t i o . Other properties of the algorithms w i l l also be considered, such as the e f f e c t s of poor s c a l i n g and high dimensions. R e l a t i v e l y simple test functions have been composed for the purpose of i s o l a t i n g these e f f e c t s . Though not s p e c i f i c a l l y designed for the case where noise i s present, the algorithms w i l l also be compared on the basis of such problems. None of these algorithms i s s p e c i f i c a l l y equipped to deal with e q u a l i t y c o n s t r a i n t s . If such were present they would f i r s t have to be con-verted i n t o some other equivalent form. Thus, the above eq u a l i t y constraints may be approximated by the i n e q u a l i t y constraints |h.(x)| < \u00C2\u00A3. where are s u i t a b l y small p o s i t i v e numbers. A l t e r n a t i v e l y , the q u a l i t y function Q(x) may be replaced by the penalty function P(x). a P(x) 2 Q(x) + J h . 2 (x) Both a l t e r n a t i v e s have t h e i r advantages and disadvantages. For example, penalty functions introduce steep curving v a l l e y s into the surface whose minimum i s to be found, but they do not require an i n i t i a l f e a s i b l e point to s t a r t the search. If i n e q u a l i t y constraints are present, however, an i n i t i a l f e a s i b l e point must be given. In the following comparisons no equality constraints are present, or they may be assumed incorporated into the problem as penalty functions. For the i n e q u a l i t y constrained test function an i n i t i a l f e a s i b l e point i s given. A c t u a l l y , two of these algorithms are not s p e c i f i c a l l y equipped to deal with i n e q u a l i t y constraints e i t h e r . A simple common expedient i s used i n such cases. Wherever the i n e q u a l i t y constraints are not s a t i s f i e d Q(x) i s assigned a s u f f i c i e n t l y large value so that a step outside the f e a s i b l e region w i l l always r e s u l t i n a f a i l u r e of that step to produce an improvement. T h i s e f f e c t i v e l y confines the search to the f e a s i b l e region. The comparison of the algorithms i s the primary aim of the t h e s i s . A secondary aim i s to answer the question: Why use random search methods when there i s already a large number of d e t e r m i n i s t i c methods a v a i l a b l e f o r these problems? From the r e s u l t s of these comparisons as w e l l as from a study of the e x i s t i n g l i t e r a t u r e on random search methods some answer to t h i s question w i l l be a r r i v e d at. 6 2. DESCRIPTION OF RANDOM SEARCH ALGORITHMS A l l three algorithms have been reconstructed from t h e i r d e scriptions and s i m p l i f i e d flow charts as published i n the papers by Matyas [26], Schumer and S t e i g l i t z [39], and K j e l l s t r o m [22]. Where parameters have been found to be merely delimited or l e f t altogether unspecified reasonable estimates of t h e i r best values have been used. The e f f i c i e n c y of each reconstructed algorithm i s then checked against that of the o r i g i n a l v e r s i o n by using the same tes t functions as were used by the author of that p a r t i c u l a r algorithm. M u l t i p l e stopping rules are used throughout. In order to guard against unexpectedly large computing times each algorithm i s designed to stop the search a f t e r a predetermined number of function evaluations. The second stopping r u l e that i s common to a l l algorithms depends on the value of the q u a l i t y function Q(x). In a d d i t i o n to these each algorithm contains one or more stopping rules p e c u l i a r to that p a r t i c u l a r method. The stopping rules remain unchanged from problem to problem. Adaptive Random Search (ARS) The e a r l i e s t of the three methods considered, t h i s i s an a r b i t r a r y common sense method. Nevertheless, i t i s capable of adapting both step s i z e and step d i r e c t i o n according to success or f a i l u r e of previous steps. I t s convergence has been proven [26,27], but the rate of t h i s convergence i s open to question. The algorithm has previously been tested or discussed by several w r i t e r s [9,28,29,35]. A stepwise d e s c r i p t i o n follows. S t a r t i n g from u , a f e a s i b l e point i n parameter space, a random step 6 i s generated according to 6 k = b k + T V th k k where k denotes the k i t e r a t i o n , b i s a bias vector and \u00C2\u00A3 i s a normal random vector with covariance matrix equal to the i d e n t i t y matrix. The k transformation matrix T i s an (n x n) matrix, i n t h i s case r e s t r i c t e d to be diagonal with a l l elements equal to a, the standard deviation. T = diag (a) The new t r i a l point i s then x . lc k Pk X = U + 0 The q u a l i t y f unction Q(x) i s evaluated at t h i s point and a comparison with Q*, the best value of Q(x) so f a r , determines success or f a i l u r e of th i s step. When the i n e q u a l i t y Q(x k) < Q* - e i s s a t i s f i e d , and no constraints are v i o l a t e d , the step i s a success and the v a r i a b l e s are updated as follows: b = c b + d 6 s \u00E2\u0080\u00A2 s T k + 1 \u00C2\u00AB, a T k s k+1 k , .k u = u + o If the i n e q u a l i t y i s not s a t i s f i e d , or i f a constraint i s v i o l a t e d , the step i s a f a i l u r e and the v a r i a b l e s are updated as follows: b k + 1 = c f b k + d f 6 k T k + 1 = a f T k k+1 k u = u The threshold e i s a s u i t a b l y small p o s i t i v e number. The constants must s a t i s f y the i n e q u a l i t i e s : OSc $1 d >0 c + d >1 s s s s 0Sc fQ(u) K>0 step i s unsuccessful K>M Yes maximum computing e f f o r t i s exceeded WRITE Q\", u EXIT generate t r i a l point <5-HD+T\u00C2\u00A3 x-Hi+S K+-K+1 No improvement Q(x)Q(x) u+x a-KL. l a b-^ -c b+d 6 s s e^O.0001|Q*1 No a<--o, Figure 1 Flowchart of Adaptive Random Search The following values have been assigned i n t h i s implementation of the algorithm c = c = 0.75 s f d = 0.5 d^ = -0.25 s f a = 1.1 a = 0.9 s f As i t was decided to make the standard deviation a v a r i a b l e i t was found necessary to also set a lower l i m i t on the values that a can take i n order to prevent the steps from becoming so small that further improvements become impossible. An upper l i m i t for 6 , as suggested i n the o r i g i n a l paper, has not been incorporated because the algorithm, as implemented, can quickly recover from an overly larp\". step s i z e . Intro-duction of a d d i t i o n a l a r b i t r a r y parameters i s therefere unnecessary. The constants c g and c^ have the e f f e c t of reducing the influence of previous steps as they recede farther into the past h i s t o r y of the search. They are therefore c a l l e d r e j e c t i o n c o e f f i c i e n t s . The constants d^ and d^on the other hand, modify the bias b for the next step according to success or f a i l u r e of the present step. Since they incorporate new experience i n t o the search e f f o r t they are c a l l e d detection c o e f f i c i e n t s . The i t e r a t i v e adaptation of step length and d i r e c t i o n makes t h i s i n fact a learning algorithm [29]. Further d e t a i l s are given i n the flowchart ( f i g . 1). In a l l flow-charts the v a r i a b l e K i s used as a counter for the number of function evaluations, and M i s the maximum value of K before the search i s stopped. L i t t l e data i s a v a i l a b l e for checking the performance of t h i s implementation against the o r i g i n a l . In fac t only a quadratic i n two va r i a b l e s i s given [26]: 2 2 Q(x) = 0.26x 1 + 0.26x 2 - 0.48x^2 St a r t i n g from the point (15,30) an average of 49 function evaluations were required by th i s author's version of the algorithm to f i n d a point where Q(x) was les s than 0.2. About 25 of these were s u c c e s s f u l . The r e s u l t s compare favourably with corresponding figures given by Matyas, who required 93 function evaluations, 55 of which were s u c c e s s s f u l . For t h i s test function, then, the present ve r s i o n i s somewhat more e f f i c i e n t than the o r i g i n a l . However, almost any method performs reasonably w e l l on such a simple problem and therefore no conclusion i s warranted on how c l o s e l y the performance of the present version resembles that of the o r i g i n a l i n more d i f f i c u l t problems. Adaptive Step Size Random Search (ASSRS) This algorithm, proposed by Schumer and S t e i g l i t z [39], i s based on a more d e t a i l e d mathematical analysis than ei t h e r of the other two. Even so, the p r a c t i c a l version contains several unspecified parameters to which values had to be assigned somewhat a r b i t r a r i l y . The algorithm was designed for unimodal, w e l l scaled, unconstrained functions with many vari a b l e s . Given an i n t t i a l point u i n n-dimensional parameter space and an i n i t i a l step s i z e s, a t r i a l point x^\"^ i s computed according to x ( 1 ) = u + sr Id e a l l y the random vector r i s uniformly d i s t r i b u t e d on the surface of the unit h-sphere, but i n pr a c t i c e an approximation i s used. If the in e q u a l i t y Q(x ( 1 ))M or sn set j-f-1 Yes analysis of random walk COMPUTE y = cov(x) SELECT TRIPLETS j , k, 1 COMPUTE y \" 1 , y.\" 1, y., A. 2 2 2 SORT AND RELABEL POINTS Q(x 1)$Q(x 2)$Q(x 3)^... u+-x , L-<-0 q+-MEDIAN VALUE OF Q(x) No K>M Yes WRITE q,u Figure 4 Flowchart of Kjellstrom's Random Search 18 by s e t t i n g to zero a l l elements o f u \" whose double subscripts are not combinations of the numbers j , k and 1. The above inequality i s then scaled by the factor 5/(n+2) r e s u l t i n g i n the new inequality t -1 ,. x 1J \u00E2\u0080\u00A2 x<;5 which describes the three-dimensional e l l i p s o i d r e s u l t i n g from t h i s cut. This r e l a t i o n also defines u.., the covariance matrix of the cut [ l O ] . Next y i s factored into two triangular matrices A . and 3 3 3 V. = A.Afc 2 3 3 th and A^ i s stored to guide the random walk i n the j \u00E2\u0080\u0094 subspace during the next i t e r a t i o n . This means that n such cuts and matrices w i l l be computed. The stored values of Q(x) are then sorted i n ascending order. That value of x which corresponds to the lowest value of Q(x) becomes the new s t a r t i n g point u for the next random walk. The new allowable region i s the set of a l l points x s a t i s f y i n g the inequality constraints and the ine q u a l i t y Q(x) n 1 6 = 10 (stands f o r \u00C2\u00B0\u00C2\u00B0) i f g(x)\u00C2\u00A3n n 2 g(x) = E x 1 . x\u00C2\u00B0 = (-1.2,-1.2,...,-1.2) x* = (1,1,.,-.,1) o o 500 1000 1500 2000 Function Evaluations Figure 9 Comparison of E f f i c i e n c y ^.Test Function II .'l\> This function [22] has already been re f e r r e d to i n the d e s c r i p t i o n of K j e l l s trom's algorithm. The value of n i s here set to 6. The search o * proceeds halfway around a hyperspherical forbidden region from x to x , always keeping close to the boundary of the f e a s i b l e region; hence the name. Results (Fig. 11) show that KRS i s again the most e f f i c i e n t method. ARS performs poorly, and ASSRS breaks down completely, as expected, because i t requires the v a l l e y i n the performance surface to be smooth. E f f e c t of Scaling A simple test f u nction has been composed to demonstrate the extent to which the various algorithms depend on the constant-Q surfaces being hyperspherical. Test Function V: The H y p e r e l l i p s o i d Q(x) = O . l x 2 + E x 2 1 2 x x\u00C2\u00B0 = (1,1,...,1) x* = (0,0,...,0) This function d i f f e r s from Test Function I only i n the c o e f f i c i e n t 2 f o r x^. The 0.1 c o e f f i c i e n t has the e f f e c t of elongating the hypersphere i n the x^ d i r e c t i o n so that the r a t i o between major and minor axes i s l;v / To(Fig. 12). For ease of comparison the corresponding curves for the hypershpere have been reproduced on the same graph with dotted l i n e s . It i s seen that ASSRS i s by f a r the most s e n s i t i v e to s c a l i n g , the other two being almost unaffected. E f f e c t of High Dimensions The r e s u l t s of Schumer and S t e i g l i t z suggest that random search becomes r e l a t i v e l y more e f f i c i e n t as the number of v a r i a b l e s increases. 4 4 32 To see how the algorithms under i n v e s t i g a t i o n compare i n t h i s respect tests were c a r r i e d out on the hypersphere with n = 10. Test Function VI: The Hypershpere i n Ten Dimensions 10 Q(x) = I x. 1 1 x\u00C2\u00B0 = (1,1,...,1) x* = (0,0,...,0) From the r e s u l t s ( F i g . 13) i t i s apparent that the e f f i c i e n c y of KRS decreases most r a p i d l y with an increase i n dimensions. As n doubles the number of function evaluations required f or a predetermined accuracy also doubles approximately f o r ASSRS, and for ASS i n th e ' l i n e a r region, but more than quadruples for KRS. E f f e c t of Noise If the q u a l i t y function must be evaluated by experiment, for example, a c e r t a i n amount of noise i s usually unavoidable due to random errors of measurement and the influence of the environment. I t i s therefore i n t e r e s t i n g to check how these random search algorithms are af f e c t e d by small random errors i n 0(x). Such problems are properly the subject of s t o c h a s t i c programming methods [42]. Test Function VII: The Hypersphere with Gaussian Noise 5 Q(x) =(Z x ) (1+0.01 \u00C2\u00A3 ) 1 x\u00C2\u00B0 = (1,1,...,1) x* =' (0,0,'. . . ,0) This i s again Test Function I, except that a noise term has been added. Test Function VI (n-10) Test Function I (n=5) Function Evaluations Figure 13 E f f e c t of High Dimensions - Test Function VI 34 As before, \u00C2\u00A3^ denotes a.normal random v a r i a b l e with mean 0 and standard deviation 1. Results ( F i g . 14) show that t h i s 1% noise l e v e l i s of l i t t l e consequence for a l l three algorithms. Location convergence,not shown here, was not noticeably affected e i t h e r . Yet for any method r e q u i r i n g gradients to be determined by taking f i n i t e d i f f e r e n c e s , t h i s amount of noise would very l i k e l y reduce the e f f i c i e n c y d r a s t i c a l l y [16, 18]. Test Function VIII: The Hypersphere with Uniform Noise ' ' ' 5 2 Q(x) = E x. + 0.05y. 1 1 x\u00C2\u00B0 = (1,1,...,1) x* = (0,0,...,0) The random v a r i a b l e y i s uniformly d i s t r i b u t e d on the i n t e r v a l (-1,1), and the maximum noise amplitude i s constant at 1% of the s t a r t i n g value of Q ( x ) . In t h i s test function then, and only i n t h i s one, Q(x) can become negative. The stopping c r i t e r i o n was there-fore set to | Q ( X ) | < Q J . instead of Q ( x ) < Q \u00C2\u00A3 , as i n a l l other problems. This change has the e f f e c t of postponing the moment of breakdown of the algorithm, thus improving l o c a t i o n convergence s l i g h t l y . Results i n d i c a t e ( F i g . 15) that the algorithms break down when the noise i s of about the same magnitude as the value of the q u a l i t y function without the noise. KRS and ASSRS stopped because the step s i z e had become too small, whereas ARS continued on i n e f f e c t u a l l y u n t i l the preassigned computing e f f o r t had been exhausted. Nevertheless, up to the point of breakdown, the e f f i c i e n c y was not appreciably a f f e c t e d . Location convergence, as expected, was extremely poor. Summary of Numerical Experiments On the basis of these comparisons i t i s concluded that KR.S Test Function VII Test Function I 500 1000 1500 2000 Function Evaluations Figure-14 E f f e c t of Noise - Test Function VII Ln 4 2 Test Function VIII Test Function I Method Breaks Down 2000 Figure 15 E f f e c t of Noise - Test Function VIII i s the best general random search method when there are fewer than about 10 parameters to be optimized. This l i m i t a t i o n to low dimensions would become even more apparent i f computation times were taken into account. This i s due to the large number of matrix m u l t i p l i c a t i o n s and inversions that are involved. The next best method appears to be ARS. I t performs poorly, however, when there are sharp, discontinuous v a l l e y s i n the performance surface, and i t i s very s e n s i t i v e to parameter changes. The l a t t e r c h a r a c t e r i s t i c might be an advantage i f the e f f e c t s of the parameter changes were known i n advance for then the algorithm could be adjusted to the problem at hand. ASSRS i s l e a s t desirable as a general purpose minimization algorithm because i t i s severely r e s t r i c t e d i n scope, being able to handle e f f i c i e n t l y only r e l a t i v e l y w e l l scaled problems, such as Test Functions I, V, VI and VII. In t h i s narrow f i e l d , however, i t i s the most e f f i c i e n t method. A l l three random search methods have shown themselves to be immune to r e l a t i v e l y small amounts of noise, which corroborates the findings of Brooks, Weinberger, Pensa, and others. In this respect they have a d e f i n i t e advantage over most de t e r m i n i s t i c methods, e s p e c i a l l y those r e q u i r i n g computation of d e r i v a t i v e s by taking f i n i t e d i f f e r e n c e s [16, 18, 41.]. In those problems not contaminated with noise, however, de t e r m i n i s t i c methods were found to be more e f f i c i e n t than random methods. Results of Test Functions I, V and VII show that KRS, l i k e ASSRS, has l i n e a r convergence for quadratic q u a l i t y functions, with or without small amounts of noise. In other words, the r e l a t i o n between log Q(x) and K, the number of function evaluations, i s very nearly a s t r a i g h t l i n e . Even Matyas' ARS e x h i b i t s t h i s property f o r as long as the standard deviation -O i s far from i t s lower l i m i t a . I t suggests 38 the l i n e a r i t y may be c h a r a c t e r i s t i c of a large c l a s s of random search methods when applied to quadratics. 39 4- GENERAL DISCUSSION OF RANDOM SEARCH METHODS Other Random Search Algorithms As stated above, the three algorithms compared here are only a sampling of the available random search methods. A few additional methods are l i s t e d below, i n chronological order, further i l l u s t r a t i n g the variety of approaches that have been used. Brooks [5,6,17] experimented with a number of simple random search methods, both simultaneous and sequential. Matyas' ARS may be considered an elaboration of hi s \"creeping\" or sequential random search. Karnopp's [l9,20] \"narrow range scan with moving mean\" and Gall's [ l 5 ] \"exponential random search\" are two more sequential methods which may be considered descendents of creeping random search. Karnopp has further suggested use of histograms for gathering global information about the q u a l i t y function, and he investigated the problem of choosing between global and l o c a l search methods. Weinberger's [41] \"multistage random search\" uses the concept of the hypervolume of a suitably defined allowable region i n parameter space. As the hyparvolume i s reduced i n stages the allowable region con-verges to a small neighborhood i n parameter space which has a certain p r o b a b i l i t y of containing the optimum. Other methods are the \"Variable Structure Method\" [28], \"Gradient Biased Random Search\" [35] and. a method due- to Lawson and Lavi that may be described as a sophisticated version of ASSRS [24]. .Certain universal adaptive optimization codes make use of one or more of the simpler random search methods [14]. A number of random search algorithms have appeared in.foreign language journals, especially Russian technical journals. Some of the authors are Rastrigin, Chichinada^ Zakharov, Vaysbord, Yudin and Matyas. 40 One of the l a t e s t papers on random search methods proposes a new method of finding optimal control functions. The time function i s approxi-mated by a f i n i t e , ordered sequence of independent parameters. These para-meters are then optimized by using a random search method i d e a l l y suited for the hybrid computer. I t i s not the only method of finding optimal control functions by random search, but i t i s the most dir e c t and the least r e s t r i c t i v e method [23]. The E f f i c i e n c y of Random Search Methods. Random search methods are frequently described as being e f f i c i e n t [ l 9 , 15> 41, 38]. Unless the reader pays close attention to the context i n which such statements are made he may very e a s i l y conclude that, i n some cases at lea s t , random search methods are more e f f i c i e n t than the best available deterministic method. Such a case has not been proven to e x i s t . Karnopp [19] gives a lengthy proof to show that h i s proposed random search method i s more e f f i c i e n t than a certain deterministic method. This \"deterministic\" method i s actually a complicated extension of the gradient method so as to make i t applicable to multimodal functions. For t h i s purpose random s t a r t i n g points are used. S t r i c t l y speaking, therefore, he i s simply comparing one random search method with another. Gradient methods by themselves simply do not apply to multimodal functions. I t i s therefore not a v a l i d comparison of random and deterministic methods. Gal l describes his exponential random search as simple and e f f i c i e n t , but does not make a quantitative comparison with any deterministic a l t e r n a t i v e . In the same work [15] he then shows that the number of function evaluations required by his method varies with the dimension approximately as 2n~\"'\", provided n i s less than about 5. This exponential dependence on n does 41 not indicate e f f i c i e n c y f o r high n. Furthermore, he concludes that \"the e f f i c i e n c y of the simpler methods could stand improvement\". E f f i c i e n c y i n t h i s context, therefore, means the method i s not so i n e f f i c i e n t as to render i t impracticable. Weinberger compares the e f f i c i e n c y of h i s multistage random search with two deterministic methods, the dichotomous method and a form of gradient method. The dichotomous method i s known to be i n e f f i c i e n t f o r multivariable problems, and the pa r t i c u l a r gradient method considered requires the evaluation of t h i r d derivatives, which again makes t h i s method quite i n e f f i c i e n t [41]. Current conjugate d i r e c t i o n methods require only f i r s t derivatives to be evaluated. This comparison therefore proves only that some random search methods are more e f f i c i e n t than some i n e f f i c i e n t deterministic methods. Schumer [38,39] has compared the ef f i c i e n c y of ASSRS to the ef f i c i e n c y of the Newton-Raphson method. When comparison was r e s t r i c t e d to problems with hyperspherical constant-Q surfaces, he found that f o r high dimensions and for l i m i t e d accuracy ASSRS i s the more e f f i c i e n t algorithm. The most e f f i c i e n t method i n that case, however, i s the gradient method. Following Rastrigin, he has also shown that fixed-step size random search i s more e f f i c i e n t than fixed step size gradient search, provided that consideration i s r e s t r i c t e d to the hypersphere and the step size i s chosen small enough. This r e s u l t , however, i s not very surprising nor p r a c t i c a l l y very s i g r i f i c a n t . For such w e l l scaled problems a gradient method would be designed to take the optimum step s i z e , i . e . , i t would converge after one li n e a r search i n the gradient d i r e c t i o n . Other authors [4,36] consider random methods i n e f f i c i e n t and 42 evidently also unimportant. They relegate such methods to i n c i d e n t a l aspects of the search, such as applying dither to gradient methods and exploration around an apparent optimum to determine whether or not i t i s indeed an optimum. S t i l l others concede the general i n e f f i c i e n c y of random methods, but f i n d that they have certain compensating advantages [22]. I t i s also the conclusion reached by t h i s author. In addition to the evidence of numerical experiments there i s a good p l a u s i b i l i t y argument to support t h i s point of view. I f the q u a l i t y function can be described by a r e l a t i v e l y simple mathematical expression i t means that a great deal i s known about the function. Methods that make e f f i c i e n t use of t h i s information may therefore be expected to be more e f f i c i e n t than those making very l i t t l e or no use of such information. Random methods become \" e f f i c i e n t \" only when the function i s so badly behaved that deterministic methods become very i n e f f i c i e n t or break down altogether. 43 Advantages of Random Search Methods. The main advantage of random methods i s that they are applicable to a larger class of functions than any other method. This i s due to the fact that they require a minimum of r e g u l a r i t y conditions on the q u a l i t y function, which may be discontinuous, multimodal, moderately noisy, slowly time-varying [ l 9 ] , or otherwise badly behaved. This makes possible the use of more r e a l i s t i c , hence more complicated, performance c r i t e r i a , such as Gall's max-ranklng matrix [ l 5 ] . Simultaneous random search makes possible the i n v e s t i g a t i o n of global properties of functions. For example, t h i s method may be used to locate regions i n parameter space which are l i k e l y to contain the global optimum and to provide some measure of the r e l a t i v e extent of these regions. Sometimes t h i s i s the most d i f f i c u l t part of the optimization problem, the l o c a l search being merely a refinement of the approximate resu l t given by the global method [ l 9 J . Another advantage of random methods i s t h e i r f l e x i b i l i t y . Random search methods may be designed for digital,, analogue or hybrid computers, depending upon the problem at hand. The search method may be r e l a t i v e l y complicated [22], as when a large computer i s available, or i t may be simple enough to be suitable for an inexpensive r e a l time optimizer such as described by Matyas. Various kinds of a p r i o r i knowledge about the q u a l i t y function may be incorporated into the algorithm. Thus, the d i s -t r i b u t i o n of t r i a l points may be chosen so that certain regions of parameter space are searched more int e n s i v e l y than others. In fact, 44 the very absence of any inherent structure of random methods allows the designer of the algorithm to structure i t according to the available hardware, the available a p r i o r i information, and the intended application of the. solution. Random methods also have a number of r e l a t i v e l y minor advantages. The boundaries of the search space generally need not be specified exactly, most random search methods are r e l a t i v e l y easy to program, and some methods can solve inequality constrained problems with l i t t l e more d i f f i c u l t y than unconstrained problems [22]. Suggestions f o r Further Research This study suggests that ARS i s the algorithm most amenable to further improvement. I t may be possible to establish guidelines for the choice of the c^ and d^ parameters according to the properties of the function to be minimized. The transformation matrix T may be computed i n such a way as to make more e f f i c i e n t use of the previous hist o r y of the search. Computation of T from the e l l i p s o i d of concentration of successful t r i a l points may be investigated i n t h i s connection. There i s considerable s i m i l a r i t y between ASSRS and gradient methods. Each method i s the most e f f i c i e n t i n i t s class when the constant-Q surfaces are hyperspherical, and both deteriorate ra p i d l y i n e f f i c i e n c y as these surfaces become progressively less spherical. A closer comparison between these methods may y i e l d additional insight into t h e i r convergence properties. The e f f i c i e n c i e s of these methods could be compared on the basis of test functions s i m i l a r to Test Function V. The relevant parameters to be varied are the dimension and the e c c e n t r i c i t y of hyperellipsoid. Such a comparison may show that ASSRS i s always less e f f i c i e n t than a good 45 p r a c t i c a l version of the gradient method when i t applies. \" I t i s an exercise of the imagination to devise schemes for using e s s e n t i a l l y l o c a l information for speeding convergence to a l o c a l maximum\"[19]\u00E2\u0080\u00A2 More comparisons are desirable f o r new or e x i s t i n g random search algorithms and t h e i r prospective competitors, both deterministic and random. Such comparisons can guide the user i n the choice of an algorithm f o r a given optimization problem. Though i t i s easy to design an algorithm that works, i t i s more d i f f i c u l t to design one that works e f f i c i e n t l y . Random search methods may be considered as strategies f o r using a machine to do by brute force what cannot be accomplished by the more economical a n a l y t i c a l methods, or methods derived from these. Because the c a p a b i l i t i e s of the machine are l i m i t e d as w e l l , analysis i s s t i l l required to devise e f f i c i e n t strategies. Various passages i n the l i t e r a t u r e [ l 9\u00C2\u00BB34]> as well as p l a u s i b i l i t y arguments, suggest that p r i n c i p l e s and methods of other branches of science, such as Pattern Recognition and Information Theory, may be brought to bear on the problem of designing e f f i c i e n t random search algorithms. Such approaches may be p a r t i c u l a r l y useful i n connection with the d i f f i c u l t problems presented by performance surfaces which are m u l t i -modal or have steep, curving va l l e y s . In t h i s connection i t i s i n t e r e s t i n g to note that close analogies have been drawn between sequential random search and the theory of evolution [3, .5,12,19]. Each species may be thought to be a l o c a l optimum, each mutation Being a new t r i a l point, and the s u r v i v a l of l i f e being the performance index. Analogies have also been found between random search, knowledge processes and the evolution of s o c i a l i n s t i t u t i o n s [ 3 ,7 ] . Such analogies indicate that random search i s a natural method of solving problems with l i t t l e a p r i o r i knowledge about the qua l i t y function. One could possibly obtain useful ideas f o r the construction of random search algorithms by a study of these natural random search processes. 47 .5. CONCLUSION Among the three reconstructed algorithms compared in this study Kjellstrom's Random Search performed best provided that the number of variables was small. In the special case of spherical constant-Q surfaces ASSRS was most effic i e n t . ARS was neither noticeably superior nor i n -ferior i n any case, but i t has the greatest potential for further improve-ment. The ASSRS algorithm used here i s somewhat more efficient than the original because of a small change that has been introduced deliberately. It was verified that use of a truly uniform random vector distribution produced no appreciable improvement i n the efficiency of the method. A special algorithm was designed to generate these uniformly distributed random vectors. ASSRS and KRS were found to have certain linear convergence properties. A l l three algorithms were found to be immune to small amounts of noise. But i n relation to deterministic methods they were a l l rather ine f f i c i e n t . A general discussion of random search methods points out the variety of approaches that have been used i n the construction of such algorithms. The advantages of random search methods were discussed and suggestions were made for further research. The conclusion reached is that random search methods are more widely applicable than deterministic methods and more flexible to the problem at hand. As computing machines become progressively faster, and as more efficient algorithms are developed, i t may be expected that random search methods w i l l be more widely used, especially where regularity conditions are minimal. 48 Appendix Random Vector Generator This i s an algorithm for the generation of random vectors uniformly d i s t r i b u t e d over the surface of the unit n-sphere. Such vectors are assumed to be a v a i l a b l e i n the OSSRS algorithm upon which ASSRS i s based. However, i n ASSRS the components of th i s random vector are approximated by making each component a random v a r i a b l e uniform on the i n t e r n a l (-1,1) and then normalizing, which does not r e s u l t i n a uniform d i s t r i b u t i o n . The f i r s t step i s to derive the p r o b a b i l i t y density f u n c t i o n of the components r_^ of the random vector r . Because of the uniformity of the d i s t r i b u t i o n a l l n components have the same density function. The following d e r i v a t i o n proceeds by analogy to three dimensional geometry and i s simpler than that of R a s t r i g i n [37]. A ring-shaped element of surface area dS^ on a sphere of radius P 3 and at a distance r^ on the x^-axis i s given by the formula ds 3 = s ^ where S 2 = 2 ITP 2 P 2 = p 3 s i n (j) r3 = p3 C O S ^ S 2 i s inte r p r e t e d as the \"surface area\" of the two dimensional \"sphere\" which r e s u l t s from the i n t e r s e c t i o n of the sphere 3 2 2 Zx = p 1 and the plane X 3 = r3 49 In t h i s case the \"surface area\" i s a c t u a l l y the circumference of the c i r c l e of radius p^. Generalizing to n dimensions, a s i m i l a r element of surface area d S n on a hypersphere of radius p n and at a distance r n on the x n ~ a x i s i s defined by the formula dS - S , p d A n n-1 n v where S , i s again int e r p r e t e d as the surface area of the (n-1) -dimensional n-1 \u00C2\u00B0 hypersphere which r e s u l t s from the i n t e r s e c t i o n of the hypersphere 2 i i n \u00C2\u00A3 x_. = p1 and the hyperplane x = r n n The formula f o r the surface of an n-dimensional hypersphere i s given by [10] s . , 2 r n ( i / 2 ) f t . n - 1 n r (n/2) To complete the analogy one requires the d e f i n i t i o n s p n - l =Pn s i n *\u00E2\u0080\u00A2 r n = p n coscb In f i g u r e 16, which i l l u s t r a t e s the d e f i n i t i o n of these v a r i a b l e s , only two coordinate axes are shown. Any other axes, for n^3, w i l l be orthogonal to each other and to the plane of the page. If random points are uniformly d i s t r i b u t e d over the surface of t h i s hypersphere of radius then the p r o b a b i l i t y P () \u00C2\u00B0f a n v point f a l l i n g i n the element of surface area dS i s p r e c i s e l y the r a t i o dS , i . e . , n n S n p (cb) dd> = dS n n S n S _ p deb n-1 n r S n x , - AXIS n-1 Figure 17 Generation of Random Numbers by a Refined Acceptance-Rejection Method. 51 therefore (1) Pn() = G n s i n 1 1 0 \u00C2\u00A3\u00C2\u00A3TT , n>l where G = rin/2) n r d / 2 ) r ( \u00E2\u0080\u0094 ^ This r e s u l t agrees with that of R e s t r i g i n . In order to conserve computer time f o r the implementation of the algorithm i t i s convenient at th i s point to change v a r i a b l e s from d i r e c t i o n angles to d i r e c t i o n cosines. Let X = cos then dX = -sindcf> It i s desired to express p (X), the p r o b a b i l i t y density function of n d i r e c t i o n cosines, i n terms of Pn(), the p r o b a b i l i t y density function of d i r e c t i o n angles. Reasoning from f i r s t p r i n c i p l e s one may set P n ( \u00C2\u00BB |d<}>| = P n ( x ) j d x | This r e l a t i o n expresses the condition that the p r o b a b i l i t y of f i n d i n g a random d i r e c t i o n angle i n the i n f i n i t e s i m a l l y small i n t e r v a l (+d n n (2) P n ( x ) = G n ( l - X 2 ) ( n ~ 3 ) / 2 - U X * + - - 1 , n>l A more general and rigorous procedure f o r changing the v a r i a b l e s of p r o b a b i l i t y density functions i s given i n [ 1 0 ] . The problem now i s to generate random vectors r the components of which have density function p n ( X ) . A s i m i l a r analogy w i l l now be used i t e r a t i v e l y . When generating a uniformly d i s t r i b u t e d random vector i n three dimensions, r ^ i s chosen from a uniform d i s t r i b u t i o n on the i n t e r v a l (-1,1), which amounts to s e t t i n g n=3 i n formula (2). Then another point i s chosen on the circumference of that c i r c l e , with radius P 2 > which has been determined by the choice of r ^ . The d i s t r i b u t i o n of points on the circumference of t h i s c i r c l e i s again uniform. Choice of the second component amounts to a r e p e t i t i o n of the same process as choice of the t h i r d component r ^ , except that the dimension has been reduced by one and the radius of the sphere has been reduced by the 2 1/2 fa c t o r sincb = (1-X ) I t i s noted also that the repeated process i s c a r r i e d out i n the subspace again defined by the i n t e r s e c t i o n of the sphere of radius and the plane x ^ r ^ , j u s t as i n the f i r s t analogy. Choice of also determines r ^ from the requirement of unit length f or Generalizing to n dimensions, the algorithm for computing the n components r ^ i s as follows: 1) Set p = 1 n \u00E2\u0080\u00A2 2) Set i = n 3) Choose X. from p. I l (X) A) Scale r. = p.X. l i i 5) Proj ect p. n = p. J i - l l ( i - x2 ) 1 / 2 l 6) Decrement i 7) If i > l go to step (3) 8) Set X = +1, with equal p r o b a b i l i t y 9) Set r x = p 1 X 1 In t h i s context X^ denotes a r e a l i z a t i o n of the random v a r i a b l e with p r o b a b i l i t y density function p^(X). 53 This procedure y i e l d s random vectors of u n i t length, independently of the functions p^(X), provided only that | ] $ 1 . I N I ^ r 2 1 2 2 2 = X +X \ (1-X )+ n n-1 n + x J ( l - X 2 , ) ( l - X 2 , ) . . . ( l - X 2 ) \u00E2\u0080\u009E n-1 _ n = X + Z [XT n (1-X7)] J=l J 1=J+1 Taking into account the d e f i n i t i o n of X^ the second term may be telescoped 2 to 1-X , r e s u l t i n g i n n l l r l l - 1 . As t h i s procedure by analogy does not c o n s t i t u t e a proof, the question about whether or not r i s uniformly d i s t r i b u t e d on the hypersphere was v e r i f i e d experimentally. A histogram was constructed of the projections of these random vectors onto some unit reference vector. As the number of sample random vectors increases without l i m i t , t h i s histogram must tend to the above density function when correspondingly scaled. Experiments with several randomly chosen reference vectors i n 5,6 and 7-dimensional spaces indicated that the d i s t r i b u t i o n s were indeed uniform. A sample histogram i s shown i n f i g u r e 18, i n t h i s case the random vectors are projected onto one of the coordinate axes. During the implementation of t h i s algorithm the t h i r d step, choosing X_^ from p^(X), proved troublesome. Since p_^(X), cannot be integrated and solved f o r X the Inverse Method of random number generation could not be applied [1]. Instead, a r e f i n e d version of the Acceptance-Rejection method has been used. The area under the curve p^(X) was covered with non-overlapping rectangles, which i n turn were uniformly covered with randomly chosen points. Those points f a l l i n g under the curve were accepted and the rest were rejected XXX Data Points of Histogram Density Function p 7(X) X RXIS Figure 18 Sample Histogram, of 100 000 7-Dimensional Random Vectors Projected onto x^-Axis i_ri -P-[ f i g . 17]. For large n t h i s method i s more e f f i c i e n t than a simple Acceptance-Rejection method because P n(X) tends to the impulse function as n tends to i n f i n i t y . Also, f o r n$3 the whole procedure was abandoned i n favour of one of the simpler methods a v a i l a b l e f or three dimensional d i s t r i b u t i o n s , thus avoiding the problems posed by P2(X), which i s i n f i n i t e at +1. The algorithm i s somewhat les s e f f i c i e n t than the method des-cribed by Muller [30], which simply scales a random vector from a r a d i a l l y symmetric n-dimensional Gaussian d i s t r i b u t i o n for unit length. The algorithm was used i n t h i s work to generate random vectors i n up to 30 dimensions. No analysis of the propagation of truncation errors was c a r r i e d out, however. These may be s i g n i f i c a n t for high n. The r e l a t i v e l y simple geometry of the hypersphere suggests that i t may be pos s i b l e to s i m p l i f y t h i s algorithm. 56 REFERENCES 1. M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions, Dover Publications, Inc. (1965). 2. G. A. Bekey, M. H. Gran, A. E. Sabroff and A. Wong, \"Parameter Op-timizat i o n by Random Search Using Hybrid Computer Techniques\", Proc. AFIPS 1966 F a l l Joint Computer Cong., v o l . 29, pp. 191-200. 3. H. J. Bremermann, M. Rogson and S. Salaff, \"Search by Evolution\", Proc. Second Cybernetic Sciences Symposium: Biophysics and Cybernetic Systems, pp. 157-167 (1965). 4. N. S. Bromberg, \"Maximization and Minimization of Complicated Multivariable Functions\", AIEE Trans. Com. and E l e c , v o l . 58, pp. 725-730 (1962). 5. S. H. Brooks, \"Discussion of Random Search Methods for Locating Surface Maxima\", Op. Res., v o l . 6, pp. 244-251 ( l 9 5 8 ) . 6. \"A Comparison of Maximum Seeking Methods\", Op. Res., v o l . 7, pp. 430-457 (1959). 7. D. T. Campbell, \"Blind Variation and Selective Survival as a General Strategy i n Knowledge Processes\"-, Self-Organizing Systems (M.G. Yovits and S. Cameron, eds.), Pergamon Press, I960. 8. V. K. Chichinadze, \"Random Search to Determine the Extremum of Functions of Several Variables\", Eng. Cybernetics, 1967, pp. 115-123. 9. L. D. Cockrell, \"On Adaptive Random Search Techniques\", IEEE Proc. Seventh Symp. Adaptive Processes> UCLA, 1968. 10. H. Cramer, Mathematical Methods of S t a t i s t i c s , Princeton Univ. Press, 1945. 11. B.M.E. De S i l v a , \"The Application of Nonlinear Programming to the Automated 57 Minimum Weight Design of Rotating Discs\", Optimization (R. Fletcher, ed.), Academic Press Inc., 1969\u00C2\u00AB 12. R. E. Favreau and R. Franks, \"Random Optimization by Analog Techniques\", Proc. Second Int. Analog Computation Meeting, Presses Academiques Europeenes, 1959-13. R. Fletcher and M. J. D. Powell, \"A Rapidly Convergent Descent Method for Minimization\", Comp. J., v o l . 6, pp. 163-168 (1963). 14. M. M. Flood and A. Leon, \"A Universal Adaptive Code for Optimization\", Recent Advances i n Optimization Techniques (A. Lavi and T. P. Vogl, eds.), Wiley and Sons, Inc., 1966. 15. D. A. G a l l , \"A P r a c t i c a l Approach to the Optimal Control of Systems with Applications to the Control of a Submarine i n a Random Sea\", Ph. D. Thesis, MIT, 1964. 16. L. S. Gurin and L. A. Rastrigin, \"Convergence of the Random Search Method i n the Presence of Noise\", Automation and Remote Control, v o l . 26, pp. 1505-1511 (1965). 17. R. Hooke and J. A. Jeeves, \"Comments on Brooks' Discussion of Random Search Methods\", Op. Res., v o l . 6, pp. 881-882 ( 1958) . 18. J. M. Idelsohn, \"10 Ways to. Find the Optimum\", Control Eng., v o l . 11, pp. 97-102 (1964). 19. D. C. Karnopp, \"Search Theory Applied to Parameter Scan Optimization Problems\", Ph.D. Thesis, MIT, 1961. 2 0 . \"Random Search Techniques for Optimization Problems\", Automatica, v o l . 1, pp. 111-121 (1963). 21 . W. P. Kavanaugh, E. C. Stewart and D. H. Brocker, \"Optimal Control of S a t e l l i t e A l t i t u d e Acquisition by a Random Search Algorithm on Hybrid Computers\", Proc. AFIPS 1968, pp. 443-452. 58 22. G. K j e l l s t r o m , \"Network Optimization by Random V a r i a t i o n of Component Values\", Ericson Technics, no. 3, 1969. 23. G.A. Korn and H. Kosako, \"A Proposed Hybrid-Computer Method for Functional Optimization\", IEEE Trans. Computers, Feb. 1970. 24. G.P. Lawson and A. L a v i , \"Monte Carlo Programming: Random Searching i n the Solution of Parameter Optimization Problems',' 1970. 25. A. Leon, \"A C l a s s i f i e d Bibliography on Optimization\", Recent Advances i n Optimization Techniques (A. Vavi and T.P. Vogl. eds.), 1966. 26. J . Matyas, \"Random Optimization\", Automation and Remote Control, v o l . 26, pp. 244-251 (1967). 27. \"Das z u f a e l l i g e Optimierungsverfahren und seine Konvergenz\", F i f t h Intern. Analogue Computation Meetings, Proceedings,' v o l . 1, pp. 540-544 (1967). 28. J.M. Mendel and K.S. Fu (eds.), Adaptive, Learning and Pattern Recognition Systems (chapter 7), Mathematics i n Science and Engineering, v o l . 66, Acedemic Press, 1970. 29. J.M. Mendel and J . J . Zapalac, \"The A p p l i c a t i o n of Techniques of A r t i f i c i a l I n t e l l i g e n c e to Control System Design\", Advances i n Control Systems, v o l . 6, Academic Press, 1968. 30. M.E. Muller, \"A Note on a Method for Generating Points Uniformly on n-Dimensional Spheres\", Comm. ACM, v o l . 2, pp. 19-20 (1959). 31. J.K. Munson and A.I. Rubin, \"Optimization by Random Search on the Analog Computer\", IRE Trans. EC-8, pp. 200-203 (1959). 32. B.A. Murtagh and R.W.H. Sargent, \"Computational Experience with Qu a d r a t i c a l l y Convergent Minimization Methods\", Comp. J . , v o l . 43, no. 2 (1970). 33. J.A. Nelder and R. Mead, \"A Simplex Method\"for Function Minimization\", 59 Comp. J . v o l . 7, pp. 308-312 (1965). 34. Y.I. Neymark and R.G. Strongin, \"Information Approach to Finding the Extremum of Functions\", Eng. Cybernetics, 1966, no. 1. 35. A.F. Pensa, \"Gradient Biased Random Search\", Ph.D. Thesis, Pennsyl-vania State U n i v e r s i t y , 1969. 36. D.A. P i e r r e , Optimization Theory with A p p l i c a t i o n s , Wiley and Sons, 1969. 37. L.A. R a s t r i g i n , \"The Convergence of the Random Search Method i n the Extremal Control of a Many Parameter System\", Automation and Remote Control, v o l . 24, pp. 1337-1342 (1963). 38. M.A. Schumer, \"Optimization by Adaptive Random Search\", Ph.D. Thesis, U n i v e r s i t y of Princeton, 1968. 39. M.A. Schumer and K. S t e i g l i t z , \"Adaptive Step Size Random Search\", IEEE Trans. AC-13, pp. 270-276 (1968). 40. E'.M. Vaysbord, \"Asymptotic Estimates of the Rate of Convergence of Random Search\", Eng. Cybernetics, 1967, no. 4. 41. M.R. Weinberger, \"Multistage Random Search and Automatic Network Synthesis\", Ph.D. Thesis, MIT, 1966. 42. D.J. Wilde, Optimum Seeking Methods, Prentice H a l l , 1964. "@en . "Thesis/Dissertation"@en . "10.14288/1.0093288"@en . "eng"@en . "Electrical and Computer Engineering"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use."@en . "Graduate"@en . "Comparison of three random search methods"@en . "Text"@en . "http://hdl.handle.net/2429/34218"@en .