AN EMPIRICAL STUDY OP LOCALLY PSEUDO-RANDOM SEQUENCES by Alan Rodney Dobell A., The University of B r i t i s h Columbia, 1959 A Thesis Submitted i n P a r t i a l Fulfilment of The Requirements for the Degree of Master of Arts i n the Department of Mathematics We accept t h i s thesis as conforming to the required standard The University of B r i t i s h Columbia A p r i l , 1961 I n p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r an a d v a n c e d d e g r e e a t t h e U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t t h e L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s t u d y . I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s may be g r a n t e d by t h e Head o f my Department o r by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l n o t be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . D e p a r t m e n t o f Mathematics The U n i v e r s i t y o f B r i t i s h C o l u m b i a , V a n c o u v e r $ , C a n a d a . Date April 12, 1961. . i i Abstract In Monte Carlo calculations performed on el e c t r o n i c com-puters I t i s advantageous to use an arithmetic scheme to generate sets of numbers with "approximately" the properties of a random sequence. For many applications the l o c a l c h a r a c t e r i s t i c s of the r e s u l t i n g sequence are of i n t e r e s t . In t h i s thesis the concept of a pseudo-random sequence i s set out, and arithmetic methods for t h e i r generation are d i s -cussed. A b r i e f survey of some standard s t a t i s t i c a l tests of randomness i s offered, and the r e s u l t s of empirical tests f o r l o c a l randomness performed on the ALWAC III-E computer at the University of B r i t i s h Columbia are recorded. I t i s demonstrated that many of the standard generating schemes do not y i e l d sequences with suitable l o o a l properties, and could therefore be responsible f o r misleading r e s u l t s i n some applications. A method appropriate f o r the generation of short blocks of num-bers with approximately the properties of a randomly selected set i s proposed and tested, with s a t i s f a c t o r y r e s u l t s . i i i TABLE OP CONTENTS Chapter 1. Introduction Page 1 Chapter 2. S t a t i s t i c a l Tests 9 Chapter 3. Methods of Generation 18 Chapter J i . Locally Pseudo-Random Sequences 29 Chapter Summary and Conclusions 37 Bibliography 38 Tables. Table I To Follow Page 31 Table II To Follow Page 33 Table III To Follow Page 3I1 Table IV To Follow Page 3k Acknowledgementa I wiah to acknowledge the assiatance extended to me by Dr. L. Schwartz, who read and c r i t i c i z e d t h i s thesis i n draft form, by the s t a f f of the Computing Centre at the University of B r i t i s h Columbia, with whom I have had the p r i v i l e g e of working, and i n p a r t i c u l a r by i t s director, Professor Thos. E. H u l l . For any merits t h i s thesis may f i n a l l y claim, his enthusiasm and ready counsel are primarily responsible. 1 Introduction Solutions to many problems require random numbers. Simu-l a t i o n techniques and Monte Carlo methods often depend i n an es s e n t i a l way on the fact that i f P i s a p r o b a b i l i t y d i s t r i -bution function, with inverse. P"1, and X i s a random variable uniformly distributed on [o, i ] , then F""1(X) i s a random variable with d i s t r i b u t i o n function P. (See, e.g . f j p ] , [73.) Applications i n mathematical s t a t i s t i c s often depend i n an e s s e n t i a l way on t h e o r e t i c a l properties of random numbers. (See, e.g.,[8]) A concern basic to both applications, and of importance i n other problems of applied analysis, i s that of guaranteeing adequate supplies of numbers apparently Indepen-dently drawn from a population uniformly distributed on the unit i n t e r v a l . That i s , the problem Is to provide numbers x drawn i n such a way that Prob. (x * a) = a f o r 0 i a 1 independently of a l l preceding or succeeding numbers. The pur-pose of this essay i s to describe and examine some methods by which numbers with "approximately'* t h i s property may be supplied i n a p r a c t i c a l way using electronic computers. Let us e s t a b l i s h at the outset that a number i s to be i n t e r -preted as a point on the r e a l l i n e . This understanding w i l l serve to d i s t i n g u i s h the following study from work dealing with random sampling d i g i t s . Random d i g i t s have long been of concern to the s t a t i s t i c i a n interested i n actual sampling procedures, or i n the t h e o r e t i c a l 2 procedure known as d i s t r i b u t i o n sampling. Discussion of the problem of supplying suitable random sampling d i g i t s seems to have begun with Kendall and Bab ing ton-Smith C241 [25]# although Tippett's tables were published e a r l i e r £*36J and some th e o r e t i c a l sampling techniques had been used e a r l i e r yet.|f35l This work on random d i g i t s i s n a t u r a l l y very closely related to the problem of random number generation; the two problems are not, however, equivalent. Thus, although we s h a l l r e f e r to a r t i c l e s on random sampling d i g i t s , and s h a l l use modified forms of tests proposed f o r random d i g i t s , we s h a l l r e t a i n the d i s t i n c t i o n , and phrase our discussion e n t i r e l y i n terms of points on the r e a l l i n e . In terms of points x, the t h e o r e t i c a l requirements imposed on a random set of numbers may be summarized i n the following standard d e f i n i t i o n s . Defn. 1; The set of numbers x., x_, x w i l l be 1 2 n said to be random i f i_t represents an observation on a vector random variable X^n^ = (X-^ , x2, . . .» xn) with a j o i n t cumulative d i s t r i b u t i o n function M n P n = P n ( x n , x p, x ) = 7T p l ( x ) 1 1 i = l where p l i s some univariate d i s t r i b u t i o n function. ' Defn. 2 t The set of numbers x,, x 0, x w i l l be said to be p e r f e c t l y random i f the d i s t r i b u t i o n function P 1 of D e f i n i t i o n 1 i s the d i s t r i b u t i o n function rQ i f x ^ 0 P 1(x) = j x i f _ 0 £ x ^ l V 1 i f x > 1 3 and x± £ fo, l l . This d e f i n i t i o n sets out the properties required i n a n a l y t i -c a l work; we must modify i t somewhat f o r our purposes. We must recognize i n i t i a l l y that the numbers with which we are oonoerned w i l l be represented i n a f i n i t e word length computer by a f i n i t e number of d i g i t s . Thus only a f i n i t e num-ber (depending on the computer) of d i s t i n c t configurations i s possible; i f the radix i s denoted r , and the word length k, then the number of d i s t i n c t elements i n the set S of d i s t i n g u i -shable numbers i s just r ^ . We can deal only with discrete approximations, uniform over the set S, to the d i s t r i b u t i o n functions mentioned i n D e f i n i t i o n 2. It w i l l be the d e f i n i t i o n i n terms of these discrete approximations which i s meant when reference i s made i n the following to properties of randomness. But i n any case one has In general no p r i o r knowledge of the d i s t r i b u t i o n function F — the only procedure which can determine whether an observed set of numbers may be said to s a t i s f y D e f i n i t i o n s 1 or 2 i s a s t a t i s t i c a l t e s t i n g procedure. And any f i n i t e t e sting procedure i s imperfect: sets of numbers which actually s a t i s f y D e f i n i t i o n 2 may not s a t i s f y standard t e s t s ; sets of numbers passing any given class of tests may yet not be random i n the sense of the d e f i n i t i o n . This d i f f i c u l t y w i l l not concern us i n the following. Be-cause the requirements i n application are f o r sets of numbers with c e r t a i n sp e c i f i e d properties, our objective w i l l be simply to construct methods which pass tests f o r these properties. Because applications are set up generally to mirror the condi-t i o n of t h e o r e t i c a l randomness, a set of numbers w i l l be suitable i n application i f s p e c i f i c properties i t displays do not deviate "too markedly" from those expected of a random set. Therefore we may apply the machinery of s t a t i s t i c a l t e sts of randomness to determine whether a p a r t i c u l a r set of numbers w i l l be suitable i n a given a p p l i c a t i o n . These considerations lead to the following d e f i n i t i o n of a pseudo-random sequence. Defn.; A pseudo-random sequence i s a set of numbers which passes the tests i n a class C of tests of randomness. Our objective i s to construct pseudo-random sequences. The choice of a p a r t i c u l a r class C of tests i s a problem to which s a t i s f a c t o r y answers cannot be given i n general. No f i n i t e set of numbers oan s a t i s f y a l l plausible tests of random-ness j on examination, every f i n i t e sequence w i l l display some p e c u l i a r i t i e s which would cause i t to be rejected as a randomly generated set. The choice of tests can be f i n a l l y determined only by the use to which the sequence i s to be put, and the properties which are e s s e n t i a l i n that use. An i n i t i a l discus-sion of th i s question, and of the way i n which the tests w i l l be used, i s given i n Chapt. 2. Having i n hand a procedure for determining whether an observed sequence i s pseudo-random, the problem i s to construct methods which y i e l d sequences l i k e l y to s a t i s f y the test c r i t e r i a . The f i r s t suggestion may be to make use of mechanical 5 processes which are believed a p r i o r i to s a t i s f y the conditions of D e f i n i t i o n 2 processes such as the drawing of cards from a "properly shuffled deck" of numbered cards, the f l i p p i n g of "true" coins, the r o l l i n g of " f a i r " dice. Tables of random d i g i t s have been constructed by such means, and the methods are s t i l l proposed. (See the review by Tompkins £ l 2 l ] of icosahedral dice.) They are, however, both too slow and too limited i n scope to be of use i n computer applications. In the same s p i r i t i s the suggestion that tables could be constructed from numbers appearing on census returns, waybills, etc. The tables of Tippett [ 3 6 ] , and Horton and Smith flOli] used t h i s method, with a randomizing transformation described i n [lOli]. For more extensive tables, some methods u t i l i z e p hysical pro-cesses which are expected, on the basis of phys i c a l theory, to y i e l d completely random output. It was by thi s means that the tables of Kendall and Babing.ton-Smith [ 2 6 ] and the RAND Corpora-t i o n [ 3 3 j were constructed. It might be expected that i f electronic equipment can be constructed to produce random output, such equipment could e a s i l y be wired into a computer, y i e l d i n g random numbers on demand. Apart from the important fa c t that such equipment would be expen-sive, there are major disadvantages. i ) The equipment has a tendency to degenerate to a syste-matic output; (see T3J on the RAND experience) and would therefore be expensive to maintain i n the 'random' condition. i i ) A c a l c u l a t i o n could not be checked or re-run. The output cannot be duplicated. 6 i i i ) No attempt can be made i n the c a l c u l a t i o n to avoid the deviations from mean behaviour which ine v i t a b l y occur i n a random sequence. That i s , although the sequence may be random, i t need not be pseudo-random. Not a l l random sequences would be suitable i n applica-t i o n ; the deviations mentioned above may be responsible f o r misleading or anomalous r e s u l t s . Rather than incorporating special equipment into the computer, we might u t i l i z e e x i s t i n g tables to input random numbers as required. This, even with the most e f f i c i e n t input,equipment, i s too slow to be f e a s i b l e . A l t e r n a t i v e l y , to attempt to store tables i n the computer memory would require a p r o h i b i t i v e l y large amount of memory, and i n many cases would s t i l l be unduly slow. Both measures have the additional disadvantage that the volume of numbers required for some calculations could well exceed the size of the largest e x i s t i n g tables. Thus, neither published tables nor external procedures are sa t i s f a c t o r y f o r standard computer appl i c a t i o n s . Recognizing the disadvantages of the above proposals, i t has been suggested that suitable pseudo-random sequences could be supplied by an arithmetic method of generation. Of course, such methods do not s a t i s f y the conditions of a random method of generation; they are, i n fact, completely determined by one or two previous elements of the sequence. As John von Neumann says ([l2i+~) page 36) "any one who considers using arithmetical methods of producing random d i g i t s i s , of course, i n a state of s i n . " But the point i s that we are not concerned with the conditions of randomness — we require only a pseudo-random , 7 sequence. Whioh means that we look no f a r t h e r than the sequence i t s e l f f o r our c r i t e r i a of s u i t a b i l i t y ; we care not how i t was produced. In Chapter 3 we outline several proposed methods f o r generating arithmetically a pseudo-random sequence, and r e l a t e the r e s u l t s of s t a t i s t i c a l t e sts performed on these. In that discussion i t w i l l be noted that the number sequences so gener-ated are p e r i o d i c ; the problem of assuring a s u f f i c i e n t l y long period i s successfully treated by methods of number theory, and a l l the methods proposed are such as y i e l d a maximal period f o r a given computer. Applications arise, however, i n which a shorter block of random numbers appears as an i n t e g r a l part of the c a l c u l a t i o n s . This s i t u a t i o n suggested the present study; i t appeared not at a l l obvious that a sequence which suitably passed a class of tests would y i e l d short subblocks which i n themselves would pass the same class of t e s t s . In Chapter it we outline empirical work which was undertaken to test whether a sequence with maxi-mal period i s most suitable f o r use i n such a s i t u a t i o n . Our conclusion, based on t h i s experimental work, i s that i t i s not. It seems rather that a sequence with shorter period displays more suitable properties over short blocks of numbers. This r e s u l t again points up the f a c t that the choice of a method for generating a sequence must be made with the requirements and c h a r a c t e r i s t i c s of the p a r t i c u l a r a p p l i c a t i o n i n mind. These then are the main themes of t h i s t h e s i s : that there i s i n p r a c t i c e no d e f i n i t i o n of randomness f o r f i n i t e ' sets apart from a class of s p e c i f i c t e s t s ; that t h i s class of tests 8 must be selected with reference to the intended application of the random sequence; and that for some applications the choice of a suitable method of generation w i l l lead to sequences with less than maximal period. In the next chapter we begin our discussion by determining an adequate procedure for t e s t i n g sequences. Chapter 2. Tests of Randomness Tests of randomness are unlimited i n number. Many are extensively studied i n the s t a t i s t i c a l l i t e r a t u r e , and references to much of t h i s work are included i n our bibliography. The general problem i s described by Levene £63! as follows: "Let the vector random variable X<n> = (xlf x2, xn) have the j o i n t cumulative d i s t r i b u t i o n function F N = F n ( x ^ , x 2, .»», x n ) • 0 .... Let -A n be the class of a l l continuous F N , and l e t Csn be the class of a l l F N of the form F N = TT n _ 1 P 1 ^ ) where F^ - i s some continuous univariate d i s t r i b u t i o n function. By the hypothesis of randomness, HQ, we mean the hypothesis that F N , known to belong to -A^ actually belongs to CJn. The s t a t i s t i c a l problem i s to test H Q on the basis of one observation x n on Xn The most usual procedure has been f o r the s t a t i s t i c i a n to devise some s t a t i s t i c whose d i s -t r i b u t i o n could be obtained without too much trouble. Then i f extreme values of t h i s s t a t i s t i c were observed, the hypothesis of randomness was rejected." Likewise, to test the hypothesis that an observed set x(n) i s p e r f e c t l y random i s to test the hypothesis that F N £ CJn, where &>n Is the class of a l l F N of the form F N = T T F 1 ( X 1 ) .. CO x £ 0 and F-Mx) = f x 0 x ^ 1 ( l x > 1 i . e . where F 1 i s the uniform d i s t r i b u t i o n on f o , l ] , A d i f f i c u l t y mentioned i n the introduction arises at t h i s point. As Levene £63J points out: 10 "... I f we look long enough we w i l l f i n d something very peculiar and non-random about any given sequence and can prove that the p r o b a b i l i t y of t h i s p e c u l i a r i t y a r i s i n g by chance i s very small. The d i f f i c u l t y i s that randomness Is not a property of a sequence of numbers, but of the process that produced them, that i s , of pn ." Consequently there i s no test with a high p r o b a b i l i t y of r e j e c t i n g H Q whenever P n ^ w n . "In f a c t , given any c r i t i c a l region of size o(t there exists F n £ cJ f o r which the p r o b a b i l i t y of the c r i t i c a l set i s zero." The t h e o r e t i c a l alternative proposed i s to r e s t r i c t the class F n to a class of alternatives e s p e c i a l l y feared, and to ohoose s t a t i s t i c s with good power against these. In prac-t i c e t h i s means we must stipulate ahead of time s p e c i f i c proper-t i e s e s s e n t i a l i n a given application, and test the sequence for these properties. Because we wish only to determine whether an observed sequence i s pseudo-random, we test the hypothesis of randomness only against alternatives which would represent sets unsuitable In a p p l i c a t i o n . That i s , the class of alternatives w i l l contain only d i s t r i b u t i o n s which are non-random i n such a way as to render samples drawn from them unsatisfactory f o r the intended a p p l i c a t i o n . In p a r t i c u l a r , the alternatives w i l l not include d i s t r i b u t i o n s associated with the type of arithmetic dependence we describe l a t e r . These considerations then suggest the correct interpreta-t i o n to be assigned to c e r t a i n extreme values of the t e s t s t a t i s -t i c s . In general an observed value Z Q f o r a test s t a t i s t i c Z i s considered to give cause f o r the r e j e c t i o n of the n u l l hypothesis at a l e v e l of significance 0( i f z f a l l s i n a o 11 c r i t i c a l region w where Prob a b i l i t y [w] ~ <X when the n u l l hypothesis i s true. In the tests which we w i l l describe and use i n the following, Z i s a "distance" s t a t i s t i c and the c r i t i c a l region w i s of the form w = | " z o : Z Q > c^ where c i s a constant and Prob £ z > c] ~ o< when the n u l l hypothesis i s true. Following the lead of Kendall and Babington-Smith £2 l i ] several writers have used a "two-tailed" test f o r tes t i n g pseudo-random sequences. (See, e.g. £ ° 7 j on the tests of the d i g i t s of If and e_, and the comment of Taussky and Todd £ l l 7 l that the d i g i t s of e are "apparently bad" (P. 2 6 ) . ) But, by vi r t u e of the way i n which we r e s t r i c t the class of alternatives to be considered, there i s no alternative which j u s t i f i e s the r e j e c t i o n of the n u l l hypothesis on the basis of extremely low values of the test s t a t i s t i c s we s h a l l use. For the purpose of assessing a pseudo-random sequence, a "two-t a i l e d " test procedure i s not appropriate. (On t h i s question, see also JTlio] [f?l].) The tests outlined below f a l l In the class known as non-parametric or d i s t r i b u t i on free t e s t s . We s h a l l describe some which have been extensively used, and o f f e r references to several others. Uniformity — The Chi-square Test The basic requirement on pseudo-random sequences i s that they be "approximately" uniformly distributed — i . e . — that Pr o b a b i l i t y [x ^ a] = a 0 £ a 1 . 12 The standard way to test whether th i s property can be said to hold f o r an observed sequence Is to subdivide the u n i t i n t e r -v a l i n t o k d i s j o i n t i n t e r v a l s I. of length L. and calculate the value of t h e ^ s t a t i s t i c i f the i n t e r v a l s are of equal length k X? = 21 (f,-n/k) 2/(n/k) , j=l 3 where f . denotes the number of elements of the sequence f a l l i n g J i n Interval j , and n denotes the t o t a l number of elements. The l i m i t i n g d i s t r i b u t i o n of t h i s well-known s t a t i s t i c was devel-oped by Pearson f7 23* i s tabulated as the ")(_2 s t a t i s t i c with k-1 degrees of freedom. We.reject the hypothesis of randomness at the l e v e l of significance o( i f X 2 C where Prob. [X^-iJ - °} ~ when the n u l l hypothesis i s true, and X 2^. -Q i s the tabulated X 2 d i s t r i b u t i o n with k-1 degrees of freedom. Such X 2 goodness of f i t tests are used In many d i f f e r e n t tests of randomness. The question of optimal choice of k has been studied £83}, but i s frequently s e t t l e d by considerations of programming convenience. Further studies discuss the a p p l i -c a b i l i t y of " X 2 tests i n general, and discuss possible modifi-cations. [i|3] [si] Other possible "distance" measures used as tests of goodness of f i t involve the evaluation of expressions l i k e sup | F n ( x ) - G(x) | 13 where P n i s t h e e m p i r i c a l , d i s t r i b u t i o n t o be t e s t e d , G t h e h y p o t h e s i z e d " t r u e " d i s t r i b u t i o n . S u c h e x p r e s s i o n s a r e e x t r e m e l y awkward t o p r o g r a m , and we t h e r e f o r e have made no u s e o f t h e m i n t h i s i n i t i a l e x p e r i m e n t a l w o r k . No a p p l i c a t i o n seems t o have b e e n made o f s u c h s t a t i s t i c s i n any p u b l i s h e d t e s t s o f p s e u d o - r a n d o m s e q u e n c e s . I n d e p e n d e n c e — The S e r i a l M a t r i x T e s t C o n s i d e r a s e t o f n d i g i t s a^, a 2 , a R where e a c h a^ i s drawn f r o m a s e t o f t d i g i t s i n a random m a n n e r . Then i t i s t o be e x p e c t e d t h a t no d i g i t w o u l d t e n d t o be f o l l o w e d more o f t e n b y any one d i g i t t h a n b y any o t h e r — i . e . — t h e o f r e q u e n c y o f o c c u r r e n c e o f e a o h o f t h e t p o s s i b l e 2 - d i g i t c o n f i g u r a t i o n s w o u l d be t h e same, and t h e r e f o r e e q u a l t o n / t 2 . K e n d a l l and B a b i n g t o n - S m i t h \2\\\ p r o p o s e d a t e s t f o r t h i s p r o p e r t y w h i c h t h e y c a l l e d t h e S e r i a l t e s t . I f t h e f r e q u e n c y o f o c c u r r e n c e o f t h e 2 - d i g i t c o n f i g u r a t i o n a j _ a j i s d e n o t e d g ^ j , t h e n t h e e x t e n t t o w h i e h t h e o b s e r v e d s e t d e v i a t e s f r o m e x p e c t a t i o n may be measured b y t X 2 = 21 ( g . - n / t 2 ) 2 / ( n / t 2 ) . i , j = l i j K e n d a l l and B a b l n g t o n - S m i t h a s s e r t e d t h a t t h i s s t a t i s t i c had a s y m p t o t i c a l l y t h e ~)( 2 d i s t r i b u t i o n w i t h ( t 2 - t ) d e g r e e s o f f r e e d o m , and t h i s f a c t was much u s e d (by t h e RAND C o r p o r a t i o n among o t h e r s ) i n t e s t s o f random d i g i t s . A m o d i f i c a t i o n o f t h i s t e s t f o r a p p l i c a t i o n t o r a n d o m num-b e r s was u s e d b y J u n c o s a £107]}. I n t h i s m o d i f i c a t i o n , t h e u n i t i n t e r v a l was p a r t i t i o n e d i n t o k (= 10) s u b i n t e r v a l s . F o r a Ik set of numbers, the number of occasions on which a number x r f a l l i n g i n i n t e r v a l i was followed by a number x r + l * n * n t e r v a i J w a s t a l l i e d as the frequency Again the extent of deviation from expected behaviour may be measured by a }£ 2 s t a t i s t i c , X 2 =21 ( f ± i - n/k2)2/(n/k 2) . 2 i , j = l 1 J Juncosa then tested the significance of t h i s measured value by comparison with the values of % f o r k - 1 ( 9 9 ) degrees of freedom. Subsequently I. J . Good p>2J J53J demonstrated that x| did not have asymptotically a X 2 d i s t r i b u t i o n , but that, i f we set f ± = JT j = 1 . f i j = j = i f " t h e n k k H ( f , . - n/k 2)2/(n/k 2) - 51 (f, - n/k) 2/(n/k) = ^ -i , j = l 1 J i = l 1 . . * has asymptotically a d i s t r i b u t i o n with k - k degrees of freedom, and also k k 22 ' n/k2)2/( n/ k2) . 2 ^ ( f ± , - n / k ) 2 / ( n / k ) i , j = l 1 = 1 = X 2 - 2X 2 P P has asymptotically a % d i s t r i b u t i o n with (k - 1) degrees of freedom. The l a t t e r of these measures was used by the authors of the RAND table i n a correction [ l 2 l | to t h e i r e a r l i e r t e s t ; we s h a l l use the former i n the experimental work of Chapter lx, i n t e s t i n g f o r the independence of successive d i g i t s . Independence — S e r i a l Conrelation Test Tests which are p a r t i c u l a r l y useful i n t e s t i n g f o r random-ness against the alternative of a trend or a c y c l i c f l u c t u a t i o n 15 are the s e r i a l c o r r e l a t i o n measures. The values of such expressions as N R h = 1 / N <Ti X i X i + ^ ' N C = 1 - j2 / 2 t f2 N where tf2 = l A Z ^ X * - X ) 2 i = l are used as measures of the s e r i a l c o rrelation, and have been widely studied. For the case i n which h = 1, i t i s found £37] [&h\ that for large N, R i s approximately normal with expectation E(R) = (S 2 - S 2)/(N-1) and variance 6 2(R) = ( S 2 - S ^ ) / ( r + (s4-l4,S2S2+iiS.|S^+S2-2S, ) -E2(R)/{;,,-;,; ; n- 1 (n-l)(n - 2 ) where Sr,-= x^ + x + ... x^ K 1 2 N and the sequence i s assumed to be randomly drawn from a d i s t r i -bution with low order moments. Williams £87] and von Neumarn[8o] [8l] have studied the moments of ^ 2 , and Young f92^ has proposed the related s t a t i s -t i c C, which behaves l i k e a conventional c o r r e l a t i o n c o e f f i -cient. The analogous expressions f o r h 1 measure a s e r i a l c o r r e l a t i o n with lag h, and provide a basis f o r further tests of randomness against the alternative of trend or regular f l u c -tuation. 16 Independence -- Runs Teats . A study of the distribution of runs yields several measures which are used as tests of randomness. The article by A. M. Mood £681 has an extensive bibliography of the i n i t i a l papers on the topic and of related studies to 191+0. The analysis of runs up and down has perhaps been the most thoroughly studied; the work of Wolfowitz and Levene £ 6 3 j [61±1 has given the expected values and the covariance matrix for statistics based on runs up and down, and they have studied the properties of tests based on these s t a t i s t i c s . Their article f61il incidentally shows that the test procedure used by Kermack and McKendrick £ 6 l ] is not correct. Letting r^ denote the number of runs up or down of length p and r 1 denote the number of runs up or down of length P greater than or equal to p , then the expected values of r P and r' in a set of size n, are given by P E(r ) = 2n(p 2 +3p+D/(p+3): - 2(p3+3 P 2- p-L|.)/(p+3) t E(r ) = 2n(p+l)/(p+2)l - 2(p 2 - H p-l)/(p+ 2 ) I 3? These results ma<ke possible a test of goodness of f i t of the observed to the expected number of runs up and down i n the observed sequence. Similar statistics may be based on the observed number of runs above and below the median C68j, runs above and below the mean T68J, and the total number of runs £ 7 5 ] ] . The sign tests [66] [69] and the U-test [ l 3 0 J are of a similar nature. A com-prehensive discussion of this class of order statistics is found i n Wilks [ 8 8 ] . 17 The Co s t a t i s t i c of Kendall and Sherman [38] [23] and n the low order moments of the sample (empirical) d i s t r i b u t i o n provide further tests of randomness. The d 2 test of Gruenberger and Mark [$6] i s designed f o r the case i n which the sequence tested i s to be used i n Monte Carlo calculations — i n p a r t i c u l a r those i n which two succes-sive numbers are used as the coordinates of a "random" point i n the unit square. The test i s based on the p r o b a b i l i t y that the squared distance between two successive points w i l l exceed a value i X 2 . The t h e o r e t i c a l p r o b a b i l i t i e s are tabulated In T^b], on the basis of the r e l a t i o n P r o b [ d 2 ^ c< 2] = c<2 - 8*3/3 +o<^/2 for 0 d<X* c \ Prob[d 2£o<*] = 1/3 + (7f-2)iX2 + M * 2 - ! ) 1 ^ + 8/3(<X 2-l) 3 / / 2 - 0^/2 - lio*2 Sec" 1* f o r 1 £ (X2 2 . F i n a l l y we note that a t e s t of the randomness of a set of numbers may be based on the empirical d i s t r i b u t i o n of some test s t a t i s t i c computed f o r each of several subsets of the set. The observed d i s t r i b u t i o n of these computed values may then be tested f o r goodness of f i t to the t h e o r e t i c a l d i s t r i b u t i o n of the test s t a t i s t i c . Such a procedure i s used by Taussky and Todd f l l 7 j and by Dodd ["1+8] ; i t w i l l be used also i n our analysis of experi-mental r e s u l t s i n Chapter Ij.. On the basis of discussions i n {"183 C23l D>°3 further tests of randomness could be constructed. The foregoing, however, describes or gives reference to a l l of the standard tests used i n evaluating the arithmetic methods described i n the following chapter. 18 Chapter Three. The Generation of P3eudo-Random Sequences Having an elaborate background f o r the t e s t i n g of pseudo-random sequences, we need now to obtain some numbers. As i n d i -cated i n the introduction, we s h a l l confine the discussion to the generation of sequences by an arithmetic r e l a t i o n . For the sake of c l a r i t y we s h a l l r e f e r to a s p e o i f i c recursion r e l a t i o n Xj+1 = ^ X j * X j - i * xj_-fc» &2* &3* with given parameter values as a pseudo-random generator, while a class of such r e l a t i o n s , of similar form but with unspecified values f o r the parameters, w i l l be ca l l e d a method. There are three basic methods — the mid-square method, the m u l t i p l i c a t i v e congruential method, and the additive congruential method. These each o f f e r s p e c i a l cases which w i l l be mentioned separately. The discussion w i l l generally be phrased i n terms of r e l a t i o n s suitable f o r a binary computer. The Mid-Square Method (i) Generation. The mid-square method suggested by von Neumann squares a 2r b i t number u., 0 4. u. L 1, extracts the middle 2r b i t s from the r e s u l t , and uses t h i s number as uj+l« There i s no convenient analytic expression to describe t h i s method, but f o r a 2r d i g i t number u^, one can write the r e l a t i o n u j + 1 = 2 r £ u 2 mod 2" rJ (1) with the understanding that only the 2r most s i g n i f i c a n t ( i . e . leftmost) b i t s of the r e s u l t are used. The c a l c u l a t i o n of each succeeding random number can there-19 fore be carried out i n a binary computer with, one m u l t i p l i c a t i o n and one s h i f t operation. The subsequent m u l t i p l i c a t i o n by 2 r i s achieved by the scaling convention i n which the binary point i s understood to be at the left-hand end of a r e g i s t e r — i . e . before the most s i g n i f i c a n t binary d i g i t . ( i i ) Period. It i s apparently not possible to determine a n a l y t i c a l l y the number of numbers which may be obtained by any p a r t i c u l a r choice of vLQ. But, the number of d i s t i n c t i t e r a t e s i s c l e a r l y f i n i t e ( i n fact ^ 2 ? r ) and the method w i l l cycle i f at any time a value i s repeated. Of further concern i s the p o s s i b i l i t y that the process w i l l degenerate by accumulating zeros at either end, and thus terminate i n zero. Metropolis f i l l } and Forsythe T99l have studied the lengths and types of cycles produced by the method, and have concluded that i n many cases t h i s 'zero mechanism' i s dominant i n the t e r -mination of the sequence through c y c l i n g . If t h i s i s assumed, then a rough estimate of the length of the sequence may be made by observing that the p r o b a b i l i t y of r zeros accumulating i n either the leading or the t r a i l i n g d i g i t s i s 2" r, and hence, assuming the sequence to be random, the p r o b a b i l i t y of the sequence degenerating at any stage i s 2 x 2~r. Thus the expected length of sequence i s approximately 2 r " l , or the square root of the larges t value represented by 2r b i t s . ( i i i ) Properties. The method has been motivated on the basis of two observations. a) I f the variable x i s uniformly distributed on (0, 1) then the variable y = x 2 20 has the density function p(y) = 1/2 y " 1 / 2 f o r y > 0. h) I f a random variable y^. i s formed from a random variable y by the rul e y k = 2 n [y mod 2" nJ (2) then the l i m i t i n g d i s t r i b u t i o n as n ->oo i s uniform on (0, 1). A proof of (b) i s found i n Tocher [ l 2 0 ] , who also derives an estimate of the bias implied by (a). From th i s estimate or d i r e c t l y from consideration of (a) i t i s expected that the method w i l l y i e l d too many small numbers, and t h i s expectation i s confirmed In p r a c t i c e . (iv) Tests. Tests on sequences produced by the mid-square method have been performed by Forsythe C983, w-bx> tested ]+-digit mid-square sequences, by Votaw and Rafferty ^1253 and by Hammer £l02], but i n each case these tests were performed on the i n d i -vidual d i g i t s rather than the numbers. The r e s u l t s reported by Forsythe were negative, but Hammer, using 1 0-digit decimal num-bers, and Votaw and Rafferty, using 8-digit decimal numbers, reported s a t i s f a c t o r y r e s u l t s . Cashwell and Everett f 7J report that t h e i r mid-square method using a 38 b i t number y i e l d s a sat i s f a c t o r y sequence of length about 750,000. (v) Assessment. The major disadvantage of the mid-square method i s the danger of undetected short cycles i n the sequence. Coupled with the f a c t s that the method i s not f a s t , and possesses a bias toward small values, t h i s has generally led to the aban-donment of the mid-square procedure i n favour of other methods. ,'\ 2 1 A mid-product method, and an off-center mid-product method (which Tocher shows to be less biased than a mid-square method) are tested also by Porsythe Z99}» but apparently are not s a t i s -factory, and are not recommended. The M u l t i p l i c a t i v e Gongruential Methods The l e a s t p o s i t i v e residues of the r e l a t i o n X j + 1 E (kXj + c) mod M (3) form a peri o d i c sequence, and Lehmer [lOoQ suggested that the r e l a t i o n u. = " ( L / M ) X j may be suitable f o r generating a sequence of Rseudo-random numbers. Several s p e c i a l cases of (3) are used s u f f i c i e n t l y often to j u s t i f y distinguishing them by name. The r e l a t i o n (for binary computers) X^ + 1 = k'X^ mod 2 P + 1 U j - 2 " P X J i s known as Lehmer*s method, and i s a s p e c i a l case of the r e l a t i o n X j + 1 - k X j m o d M = k J ' + l x 0 m o d M u j = X j / M -which i s called the power residue method. We s h a l l use t h i s term only i n case M = 2 ? . To avoid the m u l t i p l i c a t i o n required by (ij.) or ( 5 ) , :Greenberger t l O l J proposed that the r e l a t i o n ( 5 ) employ 22 k = 2 a + 3 so that x j + l = ( 2 * + 3)Xj mod 2 P (6) and the m u l t i p l i c a t i o n could be accomplished by a s h i f t and add procedure. For the same reason Rotenberg [ l l 6 j proposed to use a method which i s a spe c i a l case of ( 3 ) , setting X.^ = ( 2 a+l)X, + G mod 2 P J 1 p J (7) u i = 2~ PX. J J We s h a l l describe the properties of the power residue method i n some d e t a i l , and quote corresponding r e s u l t s for the other cases. (i) Generation. The sequence [n^j of numbers i s defined by the r e l a t i o n s Xj+ 1 - - k X j m o d 2 ? - k ^ + l x 0 m o d 2 ? (8) uj = 2 - p X j (9) and can therefore be generated by a single m u l t i p l i c a t i o n , pro-vided P Is less than or equal to the word length of the com-puter used. ( i i ) Period. Clearly the sequence (8) i s simply periodic, with the period given by the least solution of the congruence r e l a t i o n i X j + n = Xj mod 2 P But X 4 = k% Dmod 2 P >P\ = Xj+n = k J + n x o m o d 2 ? ' If k odd, (k, 2*) = 1 , and we have X . + n = k 3 ^ = X^ mod 2 P ( k n - l ) X j = o mod 2 P 23 If also X odd, then X. i s odd f o r a l l j , and (X., 2 P) = 1 Hence k - 1 = 0 mod 2 P (10) k n 5 1 mod 2 P Number theory methods show that the least s o l u t i o n of t h i s r e l a -t i o n i s greatest when k i s of the form k = + 3 mod 8 (11) That i s , a necessary and s u f f i c i e n t condition that the r e l a t i o n (8) generate a sequence with maximal period i s that k = + 3 mod 8 (12) X c : 1 mod 2 — i . e . — X D odd. For such choice of k and X Q, the period of the sequence i s 2^-2 numbers. M u l t i p l i e r s of the form 5 2 n ~ l s a t i s f y (11), f o r 5 2 n - l = ( i + l ^ n - l = X2n - 1 + (2n-l)f n - 2 i i ^ + k2 0 = l 2 1 1 " 1 + 8n-I|. + k2 [• • • = -3 mod 8 Consequently the r e l a t i o n s studied by Juneosa £l07j, by Davis and Rabinowitz , and at the National Bureau of Stan-dards £ 1 1 7 ] , have maximal period. ( i l l ) Properties. I f k and X Q are both odd, then Xj i s odd f o r a l l j . Since there are 2 P" 2 numbers i n the maxi-mal sequence (8), and 2 P _ ± d i s t i n c t odd numbers i n S, exactly one h a l f of a l l the odd numbers occur In the sequence. Further, i t may be shown, as Juneosa does f o r the case k = 5 x 3, that i f k ; -3 mod 8 and the number r occurs i n the sequence (8), then r + 2 cannot. 21+ I f k = + 3 mod 8, and r occurs i n (8), then r + 1+ cannot. Consequently the numbers of the sequence { u ^ are uniformly distributed over the set S of distinguishable numbers. If Xj i s to be always odd, i t s least s i g n i f i c a n t d i g i t must be constantly 1. The next d i g i t p o s i t i o n , by the above argument, i s likewise constant, but may be either zero or one. The p e r i o d i c i t y of d i g i t positions then increases to the l e f t , only the most s i g n i f i c a n t d i g i t f i n a l l y having period 2 P ~ 2 . For t h i s reason, i n any application requiring random d i g i t s , only the most s i g n i f i c a n t d i g i t s of each number generated may be used. (iv) Tests. Taussky and Todd, £ll7], Juncosa C107!I» Moshman [ l l 3 ] , and others have tested extensively power residue methods using k = 5 1 7 where r i s odd. Satisfactory r e s u l t s were obtained i n tests of uniformity, low order moments, runs, and c o r r e l a t i o n as determined by the s e r i a l matrix t e s t . (v) Assessment. The m u l t i p l i c a t i v e congruential method i s f a s t e r than the mid-square method, and produces a sequence which over a f u l l period i s uniformly d i s t r i b u t e d . A p r e d i c t -able and maximal length of period may be obtained by suitable choice of m u l t i p l i e r s , and such generators y i e l d sequences which pass a l l the customary tests of randomness. Lehmer' s method uses the r e l a t i o n (1+); the generator o r i g i n a l l y proposed by Lehmer (the f i r s t published account of a m u l t i p l i c a t i v e congruential method) used k = 23 M = 10 8 + 1 . 25 For binary schemes, the reduction modulo 2? + 1 may be accomplished by observing that i f we represent a number greater than 2 P In a computer with P-bit word length as a + b ' 2 p = a - b + b (2 p + 1) then (a - b) i s the le a s t residue mod 2 P + 1. Hence we may generate the sequence (li) simply by forming ^1+1 l?= k X i * 0 " 1 , 0 ? ? 1 1 ^ * n e m 0 3 t s i g n i f i c a n t b i t s of y^ beyond the pth, and subtracting the dropped portion from the remainder. The scheme has the advantage that i t removes the p e r i o d i c i t y of the low order d i g i t s i n (8), but t h i s of course i s at the cost of extra Instructions, and consequently extra time. A special case of (5) which has the advantage of greater speed i s the method proposed as Greenberger Ts Method. In th i s case the m u l t i p l i c a t i o n operation i s replaced by the f a s t e r s h i f t and add procedure; the generator used took the values k = 2 1 8 + 3, P = 35. Alternative procedures u t i l i z e (3) rather than (5), and thereby need not be r e s t r i c t e d to the odd numbers of the set S. Thus they may possess a period greater than the sequences produced by the power residue method. A modification proposed by Thompson fH93 uses the r e l a t i o n X. + 1 = (lpk+l)X + k mod 2 P or X. + 1 = (Iik+l)X. + 3k mod 2 P where k must be odd, and f o r t h i s r e l a t i o n the sequence has the f u l l period 2 P . The same i s true f o r the method proposed by Rotenberg L l l 6 ] using a s h i f t and add technique as indicated by r e l a t i o n (7), provided c i s odd. 26 With these r e s u l t s , there i s l i t t l e to be gained i n f u r -ther development of m u l t i p l i c a t i v e congruential methods. The alternative i s to attempt further improvements i n the speed of generation, by studying methods using addition rather than multi-p l i c a t i o n . Additive Congruential Methods. ( i ) Generation. The general r e l a t i o n , given t i n i t i a l values { x ^ 0 ± i £ t tJ X. A = 2 l a.X. mod 2 P ( i i i ) J + 1 " j-t+1 1 1 has been suggested as a pseudo-random generator, apparently by van Wijngaarden [ l 2 2 l , and mentioned b r i e f l y by Tocher [ l 2 C ] . The special case X ^ + i = X + X. n mod 2 P X = 0 U j = 2 - p X j X X = 1 i s known as the reduced Fibonacci series, and has been studied In some d e t a i l . I t may be generated with a single addition i n s t r u c t i o n , and i s therefore the fastest r e l a t i o n yet studied. The s l i g h t l y more general case XJ+1 1 X j + X j - n m o d 2 ? also requires a single addition with, however, the necessity of some indexing technique. Period. A f u l l analysis of the period of the reduced Fibonacci series (15) and of the series with a r b i t r a r y i n i t i a l values has been given by Wall [ l 2 0 ] . In his paper i t i s shown 27 that the period of the sequence i s 3 * 2 P ~ X and Independent of the s t a r t i n g values. ( i l l ) Properties. The reduced Pibonacci series method i s closely related to the power residue method. I f the r e l a t i o n (15) i s solved as a difference equation, then, approximately, x = [i(S$+i)V - [U/S-DV M O D 2 P . Since J ^ " 1 4. 1, as j —• oo 2 + D] J" - ^ i that i s , ^ n j ^ appptocimatesY a power residue method with k = + 1) x D = i//3f. (iv) Tests. The sequences generated by (15) have been tested by Taussky and Todd.jHll73 It i s found that, though the tests of uniformity and of moments y i e l d s a t i s f a c t o r y r e s u l t s , the runs tests Indicate serious deviations from the expected behaviour of a random sequence. Satisfactory r e s u l t s may ap-parently be obtained by discarding alternate numbers, but with t h i s modification the method offers no advantage over the power residue methods. The more general r e l a t i o n (16) has been studied by Green et a l flOOj and found to be s a t i s f a c t o r y only i f either a l t e r -nate numbers were discarded, or n was taken greater than or equal to l 6 . Either measure involves some programming Incon-venience, and detracts s i g n i f i c a n t l y from the advantages of an additive method. 28 Multiple Method A proposal of t h e o r e t i c a l i n t e r e s t i s based on a theorem of Kroneoker and S i e r p i n s k i [ l 7 ; page 3831 which states Theorem: I f t Is i r r a t i o n a l , then the points {ntj mod 1 are uniformly distributed on £0, l l . The generating process thus consists of forming the sequence u n ^ n * m 0 <3 "1 or, i n terms of the f i n i t e computer representa-t i o n of t, may be formulated i n terms of integers, so that u n = na mod 2 p where a i s odd, and less than 2 P. The method i s not, however, used i n practice, and we s h a l l not go into a discussion of i t . 29 Chapter i i . Empirical Results Pseudo-random sequences f i n d application i n problems with widely varying c h a r a c t e r i s t i c s . Two applications programmed fo r the ALWAC III-E computer at the University of B r i t i s h Colum-b i a display i n common a feature which i s c h a r a c t e r i s t i c of a wide class of problems, and i s of importance i n determining a pseudo-random generator suitable f o r them: they both employ f a i r l y large quantities of random numbers i n small, e s s e n t i a l l y i s o l a t e d blocks. Between blocks, parameters of the c a l c u l a t i o n are varied over a preassigned range and the output of the pro-gram i s studied as a function of these parameter values. Clearly i t i s e s s e n t i a l that the i n d i v i d u a l blocks of numbers display no s i g n i f i c a n t deviation from the mean behaviour expected of a random sequence — otherwise the observed results may be found to depend In a s i g n i f i c a n t way on the blocks of pseudo-random numbers, and w i l l not accurately r e f l e c t the influence of varying parameter values. It seemed not at a l l obvious that the standard methods d i s -cussed i n Chapter 3 would be suitable under these circumstances. The problem of determining a generating procedure suited to such applications was therefore undertaken as an empirical study using the f a c i l i t i e s of the University of B r i t i s h Columbia Com-puting Centre, with the hope that empirical r e s u l t s might also suggest an analytic solution to the problem. The question of l o c a l randomness has been mentioned b r i e f l y i n several papers, [2ii] [112] [119] but has apparently not been studied to any extent. In t h i s l i t e r a t u r e , a sequence has 30 conventionally been calle d ' l o c a l l y random' to emphasize the fact that not only were the conditions of randomness s a t i s f i e d , but that also the block of numbers i n i t s e l f passed a class of tests of randomness. We have used the term pseudo-random to denote the fact that a bloek of numbers under consideration passes a class of tests of randomnessj f o r the problem at hand we wished to study the p o s s i b i l i t y that successive sub-blocks of length N i n a pseudo-random sequence be found to y i e l d i n themselves s t a t i s t i c a l l y non-significant results on a l l of a class C of tests of randomness. If t h i s condition holds, we say the sequence i s l o c a l l y pseudo-random f o r domains of order N. Then the conjecture studied f i r s t was that the standard generators described i n Chapter 3 need not be l o c a l l y pseudo-random f o r domains of order N, where N i s small r e l a t i v e to the modulus of the generator. In p a r t i c u l a r , we studied the case where N was 2 8 , 2^, or 2 1 0 — i . e . 2£6, 512, or 1021+, and the modulus was 232. On s t a t i s t i c a l grounds such a conjecture i s p l a u s i b l e . The f a c t that the standard genera-tors have been found to s a t i s f y tests of randomness over long cycles suggests immediately that i f the sequence were partitioned into blocks, about 5$ of these blocks would show deviations from expected behaviour s i g n i f i c a n t at the $% l e v e l . I f not, doubt would be cast on the randomness of the sequence. Since t h i s would be true of each of several independent tests, i t seemed possible that a substantial number of sub-blocks generated by standard methods would i n themselves be unsatisfactory f o r use as random numbers. 31 In in v e s t i g a t i n g t h i s question empirically, two tests were considered to constitute a minimal class C of tests of ran-domness: a chi-square test f o r the uniformity of d i s t r i b u t i o n i n 8 equal i n t e r v a l s of fo, l ] , and a chi-square test on the uniformity of an 8 x 8 s e r i a l matrix f ^ described i n Chapter 2. A 5$ l e v e l of significance was employed. Thus, i n testing blocks of 256 numbers, the d i s t r i b u t i o n over the unit Interval was considered to deviate s i g n i f i c a n t l y from uniformity i f , employing the notation of Chapter 2, 8 X 2 = 21 (f,-32) 2/32 > l l i . l 1 i = l 1 where l l i . l i s the 95$ value of the tabulated chi-square d i s t r i -butions f o r 7 degrees of freedom. Using the f a c t that 8 8 X§ - X 2 = H ( f i r i + ) 2 A - 21 (f -32)2/2 d 1 i ) j = 1 l j i = l 1 i s asymptotically distributed as chi-square with 56 degrees of freedom, and using the normal approximation to the chi-square d i s t r i b u t i o n , the d i s t r i b u t i o n of p a i r s within the matrix was said to deviate s i g n i f i c a n t l y from uniformity i f /2X2 - ffi > 1.61+ i . e . If 74.18 where 1.6i|. Is the 95$ value of the unit normal curve, and r, the number of degrees of freedom, i s 56. For blocks of 512 and 102i|. numbers, a sim i l a r measure was used, with the same c r i t i c a l values. Table I summarizes the r e s u l t s of tests f o r 100 blocks of 256 numbers each, generated by each of the standard methods discussed i n Chapter 3, and for blocks of 512 and 1021+. numbers To F o l l o w Page 31 TABLE I Number o f " U n s a t i s f a c t o r y " B l o c k s I n 100 B l o c k s G e n e r a t e d b y S t a n d a r d Methods w i t h Maximum Modulus „ „ ^ Number o f B l o c k s R e j e c t e d m A , Method G e n e r a t o r B l o c k :—•= - T o t a l S i z e x£ T e s t X 2 T e s t B o t h R e j e c t e d R o t e n b e r g ' s X i + 1 = ( 2 7 + l ) X i + l 256 6 5 3 Method mod 232 8 Lehmer's X..-I=23X, 256 3 15 3 15 Method 1 x m G d |32+i F i b o n a c c i x i + i = x i + X i _ i 256 i i 11 1+ 11 S e r i e a mod 232 Power x i +i=62 ,973X i 256 5 7 0 12 R e s i d u e m £ ) d 232 Power X 1 + 1=62,973X 1 512 6 i+ 1 9 R e s i d u e m o d 232 x Power X±+1=62,973X3^ 102)+ 1+ 6 0 10 R e s i d u e m o d 232 32 each generated by the power residue method. Prom the Table i t w i l l be seen that approximately the expeoted number (10$) of sub-blocks In each case f a i l s to pass these rather weak requirements, and that serious disadvantages therefore acoom-pany the use of these generators In applications of the type we describe. It was therefore proposed to test the conjecture that a power residue method with modulus and m u l t i p l i e r chosen to y i e l d a maximal period equal to the number of elements neces-sary i n the sub-blocks would be suitable. Again there Is a t h e o r e t i c a l consideration to support the proposal, namely the theorem quoted i n Chapter 3 which states that over a f u l l period the elements of a sequence generated by a power residue method are uniformly distributed on the set S of distinguishable numbers. This conjecture also was tested and v e r i f i e d empirically. In Chapter 3 i t was mentioned that the maximal period f o r the power residue method with modulus 2 P i s 2 P ~ ^ a n a- -that this period i s attained i f the m u l t i p l i e r k s a t i s f i e s k = t 3 mod 8 . Therefore moduli 2 X 0 , 2 X X , 2 x 2 were employed to generate sequences with maximal period 2^, 2^, 2 x 0 , r e s p e c t i v e l y . A l l possible m u l t i p l i e r s of the form k = + 3 mod 8 were employed to generate blocks of length equal, i n each case, to the period of the generator. These blocks were a p r i o r i uniformly d i s t r i b u t e d ; It was found, with certain s t r i k i n g exceptions, that they also displayed 33 no s i g n i f i c a n t deviation from uniformity over the s e r i a l matrix. The blocks displaying s i g n i f i c a n t deviations from expected be-haviour were consistently associated with m u l t i p l i e r s f a l l i n g i n p a r t i c u l a r residue classes. This information i s summarized i n Table I I , and on the basis of t h i s Table i t i s possible to specify a method which guarantees that each sub-block of length N (N = 2 5 6 , 5 1 2 , 102ii.) i n a long sequence of pseudo-random numbers w i l l pass the two specified t e s t s . That i s , by r e s t r i c t -ing the m u l t i p l i e r s k so as not to l i e i n any one of the residue classes 3 , 5 , 1*3, 5 1 , 8 5 , 1 2 5 , 1 3 1 , 1 7 1 , 2 0 5 , 2 1 3 , 2 5 1 , or 253 (mod 2 5 6 ) , one can construct a sequence of length greater than 5 0 , 0 0 0 numbers which i s l o c a l l y pseudo-random f o r domains of order 2 5 6 ; by r e s t r i c t i n g k so as not to l i e i n any one of the residue classes 3 , 5 , 5 1 , 8 5 , 1 7 1 , 2 0 5 , 2 5 1 , or 253 (mod 2 5 6 ) one can construct a sequence of length greater than 2 2 0 , 0 0 0 numbers which i s l o c a l l y pseudo-random for domains of order 5 1 2 ; by r e s t r i c t i n g k so as not to l i e i n any one of the residue classes 3 , 5 , 1 1 , 1 3 , 5 1 , 5 9 , 8 5 , 9 3 , 1 6 3 , 1 7 1 , 1 9 7 , 2 0 5 , 2J+5, 2 5 1 , or 253 (mod 2 5 6 ) , one can construct a se-quence of length greater than 800 ,000 which i s l o c a l l y pseudo-random f o r domains of order IO2I4.. For our problem this i s an important r e s u l t . It gives empirical corroboration of the idea that arithmetic methods may be selected which eliminate s i g n i f i c a n t l o c a l fluctuations i n a pseudo-random sequence designed f o r uses i n which l o e a l randomness i s e s s e n t i a l . Further, the fact that m u l t i p l i e r s associated with "unsatisfactory" blocks were observed to be confined to a r e l a t i v e l y small number of residue classes To Follow Page 33 TABLE II Residue Glasses Containing M u l t i p l i e r s Y i e l d i n g "Unsatisfactory" Blocks i n Power Residue Generators With Maximal Period Equal to the Length of Block Block Size (28)256 (29)512 (2l°)102li Modulus 2 i o 211 212 Number of M u l t i p l i e r s k, k=+3mod8 256 512 102li k< modulus Residue 3 3 3 Classes 5 5 5 (mod 256) 51 51 51 ^Containing 85 85 85 "Unsatisfactory" 171 171 171 M u l t i p l i e r s 205 205 205 251 251 251 253 253 253 • • • i+3 • • • • • • n 125 13 131 59 213 93 163 197 2ii5 3ii supports the b e l i e f that a number theoretic c r i t e r i o n may be found to characterize those m u l t i p l i e r s whioh y i e l d blocks with s t a t i s t i c a l l y non-significant properties under a given class of t e s t s . Since the empirical c l a s s i f i c a t i o n of multi-p l i e r s was to some extent consistent through the three moduli tested, i t seems l i k e l y that any number-theoretic c r i t e r i o n would hold f o r a wider range of block s i z e s . These r e s u l t s demonstrate that a generator of the power residue type with small modulus can be suitable when small blocks of random numbers are required and the two tests we des-cribed are s u f f i c i e n t l y searching. Such a generator may be programmed p r e c i s e l y as i n the case with maximum modulus; the only change i s a r e s c a l i n g of the i n i t i a l value X Q as stored i n the computer. Gn the other hand, the selection of suitable m u l t i p l i e r s w i l l not always be a convenient procedure to pro-gram i n computer applications. I t would be desirable to have a generating scheme which required only a single m u l t i p l i e r . For this reason an extension of the study was attempted, and achieved li m i t e d success. In Tables III and IV are displayed the r e s u l t s of tests performed on sets of 100 blocks of 2£6 numbers each, obtained using generators with modulus ranging from to 2 1 7 . Since the length of the component blocks no longer Is equal to the f u l l period of the generators, i t i s no longer to be expected that the numbers i n a block w i l l be uniformly d i s t r i b u t e d . Table III sets out for each modulus the empirical d i s t r i b u t i o n of the observed deviations from u n i -formity measured by the X£ s t a t i s t i c . Since t h i s s t a t i s t i c Is expected under the n u l l hypothesis to have asymptotically To Follow Page 34 TABLE III Dis t r i b u t i o n for Selected Moduli of 100 Values of xi Tabulated . X 2 5$ 10% 20$ 30% $0% 70% Qo% 90$ 95$ Percentile . . - . . Tabulated X 2 2.17 2.83 3.82 1+.67 6.35 8.38 9.80 12.0 l l i . l Larger Value Period of Generator No. of Observed 1 Values F a l l i n g i n Indicated Intervals 1x256 100 . . . 2x256 22 14 20 12 16 8 4 4 4x256 16 16 20 20 12 12 4 8x256 11 1 11 4 21 24 14 0 I k 16x256 4 4 4 14 36 14 10 8 6 32x256 7 4 10 13 23 12 12 12 7 61ix256 3 7 12 7 30 22 9 7 2 1 128x256 4 3 10 8 27 22 7 9 9 1 222X256 8 3 6 13 20 26 5 '13 l 5 To Follow Page 3l+ TABLE IV D i s t r i b u t i o n f o r Selected Moduli of 100 Values of X| Tabulated Percentile 10$ 20% 30% $0% 6o% 70% 80% 90% 9$% Tabulated •-X 2 3 9 . 6 1+2.9 1+7.1 5 0 . 2 5 3 . 0 5 5 . 5 5 8 . 2 6 l . 2 61+. 8 6 9 - 9 71+.2 Larger Value Period of Generator Number of Observed Values F a l l i n g i n Indicated Intervals 1 x 2 5 6 ^ ioo 2x256 76 k 12 2 k 2 I+x256 76 k 16 k 8 x 2 5 6 11 11 18 lk 2k 5 6 3 1 5 2 1 6 x 2 5 6 k 10 20 5 11 7 9 8 8 lk It 3 2 x 2 5 6 7 16 10 12 22 12 1+ 6 5 k 2 6i+x256 6 6 5 12 ik 8 12 13 8 10 6 1 1 2 8 x 2 5 6 8 5 7 13 20 11 5 8 7 8 6 2 2 2 2 X 2 5 6 3 3 10 11 9 10 1 6 8 13 7 3 7 35 the chi-square d i s t r i b u t i o n with 7 degrees of freedom, the Table shows the number from 100 observed values of X£ which f a l l i n each i n t e r v a l defined by tabulated *)£_2 values. Like-wise, Table IV shows fo r each modulus the empirical d i s t r i b u t i o n of observed values of the s t a t i s t i c x| ; the i n t e r v a l s are defined by the normal approximation to the chi-square d i s t r i b u -t i o n with 5© degrees of freedom. These Tables show that i t Is possible, using a single mul-t i p l i e r , to generate sequences of length up to 8192 numbers which are l o c a l l y pseudo-random f o r domains of order 256. A method has so f a r not been found which w i l l y i e l d longer l o c a l l y pseudo-random sequences using only a single m u l t i p l i e r . Never-theless, as Table IV shows, a substantial improvement over the standard generators has been achieved. Using moduli of 2^, 2X7, sequences of length up to 32,000 have been generated i n which no more than 2% i n t o t a l of the sub-blocks of length 256 display s i g n i f i c a n t r e s u l t s on either of the two tests used. We note also that as i n a l l the above cases, the length of the sequence may be doubled merely by taking a new i n i t i a l value X Q from among the «• 2^-2 odd P-bit numbers not i n the sequence of 2P~2 numbers produced by the generator with modulus 2 P and a given i n i t i a l value. By the theorem quoted i n Chapter 3, i f r i s i n the set of numbers produced by a generator with i n i t i a l value X Q, and k 5 -3 mod 8, then r + 2 w i l l not be i n the set, and hence X Q + 2 i s a suitable i n i t i a l value to generate another sequence ( d i s j o i n t from the f i r s t ) of 2P**2 numbers; i f k = 3 mod 8, then r. + k i s a suitable new i n i t i a l " value. 36 These r e s u l t s indicate that even f o r applications requiring f a i r l y large numbers of pseudo-random numbers, It i s possible to f i n d a generating scheme involving only one m u l t i p l i e r which y i e l d s a sequence i n which fewer than 2 or 3% of the sub-blocks of length 2^6 need to be rejected as deviating s i g n i f i c a n t l y from properties of randomness. For many applications, therefore, our procedure i s a decided improvement over standard generators. When i t i s convenient to allow, i n a computer program, f o r the selection of m u l t i p l i e r s , the use of generators with small modulus can guarantee l o c a l pseudo-randomness. Even when the use of more than one m u l t i p l i e r i s not convenient, a substantial improvement i n the "qua l i t y " of small blocks of numbers may s t i l l be effected by the use of a generator with the least modulus i n excess of the t o t a l number of numbers required. 37 C h a p t e r Summary and C o n c l u s i o n s Randomness i s an e l u s i v e c o n c e p t , and t h e p u r s u i t o f r a n -domness a r a t h e r u n c e r t a i n t a s k . B e h i n d t h i s t h e s i s i s the i d e a t h a t a major advantage i n t h e g e n e r a t i o n of p s e u d o - r a n d o m numbers by a r i t h m e t i c methods Is t h e e x p e r i m e n t e r ' s c o n t r o l o v e r h i s medium. As J u n e o s a p o i n t s out Zl07~}i i t i s p o s s i b l e , w i t h d e t e r m i n i s t i c methods, t o a v o i d t h e s i g n i f i c a n t d e v i a t i o n s f r o m mean b e h a v i o u r w h i c h i n e v i t a b l y o c c u r i n a t r u l y random s i t u a t i o n , and w h i c h may g i v e r i s e t o s e r i o u s l y m i s l e a d i n g r e -s u l t s . We may a v o i d , i f we w i s h , t h e p o s s i b i l i t y o f s i g n i f i c a n t l o c a l n o n - r a n d o m n e s s . I f , t h e r e f o r e , we t a k e our chances on a p s e u d o - r a n d o m sequence of l o n g p e r i o d i n s i t u a t i o n s i n w h i c h l o c a l randomness i s c r i t i c a l , we are n o t u t i l i z i n g t h e a d v a n -t a g e s o f f e r e d by a r i t h m e t i c p r o c e d u r e s . T h i s t h e s i s has n o t s e t t l e d t h e q u e s t i o n o f g e n e r a t i n g l o c a l l y p s e u d o - r a n d o m s e q u e n c e s . Nor Is i t p r e t e n d e d t h a t t h e e m p i r i o a l s t u d y h e r e d e s c r i b e d i s any s u b s t i t u t e f o r an a n a l y t i c a l t r e a t -ment o f t h e p r o b l e m i f s u c h be p o s s i b l e . On the o t h e r h a n d , t h e s t u d y has d e m o n s t r a t e d t h a t i n some q u i t e p l a u s i b l e c i r c u m -s t a n c e s the s t a n d a r d g e n e r a t o r s a r e n o t s u i t a b l e . I t has demon-s t r a t e d t h a t b e t t e r p r o c e d u r e s can be d e v i s e d , and t h a t t h e r e i s hope a t l e a s t f o r an e m p i r i c a l c l a s s i f i c a t i o n o f l o c a l l y p s e u d o -random g e n e r a t o r s . T h e r e Is no doubt i n the w r i t e r ' s mind t h a t t h e p r o b l e m i s w o r t h f u r t h e r s t u d y . 38 B I B L I O G R A P H Y Chapter 1. Applications and Preliminary Theory 1. Bauer, W. P. The Monte Carlo Method. J. Soc. Indust. Appl. Math.,vol. 6, Dec. 1958. 2. Box, G. E. P. and Muller, M. E. A Note on the Generation of Normal Deviates. Ann. Math. S t a t i s t . , v o l . 28, 1958, p. 610. 3. Brown, G. W. History of RAND's Random D i g i t s — A Summary. Monte Carlo Method, National Bureau of Standards, Applied Mathematics Series #12, p. 31. if. Butcher, J. C. Random Sampling from the Normal D i s t r i b u -t i o n . The Computer Journal, v o l . 3, #!+> 19&1, P» 251. 5 . Butler, J. W. Machine Sampling from Given P r o b a b i l i t y D i s t r i b u t i o n s . Symposium on Monte Carlo Method, H. A. Meyer (Ed.), New York, Wiley, 195b. 6. Cameron, J . M. Monte Carlo Experiments on SEAC. National Bureau of Standards SEL Working Paper SEL-52-5. 7. Cashwell, E. D. and Everett, G . J . A P r a c t i c a l Manual on the Monte Carlo Method f o r Random Walk Problems. Interna-t i o n a l Tracts In Computer Science and Technology. Permagon Press, Los Angeles, l959« 8. Church, A. E. R. On the Means and Squared Standard Devia-tions of Small Samples from Any Population. Biometrika, 18, 1926, p. 321. 9. Clark, C. E. The U t i l i t y of S t a t i s t i c s of Random Numbers. Operations Research, v o l . 8, i960. 10. Curtiss, J . H. Sampling Methods Applied to D i f f e r e n t i a l and Difference Equations. Seminar on S c i e n t i f i c Computation, I.B.M., New York, 19J+9. 11. Davis, P. and Rabinowitz, P. Some Monte Carlo Experiments i n Computing Multiple Integrals. Math. Tables Aids Comput., v o l . 10, p. 1. 12. Fisher, R. A. and Yates, F. S t a t i s t i c a l Tables f o r B i o l o g i -c a l , A g r i c u l t u r a l , and Medical Research. Edinburgh, Oliver and Boyd, 1953. 13. Fisher, R. A. S t a t i s t i c a l Methods for Research Workers. Edinburgh, 1+th e3~. Oliver & Boyd, 1932, p. 03. i l l . Franklin, J . N. On the E q u i d i s t r i b u t i o n of Pseudo-Random Numbers. Quart. Appl. Math., v o l . xvi, #2, July, 1958. 39 15. Gage, R. Contents of TIppett's Random Sampling Numbers. J. Amer. S t a t i s t . Assoc., v o l . 38, 1943. 16. Haramersley, J . M. Note on El e c t r o n i c Computers and the Analysis of Stochastic Processes. Math. Tables Aids Comput., v o l . 1+, 1950» P. 56. 17. Hardy, G. H. and Wright, E. M. An Introduction to the Theory of Numbers. Oxford, Clarendon Press, 195b. 18. Hoel, P. G. Introduction to Mathematical S t a t i s t i c s . New York, J . Wiley & Sons, Inc., 19L/7. 19. Holz, B. W. and Clark, C. E. A Table of Exponentially Distributed Pseudo-Random Numbers. Operations Research Office - SP - 10b1, Washington, D. C., 1958. 20. Jones, H. L. How Many of a Group of Random Numbers w i l l be Usable i n Selecting a P a r t i c u l a r Sample. J . Amer. S t a t i s t . Assoc., v o l . 54, #285, 1959, PP. 102-122. 21. Kahn, H. Applications of Monte Carlo. Rand Report, #RM 1237, AEC, 1954. 22. Kendall, M. G. A Theory of Randomness. Biometrika 32, 1941. 23. Kendall, M. G. The Advanced Theory of S t a t i s t i c s . V o l . I I . London, G r i f f i n & Co., 191itJ. 2i+. Kendall, M. G. and Babington-Smith, B. Randomness and Random Sampling Numbers. J . Roy. S t a t i s t . Soc., v o l . Cl> 1938. 25. Kendall, M. G. and Babington-Smith, B. Second Paper on Random Sampling Numbers. J . Roy. S t a t i s t . Soc., v o l . VI, (Supp.), 1939, P. 51. 26. Kendall, M. G. and Babington-Smith, B. Random Sampling Numbers. Department of S t a t i s t i c s , University College, London. 27. Metropolis and Ulam. The Monte Carlo Method. J. Amer. S t a t i s t . Assoc., v o l . l+li, 1949, p. 335. 28. Meyer, H. A. (Ed.) Symposium on Monte Carlo Methods. New York, Wiley, 195bv~ 29. Muller, M. E. Inverse Method f o r Generation of Random Normal Deviates on Large-Scale Computers. Math. Tables Aids Comput., 12, 1958. 30. Musk, P. I. A Monte Carlo Simulation of a Production Planning Problem. Computer J., v o l . 2, #2, p. 90. ko 31. Monte Carlo Method. Applied Mathematics Series #12. Washington, D. C , U. S. Dept. of Commerce, National Bureau of Standards, 1951« 32. Neate, R. and Dacey, W, J. A Simulation of Melting Shop Operations. Computer J., v o l . 2, #2, p. 59. 33. RAND Corportion, The. A M i l l i o n Random Digits with 100,000 Normal Deviates. Free Press Publishers, Glencoe, I l l i n o i s , 1 ^ 3i+. Strachey, C. Two Contributions to the Techniques of Queuing Problems. Computer J.,, v o l . 3, #2, p. lll+. 35. "Student". The Probable Error of the Mean. Biometrika 6, 1908, p. 1-25. 36. Tippett, L. H. C. Random Sampling Numbers. Tracts f o r Computers #XV, Cambridge University Press, 1950. Chapter 2. S t a t i s t i c a l Tests f o r Randomness 37. Anderson, R. L. The D i s t r i b u t i o n of the S e r i a l Correlation C o e f f i c i e n t . Ann. Math. S t a t i s t . , v o l . 13, 191+2, p. 1. 38. Bartholomew, D. J. Note on Sherman's S t a t i s t i c as a Test of Randomness. Biometrika 1+1, p. 556. 39. B a r t l e t t , M. S. The Frequency Goodness of F i t Test f o r Probability Chains. Proc. Cambridge Philos. S o c , v o l . I4.7, 1951, PP. 86-95. 1+0. Berkson, J. Some D i f f i c u l t i e s of Interpretation Encountered i n the Application of the Chi-Square Test. Amer. S t a t i s t . Assoc., v o l . 33, 1938. 111. Cameron, J. M. Results of Some Tests of Randomness on Pseudo-Random Numbers. Ann. Math. S t a t i s t . , v o l . 23, 1952, (Abstract). i+2. Camp, B. H. Multinomial Solid and the Chi Test. Trans. Amer. Math. S o c , v o l . 31, 1929, p. 133. 1+3. Goohran, W. G. T h e O C 2 Test of Goodness of F i t . Ann. Math. S t a t i s t . , v o l . 23, 1952, p. 315. kk* Coveyou, R. R. S e r i a l Correlation i n the Generation of Pseudo-Random Numbers. J. A s s o c Comput. Mach., v o l . 7, I960, pp. 72. 1+5. David, F. N. Combinatorial Tests of Whether a Sample Has Come from a Given Population, Biometrika 37, 1956. 1+1 D i x o n , W. J . F u r t h e r C o n t r i b u t i o n s t o P r o b l e m s o f S e r i a l C o r r e l a t i o n . Ann. Math. S t a t i s t . , v o l . 15, 1944, p . 119. Dodd, E . L . The L e n g t h o f C y c l e s W h i c h R e s u l t f r o m t h e G r a d u a t i o n o f Chance E l e m e n t s . Ann. Math. S t a t i s t . , v o l . 10, 1939, P . 254. Dodd, E . L . C e r t a i n T e s t s f o r Randomness A p p l i e d t o D a t a Grouped i n t o S m a l l S e t s . E c o n o m e t r i c a 10, 191+2, p . 2ii9. F i s h e r , R. A. On t h e Random S e q u e n c e . Q u a r t e r l y J o u r n a l o f t h e R o y a l M e t e o r o l o g i c a l S o c i e t y , v o l . $2, 1926, p . F r a s e r , D. A. S. N o n p a r a m e t r i e Methods i n S t a t i s t i c s . New Y o r k , J o h n W i l e y & Sons, I n c . , 1957. F r y , T. C. The ")C2 T e s t o f S i g n i f i c a n c e . J . Amer. S t a t i s t . A s s o c . , v o l . 33, 1938, p . 513. Good, I . J . The S e r i a l T e s t f o r S a m p l i n g Numbers and O t h e r T e s t s f o r Randomness. Cambridge P h i l o s . S o c . P r o c , v o l . 49, 1953, P P . 276-284. Good, I . J . On t h e S e r i a l T e s t f o r Random S e q u e n c e s . Ann. Math. S t a t i s t . , v o l . 28, 1957, p . 262. Greenwood, R. E . Coupon C o l l e c t o r s T e s t f o r Random D i g i t s . Math. T a b l e s A i d s Comput., v o l s . 9-10, 1955-1956, p . 1. G r u e n b e r g e r , F . T e s t s o f Random D i g i t s . Math. T a b l e s A i d s Comput., v o l . 4, P« 2i|i|.. G r u e n b e r g e r , F . and Mark, A. M. The d 2 T e s t o f Random D i g i t s . Math. T a b l e s A i d s Comput., v o l . 5, 1951, p . 109. Gumbel, E . J . On t h e R e l i a b i l i t y o f t h e C l a s s i c a l X 2 T e s t . Ann. M a t h . S t a t i s t . , v o l . i l l , 1943. H o e l , P. G. On t h e C h i - S q u a r e D i s t r i b u t i o n f o r S m a l l S a m p l e s . Ann. Math. S t a t i s t . , v o l . 9, 1938, p . 158. H u n t e r , D. G. N. Note on a T e s t f o r R e p e a t i n g C y c l e s i n a Pseudo-Random Number G e n e r a t o r . Computer J . , v o l . 3, I960. I y e r , P. V. K., & Rao, A. S. P. T h e o r y o f P r o b a b i l i t y D i s t r i b u t i o n o f Runs i n Sequences o f O b s e r v a t i o n s . I n d i a n S o c . A g r i c u l t u r a l S t a t i s t . J . , v o l . 5, 1953, p p . 2 9 - 7 7 t Kermack, W. 0 . and M c K e n d r i c k , A. G. T e s t s f o r Randomness i n a S e r i e s o f N u m e r i c a l O b s e r v a t i o n s . P r o c . Roy. S o c , 1936-1937, v o l . 57, P a r t 3, p p . 228-24o. 1+2 62. Koopmans, T . C . S e r i a l C o r r e l a t i o n and Q u a d r a t i c Forms i n Normal V a r i a b l e s . A n n . M a t h . S t a t i s t . , v o l . 13, 194 2 , p . l)+. 63. L e v e n e , H . & W o l f o w i t z , J . On the Power F u n c t i o n o f T e s t s o f Randomness Based on Runs Up and Down. A n n . M a t h . S t a t i s t . , v o l . 23, 1952, p . 34. 64. L e v e n e , H . & W o l f o w i t z , J . The C o v a r i a n c e M a t r i x o f Runs Up and Down. A n n . M a t h . S t a t i s t . , v o l . 15, 1944, p . 58. 65. McEwen, G . F . The R e a l i t y o f R e g u l a r i t i e s I n d i c a t e d i n a Sequence o f O b s e r v a t i o n s . P r o c . B e r k e l e y Symposium on M a t h . S t a t i s t , and P r o b a b i l i t y , E d . J . Neyman, U n i v e r s i t y o f C a l i f o r n i a P r e s s , 1949. 66. Mann, H . B . On a T e s t f o r Randomness Based on S i g n s o f D i f f e r e n c e s . A n n . M a t h . S t a t i s t . , v o l . l 6 , 1945, p . 193. 67. M e t r o p o l i s , N . G . , R e i t w i e s n e r , G . , and von Neumann, J . S t a t i s t i c a l T r e a t m e n t o f F i r s t 2,000 D e c i m a l D i g i t s o f e and o f 7T C a l c u l a t e d on the E n i a c . M a t h . T a b l e s A i d s C o m p u t . , v o l . i i , p . 109. 68. Mood, A . M. The D i s t r i b u t i o n Theory o f R u n s . A n n . M a t h . S t a t i s t . , v o l . 11, I9I4.O, p . 367. 69. Moore, G . H . and W a l l i s , W. A . Time S e r i e s S i g n i f i c a n c e T e s t s Based on S i g n s o f D i f f e r e n c e s . J . Amer. S t a t i s t . A s s o c , v o l . 38, 1943, P P . 153-165. 70. Moore, P . G . A S e q u e n t i a l T e s t f o r Randomness. B i o m e t r i k a 40, 1953. 71. N a i r , K . N . On T i p p e t t ' s Random S a m p l i n g Numbers. , Sankhya i+, 1938, p . 65.-72. P e a r s o n , K . On the C r i t e r i o n t h a t a G i v e n System o f D e v i a t i o n s f r o m the P r o b a b l e i n the Case o f a C o r r e l a t e d System o f V a r i a b l e s I s Such t h a t i t can R e a s o n a b l y be S u p p o s e d ' t o Have A r i s e n f r o m Random S a m p l i n g . P h i l o s o p h i -c a l M a g a z i n e , v o l . 50, 1900, p . 157. 73. R u b i n , H . On the D i s t r i b u t i o n o f the S e r i a l C o r r e l a t i o n C o e f f i c i e n t . A n n . M a t h . S t a t i s t . , v o l . 16, 1945, P . 211. 74. S l u t z k y , E . The Summation o f Random Causes as a Source o f C y c l i c P r o c e s s e s . E o o n o m e t r i o a , v o l . 5, 1937, p p . 105-146. : 75. Swed, F . S . and E i s e n h a r t , C . T a b l e s f o r T e s t i n g Random-n e s s o f G r o u p i n g i n a Sequenoe o f A l t e r n a t i v e s . A n n . M a t h . S t a t i s t . , v o l . 14, 1943, p . 66. 76. T a t e , M. W. and C l e l l a n d , R . C . N o n - P a r a m e t r i c and S h o r t -c u t S t a t i s t i c s . I n t e r s t a t e P r i n t e r s and P u b l i s h e r s , 1957. 1*3 77. Thompson, W. E. Ernie — A Mathematical and S t a t i s t i c a l Analysis. J . Roy. S t a t i s t . Soc. Ser. A, v o l . 122, p. 301. 78. Tintner, 0 . Tests of Significance i n Time Series. Ann. Math. S t a t i s t . , v o l . 10, 1939, p. ll+l. 79. Tompkins, G. B. The RAND Corporation's One M i l l i o n D i g i t s . Math. Tables Aids Comput., v o l . 10, 1956, p. 39. 80 . von Neumann, J . The D i s t r i b u t i o n of the Ratio of Mean Square Successive Difference to the Variance. Ann. Math. S t a t i s t . , v o l . 12, 191+1, p. 367. 8 1 . von Neumann, J . , Kent, R. H., Bellman, H. R., and Hart, B.I. The Mean Square Successive Difference. Ann. Math. S t a t i s t . , v o l . 12, 1951, p. 153. 8 2 . von Schelling, H. Coupon Co l l e c t i n g f o r Unequal Probabi-l i t i e s . Amer. Math. Monthly, v o l . 6 1 , 195k, pp. 306-311. 83. Wald, A. A. and Mann, H. B. On the Choice of the Number of Intervals i n the Application of the Chi-Square Test. Ann. Math. S t a t i s t . , v o l . 13, 191+2, p. 306. 8l+. Wald, A. and Wolfowitz, J . An Exact Test f o r Randomness i n the Non-Parametric Case Based on S e r i a l Correlation. Ann. Math. S t a t i s t . , v o l . ll+, 191+3, p. 378. -85. Wald, A. and Wolfowitz, J . On a Test Whether Two Samples are Prom the Same Population. Ann. Math. S t a t i s t . , v o l . 1 1 , 191+0, p. 11+7. 86. Williams, C. A. On the Choice of the Number and Width of Glasses for Chi-Square Test of Goodness of P i t . J . Amer. S t a t i s t . Assoc., v o l . 1+5, 1950. 87. Williams, J . D. Moments of the Ratio of the Mean Square Successive Difference to the Mean Square Difference i n Samples from a Normal Universe. Ann. Math. S t a t i s t . , v o l . 12, 191+1, p. 239. 88. Wilks, S. S. Order S t a t i s t i c s . Amer. Math. Soc. B u l l . , v o l . 51+, 191+8, p. 6 . 8 9 . Wolfowitz, J . Note on Runs of Consecutive Elements. Ann. Math. S t a t i s t . , v o l . 15, 191*4, P» 97. 90. Wolfowitz, J . Asymptotic D i s t r i b u t i o n of Runs Up and Down. Ann. Math. S t a t i s t . , v o l . 15, 19kk, p. 163. 91. Wolfowitz, J . On the Theory of Runs with Some Applications to Quality Control. Ann. Math. S t a t i s t . , v o l . li+, 191+3, p. 280. : 54 92. Young, L. C. On Randomness i n Ordered Sequences. Ann, Math. S t a t i s t . , v o l . 12, 1951, p. 293. 93. Yule, G. U. A Test of Tippett's Random Sampling Numbers, J. Roy. S t a t i s t . S o c , v o l . 101, 1938, p. 167. Chapter 3» Arithmetic Methods f o r the Generation of Pseudo-Random Sequences" 91+.. Arthur, A. 0 . Random D i g i t Generation. Computing News, Sept., 1956. 95. Bofinger, E. and Bofinger, V. I, A Periodic Property of Pseudo-Random Sequences. J. Assoc Comput. Maoh., 1958, pp. 261-265. 95. b.Certaine, J . E. On Sequences of Pseudo-Random Numbers of Maximal Length. J . Assoc. Comput. Mach., v o l . 5, 1958. 96. Duparc, H. J. A., Lekkerkerker, G. G., and Peremans, W. Reduced Sequences of Integers and Pseudo-Random Numbers. Mathematics Centrum, Amsterdam, Report ZW 1953 - 002, z w 1952 -013. 97* Edmonds, A. R. The Generation of Pseudo-Random Numbers on El e c t r o n i c D i g i t a l Computers. Computer J., v o l . 2, I960, p. 181. 9 8 , Porsythe, G. E. Generation and Testing of Random Dig i t s at the National Bureau of Standards. Monte Carlo Method, National Bureau of Standards. A.M.S. 12, 1951, p. 3k. 99. Porsythe, G. E. Generation and Testing of 1,217,370 Random Binary D i g i t s on SWAC. B u l l . Amer. Math. S o c , v o l . 57, 1951, P. 305. 100. Green, B. P. J r . , Smith, J. E. K . , and Klem, L. Empirical Tests of an Additive Random Number Generator. J. Assoc Comput. Mach., v o l . 6, #5, 1959, p. 527. 101. Greenberger, M., Orrcut, G. H., and R i v l i n , A. M. Decision Unit Models and Simulation of the United States Economy. 102. Hammer, P. C. The Mid-Square Method of Generating D i g i t s , Monte Carlo Method. National Bureau of Standards. A.M.S. WIT. 103. Horton, H. B. A Method f o r Obtaining Random Numbers. Ann. Math. S t a t i s t . , v o l . 19, 19)48, pp. 81-85. 105. Horton, H. B. and Smith, R. T. A Direct Method f o r Pro-ducing Random D i g i t s i n Any Number System. Ann. Math. S t a t i s t . , v o l . 20, 1959, pp. 82-90. 45 105. International Business Machines. Reference Manual — Random Number Generating and Testing, IBM 1958. 106. Johnson, D. L. Generating and Testing Pseudo-Random Numbers on IBM 701. Math. Tables Aids Comput., v o l . 10, 1956, pp. 8-13. 107. Juneosa, M. L. Random Number Generation of the BRL High Speed Computing Machines. B a l l i s t i c s Research Laboratories. Report #855. Aberdeen Proving Ground, Maryland, 1953. 108. Lehmer, D. H. Mathematical Methods i n Large-Scale Computing Units. Annals of the Computation Laboratory of Harvard University. Proceedings of a Second Symposium on Large- Scale D l g i t a l ~ C a l c u l a t i n g Machinery, 1951. 109. Lehmer, D. H. Review of Juneosa 'Random Number Generation of the BRL High Speed Computing Machines'. Math. Reviews, v o l . 15, 1954, P. 559. 110. Mauchly, J . W. Pseudo-Random Numbers. (Address) American S t a t i s t i c s Association, 1949. 111. Metropolis, C. Phase Shifts — Middle Squares — Wave Equation. Symposium on Monte Carlo Methods. H. A. Meyer (Ed.) New York, Wiley, 195b. " 112. Meyer, H. A., Gephart, L. S. and Rasmussen, N. L. On the Generation and Testing of Random D i g i t s . WADC Tech. Rep. 54-55, Wright Patterson A i r Force Base - Ohio - 1954. 113. Moshman, J. The Generation of Pseudo-Random Numbers on a Decimal Calculator. J . Assoc. Comput. Mach., v o l . 1, 1954* 114. Page, E. S. Pseudo-Random Elements f o r Computers. Appl. S t a t i s t . , v o l . 8, 1959. 115. Reitwiesner, G. An Eniac Determination of [\ and e to More Than 2,000 Decimal Places. Math. Tables Aids Comput., v o l . 4, P.. 11. 116. Rotenberg, A. A New Pseudo-Random Number Generator. J . Assoc. Comput. Mach., v o l . 7, i960. 117. Taussky, 0. and Todd, J . Generation and Testing of Pseudo-Random Numbers. Symposium on Monte Carlo Methods. Ed. H. A. Meyer. New York, Wiley, 195b, pp. 15-25. 118. Teichroew, D. D i s t r i b u t i o n Sampling with High Speed Com-puters. Di s s e r t a t i o n . University of North Carolina, 1953. 119. Thompson, W. E. A Modified Congruence Method of Generating Pseudo-Random Numbers. Computer J., v o l . 1, p. 83. 46 120. Tocher, K. D. The Application of Automatic Computers to Sampling Experiments. J. Roy. S t a t i s t . Soc., v o l . l6, 1954, P. 39. 121. Tompkins, C. B. (reviewer) Random Number Generating -Icosa-hedral Dice, Mathematics of Computation, v o l . 15, #73, January, 19ol, p. 9k. 122. van Wijngaarden, J . Report on Proceedings of Symposium on D i g i t a l Computation. N. P. L., Spring, 1953. 123. Vickery, C. W. On Drawing a Random Sample Prom a Set of Punched Cards. J. Roy. S t a t i s t . S o c , v o l . 6, 1939, p. 63. 12lx. von Neumann, J . Various Techniques Used i n Connection with Random D i g i t s . Monte Carlo Method. National Bureau of Standards, AMS #12, p. 30. 125. Votaw, D. P. Jr . , and Rofferty, J . A. High Speed Sampling. Math. Tables Aids Comput., v o l . 5, 1951, pp. 1-8. 126. Wall, D. D. Fibonacci Series Modulo m. Amer. Math. Monthly, v o l . 67, #6, p. 525. 127. Walsh, J. E. Concerning Compound Randomization i n the Binary System. Ann. Math. S t a t i s t . , v o l . 20, 1949, pp. 580-589. 128. Walsh, J. E. On a Method of Obtaining Random Binary D i g i t s by F l i p p i n g Coins. Project RAND. Santa Monica, C a l i f o r n i a , July 1948. 129. Walsh, J. E. An Experimental Method f o r Obtaining Random Digits and Permutations. Sankhya 17, 1956/57. Addendum: 130. Mann, H. B. and Whitney, D. R. On a Test of Whether One of Two Random Variables i s Stoc h a s t i c a l l y Larger Than the Other. Ann. Math. S t a t i s t . , v o l . 18, 1947, p. 50.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- An empirical study of locally pseudo-random sequences
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
An empirical study of locally pseudo-random sequences Dobell, Alan Rodney 1961
pdf
Page Metadata
Item Metadata
Title | An empirical study of locally pseudo-random sequences |
Creator |
Dobell, Alan Rodney |
Publisher | University of British Columbia |
Date Issued | 1961 |
Description | In Monte Carlo calculations performed on electronic computers it is advantageous to use an arithmetic scheme to generate sets of numbers with "approximately" the properties of a random sequence. For many applications the local characteristics of the resulting sequence are of interest. In this thesis the concept of a pseudo-random sequence is set out, and arithmetic methods for their generation are discussed. A brief survey of some standard statistical tests of randomness is offered, and the results of empirical tests for local randomness performed on the ALWAC III-E computer at the University of British Columbia are recorded. It is demonstrated that many of the standard generating schemes do not yield sequences with suitable local properties, and could therefore be responsible for misleading results in some applications. A method appropriate for the generation of short blocks of numbers with approximately the properties of a randomly selected set is proposed and tested, with satisfactory results. |
Subject |
Algebra, Abstract Number theory |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2012-01-26 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0080635 |
URI | http://hdl.handle.net/2429/40322 |
Degree |
Master of Arts - MA |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1961_A8 D6 E6.pdf [ 3.42MB ]
- Metadata
- JSON: 831-1.0080635.json
- JSON-LD: 831-1.0080635-ld.json
- RDF/XML (Pretty): 831-1.0080635-rdf.xml
- RDF/JSON: 831-1.0080635-rdf.json
- Turtle: 831-1.0080635-turtle.txt
- N-Triples: 831-1.0080635-rdf-ntriples.txt
- Original Record: 831-1.0080635-source.json
- Full Text
- 831-1.0080635-fulltext.txt
- Citation
- 831-1.0080635.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080635/manifest