UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

S-R-T division algorithms as dynamical systems McCann, Mark A. 2002

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
[if-you-see-this-DO-NOT-CLICK]
ubc_2002-0175.pdf [ 1.74MB ]
Metadata
JSON: 1.0051671.json
JSON-LD: 1.0051671+ld.json
RDF/XML (Pretty): 1.0051671.xml
RDF/JSON: 1.0051671+rdf.json
Turtle: 1.0051671+rdf-turtle.txt
N-Triples: 1.0051671+rdf-ntriples.txt
Original Record: 1.0051671 +original-record.json
Full Text
1.0051671.txt
Citation
1.0051671.ris

Full Text

S-R-T Division Algorithms As Dynamical Systems by Mark A . McCann B . S c , University of B r i t i s h Columbia, 1999  A THESIS S U B M I T T E D IN P A R T I A L F U L F I L L M E N T O F THE REQUIREMENTS  FOR THE DEGREE  OF  Master of Science in T H E F A C U L T Y OF G R A D U A T E STUDIES (Department of Computer Science)  We accept this thesis as conforming to the required standard  The University of British Columbia A p r i l 2002 © M a r k A . M c C a n n , 2002  In  presenting  degree freely  at  this  the  thesis  in  partial  fulfilment  University  of  British  Columbia, I agree that the  available for  copying  of  department publication  this or of  reference  thesis by  this  his  for  and study. scholarly  or  thesis for  her  purposes  of  Compter  Date  DE-6  (2/88)  It  gain shall not  Sciintt  The University of British Columbia Vancouver, Canada  may  representatives.  financial  the  requirements  I further agree that  permission.  Department  of  be is  Library  permission  granted  by  understood be  for  an  advanced  shall make for  the that  allowed without  it  extensive  head  of  my  copying  or  my  written  Abstract S - R - T division, as it was discovered in the late 1950s [4, 19, 23], represented an important improvement i n the speed of division algorithms for computers at the time. A variant of S - R - T division is still commonly implemented i n computers today. A l t h o u g h some bounds on the performance of the original S - R - T division method were obtained, a great many questions remained unanswered.  In this thesis, S - R - T division is described as a d y n a m i c a l system.  T h i s enables us to bring modern dynamical systems theory, a relatively new development i n mathematics, to bear on an older problem. In doing so, we are able to show that S - R - T division is ergodic, and is even Bernoulli, for a l l real divisors and dividends.  ii  Contents  Abstract  ii  Contents  iii  List of Figures  v  Acknowledgements  vi  1 Introduction and Background  1  1.1  Introduction to S - R - T Division  1  1.2  S - R - T D i v i s i o n as a D y n a m i c a l System  5  1.3  Shift Average for D € [|, 1)  12  2 Bernoulli Property 2.1  P r o o f of Bernoulliness  2.2  Entropy oi T  15 16 •  D  3 Extensions to Multi-Divisor S-R-T Division  19  23  3.1  M u l t i - D i v i s o r S - R - T Division  23  3.2  P r o o f of Bernoulliness  27  3.3  Some Restrictions on a  32  3.4  Entropy of M u l t i - D i v i s o r S - R - T D i v i s i o n  37  iii  4  Future Work  Bibliography  L i s t of Figures  1.1  A pseudo state-machine for converting to binary  4  1.2  S - R - T division where p = 0 . 6 7 , and D = 0 . 7 5  5  1.3  Following partial remainder magnitudes graphically for D  0  0.75  =  and p = 0 . 6 7  7  0  1.4  A p p l y i n g T / to x = j one hundred times  1.5  Applying T  1.6  A p p l y i n g P associated w i t h T / to f{x)  = 1 six times  1.7  A p p l y i n g P associated w i t h T / to f(x)  — ^~ f^  3.1  S - R - T division where p = 0 . 6 7 , D = 0 . 7 5 , and a = ( 0 . 7 5 , 1 , 1 . 2 5 ) 2 7  3.2  C o m b i n e d plot of the regions where oti(e, D) < | and a (e, D) >  4  8  5  4 / 5  to x = f + 0 . 0 0 0 0 1 one hundred times 3  3  5  5  2  '  ^  8  11 six times  12  0  2  1 for proof of Theorem 1 5  34  3.3  A n example of a non-ergodic system for T D  3.4  A n example of a non-ergodic system for To,  3.5  A n example of a non-ergodic system for T o  ) Q  a  v  ; a  G 9Tt„>4  34  £ 9^3  35  G WI2  37  Acknowledgements I would like to thank my thesis supervisor, Nick Pippenger, for his patient guidance and for the inspiration for this thesis. I would like to thank Mark Greenstreet for the many excellent comments and suggestions that he provided as my second reader. I would like to thank my good friends Ellen Gethner, Wesley Wong, and Michael Forbes for many good conversations relating to my thesis, and for their numerous helpful suggestions. I would especially like to thank my wife Karen for being a constant support and encouragement to me throughout my Master's degree. Finally, I would like to thank my parents for all the support and encouragement they have given me through numerous years of schooling.  M A R K  The University April  of British  Columbia  2002  vi  A .  M C C A N N  Chapter 1 Introduction and Background 1.1  Introduction to S-R-T Division  " S - R - T division" roughly refers to a class of non-restoring, binary division algorithms that have been designed for floating-point computers [3, 5, 6, 7, 14, 22]. T h e term "non-restoring" refers to the fact that partial remainders are allowed to range freely through the interval (—1,1), rather than being restored to the positive realm before proceeding to the next step. T h i s feature reduces uses of the adder by about fifty percent.  A n equally important feature of  this algorithm is the " S - R - T " optimization from whence the algorithm gets its name.  In the late 1950's, Sweeney [4], Robertson [19], and Tocher [23]  independently made the observation that whenever a p a r t i a l remainder is i n the range (—|, | ) , there w i l l be one or more leading zeros that can be shifted through i n a very short amount of time (usually one cycle). T h e more leading zeros i n a given step, the more the algorithm can avoid costly uses of the adder. A further development of this original algorithm, which is still called S - R - T division, is the algorithm most often implemented in modern point units.  floating-  In modern S - R - T division, a fixed number of quotient digits  are produced every cycle as opposed to a variable number [5, pp. 1  37-62].  A n example of modern S - R - T division i n use is Intel's first release of the P e n t i u m ™ C P U with its infamous "Pentium Bug," which was really just a small error in its S-R-T division implementation. This thesis will restrict its attention to the original version of S-R-T division. To present the simplest type of S-R-T division, we begin with a few definitions for an algorithm similar to that presented by Shively [22, pp. 3-4]: (a) n represents the number of iterations performed in the algorithm. (b) po is the dividend (or initial partial remainder) normalized so that po G [\, !)• (c) Pi G (—1,1), i G N, is the partial remainder after the ith step. (d) D is the divisor normalized to  1).  (e) qi G {—1,0,1} (i G { 0 , . . . , n — 1}) is the quotient digit generated by the i t h step. 71-1  (f) Q  = ^2 % is the "rounded off" quotient generated after n steps of the i=0 algorithm. n  Given the above definitions, after n steps of the division algorithm, we would like it to be true that po = DQ  n  + s{n)  where e(n) is a term that goes to zero as n goes to infinity. A recurrence relation for the S-R-T division algorithm can be stated as  Pi+i  =  <  2pi  \Pi\ < \  2(pi - D)  \pi\ > \ and pi > 0  2(pi + D)  \pi\ > \ and pi < 0,  2  and :  0 Qi = <  1  P i  :  i  \ <  \Pi I  :  -1  \  > 5  a  > 5  n  d Pi > 0  and pi < 0 .  B y observing that 2{ -{Q)D)  \Pi\ < 5  \ 2 ( P i - (1)D)  \Pi\ > \  (  Pi+i =  Pi  2{ -{-l)D)  and pi > 0  |pi| > 5 and pi < 0 ,  Pi  we can rewrite the definition of pi+i as  After n steps have been completed, we have = 2  p  n  n Po  - 2qD n  0  -  2- D  2 q -lD..  n l  l  qi  n  and then after dividing by 2 and solving for Q we find that n  P  Pn , PO = 7T +  qoD q\D 2° +  +... +  71-1  Qn-lD Jn—1  Pn  i=0 Now let e(n) = p /2 n  n  and let Q* = l i m ^ o o Q - Since | p | < 1, i n the limit as n n  n  goes to infinity Po =  DQ*.  The quotient bits being generated are not i n a standard binary representation, but it is a simple matter to convert the answer back to standard binary without using any expensive operations. Figure 1.1 shows a simple pseudo state-machine (really a push-down automaton) that converts positive floating-point numbers in the {—1,0,1} representation into binary.  Figure 1.1: A pseudo state-machine for converting sequences of { — 1,0,1} into sequences of {0,1} (binary). We assume that the input sequence corresponds to a positive number. The letter ' Z ' is used to indicate that the end of the sequence has been reached, and the symbol e represents the null string. We represent a run of m zeros as 0 • • • 0 and a run of m ones as 1 • • • 1. Sequences of symbols should be read m  from left to right. For example, the expression 1/10 ••• 0 means: if a 1 is encountered in the input sequence, write a 1 followed by m zeros. The above conversion automaton implies that conversion happens after the calculation is completed. In reality, the conversion from the generated quotient bits to standard binary is done in hardware on-the-fiy, using registers to convert runs of zeros into runs of zeros or ones in parallel, or by performing a single subtraction. Figure 1.2 shows an example of using the S-R-T division algorithm to divide 0.67 by 0.75. The steps that produce non-zero quotient bits have been shown. In this example, after six uses of the adder, the quotient (0.893) has been determined to four digits of precision.  4  0.67  Po = 0.67  -0.16  Pi = 2(0.67 - D) Pi = 2 ( 2 ( - 0 . 1 6 ) + £>) =  0.22  Pi = 2(2 (0.22) — D)  =  0.26  P9 = 2(2 (0.26) - D)  = -0.46  2  2  1  1  go =  Qo = 1  Qz = - 1  qs =  Qz  = 0.875  1  Qe = 0.890625  1  Qs  = 0.89453125  Pn = 2 ( 2 ( - 0 . 4 6 ) + D) = - 0 . 3 4  Qw = - 1  Qw = 0.8935546875  Pl3 = 2 ( 2 ( - 0 . 3 4 ) + D) =  912  -1  Ql2 = 0.8933105469  1  0.14  1  =  Figure 1.2: A n example of S-R-T division when the dividend po = 0.67, and the divisor D = 0.75. The quotient Q* is 0.893. Now, with this simple system of division in hand, we might want to ask certain questions about its performance. For example, we could ask "How many bits of precision are generated per iteration of the algorithm on average?" To answer this question, we must look at the magnitude of \Q* — Q \ = \ /2 \. n  n  of bits of precision on the nth step is then n — log  2P n  Pn  The number  . In the worst case, p is close n  to 1, and therefore we get at least one bit of precision per iteration of the algorithm, regardless of the values of D or Q. Of course, a designer of actual floating-point P  hardware probably wants to know the expected performance based on the expected values of p . To answer the many variants of this type of question, it is clear that we n  must know something about the distribution of partial remainders over time. The remainder of this thesis is devoted to extending what is known about the answer to this type of question as it relates to S-R-T division and its variants.  1.2  S-R-T Division as a Dynamical System  The example in figure 1.2 makes it clear that keeping track of the signs of successive partial remainders is irrelevant in determining how many times the adder will be 5  used for a p a r t i c u l a r calculation.  F o r this reason, we only need to consider the  magnitudes of successive p a r t i a l remainders. W e now give a reformulation of S - R - T d i v i s i o n that w i l l allow us to look at d i v i s i o n as a d y n a m i c a l system.  Definition 1 (S-R-T Division Transformation). F o r D G [5,1), we define the function T  D  : [0,1) ->• [0,1) as  TD(X)  = {  2x  0 < x < \  2(D - x)  \ < x < D  2(x - D)  D < x < 1  T h i s t r a n s f o r m a t i o n of the u n i t interval represents the successive p a r t i a l remainders that arise as S - R - T d i v i s i o n is carried out b y a divisor D o n a d i v i d e n d x. D is n o r m a l i z e d to  1). T h e d i v i d e n d x is normalized to  the successive p a r t i a l remainders Tjj(x)  1) initially, while each of  (n G N) subsequently ranges t h r o u g h [0,1).  B y using the characteristic function for a set A defined as  1A(*) =  we c a n rewrite T (x) D  J 1  :  i e A  0  :  x g" A ,  as  = 2x • l ^ i j ( 1 ) + 2(D - x) • l r i , 0 r ) + 2(x - D) • l o )  m )  (x)  .  (1.1)  If we p l o t equation 1.1 o n t h e u n i t interval, we o b t a i n a very useful v i s u a l i z a t i o n of our transformation. F i g u r e 1.3 shows the plot of To.75(0;) c o m b i n e d w i t h a plot of the successive p a r t i a l remainders that arise while d i v i d i n g 0.67 b y 0.75. T h i s is the same system that was presented earlier i n figure 1.2. N o t i c e that a vertical line i n the interval [\,D)  corresponds to a subsequent flip i n the sign of the next  p a r t i a l remainder.  6  0  0.25  0.5  0.75  1  X  F i g u r e 1.3: A n example of following p a r t i a l remainder magnitudes g r a p h i c a l l y for D — 0.75 a n d po = 0.67. T h e heavy solid lines represent the transf o r m a t i o n To.75, while the abscissa of the t h i n vertical lines represent successive p a r t i a l remainder magnitudes.  F i g u r e 1.3 shows a n example of following the t r a j e c t o r y of a single p a r t i a l remainder for a p a r t i c u l a r divisor. A f t e r ten applications of the To.75, there is not any obvious regular pattern, although we expect to see one eventually since the quotient is r a t i o n a l i n this case. O f course, most numbers are not r a t i o n a l and we can deduce that for most numbers, the transformation w i l l never exhibit a repeating pattern. In figures 1.4 a n d 1.5, we see that a very s m a l l change i n the value of the i n i t i a l p a r t i a l remainder quickly produces large differences i n the observed behaviour of the subsequent p a r t i a l remainders. O u r system appears to be chaotic (it certainly has sensitive dependence o n i n i t i a l conditions a n d is topologically transitive), a n d , if this the case, we w i l l gain little understanding by s t u d y i n g the trajectories of  7  i n d i v i d u a l p a r t i a l remainders. T h e logical next step is to study the behaviour of d i s t r i b u t i o n s of points over the whole interval. l r  0.75  0.5  0.25  100 F i g u r e 1.4: T h e result of a p p l y i n g T / 4  5  to x = j one h u n d r e d times.  0.75  ST  0.5  0.25  0  J  20  40  F i g u r e 1.5: T h e result of a p p l y i n g T / 4  5  n  60  80  100  to x = j + 0.00001 one h u n d r e d times.  T h e area of understanding the behaviour of ensembles of points under repeated transformation is the r e a l m of d y n a m i c a l systems theory. F o r the remainder of this thesis, we assume a certain amount of f a m i l i a r i t y w i t h the fundamentals of 8  d y n a m i c a l systems theory (or ergodic theory), w h i c h requires some basic unders t a n d i n g of measure theory.  W e w i l l include a few helpful b a c k g r o u n d m a t e r i a l  definitions as they are needed, b u t mostly we w i l l provide references. A very good i n t r o d u c t i o n to the study of chaotic systems is L a s o t a a n d M a c k e y ' s book Chaos, Fractals, and Noise [11]. For a more detailed i n t r o d u c t i o n to ergodic theory (along w i t h the necessary measure theory needed to understand this thesis), Peter W a l ters's book An Introduction to Ergodic Theory [24] a n d K a r l Petersen's book Ergodic Theory [18] are h i g h l y recommended.  Definition 2 (Probability Space). If B is a cr-algebra on subsets of a set X a n d if m is a measure o n B where m(X) = 1, then the triple (X, B, m) is called a probability space. (See [24, p p . 3-9] a n d [11, pp. 19-31] for a good overview of basic measure theory a n d Lebesgue integration.)  Definition 3 (Stationary Distribution). Let (X, B, m) be a p r o b a b i l i t y space, let P be the P e r r o n - F r o b e n i u s operator associated w i t h a non-singular t r a n s f o r m a t i o n T : X —>• X, a n d let L Pf  = ft  1  denote the L  space of (X,B,m)^.  l  If / € L  1  is such that  then / is called a stationary distribution of T.  Definition 4 (Perron-Frobenius operator). F o r a probability space the Perron-Frobenius  (X,B,m),  operator associated w i t h a non-singular t r a n s f o r m a t i o n T :  X —)• X is defined b y  f Pf(x) d mJT-^B) = [ f{x) d m ,  JB  for B  eB .  For a piecewise C § transformation T w i t h n pieces, we c a n give a n explicit 2  f o r m u l a for the P e r r o n - F r o b e n i u s operator. Let A = {Ai,A ,... 2  ,A } n  be the p a r t i -  t i o n of X w h i c h separates T into n pieces. For i € { 1 , . . . , n } , let ti(x) represent the *For a probability space (X,B,m), the L space of (X,B,m) is the set of / : X -> E satisfying J \f(x) \ dm < co. *The o symbol will be used to indicate that a given relation holds except possibly on a set of measure zero. §C denotes the set of all functions with two continuous derivatives. 1  x  2  9  n a t u r a l extension of the ith. C  2  function T ( x ) | ^ . i  T h e P e r r o n - P r o b e n i u s operator  for T is t h e n  tl {x) f(t-\x))-l {x) l  dx  u[Ai)  I n p a r t i c u l a r , for T o (as i n equation 1.1),  Pf{x)  =  \f{\x)-l {x) m  +  \f{D  - \x)  • l ,2D-i](aj) (0  + \f{D  + \x)  •l  [ 0 l 2  -2D)(^ •  (1-2)  W i t h equation 1.2 we c a n show precisely what happens to a n i n i t i a l dist r i b u t i o n of points (described b y a n integrable function) after they are repeatedly t r a n s f o r m e d under Try. Figures 1.6 a n d 1.7 show w h a t happens to two different i n i t i a l d i s t r i b u t i o n of points after five applications of the P e r r o n - F r o b e n i u s operator associated w i t h T / ( z ) . B y the fifth a p p l i c a t i o n , the d i s t r i b u t i o n s look r e m a r k a b l y 3  5  similar. O n e might guess that they are b o t h approaching t h e same final d i s t r i b u t i o n . T h i s s i t u a t i o n is i n m a r k e d contrast to chaotic behaviour observed i n figures 1.4 a n d 1.5.  10  2  2  1.5  1.5  1  1  0.5  0. 5  0  0  0.25  0.5  0.75  0  1  0  0.25  2  2  1.5  1. 5  1  CO  1  0.75  1  0.75  1  1  0  0.25  0.5  0.75  0  1  0  0.25  2  2  1.5  1.5  1  lO  0.  0.5 0  0.25  0.5 a:  0.5 X  X  0  0.75  0.5  0.5  0  0.5 X  X  0.75  1 0.5  0  1  0  0.25  0.5 X  u r e 1.6: T h e result of a p p l y i n g the P e r r o n - F r o b e n i u s operator P associated with T / 3  5  to f(x)  = 1 six times.  11  2.5  2.5 2  ^  ^  2  ^  1.5  a,  1  i  0.5  0.5 0  0  0.25  0.5 X  0.75  0  1  0  0.25  0.5 X  0.75  1  0  0.25  0.5  0.75  1  0  0.25  0.5  0.75  1  2  2 ^ -  1. 5  1.5  1  CN  0.5  0.5 0  0  0.25  0.5  0.75  0  1  x 2  2 1.5  1.5 1  a,  0.5 0  1  lO  0.5 0  0.25  0.5 X  0.75  1  F i g u r e 1.7: T h e result of a p p l y i n g the P e r r o n - F r o b e n i u s operator P associated 1 f = -—/ !og2 J  • •  1  w i t h T 3 / 5 to fix)  1 / 2  1.3  — six times. x  Shift Average for D G [§, 1)  A n exact equation for the stationary d i s t r i b u t i o n when D € [|, 1) was first given by F r e i m a n [6] a n d is restated by Shively [22] as 1 /(z)  1  = -Q\O,2D-\){?)  + 2^1[2£>-i,i)(») •  (1-3)  T o verify that this is a stationary d i s t r i b u t i o n f u n c t i o n , we b e g i n by a p p l y i n g the P e r r o n - F r o b e n i u s operator as given i n equation 1.2 to equation 1.3 a n d verifying  12  that Pf(x) = f(x). So then, a p p l y i n g P to / we get  Pf(x)  = ^ ^ l [ o , 2 D - i ) ( H + 2^ [2i>-i,i)(5 )) [o,i)( ) 1  a;  1  + \ ^ l [ o , 2 i > - i ) ( - D - h ) + -fij [2D-\,\){D x  L  + \ ^l[0,2X?-l)(-D + \ ) x  a;  ~ k ) ) l(0,2£»-i](a;)  + 2 ^ 1 [ 2 D - l , l ) P + \x)^j l[Q, -2D){x) 2  •  A s s u m i n g that D € [5,1), a n d observing that a; € [0,1),  Pf(x)  =  \ ^ l [ o , 4 D - 2 ) ( a j ) + ^ [ 4 D - 2 , i ) ( ) ) l[o,i)(a:) 1  a ;  + £ ^ ( 2 - 2 D , l ) ( ) + ^ ( 0 , 2 - 2 £ > ] ( 2 ; ) ) l(0,2D-l](a;) 1  a ;  1  + £ ( ^ 1 [ 0 , 2 D - 1 ) ( ^ l[0,2-2O)(a;) • F i n a l l y , assuming that D € [|, 1), we have  P  f( ) x  =  2^1[0,l)(a;) + 2 ^ ( 2 - 2 D , 2 i 3 - l ] ( ^ ) + ^ l ( o , 2 - 2 D ] ( a ; ) + 42jl[o,2-2D)(a:)  =  3 1 ^ 1 [ 0 , 2 - 2 D ) ( ^ ) + ^ 1(0,2-20] (*)  1  +  -^ [2-2D,2D-l){ ) l  x  +  ^T5 (2-2D,2£)-l](a;) 1  + 2^1[22>-i,i)(a0 =  ^l[o,2D-i)(a;) + 2^1[2D-i,i)(a:) = / ( » ) •  O n e of the p r i m a r y uses of having a formula for t h e d i s t r i b u t i o n of p a r t i a l remainders is for calculating the shift average for a given divisor. T h e shift average is the average number uses of the shift register (single shift or m u l t i p l i c a t i o n by two) between uses of the adder. U n d e r the assumption that a register shift is a m u c h faster operation t h a n using the adder, the shift average gives a useful characterization of the expected performance of our a l g o r i t h m for a given divisor. W i t h equation 1.3, we know the fraction of bits that require the use of the adder. T o calculate the average number of zero bits generated between non-zero bits (bits requiring use of  13  the adder), we take the reciprocal of the fraction of bits that require the adder. W e calculate the shift average for a divisor D e [f, 1) to b e  •<">-I4 * £ T =  (L4)  Since have not proven that the s t a t i o n a r y d i s t r i b u t i o n s f r o m S - R - T d i v i s i o n are unique, we have no way of k n o w i n g whether or not a shift average c a l c u l a t i o n i n equation 1.4 is correct. T o prove that a l l stationary d i s t r i b u t i o n s are unique, we need to show that TD is ergodic for a l l D G [^,1)- F r e i m a n [6] shows that T o is ergodic for r a t i o n a l D, b u t we extend this result for real D. I n the next section we show that a l l  are B e r n o u l l i a n d it is k n o w n that h a v i n g the B e r n o u l l i property  implies ergodicity. Before c o n c l u d i n g this chapter w i t h a definition for ergodicity, we w i l l briefly comment o n the derivation of stationary distributions for D G [|, f ) . F o r D G [|, | ) , the stationary d i s t r i b u t i o n functions have been derived, a n d their associated shift average functions have been shown to be constantly three [6, 22]. T h e layout of stationary d i s t r i b u t i o n functions i n the region D G [ j , § ) has several s u r p r i s i n g properties a n d is far f r o m being fully understood. W e discuss the c a l c u l a t i o n of shift averages as a n interesting area for future investigation i n C h a p t e r 4.  Definition 5 (Ergodic [11]). L e t (X, B,m) be a p r o b a b i l i t y space a n d let a nonsingular t r a n s f o r m a t i o n T : X —> X b e given. T h e n T is ergodic i f for every set B G B such that T~ {B) L  = B, either m(B) = 0 or m(X \B)  14  = 0.  Chapter 2 Bernoulli Property In this chapter, we w i l l prove that the class of transformations of the interval that characterizes the S - R - T d i v i s i o n for a l l real divisors D has the property that each t r a n s f o r m a t i o n TD is B e r n o u l l i . A l t h o u g h the basic concept of a B e r n o u l l i shift (the things to w h i c h transformations having a B e r n o u l l i property are isomorphic to) is not difficult, a complete definition requires enough a u x i l i a r y concepts from measure theory (concepts not used anywhere else i n this thesis) that we chose to refer the interested reader to [17, 18, 21, 24] a n d other selections listed i n the B i b l i o g r a p h y . N e i t h e r a n understanding of B e r n o u l l i shifts, nor a f o r m a l definition of what  it  means to be B e r n o u l l i is required to follow the proofs i n this chapter. H a v i n g said this, we should mention informally the connection between B e r n o u l l i shifts and transformations h a v i n g the B e r n o u l l i property. T h e transformation TD is an non-invertible e n d o m o r p h i s m of the unit interval. T h i s means that f r o m a given p a r t i a l remainder we can predict a l l future p a r t i a l remainders, but we cannot uniquely predict past p a r t i a l remainders. T h e r e is a natu r a l way (called the n a t u r a l extension) to make our t r a n s f o r m a t i o n invertible (an a u t o m o r p h i s m ) on a larger space. Specifically, each non-invertible  transformation  TD h a v i n g the B e r n o u l l i property has a n extension to a n a u t o m o r p h i c transformat i o n , isomorphic to a two-sided B e r n o u l l i shift [18, p p . 13,276]. F r o m the way that  15  entropy for a transformation is denned, the entropy for a n a u t o m o r p h i c B e r n o u l l i t r a n s f o r m a t i o n associated w i t h a non-invertible B e r n o u l l i t r a n s f o r m a t i o n is the same as the entropy for the non-invertible B e r n o u l l i transformation. B y p r o v i n g that a l l transformations T p are B e r n o u l l i , a n d b y p r o v i n g that entropy of each To is the same, we w i l l be able to conclude that the n a t u r a l extensions of S - R - T d i v i s i o n algorithms are isomorphic to each other for a l l divisors.  2.1  Proof of Bernoulliness  Definition 6 (of Bowen [1], Expanding). W e w i l l say that a t r a n s f o r m a t i o n T on a n interval is expanding i f it has the property that s u p  u-(T U) = 1 for a l l open n  n > 0  intervals U w i t h p{U) > 0, where / i is any n o r m a l i z e d measure that is absolutely continuous w i t h respect to Lebesgue measure.  Definition 7 (Straddle). L e t U be a n interval of reals (either o p e n , closed, or half open) a n d let p G K . If p G U°J then we say that U straddles p. +  Theorem 1. The S-R-T division transformation is expanding for all real divisors. Proof. L e t (X,B,m)  be a p r o b a b i l i t y space where X = [0,1), B is the B o r e l o-  algebra o n X a n d m is the Lebesgue measure o n H*. L e t T  D  : X ->• X be the S - R - T  d i v i s i o n t r a n s f o r m a t i o n for a given normalized divisor D as defined i n equation 1.1. *The symbol o as the exponent of an interval denotes an open version of the interval. *For an interval [a, b], the Lebesgue measure is defined as m([a, b]) = b — a.  16  Let us define an infinite sequence of intervals U = {C/jjieN as and  Ui = U  C/°C[0,i)  g [0, i ) and (7° g  U°  T (C7in[0,i)) D  ort/?C[±,l) [i,  1) and  m(C7i D [0, i ) ) > m ( < 7 [ i , l ) ) i n  tf?g[0,±)andtf?g[±,l)  n [ ± , i ) )  and n [ o , i ) )  Property 1. For all Proof.  Ui such  that  \  U° and D g U°, m{U i) i+  TD  i  =  2m{Ui).  [5,-D), or [D, 1), then we are in the first  If a U° is a subset of either [0,  case of the U definition and we apply  <m((7 n[i,l)).  directly. Since each of the three cases of the  expand an interval by a factor of two, it is clear that m(T£>(Ui)) = m(Ui \) +  =  2m(Ui).  Property 2. Proof.  For all Ui where  D  0  Ui, m(Ui i) +  >  m(Ui).  Assume that D $ Ui. If j £ Ui, then according to Property 1, U{+\ doubles.  Otherwise, \ 6 U{ and therefore, to find E/i+i, we must consider the second and third cases of the U sequence. In the worst case, m(Ui D [0, ^)) = m(Ui n [^,D)),  and  regardless of which half we choose, m(Ui  By  fl  [0, ^)) = m(Ui  fl  [5, D)) = ^m(Ui).  applying Trj to this truncated interval, we double what we halved so that m(Ui) = m{U i). i+  By way of contradiction, let us assume that there exists a sequence of U that never expands to fill X. Such a sequence can never include the point D and the following Property will hold:  Property 3. (a)  There  exists N such  m ( [ / ; n [ 0 , ^)),m(Uif][^,  straddle  that for all i > N  1)) > 0 (in other  \), and  17  words,  all subsequent  intervals  must  (b) m{Ui PI [0, \)) < m(Ui n that the right Proof  of Property  half  1)) (in other  words,  of Ui is not discarded  all subsequent  by the definition  Ui must  be such  ofU).  3(a) Property 1 says that the only way not to double is to straddle  ^. Therefore, at a minimum, it must be the case that ^ is eventually included every time or else the interval will double a sufficient number of times to include D which would be a contradiction. Proof e,\  of Property  3(b) If m{Ui  n [0, | ) )  + e') where e > e'. Now U  i+1  > m(Ui  = T {Ui)  n [£, 1)),  = T (\  D  then we have Ui = {\ -  - e, \) = (1 - 2e, 1). B u t ,  D  since D is not i n C/j+i, \ cannot be in t/i+i and Property 3(a) fails, resulting in a contradiction. B y Property 3, we will eventually be in a situation where U{ = ( \ — e', \ +e), e' < e, and Property 3 will hold for every subsequent interval. So then Ui i+  =T {\-e',\+e)  =T [\,\  D  + e) = (2Z> - 1 - 2e, 2 D - 1]  D  by Property 3(b). But again by Property 3, U  l+2  = T {2D D  - 1 - 2e,2D  - 1] = T [\,2D  - 1] = [2 - 2D,2D  D  - 1].  It is now clear that ^ is at the midpoint of Ui+2 and that we must now pick the left half of the interval which contradicts Property 3(b). Therefore, D will eventually be included in an interval and the sequence will expand to fill all of X.  •  We can now prove that the S-R-T division process is weak-mixing, and therefore Bernoulli, by two theorems of Bowen [1].  Theorem 2 (of Bowen [1]). T-invariant system  probability  Let T be a piece-wise  measure,  (T, /j,) is weak-mixing,  then  C  2  map of [0,1], [i be a  and A = infn< <i |/'0c)| > 1x  the natural  18  extension  U  of (T, p) is  smooth  dynamical Bernoulli.  W e mention here that the natural  of (T, n) is the associated a u -  extensions  t o m o r p h i c transformation that we alluded to at the b e g i n n i n g of this chapter. See Petersen [18, p. 13] for a n exact definition.  Theorem 3 (of Bowen [1]). With T and mixing ifT  is  / i as in Theorem  2,  C  Perron-Frobenius  with T, then for any f  operator  (n  Sfc=o  f*  has the property  -f /)n^=i  under  ?  s  associated  convergent that Pf*  2  function  be a probability  and let T : X —>• X be a piecewise  invariant  will be weak-  expanding.  Theorem 4 (of Lasota and Yorke [10]). Let (X,B,m)  fc  (T,n)  such that i n f \T'\ > 1. If P is the  in norm to a function = f*  space  f* € L\.  and consequently,  G L,  the sequence  1  The limit  function  the measure d/Lt* = / * d m is  T.  H a v i n g established that TD is e x p a n d i n g , we now use the above three theorems to prove the central result of this thesis.  Theorem 5. TD is  Bernoulli.  Proof. F r o m the definition of T o , we see that TD is C a n d that i n f o < < i |TD'(:E)| = 2  x  2 >  1 since \TD'(X)\  i n f n < < i \TD'(X)\ x  = 2 for a l l x for w h i c h the derivative is defined.  Since  > 1, by T h e o r e m 4 there exists at least one \x such that fi is a  s m o o t h T o - i n v a r i a n t p r o b a b i l i t y measure. B y T h e o r e m 1, we see that T h e o r e m 3 holds. Hence, (TD,U-)  2.2  is w e a k - m i x i n g a n d , by T h e o r e m 2 ( T o , ^ ) is B e r n o u l l i .  •  Entropy of T  D  K n o w i n g that a l l T o are B e r n o u l l i is a very useful property because we c a n use entropy as a complete invariant to show i s o m o r p h i s m amongst the two-sided B e r n o u l l i shifts associated w i t h T o that have the same entropy. T h i s comes f r o m the contrib u t i o n of O r n s t e i n to the K o l m o g o r o v - O r n s t e i n T h e o r e m .  19  Theorem 6 (of Kolmogorov [8, 9] and Ornstein [16]). Two Bernoulli shifts are isomorphic if and only if they have the same entropy. T h e purpose of this section is t o calculate the entropy o f T p . W e begin w i t h a m u l t i - p a r t definition of entropy along w i t h some s u p p o r t i n g definitions that follow the development presented by Walters [24, pp. 75-87]. Definition 8 (Partition). A partition of (X,B,m)  is a disjoint collection of ele-  ments of B whose u n i o n is X. Definition 9 (Join). Let V and Q be finite partitions of (X, B, m). T h e n ? V Q = {PnQ  : P G V, a n d Q 6 Q } is called the join of V a n d Q. N o t e that V V Q is also  a finite p a r t i t i o n of (X, B, m). Definition 10 (Entropy of a partition). L e t (X,B,m) a n d let V = {Pi,...,Pk}  be a finite p a r t i t i o n of (X,B,m).  be a p r o b a b i l i t y space T h e entropy of the  partition is defined as  H(V)  k = -^miPi)  log  m(Pi).  Definition 11 (Entropy of a transformation with respect to a partition). Suppose T : X —> X is a measure-preserving transformation of the p r o b a b i l i t y space (X, B, m). If V is a finite p a r t i t i o n of (X, B, m), then  is called the entropy ofT with respect to partition V. Definition 12 (Entropy of a transformation). L e t T : X —> X be a measurepreserving t r a n s f o r m a t i o n of the p r o b a b i l i t y space (X, B,m) a n d suppose h(T) = s u p / i ( T , V), where the s u p r e m u m is taken over a l l finite partitions V of T h e n h(T) is called the entropy ofT.  20  (X,B,m).  T h e following definitions a n d theorems involving C - m a p s a n d P C - m a p s are taken f r o m a paper o f L e d r a p p i e r [12] a n d have been streamlined for o u r argument.  Definition 13 (of Ledrappier [12], C-map). A real f u n c t i o n / defined o n a n interval [a, b] is said to be a C-map if / is continuously differentiable a n d its derivative / ' has the following properties: (a) / ' satisfies a H o l d e r condition^ of order e > 0. (b) T h e r e are only a finite number of points x G [a, 6] where f'(x) t h e m by a < a\ < a •.. < a 2  n  = 0. W e denote  < b w i t h / ' ( a ; ) = 0 for 0 < i < n.  (c) T h e r e exist positive numbers k~ (kf) such that log  ,k |x—a| 1  -(+)  is b o u n d e d i n  a left (right) neighborhood of a ; .  Definition 14 (of Ledrappier [12], PC-map). A m a p / : [0,1) —> [0,1) is called a PC-map  if there exists a finite p a r t i t i o n 0 < b\ < b .. • < b 2  C - m a p f r o m [bj,bj i] +  into [0,1), for any j.  Theorem 7 (of Ledrappier [12]). Let f be a PC-map. lutely continuous  < 1 such that / is a  m  invariant  measure),  If  then Rohlin's formula  is an a.c.i.m.  \i  (abso-  [20] is true:  Hf) = j log|/'|d/i. Theorem 8.  The entropy h{T ), D  ofT  D  for D e [^,1)  is equal to / l o g | T o ' | d/z =  log 2. Proof. W e begin by showing that T o is a P C - m a p . B y the definition of a P C - m a p , T p is a P C - m a p if each of the three functions Tu\^ ij ,T!D|[1 y Q  D  a n d To\y ^ is a D  C-map. T r i v i a l l y , each Try restricted to any of the three domains [0, | ) , [D,l)  or  satisfies a H o l d e r c o n d i t i o n of order e = 1 because each piece of To is just a  §A function f(x) defined on an interval [a, b] satisfies a Holder condition of order e E K if there exists c 6 R such that for any two points pi,P2 G [a,b], \f{pi) — f(p2)\ < c\pi - P2\ • +  +  e  21  line of slope two. T h u s c o n d i t i o n (a) of D e f i n i t i o n 13 is satisfied. C o n d i t i o n (b) is satisfied because there are no points for w h i c h the derivative is equal to zero w i t h i n a given line segment. Therefore, condition (c) is t r i v i a l l y satisfied. T h u s each of the three segments of TD are C - m a p s a n d by D e f i n i t i o n 14, T o is a P C - m a p . N o w , since each TD is B e r n o u l l i , there exists a unique a.c.i.m., call it /z, for each To-  B y T h e o r e m 7, we can use R o h l i n ' s formula to calculate the entropy:  • W i t h the proof of T h e o r e m 8 we have established i s o m o r p h i s m amongst the a u t o m o r p h i c transformations (or n a t u r a l extension) associated w i t h simple S - R - T d i v i s i o n transformations by a n a p p l i c a t i o n of the K o l m o g o r o v - O r n s t e i n T h e o r e m . T h e key to o b t a i n i n g this result was being able to show that TD has Bowen's exp a n d i n g property.  In C h a p t e r 3, we extend the results of this chapter to a more  general type of S - R - T d i v i s i o n .  22  Chapter 3 Extensions t o M u l t i - D i v i s o r S-R-T Division 3.1  Multi-Divisor S-R-T Division  A c o m m o n o p t i m i z a t i o n to the S - R - T d i v i s i o n a l g o r i t h m is the i n c l u s i o n of a d d i t i o n a l divisors to increase the shift average. I n this section, we prove that a l l such d i v i s i o n algorithms w i t h reasonable assumptions o n the separation of the divisor multiples have the e x p a n d i n g property. It w i l l be useful to define precisely a class of m u l t i divisor S - R - T d i v i s i o n transformations.  Definition 15. L e t a  G W be such that 1  (a) 0 < ai < a  < • • • < a , and  (b) F o r a l l x,D  e [|, 1), there exists i G { 1 , . . . , n) such that \c<iD —  2  n  x\<\.  W e define 2 l „ to be the set of a l l a G R , satisfying conditions (a) a n d (b). A l s o , n  Definition 16 (Peaks and Valleys). G i v e n a n a between two lines f(x)  G 2 l > 2 , the point of intersection N  = 2(x - ctiD) a n d g(x) = 2(a iD i+  23  - x) w i l l be called a peak  a n d is denoted by ipi = (^D(cti i  + ati), D(a{ i  to the abscissa as ipf = ^D(oti i  + a j ) , a n d to the ordinate as ipf = D{a.i \  +  +  +  — a;)). For convenience, we w i l l refer — aj).  +  T h e point of intersection of the two lines f(x) = 2{aiD — x) a n d g(x) = 2(x — otiD) is (cxiD,0)  a n d w i l l be called a valley.  Definition  For a D e  17.  a n  d  Q 6  21, define the t r a n s f o r m a t i o n Tr), (x) a  :  [0,1) —* [0,1). F o r a S 2 t i , we get the familiar transformation 2x TDA )  0 <  \2(D-x)\ For  X  < i  X  <  =  X  <  1.  a € 2l , 2  Q<x<\  2x \2{ D-x)\  i < x < ipf  \2{a D-x)\  \ < x a n d ipf < x < 1 .  ai  2  For a € 2 l  n > 3  , 2x  0 <x < \  \2( D-x)\  \ < x < ipf  \2( D-x)\  \<x  \2{a D-x)\  \ < x a n d ipn-i < x < l .  ai  ai  n  Definition 1 8 . Define Wl = {T n  UneN^n- We  c  a  n  *  n  es  e  Di(X  t  °^  we c a l l iVd the set of multi-divisor  a  n  a n d ipf < x < tpf  +1  : D <E ( ± , 1 ] , a € 2 t } a n d define iW = n  -divisor  n  S-R-T  S-R-T  division  division  transformations  and  transformations.  C o n d i t i o n (b) i n D e f i n i t i o n 15 guarantees that the d i v i s i o n a l g o r i t h m generates a new quotient bit every step. A l t h o u g h the c o n d i t i o n makes intuitive sense, it is not i m m e d i a t e l y obvious i f a n a satisfies the c o n d i t i o n just by inspection. L e m m a 10 below provides a n easier way t o check. 24  Lemma  9. If a = ( a i ) , then  condition  (b) of Definition  15 is satisfied  if and  only  if ai = 1.  Proof.  If a\ = 1, then maxjr> e[i/2,i) \ \D  « i ^ 1 and e G M  +  — x\ < | . N o w consider the cases when  a  ]X  . If ct\ = 1 + e, then when D =  a n d x = \ , \OL\D — x\ =  1 — i = i ^ i . O n the other h a n d , if ct\ = 1 — e , then when D = \ a n d a; = 1 — | , |ai£> - x | = 1 - | - (1 - e ) ± = i ^ \ .  Lemma 10. A n condition (i)  G  2l  a{  €  ( 0 , 7 ? ] and  (Sketch).  if for some  otj €  a^c?  satisfies  n  (b) if and only  ( M J a» € [^,1]  Proof  a  •  [1,1  +  a*],  condition  i,j  (a) of Definition  G { 1 , . . . ,n}  (possibly  15 also  i = j),  satisfies  either  or  € [l,3«i].  L e m m a 9 has shown that a single component a of a w i t h a = 1 is  sufficient to ensure that the range of f(x) = 2 \aD — x\ is equal to [0,1) as x a n d D range over [5,1). It is easy, to see based on the proof of L e m m a 9 that if there does not exist i 6 { 1 , . . . ,n} such that ai = 1, then there must exist i,j  6 { 1 , . . . ,n}  (i < j) where OJJ < 1 a n d a > 1. 3  Let us assume that i is the largest value where CKJ < 1, a n d let us assume that  j is the smallest value where ctj > 1 (therefore j = i + 1). W e make this a s s u m p t i o n because no other scalars of D w i l l have an influence on whether or not c o n d i t i o n (b) is satisfied. C o n s i d e r the case where c*j G (0, ^]. In this case where ai 6 (0, | ] , when  D is close enough to 1, some of the line f(x) = 2(x — aiD) appears i n the region (denoted R) where \ < x < 1, 0 < T (x) a  < 1. W h e n a p o r t i o n of the line f(x)  appears i n region R, we must put restrictions on aj i n terms of ai so that the peak ?/>! is always i n R. ipf is greatest when D = 1. W e find the m a x i m u m allowable value of aj by setting D = 1 a n d solving ipf = 1 for aj: ipf = I  Therefore, if a ; G (0,  => D(aj then  - ai) = 1  G [1,1 + aj]. 25  aj = a ; + 1.  In the case where a j € [^, 1], for large enough values of D, the line f(x) 2(x — Dai)  =  crosses the line x = 1 i n the range [0,1). Because of this, we must loosen  the restriction that ctj G [1,1 + ctj].  It is straightforward to calculate that  begins to cross the line x = 1 i n the range [0,1) w h e n D =  f(x)  W e c a n ensure that  as D becomes smaller, the peak ipi w i l l always be i n region R by solving tpf = 1 for ctj when D = tp\  = 1  D(aj  -  on)  = 1  =>  7f—((Xj  -  = 1  on)  =4>  ctj  =  3a . t  Therefore, if ai € [^, 1], then ctj £ [1, 3aj].  •  Definition 19 (Separation). F o r a € 2 l , we define the separation i n a as n  separation(a)  =  max i€{l,...,n-l}  1 + 1  .  ttj  L i m i t i n g the separation is a convenient way to restrict the subset of 21 being considered. If separation(a)  = r, we say that "the divisor multiples i n a are separated  by at most a factor of r." F i g u r e 3.1 shows a n example of m u l t i - d i v i s o r S - R - T d i v i s i o n . T h i s example is performing the same calculation as i n figure 1.2, but it has c o m p u t e d the d i v i d e n d w i t h twice as m a n y digits of precision w i t h the same effective n u m b e r of uses of the adders. W e say "effective" because i n m u l t i - d i v i s o r S - R - T d i v i s i o n , there are several adders w o r k i n g i n parallel. In a real i m p l e m e n t a t i o n of m u l t i - d i v i s o r S - R - T d i v i s i o n , the values for a must be carefully chosen so that not too m u c h overhead is required to select a good p a r t i a l remainder. T h e r e is also a tradeoff between the amount of overhead i n choosing a good p a r t i a l remainder a n d the precision to w i t h w h i c h a g o o d p a r t i a l remainder is selected.  26  0.67  Po = 0.67 Pi = 2(0.67 -  -0.16  a D) 2  P4 = 2(2 (-0.16) + anD) 2  P7  = -0.155  = 2(2 (-0.155) + a D) = -0.115 2  x  Pll = 2(2 (-0.115) + a D) 3  3  =  Qo = 1  Qo =  0.035  Q3 = -a  Qz  = 0.90625  <76 =  Q  = 0.89453125  9io = - « 3  Qw = 0.8933105469  x  6  = 2(2 (0.035) - a D)  = -0.005  Qis =  Ql5 = 0.8933334351  P24= 2(2 (0.005) + a D)  = -0.155  923 =  Q2S = 0.8933333456  Pl6  4  x  7  x  Figure 3.1: An example of S-R-T division where three multiples of the divisor are used. In this example the dividend po — 0.67, and the divisor D = 0.75 with divisor multiples a = (0.75,1,1.25). The quotient Q* is 0.893.  3.2  Proof of Bernoulliness  In this section, we will show that all multi-divisor S-R-T division transformations are Bernoulli, given a necessary restriction on the multiples of the divisor. As in the case for a single divisor, it will be useful to define a sequence of intervals that are subsets of the sequence of sets that would arise from repeatedly applying TD,Q. to an initial open interval. Unless otherwise noted, assume that the function m denotes the Lebesgue measure. Definition 20. Given an initial open interval U C [0,1) and To^a G 9DT, we define  27  the infinite sequence of intervals U = {Ui}^  as  and  Ui = U  C/°C[0,i)  T , (Ui) D a  T , (Ui D a  t / ° £ [0,  orC7?C[i,l)  n [0, i ) )  a n d t/? g [ i , 1) a n d m(f/<h[0,i))  T ^ a ^ i  a n d C7? g [ ± , l )  U°%%\)  n [ i , i ) )  >m{Uif\{\,\)) and  m(f7 n[0,i)) i  Definition 21 (Critical Points). For a given Tu,  a  C = {  Ci  n  : i 6 {1, • • •, m } , a E B U {0, ±, 1}}  n  C2 < . . . < 1. C is called the set of critical  Lemma 11 (Doubling). GivenTr),a  points for T o  £  m  be the set of critical  j G {1,... , r a - 1}, then m(Ui+i) Proof. Since Ui C  [CJ,CJ \] +  =  ) a  • • •, V>n-i}}  a  n  d  c  i  <  .  ^ ^ sequence of intervals U be defined e  20 and let Ui be some interval  C = {ci,... ,c }  )  i  where a € 2 l , define the set  where 5 = {& : i < & < 1 and b G { a i - D , • • •, a - D } U O f ,  as in Definition  <m(f/ n[i l)).  in the sequence.  points for  If Ui C  T D , « .  Furthermore, [ C J , C J + I ]  for  let some  2m(Ui).  for some j £ { l , . . . , m - l } , because we are i n the first  case of the definition of U, either U? C [0, \) or U° C [ i , 1). B y simple inspection of the i n d i v i d u a l cases that define Trj , i(X  the points  Cj  and  Cj+i,  it is apparent that a l l of Ui, except possibly  fall w i t h i n the same case of Tr), a  Therefore, the resulting  interval Ui+i w i l l be double the length of Ui.  Definition 22 (Active Valleys). G i v e n T  •  Dt(X  € 93T , define n  V = {ctiD : i 6 { 1 , . . . , n) and \ < a D t  V is called the set of active valleys for  To, a  28  < 1} .  Definition 23 (Active Peaks). Given Tn, G Tl , define a  n  P = {if>f : t e { l , . . . , n - l }  andi<^f<l}.  P is called the set of active peaks for To, a  Lemma 12 (Non-shrinking). the sequence of intervals  {Ui}i^  active valleys for T r j , .  For any interval  a  m(U i) i+  > m(Ui) or m(U )  >  i+2  Proof, separation(a)  with separation(a)  Given TD,OC G  < | , let  be defined as above and let V denote the set of Ui G U such that V f l Ui = 0,  either  m(Ui).  < | implies that cti \  <  +  For a given separation, the  value of ip\ is maximized when ipf = 1. This implies that aj =  We calculate  the value of ipf with the assumption that ipf = 1 to get a bound on ipf for D < 1:  iPf  <  D(l  ai - ai) =  D(l ) ai  = £>(§&) =  5 •  Case 1: Consider when Ui C [0, \ ] . In this case, m(Ui \)  — 2m(Ui).  +  Case 2: Consider when Ui C [^,1). The interval Ui can span at most one peak. Therefore, m(£/j+i) > m(Ui). m{U ) i+2  =  A further observation is that since Ui+i C [0,  2m{Ui).  Case 3: Consider when Ui £ [0, \] and Ui %  1). In this case, Ui straddles  \ . From the definition of U, we see that in the worst case we might throw away up to half of Ui. Call the part not thrown away Ui and observe that m(Ui) >  \m(Ui).  Now, either Ui' C [0, \) or Ui' C [i, 1). If [// C [0, ±], then m(C/i+i) = 2m(<7;') > m{Ui).  If t/i' C [ i , 1), then m ( C / i ) = 2m(£//) > m(c/;).  Lemma 13. A multi-divisor when separation(a)  •  +2  S-R-T division  transformation  Tr>, G 9JI is expanding a  < |.  Proof. Let V be the set of active valleys (as defined in Definition 22) for a Trj,a- Let P be the set of active peaks (as defined in Definition 23) for a To,a- Let U = be the sequence of intervals associated with a To, and an initial interval U. a  29  {Ui\i^n  B y way of c o n t r a d i c t i o n , assume that a T£>  >a  is not e x p a n d i n g . T h i s means  that for some T r j , , there does not exist a n interval Ui where any of the points i n V a  are contained i n Ui. T h i s is true because i f any of the valley points are i n Ui, then Ui i +  = [0, e) or C / j  + 1  = [0,e], a n d after a finite number of steps, Ui w i l l have grown  to include a l l of [0,1). If there is a sequence U that avoids a l l points i n V, t h e n b y L e m m a 12 it must be true that the intervals i n the sequence can only double a finite number of times. L e t i e N b e the first index for w h i c h there is no j > i such that m(Uj) >  2m(Ui).  It now follows that Ui straddles ^ . T h e proof for L e m m a 12 reveals that this is the only s i t u a t i o n where it is not necessarily the case that either m{Ui \) +  or rn(Ui+2)  = 2m(Ui).  I n fact, Ui must straddle b o t h \ a n d m i n P .  not s t r a d d l e d a n d m(Ui f l or m(Ui z) +  m(Ui n  [0,  > 2m(Ui).  [0,  ^)) < m(Ui f l  [|,1)),  then either m(Ui+2)  =  2m{Ui)  If m i n P is  >  2m([/j)  I n the other possibility where m i n P is not straddled a n d  ^)) > m{Ui n  [1,1)),  we find that m(U ) i+2  >  2m{Ui).  A s s u m i n g that Ui straddles b o t h ^ a n d m i n P , we also observe that there can be no j > i such that m(Uj D [0, 5)) > m(Uj f l [5,1)) because this q u i c k l y leads to d o u b l i n g . I n other words, the right side must be larger t h a n the left side whenever we straddle | . Therefore, we must be i n the s i t u a t i o n where  t/i = ( i - e ' , i + e), e'<£ Ui+i  = (min{2(i - aiD),2(a D-  U  l + 2  = (2min{2(i  u  i+3  = (min{2(i - aiD),2{a  Ui+4  = (2min{2(i  => U  i + 5  i+l  -  aiD),2(a D i+1  D-  i+1  -aiD),2{a D i+1  = (min{2(i - aiD),2{a  D-  i+1  _(i+ ))}, 2^?) e  2vf)},^r) -2ltf)},2#) 2 ^ ) } , ^ ) = l7 3 i+  It is apparent that the interval represented by [/j+4 w i l l re-occur every other interval ad infinitum.  W e now use this interval to show that i n fact such a sequence  of n o n - e x p a n d i n g intervals is not possible.  30  Since  straddles \ , we can compare the length of the left a n d right sides  of J7j+4. L e t R = [^,2i/>f) denote the right side a n d let L = (4(^ — o^-D), \ ) a n d V  = (4(cti+iD  — 2ipy), ^) denote the two possibilities for the left side. T h e length  of the right side is = 2Vf -  m(R)  i,  while the length of the left side is the larger of two possible lengths m(L)  = \ - 4(i -  D)  ai  and m(L')  = i-4(a  i +  i£>-2^)  .  W e then compare the differences between the right side a n d each of the two possible left sides. T h e first possibility is m(R)  - m(L)  = 2$  a D))  - I - (I - 4(i -  = 2D(a  -  i+l  = 2a D  a  i  t  ) - 1 + 2-  4a D {  - GctiD + 1 ,  i+1  while the second possibility is m(R)  - m(L')  = 2^f - ± - (± - 4 ( a = 2D{a  l+x  = -2a D i+i  l + 1  2$))  -  - ai) - 1 + 4{a D l+1  + 6aiD -  - 2D{a  l+1  -  a )) t  1.  It is now clear that m(R)  - m(L)  = - (m{R)  - m(L'))  .  B u t this means that the length of the left side is always greater t h a n or equal to the length of the right side, w h i c h contradicts our a s s u m p t i o n t h a t the right side must be bigger t h a n the left side whenever the interval straddles ^.  • 31  Theorem 14. To,a G %Jl is Bernoulli Proo/. Let T = T o  i a  .  when separation(cx)  < |.  F r o m the definition of T , we see that T o  is C a n d that 2  > a  i n f o < < i |T'(a;)| = 2 > 1 since | T ' ( x ) | = 2 for a l l x for w h i c h the derivative is defined. x  Since i n f o < < i |T"(a;)| > 1, by T h e o r e m 4, there exists at least one /z such that zz is x  a s m o o t h T - i n v a r i a n t p r o b a b i l i t y measure. B y L e m m a 13 we see that T h e o r e m 3 holds w h e n separation(a)  < §. Hence, (T, /z) is w e a k - m i x i n g a n d b y T h e o r e m 2,  (T,/z) is B e r n o u l l i w h e n separation(a)  3.3  < |.  •  Some Restrictions on ex.  In section 3.2, we showed that i f a l l T D  ) Q  G 9Jt, i f separation(a)  < §, t h e n TD  >OC  is  B e r n o u l l i . I n this section, we construct examples of T G %Jl , for every n, that fail n  to be B e r n o u l l i w h e n the restriction that separation(a)  Theorem 15. For  TD,O. G  i/iere ea;zsr; uncountably  if separation(a)  ffin>4,  many a. for which Tu  < § is relaxed.  [|,1),  > §, then for each D G  is not ergodic.  t0l  Proof. W e b e g i n this proof by considering T G 9 J l = 4 n  A s s u m e that we relax the restrictions o n a b y e > 0. T h i s means that separation(a)  < | + e a n d that no peak can be above the line f(x) =  this r e l a x a t i o n , we can define a = (c*i, a ,0:3,0:4) 2  a subset of [0,1) is n o n - e x p a n d i n g . 40C+2 4 Pe 9  £  1  a n (  ^  a  4  =  40P+24Pe • ^  o r  o  u  W e let ai = r  With  w i t h respect t o a given D so that g+48D > 2 = a  8 0  £  8  or>+48£>e'  0  :  3  =  constructed a to be v a l i d , we must be careful  that conditions (a) a n d (b) of D e f i n i t i o n 15 h o l d . C o n d i t i o n (a) requires that the components of a r e m a i n i n ascending order. T h i s is satisfied w h e n e G (0, ^ ] . ordering is m a i n t a i n e d , separation(a)  Since  < 3, a n d minrj [i/2,i),ee(o,2/i5] 4 = 1-2 > 1, a  g  to verify that c o n d i t i o n (b) of D e f i n i t i o n 15 holds, it is sufficient to show (by L e m m a 10) that for a l l values of D a n d e, either ot\, a , or a G [5,1]- B y m a x i m i z i n g a n d 2  3  m i n i m i z i n g over e a n d D, we find that a\ G [0.375,0.7] a n d a G [0.625,1.3]. F i g u r e 2  32  3.2 provides a v i s u a l proof that as e is varied over [0, ^ ] a n d D is varied over [5,1], it is never the case that b o t h a\ < \ a n d a  2  that either a\ or a  2  > 1. Therefore, it is always the case  G [5, !]•  H a v i n g verified that our defined a satisfies D e f i n i t i o n 15, we calculate that peak ^  =  ( i ± i f ,  l{j±if§)  a n d peak </>  3  while r e m a i n i n g above the line g(x) below f(x)  = (f^,  ftl||).  — ^, a n d the point ip  W i t h this definition for  1  w i l l always be slightly  while r e m a i n i n g above the line g(x) = \ . A l l of the definitions have been  chosen so that we are i n a s i t u a t i o n where 1 — ip% = ^ 3 ~~ \ — 2(V'i ~~ \) = 2{ip\ A n o t h e r i m p o r t a n t feature i n this construction is the interval between a D 2  Since ip  2  and  |)-  —  a^D.  is not used i n our construction, it is possible to insert a n a r b i t r a r y number  of divisor multiples between a D 2  a n d a D . T h u s , the results i n this proof a p p l y 3  to T G 9Jt„ for a r b i t r a r i l y large n. F i g u r e 3.3 illustrates the type of t r a n s f o r m a t i o n that we have constructed. We are now i n a p o s i t i o n to show that there exists a set of points A w i t h 0 < m(A)  < 1, for w h i c h TD,OL{A)  = A.  T h i s is the definition of a t r a n s f o r m a t i o n  5). i + w - A* = [h- m -M+m - h)i 4. = [i - 2(1 - 1].  being non-ergodic [11, p. 59].  Define A = A\ U A  It can be shown by calculation that T o ( A i ) ] a  TD,OL{M)  = A. 2  Therefore, TD,CI(A)  U A  2  = A, 2  3  where A\  and  = [\ - (ipf  -  = A\ U A 3 , and  T^A^)  = A, a n d by definition, To,  a  is non-ergodic or  •  non-expanding.  33  Figure 3.2: Combined plot of the regions where a\(e,D)  < \ and ct2{e,D)  > 1.  Over the domain e 6 [0, ^ ] and D € [\, 1], it is never true that both  ct\ < \  and «2 > 1-  X  Figure 3.3: A n example of a non-ergodic system for Trj  a  ample n = 4, D =  a  =  (||,  1, | | ) ,  | + j^-. The thick lines represent T r j . ]a  € 9Jt„>4- In this exand  separation(a)  =  The coarse dashed line  represents the necessary separation restriction on a to guarantee that T D A  =  ) Q  §]  is ergodic. u  tl> I]  u  In this case, partial remainders in the set ( i t . !)  a  r  e  mapped back to A by T , . This D  a  means that T D . is not ergodic, and therefore not Bernoulli. ) Q  34  Theorem 16.  For To,  G 9Jt  a  3)  if separation(a)  > | , then for each D £ [5,1),  there exists an a for which T o i a is not ergodic. Proof. T h e proof for this theorem comes as a special case from the proof for T h e orem 15. C o n s i d e r a separation{a)  = (oti, 0:2, 0^3,0^4) as defined i n the proof for 15.  = | = | +  we are i n the special situation where a  2  When  = 0:3. Since  a l l of the results for the proof of T h e o r e m 15 s t i l l h o l d , we now have a n example transformation T w i t h only three unique multiples of D a n d this T has been proven to be non-ergodic. F i g u r e 3.4 gives a n example of a non-ergodic transformation for D = l.  •  2  0.25  0  0.5  0.75  1  X  F i g u r e 3.4: A n example of a non-ergodic system for T o , D =  a  n  d  « = ( § > f> i f ) -  T  n  e  t  m  c  ^  u  n  e  s  a  £ 9^3- In this example, r e  P  r e s e n t  I'D,a-  The  coarse dashed line represents the necessary separation restriction o n a to guarantee that To,a is ergodic. In this case, p a r t i a l remainders i n the set A = [±, ^ ] U [ ^ , i § M y § , 1) are m a p p e d back to A by T h i s means that Trj,  a  T,.  is not ergodic, a n d therefore not B e r n o u l l i .  35  D a  Theorem 17. For Trj,  a  there exist uncountably  € 9#2, if separation(a)  many a for which TD  is not ergodic.  >(X  Proof. A s s u m e that separation(a) ai = jjj  > 3, then for some D G (^,1),  < 3 + e a n d D G {\, ^ p ) .  F i r s t , we choose  so that a i D = | a n d a = 1 + OL\. O u r restriction o n D i n terms of e has 2  been chosen so that a /a\ 2  < 3 + e w h e n 0:2 = 1 + « i - Since 0 2 >  (a) of D e f i n i t i o n 15 is satisfied. Since a\ G (\, 5), a n d a  condition  G (1,1 + a\], b y L e m m a  2  10, c o n d i t i o n (b) of D e f i n i t i o n 15 is satisfied. T h u s , o u r defined a is always v a l i d . Define A = [±, D]. W e now a p p l y T = T  D>a  to A:  T [ i , D] = [ m i n { 2 ( i - D),2(a D ai  -  2  £>)},^]  = [ m i n { 2 ( ± - 4%), 2(D + \ - D)},D(a  2  =[min{I,i},D(l +  4  )]  ai  ^- ^)] 4  N o w , since \ < D < 1, 0 < m ( A ) < 1 a n d T A Dt(x  ergodic.  -  = A, b y D e f i n i t i o n T  D  ]  a  is not •  36  X  F i g u r e 3.5:  A n example of a non-ergodic system for Trj,  a  D = | and  S 93?2. In this example,  a = ( ^ , ^ ) . T h e thick lines represent  Tr>,a-  T h e coarse  dashed line represents the necessary separation restriction o n a guarantee that T  a  to  is ergodic. In this case, p a r t i a l remainders w i t h i n  the interval [|, |] m a p back to  |] a n d w h i c h means T o ,  a  is not  ergodic, a n d therefore not B e r n o u l l i .  3.4  Entropy of Multi-Divisor S-R-T Division  T h e c a l c u l a t i o n for entropy i n m u l t i - d i v i s o r S - R - T d i v i s i o n follows the same m e t h o d used for single divisor S - R - T d i v i s i o n . W e begin by showing that To,a is a P C - m a p .  Lemma 18. Proof.  Trj  tOC  eiVl  is a PC-map  B y i n s p e c t i o n , each T o ,  a  (as defined  in Definition  14)-  is a finite collection of line segments each w i t h slope  2. E a c h of these line segments is a C - m a p by the same argument used i n the proof for T h e o r e m 8. Therefore, by definition, each To,a is a P C - m a p .  Theorem 19.  The entropy  of any TD  >(X  S 9 ^ with separation(a) 37  •  < |  is log 2.  Proof. B y L e m m a 18, a l l To,  a  when separation(ot)  G 9Jt are P C - m a p s . B y T h e o r e m 14, TD,<X is B e r n o u l l i  < | and hence there exists a unique a.c.i.m. /J. T h e o r e m 7  says that R o h l i n ' s f o r m u l a for the entropy is true and therefore: h{T , ) D a  = J log | T b , ' | d i i = log 2 ^ dfi = log 2. a  •  38  Chapter 4 Future Work T h e o r i g i n a l question that inspired this thesis was "Is simple S - R - T d i v i s i o n ergodic for a l l real d i v i s o r s ? " In p u r s u i n g the answer to this p r o b l e m , we discovered that not o n l y is simple S - R - T d i v i s i o n ergodic for a l l divisors, but it is also B e r n o u l l i . H a v i n g established a B e r n o u l l i property, a n d having calculated the entropy for our transformations, we were able to use the K o l m o g o r o v - O r n s t e i n theorem to conclude that our transformations are isomorphic to each other. In p r o v i n g these i m p o r t a n t properties for simple S - R - T d i v i s i o n , we made extensive use of more general results f r o m d y n a m i c a l systems theory. Consequently, our results were shown to be easily extensible to more general d i v i s i o n systems. In general, it is difficult to prove that a p a r t i c u l a r class of transformations are ergodic or B e r n o u l l i .  O u r results have  p r o v i d e d a n effective means of p r o v i n g b o t h of these properties for a large class of S - R - T - l i k e d i v i s i o n algorithms. P r o m the standpoint of understanding a n algorithm's expected performance, it is necessary to know that when a stationary d i s t r i b u t i o n is f o u n d , it is unique. H a v i n g established the uniqueness of stationary distributions, the next step is to find the a c t u a l stationary d i s t r i b u t i o n for as wide a class of transformations as possible. In section 1.3, we verified a k n o w n expression for the stationary d i s t r i b u t i o n f u n c t i o n for TD where D e [|, 1). In a d d i t i o n , m a n y of the stationary d i s t r i b u t i o n functions  39  have been classified by Shively and F r e i m a n for D G [§, §], a l t h o u g h the derivations are not nearly as simple as for D G [§, 1). It turns out that things become very c o m p l i c a t e d w h e n D G [^, §]. In his thesis [22], Shively shows m a n y interesting properties for the stationary d i s t r i b u t i o n functions i n this region. F o r example, he shows that there are many different intervals of D where there are a n infinite number of different stationary d i s t r i b u t i o n equations. A s such, the g r a p h of the shift average for D G [5, |] is far f r o m complete a n d appears to have a°complex p a t t e r n (from the few points that have been plotted i n this region). T h i s is s u r p r i s i n g considering the s i m p l i c i t y of the u n d e r l y i n g transformation.  A better u n d e r s t a n d i n g of this  final  region of simple S - R - T d i v i s i o n would be an interesting goal to pursue. In the work of F r e i m a n [6], it was first shown that the shift average for D G [|, |] is constantly 3, w h i c h can be easily shown to be the m a x i m u m possible shift average. T h i s property was then used by M e t z e [15] to o b t a i n a version of S - R - T d i v i s i o n that has a n expected shift average of 3 for a l l divisors. A n o t h e r area to pursue w o u l d be to explore shift averages for m u l t i - d i v i s o r S - R - T d i v i s i o n a n d , if other plateaus are f o u n d , they could possibly be used to o b t a i n higher expected shift averages for a l l possible divisors. U n d o u b t e d l y , o b t a i n i n g a complete u n d e r s t a n d i n g of the stationary d i s t r i b u t i o n functions for m u l t i - d i v i s o r d i v i s i o n w o u l d be even more difficult t h a n it is for simple S - R - T d i v i s i o n . It is possible that such results i n this area could lead to improvements i n m o d e r n S - R - T d i v i s i o n .  R e l a t e d to this, it  w o u l d be interesting to attempt to extend the results of this thesis to m o d e r n S - R - T division.  40  Bibliography [1] R u f u s B o w e n . B e r n o u l l i maps of the interval. Israel  J. Math.,  28(1-2):161-168,  1977. [2] A b r a h a m B o y a r s k y and Pawel G o r a . Laws  of chaos.  B i r k h a u s e r B o s t o n Inc.,  B o s t o n , M A , 1997. Invariant measures a n d d y n a m i c a l systems i n one dimension. [3]  Digital  Joseph F. Cavanagh.  tion.  Computer  Arithmetic:  Design  and  Implementa-  M c G r a w - H i l l , N e w Y o r k , N Y , 1984.  [4] J . Cocke a n d D. W . Sweeney.  H i g h speed arithmetic i n a parallel device.  T e c h n i c a l report, I B M , F e b r u a r y 1957. [5] M i l o s D. Ercegovac a n d Tomas L a n g . Division  algorithms  and implementations.  and square  root:  digit-recurrence  Kluwer Academic Publishers Group, Norwell,  M A , U S A , a n d Dordrecht, T h e Netherlands, 1994. [6] C . V . F r e i m a n . S t a t i s t i c a l analysis of certain b i n a r y d i v i s i o n algorithms.  IRE,  49:91-103, 1961.  [7] D. H a r r i s , S. O b e r m a n , and M . H o r o w i t z .  S R T d i v i s i o n architectures  i m p l e m e n t a t i o n s . In 13th IEEE  Symposium  on Computer  ings,  California,  USA, v o l u m e 13 of Symposium  July  Computer  6-9,  1997, Asilomar,  Arithmetic,  [8] A . N . K o l m o g o r o v . Akad.  Proc.  Nauk  SSSR,  Arithmetic:  and  proceedon  pages 18-25. I E E E C o m p u t e r Society Press, 1997. A new invariant for transitive d y n a m i c a l systems. 119:861-864,  1958.  41  Dokl.  [9] A . N . K o l m o g o r o v . p h i s m s . Dokl.  E n t r o p y per unit time as a metric invariant of automor-  Akad.  Nauk  SSSR,  124:754-755, 1959.  [10] A . L a s o t a a n d James A . Yorke.  O n the existence of invariant measures for  Trans.  piecewise m o n o t o n i c transformations.  Amer.  Math.  Soc, 186:481-488  (1974), 1973. [11] A n d r z e j L a s o t a a n d M i c h a e l C . Mackey. Chaos, fractals,  and noise.  Springer-  V e r l a g , N e w Y o r k , second edition, 1994. Stochastic aspects of d y n a m i c s . [12] Frangois L e d r a p p i e r . Some properties of absolutely continuous invariant measures on a n i n t e r v a l . Ergodic  Theory  Dynamical  Systems,  l ( l ) : 7 7 - 9 3 , 1981.  [13] T i e n Y i e n L i a n d James A . Yorke. E r g o d i c transformations f r o m a n interval into itself. Trans.  Amer.  Math.  Soc., 235:183-192, 1978.  [14] O . L. M a c S o r l e y . High-speed arithmetic i n b i n a r y computers. Proc. IRE, 4 9 : 6 7 91, 1961. [15] G e m o t M e t z e . A class of b i n a r y divisions y i e l d i n g m i n i m a l l y represented quotients. IRE  Trans,  on Electronic  [16] D o n a l d S. O r n s t e i n . Advances [17]  Computers,  EC-ll:761-764,  D e c . 1961.  B e r n o u l l i shifts w i t h the same entropy are isomorphic.  in Math.,.4:337-352,  D o n a l d S. O r n s t e i n . Ergodic  1970.  theory,  randomness,  and dynamical  systems.  Yale  U n i v e r s i t y Press, N e w H a v e n , C o n n . , 1974. James K . W h i t t e m o r e Lectures i n M a t h e m a t i c s given at Y a l e University, Y a l e M a t h e m a t i c a l M o n o g r a p h s , N o . 5. [18] K a r l Petersen. Ergodic  theory.  C a m b r i d g e U n i v e r s i t y Press, C a m b r i d g e , 1983.  [19] J . E . R o b e r t s o n . A new class of d i g i t a l d i v i s i o n methods. IRE Trans.  Computers,  E C - 7 : 2 1 8 - 2 2 2 , Sept. 1958.  42  Electronic  [20] V . A . R o h l i n . E x a c t endomorphisms of a Lesbesgue space. Amer. Transl.,  Math. Soc.  39, 1964.  [21] P a u l Shields. The theory  of Bernoulli  shifts.  T h e U n i v e r s i t y of C h i c a g o Press,  C h i c a g o , I l l . - L o n d o n , 1973. C h i c a g o Lectures i n M a t h e m a t i c s . [22]  R o b e r t Shively. Stationary  division.  distribution  of partial  remainders  in S-R- T  digital  P h D thesis, U n i v e r s i t y of Illinois, 1963.  [23] K . D . Tocher. Techniques of m u l t i p l i c a t i o n a n d d i v i s i o n for a u t o m a t i c computers.  Quart.  J. Mech.  [24] Peter Walters. An introduction  Appl.  Math.,  to ergodic  1982.  43  binary  11:364-384, J u l y - S e p t e m b e r 1958.  theory.  Springer-Verlag, N e w Y o r k ,  

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
China 21 22
United States 7 4
Japan 2 0
Poland 1 0
City Views Downloads
Beijing 19 0
Ashburn 4 0
Unknown 4 4
Shenzhen 2 22
Tokyo 2 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051671/manifest

Comment

Related Items