UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

S-R-T division algorithms as dynamical systems 2002

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
ubc_2002-0175.pdf [ 1.74MB ]
Metadata
JSON: 1.0051671.json
JSON-LD: 1.0051671+ld.json
RDF/XML (Pretty): 1.0051671.xml
RDF/JSON: 1.0051671+rdf.json
Turtle: 1.0051671+rdf-turtle.txt
N-Triples: 1.0051671+rdf-ntriples.txt
Citation
1.0051671.ris

Full Text

S-R-T Division Algorithms As Dynamical Systems by Mark A . M c C a n n B . S c , University of Br i t i sh Columbia, 1999 A T H E S I S S U B M I T T E D I N P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F Master of Science in T H E F A C U L T Y O F G R A D U A T E S T U D I E S (Department of Computer Science) We accept this thesis as conforming to the required standard The University of British Columbia A p r i l 2002 © Mark A . M c C a n n , 2002 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of C o m p t e r Sciintt The University of British Columbia Vancouver, Canada Date DE-6 (2/88) Abstract S-R-T division, as it was discovered in the late 1950s [4, 19, 23], represented an important improvement in the speed of division algorithms for computers at the time. A variant of S -R-T division is st i l l commonly implemented in computers today. Al though some bounds on the performance of the original S-R-T division method were obtained, a great many questions remained unan- swered. In this thesis, S-R-T division is described as a dynamical system. This enables us to bring modern dynamical systems theory, a relatively new development in mathematics, to bear on an older problem. In doing so, we are able to show that S-R-T division is ergodic, and is even Bernoull i , for al l real divisors and dividends. i i Contents Abstract ii Contents iii List of Figures v Acknowledgements vi 1 Introduction and Background 1 1.1 Introduction to S-R-T Divis ion 1 1.2 S-R-T Divis ion as a Dynamical System 5 1.3 Shift Average for D € [|, 1) 12 2 Bernoulli Property 15 2.1 Proof of Bernoulliness 16 2.2 Entropy oi TD • 19 3 Extensions to Multi-Divisor S-R-T Division 23 3.1 Mul t i -Div i sor S-R-T Divis ion 23 3.2 Proof of Bernoulliness 27 3.3 Some Restrictions on a 32 3.4 Entropy of Mul t i -Div isor S-R-T Divis ion 37 i i i 4 F u t u r e W o r k B i b l i o g r a p h y List of Figures 1.1 A pseudo state-machine for converting to binary 4 1.2 S-R-T division where p0 = 0 . 6 7 , and D = 0 . 7 5 5 1.3 Following partial remainder magnitudes graphically for D = 0 . 7 5 and p0 = 0 . 6 7 7 1.4 App ly ing T 4 / 5 to x = j one hundred times 8 1.5 A p p l y i n g T 4 / 5 to x = f + 0 . 0 0 0 0 1 one hundred times ' 8 1.6 App ly ing P associated with T 3 / 5 to f{x) = 1 six times 1 1 1.7 App ly ing P associated wi th T 3 / 5 to f(x) — ^~ f^2 ^ six times 1 2 3.1 S -R-T division where p0 = 0 . 6 7 , D = 0 . 7 5 , and a = ( 0 . 7 5 , 1 , 1 . 2 5 ) 2 7 3 .2 Combined plot of the regions where oti(e, D) < | and a2(e, D) > 1 for proof of Theorem 1 5 3 4 3 .3 A n example of a non-ergodic system for T D ) Q G 9Tt„>4 3 4 3 .4 A n example of a non-ergodic system for To,a £ 9^3 3 5 3 .5 A n example of a non-ergodic system for T o ; a G WI2 3 7 v Acknowledgements I would like to thank my thesis supervisor, Nick Pippenger, for his patient guidance and for the inspiration for this thesis. I would like to thank Mark Greenstreet for the many excellent comments and suggestions that he provided as my second reader. I would like to thank my good friends Ellen Gethner, Wesley Wong, and Michael Forbes for many good conversations relating to my thesis, and for their numerous helpful suggestions. I would especially like to thank my wife Karen for being a constant support and encouragement to me throughout my Master's degree. Finally, I would like to thank my parents for all the support and en- couragement they have given me through numerous years of schooling. M A R K A . M C C A N N The University of British Columbia April 2002 vi Chapter 1 Introduct ion and B a c k g r o u n d 1.1 Introduction to S-R-T Division "S -R-T division" roughly refers to a class of non-restoring, binary division algorithms that have been designed for floating-point computers [3, 5, 6, 7, 14, 22]. The term "non-restoring" refers to the fact that partial remainders are allowed to range freely through the interval (—1,1), rather than being restored to the positive realm before proceeding to the next step. This feature reduces uses of the adder by about fifty percent. A n equally important feature of this algorithm is the "S -R-T" optimization from whence the algorithm gets its name. In the late 1950's, Sweeney [4], Robertson [19], and Tocher [23] independently made the observation that whenever a part ial remainder is in the range (—|, | ) , there wi l l be one or more leading zeros that can be shifted through in a very short amount of time (usually one cycle). The more leading zeros in a given step, the more the algorithm can avoid costly uses of the adder. A further development of this original algorithm, which is s t i l l called S-R-T division, is the algorithm most often implemented in modern floating- point units. In modern S-R-T division, a fixed number of quotient digits are produced every cycle as opposed to a variable number [5, pp. 37-62]. 1 A n example of modern S-R-T division in use is Intel's first release of the P e n t i u m ™ C P U with its infamous "Pentium Bug," which was really just a small error in its S-R-T division implementation. This thesis wil l restrict its attention to the original version of S-R-T division. To present the simplest type of S-R-T division, we begin with a few definitions for an algorithm similar to that presented by Shively [22, pp. 3-4]: (a) n represents the number of iterations performed in the algorithm. (b) po is the dividend (or initial partial remainder) normalized so that po G [\, !)• (c) Pi G (—1,1), i G N, is the partial remainder after the ith step. (d) D is the divisor normalized to 1). (e) qi G {—1,0,1} (i G {0 , . . . , n — 1}) is the quotient digit generated by the i th step. 71-1 (f) Qn = ^2 % is the "rounded off" quotient generated after n steps of the i=0 algorithm. Given the above definitions, after n steps of the division algorithm, we would like it to be true that po = DQn + s{n) where e(n) is a term that goes to zero as n goes to infinity. A recurrence relation for the S-R-T division algorithm can be stated as Pi+i = < 2pi 2(pi - D) 2(pi + D) \Pi\ < \ \pi\ > \ and pi > 0 \pi\ > \ and pi < 0, 2 and Qi = < 0 : \ P i \ < i 1 : \Pi I > 5 a n d Pi > 0 - 1 : > 5 and pi < 0 . By observing that (2{Pi-{Q)D) 2 (P i - (1)D) 2{Pi-{-l)D) we can rewrite the definition of pi+i as Pi+i = \ \Pi\ < 5 \Pi\ > \ and pi > 0 |pi| > 5 and pi < 0 , After n steps have been completed, we have pn = 2nPo - 2nq0D - 2n-lqiD 2lqn-lD.. and then after dividing by 2 n and solving for PQ we find that PO = 7T + Pn , qoD q\D 2° 71-1 + +... + Qn-lD Jn—1 P n i=0 Now let e(n) = pn/2n and let Q* = l i m ^ o o Q n - Since |p n | < 1, in the limit as n goes to infinity Po = DQ*. The quotient bits being generated are not in a standard binary representa- tion, but it is a simple matter to convert the answer back to standard binary without using any expensive operations. Figure 1.1 shows a simple pseudo state-machine (re- ally a push-down automaton) that converts positive floating-point numbers in the {—1,0,1} representation into binary. Figure 1.1: A pseudo state-machine for converting sequences of { — 1,0,1} into sequences of {0,1} (binary). We assume that the input sequence corresponds to a positive number. The letter 'Z ' is used to indicate that the end of the sequence has been reached, and the symbol e represents the null string. We represent a run of m zeros as 0 • • • 0 and a run of m ones as 1 • • • 1. Sequences of symbols should be read m from left to right. For example, the expression 1/10 ••• 0 means: if a 1 is encountered in the input sequence, write a 1 followed by m zeros. The above conversion automaton implies that conversion happens after the calculation is completed. In reality, the conversion from the generated quotient bits to standard binary is done in hardware on-the-fiy, using registers to convert runs of zeros into runs of zeros or ones in parallel, or by performing a single subtraction. Figure 1.2 shows an example of using the S-R-T division algorithm to divide 0.67 by 0.75. The steps that produce non-zero quotient bits have been shown. In this example, after six uses of the adder, the quotient (0.893) has been determined to four digits of precision. 4 Po = 0.67 0.67 Pi = 2(0.67 - D) -0 .16 go = 1 Qo = 1 Pi = 2(2 2 ( -0.16) + £>) = 0.22 Qz = - 1 Qz = 0.875 Pi = 2(2 2(0.22) — D) = 0.26 1 Qe = 0.890625 P9 = 2(2 1(0.26) - D) = -0 .46 qs = 1 Qs = 0.89453125 Pn = 2(2 1 ( -0.46) + D) = -0 .34 Qw = - 1 Qw = 0.8935546875 Pl3 = 2(2 1 (-0.34) + D) = 0.14 912 = - 1 Ql2 = 0.8933105469 Figure 1.2: A n example of S-R-T division when the dividend po = 0.67, and the divisor D = 0.75. The quotient Q* is 0.893. Now, with this simple system of division in hand, we might want to ask certain questions about its performance. For example, we could ask "How many bits of precision are generated per iteration of the algorithm on average?" To answer this question, we must look at the magnitude of \Q* — Qn\ = \Pn/2n\. The number of bits of precision on the nth step is then n — log 2 P n . In the worst case, pn is close to 1, and therefore we get at least one bit of precision per iteration of the algorithm, regardless of the values of D or PQ. Of course, a designer of actual floating-point hardware probably wants to know the expected performance based on the expected values of pn. To answer the many variants of this type of question, it is clear that we must know something about the distribution of partial remainders over time. The remainder of this thesis is devoted to extending what is known about the answer to this type of question as it relates to S-R-T division and its variants. 1.2 S-R-T Division as a Dynamical System The example in figure 1.2 makes it clear that keeping track of the signs of successive partial remainders is irrelevant in determining how many times the adder will be 5 used for a part icular calculat ion. For this reason, we only need to consider the magnitudes of successive par t ia l remainders. We now give a reformulat ion of S - R - T d iv is ion that w i l l al low us to look at d iv is ion as a dynamica l system. Definition 1 (S-R-T Division Transformation). For D G [5,1), we define the funct ion TD : [0,1) ->• [0,1) as TD(X) = { 2x 2(D - x) 2(x - D) 0 < x < \ \ < x < D D < x < 1 Th i s t ransformat ion of the unit interval represents the successive par t ia l remainders that arise as S - R - T div is ion is carried out by a divisor D o n a d iv idend x. D is normal ized to 1). T h e d iv idend x is normal ized to 1) ini t ia l ly, whi le each of the successive par t ia l remainders Tjj(x) (n G N) subsequently ranges through [0,1). B y using the characterist ic funct ion for a set A defined as 1 A ( * ) = J 1 : i e A 0 : x g" A , we can rewrite as TD(x) = 2x • l ^ i j ( 1 ) + 2(D - x) • l r i , o ) 0 r ) + 2(x - D) • l m ) ( x ) . (1.1) If we plot equation 1.1 on the uni t interval , we obta in a very useful v isual - izat ion of our t ransformat ion. F igure 1.3 shows the plot of To.75(0;) combined w i th a plot of the successive par t ia l remainders that arise whi le d iv id ing 0.67 by 0.75. Th i s is the same system that was presented earlier i n figure 1.2. Not ice that a vert ical l ine in the interval [\,D) corresponds to a subsequent f l ip in the sign of the next par t ia l remainder. 6 0 0.25 0.5 0.75 1 X F i g u r e 1.3: A n example of fol lowing par t ia l remainder magnitudes graphical ly for D — 0.75 and po = 0.67. The heavy sol id lines represent the trans- format ion To.75, whi le the abscissa of the th in vert ical lines represent successive par t ia l remainder magnitudes. F igure 1.3 shows an example of fol lowing the trajectory of a single par t ia l remainder for a part icular divisor. Af ter ten appl icat ions of the To.75, there is not any obvious regular pat tern, al though we expect to see one eventual ly since the quotient is rat ional in this case. O f course, most numbers are not ra t ional and we can deduce that for most numbers, the transformat ion w i l l never exhibi t a repeating pat tern. In figures 1.4 and 1.5, we see that a very smal l change in the value of the in i t ia l par t ia l remainder quickly produces large differences in the observed behaviour of the subsequent par t ia l remainders. Our system appears to be chaotic (it certainly has sensitive dependence on in i t ia l condit ions and is topological ly t ransi t ive), and, i f this the case, we w i l l gain l i t t le understanding by s tudy ing the trajectories of 7 i nd iv idua l par t ia l remainders. The logical next step is to study the behaviour of d ist r ibut ions of points over the whole interval. l r 0 . 7 5 0 . 5 0 . 2 5 100 F i g u r e 1.4: The result of app ly ing T 4 / 5 to x = j one hundred t imes. ST 0 . 7 5 0 . 5 0 . 2 5 J0 20 40 60 80 100 n F i g u r e 1.5: The result of app ly ing T 4 / 5 to x = j + 0.00001 one hundred times. T h e area of understanding the behaviour of ensembles of points under re- peated transformat ion is the realm of dynamica l systems theory. For the remainder of this thesis, we assume a certain amount of fami l iar i ty w i th the fundamentals of 8 dynamica l systems theory (or ergodic theory), which requires some basic under- standing of measure theory. We wi l l include a few helpful background mater ia l definit ions as they are needed, but mostly we wi l l provide references. A very good int roduct ion to the study of chaotic systems is Lasota and Mackey 's book Chaos, Fractals, and Noise [11]. For a more detailed in t roduct ion to ergodic theory (along w i th the necessary measure theory needed to understand this thesis), Peter W a l - ters's book An Introduction to Ergodic Theory [24] and K a r l Petersen's book Ergodic Theory [18] are h ighly recommended. Definition 2 (Probability Space). If B is a cr-algebra on subsets of a set X and if m is a measure on B where m(X) = 1, then the tr iple (X, B, m) is cal led a probability space. (See [24, pp. 3-9] and [11, pp. 19-31] for a good overview of basic measure theory and Lebesgue integration.) Definition 3 (Stationary Distribution). Let (X, B, m) be a probabi l i ty space, let P be the Perron-Frobenius operator associated w i th a non-singular t ransformat ion T : X —>• X, and let L 1 denote the Ll space of (X,B,m)^. If / € L1 is such that Pf = ft then / is cal led a stationary distribution of T. Definition 4 (Perron-Frobenius operator). For a probabi l i ty space (X,B,m), the Perron-Frobenius operator associated w i th a non-singular t ransformat ion T : X —)• X is defined by f Pf(x) d m = [ f{x) d m , for B e B . JB JT-^B) For a piecewise C 2 § transformat ion T w i th n pieces, we can give an expl ic i t formula for the Perron-Frobenius operator. Let A = {Ai,A2,... ,An} be the par t i - t ion of X which separates T into n pieces. For i € { 1 , . . . , n} , let ti(x) represent the *For a probability space (X,B,m), the L1 space of (X,B,m) is the set of / : X -> E satisfying Jx \f(x) \ dm < co. *The o symbol will be used to indicate that a given relation holds except possibly on a set of measure zero. §C 2 denotes the set of all functions with two continuous derivatives. 9 natura l extension of the ith. C2 funct ion T ( x ) | ^ i . The Perron-Probenius operator for T is then In part icular , for T o (as in equation 1.1), Pf{x) = \ f { \ x ) - l m { x ) + \f{D - \x) • l ( 0 ,2D - i ] (a j ) + \f{D + \x) • l [ 0 l 2 - 2 D ) ( ^ • (1-2) W i t h equat ion 1.2 we can show precisely what happens to an in i t ia l dis- t r ibu t ion of points (described by an integrable function) after they are repeatedly transformed under Try. Figures 1.6 and 1.7 show what happens to two different in i t ia l d is t r ibut ion of points after five appl icat ions of the Perron-Frobenius operator associated w i th T 3 / 5 ( z ) . B y the fifth appl icat ion, the distr ibut ions look remarkably simi lar. One might guess that they are both approaching the same final d is t r ibut ion. Th i s s i tuat ion is i n marked contrast to chaotic behaviour observed i n figures 1.4 and 1.5. tll{x) dx f(t-\x))-lu[Ai){x) 10 2 1 . 5 1 0 . 5 0 2 1 . 5 1 0 . 5 0 2 1 .5 1 0 . 5 0 0 0 . 2 5 0 . 5 0 . 7 5 1 X CO 0 0 . 2 5 0 . 5 0 . 7 5 1 X l O 0. 0 0 . 2 5 0 . 5 0 . 7 5 1 a: 2 1 .5 1 0 . 5 0 2 1. 5 1 0 . 5 0 2 1 .5 1 0 . 5 0 0 0 . 2 5 0 . 5 0 . 7 5 1 X 0 0 . 2 5 0 . 5 0 . 7 5 1 X 0 0 . 2 5 0 . 5 0 . 7 5 1 X ure 1.6: The result of apply ing the Perron-Frobenius operator P associated w i th T 3 / 5 to f(x) = 1 six t imes. 11 2 . 5 2 ^ 1 0 . 5 0 2 1 . 5 1 0 . 5 0 0 0 . 2 5 0 . 5 0 . 7 5 1 X CN 0 0 . 2 5 0 . 5 0 . 7 5 1 x 2 1 . 5 1 0 . 5 0 2 . 5 ^ 2 ^ 1 . 5 a, i 0 . 5 0 2 ^ - 1 . 5 0 . 5 0 2 1 . 5 0 0 . 2 5 0 . 5 0 . 7 5 1 X 0 0 . 2 5 0 . 5 0 . 7 5 1 1 l O a, 0 . 5 0 0 . 2 5 0 . 5 0 . 7 5 1 X 0 0 . 2 5 0 . 5 0 . 7 5 1 F i g u r e 1.7: T h e result of app ly ing the Perron-Frobenius operator P associated 1 f1 • • wi th T3 /5 to fix) = -—- / — six t imes. !og2 J 1 / 2 x 1.3 Shift Average for D G [§ , 1) A n exact equat ion for the stat ionary d is t r ibut ion when D € [|, 1) was first given by Fre iman [6] and is restated by Shively [22] as 1 1 / ( z ) = -Q\O,2D-\){?) + 2^1 [2£>- i , i ) (» ) • (1-3) To verify that this is a stat ionary d is t r ibut ion funct ion, we begin by app ly ing the Perron-Frobenius operator as given in equation 1.2 to equat ion 1.3 and veri fy ing 12 that Pf(x) = f(x). So then, apply ing P to / we get Pf(x) = ^ ^ l [ o , 2 D - i ) ( H + 2 ^ 1 [ 2 i > - i , i ) ( 5 a ; ) ) 1 [ o , i ) ( a ; ) + \ ^ l [o ,2 i>- i ) ( -D - hx) + -fijL[2D-\,\){D ~ k ) ) l(0,2£»-i](a;) + \ ^ l [ 0 , 2 X ? - l ) ( - D + \x) + 2 ^ 1 [ 2 D - l , l ) P + \x)^j l[Q,2-2D){x) • Assuming that D € [5,1), and observing that a; € [0,1), Pf(x) = \ ^ l [ o , 4 D - 2 ) ( a j ) + ^ 1 [ 4 D - 2 , i ) ( a ; ) ) l[o,i)(a:) + £ ^ 1 ( 2 - 2 D , l ) ( a ; ) + ^ 1 ( 0 , 2 - 2 £ > ] ( 2 ; ) ) l(0,2D-l](a;) + £ ( ^ 1 [ 0 , 2 D - 1 ) ( ^ l[0,2-2O)(a;) • Fina l l y , assuming that D € [|, 1), we have P f ( x ) = 2^1[0,l)(a;) + 2^ 1 (2-2D,2 i3- l ] (^ ) + ^ l (o ,2 -2D] (a ; ) + 42jl[o,2-2D)(a:) 3 1 = ^1 [ 0 , 2 - 2 D ) ( ^ ) + ^ 1(0,2-20] (*) + - ^ l [ 2 - 2 D , 2 D - l ) { x ) + ^T5 1(2-2D,2£)-l](a;) + 2^1[22>-i,i)(a0 = ^l[o,2D-i)(a;) + 2^1[2D-i , i ) (a:) = / ( » ) • One of the pr imary uses of having a formula for the d is t r ibut ion of par t ia l remainders is for calculat ing the shift average for a given divisor. The shift average is the average number uses of the shift register (single shift or mul t ip l i ca t ion by two) between uses of the adder. Under the assumption that a register shift is a much faster operat ion than using the adder, the shift average gives a useful character izat ion of the expected performance of our a lgor i thm for a given divisor. W i t h equat ion 1.3, we know the fract ion of bits that require the use of the adder. To calculate the average number of zero bits generated between non-zero bits (bits requir ing use of 13 the adder), we take the reciprocal of the fract ion of bi ts that require the adder. We calculate the shift average for a div isor D e [f, 1) to be •<">-I4 = * £ T - (L4) Since have not proven that the stat ionary d ist r ibut ions f rom S - R - T d iv is ion are unique, we have no way of knowing whether or not a shift average calculat ion in equat ion 1.4 is correct. To prove that a l l stat ionary d ist r ibut ions are unique, we need to show that TD is ergodic for a l l D G [^,1)- Fre iman [6] shows that T o is ergodic for rat ional D, but we extend this result for real D. In the next section we show that a l l are Bernou l l i and it is known that having the Bernou l l i property impl ies ergodicity. Before concluding this chapter w i th a defini t ion for ergodicity, we w i l l briefly comment on the der ivat ion of stat ionary distr ibut ions for D G [|, f ) . For D G [|, | ) , the stat ionary d is t r ibut ion functions have been derived, and their associated shift average functions have been shown to be constantly three [6, 22]. T h e layout of stat ionary d is t r ibut ion functions in the region D G [ j , § ) has several surpr is ing propert ies and is far f rom being ful ly understood. We discuss the calculat ion of shift averages as an interesting area for future investigation i n Chapter 4. Definition 5 (Ergodic [11]). Let (X, B,m) be a probabi l i ty space and let a non- singular t ransformat ion T : X —> X be given. T h e n T is ergodic i f for every set B G B such that T~L{B) = B, either m(B) = 0 or m(X \B) = 0. 14 Chapter 2 B e r n o u l l i P r o p e r t y In this chapter, we w i l l prove that the class of transformations of the interval that characterizes the S - R - T d iv is ion for a l l real divisors D has the property that each t ransformat ion TD is Bernou l l i . A l though the basic concept of a Bernou l l i shift (the things to which transformations having a Bernou l l i property are isomorphic to) is not diff icult, a complete def ini t ion requires enough auxi l iary concepts from measure theory (concepts not used anywhere else in this thesis) that we chose to refer the interested reader to [17, 18, 21, 24] and other selections l isted in the Bib l iography. Nei ther an understanding of Bernou l l i shifts, nor a formal def ini t ion of what it means to be Bernou l l i is required to follow the proofs i n this chapter. Hav ing said this, we should ment ion informal ly the connection between Bernou l l i shifts and transformations having the Bernoul l i property. The transformat ion TD is an non-invert ible endomorphism of the unit inter- val. Th i s means that f rom a given par t ia l remainder we can predict a l l future par t ia l remainders, but we cannot uniquely predict past par t ia l remainders. There is a nat- ura l way (called the natura l extension) to make our t ransformat ion invert ible (an automorphism) on a larger space. Specif ical ly, each non-invert ible t ransformat ion TD having the Bernou l l i property has an extension to an automorphic transforma- t ion, isomorphic to a two-sided Bernou l l i shift [18, pp. 13,276]. F rom the way that 15 entropy for a transformat ion is denned, the entropy for an automorphic Bernou l l i t ransformat ion associated w i th a non-invert ible Bernou l l i t ransformat ion is the same as the entropy for the non-invert ible Bernou l l i t ransformat ion. B y proving that a l l transformations T p are Bernou l l i , and by proving that entropy of each To is the same, we wi l l be able to conclude that the natura l extensions of S - R - T d iv is ion algor i thms are isomorphic to each other for a l l divisors. 2.1 Proof of Bernoulliness Definition 6 (of Bowen [1], Expanding). We w i l l say that a t ransformat ion T on an interval is expanding i f it has the property that s u p n > 0 u-(TnU) = 1 for a l l open intervals U w i th p{U) > 0, where / i is any normal ized measure that is absolutely continuous w i th respect to Lebesgue measure. Definition 7 (Straddle). Let U be an interval of reals (either open, closed, or half open) and let p G K + . If p G U°J then we say that U straddles p. Theorem 1. The S-R-T division transformation is expanding for all real divisors. Proof. Let (X,B,m) be a probabi l i ty space where X = [0,1), B is the Bore l o- algebra on X and m is the Lebesgue measure on H*. Let TD : X ->• X be the S - R - T d iv is ion t ransformat ion for a given normal ized divisor D as defined in equation 1.1. *The symbol o as the exponent of an interval denotes an open version of the interval. *For an interval [a, b], the Lebesgue measure is defined as m([a, b]) = b — a. 16 Let us define an infinite sequence of intervals U = {C/jjieN as Ui = U and C / ° C [ 0 , i ) o r t / ? C [ ± , l ) U° g [0, i ) and (7° g [ i , 1) and TD(C7in[0,i)) n [ ± , i ) ) m(C7i D [0, i ) ) > m ( < 7 i n [ i , l ) ) tf?g[0,±)andtf?g[±,l) and n [ o , i ) ) < m ( ( 7 i n [ i , l ) ) . Property 1. For all Ui such that \ U° and D g U°, m{Ui+i) = 2m{Ui). Proof. If a U° is a subset of either [0, [5,-D), or [D, 1), then we are in the first case of the U definition and we apply directly. Since each of the three cases of the TD expand an interval by a factor of two, it is clear that m(T£>(Ui)) = m(Ui+\) = 2m(Ui). Property 2. For all Ui where D 0 Ui, m(Ui+i) > m(Ui). Proof. Assume that D $ Ui. If j £ Ui, then according to Property 1, U{+\ doubles. Otherwise, \ 6 U{ and therefore, to find E/i+i, we must consider the second and third cases of the U sequence. In the worst case, m(Ui D [0, ^)) = m(Ui n [^,D)), and regardless of which half we choose, m(Ui f l [0, ^)) = m(Ui f l [5, D)) = ^m(Ui). By applying Trj to this truncated interval, we double what we halved so that m(Ui) = m{Ui+i). By way of contradiction, let us assume that there exists a sequence of U that never expands to fill X. Such a sequence can never include the point D and the following Property wil l hold: Property 3. There exists N such that for all i > N (a) m ( [ / ; n [ 0 , ^)),m(Uif][^, 1)) > 0 (in other words, all subsequent intervals must straddle \), and 17 (b) m{Ui PI [0, \)) < m(Ui n 1)) (in other words, all subsequent Ui must be such that the right half of Ui is not discarded by the definition ofU). Proof of Property 3(a) Property 1 says that the only way not to double is to straddle ^. Therefore, at a minimum, it must be the case that ^ is eventually included every time or else the interval will double a sufficient number of times to include D which would be a contradiction. Proof of Property 3(b) If m{Ui n [0, | ) ) > m(Ui n [£, 1)), then we have Ui = {\ - e,\ + e') where e > e'. Now Ui+1 = TD{Ui) = TD(\ - e, \) = (1 - 2e, 1). But, since D is not in C/j+i, \ cannot be in t/i+i and Property 3(a) fails, resulting in a contradiction. By Property 3, we will eventually be in a situation where U{ = ( \ — e', \ +e), e' < e, and Property 3 wil l hold for every subsequent interval. So then Ui+i =TD{\-e',\+e) =TD[\,\ + e) = (2Z> - 1 - 2e, 2D - 1] by Property 3(b). But again by Property 3, Ul+2 = TD{2D - 1 - 2e,2D - 1] = TD[\,2D - 1] = [2 - 2D,2D - 1]. It is now clear that ^ is at the midpoint of Ui+2 and that we must now pick the left half of the interval which contradicts Property 3(b). Therefore, D wi l l eventually be included in an interval and the sequence will expand to fill all of X. • We can now prove that the S-R-T division process is weak-mixing, and there- fore Bernoulli, by two theorems of Bowen [1]. Theorem 2 (of Bowen [1]). Let T be a piece-wise C2 map of [0,1], [i be a smooth T-invariant probability measure, and A = infn<x<i |/'0c)| > 1- U dynamical system (T, /j,) is weak-mixing, then the natural extension of (T, p) is Bernoulli. 18 We ment ion here that the natural extensions of (T, n) is the associated au- tomorphic t ransformat ion that we al luded to at the beginning of this chapter. See Petersen [18, p. 13] for an exact definit ion. Theorem 3 (of Bowen [1]). With T and / i as in Theorem 2, (T,n) will be weak- mixing ifT is expanding. Theorem 4 (of Lasota and Yorke [10]). Let (X,B,m) be a probability space and let T : X —>• X be a piecewise C2 function such that inf \T'\ > 1. If P is the Perron-Frobenius operator associated with T, then for any f G L1, the sequence (n Sfc=o - f f c / )n^=i ? s convergent in norm to a function f* € L\. The limit function f* has the property that Pf* = f* and consequently, the measure d/Lt* = / * d m is invariant under T. Hav ing established that TD is expanding, we now use the above three theo- rems to prove the central result of this thesis. Theorem 5. TD is Bernoulli. Proof. F r o m the defini t ion of T o , we see that TD is C2 and that i n fo< x < i |TD'(:E)| = 2 > 1 since \TD'(X)\ = 2 for a l l x for which the derivative is defined. Since in fn< x <i \TD'(X)\ > 1, by Theorem 4 there exists at least one \x such that fi is a smooth To- invar iant probabi l i ty measure. B y Theorem 1, we see that Theorem 3 holds. Hence, (TD,U-) is weak-mix ing and, by Theorem 2 ( T o , ^ ) is Bernou l l i . • 2.2 Entropy of TD K n o w i n g that a l l T o are Bernoul l i is a very useful property because we can use en- tropy as a complete invariant to show isomorphism amongst the two-sided Bernou l l i shifts associated w i th T o that have the same entropy. Th i s comes f rom the contr i - bu t ion of Ornste in to the Kolmogorov-Ornste in Theorem. 19 Theorem 6 (of Kolmogorov [8, 9] and Ornstein [16]). Two Bernoulli shifts are isomorphic if and only if they have the same entropy. The purpose of this section is to calculate the entropy of T p . We begin w i th a mul t i -part def ini t ion of entropy along w i th some suppor t ing definit ions that follow the development presented by Walters [24, pp. 75-87]. Definition 8 (Partition). A partition of (X,B,m) is a disjoint col lect ion of ele- ments of B whose union is X. Definition 9 (Join). Let V and Q be finite part i t ions of (X, B, m). T h e n ? V Q = {PnQ : P G V, and Q 6 Q} is called the join of V and Q. Note that V V Q is also a finite par t i t ion of (X, B, m). Definition 10 (Entropy of a partition). Let (X,B,m) be a probabi l i ty space and let V = {Pi,...,Pk} be a finite par t i t ion of (X,B,m). T h e entropy of the partition is defined as Definition 11 (Entropy of a transformation with respect to a partition). Suppose T : X —> X is a measure-preserving transformat ion of the probabi l i ty space (X, B, m). If V is a finite par t i t ion of (X, B, m), then is cal led the entropy ofT with respect to partition V. Definition 12 (Entropy of a transformation). Let T : X —> X be a measure- preserving t ransformat ion of the probabi l i ty space (X, B,m) and suppose h(T) = sup / i (T , V), where the supremum is taken over a l l finite part i t ions V of (X,B,m). T h e n h(T) is cal led the entropy ofT. k H(V) = -^miPi) log m(Pi). 20 T h e fol lowing definit ions and theorems involv ing C-maps and P C - m a p s are taken f rom a paper of Ledrappier [12] and have been streaml ined for our argument. Definition 13 (of Ledrappier [12], C-map). A real funct ion / defined on an interval [a, b] is said to be a C-map i f / is continuously differentiable and its derivative / ' has the fol lowing properties: (a) / ' satisfies a Holder condit ion^ of order e > 0. (b) There are only a finite number of points x G [a, 6] where f'(x) = 0. We denote them by a < a\ < a2 •.. < an < b w i th / ' ( a ; ) = 0 for 0 < i < n. (c) There exist posit ive numbers k~ (kf) such that a left (right) neighborhood of a; . Definition 14 (of Ledrappier [12], PC-map). A map / : [0,1) —> [0,1) is called a PC-map if there exists a finite par t i t ion 0 < b\ < b2 .. • < bm < 1 such that / is a C-map f rom [bj,bj+i] into [0,1), for any j. Theorem 7 (of Ledrappier [12]). Let f be a PC-map. If \i is an a.c.i.m. (abso- lutely continuous invariant measure), then Rohlin's formula [20] is true: Hf) = j log|/'|d/i. Theorem 8. The entropy h{TD), ofTD for D e [̂ ,1) is equal to / l o g | T o ' | d/z = log 2. Proof. We begin by showing that T o is a P C - m a p . B y the def ini t ion of a P C - m a p , T p is a P C - m a p i f each of the three functions Tu\^Q ij ,T!D|[1 Dy and To\yD ^ is a C-map. Tr iv ia l ly , each Try restr icted to any of the three domains [0, | ) , or [D,l) satisfies a Holder condi t ion of order e = 1 because each piece of To is just a §A function f(x) defined on an interval [a, b] satisfies a Holder condition of order e E K + if there exists c 6 R+ such that for any two points pi,P2 G [a,b], \f{pi) — f(p2)\ < c\pi - P2\e• 21 log 1 ,k |x—a| - (+) is bounded in l ine of slope two. Thus condi t ion (a) of Def in i t ion 13 is satisfied. Cond i t i on (b) is satisfied because there are no points for which the derivative is equal to zero w i th in a given l ine segment. Therefore, condi t ion (c) is t r iv ia l ly satisfied. Thus each of the three segments of TD are C-maps and by Def in i t ion 14, T o is a P C - m a p . Now, since each TD is Bernou l l i , there exists a unique a.c. i .m., cal l it /z, for each To- B y Theorem 7, we can use Roh l in 's formula to calculate the entropy: W i t h the proof of Theorem 8 we have established isomorphism amongst the automorphic transformations (or natural extension) associated w i th simple S - R - T d iv is ion transformations by an appl icat ion of the Ko lmogorov-Ornste in Theorem. T h e key to obta in ing this result was being able to show that TD has Bowen's ex- panding property. In Chapter 3, we extend the results of this chapter to a more general type of S - R - T div is ion. • 22 Chapter 3 Extensions to M u l t i - D i v i s o r S - R - T D i v i s i o n 3.1 Multi-Divisor S-R-T Division A common opt imizat ion to the S - R - T div is ion a lgor i thm is the inc lus ion of addi t ional divisors to increase the shift average. In this section, we prove that a l l such d iv is ion algor i thms w i th reasonable assumptions on the separation of the divisor mult iples have the expanding property. It w i l l be useful to define precisely a class of mul t i - div isor S - R - T d iv is ion transformations. Definition 15. Let a G W1 be such that (a) 0 < ai < a2 < • • • < an, and (b) For a l l x,D e [|, 1), there exists i G { 1 , . . . , n) such that \c<iD — x\<\. We define 2 l „ to be the set of a l l a G Rn, satisfying condit ions (a) and (b). A lso , Definition 16 (Peaks and Valleys). G iven an a G 2 l N > 2 , the point of intersection between two lines f(x) = 2(x - ctiD) and g(x) = 2(ai+iD - x) w i l l be cal led a peak 23 and is denoted by ipi = (^D(cti+i + ati), D(a{+i — a;)) . For convenience, we w i l l refer to the abscissa as ipf = ^D(oti+i + a j ) , and to the ordinate as ipf = D{a.i+\ — a j ) . T h e point of intersection of the two lines f(x) = 2{aiD — x) and g(x) = 2(x — otiD) is (cxiD,0) and w i l l be cal led a valley. Definition 1 7 . For a D e a n d Q 6 21, define the t ransformat ion Tr),a(x) : [0,1) —* [0,1). For a S 2 t i , we get the fami l iar t ransformat ion TDAX) = 2x \2(D-x)\ 0 < X < i < X < 1 . For a € 2 l 2 , 2x \2{aiD-x)\ \2{a2D-x)\ For a € 2 l n > 3 , 2x \2(aiD-x)\ \2(aiD-x)\ \2{anD-x)\ Q<x<\ i < x < ipf \ < x and ipf < x < 1. 0 < x < \ \ < x < ipf \<x and ipf < x < tpf+1 \ < x and ipn-i < x < l . Definition 1 8 . Define Wln = {TDi(X : D <E (± ,1 ] , a € 2t n } and define iW = UneN^n- We c a n * n e s e t °^ a n n-divisor S-R-T division transformations and we cal l iVd the set of multi-divisor S-R-T division transformations. Cond i t i on (b) i n Def in i t ion 15 guarantees that the d iv is ion a lgor i thm gener- ates a new quotient bi t every step. A l though the condi t ion makes intui t ive sense, it is not immediate ly obvious i f an a satisfies the condi t ion just by inspect ion. L e m m a 10 below provides an easier way to check. 24 Lemma 9. If a = ( a i ) , then condition (b) of Definition 15 is satisfied if and only if ai = 1. Proof. If a\ = 1, then maxjr>]Xe[i/2,i) \a\D — x\ < | . Now consider the cases when « i ^ 1 and e G M + . If ct\ = 1 + e, then when D = and x = \ , \OL\D — x\ = 1 — i = i ^ i . O n the other hand, i f ct\ = 1 — e , then when D = \ and a; = 1 — | , |ai£> - x | = 1 - | - (1 - e ) ± = i ^ \ . • Lemma 10. A n a G 2 l n satisfies condition (a) of Definition 15 also satisfies condition (b) if and only if for some i,j G { 1 , . . . ,n} (possibly i = j), either (i) a{ € ( 0 ,7?] and otj € [1,1 + a*], or ( M J a» € [^,1] a^c? € [ l , 3 « i ] . Proof (Sketch). L e m m a 9 has shown that a single component a of a w i th a = 1 is sufficient to ensure that the range of f(x) = 2 \aD — x\ is equal to [0,1) as x and D range over [5,1). It is easy, to see based on the proof of L e m m a 9 that if there does not exist i 6 { 1 , . . . ,n} such that ai = 1, then there must exist i,j 6 { 1 , . . . ,n} (i < j) where OJJ < 1 and a3 > 1. Let us assume that i is the largest value where CKJ < 1, and let us assume that j is the smallest value where ctj > 1 (therefore j = i + 1). We make this assumpt ion because no other scalars of D w i l l have an influence on whether or not condi t ion (b) is satisfied. Consider the case where c*j G (0, ^]. In this case where ai 6 (0, | ] , when D is close enough to 1, some of the l ine f(x) = 2(x — aiD) appears in the region (denoted R) where \ < x < 1, 0 < Ta(x) < 1. W h e n a por t ion of the l ine f(x) appears i n region R, we must put restrictions on aj in terms of ai so that the peak ?/>! is always in R. ipf is greatest when D = 1. We f ind the m a x i m u m allowable value of aj by sett ing D = 1 and solving ipf = 1 for aj: ipf = I => D(aj - ai) = 1 aj = a ; + 1. Therefore, i f a ; G (0, then G [1,1 + aj]. 25 In the case where a j € [̂ , 1], for large enough values of D, the l ine f(x) = 2(x — Dai) crosses the l ine x = 1 in the range [0,1). Because of this, we must loosen the restr ict ion that ctj G [1,1 + ctj]. It is straightforward to calculate that f(x) begins to cross the l ine x = 1 in the range [0,1) when D = We can ensure that as D becomes smaller, the peak ipi w i l l always be in region R by solv ing tpf = 1 for ctj when D = tp\ = 1 D(aj - on) = 1 => 7f—((Xj - on) = 1 =4> ctj = 3at. Therefore, i f ai € [̂ , 1], then ctj £ [1, 3aj]. • Definition 19 (Separation). For a € 2 l n , we define the separation i n a as separation(a) = max 1 + 1 . i€{l , . . . ,n-l} ttj L im i t i ng the separat ion is a convenient way to restrict the subset of 21 being consid- ered. If separation(a) = r, we say that "the divisor mult ip les i n a are separated by at most a factor of r." F igure 3.1 shows an example of mul t i -d iv isor S - R - T d iv is ion. Th i s example is performing the same calculat ion as in figure 1.2, but it has computed the d iv idend w i th twice as many digits of precision w i th the same effective number of uses of the adders. We say "effective" because in mul t i -d iv isor S - R - T d iv is ion, there are several adders work ing in paral le l . In a real implementat ion of mul t i -d iv isor S - R - T d iv is ion, the values for a must be careful ly chosen so that not too much overhead is required to select a good par t ia l remainder. There is also a tradeoff between the amount of overhead in choosing a good par t ia l remainder and the precision to w i th which a good par t ia l remainder is selected. 26 Po = 0.67 0.67 Pi = 2(0.67 - a2D) -0.16 Qo = Qo = 1 P4 = 2(2 2(-0.16) + anD) = -0.155 Q3 = -ax Qz = 0.90625 P7 = 2(2 2(-0.155) + axD) = -0.115 <76 = Q6 = 0.89453125 Pll = 2(23(-0.115) + a3D) = 0.035 9io = - « 3 Qw = 0.8933105469 Pl6 = 2(2 4(0.035) - axD) = -0.005 Qis = Ql5 = 0.8933334351 P24 = 2(27 (0.005) + axD) = -0.155 923 = Q2S = 0.8933333456 Figure 3.1: An example of S-R-T division where three multiples of the divisor are used. In this example the dividend po — 0.67, and the divisor D = 0.75 with divisor multiples a = (0.75,1,1.25). The quotient Q* is 0.893. 3.2 Proof of Bernoulliness In this section, we will show that all multi-divisor S-R-T division transformations are Bernoulli, given a necessary restriction on the multiples of the divisor. As in the case for a single divisor, it will be useful to define a sequence of intervals that are subsets of the sequence of sets that would arise from repeatedly applying TD,Q. to an initial open interval. Unless otherwise noted, assume that the function m denotes the Lebesgue measure. Definition 20. Given an initial open interval U C [0,1) and To^a G 9DT, we define 27 the inf inite sequence of intervals U = {Ui}^ as Ui = U and TD,a(Ui) TD,a(Ui n [0, i ) ) C / ° C [ 0 , i ) o r C 7 ? C [ i , l ) t / ° £ [0, and t/? g [ i , 1) and T ^ a ^ i n [ i , i ) ) m ( f / < h [ 0 , i ) ) >m{Uif\{\,\)) U°%%\) and C7? g [ ± , l ) and m ( f 7 i n [ 0 , i ) ) < m ( f / i n [ i ) l ) ) . Definition 21 (Critical Points). For a given Tu,a where a € 2 l n , define the set where 5 = {& : i < & < 1 and b G {a i -D, • • •, a n -D} U Of, • • •, V>n-i}} a n d c i < C2 < . . . < 1. C is cal led the set of critical points for T o ) a . Lemma 11 (Doubling). GivenTr),a £ ^ ^ e sequence of intervals U be defined as in Definition 20 and let Ui be some interval in the sequence. Furthermore, let C = { c i , . . . , c m } be the set of critical points for T D , « . If Ui C [ C J , C J + I ] for some j G {1,... , r a - 1}, then m(Ui+i) = 2m(Ui). Proof. Since Ui C [CJ,CJ+\] for some j £ { l , . . . , m - l } , because we are in the first case of the def ini t ion of U, either U? C [0, \) or U° C [ i , 1). B y simple inspect ion of the ind iv idua l cases that define Trji(X, it is apparent that a l l of Ui, except possibly the points Cj and Cj+i, fa l l w i th in the same case of Tr),a- Therefore, the result ing interval Ui+i w i l l be double the length of Ui. • Definition 22 (Active Valleys). G i ven TDt(X € 93Tn, define V = {ctiD : i 6 { 1 , . . . , n) and \ < atD < 1} . V is cal led the set of active valleys for To,a- C = {Ci : i 6 {1, • • •, m} , a E B U {0, ±, 1}} 28 Definition 23 (Active Peaks). Given Tn,a G Tln, define P = {if>f : t e { l , . . . , n - l } a n d i < ^ f < l } . P is called the set of active peaks for To,a- Lemma 12 (Non-shrinking). Given TD,OC G with separation(a) < | , let the sequence of intervals {Ui}i^ be defined as above and let V denote the set of active valleys for T r j , a . For any interval Ui G U such that V f l Ui = 0, either m(Ui+i) > m(Ui) or m(Ui+2) > m(Ui). Proof, separation(a) < | implies that cti+\ < For a given separation, the value of ip\ is maximized when ipf = 1. This implies that aj = We calculate the value of ipf with the assumption that ipf = 1 to get a bound on ipf for D < 1: iPf < D(lai - ai) = D(lai) = £ > ( § & ) = 5 • Case 1: Consider when Ui C [0, \ ] . In this case, m(Ui+\) — 2m(Ui). Case 2: Consider when Ui C [^,1). The interval Ui can span at most one peak. Therefore, m(£/j+i) > m(Ui). A further observation is that since Ui+i C [0, m{Ui+2) = 2m{Ui). Case 3: Consider when Ui £ [0, \] and Ui % 1). In this case, Ui straddles \ . From the definition of U, we see that in the worst case we might throw away up to half of Ui. Cal l the part not thrown away Ui and observe that m(Ui) > \m(Ui). Now, either Ui' C [0, \) or Ui' C [i, 1). If [// C [0, ±], then m(C/i+i) = 2m(<7;') > m{Ui). If t / i ' C [ i , 1), then m(C/i + 2 ) = 2m(£//) > m(c/;). • Lemma 13. A multi-divisor S-R-T division transformation Tr>,a G 9JI is expanding when separation(a) < | . Proof. Let V be the set of active valleys (as defined in Definition 22) for a Trj,a- Let P be the set of active peaks (as defined in Definition 23) for a To,a- Let U = {Ui\i^n be the sequence of intervals associated with a To,a and an initial interval U. 29 B y way of contradict ion, assume that a T£> > a is not expanding. Th i s means that for some T r j , a , there does not exist an interval Ui where any of the points in V are contained in Ui. Th i s is true because i f any of the valley points are in Ui, then Ui+i = [0, e) or C / j + 1 = [0,e], and after a finite number of steps, Ui w i l l have grown to include a l l of [0,1). If there is a sequence U that avoids a l l points i n V, then by L e m m a 12 it must be true that the intervals in the sequence can only double a finite number of t imes. Let i e N b e the first index for which there is no j > i such that m(Uj) > 2m(Ui). It now follows that Ui straddles ^. The proof for L e m m a 12 reveals that this is the only s i tuat ion where it is not necessarily the case that either m{Ui+\) = 2m{Ui) or rn(Ui+2) = 2m(Ui). In fact, Ui must straddle bo th \ and m i n P . If m i n P is not straddled and m(Ui f l [0, ^)) < m(Ui f l [|,1)), then either m(Ui+2) > 2m([/j) or m(Ui+z) > 2m(Ui). In the other possibi l i ty where m i n P is not straddled and m(Ui n [0, ^)) > m{Ui n [1,1)), we find that m(Ui+2) > 2m{Ui). Assuming that Ui straddles both ^ and m i n P , we also observe that there can be no j > i such that m(Uj D [0, 5)) > m(Uj f l [5,1)) because this quickly leads to doubl ing. In other words, the right side must be larger than the left side whenever we straddle | . Therefore, we must be i n the s i tuat ion where t/i = ( i - e ' , i + e), e'<£ Ui+i = (min{2(i - aiD),2(ai+lD- U l + 2 = (2min{2(i - aiD),2(ai+1D _(i+ e))}, 2^?) ui+3 = (min{2(i - aiD),2{ai+1D- 2vf)},^r) Ui+4 = (2min{2(i -aiD),2{ai+1D -2ltf)},2#) => U i + 5 = (min{2(i - aiD),2{ai+1D- 2^)},^) = l7 i+3 It is apparent that the interval represented by [/j+4 wi l l re-occur every other interval ad infinitum. We now use this interval to show that in fact such a sequence of non-expanding intervals is not possible. 30 Since straddles \ , we can compare the length of the left and right sides of J7j+4. Let R = [^,2i/>f) denote the right side and let L = (4(^ — o^-D), \ ) and V = (4(cti+iD — 2ipy), ^) denote the two possibi l i t ies for the left side. T h e length of the right side is m(R) = 2Vf - i , while the length of the left side is the larger of two possible lengths m(L) = \ - 4( i - aiD) and m(L') = i - 4 ( a i + i £ > - 2 ^ ) . We then compare the differences between the right side and each of the two possible left sides. The first possibi l i ty is m(R) - m(L) = 2$ - I - (I - 4(i - atD)) = 2D(ai+l - a i ) - 1 + 2- 4a{D = 2ai+1D - GctiD + 1 , while the second possibi l i ty is m(R) - m(L') = 2^f - ± - (± - 4 ( a l + 1 - 2$)) = 2D{al+x - ai) - 1 + 4{al+1D - 2D{al+1 - at)) = -2ai+iD + 6aiD - 1. It is now clear that m(R) - m(L) = - (m{R) - m(L')) . B u t this means that the length of the left side is always greater than or equal to the length of the right side, which contradicts our assumpt ion that the r ight side must be bigger than the left side whenever the interval straddles ^. • 31 Theorem 14. To,a G %Jl is Bernoulli when separation(cx) < | . Proo/ . Let T = T o i a . F rom the definit ion of T , we see that T o > a is C 2 and that info< x <i |T'(a;)| = 2 > 1 since |T ' (x ) | = 2 for a l l x for which the derivative is defined. Since info< x <i |T"(a;)| > 1, by Theorem 4, there exists at least one /z such that zz is a smooth T- invar iant probabi l i ty measure. B y L e m m a 13 we see that Theorem 3 holds when separation(a) < §. Hence, (T, /z) is weak-mix ing and by Theorem 2, (T,/z) is Bernou l l i when separation(a) < |. • 3.3 Some Restrictions on ex. In section 3.2, we showed that if a l l T D ) Q G 9Jt, if separation(a) < §, then TD>OC is Bernou l l i . In this section, we construct examples of T G %Jln, for every n, that fai l to be Bernou l l i when the restr ict ion that separation(a) < § is relaxed. Theorem 15. For TD,O. G ffin>4, if separation(a) > §, then for each D G [|,1), i/iere ea;zsr; uncountably many a. for which Tut0l is not ergodic. Proof. We begin this proof by considering T G 9J l n =4- Assume that we relax the restrictions on a by e > 0. T h i s means that separation(a) < | + e and that no peak can be above the l ine f(x) = W i t h this re laxat ion, we can define a = (c*i, a2,0:3,0:4) w i th respect to a given D so that a subset of [0,1) is non-expanding. We let ai = 8 0 g + 4 8 D £ > a 2 = 8or>+48£>e' 0 : 3 = 40C+2 9 4 £ Pe 1 a n ( ^ a 4 = 4 0 P + 2 4 P e • ^ o r o u r constructed a to be va l id , we must be careful that condit ions (a) and (b) of Def in i t ion 15 hold. Cond i t i on (a) requires that the components of a remain in ascending order. Th i s is satisfied when e G (0, ^ ] . Since ordering is mainta ined, separation(a) < 3, and minrj g[ i/2,i),ee(o,2/i5] a 4 = 1-2 > 1, to verify that condi t ion (b) of Def in i t ion 15 holds, it is sufficient to show (by L e m m a 10) that for a l l values of D and e, either ot\, a2, or a 3 G [5,1]- B y max im iz ing and min im iz ing over e and D, we find that a\ G [0.375,0.7] and a2 G [0.625,1.3]. F igure 32 3.2 provides a v isual proof that as e is varied over [0, ^ ] and D is varied over [5,1], Hav ing verif ied that our defined a satisfies Def in i t ion 15, we calculate that peak ^ = ( i ± i f , l{j±if§) and peak </>3 = ( f ^ , ftl||). W i t h this def ini t ion for whi le remain ing above the line g(x) — ^, and the point ip1 w i l l always be sl ightly chosen so that we are i n a s i tuat ion where 1 — ip% = ^ 3 ~~ \ — 2(V'i ~~ \) = 2{ip\ — |)- Another impor tant feature in this construct ion is the interval between a2D and a^D. Since ip2 is not used in our construct ion, it is possible to insert an arb i t rary number of d iv isor mult ip les between a2D and a 3 D . Thus , the results i n this proof apply to T G 9Jt„ for arb i t rar i ly large n. F igure 3.3 i l lustrates the type of t ransformat ion that we have constructed. We are now in a posi t ion to show that there exists a set of points A w i th 0 < m(A) < 1, for which TD,OL{A) = A. Th i s is the def ini t ion of a t ransformat ion being non-ergodic [11, p. 59]. Define A = A\ U A2 U A 3 where A\ = [\ - (ipf -5). i + w - A* = [h- m -M+m - h)iand 4. = [i - 2(1 - 1]. It can be shown by calculat ion that T o ] a ( A i ) = A2, T^A^) = A\ U A 3 , and TD,OL{M) = A2. Therefore, TD,CI(A) = A, and by def ini t ion, To,a is non-ergodic or it is never the case that bo th a\ < \ and a2 > 1. Therefore, it is always the case that either a\ or a2 G [5, !]• below f(x) whi le remaining above the l ine g(x) = \ . A l l of the definit ions have been non-expanding. • 33 Figure 3.2: Combined plot of the regions where a\(e,D) < \ and ct2{e,D) > 1. Over the domain e 6 [0, ^ ] and D € [\, 1], it is never true that both ct\ < \ and «2 > 1- X Figure 3.3: A n example of a non-ergodic system for Trja € 9Jt„>4- In this ex- ample n = 4, D = a = ( | | , 1, | | ) , and separation(a) = | + j^-. The thick lines represent T r j ] a . The coarse dashed line represents the necessary separation restriction on a to guarantee that T D ) Q is ergodic. In this case, partial remainders in the set A = §] u tl> I] u ( i t . !) a r e mapped back to A by T D , a . This means that T D ) Q . is not ergodic, and therefore not Bernoulli. 34 Theorem 16. For To, a G 9Jt 3 ) if separation(a) > | , then for each D £ [5,1), there exists an a for which To i a is not ergodic. Proof. T h e proof for this theorem comes as a special case f rom the proof for The- orem 15. Consider a = (oti, 0:2, 0^3,0^4) as defined in the proof for 15. W h e n separation{a) = | = | + we are i n the special s i tuat ion where a 2 = 0:3. Since a l l of the results for the proof of Theorem 15 st i l l ho ld , we now have an example t ransformat ion T w i th only three unique mult ip les of D and this T has been proven to be non-ergodic. F igure 3.4 gives an example of a non-ergodic t ransformat ion for D = l2. • 0 0.25 0.5 0.75 1 X F i g u r e 3.4: A n example of a non-ergodic system for T o , a £ 9^3- In this example, D = a n d « = ( § > f> i f ) - T n e t m c ^ u n e s r e P r e s e n t I'D,a- The coarse dashed line represents the necessary separat ion restr ict ion on a to guarantee that To,a is ergodic. In this case, par t ia l remainders i n the set A = [±, ^ ] U [ ^ , i § M y § , 1) are mapped back to A by TD,a. Th is means that Trj,a is not ergodic, and therefore not Bernou l l i . 35 Theorem 17. For Trj,a € 9#2, if separation(a) > 3, then for some D G (^,1), there exist uncountably many a for which TD>(X is not ergodic. Proof. Assume that separation(a) < 3 + e and D G {\, ^p ) . F i rs t , we choose ai = jjj so that a i D = | and a2 = 1 + OL\. O u r restr ict ion on D in terms of e has been chosen so that a2/a\ < 3 + e when 0:2 = 1 + « i - Since 0 2 > condi t ion (a) of Def in i t ion 15 is satisfied. Since a\ G (\, 5), and a2 G (1,1 + a\], by L e m m a 10, condi t ion (b) of Def in i t ion 15 is satisfied. Thus , our defined a is always val id. Define A = [±, D]. We now apply T = TD>a to A: T [ i , D] = [min{2 ( i - aiD),2(a2D - £>)},^] =[m in {2 (± - 4%), 2(D + \ - D)},D(a2 - ai)] = [ m i n { I , i } , D ( l + 4 ^ - 4 ^ ) ] Now, since \ < D < 1, 0 < m ( A ) < 1 and TDt(xA = A, by Def in i t ion T D ] a is not ergodic. • 36 X F i g u r e 3.5: A n example of a non-ergodic system for Trj,a S 93?2. In this example, D = | and a = ( ^ , ^ ) . The thick lines represent Tr>,a- T h e coarse dashed l ine represents the necessary separat ion restr ict ion on a to guarantee that T a is ergodic. In this case, par t ia l remainders w i th in the interval [|, |] map back to |] and which means T o , a is not ergodic, and therefore not Bernou l l i . 3.4 Entropy of Multi-Divisor S-R-T Division The calculat ion for entropy in mul t i -d iv isor S - R - T d iv is ion follows the same method used for single div isor S - R - T div is ion. We begin by showing that To,a is a P C - m a p . Lemma 18. TrjtOC eiVl is a PC-map (as defined in Definition 14)- Proof. B y inspect ion, each T o , a is a finite col lect ion of l ine segments each w i th slope 2. E a c h of these l ine segments is a C-map by the same argument used in the proof for Theorem 8. Therefore, by def ini t ion, each To,a is a P C - m a p . • Theorem 19. The entropy of any TD>(X S 9^ with separation(a) < | is log 2. 37 Proof. B y L e m m a 18, a l l To,a G 9Jt are P C - m a p s . B y Theorem 14, TD,<X is Bernou l l i when separation(ot) < | and hence there exists a unique a.c. i .m. /J. Theorem 7 says that Roh l in ' s formula for the entropy is true and therefore: h{TD,a) = J log | T b , a ' | d i i = log 2 ^ dfi = log 2. • 38 Chapter 4 Future W o r k The or ig inal question that inspired this thesis was "Is simple S - R - T d iv is ion ergodic for a l l real d iv isors?" In pursuing the answer to this problem, we discovered that not only is simple S - R - T div is ion ergodic for a l l divisors, but it is also Bernou l l i . Hav ing establ ished a Bernou l l i property, and having calculated the entropy for our transformations, we were able to use the Kolmogorov-Ornste in theorem to conclude that our transformations are isomorphic to each other. In proving these important propert ies for simple S - R - T div is ion, we made extensive use of more general results f rom dynamica l systems theory. Consequently, our results were shown to be easily extensible to more general d iv is ion systems. In general, it is diff icult to prove that a part icular class of transformations are ergodic or Bernou l l i . O u r results have provided an effective means of proving bo th of these propert ies for a large class of S -R-T- l i ke d iv is ion algori thms. Prom the standpoint of understanding an algor i thm's expected performance, it is necessary to know that when a stat ionary d is t r ibut ion is found, it is unique. Hav ing established the uniqueness of stat ionary distr ibut ions, the next step is to f ind the actual stat ionary d is t r ibut ion for as wide a class of transformations as possible. In section 1.3, we verif ied a known expression for the stat ionary d is t r ibut ion funct ion for TD where D e [|, 1). In addi t ion, many of the stat ionary d is t r ibut ion functions 39 have been classified by Shively and Fre iman for D G [§, §], a l though the derivations are not nearly as simple as for D G [§, 1). It turns out that things become very compl icated when D G [̂ , §]. In his thesis [22], Shively shows many interesting properties for the stat ionary d is t r ibut ion functions in this region. For example, he shows that there are many different intervals of D where there are an inf inite number of different stat ionary d is t r ibut ion equations. A s such, the graph of the shift average for D G [5, |] is far f rom complete and appears to have a°complex pat tern (from the few points that have been plot ted in this region). Th i s is surpr is ing considering the s impl ic i ty of the under ly ing transformation. A better understanding of this final region of s imple S - R - T d iv is ion would be an interesting goal to pursue. In the work of Fre iman [6], it was first shown that the shift average for D G [|, |] is constant ly 3, which can be easily shown to be the m a x i m u m possible shift average. Th i s property was then used by Metze [15] to obta in a version of S - R - T d iv is ion that has an expected shift average of 3 for a l l divisors. Another area to pursue would be to explore shift averages for mul t i -d iv isor S - R - T d iv is ion and, i f other plateaus are found, they could possibly be used to obta in higher expected shift averages for a l l possible divisors. Undoubtedly, obta in ing a complete understanding of the stat ionary d is t r ibut ion functions for mul t i -d iv isor d iv is ion would be even more diff icult than it is for simple S - R - T div is ion. It is possible that such results in this area could lead to improvements in modern S - R - T d iv is ion. Related to this, it would be interesting to attempt to extend the results of this thesis to modern S - R - T d iv is ion. 40 Bibl iography [1] Rufus Bowen. Bernou l l i maps of the interval. Israel J. Math., 28(1-2):161-168, 1977. [2] A b r a h a m Boyarsky and Pawel Gora . Laws of chaos. B i rkhauser Boston Inc., Bos ton , M A , 1997. Invariant measures and dynamica l systems in one dimension. [3] Joseph F. Cavanagh. Digital Computer Arithmetic: Design and Implementa- tion. M c G r a w - H i l l , New York , N Y , 1984. [4] J . Cocke and D. W . Sweeney. H igh speed ar i thmet ic i n a paral le l device. Technical report, I B M , February 1957. [5] M i los D. Ercegovac and Tomas Lang . Division and square root: digit-recurrence algorithms and implementations. K luwer Academic Publ ishers Group , Norwel l , M A , U S A , and Dordrecht, The Nether lands, 1994. [6] C . V . Fre iman. Stat is t ical analysis of certain b inary d iv is ion algori thms. Proc. IRE, 49:91-103, 1961. [7] D. Harr is , S. Oberman, and M . Horowitz. S R T d iv is ion architectures and implementat ions. In 13th IEEE Symposium on Computer Arithmetic: proceed- ings, July 6-9, 1997, Asilomar, California, USA, volume 13 of Symposium on Computer Arithmetic, pages 18-25. I E E E Computer Society Press, 1997. [8] A . N . Kolmogorov. A new invariant for transit ive dynamica l systems. Dokl. Akad. Nauk SSSR, 119:861-864, 1958. 41 [9] A . N . Kolmogorov. Ent ropy per unit t ime as a metr ic invariant of automor- phisms. Dokl. Akad. Nauk SSSR, 124:754-755, 1959. [10] A . Lasota and James A . Yorke. O n the existence of invariant measures for piecewise monotonic transformations. Trans. Amer. Math. Soc, 186:481-488 (1974), 1973. [11] Andrze j Lasota and Michae l C . Mackey. Chaos, fractals, and noise. Springer- Ver lag, New York , second edi t ion, 1994. Stochastic aspects of dynamics. [12] Frangois Ledrappier . Some properties of absolutely continuous invariant mea- sures on an interval. Ergodic Theory Dynamical Systems, l ( l ) : 77 -93 , 1981. [13] T i e n Y i e n L i and James A . Yorke. Ergod ic transformations f rom an interval into itself. Trans. Amer. Math. Soc., 235:183-192, 1978. [14] O. L. MacSor ley. High-speed ar i thmet ic in b inary computers. Proc. IRE, 49:67- 91, 1961. [15] G e m o t Metze. A class of b inary divisions y ie ld ing m in ima l l y represented quo- t ients. IRE Trans, on Electronic Computers, E C - l l : 7 6 1 - 7 6 4 , Dec. 1961. [16] Dona ld S. Ornste in . Bernou l l i shifts w i th the same entropy are isomorphic. Advances in Math.,.4:337-352, 1970. [17] Dona ld S. Ornste in . Ergodic theory, randomness, and dynamical systems. Yale Univers i ty Press, New Haven, Conn . , 1974. James K . Wh i t temore Lectures in Mathemat ics given at Yale Universi ty, Yale Mathemat ica l Monographs, No . 5. [18] K a r l Petersen. Ergodic theory. Cambr idge Univers i ty Press, Cambr idge, 1983. [19] J . E . Rober tson. A new class of d ig i ta l d iv is ion methods. IRE Trans. Electronic Computers, EC-7 :218-222 , Sept. 1958. 42 [20] V . A . Roh l i n . Exac t endomorphisms of a Lesbesgue space. Amer. Math. Soc. Transl., 39, 1964. [21] P a u l Shields. The theory of Bernoulli shifts. The Univers i ty of Chicago Press, Chicago, I l l . -London, 1973. Chicago Lectures in Mathemat ics . [22] Rober t Shively. Stationary distribution of partial remainders in S-R- T digital division. P h D thesis, Univers i ty of I l l inois, 1963. [23] K . D. Tocher. Techniques of mul t ip l icat ion and d iv is ion for automat ic b inary computers. Quart. J. Mech. Appl. Math., 11:364-384, Ju ly -Sep tember 1958. [24] Peter Walters. An introduction to ergodic theory. Spr inger-Ver lag, New York , 1982. 4 3

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
China 21 22
United States 4 4
Japan 2 0
City Views Downloads
Beijing 19 0
Unknown 3 2
Shenzhen 2 22
Tokyo 2 0
Ashburn 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}

Share

Share to:

Comment

Related Items