UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Exponentially fast convergence to flat triangles in the iterated barycentric subdivision Klöckner, Matthias 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2018_november_kloeckner_matthias.pdf [ 1.74MB ]
Metadata
JSON: 24-1.0371619.json
JSON-LD: 24-1.0371619-ld.json
RDF/XML (Pretty): 24-1.0371619-rdf.xml
RDF/JSON: 24-1.0371619-rdf.json
Turtle: 24-1.0371619-turtle.txt
N-Triples: 24-1.0371619-rdf-ntriples.txt
Original Record: 24-1.0371619-source.json
Full Text
24-1.0371619-fulltext.txt
Citation
24-1.0371619.ris

Full Text

Exponentially fast convergence to flat trianglesin the iterated barycentric subdivisionbyMATTHIAS KLÖCKNERA THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES(Mathematics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)August 2018c© Matthias Klöckner, 2018The following individuals certify that they have read, and recommend to the Faculty ofGraduate and Postdoctoral Studies for acceptance, a thesis entitled:Exponentially fast convergence to flat triangles in the iterated barycentric sub-divisionsubmitted by Matthias Klöckner in partial fulfillment of the requirements for the degreeof Master of Science in Mathematics.Examining Committee:Gordon Slade, Professor of MathematicsSupervisorWilliam Casselman, Professor of MathematicsSupervisory CommitteeiiAbstractThe barycentric subdivision dissects a triangle along its three medians into six childrentriangles. The children of a flat triangle (i.e. the vertices are collinear) are flat. For anytriangle ∆ let the shape S(∆) be the unique complex point z in the first quadrant such that∆ is similar to the triangle −1, 1, z in which the edge between −1 and 1 has maximal length.Only flat triangles’ shapes lie on the real line. If ∆(n) is a Markov chain of triangles with ∆(n)chosen uniformly amongst the children of ∆(n−1), then we call the Markov chain S(∆(n)) ashape chain and we call it (non-)flat, if ∆(0) and therefore each ∆(n) is (non-)flat. Let Znbe a non-flat and Xn be a flat shape chain. We say that a sequence of random variables Wntaking values in C \ {0} decays exactly or at least with rate χ, if χ > 0 and almost surelylimn 1n ln |Wn| = −χ or lim supn 1n ln |Wn| ≤ −χ, resp. In a paper from 2011, P. Diaconisand L. Miclo show that =Zn decays at least with rate χ′ for some universal constant χ′ andthat Xn has an invariant measure µ. We prove that =Zn decays exactly with rate χ for auniversal constant χ which we express as an integral w.r.t. µ. The above paper also showsthe convergence of Zn − Xn to 0 in probability for a specific coupling (Xn, Zn). For thiscoupling we prove that Zn −Xn decays exactly with rate χ.iiiLay SummaryIn any triangle ∆ the three lines between each vertex and the middle point of the oppositeedge intersect in one point, the barycentre of ∆, and cut ∆ into six smaller triangles which wecall the children of ∆. We randomly choose one of the six children of ∆(1) = ∆ by throwinga die and call the new triangle ∆(2). Then we choose ∆(3) amongst the children of ∆(2) bythrowing another die, and so on. To any triangle ∆ we assign a certain number flat(∆)which is 0, if ∆ is flat (i.e. the three vertices lie on the same line), and positive otherwise.The smaller flat(∆) is, the “flatter” ∆ looks. It has been known that the sequence flat(∆(1)),flat(∆(2)), . . . converges to 0 exponentially fast. We prove the existence of and a formulafor the exact exponential rate.ivPrefaceThe dissertation is original, unpublished, independent work by the author, M. Klöckner.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of symbols and abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . xiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Relation to earlier work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.4 New approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Geometric preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1 New geometric setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2 Möbius transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14vi2.3 Explicit formula for the barycentric subdivision . . . . . . . . . . . . . . . . 162.4 Shape sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.5 Upcircle coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.6 Möbius transformations in upcircle coordinates . . . . . . . . . . . . . . . . . 193 Other preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.1 Up-to-sign derivatives of the child functions . . . . . . . . . . . . . . . . . . 213.2 Generalized shape functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3 Definition of the generalized child functions . . . . . . . . . . . . . . . . . . 243.4 Generalized child functions in upcircle coordinates . . . . . . . . . . . . . . . 264 Ergodicity of the flat shape chain . . . . . . . . . . . . . . . . . . . . . . . . 284.1 Contractivity on average . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.2 Law of large numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.3 Shape integrand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.4 Proof of the first theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.5 Modified law of large numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 365 Asymptotic exponential decay . . . . . . . . . . . . . . . . . . . . . . . . . . 385.1 Asymptotic exponential rates . . . . . . . . . . . . . . . . . . . . . . . . . . 385.2 Series of complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.3 Sequences of upcircle coordinates . . . . . . . . . . . . . . . . . . . . . . . . 40vii5.4 Comparison of both generalized shape functions . . . . . . . . . . . . . . . . 425.5 Equality of distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.6 Proof of the second theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 466 Triangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.1 Geometric setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.2 Explicit formula for the barycentric subdivision . . . . . . . . . . . . . . . . 506.3 Shape sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 Up-to-sign derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567.2 Up-to-sign derivative of the shape function . . . . . . . . . . . . . . . . . . . 577.3 Up-to-sign derivative of the child functions . . . . . . . . . . . . . . . . . . . 608 Generalized child functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628.1 Exceptional set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628.2 Construction of the generalized shape functions . . . . . . . . . . . . . . . . 638.3 Definition of the generalized child functions . . . . . . . . . . . . . . . . . . 668.4 Equality of distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669 Upcircles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699.1 Upcircle coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699.2 Möbius transformations in upcircle coordinates . . . . . . . . . . . . . . . . . 71viii9.3 Generalized child functions in upcircle coordinates . . . . . . . . . . . . . . . 739.4 Upcircles intersecting two shape sets . . . . . . . . . . . . . . . . . . . . . . 7410 Ergodicity and the law of large numbers . . . . . . . . . . . . . . . . . . . . 7810.1 First setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7810.2 Extended setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8111 Shape integrand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8511.1 Definition and connection to contractivity on average . . . . . . . . . . . . . 8511.2 Extrema of the shape integrand . . . . . . . . . . . . . . . . . . . . . . . . . 8612 Conclusion and conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9012.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9012.2 Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93A Asymptotic exponential rates . . . . . . . . . . . . . . . . . . . . . . . . . . . 94ixList of Figures1 A triangle ∆z, its shape S(z) ∈ Σ and its children ∆(l,k)z . Their second indexk is depicted in the same colour as the corresponding k-edge of z. . . . . . . 22 Comparison of A(1,1,)(z) and a(1,1,)(z). Corresponding edges in similar trian-gles have the same colour. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 The shape sets Σj and elements zj ∈ Σj with z(0,0) = S(zj) = Sj(zj). . . . . . 174 Upcircle U(x, r). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Upcircles U(x, r) and U(x′, r′) = S(1,1)(U(x, r)) with elements z and z′ =S(1,1)(z). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426 (Copy of Figure 1). A triangle ∆z, its shape S(z) ∈ Σ and its children ∆(l,k)z .Their second index k is depicted in the same colour as the correspondingk-edge of z. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487 (Copy of Figure 2). Comparison of A(1,1,)(z) and a(1,1,)(z). Correspondingedges in similar triangles have the same colour. . . . . . . . . . . . . . . . . 498 Triangles ∆z and ∆h2(z) with their similar children ∆(1,2)z and ∆(1,0)h2(z). Corre-sponding edges have the same colour. The outer edges of both children areblue and the outer vertices are magenta. . . . . . . . . . . . . . . . . . . . . 529 (Copy of Figure 3). The shape sets Σj and elements zj ∈ Σj with z(0,0) =S(zj) = Sj(zj). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5310 (Copy of Figure 4). Upcircle U(x, r). . . . . . . . . . . . . . . . . . . . . . . 6911 (Copy of Figure 5). Upcircles U(x, r) and U(x′, r′) = S(1,1)(U(x, r)) withelements z and z′ = S(1,1)(z). . . . . . . . . . . . . . . . . . . . . . . . . . . . 7412 Upcircle U(x, r) intersecting Σ(1,1). . . . . . . . . . . . . . . . . . . . . . . . 76xList of symbols and abbreviationsThe abbreviation “a.s.“ stands for “almost surely”. We use the following notations:• <z and =z denote the real and imaginary part of a complex number z.• H = {z ∈ C : =z > 0} and H0 = H ∪ R.• H0 = H0∪{∞} and R = R∪{∞} are subsets of the extended complex plane (Riemannsphere).• R∪{−∞,∞} = [−∞,∞] denotes the extended real number line, where we distinguishbetween −∞ and ∞. We define intervals in a usual way, e.g. (0,∞] := (0,∞) ∪ {∞}.• Sequences are indexed by n ∈ N0 or n ∈ N and we simply write sn for a sequence (sn)n.Moreover, any limit (inferior, superior) is implicitly understood w.r.t. n→∞.• We omit the ◦ symbol for compositions of functions. For a function f : C → C wewrite fn for the n-fold composition f . . . f : C → C.• δx denotes the Dirac measure at x.xiAcknowledgementsI owe thanks to my supervisor Prof. Slade who has been an excellent and very supportivementor during my Master’s program. His supervision fostered my passion for research byproviding the perfect balance between freedom and guidance. My research and presentationskills benefited from the very helpful feedback during our meetings, particularly those incompany of Prof. Casselman who kindly joined the supervisory committee. I am deeplygrateful that the UBC was in particular a place for me where two great mathematicians withdecades of experience were repeatedly meeting with me and giving me invaluable advice. Ialso want to thank my mentor Prof. Ritter from my Bachelor’s program, my friends inKaiserslautern and Vancouver, as well as my family who has always been a great supportfor me.xiiDedicationFür Oma Aloisia.xiii1 Introduction1.1 ModelConsider a triangle ∆ in the complex plane. If ∆ is degenerate (i.e. its three vertices arecollinear), then following the terminology of [1] and [6] we call ∆ flat. We allow the specialcase of two vertices being identical, but we exclude the case of all three vertices beingidentical. The three medians of any triangle ∆ (which connect each vertex to the midpointof the opposite edge) intersect in the barycentre and dissect ∆ into six smaller triangleswhich we call the children of ∆. This construction is called the barycentric subdivision. Wecan even make sense of it for flat triangles whose children are also flat. Now we repeat thisconstruction by applying the barycentric subdivision to each child of ∆ creating a total of62 grand-children of ∆. Repeating this procedure iteratively results in 6n descendants of ∆in the n-th generation for n ∈ N0.We introduce a Markov chain ∆(n). Starting with ∆(0) we randomly choose one of itssix children by throwing a die and we call the new triangle ∆(1). Now we randomly choose∆(2) amongst the children of ∆(1) by throwing another die, and so on. In each step the dieis thrown independently of all the previous dice. In case of a flat starting triangle ∆(0) weobtain a Markov chain of flat triangles ∆(n).For a flat triangle ∆ we set flat(∆) := 0. For any other triangle ∆ we defineflat(∆) := (shortest height in ∆)2area of ∆ = 2 ·shortest height in ∆longest edge in ∆ . (1.1)According to the formula for the area of a triangle, a shortest height is orthogonal to alongest edge. Here we use the indefinite article “a” because z might have several shortestheights or several longest edges. If ∆ and ∆′ are similar, then flat(∆) = flat(∆′). Thesmaller flat(∆) is, the “flatter” ∆ looks. The value flat(∆) is maximal, namely√3, if andonly if ∆ is equilateral.For two similar triangles ∆, ∆′ there is a (not necessarily unique) one-to-one correspon-dence between the children of ∆ and the children of ∆′ such that corresponding children are1similar. Hence, if we can characterize the “shape” of ∆ by some complex number S(∆) —i.e. two triangles ∆ and ∆′ are similar if and only if their S-values S(∆), S(∆′) are equal— then S(∆) = S(∆′) implies that the multiset of S-values of the children of ∆ equals themultiset of S-values of the children of ∆′. Hence S(∆) uniquely determines the multiset ofS-values of the children of ∆. We obtain the following Markov chain which we call a shapechain:Zn := S(∆(n)), n ∈ N0. (1.2)Unless otherwise stated, we always suppose that z and z′ lie in the set H0 of complexnumbers with non-negative imaginary part, so H0 is the union of the upper half plane Hwith the real line. We denote by ∆z the triangle with vertices −1, 1 and z and we say that∆z is the associated triangle of z. We call the edge between −1 and 1 the 0-edge, the edgebetween −1 and z the 1-edge and the edge between 1 and z the 2-edge of ∆z, see Figure 1.Figure 1: A triangle ∆z, its shape S(z) ∈ Σ and its children ∆(l,k)z . Their second index k isdepicted in the same colour as the corresponding k-edge of z.From now on we usually think of a point z in terms of its associated triangle, so we speakfor example of the 0-edge of z and we call z, z′ similar, if ∆z and ∆z′ are similar. Moreover,we set flat(z) := flat(∆z).2Let Σ be the intersection of the first quadrant and the closed disk around −1 of radius 2,see Figure 1. Equivalently, Σ is the set of all points z such that the 0-edge of z is a longestedge and the 2-edge of z is a shortest edge. For z ∈ Σ the length of a longest edge of z is 2and the length of a shortest height is =z, henceflat(z) = =z. (1.3)For any triangle ∆ there is a unique point z′ ∈ Σ such that ∆ and ∆z′ are similar. Wecall ∆z′ the standardized version of ∆ and z′ the shape of ∆. Both ∆ and ∆z′ have theflat-value =z′. Let us denote z′ by S(∆). So flat(∆) = =S(∆). We see that S(z) := S(∆z)defines a projection from H0 into Σ, where S(z) is the unique point in Σ that is similar toz, see Figure 1. We call this projection the shape function S. The restriction of S to R is aprojection from R to Σ ∩ R = [0, 1]. Using (1.2) as a definition for Zn, we see that=Zn = flat(∆(n)). (1.4)In Section 1.2 we formulate our result that a.s. =Zn = flat(∆(n)) converges to 0 “exponentiallyfast” and we give a formula for this exponential rate.The barycentre of z is the arithmetic mean 13z of its vertices −1, 1, z. We denote the sixchildren of z by ∆(0,0)z , ∆(1,0)z , ∆(0,1)z , ∆(1,1)z , ∆(0,2)z , ∆(1,2)z in clockwise order, where ∆(0,0)z isthe triangle with vertices 0, 1, z/3, see Figure 1. We index the six children ∆jz byj = (l, k) ∈ J := {0, 1} × {0, 1, 2}. (1.5)Unless otherwise stated, we always view the indices l, l′ as elements of the cyclic group Z2and the indices k, k′ as elements of the cyclic group Z3. We always assume that j, j′ ∈ J .For j = (l, k) and the child ∆ := ∆jz we define the outer vertex of ∆ as the vertex of ∆that is also a vertex of ∆z, and we define the outer edge of ∆ as the edge that is one half ofan edge (namely the k-edge) of ∆z.Let us define the six child functions aj : H0 → Σ that describe the shapes of the childrenof ∆z:aj(z) := S(∆jz). (1.6)3The triangle ∆(n) is chosen uniformly amongst the children of ∆(n−1) and independently of∆(0), . . . ,∆(n−2). Since ∆(n−1) is similar to its standardized version ∆Zn−1 , the multiset of theS-values of the children of ∆(n−1) equals the multiset of the S-values of ∆jZn−1 with j ∈ J .Thus Zn = S(∆(n)) is chosen uniformly amongst S(∆jZn−1) with j ∈ J and independentlyof Z0, . . . , Zn−2. We find a sequence j1, j2, . . . of independent random indices such that eachjn is uniformly distributed on J and Zn = S(∆jnZn−1). Using the child functions this can bewritten asZn = ajn(Zn−1). (1.7)In (1.2) we defined the Markov chain Zn as the sequence of S-values of the Markovchain ∆(n). From now on we assume Z0 ∈ H, i.e. ∆(0) is not flat. Hence Zn ∈ H forany n. Analogously to the non-flat setting we define Xn as the sequence of S-values of acorresponding Markov chain of flat triangles: We fix X0 ∈ Σ ∩ R = [0, 1] and setXn := ajn(Xn−1). (1.8)Notice that we use the same random indices j1, . . . , jn for Xn and Zn. In Section 1.2 wepresent our main result which describes the a.s. behaviour of this specific coupling (Xn, Zn).We see that Xn is a Markov chain with state space [0, 1] and that Zn is a Markov chain withstate space Σ \ R. The Markov kernel Mflat of Xn is given byMflat(x, ·) = 16∑j∈Jδaj(x), x ∈ [0, 1], (1.9)and the kernel of Zn has the same formula with x replaced by z ∈ Σ \ R.1.2 Main resultWe deal with the following two notions of asymptotic exponential decay of a sequence ofcomplex random variables:Definition 1.1 Let χ > 0. We formally write ln(0) = −∞ and say that a sequence Wn ofcomplex random variables decays exactly with rate χ or decays at least with rate χ if a.s.limnln |Wn|n= −χ or lim supnln |Wn|n≤ −χ, resp.4An equivalent formulation for this definition is the following: A.s. for any  > 0 there issome n0 ∈ N such that any n ≥ n0 satisfiese(−χ−)n ≤ |Wn| ≤ e(−χ+)n or |Wn| ≤ e(−χ+)n, resp. (1.10)In both cases we note that in particular, Wn converges to 0 a.s.Definition 1.2 An upcircle is a circle in H0 with tangent line R. For any x ∈ R and z ∈ Hwe define the radius R(x, z) = |z − x|2/(2=z).We show in Section 2.5 that R(x, z) is the radius of the unique upcircle containing x andz.Definition 1.3 Let b : Db → R be a function with Db ⊆ R. We say that b is up-to-signdifferentiable if there is a continuous function β : Db → [0,∞) such that for all but finitelymany x ∈ Db the function b is differentiable in x with |b′(x)| = β(x). Here, β is unique andwe call β the up-to-sign derivative of b and we formally write |b′| := β.Before we state our main results, let us summarize Section 1.1: We always assume thatthe index j lies in J = {0, 1} × {0, 1, 2} = Z2 × Z3. The points aj(z) ∈ Σ are the shapes ofthe six children of ∆z. We fix arbitrary points Z0 ∈ Σ\R and X0 ∈ [0, 1]. The Markov chainsZn and Xn with state spaces Σ\R and [0, 1] have the recursive formulas Zn = ajn(Zn−1) andXn = ajn(Xn−1), where j1, j2, . . . are independent indices which are uniformly distributedon J .Theorem 1.4 The kernel Mflat of Xn has an invariant measure µ, i.e. µMflat = µ, and µ isunique. Each child function aj viewed as a function [0, 1] → [0, 1] is Lipschitz continuousand has an up-to-sign derivative |a′j| : [0, 1]→ (0,∞). The constantχ := −16∑j∫ln |a′j| dµis bounded from below by χmin := 13 ln32 ≈ 0.1352 and bounded from above by χmax :=13 ln912 − ln 3 ≈ 0.1740.5Proof. Postponed to Theorem 4.18 (on page 34). Theorem 1.5 Let Rn be the radius R(Xn, Zn). The following three sequences decay exactlywith rate χ:Rn, Zn −Xn, =Zn = flat(∆(n)).The sequence <Zn −Xn decays at least with rate χ.Proof. Postponed to Theorem 5.17 (on page 46). 1.3 Relation to earlier workLet us briefly discuss how our work relates to previous papers that use dynamical systemsarguments:• The first proof of the a.s. convergence to flat triangles in the iterated barycentricsubdivision was given by Bárány, Beardon and Carne [1] using Furstenberg’s theoremon random products of matrices in SL2(R), see [8].• Wilkinson [10] explains in Section 1.7 the measurably hyperbolic cocycle (over a measure-preserving dynamical system) which “lurks behind random barycentric subdivision”.With the same approach as in [1], Furstenberg’s theorem leads to the observation thatthe aspect ratio defined byαn :=area of ∆(n)(longest edge in ∆(n))2 =12 ·shortest height of ∆(n)longest edge in ∆(n) (1.11)decays exactly with some rate 2γ, where γ is the (upper) Lyapunov constant. Acomparison to our definition (1.1) shows that 4αn = flat(∆(n)). Thus Theorem 1.5implies χ = 2γ.• Hough [9] presents an application of Azuma’s inequality resulting in lower and upperbounds δ− ≈ 0.05857 and δ+ ≈ 0.09461 for the (upper) Lyapunov constant γ. Thesebounds show the inaccuracy of the numerical estimate γ ≈ 0.0446945 in [10, Theorem0.1]. The “bridge” χ = 2γ to the dynamical systems setting gives us the bounds60.1171 ≈ 2δ− ≤ χ ≤ 2δ+ ≈ 0.1892. Notice that our bounds 0.1352 ≈ χmin ≤ χ ≤χmax ≈ 0.1740 are sharper.Our main reference is Diaconis and Miclo [6] who we refer to as “they”. We haveXn = −2Zn + 1, Zn = 2(−Xn + iYn) + 1, (1.12)where the left-hand sides are our objects and the right-hand sides are theirs. Notice that=Zn in our notation and Yn in their notation only differ by a constant factor and thereforehave the same behaviour in terms of asymptotic exponential decay. We translate some oftheir results into our notation and compare them to our work:• Their Theorem 1.1 states that =Zn decays at least with rate χ′ for some universalconstant χ′. Our Theorem 1.5 states that =Zn decays exactly with rate χ for someuniversal constant χ.• Their Lemma 6.1 states that <Zn − Xn converges to 0 in probability. Together withthe a.s. convergence of =Zn to 0 it follows that Zn −Xn converges to 0 in probability.Our Theorem 1.5 shows that Zn −Xn converges a.s. to 0.• Their Theorem 1.3 states that a.s. the limit set of the sequence <Zn is [0, 1]. In Remark6.5, they show under the assumption of an unproven estimate (6.5) that <Zn − Xnconverges a.s. to 0 and they point out how this result (which follows from our Theorem1.5) allows for a much easier proof of their Theorem 1.3.• Their Proposition 5.3 states that the kernel M := Mflat of Xn is ergodic and thatthe Markov chain Xn satisfies the law of large numbers: There is a unique invariantprobability measure µ on [0, 1], i.e. µM = µ, and µ is attracting, i.e. for any probabilitymeasure ν on [0, 1] the sequence νMn converges weakly to µ. For any continuousfunction φ : [0, 1]→ R we have the a.s. convergencelimn1nn∑t=1φ(Xt) =∫φ dµ. (1.13)Little is known about the invariant measure µ. They show that µ is continuous andhas support [0, 1], but neither they nor we have been able to prove that µ is absolutelycontinuous.7• Their proof of Proposition 5.3 relies on a result of Barnsley and Elton [2] which requiresthe technical assumption that the restricted child functions aj : [0, 1] → (0,∞) are“contractions on average”. We formulate and verify this assumption in Sections 4 and11 with a new method that makes the calculations much easier and also yields thebounds χmin ≤ χ ≤ χmax as a byproduct.1.4 New approachRemember that we call points z, z′ similar, if their associated triangles ∆z, ∆z′ are similar.Definition 1.6 We call a stochastic process of points Wn ∈ H0 a barycentric process, if Wnis similar to ain(Wn−1) for some sequence i1, i2, . . . of independent random indices, each ofwhich is uniformly distributed on J . We call a barycentric process Wn flat (or non-flat), ifW0 and therefore each Wn is real (not real).Shape chains are special barycentric processes for which “similar” in the above definitioncan be replaced by “equal”. In particular, shape chains have state space Σ. We easily verifythat Wn is a barycentric process if and only if S(Wn) is a shape chain. All we need isthat there is a one-to-one correspondence between the children of Wn−1 and the children ofS(Wn−1) such that corresponding children are similar.Diaconis and Miclo [6] only consider the coupling(Xn, Zn) = (ajn(Xn−1), ajn(Zn−1)) (1.14)of a flat and a non-flat shape chain, whereas our approach consists of the analysis and thecomparison of different couplings of flat and non-flat barycentric processes. The rest of thissubsection gives a glimpse into the main ideas behind this new approach, but may be skippedat first reading as the proofs in the following sections do not formally require this discussion.In Definition 2.10 we introduce the set F containing Möbius transformations f with realcoefficients and the set G := F ∪ {mf : f ∈ F}, where m(z) := −z is the reflection acrossthe imaginary axis. We show in Remark 3.12 for each j that there is a set Gj of six functions8in G such that for any z the six (not necessarily pairwise different) evaluations g(z) withg ∈ Gj are exactly those points that are similar to aj(z). Hence for fixed z there is a (notnecessarily unique) function azj ∈ Gj such thataj(z) = azj(z). (1.15)Notice that azj maps z into Σ. For any other point z′ the image azj(z′) is similar to aj(z′),but not necessarily equal. In particular, azj(z′) does not necessarily lie in Σ.For g ∈ G let us write component-wise g(x, z) := (g(x), g(z)) for (x, z) ∈ R × H. InSections 3.2 and 3.3 we quote parts of Definitions 8.1 and 8.5, where we introduce a setD ⊂ R × H and generalized child functions a1,j, a2,j : D → D in such a way that “most”pairs (x, z) ∈ D satisfya1,j(x, z) = axj (x, z), a2,j(x, z) = azj(x, z). (1.16)This implies that the first component of a1,j(x, z) = (axj (x), axj (z)) is equal to aj(x) ∈ Σ,whereas the second component axj (z) is just similar to aj(z). Analogously, (1.16) impliesthat the second component of a2,j(x, z) is equal to aj(z) ∈ Σ, whereas the first componentis just similar to aj(x).For each component index c = 1, 2 we define a Markov chain Vc,n = (Xc,n, Zc,n) byVc,0 := (X0, Z0) andVc,n := ac,jn(Vc,n−1). (1.17)We see that X1,n is exactly our flat shape chain Xn, whereas Z1,n is just similar to ajn(Z1,n−1)which means that Z1,n is a non-flat barycentric process. Similarly, X2,n is a flat barycentricprocess and Z2,n is our non-flat shape chain Zn. So each Markov chain Vc,n is a coupling ofa flat and a non-flat barycentric process.The definition of Vc,n may first seem a bit obscure, so let us reformulate it in the hope ofmaking it more clear. For any n ∈ N we define the random functionsg1,n := aXnjn , g2,n := aZnjn . (1.18)By design, the functions g1,n and g2,n only “make sure” that Xn and Zn are mapped into Σ. Acomparison to the definitions above shows that (1.17) is equivalent to Vc,n = gc,n(Vc,n−1). By9the component-wise definition g(x, z) := (g(x), g(z)) this means that in any step we obtain(Xc,n, Zc,n) by applying the same function gc,n ∈ G to both component of (Xc,n−1, Zc,n−1).This is the main difference to the coupling (Xn, Zn) where in any step we obtain Xn =g1,n(Xn−1) and Zn = g2,n(Zn−1) by applying potentially different functions g1,n, g2,n ∈ G tothe components of (Xn−1, Zn−1).Applying in every step the same function g ∈ G to both components of V1,n−1 makes iteasier to describe the upcircle containing Xc,n, Zc,n than to describe the upcircle containingXn, Zn. The essential reason, as discussed in Section 2.5, is that any function g ∈ G preservesupcircles, and in a rather nice way: Given the radius of a pair (x, z), see Definition 1.2, wecan compute the radius of the pair g(x, z) with the following easy formula, where g′ denotesthe derivative of g restricted to the real line:Rg(x, z) = |g′(x)| ·R(x, z). (1.19)For each component index c = 1, 2 we define the radiusRc,n := R(Vc,n). (1.20)The first milestone in the proof of Theorem 5.17 (which restates Theorem 1.5) is that R1,ndecays exactly with the rate χ from Theorem 1.4. Showing this is most of the work in thisthesis. Afterwards, we conclude that R2,n decays exactly with rate χ and a comparisonbetween V1,n and V2,n finally leads to the heart of Theorem 1.5, namely that the radiusRn = R(Xn, Zn) decays exactly with rate χ.How do we prove that R1,n decays exactly with rate χ? As a consequence of formula(1.19) we show that Ra1,j(x, z) = |a′j(x)| ·R(x, z) with |a′j| denoting the up-to-sign derivativeof aj. HenceR1,n = |a′jn(Xn−1)| ·R1,n−1. (1.21)Our main technical tool is the law of large numbers (1.13) which we modify in Section 4.5for a two-dimensional Markov chain resulting in the a.s. convergencelimn1nn∑t=1ln |a′jn(Xn−1)| = χ. (1.22)10With (1.21) it follows that R1,n decays exactly with rate χ.Our method to prove the existence of and a formula for χ is new, even though the law oflarge numbers (1.13) has been shown in [6]. However, they only use it for the proofs of theirTheorems 1.2 and 1.3 and not for their Theorem 1.1 which addresses asymptotic exponentialdecay.112 Geometric preliminariesThis section presents auxiliary geometric results required for the proofs of our main resultsin Sections 4 and 5. Almost all proofs are postponed to Sections 6 and 9. The objects inthis section are non-random and “static” (we do not consider any sequences).In Sections 2.1 and 2.2 we describe the barycentric subdivision in a new geometric settingwith explicit formulas that involve Möbius transformations. In Section 2.5 we first introducea new coordinate system that uniquely describes pairs (x, z) ∈ R × H by their so-calledupcircle coordinates (x, r, d). In Section 2.6 we describe Möbius transformation in upcirclecoordinates.2.1 New geometric settingUnless stated otherwise, we assume throughout this subsection that z is a point in H0. Sofar, given a triangle ∆z and one of its six children ∆ := ∆jz, we consider the standardizedversion ∆z′ of ∆, i.e. z′ is the shape S(∆) = aj(z), see definition (1.6) of aj. In orderto construct aj(z), we first rotate, rescale and translate ∆ such that a longest edge of ∆is mapped to the line segment between −1 and 1. This (orientation preserving) similaritytransformation maps z to some new point w and therefore ∆ to ∆w. Hence the 0-edge of∆w is a longest edge of ∆w. If w lies in the first quadrant, then z′ = w. Otherwise we obtain∆z′ by reflecting ∆w across the imaginary axis. In both cases the point z′ lies in Σ and ∆z′is similar to ∆. Thus z′ is the shape aj(z) of ∆.Let us introduce a new setting: Instead of a similarity transformation that maps the child∆ to some new triangle ∆z′ and a longest edge of ∆ to the 0-edge of ∆z′ , we now considera similarity transformation that maps ∆ to some new triangle ∆z′ and the outer edge of ∆to the 0-edge of ∆z′ .Definition 2.1 Let ∆ := ∆jz be some child of ∆z. If the outer edge of ∆ has length 0,then we set Aj(z) := ∞. Otherwise, there is a unique similarity transformation that mapsthe outer edge of ∆ to the line segment between −1 and 1, the outer vertex of ∆ to 1 and12the inner vertex of ∆, namely the barycentre 13z of ∆z, to some new point z′ ∈ H0. We setAj(z) := z′. This defines a function Aj : H0 → H0.Figure 2 compares the above definition to the child functions.Figure 2: Comparison of A(1,1,)(z) and a(1,1,)(z). Corresponding edges in similar triangleshave the same colour.The following definition extends the shape function S : H0 → Σ to a function defined onall of H0. We use the same notation for this extended function.Definition 2.2 We write S(∞) := 1 ∈ Σ.Lemma 2.3 Any z ∈ H0 satisfiesaj(z) = SAj(z). (2.1)Proof. We set z′ := aj(z) and z′′ := Aj(z). If z′′ = ∞ then two vertices in the child ∆jz areidentical, hence z′ = S(∆jz) = 1 = S(z′′). Now suppose z′′ 6=∞. Then the triangles ∆z′ and∆jz are similar. By Definition 2.1 the triangles ∆z′′ and ∆jz are similar, too. Hence z′ and z′′are similar, i.e. z′ = S(z′) = S(z′′). 13Remark 2.4 Given Aj(z) we can compute aj(z) ∈ Σ, but not the other way around, i.e.Aj(z) contains more information. The reason is that Aj is bijective, whereas aj is not.Furthermore, the explicit formula for Aj is quite easy (as it turns out in Section 2.3), whereasaj involves comparisons of lengths and therefore inconvenient square roots and squares.2.2 Möbius transformationsDefinition 2.5 Let F be the set of allMöbius transformations f : H0 → H0 with coefficientsc1, c2, c3, c4 ∈ R, i.e.f(z) = c1z + c2c3z + c4, c1c4 − c2c3 > 0. (2.2)For any such f and x, y ∈ R we setf ′(x, y) := c(c3x+ c4)(c3y + c4)∈ [−∞,∞], c := c1c4 − c2c3 > 0, (2.3)and we formally write f ′(x) := f ′(x, x) ∈ (0,∞].Remark 2.6 Let us present some (well-known or easily verifiable) facts about the abovedefinition: (F , ◦) is a group and any f ∈ F has the following properties:(i) f is well-defined. With coefficients as in (2.2) we have f(∞) = c1/c3 ∈ R and f−1(∞) =−c4/c3 ∈ R.(ii) f is bijective and maps R to R and H to H.(iii) Let x ∈ R. If f(x) =∞ then f ′(x) =∞. If f(x) 6=∞ then f is differentiable in x andthe derivative is indeed f ′(x) = f ′(x, x) ∈ (0,∞).(iv) Let x, y ∈ R with x 6= y. Thenf ′(x, y) = f(x)− f(y)x− y .Definition 2.7 We define:(i) the mirror function m : H0 → H0 by m(z) := −<z + i=z = −z for z ∈ H0 andm(∞) :=∞,14(ii) G := {mlf : l ∈ {0, 1}, f ∈ F} = F ∪ {mf : f ∈ F},(iii) (mf)′(x) := −f ′(x) ∈ (0,∞] for f ∈ F and any x ∈ R,(iv) g(v) := (g(v1), g(v2)) component-wise for g ∈ G and any pair v = (v1, v2) ∈ H0 ×H0.Remark 2.8 Let us comment on parts of the above definition:(i) The mirror function reflects any point z, as well as its associated triangle ∆z acrossthe imaginary axis.(iii) Definitions 2.5 and 2.7 define for any g ∈ G the (formal) derivative g′ : R→ [−∞,∞]\{0}.(iv) It is always clear from the context whether g denotes a function from H0 to itself or afunction from H0 ×H0 to itself.Lemma 2.9 (G, ◦) is a group with subgroup (F , ◦). Any g ∈ G has the following properties:(i) g is bijective and maps R to R and H to H.(ii) For x, y ∈ R with x 6= y the square of the difference quotient can be expressed in termsof two derivatives via (g(x)− g(y))2(x− y)2 = |g′(x)| · |g′(y)|.Proof. In order to see that G is a group we first notice that f˜ := mfm ∈ F becausef˜(z) = (c1z − c2)/(−c3z + c4) and c1c4 − (−c2)(−c3) > 0. Since m2 = id it follows thatf˜m = mf andmf˜ = f˜m. Now the verification of the group properties of G is straightforward.(i) follows from Remark 2.6.(ii).(ii): Let g = mlf with f ∈ F . Then the left-hand side of the formula in (ii) is f ′(x, y)2 =f ′(x)2f ′(y)2, see Definition 2.5. This equals the right-hand side because |g′(x)| = |f ′(x)| and|g′(y)| = |f ′(y)|. 152.3 Explicit formula for the barycentric subdivisionDefinition 2.10 We define:(i) the rotation function h ∈ F byh(z) := z − 3z + 1 = 1−4z + 1 ,(ii) the six functions Sj ∈ G byS(l,k) := mlhk, (l, k) ∈ J ,(iii) the rescaling function α ∈ F byα(z) := 23z − 1.Remark 2.11 Let us explain the reason for the name “rotation function”: Lemma 6.7.(i)shows that the (associated triangles of) z and h(z) are similar with the same orientation andthat the k-edge of z corresponds with the (k − 1)-edge of h(z).Proposition 2.12 For any z ∈ H0 the explicit formula for the barycentric subdivision isAj(z) = αSj(z).Proof. Postponed to Proposition 6.9 (on page 52). Let us use the formulas in the above proposition and in Lemma 2.3 for the followingextensions of the functions Aj : H0 → H0 and aj : H0 → Σ to functions defined on all of H0.We use the same notation Aj and aj for these extended functions.Definition 2.13 We define:(i) Aj := αSj ∈ G,(ii) the child function aj := SAj : H0 → Σ.162.4 Shape setsDefinition 2.14 We define the shape setsΣ0,k := {z : the k-edge is a longest edge and the (k − 1)-edge is a shortest edge of z},Σ1,k := {z : the k-edge is a longest edge and the (k + 1)-edge is a shortest edge of z}.We include ∞ in Σ(0,1) and Σ(1,2), but not in the other shape sets.Remark 2.15 Let us comment on the above definition:(i) By definition, the set Σ(0,0) is just Σ.(ii) Figure 3 shows how the imaginary axis and the two circles of radius 2 around −1 and1 cut H0 into the six (closed) shape sets. Notice that the only unbounded shape setsΣ(0,1) and Σ(1,2) are also the only shape sets that include the element ∞.(iii) We show in Lemma 6.11.(i) that the six (not necessarily pairwise different) evaluationsSj(z) with j ∈ J are exactly those points that are similar to z. In part (iii) of thesame lemma we prove the characterizationΣj = {z ∈ H0 : S(z) = Sj(z)}. (2.4)Figure 3: The shape sets Σj and elements zj ∈ Σj with z(0,0) = S(zj) = Sj(zj).172.5 Upcircle coordinatesThroughout this subsection we assume (x, r, d), (x′, r′, d′) ∈ R × (0,∞) × R and z ∈ H. InDefinition 1.2 we defined an upcircle as a circle in H0 with tangent line R, see Figure 4, andwe defined the radius of (x, z). Now we extend this definition.Figure 4: Upcircle U(x, r).Definition 2.16 We define the(i) upcircle U(x, r) := {z : |z − (x+ ir)| = r} ⊆ H0,(ii) radius function R(x, z) := |z − x|2/(2=z) ∈ (0,∞),(iii) angle function D(x, z) := (<z − x)/=z ∈ R,(iv) unit upcircle function γ(d) := 2/(d− i) ∈ H,(v) upcircle coordinates function u(x, z) := (x,R(x, z), D(x, z)).Remark 2.17 Let us comment on two parts of the above definition:(i) The upcircle U(x, r) is the circle of radius r that intersects the real line exactly in x.We call U(0, 1) the unit upcircle and notice that U(x, r) = x+ r · U(0, 1).18(iii) The angle D(x, z) is the cotangent of the actual angle θ = ∠(x+ 1, x, z), see Figure 4.Proposition 2.18 The objects from Definition 2.16 have the following properties:(i) γ : R→ U(0, 1) \ {0} is bijective.(ii) u is bijective with inverse u−1(x, r, d) = (x, x+ r · γ(d)) ∈ {x} × U(x, r).Proof. Postponed to Proposition 9.3 (on page 70). Remark 2.19 Let us comment on the above proposition:(i) Since u−1u(x, z) = (x, z) we see that z = x + R(x, z) · γ(D(x, z)) ∈ U(x,R(x, z)). SoR(x, z) is the (unique) radius r such that z lies on the upcircle U(x, r) and the positionof z on that upcircle is uniquely determined by D(x, z).(ii) Since uu−1(x, r, d) = (x, r, d) we see that x + r · γ(d) is the (unique) point z ∈ H suchthat R(x, z) = r and D(x, z) = d. Our proof in Section 9.1 uses the following lemmain the special case x′ = x, whereas the general case with arbitrary x′ ∈ R is needed forthe proof of Lemma 5.9 in Section 5.3.Lemma 2.20 Let z := x + r · γ(d) and x′ ∈ R. For c := d2 + 1 and y := (x − x′)/(2r) wehave the formulasR(x′, z)r= cy2 + 2dy + 1, D(x′, z)− d = cy.2.6 Möbius transformations in upcircle coordinatesDefinition 2.21 For g ∈ G we define g] : R→ [−∞,∞] \ {0} byg](x) := 2x− g−1(∞) .Remark 2.22 Let us show the connection between g] and the (formal) derivative g′ : R→[−∞,∞] \ {0}, see Remark 2.8.(iii) (even though we do not need this in the following):Suppose that f ∈ F is not affine linear. With the notation from Definition 2.5 we havef ′(x) = cc23· 1(x− f−1(∞))2 . (2.5)19Hence f ′ and (f ])2 are identical up to a constant (positive) factor. Due to Definition 2.7.(iii)of the formal derivative (mf)′ it follows for any g ∈ G that g′ and (g])2 are identical up to aconstant (real) factor.Proposition 2.23 Let g ∈ G and (x, z) ∈ R × H such that g(x) 6= ∞ (which is equivalentto |g′(x)| 6=∞). Suppose we know the upcircle coordinates (x, r, d) := u(x, z).(i) The upcircle coordinates (x′, r′, d′) := ug(x, z) have the formulasx′ = g(x), r′ = |g′(x)| · r, |d′| =∣∣∣d+ g](x) · r∣∣∣ .(ii) g preserves upcircles in the following way:g(U(x, r)) = U(x′, r′).Proof. Postponed to Proposition 9.6 (on page 71). The essential reason for (ii) is that in (i)the formulas for x′ and r′ do not depend on d. Remark 2.24 The formula in the above proposition expresses the absolute value of thenew angle d′ in terms of g and the previous upcircle coordinates (x, r, d). Let us explain thegeometric meaning of the absolute value of an angle d (even though we do not need this inthe following): We easily verify that γ(|d|) = Qγ(d) for the projection Q from H0 into thefirst quadrant, i.e. Q(z) = |<z|+ i=z. So if a pair (x, z) has upcircle coordinates (x, r, d), orequivalently z − x has upcircle coordinates (0, r, d), then Q(z − x) has upcircle coordinates(0, r, |d|).203 Other preliminariesThis section presents auxiliary results about the (generalized) child functions required forthe proofs of our main results in Sections 4 and 5. All proofs are postponed to Sections 7, 8and 9.3.1 Up-to-sign derivatives of the child functionsLet us introduce the following convenient notation for iterated compositions.Definition 3.1 Let C be a set and (fi)i∈I be a family of functions fi : C → C. For n ∈ N0and a multiindex I = (i1, . . . , in) ∈ In we set fI := id, if n = 0, and otherwisefI := fin . . . fi1 .For the rest of this subsection we assume that x, y are real variables with x 6= y. Let usrestate Definition 1.3 (on page 5) in part (i) of the following definition.Definition 3.2 Let b : Db → R be a function with Db ⊆ R.(i) We say that b is up-to-sign differentiable if there is a continuous function β : Db →[0,∞) such that for all but finitely many x ∈ Db the function b is differentiable in xwith |b′(x)| = β(x). In that case, β is unique. We call β the up-to-sign derivative of band we formally write |b′| := β.(ii) We formally write |b′(x, y)| := |b(x)− b(y)|/|x− y| for x, y ∈ Db.Proposition 3.3 We restrict the child functions to functions aj : [0, 1]→ [0, 1]. Let I ∈ J nbe a multiindex with n ∈ N0. Then the composition aI is Lipschitz-continuous and has anup-to-sign derivative |a′I | : [0, 1]→ (0,∞). Moreover, this up-to-sign derivative satisfies:(i) the chain rule |a′I(x)| = |a′i(aJ(x))| · |a′J(x)|, if we write I = (J, i) with J ∈ In−1,21(ii) the “Cauchy-Schwarz inequality” |a′I(x, y)| ≤ |a′I(x)|1/2|a′I(y)|1/2.In the “n = 1” case I = j ∈ J we have:(iii) |a′j(x)| = |Sj′Aj(x)|, if aj(x) = Sj′Aj(x),(iv) |a′j(x)| = minj′ |(Sj′Aj)′(x)|.Proof. Postponed to Proposition 7.10 (on page 61). Remark 3.4 If we interpret |a′I(x)| as |a′I(x, x)|, then inequality (ii) is reminiscent of theactual Cauchy-Schwarz inequality. Inequality (ii) is crucial for Section 4, where we prove theergodicity of the flat shape chain.3.2 Generalized shape functionsRemark 3.5 According to Remark 2.15.(iv), on each shape set Σj the shape function Sequals the upcircle preserving function Sj ∈ G, but unfortunately, S itself does not preserveupcircles. Why? Imagine we fix x ∈ R∩Σj and let r grow. Then C := U(x, r) grows and willeventually contain an element z ∈ Σj′ \Σj. But then applying S pointwise to C means thatwe apply the different upcircle preserving functions Sj and Sj′ to different parts of C, thusS(C) is not necessarily an upcircle anymore (S(C) is definitely not an upcircle, if x /∈ Σj′).From now on we write v = (v1, v2) for (x, z) and consider a fixed component indexc ∈ {1, 2}.Remark 3.6 If we defined S(v) component-wise, then we would have S(v) = (Sj(v1), Sj′(v2))for potentially different upcircle preserving functions Sj and Sj′ . Instead, we construct inthe following a generalized shape function Sc : D → D for some set D ⊆ R × H whichfor any input v applies the same upcircle preserving function Sj to both components ofv, where j is chosen such that vc ∈ Σj. This choice ensures that the cth component22of Sc(v) = (Sj(v1), Sj(v2)), namely Sj(vc), equals S(vc) and therefore lies in Σ (unfortu-nately, the other component of Sc(v) does not necessarily lie in Σ). With the projectionpic(v1, v2) := vc onto the cth component we can express this “design feature” of Sc aspicSc = Spic : D → Σ. (3.1)In Definition 8.1 we define the exceptional set E ⊆ R and the above mentioned set D asD := (R \ E)×H. (3.2)The details do not matter here, all we need to know for now is that the following lemma andproposition hold.Lemma 3.7 The sets E and D have the following properties:(i) E is countable.(ii) If (x, z) ∈ D, then Aj(x, z) ∈ D.Proof. Postponed to Lemma 8.2 (on page 62). Proposition 3.8 There is a function Sc : D → D such that any v ∈ D satisfies the followingtwo conditions:(i) Sc(v) = Sj(v) for some j such that vc ∈ Σj,(ii) ScSj(v) = Sc(v) for any j.Proof. Postponed to Proposition 8.3 (on page 63). Corollary 3.9 Any v ∈ D satisfies the following two conditions:(i) The multiset of the six evaluations SjSc(v) equals the multiset of the six evaluationsSj(v).(ii) Sc′Sc(v) = Sc′(v) for any c, c′ ∈ {1, 2}.23Proof. Postponed to Corollary 8.4 (on page 63). The functions S1, S2 satisfying the conditions in the above proposition are not unique. Inthe following, we fix a specific choice of such functions S1, S2 and call them the generalizedshape functions.3.3 Definition of the generalized child functionsDefinition 3.10 For c ∈ {1, 2} we define the generalized child functionac,j := ScAj : D → D.Indeed, ac,j maps D to D because (the two-dimensional version of) Aj maps D to D, seeLemma 3.7.(ii). Only for the following two remarks we setgj′,j := Sj′Aj ∈ G, j′, j ∈ J , (3.3)and notice for any j and z that there exists an index j′ such thataj(z) = gj′,j(z). (3.4)The following remark is an almost literal copy of Remark 3.6 with the (generalized) shapefunction replaced by a (generalized) child function.Remark 3.11 If we defined aj(v) component-wise, then we would have aj(v) = (gj′,j(v1), gj′′,j(v2))for potentially different upcircle preserving functions gj′,j and gj′′,j. Instead, each general-ized child function ac,j applies the same upcircle preserving function gj′,j to both componentsof v, where j′ is chosen such that Aj(vc) ∈ Σj′ . This choice ensures that the cth component ofac,j(v), namely gj′,j(vc) = Sj′Aj(vc) = SAj(vc) = aj(vc), lies in Σ (unfortunately, the othercomponent of ac,j(v) does not necessarily lie in Σ). With the projection pic(v1, v2) := vc ontothe cth component we can express this “design feature” of ac,j aspicac,j = ajpic : D → Σ. (3.5)24We also could have shown the last line with the “design feature” (3.1) of Sc, that is picSc =Spic : D → Σ. We compose from the right side with the function Aj : D → D. Together withpicAj = Ajpic, which is a consequence of the component-wise definition of Aj(v), it follows(3.5).Remark 3.12 Let us show consistency with Section 1.4, where we alluded to the generalizedchild functions. This remark may be skipped at first reading as the following proofs do notformally require this discussion. In (1.16) we presented the “characteristic feature” of ac,j:For “most” v = (x, z) ∈ D we obtain ac,j(v) by applying the same function avcj ∈ Gj ⊆ G toboth components of v, that isac,j(v) = avcj (v). (3.6)Let us now clarify the meaning of “most” and reveal the definitions of those objects thatwere just informally introduced in Section 1.4. Consider the subset Gj := {gj′,j : j′ ∈ J } ofG. The six (not necessarily pairwise different) evaluations g(z) with g ∈ Gj are exactly thosepoints that are similar to Aj(z), or equivalently, that are similar to aj(z). For any fixed zthere is a (not necessarily unique) index j′ such that aj(z) = gj′,j(z) and for one such indexj′ we setazj := gj′,j ∈ Gj. (3.7)Hence (1.15) is satisfied, that is aj(z) = azj(z). We claimed “most” elements (x, z) of Dsatisfy (1.16), that isa1,j(x, z) = axj (x, z), a2,j(x, z) = azj(x, z). (3.8)Let us discuss what “most” means. Fix j and a component index c ∈ {1, 2}. Let us defineDcj as the set of all pairs v ∈ D such that the pair v′ := Aj(v) has the following property:Its cth component v′c = Aj(vc) lies in exactly one shape set, which means that there is justone possible choice for the index j′ in the above definition of avcj . But by Proposition 3.8this also means that there is just one possible choice j′ such that Sc(v′) = Sj′(v′). Thus anyv ∈ Dcj satisfiesac,j(v) = ScAj(v) = Sj′Aj(v) = gj′,j(v) = avcj (v). (3.9)25This is the cth equation in (3.8). In Definition 8.1 we formally introduce D and it is thenclear that D1j is actually the same as D and that D \D2j has Lebesgue measure 0. So it isfair to say that “most” elements of D lie in Dcj .3.4 Generalized child functions in upcircle coordinatesIn this subsection we assume that x, y are real variables with x 6= y. We consider thegeneralized child functions ac,j : D → D in the case c = 1. In Definition 9.7 we define afunction a]j : R \ E → R such that the following lemma holds which is a consequence ofProposition 2.23.(i).Lemma 3.13 The function a]j has the following properties:(i) a]j is well-defined and bounded.(ii) Given the upcircle coordinates (x, r, d) := u(x, z), we have the following formula forthe upcircle coordinates (x′, r′, d′) := ua1,j(x, z):x′ = aj(x), r′ = |a′j(x)| · r, |d′| = |d+ a]j(x) · r|.Proof. Postponed to Lemma 9.9 (on page 73). Let us introduce the following functions that are central for Sections 4 and 5.Definition 3.14 Let n ∈ N0 and Jn := (j1, . . . , jn) be a random multiindex. We define(random) functions xn, rn, dn and a (non-random) function Φn:(i) xn : [0, 1]→ [0, 1] : x 7→ aJn(x),(ii) rn : [0, 1]→ (0,∞) : x 7→ |x′n(x)|,(iii) Φn : [0, 1]→ R : x 7→ E ln rn(x),(iv) dn : D → R : v 7→ D(a1,Jn(v)).26Note that x0(x) = x and r0(x) = 1, as well as Φ0(x) = 0 and d0(v) = D(v). We use the samenotation rn as for the up-to-sign derivative and define for x, y ∈ [0, 1] the difference quotientrn(x, y) :=|xn(x)− xn(y)||x− y| . (3.10)Remark 3.15 Let us translate parts (i) and (ii) of the (non-random) Proposition 3.3 intothis new (random) framework: The function xn is Lipschitz-continuous and has the up-to-sign derivative rn : [0, 1]→ (0,∞). Moreover, rn satisfies(i) the chain rule rn(x) = |a′jnxn−1(x)| · rn−1(x),(ii) the “Cauchy-Schwarz inequality” rn(x, y) ≤ rn(x)1/2rn(y)1/2.Lemma 3.16 Let v ∈ D and (x, r, d) := u(v).(i) For n ∈ N0 the pair a1,Jn(v) has the upcircle coordinatesua1,Jn(v) =(xn(x), rn(x) · r, dn(v)).(ii) For n ∈ N we have the recursive formulasrn(x) = |a′jnxn−1(x)| · rn−1(x), (3.11)|dn(v)| = |dn−1(v) + a]jn(x) · rn−1(x) · r|. (3.12)Proof. (i): A simple induction using Remark 3.15.(i) and Lemma 3.13.(ii) shows that thefirst two components of the upcircle coordinates ua1,Jn are indeed xn(x) and rn(x) · r.(ii) folllows from (i) by using Lemma 3.13.(ii) a second time. 274 Ergodicity of the flat shape chainTheorem 1 in [2] shows the ergodicity of the flat shape chain Xn under the assumption ofthe “average contractivity between points” condition. This technical assumption requires thegeometric mean of the difference quotients |a′j(x, y)| to be uniformly bounded in x and y by anumber strictly less than 1. In Section 4.1 we prove the equivalence of average contractivityto the boundedness of a one-dimensional function which we later verify in Section 4.3. InSection 4.2 we discuss the ergodicity and the law of large numbers for Xn. The existence ofan invariant measure µ leads to the proof of the first theorem (Theorem 1.4) in Section 4.4.The law of large numbers for Xn is modified in Section 4.5 for a two-dimensional Markovchain whose first component is Xn. This modified law of large numbers provides the maintechnical tool for the proof of the second theorem (Theorem 1.5) in Section 5.The ergodic results of Sections 4.2 and 4.5 are not specific to the iterated barycentricsubdivision. We take account of this by formulating them in a more general setting in orderto emphasize the relevant underlying structure. We “weave in” explanations (called applica-tions) how the general framework applies to our specific setting of the iterated barycentricsubdivision.4.1 Contractivity on averageWe remind ourselves of the notation from Definition 3.1 for iterated compositions fI =fin . . . fi1 with I = (i1, . . . , in).Setting 4.1 We impose the following general setting:(i) Let d ∈ N and C be a compact subset of (Rd, | · |), where | · | is some norm. We fix afinite index set I and always assume i ∈ I. Let (fi)i be a family of Lipschitz-continuousfunctions fi : C → C.28(ii) Consider the Markov kernel M from C to C given byM(x, ·) := 1|I|∑iδfi(x), x ∈ C. (4.1)(iii) Let i1, i2, . . . be a sequence of independent random indices such that each in is uniformlydistributed on I. For n ∈ N0 we define the random multiindex In := (i1, . . . , in) whichis uniformly distributed on In.Remark 4.2 If each function fi is not only Lipschitz-continuous, but even a contraction,then (fi)i contracts on average. However, there are families (fi)i which contract on averagewithout fi being (global) contractions.Application 4.3 The setting of the flat barycentric subdivision is a special case of Setting4.1 with d = 1, C = [0, 1], I = J and fi : C → C as the restricted child function ai. HenceM = Mflat. We also identify in = jn and In = Jn.For the rest of this subsection we always assume that x 6= y. We remind ourselves of thenotation |b′(x, y)| = |b(x)− b(y)|/|x− y|, see Definition 3.2.Definition 4.4 Let us define the following notions of contractivity on average:(i) (fi)i contracts on average in step N ∈ N, ifsupx,y∈CE ln |f ′IN (x, y)| < 0,(ii) (fi)i strongly contracts on average in step N ∈ N, if there exists  > 0 such thatsupx,y∈CE ln(max{|f ′IN (x, y)|, })< 0.(iii) (fi)i (strongly) contracts on average, if it (strongly) contracts on average in step 1.Application 4.5 The family (aj)j contracts on average in step N , if and only ifsupx,y∈[0,1]E ln rN(x, y) < 0. (4.2)29Remark 4.6 The very technical assumption (ii) in the above definition is tailor-made forthe modified law of large numbers stated in Proposition 4.25. It also obviously impliesassumption (i) on which we comment now:(i) The family (fi)i∈I contracts on average in step N if and only if the family (fI)I∈INcontracts on average. The reason is that IN is uniformly distributed on IN .(ii) If (fi)i contracts on average, then it also contracts on average in any step N ∈ N. Eventhough we do not need this observation in the following, let us briefly justify it: Weimmediately verify that|f ′IN (x, y)| = |f ′in(fIn−1(x), fIN−1(y))| · |f ′In−1(x, y)|. (4.3)Applying supx,y E ln(·) to both sides and using the independence of in and In−1 showsthat (fi)i contracts on average in step N , if (fi)i contracts on average in step 1 andN − 1. Now we use induction over N ∈ N.(iii) The family (fi)i contracts on average in step N if and only if there exists κ < 0 suchthat any x, y ∈ C satisfy1|I|∑iln |f ′i(x, y)| ≤ κ.Application 4.7 Let N ∈ N. The following conditions are equivalent:(i) (aj)j strongly contracts on average in step N .(ii) (aj)j contracts on average in step N .(iii) ΦN is bounded from above by a negative number.Proof. Remember that ΦN = E ln rN : [0, 1]→ R, where rN denotes the up-to-sign derivativeof xN = aJN .“(i)⇒(ii)” is trivial.“(ii)⇒(iii)”: Since rN(x) = limy→x rN(x, y) we have Φ(x) = limy→x E ln rN(x, y). Thussupx∈[0,1]Φ(x) ≤ supx,y∈[0,1]E ln rN(x, y). (4.4)30“(iii)⇒(i)”: Any x, y ∈ [0, 1] satisfy the “Cauchy-Schwarz inequality” in Remark 3.15,namelyrN(x, y) ≤ rN(x)1/2rN(y)1/2. (4.5)The image set C ′ of rN is a subset of (0,∞) and compact in R. Let  := minC ′ > 0. Thusthe right-hand side of inequality (4.5) is at least . Applying the non-decreasing functionln(max{·, }) : R→ R to the inequality shows thatln (max {rN(x, y), }) ≤ 12 ln rN(x) + 12 ln rN(y). (4.6)Taking first the expectation and then the supremum shows that (iii) implies (i). Remark 4.8 Let us comment on the equivalence in Application 4.7:(i) Condition (iii) is the easiest to verify because ΦN is one-dimensional, whereas eachdifference quotient rN(·, ·) is a two-dimensional function.(ii) Using this equivalence, we show in Remark 4.17 that (aj)j (strongly) contracts onaverage in some step N ∈ N. For our purpose the existence of such N is all we need.(iii) Without using this equivalence, [6] show via numerical computations1 that (aj)j con-tracts on average in step 2, but not in step 1.4.2 Law of large numbersLet us impose Setting 4.1 in this subsection.Definition 4.9 We define:(i) M is ergodic, if it has a unique invariant probability measure µ which is also attracting,i.e. µM = µ and νMn converges weakly to µ for any probability measure ν on C,1included in the appendix of [7], but not included in the published version [6].31(ii) For any x ∈ C we define the Markov chainUxn := fIn(x), n ∈ N0,which starts in Ux0 = x and has kernel M .(iii) M satisfies the law of large numbers, if M is ergodic with invariant measure µ and forany continuous function φ : C → R and x ∈ C we have a.s.limn1n+ 1n∑t=0φ(Uxt ) =∫φ dµ.Application 4.10 As a consequence of fi = ai and In = Jn we have Uxn = aJn(x) = xn(x),see Definition 3.14.(i).Proposition 4.11 If (fi)i contracts on average, then M satisfies the law of large numbers.Proof. We apply Theorem 1 from [2] in the following special case: The complete metric space(X, d) is (C, | · |), the Lipschitz-continuous function wi is fi for an index i and the probabilitypi to choose wi in any given step is (uniformly) 1/|I|. The assumption of this theorem isexactly the equivalent reformulation of (fi)i contracting on average in Remark 4.6.(iii). Thefirst part of the theorem states that M is ergodic.Using the ergodicity of M , Lemma 5.2 from [6] shows the law of large numbers for M .Again, we work with the uniform distribution pi = 1/|I|. Their lemma is presented in thespecial case d = 1, but they emphasize that the compact set C (or S in their notation) canbe any complete, separable metric space. Lemma 4.12 Let N ∈ N. The kernel MN = M . . .M has the formulaMN(x, ·) = 1|IN |∑I∈INδfI(x), x ∈ C, (4.7)and satisfies the following properties:(i) If MN is ergodic, then M is ergodic.(ii) If MN satisfies the law of large numbers, then M satisfies the law of large numbers.32Proof. Postponed to Lemma 10.4 (on page 79) where we present the arguments in the proofof Proposition 5.3 from [6] in greater detail. Proposition 4.13 If (fi)i contracts on average in some step N , then M satisfies the law oflarge numbersProof. As mentioned in Remark 4.6.(i) the family (fI)I∈IN contracts on average, so by Propo-sition 4.11 the kernel MN satisfies the law of large numbers. Applying Lemma 4.12.(ii)finishes the proof. The following lemma presents other formulas for the right-hand side in Definition 4.9.(iii).Lemma 4.14 Suppose M has an invariant measure µ. Let φ : C → R be continuous andN ∈ N0. Then ∫φ dµ = E∫φfIN dµ.Proof. Since µ = µMN we know∫φ dµ =∫ ∫φ dMN(x, ·) µ(dx). (4.8)Due to formula (4.7) for MN , the inner integral in the above double integral equals1|IN |∑I∈IN∑IφfI(x) = Eφ(UxN). (4.9)It only remains to interchange the outer integral with the expectation E which can be writtenas a finite sum. 4.3 Shape integrandWhere do the bounds χmin = 13 ln32 and χmax =13 ln912 − ln 3 in Theorem 1.4 come from? InDefinition 11.1 we introduce (non-random) continuous functions Ψ,Ψn : R→ R withΨn(x) := EΨxn(x), n ∈ N0. (4.10)We call Ψ the shape integrand and Ψn the shape integrand of order n. The details of Ψare not relevant for this section; all we need to know for now is that the following twopropositions hold.33Proposition 4.15 For any n ∈ N0 and x ∈ [0, 1] we have the following approximation ofΦn(x) by the sum of the shape integrands of orders up to n− 1:∣∣∣∣∣Φn(x)−n−1∑t=0Ψt(x)∣∣∣∣∣ ≤ ln 43 .Proof. Postponed to Proposition 11.3 (on page 86). Proposition 4.16 The shape integrand has the following extrema:maxx∈[0,1]Ψ(x) = Ψ(0) = −χmin, minx∈[0,1]Ψ(x) = Ψ(1) = −χmax.Proof. Postponed to Proposition 11.5 (on page 87). Remark 4.17 By Propositions 4.15 and 4.16 we have thatsupx∈[0,1]Φn(x) ≤n−1∑t=0supx∈[0,1]Ψt(x) + ln 43 ≤ −χmin · n+ ln 43 , (4.11)infx∈[0,1]Φn(x) ≥n−1∑t=0infx∈[0,1]Ψt(x)− ln 43 ≥ −χmax · n− ln 43 . (4.12)4.4 Proof of the first theoremLet us restate and prove Theorem 1.4 (on page 5).Theorem 4.18 The kernel Mflat of Xn has an invariant measure µ, i.e. µMflat = µ, and µis unique. Each child function aj viewed as a function [0, 1]→ [0, 1] is Lipschitz-continuousand has an up-to-sign derivative |a′j| : [0, 1]→ (0,∞). The constantχ := −16∑j∫ln |a′j| dµis bounded from below by χmin := 13 ln32 ≈ 0.1352 and bounded from above by χmax :=13 ln912 − ln 3 ≈ 0.1740.Proof. Since χmin > 0 the upper bound (4.11) implies for some N ∈ N that the functionΦN is bounded from above by a negative number. According to Application (4.7) this isequivalent to (aj)j contracting on average in step N . It follows with Proposition 4.13 that34Mflat satisfies the law of large numbers. In particular, Mflat has an invariant measure µ.Choosing φ = Ψ : [0, 1]→ R shows that a.s.limn1nn−1∑t=0Ψxt(x) =∫Ψ dµ =: κ. (4.13)By Proposition 4.16 we see that κ and each term 1n∑n−1t=0 Ψxt(x) with n ∈ N are boundedfrom below by −χmax and from above by −χmin. Taking the expectation and using thedefinition Ψt(x) = EΨxt(x), as well as the dominated convergence theorem leads to the a.s.convergencelimn1nn−1∑t=0Ψt(x) = κ. (4.14)It follows with Proposition 4.15 that a.s. limn 1nΦn(x) = κ. According to Remark 4.17 therandom variables 1nΦn(x) are uniformly bounded in n ∈ N. Taking the integral w.r.t. theprobability measure µ and applying again the dominated convergence theorem leads to thea.s. convergencelimn1n∫Φn dµ = κ. (4.15)It remains to show that∫Φn dµ = −χ ·n for any n ∈ N0 because this implies that χ = −κ ∈[χmin, χmax]. Lemma 4.14 in the case φ = ln |a′j| yields∫ln |a′j| dµ = E∫ln |a′jxn−1(x)| µ(dx).Together with the definition of χ and the independence of jn, xn−1(x) it follows that−χ = 16∑j∫ln |a′j| dµ = E∫ln |a′jn(x)| µ(dx) = E∫ln |a′jnxn−1(x)| µ(dx). (4.16)In the last term we can interchange the expectation E and the integral w.r.t. µ. The impli-cation Ψn(x) = E ln |a′jnxn−1(x)| + Ψn−1(x) of the chain rule rn(x) = |a′jnxn−1(x)| · rn−1(x),see Remark 3.15.(i), results in−χ =∫Φn(x) dµ−∫Φn−1(x) dµ. (4.17)Together with Φ0(x) = 0 it follows that indeed∫Φn dµ = −χ · n. Remark 4.19 The above proof shows that −χ = ∫ Ψ dµ. It follows with Lemma 4.14 that−χ =∫Ψn dµ, n ∈ N0. (4.18)This explains the origin of the name “shape integrand of order n”. Numerical simulationssuggest that the (non-random) functions Ψn converge uniformly to a constant which mustbe −χ. In principal, this provides an opportunity to approximate χ arbitrarily close.35Definition 4.20 Let C ′flat :=⋃j{|a′j(x)| : x ∈ [0, 1]}.Remark 4.21 The set C ′flat is a subset of (0,∞) and compact in R. Why? In the aboveTheorem 4.18 we state that |a′j| only takes values in (0,∞). By definition, each up-to-signderivative |a′j| is continuous and defined on a compact set. Hence C ′flat is compact.4.5 Modified law of large numbersSetting 4.22 We extend Setting 4.1 in the following way:(i) Let (f ′i)i be a family of Lipschitz-continuous functions f ′i : C → C ′ where C ′ ⊆ R iscompact. Thus C˜ := C × C ′ ⊆ Rd+1 is compact, too.(ii) For any n ∈ N and multiindices I = (i, J) ∈ In with J ∈ In−1 let us define FI : C → C˜byFI := (fI , f ′ifJ). (4.19)In particular, Fi = (fi, f ′i).(iii) In Definition 4.9.(ii) we defined for any x ∈ C the Markov chain Uxn = fIn(x) with statespace C. Now we define for any x ∈ C a stochastic process U˜xn with state space C˜ byU˜xn := FIn(x), n ∈ N0. (4.20)Remark 4.23 A priori, there is no connection between f ′i and fi in the above generalsetting. The prime symbol is merely a notation and does not necessarily stand for anykind of derivative. However, in the following application we choose f ′i to be the up-to-signderivative of fi.Application 4.24 We remind ourselves of Application 4.3 where we chose fi to be therestricted child function ai : [0, 1] → [0, 1]. In particular, fIn = aJn = xn. Now we choosef ′i to be the up-to-sign derivative |a′i| : [0, 1] → C ′ with the compact set C ′ := C ′flat, seeDefinition 4.20. In the above setting we have C˜ = [0, 1]×C ′. By definition and by the chainrule (3.11) any n ∈ N fulfillsFIn : [0, 1]→ C˜ : x 7→(xn(x), |a′jnxn−1(x)|)=(xn(x),rn(x)rn−1(x)). (4.21)36Let us present the modified law of large numbers.Proposition 4.25 Suppose that (fi)i strongly contracts on average in some step N . Let µdenote the invariant measure of M , see . For any continuous function φ : C˜ → R and x ∈ Cwe have a.s.limn1nn∑t=1φ(U˜xt ) =1|I|∑i∫φFi dµ. (4.22)Proof. Postponed to Proposition 10.8 (on page 81). Remark 4.26 In the situation of the above proposition, the existence of the invariant mea-sure µ of M is ensured by Proposition 4.13 and the fact that in particular, (fi)i contracts instep N .We remind ourselves of Definition 1.1 which defines for any sequence of complex randomvariables Wn the meaning of Wn decaying exactly (or at least) with some (positive) rate.Application 4.27 For any x ∈ [0, 1] the sequence rn(x) decays exactly with rate χ.Proof. We remind ourselves of the previous Application 4.24. Let µ denote the invariantmeasure of Mflat, see Theorem 4.18. We apply Proposition 4.25 for the continuous functionφ : C˜ → R : (x, x′) 7→ ln(x′). Then the right-hand side of (4.22) is just −χ by definition.According to (4.21) the left-hand side of (4.22) islimn1nn∑t=1ln(rt(x)rt−1(x))= limn1nln rn(x). (4.23)This means that rn(x) decays exactly with rate χ. 375 Asymptotic exponential decayThe auxiliary results in Sections 5.1 - 5.5 pave the way to the proof of the second theoremin Section 5.6. Some of these results are presented in a slightly more general framework. Wecontinue to “weave in” explanations (called applications) how the general framework appliesto our specific setting of the iterated barycentric subdivision.5.1 Asymptotic exponential ratesRemark 5.1 In the following definition we distinguish between the subset R = {z ∈ C :=(z) = 0} ∪ {∞} of C = C ∪ {∞} and the extended real number line [−∞,∞] = R ∪{−∞,∞}. We only view [−∞,∞] as a partially (and totally) ordered set. So in particular,we distinguish between the elements −∞ and∞ of [−∞,∞], whereas we regard the elements−∞ and ∞ of R as identical. We always interpret absolute values as elements of [0,∞] ⊆[−∞,∞].Definition 5.2 Let κ, κ′ ∈ R and wn be a sequence in C and vn be a sequence in R × H.We formally write ln(0) = −∞ and say that a sequence(i) wn has at most rate κ if 1n lim supn ln |wn| ≤ κ,(ii) wn has at least rate κ if lim infn 1n ln |wn| ≥ κ,(iii) wn has exactly rate κ if limn 1n ln |wn| = κ,(iv) vn has upcircle rates κ, κ′ if R(vn) has exactly rate κ and |D(vn)|−d∞ has at most rateκ′ for some number d∞ ∈ [0,∞).Lemma A.1 in the appendix collects several elementary facts about asymptotic exponen-tial rates which we use in the following without explicitly referring to them.Remark 5.3 Let us reformulate Definition 1.1 with the new terminology from the abovedefinition: A sequence of complex random variables Wn decays exactly (at least) with rateκ, if and only if κ > 0 and Wn has a.s. exactly (at most) rate −κ.38Application 5.4 Application 4.27 states for any x ∈ [0, 1] that rn(x) has exactly rate−χ < 0.5.2 Series of complex numbersProposition 5.5 Let wn, w′n be sequences in C. Suppose that wn has at most some rateκ ∈ R.(i) If κ ≥ 0 and w′n =∑nt=1wt then w′n has at most rate κ.(ii) If κ < 0 and w′n satisfies the recursion |w′n| = |w′n−1 + wn| then |w′n| − w∞ has at mostrate κ for some number w∞ ∈ [0,∞).Proof. For both parts it suffices to show that w′n resp. |w′n| − w∞ has at most rate κ′ for afixed but arbitrary κ′ > κ. There is some n0 ∈ N such that any n ≥ n0 satisfies |wn| ≤ eκ′n.(i): Since κ′ > 0 we haven∑t=n0|wt| ≤n∑t=0eκ′t = eκ′(n+1) − 1eκ′ − 1 ≤eκ′eκ′ − 1eκ′n. (5.1)Together with |w′n| ≤∑n0−1t=1 |wt|+∑nt=n0 |wt| it follows that w′n has at most rate κ.(ii): We may assume that κ′ < 0. We can write the recursion as snw′n = w′n−1 + wn forcertain numbers sn ∈ C such that |sn| = 1. Let σn := ∏nt=1 st for n ∈ N0 (in particularσ0 = 1) and notice that |σn| = 1. We have σnw′n = σn−1w′n−1 + σn−1wn. Henceσnw′n = w′0 +n∑t=1σt−1wt. (5.2)The limit w := w′0 +∑∞t=1 σt−1wt ∈ C exists because∑∞t=n0 |σt−1wt| ≤∑∞t=n0 eκ′t < ∞. Letw∞ := |w|. We conclude for n ≥ n0 that∣∣∣|w′n| − w∞∣∣∣ = ∣∣∣|σnw′n| − |w|∣∣∣ ≤ |σnw′n − w| ≤∣∣∣∣∣∞∑t=n+1σt−1wt∣∣∣∣∣ (5.3)≤∞∑t=n+1|wt| ≤ eκ′(n+1)∞∑t=0eκ′t = eκ′1− eκ′ · eκ′n. (5.4)Thus |w′n| − w∞ has at most rate κ′. 39Application 5.6 Let v ∈ D and x, y ∈ [0, 1].(i) a1,Jn(v) has a.s. upcircle rates −χ, −χ.(ii) xn(x)− xn(y) has a.s. at most rate −χ.Proof. (i): Let (x, r, d) := u(v). Lemma 3.16.(i) states thatua1,Jn(v) = (xn(x), rn(x) · r, dn(v)). (5.5)Application 4.27 states that rn(x) and therefore also rn(x) · r have a.s. exactly rate −χ.Since the function a]j is bounded, see Lemma 3.13.(i), it follows that wn := a]jn(x) · rn−1(x) · rhas a.s. at most rate −χ. Equation (3.12) states that |dn(v)| = |dn−1(v) + wn|. Proposition5.5.(ii) ensures that |dn(v)| − d∞(v) has a.s. at most rate −χ.(ii): The case x = y is trivial. Let x 6= y. The “Cauchy-Schwarz inequality” in Remark3.15.(ii) states that|xn(x)− xn(y)||x− y| = rn(x, y) ≤ rn(x)1/2rn(x)1/2. (5.6)Hence xn(x)− xn(y) has at most rate 12(−χ) + 12(−χ) = −χ. 5.3 Sequences of upcircle coordinatesOnly in this subsection, we ignore Definition 3.14 and instead denote (non-random) upcirclecoordinates by (xn, rn, dn).Remark 5.7 If a sequence vn in R × H has upcircle rates κ, 0, then in particular, D(vn)has at most rate 0.Lemma 5.8 Let κ ∈ R and vn = (xn, zn) be a sequence in R×H with upcircle rates κ, 0.(i) The sequences zn − xn and =zn have exactly rate κ.(ii) The sequence <zn − xn has at most rate κ.40Proof. We know by Proposition 2.18 that zn = xn + rn · γ(dn) and that the unit upcirclefunction γ is bounded. Thus the sequences γ(dn), =γ(dn), <γ(dn) are bounded and thereforehave at most rate 0. Hence the following sequences have at most rate κ:zn − xn = rn · γ(dn), =zn = rn · =γ(dn), <zn − xn = rn · <γ(dn) (5.7)It remains to show that γ(dn) and =γ(dn) have at least rate 0 because this implies that zn−xnand =zn have at least rate κ. Since dn has at most rate 0 it follows that dn − i = 2/γ(dn)and d2n + 1 = 2/=γ(dn) have at most rate 0, too. Hence γ(dn) and =γ(dn) have at least rate0. Lemma 5.9 Let vn = (xn, zn) and v′n = (x′n, zn) be sequences in R×H with the same secondcomponents. Suppose that xn − x′n has at most rate κ and that vn has upcircle rates κ, 0.Then v′n has upcircle rates κ, 0, too.Proof. Let cn := d2n + 1 and yn := (xn − x′n)/(2rn). Lemma 2.20 states thatqn :=r′nrn= cn · y2n + 2dn · yn + 1, d′n − dn = cn · yn. (5.8)Since dn has at most rate 0 and xn − x′n has at most rate κ and rn has exactly rate κ, itfollows that cn and yn have at most rate 0. This results in the following two observations:• d′n − dn has at most rate 0. Since dn has at most rate 0, it follows that d′n has at mostrate 0.• qn has at most rate 0 because all three summands have at most rate 0.We also haveqn = cn ·(yn +dncn)2+ 1cn≥ 1cn. (5.9)Since cn has at most rate 0, its reciprocal and therefore also qn have at least rate 0. So qnhas exactly rate 0. Thus r′n = rn · qn has exactly rate κ. 415.4 Comparison of both generalized shape functionsWe remind ourselves of Corollary 3.9.(ii) which states for any component indices c, c′ ∈ {1, 2}thatSc′Sc = Sc′ . (5.10)The case c = 2 of the following proposition is illustrated in Figure 5. We obtain anillustration of the case c = 1 by interchanging the labels x, r, z with x′, r′, z′.Figure 5: Upcircles U(x, r) and U(x′, r′) = S(1,1)(U(x, r)) with elements z and z′ = S(1,1)(z).Proposition 5.10 Let v = (x, z) ∈ D and v′ := Sc(v). If c = 2 we assume x ∈ Σ,and if c = 1 we assume z ∈ Σ. We define the upcircle coordinates (x, r, d) := u(v) and(x′, r′, d′) := u(v′). Suppose r is small enough such that U(x, r) intersects at most two shapesets. Then the following bounds hold:|x− x′| ≤ 4r, 14r ≤ r′ ≤ 4r,∣∣∣|d| − |d′|∣∣∣ ≤ 2r. (5.11)Proof. Postponed to Proposition 9.10 (on page 74). Corollary 5.11 Let v˜ ∈ D and {c, c′} ∈ {1, 2}. We set v := Sc(v˜) and v′ := Sc′(v˜), as wellas (x, r, d) := u(v) and (x′, r′, d′) := u(v′). Suppose that U(x, r) intersects at most two shapesets. Then the bounds in (5.11) hold.42Proof. By (5.10) we have v′ = Sc′(v). The “design feature” (3.1) of Sc states that the cthcomponent of v lies in Σ. If c′ = 2 then c = 1 and x ∈ Σ. If c′ = 1 then c = 2 and z ∈ Σ.So we can apply the above proposition with c replaced by c′. Remark 5.12 The above corollary means that the (potentially different) upcircle coordi-nates of S1(v˜) and S2(v˜) become more and more similar, if r goes to 0. This is the keyfor Proposition 5.13. Its two parts (i) and (ii) seem rather unrelated at first glance, butwe present them together as one proposition in order to avoid redundancy in the proofs.Application 5.14 below is crucial for the proof of the second theorem in Section 5.6.Proposition 5.13 For both component indices c = 1, 2 let v˜c,n be a sequence in D and(xc,n, rc,n, dc,n) be the upcircle coordinates of vc,n := Sc(v˜c,n). Suppose r1,n has exactly somerate κ < 0.(i) Suppose v˜1,n = v˜2,n. Then r2,n has exactly rate κ. Moreover, if |d1,n| − d∞ has at mostrate κ for some number d∞ ∈ [0,∞), then |d2,n| − d∞ has at most rate κ, too.(ii) Suppose there are indices i1, i2, . . . in J such that v˜c,n = Ain(vc,n−1) for both c = 1, 2.If r2,n has at most rate κ, then x1,n − x2,n has at most rate κ.Application 5.14 We fix κ < 0 and v ∈ D. Let us frame parts (i) and (ii) from the above(non-random) proposition in the following (random) context:(i) If S1AJn(v) has a.s. upcircle rates κ, κ, then S2AJn(v) has a.s. upcircle rates κ, κ, too.(ii) Let Vc,n := a1,Jn(v) for c = 1, 2. If both radii R(V1,n) and R(V2,n) have a.s. exactly rateκ, then the difference of the first components of V1,n and V2,n has a.s. at most rate κ.Proof (of the proposition). We first present parts of the proofs of (i) and (ii) simultaneously.For that purpose we set c = 1, c′ = 2 for part (i) and c = 2, c′ = 1 for part (ii). The sequencerc,n converges to 0 because it has at most rate κ < 0. So there is some n0 ∈ N such that forany n ≥ n0 the upcircle U(xc,n, rc,n) intersects at most two shape sets, see the illustration of43the shape sets in Figure 3. We may assume w.l.o.g. that n0 = 1 (otherwise we always addn0 to the sequence index n). Letv′n := Sc′(vc,n) = Sc′(v˜c,n), (5.12)where we used the definition vc,n = Sc(v˜c,n) and Sc′Sc = Sc′ , see (5.10). Corollary 5.11 yields|x′n − xc,n| ≤ 4rc,n, 14rc,n ≤ r′n ≤ 4rc,n,∣∣∣|d′n| − |dc,n|∣∣∣ ≤ 2rc,n. (5.13)(i): According to (5.12) we have v′n = S2(v˜1,n) = v2,n. So (5.13) shows that r2,n hasexactly rate κ and |d2,n| − |d1,n| has at most rate κ. If |d1,n| − d∞ has at most rate κ, then|d2,n| − d∞ has at most rate κ, too.(ii): LetRn := (r1,n · r2,n)1/2, yn := |x1,n − x2,n|Rn. (5.14)The sequence Rn has at most rate 12(κ + κ) = κ, so we only need to show that yn has atmost rate 0. We havev1,n = a1,in(v1,n−1), v′n = a1,in(v2,n−1), v2,n = SjAin(v2,n−1) (5.15)for some j ∈ J (that depends on n). Lemma 3.13.(ii) describes a1,in in upcircle coordinatesand Proposition 2.23.(i) describes SjAin in upcircle coordinates. In particular, we know theformulas(x1,n, r1,n) =(ain(x1,n−1), |(ain)′(x1,n−1)| · r1,n−1), (5.16)(x′n, r′n) =(ain(x2,n−1), |(ain)′(x2,n−1)| · r2,n−1), (5.17)r2,n = |(SjAin)′(x2,n−1)| · r2,n−1. (5.18)The “Cauchy-Schwarz inequality” in Proposition 3.3.(ii) yields|x1,n − x′n|2 = |ain(x1,n−1)− ain(x2,n−1)|2 ≤r1,nr1,n−1· r′nr2,n−1· |x1,n−1 − x2,n−1|2. (5.19)Part (iv) from the same Proposition 3.3 states that |(ain)′| ≤ |(SjAin)′|, thus r′n ≤ r2,n.Together with inequality (5.19) it follows that|x1,n − x′n|Rn≤ |x1,n−1 − x2,n−1|Rn−1= yn−1. (5.20)44By (5.13) we have |x′n − x2,n| ≤ 3r2,n and therefore|x′n − x2,n|Rn≤ 3(r2,nr1,n)1/2=: qn. (5.21)Combining both inequalities with the triangle inequality yields yn ≤ yn−1 + qn and thereforeyn ≤ y0 +n∑t=1qt. (5.22)Since r2,n has at most rate κ and r−11,n has at most rate −κ we see that qn has at most rate 0.By Proposition 5.5.(i) it follows that ∑nt=1 qt has at most rate 0. Therefore yn has at mostrate 0. 5.5 Equality of distributionsThe following proposition shows for fixed v ∈ D that the Markov chain ac,Jn(v) = ScAjn . . . ScAj1(v),which applies Sc in every step, is essentially the same as the Markov chain ScAJn(v) =ScAjn . . . Aj1(v), which applies Sc only at the very end.Proposition 5.15 Let v ∈ D. The two stochastic processes ac,Jn(v) and ScAJn(v) have thesame distribution in the Borel space of sequences in D.Proof. Postponed to Proposition 8.7 (on page 67). Let us only give a heuristic argumenthere: The key is Corollary 8.4.(i) which states that the multiset of the six evaluations Sj(v)remains the same if we replace v by w := Sc(v). With the formula Aj = αSj we concludethat{Aj(v) : j ∈ J } = {Aj(w) : j ∈ J }. (5.23)So we can replace v by w = Sc(v) without changing the next “generation” {Aj(v) : j ∈J }. This makes it plausible that “smuggling in” Sc between any two functions in therandom composition AJn(v) = Ajn . . . Aj1(v) results in a random composition Ajnac,Jn−1(v) =AjnScAjn−1 . . . Aj2ScAj1(v) with the same distribution as AJn(v). Thus ScAJn(v) has thesame distribution as ac,Jn(v). Corollary 5.16 Let v ∈ D and κ, κ′ ∈ R. Then ac,Jn(v) has a.s. upcircle rates κ, κ′, if andonly if ScAJn(v) has a.s. upcircle rates κ, κ′.455.6 Proof of the second theoremLet us restate and prove Theorem 1.5 (on page 6).Theorem 5.17 Let Rn be the radius R(Xn, Zn). The following three sequences decay ex-actly with rate χ:Rn, Zn −Xn, =Zn = flat(∆(n)).The sequence <Zn −Xn decays at least with rate χ.Proof. Let κ := −χ < 0. We prove an equivalent, but more convenient reformulation of thistheorem, where the formulations “decay exactly with rate χ” and “decay at least with rateχ” are replaced by “have exactly rate κ” and “decay at most with rate κ”.In Section 1 we fixed points X0 ∈ [0, 1], Z0 ∈ Σ \ [0, 1] and introduced the Markovchains Xn = ajn(Xn−1) ∈ [0, 1] and Zn = ajn(Zn−1) ∈ Σ \ [0, 1]. Hence Xn = aJn(X0) andZn = aJn(Z0). Since E is countable, see Lemma 3.7.(i), there exists X ′0 ∈ [0, 1] \ E. Letv := (X ′0, Z0) ∈ (R \ E)×H = D.We study the following five two-dimensional stochastic processes, where the arrows indi-cate the chronological order of our investigation:V1,n := a1,Jn(v) → V−1,n := S1AJn(v) (5.24)↓V0,n := (Xn, Zn) ← V2,n := a2,Jn(v) ← V−2,n := S2AJn(v) (5.25)We first show that Vt,n has a.s. upcirle rates κ, κ for t = 1,−1,−2, 2 and afterwards we showthat V0,n has a.s. upcircle rates κ, 0. Then Lemma 5.8 finishes the proof. The numbers inthe brackets of the following list indicate the current value of t.(1) Application 5.6.(i) states that V1,n has a.s. upcircle rates κ, κ.(-1) Corollary 5.16 states that V−1,n has a.s. upcircle rates κ, κ.(-2) Application 5.14.(i) states that V−2,n has a.s. upcircle rates κ, κ.46(2) Corollary 5.16 states that V2,n has a.s. upcircle rates κ, κ.(0) Application 5.14.(ii) states thatX2,n−X1,n has at most rate κ. According to the “designfeature” (3.5) of a1,Jn the first component X1,n of V1,n equals aJn(X ′0) = xn(X ′0). ByApplication 5.6.(ii) the sequence X1,n−Xn = xn(X ′0)−xn(X0) has at most rate κ. Bythe “design feature” (3.5) of a2,Jn the second component of V2,n equals aJn(Z0) = Znwhich is also the second component of V0,n. Moreover, V2,n has a.s. upcircle rates κ,κ and therefore in particular a.s. upcircle rates κ, 0. Thus Lemma 5.9 in the casevn = V2,n and v′n = V0,n states that V0,n has a.s. upcircle rates κ, 0. Remark 5.18 Let us comment on the above proof:(i) A comparison with Section 1.4 shows that all five stochastic processes Vt,n are barycen-tric processes. For t = 1, 2 the definitions of Vt,n and Rt,n are identical with Definitions(1.17) and (1.20).(ii) It would have sufficed to show that Vt,n has a.s. upcircle rates κ, 0 for t = 1,−1,−2, 2.However, we find the following byproduct of our proof interesting for its own sake:There is a barycentric process V2,n with second component Zn such that V2,n has a.s.upcircle rates κ, κ. In particular, the angles D(V2,n) converge a.s. (and asymptoticallyexponentially fast) to some random variable D∞ taking values in [0,∞).476 TrianglesThis section discusses the objects from Sections 2.1, 2.3, 2.4. For the reader’s convenience,we depict again the figures from Sections 1.1, 2.1, 2.4. We always assume that z ∈ H0.6.1 Geometric settingIn Section 1.1 we defined for any child ∆ := ∆(l,k)z of z the outer vertex of ∆ as the vertexof ∆ that is also a vertex of ∆z, and we defined the outer edge of ∆ as the edge that is onehalf of an edge (namely the k-edge) of ∆z, see Figure 6.Figure 6: (Copy of Figure 1). A triangle ∆z, its shape S(z) ∈ Σ and its children ∆(l,k)z .Their second index k is depicted in the same colour as the corresponding k-edge of z.Let us now restate Definition 2.1 (on page 12).Definition 6.1 Let ∆ := ∆jz be some child of ∆z. If the outer edge of ∆ has length 0,then we set Aj(z) := ∞. Otherwise, there is a unique similarity transformation that mapsthe outer edge of ∆ to the line segment between −1 and 1, the outer vertex of ∆ to 1 andthe inner vertex of ∆, namely the barycentre 13z of ∆z, to some new point z′ ∈ H0. We set48Aj(z) := z′. This defines a function Aj : H0 → H0.Figure 7 compares the above definition to the child functions.Figure 7: (Copy of Figure 2). Comparison of A(1,1,)(z) and a(1,1,)(z). Corresponding edgesin similar triangles have the same colour.In Section 1.1 we defined points z, z′ ∈ H0 to be similar, if their associated triangles ∆z,∆z′ are similar, and we introduced the shape function S : H0 → Σ that maps any pointz to the unique point in Σ which is similar to z. Now we extend the notions of similarityand associated triangles to all points in H0. Moreover, using the same symbol, we extendthe shape function to a function S : H0 → Σ by restating Definition 2.2 in part (iii) of thefollowing definition.Definition 6.2 We define:(i) ∆∞ := ∆1,(ii) ∞ and z ∈ H0 are similar, if z ∈ {−1, 1,∞},(iii) S(∞) := 1 ∈ Σ.49Remark 6.3 Let us comment on the above extended definition:(i) For any z, z′ ∈ H0 it is still true that z and z′ are similar, if and only if ∆z and ∆z′ aresimilar.(ii) For any z ∈ H0 it is still true that S(z) is the unique point in Σ which is similar to z.Let us restate Lemma 2.3 (on page 13) which we have proven already.Lemma 6.4 Any z ∈ H0 satisfiesaj(z) = SAj(z). (6.1)6.2 Explicit formula for the barycentric subdivisionLet us restate Definition 2.10 (on page 16).Definition 6.5 We define:(i) the rotation function h ∈ F byh(z) := z − 3z + 1 = 1−4z + 1 ,(ii) the six functions Sj ∈ G byS(l,k) := mlhk, (l, k) ∈ J ,(iii) the rescaling function α ∈ F byα(z) := 23z − 1.Remark 6.6 Let z ∈ H0. The first and the second representation of h(z) in Definition6.5.(i) show the first and the last equation in the following line:h2(z) = −z − 3z + 1 = −1 +4z − 1 = h−1(z). (6.2)Hence h3 = id.50Lemma 6.7 The rotation function h has the following geometric meaning:(i) Let z ∈ H0 \ {−1}. Then h(z) 6= ∞ and there is an orientation preserving similaritytransformation that maps the k-edge of z to the (k − 1)-edge of h(z).(ii) Let z ∈ H0 \ {1}. Then h2(z) 6= ∞ and there is an orientation preserving similaritytransformation that maps the k-edge of z to the (k + 1)-edge of h2(z).Proof. Let us define the complex affine linear transformationsϕ1,z : w 7→ −2w + z − 1z + 1 , ϕ2,z : w 7→2w − z − 1z − 1 .We see that the orientation preserving similarity transformation ϕ1,z maps the vertices of(the associated triangle of) z to the vertices of h(z) in the following way:z 7→ −1, − 1 7→ 1, 1 7→ z − 3z + 1 = h(z). (6.3)Hence ϕ1,z maps the k-edge of z to the (k− 1)-edge of h(z). The analogous statement holdsfor ϕ2,z. Lemma 6.8 The rotation function h and the mirror function m have the following proper-ties:h3 = id, m2 = id, hkm = mh−k. (6.4)This means that ({Sj : j ∈ J }, ◦) is a Dihedral group which we denote by S. The mapj 7→ Sj is a group isomorphism between (J , ·) and (S, ◦), where the operation · is definedby(l′, k′) · (l, k) := (l′ + l, (−1)lk′ + k). (6.5)The identity element of J is (0, 0) and the formula for the inverse is (l, k)−1 = (l, (−1)l+1k).Proof. By Remark 6.6 it is clear that m,h, h2 6= id and that (6.4) holds. Hence the six(pairwise different) functions Sj form the Dihedral group S. Since φ : J → D6 : j 7→ Sj isbijective, it suffices to show that φ(j)φ(j′) = φ(j · j′) in order to conclude that J is a groupand φ a group isomorphism. Let j = (l, k) and j′ = (l′, k′). By (iii) we have hk′ml = mlhk′′for k′′ := (−1)lk′. Hence φ(j′)φ(j) = φ(l′ + l, k′′ + k) = φ(j · j′). The statements about theidentity element and the inverse are clear. 51Proposition 6.9 For any z ∈ H0 the explicit formula for the barycentric subdivision isAj(z) = αSj(z).Proof. For z ∈ {−1, 1} we verify the above formula by hand. Now we assume z /∈ {−1, 1,∞}.Due to Lemma 6.7 we know the following: For k′ ∈ {0, 1, 2} there is a similarity transforma-tion ϕk′,z (in the case k′ = 0 we set ϕ0,z := id) which maps the k-edge of z to the (k−k′)-edgeof hk′(z). It follows with elementary geometry thatϕk,z maps ∆ := ∆(l,k)z to ∆′ := ∆(l,0)hk(z)and the outer edge of ∆ to the outer edge of ∆′ and the outer vertex of ∆ to the outer vertexof ∆′. The example case (l, k) = (1, 2) is illustrated in Figure 8.Figure 8: Triangles ∆z and ∆h2(z) with their similar children ∆(1,2)z and ∆(1,0)h2(z). Correspondingedges have the same colour. The outer edges of both children are blue and the outer verticesare magenta.For z′ := S(l,k)(z) we notice thatml maps ∆′ to ∆′′ := ∆(0,0)z′and the outer edge to the outer edge and the outer vertex to the outer vertex. The similaritytransformation ϕα : w 7→ 2w − 1 maps 0, 1, 13z′ to −1, 1, α(z′) (in this order). Thusϕα maps ∆′′ to ∆α(z′)52and the outer edge to the 0-edge and the outer vertex to 1. Combining the three steps showsthat there is a similarity transformation that maps ∆ to ∆α(z′) and the outer edge to the0-edge and the outer vertex to 1. It follows by definition of Aj that A(l,k)(z) = α(z′) =αS(l,k)(z). As in Definition 2.13, we can extend Aj and aj to the following functions:Aj = αSj ∈ G, aj = SAj : H0 → Σ. (6.6)6.3 Shape setsLet us restate Definition 2.14 (on page 17).Definition 6.10 We define the shape setsΣ0,k := {z : the k-edge is a longest edge and the (k − 1)-edge is a shortest edge of z},Σ1,k := {z : the k-edge is a longest edge and the (k + 1)-edge is a shortest edge of z}.We include ∞ in Σ(0,1) and Σ(1,2), but not in the other shape sets.Figure 9 illustrates the above definition.Figure 9: (Copy of Figure 3). The shape sets Σj and elements zj ∈ Σj with z(0,0) = S(zj) =Sj(zj).53In the following lemma we assume that z ∈ H0.Lemma 6.11 The functions Sj and the shape sets Σj have the following properties:(i) The set {Sj(z) : j ∈ J } consists of those points that are similar to z.(ii) The invariance SSj = S holds.(iii) Σj = {z : Sj(z) ∈ Σ} = {z : S(z) = Sj(z)}.(iv) Sj(Σj′) = Σj′′ with j′′ := j′ · j−1.(v) S(x) = minj |Sj(x)| for any x ∈ R.Proof. (i): Let T denote the above set {Sj(z) : j ∈ J }. If z ∈ {1,−1,∞}, then T ={−1, 1,∞} consists of those points that are similar to z, see Definition 6.2. Suppose z /∈{−1, 1,∞}. We see by Lemma 6.7 that H := {z, h(z), h2(z)} consists of those points thatare similar to z with the same orientation as z. Since T = H ∪m(H) we can conclude (i).(ii): For any z ∈ H0 we know by Remark 6.3 that S(z) is the unique point in Σ that issimilar to z. Together with (i) it follows (ii).(iii): In the second equation the inclusion “⊇” is trivial and “⊆” is a consequence of (ii)and the fact that S : H0 → Σ is constant on Σ. For the first equation we show thatz ∈ Σ(l,k) ⇔ hk′(z) ∈ Σ(l,k−k′) (6.7)because then, the special case k′ = k and the fact ml(Σ(l,0)) = Σ(0,0) show thatz ∈ Σ(l,k) ⇔ S(l,k)(z) = mlhk(z) ∈ Σ(0,0) = Σ. (6.8)For the proof of (6.7) we may assume w.l.o.g. that k′ = 1 (k′ = 0 is trivial and for k′ = 2we apply the result for k′ = 1 twice). We also assume l = 0 (the proof for l = 1 is similar).So we need to show thatz ∈ Σ(0,k) ⇔ h(z) ∈ Σ(0,k−1). (6.9)54The shape set Σ(0,k) contains z = 1,∞,−1 if and only if k = 0, 1, 2 (in this order), seeFigure 9. For z = 1,∞,−1 we have h(z) = −1, 1,∞ and the shape set Σ(0,k−1) containsh(z) = −1, 1,∞ if and only if k − 1 = 2, 0, 1. This shows the equivalence (6.9) in the casez ∈ {−1, 1,∞}. Now suppose that z /∈ {−1, 1,∞}. By definition, Σ(0,k) contains z if andonly if the k-edge is a longest and the (k − 1)-edge is a shortest edge of z, and Σ(0,k−1)contains h(z) if and only if the (k − 1)-edge is a longest and the (k − 2)-edge is a shortestedge of h(z). According to Lemma 6.7 there exists a similarity transformation that mapsthe k-edge of z to the (k − 1)-edge of h(z). This shows the equivalence (6.9).(iv): By Lemma 6.8 we know that Sj′ = Sj′′·j = Sj′′Sj. Applying (iii), (ii) and again (iii)yieldsz ∈ Σj′ ⇔ SSj(z) = Sj′(z) = Sj′′Sj(z) ⇔ Sj(z) ∈ Σj′′ .Since Sj is bijective, it follows Sj(Σj′) = Σj′′ .(v): As a consequence of Lemma 6.8 and the bijectivity of the function J → J : j 7→ j ·j′we see that the set of the six compositions SjSj′ equals the set of the six functions Sj′ . Sowe may assume w.l.o.g. that x ∈ Σ (or equivalently S(x) = x), because S(x) = Sj′(x) forsome index j′ ∈ J .For j ∈ {(0, 0), (1, 0)} =: J ∗ we have x = |Sj(x)|. Suppose j ∈ J \ J ∗. It follows thatj−1 ∈ J \ J ∗. Thus Sj−1 ∩ R is one of the intervals [−∞,−3], [−3,−1], [1, 3], [3,∞], seeFigure 9. Since x ∈ Σ = Σ(0,0) and due to part (iv) we have Sj(x) ∈ Σj−1 and thereforex ≤ 1 ≤ |Sj(x)|. 557 Up-to-sign derivatives7.1 Basic propertiesIn this subsection we always assume that x, y are real variables with x 6= y. We remindourselves of Definition 3.2 which defines up-to-sign derivatives and the difference quotient|b′(x, y)| = |b(x)− b(y)|/|x− y|.Definition 7.1 Let Db ⊆ R and b : Db → R be an up-to-sign differentiable function. Weset Qb(x, x) := 0 andQb(x, y) :=|b′(x, y)|2|b′(x)| · |b′(y)| .Remark 7.2 Let g ∈ G and b : Db → R be the restriction of g to Db := R \ {g−1(∞)}.Lemma 2.9.(ii) states thate the square of the difference quotient (g(x)− g(y))/(x−y) equals|g′(x)| · |g′(y)|. This means that Qb(x, y) is constantly 1.Lemma 7.3 Let b : Db → Da and a : Da → R be functions with Da, Db ⊆ R. Supposethat a and b are up-to-sign differentiable. Their composition ab satisfies the following “chainrules” for any x, y ∈ Db:(i) ab is up-to-sign differentiable and |(ab)′(x)| = |a′b(x)| · |b′(x)|.(ii) |(ab)′(x, y)| = |a′(b(x), b(y))| · |b′(x, y)|.(iii) Qab(x, y) = Qa(b(x), b(y)) ·Qb(x, y).Proof. (i): There are only finitely many x ∈ Db for which b is not differentiable in x or a notdifferentiable in b(x). For any other x ∈ Db we have |(ab)′(x)| = |a′b(x)| · |b′(x)| according tothe ordinary chain rule. The function |a′b| · |b′| is continuous and is therefore the up-to-signderivative of ab.(ii) is clear for b(x) 6= b(y), as well as for b(x) = b(y) (in which case both sides of (ii)vanish).(iii) follows from (i) and (ii). 567.2 Up-to-sign derivative of the shape functionIn this subsection we always assume that x, y are real variables with x 6= y.Definition 7.4 We define the fixpoint polynomial ĥ : R→ [3,∞] byĥ(x) := x2 + 3.Remark 7.5 Let us explain the reason for the name “fixpoint polynomial” (even thoughwe do not need this in the following): The formula h(z) = (z − 3)/(z + 1) shows that thefixpoints of h are exactly the two roots of the complex polynomial z2 + 3. These rootsare ±i√3. Notice that the associated triangle of the point i√3 is equilateral, so it is alsogeometrically clear that this point is a fixpoint of h.Lemma 7.6 For any j and x ∈ R we can express |S ′j(x)| in terms of ĥ via|S ′j(x)| · ĥ(x) = ĥSj(x).Proof. Let (l, k) := j. We have that Sj = mlhk and |S ′j| = (hk)′. Since ĥ is even, theright-hand side in the above formula is ĥhk(x). So we need to show that(hk)′(x) · ĥ(x) = ĥhk(x). (7.1)The case k = 0 is trivial. For k ∈ {1, 2} we easily verify the formula by plugging in theformulas h(x) = (x − 3)/(x + 1) and h′(x) = 4/(x + 1)2, or h2(x) = −(x + 3)/(x − 1) and(h2)′(x) = 4/(x− 1)2, see Remark 6.6. Definition 7.7 We define the non-isosceles set Σ∗ as the set of all points x ∈ R that lie inexactly one of the shape sets Σj.Notice that Σ∗ = R \ {−3,−1, 0, 1, 3} contains exactly those points x ∈ R that are notisosceles.Proposition 7.8 Let us view the shape function as a function S : R → [0, 1]. Then S isLipschitz-continuous and up-to-sign differentiable. Any x, y ∈ R satisfy:57(i) |S ′(x)| = ĥS(x)/ĥ(x) = |S ′j(x)|, if x ∈ Σj,(ii) |S ′(x)| = minj |S ′j(x)| ∈ (0, 1],(iii) QS(x, y) ≤ 1.Proof. We setβ(x) := ĥS(x)ĥ(x)∈ (0,∞), Q(x, y) := |S′(x, y)|2β(x)β(y) . (7.2)The proof is structured in the following steps:• Step 1 explains why it suffices to show that Q(x, y) ≤ 1 for any x, y.• Step 2 shows the invariance Q(x, y) = Q(Sj(x), Sj(y)).• Step 3 justifies why we may assume w.l.o.g. that y ∈ {0, 1} and x /∈ Σ ∩ R = [0, 1].• Step 4 considers the cases y = 0 and y = 1.Step 1: Due to Lemma 7.6 the function β equals |S ′j| on Σj ∩ R. Any x ∈ Σ∗ =R \ {−3,−1, 0, 1, 3} lies in the interior of one shape set Σj, so β is differentiable in each suchx withβ(x) = |S ′j(x)|. (7.3)Due to Lemma 7.6 and S(x) = minj |Sj(x)|, see Lemma 6.11.(v), we know that β(x) =minj |S ′j(x)| is at most 1. So |S ′(x, y)|2 ≤ Q(x, y). Suppose we can show thatQ(x, y) ≤ 1. (7.4)Then S is Lipschitz-continuous. Thus β is continuous and is therefore the up-to-sign deriva-tive of S. Hence QS = Q.Step 2: First, we show the following analogue of the chain rule from Lemma 7.3.(i):β(x) = βSj(x) · |S ′j(x)|. (7.5)This follows from a combination of Lemma 7.6, and the invariance SSj = S, see Lemma6.11.(ii).58Using again the invariance SSj = S we conclude thatQ(x, y) = Q(Sj(x), Sj(y))·QSj(x, y).Due to Remark 7.2 we obtain the invarianceQ(x, y) = Q(Sj(x), Sj(y)). (7.6)In the following two steps we prove (7.4) by a series of “w.l.o.g.”-assumptions.Step 3: Because of the invariance (7.6) we may assume that y ∈ Σ, so S(y) = y andβ(y) = 1. By definition of Q, β and by ĥ(x) = x2 + 3 we obtain the formulaQ(x, y) = |S ′(x, y)|2 · x2 + 3S(x)2 + 3 . (7.7)If x also lies in Σ, then Q(x, y) = 1. We assume x /∈ Σ. This implies that the functionϕ : [0, 1]→ R : t 7→ S(x)− tx− t =S(x)− xx− t + 1 (7.8)is monotone. Thus |ϕ(y)| ≤ max{|ϕ(0)|, |ϕ(1)|}. Since Q(x, t) = ϕ(t)2β(x)−1 for any t ∈[0, 1] it follows Q(x, y) ≤ max{Q(x, 0), Q(x, 1)}. So we may assume that y ∈ {0, 1}.Step 4: Case y = 0: The invariance Q(m(x), 0) = Q(x, 0), see (7.6), justifies theassumption x ≥ 0. Together with x /∈ Σ it follows that x > 1 ≥ S(x) =: x˜ ≥ 0. Formula(7.7) yieldsQ(x, 0) = x˜2x2· x2 + 3x˜2 + 3 =1 + 3x−21 + 3x˜−2 ≤ 1. (7.9)Case y = 1: The representation mh(z) + 1 = 4/(z+ 1) shows that the function mh mapsR\ [−3, 1] = {t ∈ R : |t+1| > 2} to (−3, 1) = {t ∈ R : |t+1| < 2}. Since mh has the fixpoint1, we have the invariance Q(mh(x), 1) = Q(x, 1), see (7.6). So we may assume x ∈ [−3, 1].Together with x /∈ Σ it follows that −3 ≤ x < 0 ≤ s(x) =: x˜ ≤ 1. Formula (7.7) yieldsQ(x, 1) = (1− x˜)2(1− x)2 ·x2 + 3x˜2 + 3 =ϕ(x)ϕ(x˜) , ϕ(t) :=t2 + 3(1− t)2 . (7.10)In the special case x˜ = 1 we have Q(x, 1) = 0. Suppose x˜ 6= 1. Due to ϕ′(t) = 2(t+3)/(1−t)3the function ϕ is increasing on [−3, 1). This shows that ϕ(x) ≤ ϕ(x˜), so Q(x, 1) ≤ 1. 597.3 Up-to-sign derivative of the child functionsIn this subsection we always assume that x, y are real variables with x 6= y.Lemma 7.9 Let g ∈ G. We define b : R → [0, 1] by b(x) := Sg(x). Then b is Lipschitz-continuous and up-to-sign differentiable. Any x, y ∈ Db satisfy:(i) |b′(x)| = |(Sjg)′(x)|, if g(x) ∈ Σj,(ii) |b′(x)| = minj |(Sjg)′(x)| ∈ (0,∞),(iii) Qb(x, y) ≤ 1.Proof. Let g = mlf with f ∈ F . We use the standard notation f(z) = (c1z + c2)/(c3z + c4)with coefficients ci ∈ R such that c := c1c4 − c2c3 > 0. Let us define a continuous functionsα, β : R→ (0,∞) byα(x) := c(c1x+ c2)2 + 3(c3x+ c4)2, β(x) := ĥSg(x) · α(x). (7.11)Let Dg := R \ {g−1(∞)}. For any x ∈ Dg we have |g′(x)| = c/(c3x + c4)2 and thereforeα(x) = |g′(x)|/ĥg(x). Together with the formula |S ′g(x)| = ĥSg(x)/ĥg(x), see Proposition7.8.(i), it follows thatβ(x) = |S ′g(x)| · |g′(x)|, x ∈ Dg. (7.12)According to the chain rule from Lemma 7.3.(i), the right-hand side of the above formulais the up-to-sign derivative of the restriction b : Dg → [0, 1]. Hence β is the up-to-signderivative |b′| of b : R→ [0, 1].By using formula (7.12) and the same chain rule from Lemma 7.3.(i) again, we see thatparts (i) and (ii) from Proposition 7.8 imply the respective part of this lemma.For (iii) let g˜ : Dg → R denote the restriction of g. The chain rule from Lemma 7.3.(iii)states thatQb(x, y) = QSg˜(x, y) = QS(g(x), g(y)) ·Qg˜(x, y), x, y ∈ Dg. (7.13)According to Proposition 7.8.(iii) and Remark 7.2 we have QS ≤ 1 and Qg˜ = 1. SinceQb(x, y) is continuous in x and y it follows that Qb(x, y) ≤ 1 for any x, y ∈ R. 60Let us restate and prove Proposition 3.3 (on page 21).Proposition 7.10 We restrict the child functions to functions aj : [0, 1] → [0, 1]. LetI ∈ J n be a multiindex with n ∈ N0. Then the composition aI is Lipschitz-continuousand has an up-to-sign derivative |a′I | : [0, 1] → (0,∞). Moreover, this up-to-sign derivativesatisfies:(i) the chain rule |a′I(x)| = |a′i(aJ(x))| · |a′J(x)|, if we write I = (J, i) with J ∈ In−1,(ii) the “Cauchy-Schwarz inequality” |a′I(x, y)| ≤ |a′I(x)|1/2|a′I(y)|1/2.In the “n = 1” case I = j ∈ J we have:(iii) |a′j(x)| = |Sj′Aj(x)|, if aj(x) = Sj′Aj(x),(iv) |a′j(x)| = minj′ |(Sj′Aj)′(x)|.Proof. (i) holds by the chain rule Lemma 7.3.(i).(ii): Lemma 7.9, in the special case g = Aj ∈ G, states that aj has an up-to-sign derivative|a′j| : [0, 1] → (0,∞) and we have the bound Qaj ≤ 1. It follows with the chain rule fromLemma 7.3.(i) that the composition aI has an up-to-sign derivative |a′I | : [0, 1] → (0,∞),and it follows with the chain rule in part (iii) of the same lemma that QaI ≤ 1 which isequivalent to (ii).(iii) follows from Lemma 7.9.(i) and the equivalence of the statements aj(x) = Sj′Aj(x)and Aj(x) ∈ Σj′ .(iv) follows from Lemma 7.9.(ii). Remark 7.11 In the above proposition we could as well view the child functions as functionsaj : R→ R and the same results hold.618 Generalized child functions8.1 Exceptional setIn Definition 7.7 we introduced the non-isosceles set Σ∗ consisting of all points x ∈ R thatlie in exactly one of the shape sets Σj and noticed that Σ∗ = R \ {−3,−1, 0, 1, 3} containsexactly those points x ∈ R that are not isosceles. We say that z′ is a descendant of z if ∆z′ issimilar to one of the descendants of ∆z in the iterated barycentric subdivision. In particular,z is a descendant of itself.Definition 8.1 We define(i) the exceptional set E as the set of all points x ∈ R that have an isosceles descendant,(ii) D := (R \ E)×H,Let us restate Lemma 3.7 (on page 23) and augment it with the following part (ii).Lemma 8.2 The sets E and D have the following properties:(i) E is countable.(ii) If v ∈ D, then Aj(v) ∈ D.(iii) If Aj(x) ∈ E, then x ∈ E.Proof. (i): It is geometrically clear that x′ is a descendant of x ∈ R in generation n ∈ N0if and only if x′ is similar to x′′ := AI(x) for a certain multiindex I ∈ J n. Note that x′ isisosceles if and only if x′′ is isosceles, i.e. x′′ ∈ R \ Σ∗ = {−3,−1, 0, 1, 3}. The bijectivity ofAI and the following representation of E imply that E is countable:E =⋃n∈N0⋃I∈J n(AI)−1 (R \ Σ∗). (8.1)For (iii) we only need to notice that if the descendant Aj(x) of x has an isosceles descen-dant, then x has an isosceles descendant.(ii): follows from (iii). 628.2 Construction of the generalized shape functionsFor the rest of this section we consider a fixed component index c ∈ {1, 2}. In Lemma 6.8(on page 51) we saw that the map j 7→ Sj is a group isomorphism between the Dihedralgroups (J , ·) and ({Sj : j ∈ J }, ◦), where the operation · is defined by(l′, k′) · (l, k) := (l′ + l, (−1)lk′ + k). (8.2)The identity element of J is (0, 0) and the formula for the inverse is (l, k)−1 = (l, (−1)l+1k).Proposition 8.3 There is a function Sc : D → D such that any v ∈ D satisfies the followingtwo conditions:(i) Sc(v) = Sj(v) for some j such that vc ∈ Σj.(ii) ScSj(v) = Sc(v) for any j.Corollary 8.4 Any v ∈ D satisfies the following two conditions:(i) The multiset of the six evaluations SjSc(v) equals the multiset of the six evaluationsSj(v).(ii) Sc′Sc(v) = Sc′(v) for any c, c′ ∈ {1, 2}.Proof (of the corollary). We have w := Sc(v) = Sj′(v) for some index j′.(i): The map Sj 7→ SjSj′ is a permutation on the group {Sj : j ∈ J }.(ii) follows from Sc′Sj′ = Sc′ . Proof (of the proposition). This very technical proof may be skipped at first reading. Theset R \ E of all points x ∈ R that have only non-isosceles descendants is a subset of thenon-isosceles set Σ∗ which is a subset of R \ {−1, 1}. The last set contains all points x ∈ Rsuch that all six evaluations Aj(x) are finite.We set D∗ := Σ∗ ×H ⊇ D. Due to the similarity of x and Sj(x) it is geometrically clearthat x lies in R \E if and only if Sj(x) does. Hence v lies in D if and only if Sj(v) does. As63a consequence, it suffices to show the following in order to prove the proposition: There is afunction Sc : D∗ → D∗ such that any v ∈ D∗ satisfies (i) and (ii).In the following we always assume v ∈ D∗. We show that there is a function J c : D∗ → Jwith the following two properties:(i)J J c(v) = j for some j such that vc ∈ Σj,(ii)J J cSj(v) · j = J c(v) for any j.We then defineSc(v) := SJc(v)(v). (8.3)We see that (i)J trivially implies (i) and (i) implies that Sc actually maps D∗ to D∗ becauseof the obvious invariance Sj(Σ∗) = Σ∗. We quickly verify that (ii)J implies (ii):ScSj(v) = SJcSj(v)Sj(v) = SJc(v)(v) = Sc(v). (8.4)The case c = 1 is easy: By definition of D∗ the first component v1 lies in exactly oneshape set Σj1 and we set J1(v) := j1. In that case the first component Sj(v1) of Sj(v) lies inΣj′′ with j′′ := j1 · j−1. Then J cSj(v) = j′′, thus J cSj(v) · j = j1 = J c(v).From now on let c = 2. We compute that(l′, k′) · (l, k)−1 = (l′ + l, (−1)l(k′ − k)). (8.5)In the following, we will use this formula twice.Let us write Ml :=⋃k Σ(l,k) and Hk :=⋃l Σ(l,k). We always assume that j = (l, k). Weshow that there are functions L2, K2 : D∗ → J with the following properties:(i)L L2(v) = l for some l such that v2 ∈Ml,(i)K K2(v) = k for some k such that v2 ∈ Hk,(ii)L L2Sj(v) = L2(v) + l,64(ii)K K2Sj(v) = (−1)l(K2(v)− k).We then defineJ2(v) := (L2(v), K2(v)). (8.6)Since Σl,k = Ml ∩ Hk we see that (i)L, (i)K imply (i)J . Because of (8.5) we also see that(ii)L, (ii)K imply J2Sj(v) = J2(v) · j−1, which is equivalent to (ii)J .We set M1l := Ml \Ml+1 and M2 := M0 ∩M1, as well as H1k := Hk \ (Hk+1 ∪Hk+2) andH2k := Hk ∩Hk+1 \Hk+2 and H3 := H0∩H1∩H2. The superscripts of these sets indicate foreach element in how many of the sets M0, M1 or H0, H1, H2 this element lies. By definitionof D∗ the first component v1 lies in exactly one of the sets Ml. We defineL2(v) :=l2, v2 ∈M1l2l1, v2 ∈M2, v1 ∈Ml1(8.7)K2(v) :=k2, v2 ∈ H1k2k2 + l1, v2 ∈ H2k2 , v1 ∈Ml1k1, v2 ∈ H3, v1 ∈ Hk1(8.8)In the definition of K2(v) we treat l1 ∈ {0, 1} as an element of Z3 and not, as usually, as anelement of Z2.The properties (i)L and (i)K are obviously fulfilled. Before we prove (ii)L and (ii)K weinvestigate the image sets of Ml and Hk under Sj. Proposition 6.11.(iv) states that Sj mapsΣ(l′,k′) to Σj′′ with j′′ := (l′, k′) · j−1. By (8.5) we have j′′ = (l′ + l, (−1)l(k′ − k)). It followsfor Ml′ =⋃k′ Σ(l′,k′) and Hk′ =⋃l′ Σ(l′,k′) thatSj(Ml′) = Ml′+l, Sj(Hk′) = H(−1)l(k′−k). (8.9)Morreover, since Sj is bijective, we also know that Sj maps the complement of Ml′ or Hk′ tothe complement of Ml′+l or H(−1)l(k′−k).(ii)L: If v2 ∈M1l2 , then Sj(v2) ∈M1l2+l and therefore L2Sj(v) = l2 + l = L2(v) + l. If v2 ∈M2 and v1 ∈Ml1 , then Sj(v2) ∈M2 and Sj(v1) ∈Ml1+l, hence L2Sj(v) = l1 + l = L2(v) + l.65(ii)K : If v2 ∈ H1k2 , then Sj(v2) ∈ H1(−1)l(k2−k) and therefore K2Sj(v) = (−1)l(K2(v)− k).If v2 ∈ H3, then Sj(v2) ∈ H3 and Sj(v1) ∈ H(−1)l(k1−k), hence K2Sj(v) = (−1)l(K2(v)− k).The remaining case is v2 ∈ H2k2 and v1 ∈ Ml1 . For l = 0, 1 the point Sj(v2) lies in H2k2−k orH2−(k2+1−k) (in this order). Both cases are captured by the formulaSj(v2) ∈ H2(−1)l(k2−k)−l. (8.10)We also know that Sj(v1) ∈ Ml′ , where the number l′ ∈ {0, 1} is equal to l1 + l modulo 2.So we can write l′ = (−1)ll1 + l. Combining both formulas showsK2Sj(v) = (−1)l(k2 − k)− l + l′ = (−1)l(k2 + l1 − k) = (−1)l(K2(v)− k). (8.11)This finishes the proof. 8.3 Definition of the generalized child functionsLet us restate Definition 3.10 (on page 24).Definition 8.5 For c ∈ {1, 2} we define the generalized child functionac,j := ScAj : D → D.Remark 8.6 Let pic(v1, v2) := vc denote the projection onto the cth component. As al-ready seen in (3.1) and (3.5), Proposition 8.3.(i) and the above definition ensure the “designfeatures”picSc = Spic :D → Σ, (8.12)picac,j = ajpic :D → Σ. (8.13)8.4 Equality of distributionsIn this subsection we always assume that n ∈ N. Let us restate Proposition 5.15 (on page45).66Proposition 8.7 Let v ∈ D. The two stochastic processes ac,Jn(v) and ScAJn(v) have thesame distribution in the Borel space of sequences in D.Proof. Let w := Sc(v). As already mentioned before in the heuristic argument of Proposition5.15, the key is the following consequence of Corollary 8.4.(i):{Aj(v) : j ∈ J } = {Aj(w) : j ∈ J }. (8.14)We setBn(v) := (AJ1(v), . . . , AJn(v)), bc,n :=(ac,J1(v), . . . , ac,Jn(v)). (8.15)For any tuple v˜ = (v1, . . . , vn) ∈ Dn of n pairs vt we define Sc(v˜) = (Sc(v1), . . . , Sc(vn))component-wise. For any n we need to show that Sc(Bn(v)) and bc,n(v) have the samedistribution.We study all possible compositions of the six functions Aj and of the six functions acj.Let I = (i1, i2, . . . ) be a sequence of indices in ∈ J and let J N denote the set of all suchsequences I. For any I ∈ J N we define the multiindex In := (i1, . . . , in). We also define the“family trees”Tn(v) :={(AI1(v), . . . , AIn(v)): I ∈ J N}, (8.16)τc,n(v) :={(ac,I1(v), . . . , ac,In(v)): I ∈ J N}. (8.17)So Tn(v) or τc,n(v) is the set of all tupels (v1, . . . , vn) ∈ Dn such that for each t = 2, . . . , nthere exists j ∈ J such that vt is Aj(vt−1) or ac,j(vt−1), resp. Since our random indicesj1, j2, . . . are independent and uniform on J , the random vector Jn = (j1, . . . , jn) is uniformon J n. Hence Sc(Bn(v)) is uniform on Sc(Tn(v)) and bc,n(v) is uniform on τc,n(v). So weneed to show thatSc(Tn(v)) = τc,n(v). (8.18)Let w := Sc(v). By (8.14) the set of the evaluations Aj(w) equals the set of the evaluationsAj(v). We obtain for fixed I ∈ J N that{(v′, AI1(v′), . . . , AIn−1(v′)): v′ = Aj(w), j ∈ J}(8.19)={(v′, AI1(v′), . . . , AIn−1(v′)): v′ = Aj(v), j ∈ J}. (8.20)67Taking the union over all I ∈ IN showsTn(w) = Tn(v). (8.21)Now we prove (8.18) by induction over n. The case n = 1 is trivial. The induction step“n− 1 7→ n” relies on the definition ac,n = ScAj:τc,n(v) =⋃j∈J{ac,j(v)} × τc,n−1(ac,j(v))Sc(Tn(v)) = Sc⋃j∈J{Aj(v)} × Tn−1(Aj(v)) = ⋃j∈J{ac,j(v)} × Sc(Tn−1(Aj(v))).We only need to show for v′ := Aj(v) and w := ac,j(v) = Sc(v′) that τc,n−1(w) = Sc(Tn−1(v′)).This is clear by the induction hypothesis τc,n−1(w) = Sc(Tn−1(w)) and by Tn−1(w) = Tn−1(v′),see (8.21). 689 UpcirclesIn this section we always assume that (x, r, d), (x′, r′, d′) ∈ R × (0,∞) × R and z ∈ H. Forthe reader’s convenience, we depict again the figures from Sections 2.5 and 5.4.9.1 Upcircle coordinatesLet us restate Definition 2.16 (on page 18).Definition 9.1 We define the(i) upcircle U(x, r) := {z : |z − (x+ ir)| = r} ⊆ H0,(ii) radius function R(x, z) := |z − x|2/(2=z) ∈ (0,∞),(iii) angle function D(x, z) := (<z − x)/=z ∈ R,(iv) unit upcircle function γ(d) := 2/(d− i) ∈ H,(v) upcircle coordinates function u(x, z) := (x,R(x, z), D(x, z)).Figure 10 illustrates the above definition.Figure 10: (Copy of Figure 4). Upcircle U(x, r).69Lemma 9.2 Let z := x + r · γ(d) and x′ ∈ R. For c := d2 + 1 and y := (x − x′)/(2r) wehave the formulasR(x′, z)r= cy2 + 2dy + 1, D(x′, z)− d = cy.Proof. Since R(x′, z) = R(0, z− x′) and D(x′, z) = D(0, z− x′) we may assume w.l.o.g. thatx′ = 0, i.e. 2ry = x. Since z = 2r(y + 12γ(d)) and12γ(d) = 1/(d− i) = (d+ i)/c we have<z = 2r(y + dc), =z = 2rc. (9.1)This implies D(0, z) = <z/=z = cy + d. Together with R(0, z) = |z|2/(2=z) it follows thatR(0, z)r· c = |z|2=z ·c2r =|z|2(=z)2 =(<z=z)2+ 1 (9.2)= c2y2 + 2cdy + d2 + 1 = c ·(cy2 + 2dy + 1). (9.3)Dividing by c > 0 yields the formula for R(0, z)/r. Let us restate and prove Proposition 2.18 (on page 19).Proposition 9.3 The objects from Definition 9.1 have the following properties:(i) γ : R→ U(0, 1) \ {0} is bijective.(ii) u is bijective with inverse u−1(x, r, d) = (x, x+ r · γ(d)) ∈ {x} × U(x, r).Proof. We computeR(0, z) · γ(D(0, z)) = |z|22=z ·2<z=z − i= |z|2z= z. (9.4)(i): The image γ(d) obviously lies in H and therefore cannot be 0. The following com-putation shows that γ(d) ∈ U(0, 1):|γ(d)− i| =∣∣∣∣∣2− i(d− i)d− i∣∣∣∣∣ = |1− id||d− i| = 1. (9.5)It is clear that γ is injective. For the surjectivity choose z ∈ U(0, 1) \ {0}. Due to thissection’s underlying assumption z ∈ H this is equivalent to |z − i| = 1. We can rewrite thisas0 = (z − i)(z + i)− 1 = |z|2 + i(z − z) = |z|2 − 2=z. (9.6)70Hence R(0, z) = 1, so by (9.4) we obtain γ(D(0, z)) = z.(ii): Let u˜(x, r, d) := (x, x+ r · γ(d)). We verify both identities u˜u = id and uu˜ = id:• We have R(x, z) · γ(D(x, z)) = R(0, z − x) · γ(D(0, z − x)) = z − x, see (9.4). Henceu˜u(x, z) = (x, z).• Lemma 9.2 in the special case x = x′ states R(x, z) = r and D(x, z) = d for z =x+ r · γ(d). Thus uu˜(x, r, d) = (x, r, d).This shows that u−1 exists and that u−1 = u˜. The second component lies indeed on theupcircle U(x, r) because γ(d) ∈ U(0, 1) according to (i). 9.2 Möbius transformations in upcircle coordinatesLet us restate Definition 2.21.Definition 9.4 For g ∈ G we define g] : R→ [−∞,∞] \ {0} byg](x) := 2x− g−1(∞) .Remark 9.5 If g ∈ G is bounded on some set B ⊆ R, then g] is also bounded on B.Let us restate and prove Proposition 2.23 (on page 20).Proposition 9.6 Let g ∈ G and (x, z) ∈ R×H such that g(x) 6=∞ (which is equivalent to|g′(x)| 6=∞). Suppose we know the upcircle coordinates (x, r, d) := u(x, z).(i) The upcircle coordinates (x′, r′, d′) := ug(x, z) have the formulasx′ = g(x), r′ = |g′(x)| · r, |d′| =∣∣∣d+ g](x) · r∣∣∣ .(ii) g preserves upcircles in the following way:g(U(x, r)) = U(x′, r′).71Proof. (i): We first show the following two special cases:• Case 1: If g = m then (x′, r′, d′) = (−x, r,−d).• Case 2: If g = f ∈ F thenx′ = f(x), r′ = f ′(x) · r, d′ = d+ f ](x) · r. (9.7)Then the general case g = mlf follows by combining both special cases and using that|g′(x)| = f ′(x) and g−1(∞) = f−1(∞).Case 1 only consists of the verificationr′ = R(m(x),m(z)) = |m(z − x)|22=z = R(x, z) = r, (9.8)d′ = D(m(x),m(z)) = −<z − (−x)=z = −D(x, z) = −d. (9.9)Case 2: We setr˜ := f ′(x) · r, d˜ := d+ g](x) · r. (9.10)We show that f(z) = f(x) + r˜ · γ(d˜). The formula for u−1, see Proposition 9.3.(ii), thenimplies that r′ = R(f(x), f(z)) = r˜ and d′ = D(f(x), f(z)) = d˜. Similarly, we also havez = x+ r · γ(d).Let us use our standard notation for functions f ∈ F , namely f(z) = (c1z+c2)/(c3z+c4)with coefficients ci ∈ R such that c1c4 − c2c3 > 0. Notice that the following computationsare even true in the affine linear case c3 = 0 which is equivalent to f−1(∞) = ∞. We haveby Definition 2.5 (on page 14) and by the subsequent Remark 2.6.(iv) thatf(z)− f(x)z − x = f′(x, z) = f ′(x) · c3x+ c4c3z + c4. (9.11)Together with z = x+ r · γ(d) it follows thatf(z)− f(x)r˜= z − xr· c3x+ c4c3z + c4= γ(d) · c3x+ c4c3x+ c3rγ(d) + c4=(1γ(d) +c3rc3x+ c4)−1(9.12)= 2 ·(2γ(d) +2rx− f−1(∞))−1= 2 ·((d− i) + (d˜− d))−1= γ(d˜). (9.13)72Hence f(z) = f(x) + r˜ · γ(d˜).(ii): Notice that (x′, r′, d′) = ugu−1(x, r, d). In the following, we fix x and r and considerd a free variable. Let us define x′ and r′ by the first two formulas from part (i) which donot depend on d. The proof of (i) shows that there is a sign σ ∈ {−1, 1} such that any dsatisfiesugu−1(x, r, d) =(x′, r′, σ · (d+ g](x) · r)). (9.14)Thus {gu−1(x, r, d) : d ∈ R}={u−1(x′, r′, d′) : d′ ∈ R}. (9.15)As a consequence of both parts of Proposition 9.3 we see that the right-hand side is {x′} ×(U(x′, r′) \ {x′}) and the left-hand side is the image under g of the set {x}× (U(x, r) \ {x}).This implies that U(x′, r′) = g(U(x, r)). 9.3 Generalized child functions in upcircle coordinatesLet us consider the generalized child function ac,j : D → D in the case c = 1.Definition 9.7 We define a]j : R \ E → R bya]j(x) := (Sj′Aj)](x), if aj(x) = Sj′Aj(x).Remark 9.8 We notice the similarity to Proposition 7.10.(iii) which states for x ∈ R (seeRemark 7.11) that|a′j(x)| = |(Sj′Aj)′(x)|, if aj(x) = Sj′Aj(x). (9.16)Lemma 9.9 The function a]j has the following properties:(i) a]j is well-defined and bounded.(ii) Given the upcircle coordinates (x, r, d) := u(x, z), we have the following formula forthe upcircle coordinates (x′, r′, d′) := ua1,j(x, z):x′ = aj(x), r′ = |a′j(x)| · r, |d′| = |d+ a]j(x) · r|.73Proof. (i): For x ∈ R \ E we have Aj(x) ∈ R \ E ⊆ Σ∗ according to Lemma 8.2.(ii). Thismeans that there is a unique index j′ such that Aj(x) ∈ Σj′ or equivalently aj(x) = Sj′Aj(x).For the boundedness, we fix j′ and show that a]j is bounded onB := {x ∈ R\E : Aj(x) ∈ Σj′}.Let g := Sj′Aj ∈ G. For any x ∈ B we have a]j(x) = g](x) and g(x) ∈ Σ. Hence g is boundedon B. It follows with Remark 9.5 that a]j is bounded on B.(ii): By definition of S1 we have a1,j(x, z) = Sj′Aj(x), where j′ is chosen such that Aj(x) ∈Σj′ or equivalently aj(x) = Sj′Aj(x). Thus a]j(x) = (Sj′Aj)](x) and |a′j(x)| = |(Sj′Aj)′(x)|.Now, Proposition 9.6 shows the three formulas in (ii). 9.4 Upcircles intersecting two shape setsLet us restate and prove Proposition 5.10 (on page 42). The case c = 2 is illustrated inFigure 11. We obtain an illustration of the case c = 1 by interchanging the labels x, r, z withx′, r′, z′.Figure 11: (Copy of Figure 5). Upcircles U(x, r) and U(x′, r′) = S(1,1)(U(x, r)) with elementsz and z′ = S(1,1)(z).Proposition 9.10 Let (x, z) ∈ D and (x′, z′) := Sc(x, z). If c = 2 we assume x ∈ Σ,and if c = 1 we assume z ∈ Σ. We define the upcircle coordinates (x, r, d) := u(x, z) and74(x′, r′, d′) := u(x′, z′). Suppose r is small enough such that U(x, r) intersects at most twoshape sets. Then the following bounds hold:|x− x′| ≤ 4r, 14r ≤ r′ ≤ 4r,∣∣∣|d| − |d′|∣∣∣ ≤ 2r. (9.17)Proof. The points x and z lie on the upcircle U(x, r) which intersects Σ and at most oneother shape set Σj. If such j exists, then j is either (1, 0) or (1, 1). Figure 11 illustrates thecase j = (1, 1).c = 2: The case z ∈ Σ is trivial because then (x′, z′) = (x, z). Assume z /∈ Σ, thus Σj isthe only shape set that contains z. It follows by construction of S2 that (x′, z′) = S2(x, z) =Sj(x, z). Together with Lemma 9.11 (which is stated right after this proof) it follows (9.17).c = 1: The case x ∈ Σ is trivial because then (x′, z′) = (x, z). Assume x /∈ Σ, thus Σj isthe only shape set that contains x. It follows by construction of S1 that (x′, z′) = S1(x, z) =Sj(x, z). Lemma 6.11.(iv) states that Sj(Σj′) = Σj′′ with j′′ = j′ · j−1. Hence x′ ∈ Σj·j−1 = Σand z′ ∈ Σj−1 . In both cases j = (1, 0) and j = (1, 1) we easily check j · j = (0, 0). Thusj−1 = j and S−1j = Sj. We conclude that z′ ∈ Σj and (x, z) = Sj(x′, z′).The points x′ and z′ lie on the upcircle U(x′, r′) which therefore intersects Σ and Σj.Now, we show that U(x′, r′) does not intersect any other shape set. Assume that U(x′, r′)contains an element w ∈ Σj′ with j′ ∈ J \ {(0, 0), j}. Since U(x′, r′) = Sj(U(x, r)), seeProposition 9.6.(ii), it follows that U(x, r) contains S−1j (w) = Sj−1(w) ∈ Σj′′ with j′′ =j′ · j ∈ J \ {(0, 0), j} in contradiction to our assumption that U(x, r) intersects at most twoshape sets.Together with Lemma 9.11 (where we interchange x, r, d, z with x′, r′, d′, z′) we obtainthe following bounds:|x− x′| ≤ 4r′, r′ ≤ r ≤ 4r′,∣∣∣|d| − |d′|∣∣∣ ≤ 2r′. (9.18)It follows that (9.17) holds. 75Figure 11 illustrates the case j = (1, 1) of the following lemma.Lemma 9.11 Let (x, z) ∈ D and (x′, z′) := Sj(x, z), where j is either (1, 0) or (1, 1). Wedefine the upcircle coordinates (x, r, d) := u(x, z) and (x′, r′, d′) := u(x′, z′). Suppose thatx ∈ Σ and that U(x, r) intersects Σj (not necessarily in z). Then the following bounds hold:|x− x′| ≤ 4r, r ≤ r′ ≤ 4r,∣∣∣|d| − |d′|∣∣∣ ≤ 2r.Proof. Σ is the intersection of the first quadrant Q and the disk D around −1 of radius 2.The center of the upcircle U(x, r) is x+ ir.Case 1: j = (1, 0). The parallel to the real line passing through x+ ir intersects U(x, r)in some point w outside (or on the boundary) of Q, i.e. <w ≤ 0. Since w = x + ir − r itfollows that x ≤ r. Proposition 9.6 states that x′ = −x and r′ = r and |d′| = |d|. Hence|x− x′| = 2x ≤ 2r.Case 2: j = (1, 1). The line passing through the centers −1 and x+ ir intersects U(x, r)in some point w outside (or on the boundary) of D, i.e. |w − (−1)| ≥ 2. See Figure 12.Figure 12: Upcircle U(x, r) intersecting Σ(1,1).76Since|w − (−1)| = |w − (x+ ir)|+ |(x+ ir)− (−1)| = r + |x+ 1 + ir| (9.19)it follows that 2 − r ≤ |x + 1 + ir| and therefore 4 − 4r ≤ (x + 1)2. Proposition 9.6 statesthatx′ = −1 + 4x+ 1 , r′ = 4(x+ 1)2 · r, |d′| =∣∣∣∣d+ 2rx+ 1∣∣∣∣ . (9.20)In the following, we use x ∈ [0, 1] several times:|x′ − x| = x′ − x = 4− (x+ 1)2x+ 1 ≤4rx+ 1 ≤ 4r, (9.21)r′r= 4(x+ 1)2 ∈ [1, 4], (9.22)∣∣∣|d| − |d′|∣∣∣ ≤ 2rx+ 1 ≤ 2r. (9.23)7710 Ergodicity and the law of large numbers10.1 First settingIn this subsection we assume that x, y are real variables with x 6= y and we impose Setting4.1 (on page 28) which we first restate.Setting 10.1 Let d ∈ N and C be a compact subset of (Rd, | · |), where | · | is some norm. Wefix a finite index set I and always assume i ∈ I. Let (fi)i be a family of Lipschitz-continuousfunctions fi : C → C. Consider the Markov kernel M from C to C given byM(x, ·) := 1|I|∑iδfi(x), x ∈ C. (10.1)Let i1, i2, . . . be a sequence of independent random indices such that each in is uniformlydistributed on I. For n ∈ N0 we define the random multiindex In := (i1, . . . , in) which isuniformly distributed on In. Notice for any x ∈ C thatUxn := fIn(x), n ∈ N0is a Markov chain with Ux0 = x and kernel M because Uxn = fin(Uxn−1).Let us restate Definitions 4.4 and 4.9 (on pages 29 and 31) which use the notation|b′(x, y)| = |b(x)− b(y)|/|x− y|, see Definition 3.2.Definition 10.2 Let us define the following notions of contractivity on average:(i) (fi)i contracts on average in step N ∈ N, ifsupx,y∈CE ln |f ′IN (x, y)| < 0,(ii) (fi)i strongly contracts on average in step N ∈ N, if there exists  > 0 such thatsupx,y∈CE ln(max{|f ′IN (x, y)|, })< 0,(iii) (fi)i (strongly) contracts on average, if it (strongly) contracts on average in step 1.78Definition 10.3 We define:(i) M is ergodic, if it has a unique invariant probability measure µ which is also attracting,i.e. µM = µ and νMn converges weakly to µ for any probability measure ν on C.(ii) For any x ∈ C we define the Markov chainUxn := fIn(x), n ∈ N0,which starts in Ux0 = x and has kernel M .(iii) M satisfies the law of large numbers, if M is ergodic with invariant measure µ and forany continuous function φ : C → R and x ∈ C we have a.s.limn1n+ 1n∑t=0φ(Uxt ) =∫φ dµ.Let us restate and prove Lemma 4.12 (on page 32).Lemma 10.4 Let N ∈ N. The kernel MN = M . . .M has the formulaMN(x, ·) = 1|IN |∑I∈INδfI(x), x ∈ C, (10.2)and satisfies the following properties:(i) If MN is ergodic, then M is ergodic.(ii) If MN satisfies the law of large numbers, then M satisfies the law of large numbers.Proof. (i): The kernel MN has a unique invariant probability measure µ and µ is alsoattracting. We have (µM)MN = (µMN)M = µM and by the uniqueness of µ it followsµM = µ, so µ is also invariant forM . The uniqueness of the invariant measure forM is clearbecause any invariant measure for M is also invariant for MN . To see that µ is attractingfor M let ν be a probability measure on C. We decompose the sequence νn := νMn into Nsubsequences νn,r := νNn+r with r ∈ {0, . . . , N − 1}. Each subsequence νn,r = (νM r)(MN)nconverges weakly to µ, thus νn converges weakly to µ.79(ii): For fixed x ∈ C we decompose Uxn into N subsequences Uxn,r := UxNn+r with r ∈{0, . . . , N−1} and show the law of large numbers for each of them. For each r and n ∈ N therandom multiindex In,r := (iN(n−1)+r+1, . . . , iNn+r) is uniformly distributed on IN . Moreover,the random multiindex INn+r is the concatenation of In,r and IN(n−1)+r. This shows theindependence of Ir, I1,r, I2,r, . . . and it also shows thatUx0,r = fIr(x), Uxn,r = fIn,r(Uxn−1,r). (10.3)Hence Uxn,r is a Markov chain with kernel MN . The law of large numbers for MN shows forany (non-random) multiindex I∗ ∈ Ir thatP(limn1n+ 1n∑t=0φ(Uxt,r) =∫φ dµ∣∣∣∣∣ Ir = I∗)= 1. (10.4)It follows for any (non-random) sequence tn in N0 with limn tn =∞ that a.s.limn1tn + 1tn∑t=0φ(Uxt,r) =∫φ dµ. (10.5)Let tn,r be the largest t ∈ N0 such that Nt+ r ≤ n. Thenn∑t=0φ(Uxt ) =N−1∑r=0V xn,r, Vn,r :=tn,r∑t=0φ(Uxt,r). (10.6)Since limn(tn,r + 1)/(n+ 1) = 1N and in particular limn tn,r =∞ we know that a.s.limn1n+ 1Vxn,r =1N∫φ dµ. (10.7)This immediately implies that a.s.limn1n+ 1n∑t=0φ(Uxt ) =∫φ dµ. (10.8)Let us restate Proposition 4.13 (on page 33) which we have proven already to be aconsequence of Lemma 4.12 (or its restated version Lemma 10.4).Proposition 10.5 If (fi)i contracts on average in some step N , then M satisfies the law oflarge numbers.8010.2 Extended settingIn this subsection we impose Setting 4.22 (on page 36) which we first restate.Setting 10.6 We extend Setting 4.1 (restated in Setting 10.1) in the following way:(i) Let (f ′i)i be a family of Lipschitz-continuous functions f ′i : C → C ′, where C ′ ⊆ R iscompact. Thus C˜ := C × C ′ ⊆ Rd+1 is compact, too.(ii) For any n ∈ N and multiindices I = (i, J) ∈ In with J ∈ In−1 let us define FI : C → C˜byFI := (fI , f ′ifJ). (10.9)(iii) In Definition 10.3.(ii) we defined for any x ∈ C the Markov chain Uxn = fIn(x) withstate space C. Now we define for any x ∈ C a stochastic process U˜xn with state spaceC˜ byU˜xn := FIn(x), n ∈ N0. (10.10)Remark 10.7 The proof of the following proposition shows that U˜xn is also a Markov chain.Let us restate and prove Proposition 4.25 (on page 37), the modified law of large numbers.Proposition 10.8 Suppose that (fi)i strongly contracts on average in some step N . Letµ denote the invariant measure of M , see Proposition 4.13. For any continuous functionφ : C˜ → R and x ∈ C we have a.s.limn1nn∑t=1φ(U˜xt ) =1|I|∑i∫φFi dµ. (10.11)Proof. By Definition 10.2.(ii), there exists  > 0 such thatsupx,y∈CE ln(max{|f ′IN (x, y)|, })< 0. (10.12)In particular, (fi)i contracts on average in step N . Proposition 10.5 ensures that M satisfiesthe law of large numbers. Let µ denote the invariant measure of M .81Let us define f˜i : C˜ → C˜ and a Markov kernel M˜ from C˜ to C˜ byf˜i(x, x′) := (fi(x), f ′i(x)) (10.13)M˜(v, ·) := 1|I|∑iδf˜i(v). (10.14)Notice that f˜i(x, x′) does not depend on x′.The proof is structured in the following way:• Step 1 shows f˜I(x) = FI(x, x′) for (x, x′) ∈ C˜ and I ∈ In with n ∈ N. This implies theanalogue of Uxn = fIn(x), namely U˜xn = f˜In(x, x′). Hence U˜xn = f˜in(U˜xn−1) is a Markovchain with kernel M˜ .• Step 2 justifies why we may assume w.l.o.g. that  is a Lipschitz-constant of g :=f ′iNfIN−1 which is the second component of FIN = (fIN , g).• Step 3 verifies that M˜ satisfies the law of large numbers.• Step 4 shows that the invariant measure µ˜ of M˜ is the arithmetic mean of the dis-tributions of Fi : C → C˜ w.r.t µ. Thus any integral ∫ φ dµ˜ can be written as|I|−1∑i ∫ φFi dµ.For any continuous function φ : C˜ → R and x ∈ C the combination of Steps 1, 3, 4 leadsto the a.s. convergencelimn1nn∑t=1φ(U˜xt ) =∫φ dµ˜ = 1|I|∑i∫φFi dµ. (10.15)The very technical proofs of Steps 1 and 2 may be skipped at first reading.Step 1: Let In = (i1, . . . , in) ∈ In. We verifyf˜In(x, x′) = FIn(x) (10.16)82with the following illustration of the composition f˜In(x, x′) = f˜in . . . f˜i1(x, x′), where thenumber t = 1, . . . , n over an arrow signifies the application of the function f˜it : (x, x′) 7→(fit(x), f ′it(x)):(x, x′) 17→(fi1(x), f ′i1(x)) 27→ (fI2(x), f ′i2fI1(x)) 37→ . . . n7→ (fIn(x), f ′infIn−1(x)). (10.17)This informal explanation is more intuitive than the formal induction proof: The case n = 1holds by definition and for “n− 1 7→ n” we apply the induction hypothesis to (i2, . . . , in−1).Step 2: Let us assume that the proposition holds under the additional assumption that is a Lipschitz-constant of g. Now we show the general case.Since all functions fi and f ′i are Lipschitz-continuous and I is finite, there are constantsL,L′ ∈ (0,∞) such that L is a Lipschitz-constant for each function fi and L′ is a Lipschitz-constant for each function f ′i . Hence L′LN−1 is a Lipschitz-constant of g. Using the rescalingfactor λ := /(L′LN−1) > 0 we setf ′i,? := λf ′i : C → λC ′ =: C ′? (10.18)and C˜? := C × C ′?. We define FI,?, U˜xn,?, g? analogously to FI , U˜xn , g, where the family(f ′i,?)i replaces (f ′i)i. Then  is a Lipschitz-constant of g? = λg and the continuous functionφ? : C˜? → R : (x, x′) 7→ φ(x, 1λ · x′) fulfills φ?(U˜xn,?) = φ(U˜xn ) for any x ∈ C.Step 3: We show that M˜ satisfies the law of large numbers. According to Proposition10.5 we only need to show that (f˜i)i contracts on average in step N . Let (x, x′), (y, y′) bedifferent pairs in C˜. Step 1 and the function g (with Lipschitz-constant ) from Step 2 allowus to writef˜IN (x, x′)− f˜IN (y, y′) = FIN (x)− FIN (y) =(fIN (x)− fIN (y), g(x)− g(y)). (10.19)We equip Rd+1 with the norm |(x, x′)|∞ := max{|x|, |x′|}, where x ∈ Rd, x′ ∈ R. It followsthat|f˜IN (x, x′)− f˜IN (y, y′)|∞ ≤ max{|fIN (x)− fIN (y)|,  · |x− y|}. (10.20)In this inequality both sides vanish for x = y. Suppose x 6= y. Using |(x, x′) − (y, y′)|∞ ≥|x− y| leads to ∣∣∣f˜ ′IN((x, x′), (y, y′))∣∣∣ ≤ max {|f ′IN (x, y)|, } . (10.21)83Together with our assumption (10.12) it follows that (f˜i)i contracts on average in step N .Step 4: Let µ˜ denote the invariant measure of M˜ and µi denote the distribution ofFi : C → C˜ w.r.t. µ. We show thatµ˜ = 1|I|∑iµi.Since µ˜ = M˜µ˜, any Borel set B˜ ⊆ C˜ satisfiesµ˜(B˜) =∫1B˜dµ˜ =∫ ∫1B˜dM˜(v, ·) µ˜(dv) = 1|I|∑i∫1B˜(f˜i(v))µ˜(dv). (10.22)Let µ∗ be the probability measure on C defined by µ∗(B) := µ˜(B × C ′) and µ∗i be thedistribution of Fi : C → C˜ w.r.t. µ∗. For v = (x, x′) we have f˜i(v) = Fi(x) and therefore∫1B˜(f˜i(v))µ˜(dv) =∫1B˜(Fi(x))µ˜(d(x, x′)) = µ˜({Fi ∈ B˜} × C ′) (10.23)= µ∗({Fi ∈ B˜}) = µ∗i (B˜). (10.24)Combining both results yieldsµ˜(B˜) = 1|I|∑iµ∗i (B˜). (10.25)It remains to show that µ∗ = µ because this implies that µ∗i = µi. Let B ⊆ C be a Borelset. If we apply formula (10.25) in the special case B˜ := B × C ′, then the left-hand side isµ∗(B) and for the right-hand side we notice thatµ∗i (B˜) = µ∗({x ∈ C : Fi(x) ∈ B × C ′})= µ∗({fi ∈ B}). (10.26)Thusµ∗(B) = 1|I|∑iµ∗({fi ∈ B}) =∫ ∫1B dM(x, ·) µ∗(dx) = (µ∗M)(B). (10.27)So µ∗ is invariant for M and must therefore be equal to µ. 8411 Shape integrand11.1 Definition and connection to contractivity on averageWe remind ourselves of Definition 7.4 (on page 57) of the fixpoint polynomial ĥ : R→ [3,∞]and the subsequent Lemma 7.6 which states for any j and x ∈ R that|S ′j(x)| · ĥ(x) = ĥSj(x). (11.1)Definition 11.1 We define functions ψ,Ψ,Ψn : R→ R with n ∈ N0 byψ(x) := ln 23 · ĥ(x)ĥα(x) , Ψ(x) := 16∑jψSj(x), Ψn(x) := EΨxn(x),where ψ(∞) is defined as lim|x|→∞ ψ(x) = ln 32 . We call Ψ the shape integrand and Ψn theshape integrand of order n.Let us restate parts of Definition 3.14 (on page 26).Definition 11.2 Let n ∈ N0 and Jn := (j1, . . . , jn) be a random multiindex. We define(random) functions xn, rn and a (non-random) function Φn:(i) xn : [0, 1]→ [0, 1] : x 7→ aJn(x),(ii) rn : [0, 1]→ (0,∞) : x 7→ |x′n(x)|,(iii) Φn : [0, 1]→ R : x 7→ E ln rn(x).Note that x0(x) = x and r0(x) = 1 and Φ0(x) = 0.We remind ourselves of the chain rule |(ab)′(x)| = |a′b(x)| · |b′(x)| in Lemma 7.3.(i) (onpage 56) for up-to-sign differentiable functions a, b. Let us restate and prove Proposition4.15 (on page 34).85Proposition 11.3 For any n ∈ N0 and x ∈ [0, 1] we have the following approximation ofΦn(x) by the shape integrands of orders up to n− 1:∣∣∣∣∣Φn(x)−n−1∑t=0Ψt(x)∣∣∣∣∣ ≤ ln 43 .Proof. For n ∈ N0 we set Hn(x) := E ln ĥxn(x). For x ∈ [0, 1] we notice that ĥ(x) = x2 +3 ∈[3, 4] and therefore Hn(x) ∈ [ln 3, ln 4]. We are done if we can show thatΦn(x)−n−1∑t=0Ψt(x) = Hn(x)−H0(x). (11.2)Let n ∈ N. Conditioning on the last index jn and applying the chain rule yieldΦn(x) = 16∑jE ln |(ajxn−1)′(x)| = E16∑jln |a′jxn−1(x)|+ E ln rn−1(x) (11.3)= EΦ1xn−1(x) + Φn−1(x). (11.4)Similarly, we see that Hn(x) = E16∑j ln ĥajxn−1(x) = EH1(xn−1(x)). We remind ourselvesof Aj = αSj and aj = SAj. By (11.1) we have|S ′Aj(x)| = ĥaj(x)ĥAj(x), |S ′j(x)| =ĥSj(x)ĥ(x). (11.5)Together with the chain rule |a′j(x)| = |S ′Aj(x)|· 23 ·|S ′j(x)| = 23 ·ĥSj(x)/ĥαSj(x)·ĥaj(x)/ĥ(x)it follows thatln |a′j(x)| = ψSj(x) + ln ĥaj(x)− ln ĥ(x). (11.6)Interpreting j as a random index uniformly distributed on J and taking the expectationleads to Φ1(x) = Ψ(x) +H1(x)− ln ĥ(x). Plugging in xn−1(x) for x, taking the expectationand applying (11.3), (11.4) leads toΦn(x)− Φn−1(x) = EΦ1xn−1(x) = Ψn−1(x) +Hn(x)−Hn−1(x). (11.7)Writing Φn(x) = Φn(x)− Φ0(x) and Hn(x)−H0(x) as a telescopic sum yields (11.2). 11.2 Extrema of the shape integrandLemma 11.4 The shape integrand is invariant under the application of any shape function,that isΨSj(x) = Ψ(x), x ∈ R.86Proof. The lemma is an immediate consequence of {Sj′Sj : j′ ∈ J } = {Sj′ : j′ ∈ J }, seeLemma 6.8. Proposition 11.5 The shape integrand has the following extrema:maxx∈[0,1]Ψ(x) = Ψ(0) = −χmin, minx∈[0,1]Ψ(x) = Ψ(1) = −χmaxProof. A simple computation verifies−Ψ(0) = 13 ln 32 = χmin < χmax = 13 ln 912 − ln 3 = −Ψ(1). (11.8)We show that Ψ is decreasing on [0, 1]. Since Ψ is differentiable on [0, 1] it suffices to showthat Ψ has no critical points in (0, 1).In the following we view the six functions Sj as functions R→ R. Lemma 11.4 states theinvariance ΨSj = Ψ and implies Ψ′Sj(x) · S ′j(x) = Ψ′(x) for x ∈ R. Notice that S ′j(x) 6= 0.So if x is a critical point of Ψ, then Sj(x) is critical, too.Step 1: We show that −3,−1, 0, 1, 3 are critical points of Ψ. Since −3 = S(0,1)(0) and3 = S(0,2)(0) and −1 = S(0,1)(1) it suffices to show that 0 and 1 are critical points of Ψ. Theinvariance Ψm = Ψ means that Ψ is even and therefore Ψ′ is odd. Notice that h(0) = −h2(0)and h(1) = −1. ThusΨ′(h(0)) + Ψ′(h2(0)) = 0, Ψ′(h(1)) + Ψ′(1) = 0. (11.9)We multiply the first equation by h′(0) = (h2)′(0) and the second one by h′(1) = 1. Thenthe chain rule applied to Ψhk and the invariance Ψhk = Ψ yield2Ψ′(0) = 0, 2Ψ′(1) = 0. (11.10)Step 2: We show that Ψ has at most 11 critical points. Let Q(x) := 94 ĥα(x) = x2−3x+9and P (x) := x2 − 4x− 3. We computeψ′(x) = 2xĥ(x)− 3α(x)Q(x) =−3P (x)ĥ(x)Q(x). (11.11)87Let us define the even function ψ±(x) := ψ(x) + ψ(−x), the even polynomial Q0(x) :=Q(x) ·Q(−x) of degree 4 and the odd polynomial P0(x) := P (x)Q(−x)−P (−x)Q(x). Sinceĥ is even we haveψ′±(x) · ĥ(x) = ψ′(x) · ĥ(x)− ψ′(−x) · ĥ(−x) =P (x)Q(x) −P (−x)Q(−x) =P0(x)Q0(x). (11.12)We have Q0(x) = (x2 + 9)2− (3x)2 and P (x)Q(−x) = c4x4− x3 + c2x2− 45x+ c0 for certain(irrelevant) coefficients ci. HenceQ0(x) = x4 + 32x2 + 34, P0(x) = −3x3 − 5 · 33x. (11.13)Since 6Ψ(x) = ∑k ψ±hk(x) we can write6Ψ′(x) · ĥ(x) = ∑kϕk(x), ϕk(x) := (ψ±hk)′(x) · ĥ(x). (11.14)Lemma 7.6 (more exactly equation (7.1) in its proof) states that (hk)′(x) · ĥ(x) = ĥhk(x).Thusϕk(x) = ψ′±hk(x) · ĥhk(x) =P0hk(x)Q0hk(x). (11.15)For k ∈ {−1, 1} the formulas h(x) = (x−3)/(x+ 1) and h−1(x) = −(x+ 3)/(x−1) showthatQk(x) := Q0hk(x) · (x+ k)4, Pk(x) := P0hk(x) · (x+ k)4 (11.16)are polynomials of degree 4. We computeQ1(x) = (x− 3)4 + 32(x− 3)2(x+ 1)2 + 34(x+ 1)4 (11.17)= 91x4 + 276x3 + 58 · 32x2 + 4 · 34x+ 35, (11.18)P1(x) = −3(x− 3)3(x+ 1)− 5 · 33(x− 3)(x+ 1)3 (11.19)= −138x4 + 24x3 + 28 · 32x2 + 40 · 32x− 2 · 35. (11.20)Since h−1 = mhm and Q0 is even and P0 is odd we see thatQ−1(x) = Q1(−x), P−1(x) = −P1(−x). (11.21)By (11.14) we can write6Ψ′(x) · ĥ(x) = ∑kPk(x)Qk(x)= P∗(x)Q∗(x) (11.22)88with the polynomialsQ∗(x) :=∏kQk(x), Q∗k(x) :=Q∗(x)Qk(x), P ∗(x) :=∑kPk(x)Q∗k(x). (11.23)Notice that Q∗ has degree 12 and Q∗k has degree 8 and P ∗ has at most degree 12. We see by(11.21) that Q∗0 is even, so Q∗ = Q∗0 ·Q0 is also even. It follows that P ∗ = 6Ψ′ · ĥ ·Q∗ is oddbecause Ψ′ is odd and ĥ is even. So the degree of P ∗ is at most 11. Notice that any criticalpoint of Ψ is a root of P ∗.Step 3: We show that −3,−1, 0, 1, 3 are the only critical points of Ψ. Assume that Ψhas a critical point y ∈ R \ {−3,−1, 0, 1, 3} = Σ∗. As mentioned before Step 1 this impliesthat each point Sj(y) is also critical. These six critical points Sj(y) are pairwise differentand each of them lies in Σ∗. So there exist at least 11 critical points. We saw in Step 2 thatP ∗ has at most degree 11. Hence P ∗ has degree 11 and 11 simple roots. Exactly 5 of theroots are positive, because 0 is a root and P ∗ is odd. So P ∗ changes the sign exactly 5 timeson (0,∞). Now we show that1. P ∗(x) converges to +∞ for x→ +∞,2. (P ∗)′(0) > 0.Since P ∗ changes the sign an odd number of times on (0,∞), part 1 implies that P ∗ isnegative on some interval (0, ) in contradiction to part 2 and P ∗(0) = 0.It suffices to show that both the x11 and the x1 coefficient of P ∗ are positive. We computeQ∗0(x) = 912x8 + c6x6 + c4x4 + c2x2 + 310, (11.24)Q∗−1(x) = 91x8 + 276x7 + c′6x6 + c′5x5 + c′4x4 + c′3x3 + c′2x2 + 4 · 38x+ 39 (11.25)for certain (irrelevant) coefficients ci, c′i. Notice thatQ∗1(x) = Q∗−1(−x) due to (11.21). Simplecomputations show that the x11 coefficient of P0 ·Q∗0 is −3·912 and the x1 coefficient is −5·313.The x11 coefficient of P1 · Q∗1 is 40 272 and the x1 coefficient is 112 · 311. As a consequenceof P−1(x)Q∗−1(x) = −P1(−x)Q∗1(−x), see (11.21), the x11 and x1 coefficients of P−1Q∗−1 andP1Q∗1 coincide. In combination, we see that the x11 coefficient of P ∗ is −3 ·912+2 ·40 272 > 0and the x1 coefficient is −5 · 313 + 2 · 112 · 311 > 0. 8912 Conclusion and conjecture12.1 ConclusionConsider a Markov chain of triangles each of which is uniformly chosen amongst the childrenof the previous triangle. The sequence of the shapes of these triangles is also a Markov chainand we call it a shape chain.This thesis provides a new description of shape chains by expressing them in upcirclecoordinates. Simply speaking, instead of starting with a single point (which represents itsassociated triangle), we can start with a whole upcircle. Applying the same (random) Möbiustransformations as for a certain shape chain results in a Markov chain of upcircles.The new description of shape chains in upcircle coordinates allows for the applicationof our modified version of the law of large numbers. The original version was proved in [6]where it was not used for the proof of the exponentially fast “flattening” of the non-flatshape chain and where a technical condition, namely contractivity on average in step 2, wasverified via numerical computations. We show the contractivity on average in some step Nwithout numerical computations by reducing the problem to the analysis of a one-dimensionalfunction, the shape integrand.A byproduct of our analysis of the shape integrand are the bounds χmin and χmax for theuniversal constant χ := −16∑j∫ln |a′j| dµ > 0, where |a′j| is the up-to-sign derivative of thechild function aj and µ is the invariant measure of the flat shape chain.Our second theorem considers the same coupling (Xn, Zn) of a flat and a non-flat shapechain as [6] and presents the new result that the radius Rn of the unique upcircle containingXn and Zn decays (asymptotically) with the exact exponential rate χ from above. Fromhere, we concluded that Zn −Xn and =Zn = flat(∆(n)) decay exactly with rate χ, too.90The latter result about =Zn shows that the (upper) Lyapunov constant of the associateddynamical system (investigated in [1], [9], [10]) is 12χ and therefore has the bounds12χminand 12χmax which are sharper than the previously known bounds.12.2 ConjectureTheorem 1.2 in [6] states that the largest angle of ∆(n) converges to pi in probability. Onecan show that the a.s. convergence follows from our conjecture:limn1nln(1−Xn) = 0 a.s. (12.1)It suffices to prove that the limit limn 1n ln(1−Xn) exists a.s. because this limit can neitherbe positive (that would mean that Xn is unbounded), nor negative (because then the limitset of Xn would a.s. be {1} instead of [0, 1] as stated in Theorem 1.3 in [6]).Let λ ∈ [0, 2]. For a flat triangle ∆ we set flatλ := 0. For any other triangle ∆ we defineflatλ(∆) :=(shortest height in ∆)2−λ · (longest height in ∆)λarea of ∆ . (12.2)In particular flat0 = flat. We notice that flatλ(∆) is non-decreasing in λ. According to theformula for the area of a triangle, a shortest height is orthogonal to a longest edge and alongest height is orthogonal to a shortest edge. Henceflat1(∆) = 2 · shortest height in ∆shortest edge in ∆ ,flat2(∆) = 2 · longest height in ∆shortest edge in ∆ .With this representation of flat1 we can easily verify that 12flat1(∆) is the sine of the second-largest angle of ∆ (or largest angle in case of several largest angles). We set flatλ(z) :=flatλ(∆z) and say that a sequence of triangles or points λ-flattens if their flatλ-values convergeto 0. So a sequence of triangles 1-flattens if and only if the largest angle converges to pi. Onecan show that the conjecture (12.1) does not only imply that Zn 1-flattens, but even thatflatλ(Zn) decays exactly with the same rate χ for each λ ∈ [0, 2].91The following two examples show that 2-flattening is strictly stronger than 1-flatteningand 1-flattening is strictly stronger than 0-flattening. The sequence n+ i does not 2-flatten,but it 1-flattens. The sequence of isosceles points −1 + 2 exp(i/n) does not 1-flatten, but it0-flattens.If we could show that φ : [0, 1]→ R : x 7→ ln(1− x) is integrable w.r.t. µ and the law oflarge numbers (1.13) still holds, then our conjecture (12.1) would follow.92References[1] I. Bárány, A.F. Beardon, and T.K. Carne. Barycentric subdivision of triangles andsemigroups of Möbius maps. Mathematika, 43:165-171, (1996).[2] M.F. Barnsley, and J.H. Elton. A new class of Markov processes for image encoding.Adv. in Appl. Probab., 20(1):14-32, (1988).[3] M. Begue, D.J. Kelleher, A. Nelson, H. Panzo, R. Pellico, and A. Teplyaev. Randomwalks on barycentric subdivisions and the Strichartz hexacarpet. Exp. Math., 21(4):402-417, (2012).[4] S. Butler, and R. Graham. (2010) Iterated Triangle Partitions. In: G.O.H. Katona, A.Schrijver, T. Szőnyi, and G. Sági (eds). Fete of Combinatorics and Computer Science.Bolyai Soc. Math. Stud., 20:23-42, (2010).[5] B. Casselman. The joy of barycentric subdivision. Amer. Math. Soc. Feature column,(electronic), (June, 2017).[6] P. Diaconis, and L. Miclo. On Barycentric Subdivision. Combin. Probab. Comput.,20(2):213-237, (2011).[7] P. Diaconis, and L. Miclo. On barycentric subdivision, with simulations, (2010).[8] H. Furstenberg. Noncommuting random products. Trans. Amer. Math. Soc., 108:377-428 (1963).[9] B. Hough. Tessellation of a triangle by repeated barycentric subdivision. Elect. Commun.in Probab., 14:270-277, (2009).[10] A. Wilkinson. What are Lyapunov exponents, and why are they interesting? Amer.Math. Soc. Bulletin, 54:79-105 (electronic), (2016).93AppendixA Asymptotic exponential ratesLemma A.1 Let wn, w′n be sequences in C and κ,κ′ be real constants.(i) wn has exactly rate κ if and only if wn has at most and at least rate κ.(ii) wn has at most (or at least) rate κ, if for any κ′ > κ (or κ′ < κ) there is some n0 ∈ Nsuch that any n ≥ n0 satisfies|wn| ≤ eκ′n or |wn| ≥ eκ′n, resp.(iii) wn has at most rate κ if and only if w−1n has at least rate −κ.(iv) If wn has at most rate κ and p ∈ (0,∞), then wpn has at most rate p · κ. The same istrue with “at most” replaced by “at least” or “exactly”.(v) If wn has at most rate κ < 0, then wn converges to 0. If wn has at least some rateκ > 0, then |wn| converges to ∞.(vi) If wn is bounded, then wn has at most rate 0. If wn is bounded away from 0, then wnhas at least rate 0. In particular, if wn converges to a non-zero complex number, thenwn has exactly rate 0.(vii) If wn has at most rate κ and w′n has at most rate κ′, then wn · w′n has at most rateκ+ κ′. The same is true with “at most” replaced by “at least” or “exactly”.(viii) If wn has at most rate κ and w′n has at most rate κ′, then wn + w′n has at most ratemax{κ, κ′}.Proof. The parts (i), (ii), (iv), (vi), (vii) are clear and the parts (iii), (v) follow from (ii).(viii): For any a, b ∈ C we know |a+ b| ≤ 2 max{|a|, |b|} and thereforeln |a+ b| ≤ ln 2 + max{ln |a|, ln |b|}. (A.1)94We plug in a = wn and b = w′n, multiply by 1n , take lim supn and use that the limit superiorof the sum or the maximum of two real sequences is at most the sum or the maximum ofthe limits inferior of those two sequences. This shows (viii). 95

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0371619/manifest

Comment

Related Items