SOME NEW RESULTS ON CONSTRUCTINGOPTIMAL ALPHABETIC BINARY TREESByBrendan M. MumeyB. Sc. Honors (Mathematics) University of AlbertaA THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE STUDIESCOMPUTER SCIENCEWe accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIAAugust 1992© Brendan M. Murney, 1992In presenting this thesis in partial fulfilment of the requirements for an advanced degree atthe University of British Columbia, I agree that the Library shall make it freely availablefor reference and study. I further agree that permission for extensive copying of thisthesis for scholarly purposes may be granted by the head of my department or by hisor her representatives. It is understood that copying or publication of this thesis forfinancial gain shall not be allowed without my written permission.Computer ScienceThe University of British Columbia2075 Wesbrook PlaceVancouver, CanadaV6T 1Z1Date:2,AbstractThis thesis studies the question of whether optimal alphabetic binary trees can be constructed in o(n lg n) time. Hu and Tucker [1] gave the first subquadratic algorithm in1971, namely an O(n lg n) time algorithm. Optimal alphabetic binary trees have manyapplications and the question of whether an o(n ig n) time algorithm exists has remainedan open problem in data structures for the last 20 years. We show that a class of techniques for finding optimal alphabetic trees which includes all current methods yieldingO(n lg ii) time algorithms are at least as hard as sorting in whatever model of computation used. We introduce a new idea for finding optimal alphabetic binary trees which werefer to as region-processing. Region-processing methods have been successfully used togive 0(n) time algorithms for the case where all the input weights are within a constantfactor of one another and when they are exponentially separated. Although the regionbased method we give exhibits o(n lg n) time performance for many cases, we also showa reduction from sorting to it as well. It is possible that the ideas used in this reductioncan be extended to yield a general w(n) lower bound.11Table of ContentsAbstractList of Figures vAcknowledgement vi1 Introduction 12 Review of Current Algorithms 42.1 The Garcia-Wadis Algorithm 62.2 Correctness of the Hu-Tucker Algorithm 82.2.1 Feasibility 92.2.2 Optimality 123 Faster Methods? 173.1 Some Simple Cases Solvable in 0(n) Time 173.2 Region-based Methods 243.2.1 Detailed description of the algorithm 263.2.2 The region-processing step 273.2.3 Time complexity of the basic region-based method 294 Speeding up Region-based Methods 304.1 Speeding up General Region-based Algorithms 304.2 The Constant Factor Case 38lii4.3 The Exponentially Separated Case 415 Hardness Results 435.1 Finding the lmcp tree is as Hard as Sorting 445.2 Hardness Result for a Class of Region-based Methods 456 Conclusions 52Bibliography 53ivList of Figures2.1 Sorted list of priority queues 62.2 Compatibility blocking 153.3 Odd number of nodes present 193.4 Region processing 253.5 Typical region distribution 284.6 Can process R quickly 314.7 Gap between regions 324.8 One r-step 344.9 Forest formed after three r-steps 354.10 Partition according to inductive hypothesis 375.11 Simple case 465.12 Tree generated from {x} 485.13 The ordering tree 49vAcknowledgementI would like to thank my supervisor Maria Kiawe for suggesting this interesting problem,and for her constant encouragement and help. In spite of running our department andrunning marathons, she always found time to meet with me. Thanks to David Kirkpatrickfor suggesting improvements to the last chapter and for being my second-reader. I wouldlike to acknowledge my fellow students and the pleasant research atmosphere at U.B.C.It’s been a fun couple of years!Brendan MumeyAugust 1992.viChapter 1IntroductionThe problem of finding optimal alphabetic binary trees can be stated as follows: Givena sequence of n weights w1, . . . , w,, we wish to construct a tree whose leaves have theseweights, such that the tree is optimal with respect to some cost function and also has theproperty that the weights on the leaves occur in order as the tree is traversed from leftto right. A tree which satisfies this last requirement is said to be alphabetic. Althoughmore general cost functions can be considered along the same lines as in the discussionto follow, the original cost function studied, and the one we will concentrate on is wl1where l is the level of the i-th leaf node from the left in the tree. E. N. Gilbert and E.F. Moore [4] gave the first efficient (0(n3) time) algorithm for finding optimal alphabeticbinary trees, using dynamic programming. Their method was improved by Knuth [8],in 1971, to 0(n2) time for a more general problem where internal nodes are weighted aswell. F. Yao also solves this problem in 0(n2) time with her more general method forspeeding up dynamic programming [5].The first o(n2) time solution came in 1971, when T. C. Hu and A. C. Tucker [1]gave an 0(n lg n) time algorithm. Their original implementation was 0(n2), but this wasimproved to 0(n lg n) by Knuth, using a better data-structure, and is mentioned in theoriginal Hu-Tucker paper. If we remove the restriction that the tree must be alphabetic,then the problem becomes the well-known problem of building Huffman trees, which isknown to have e(n lg n) time complexity in the comparison model. (The lower boundmodel will be discussed in Chapter 5.) Modified 0(n lg n) algorithms for finding optimal1Chapter 1. Introduction 2alphabetic binary trees, based on the ideas of Hu and Tucker, but with simpler proofs,have been given by A. M. Garcia and M. L. Wachs in 1977 [3], and Hu, D. J. Kleitmanand J. K. Tanaka in 1979 [2]. The only recent progress on this problem has been madeby P. Ramanan [6] who showed that it is possible to verify that a given alphabetic treeon a sequence of weights is optimal in 0(n) time when the weights in the sequence areeither within a constant factor, or exponentially separated (notions we define preciselylater). However, it seems substantially more difficult to actually construct the optimaltrees in these cases.As with Huffman trees, one application of optimal alphabetic binary trees is to buildefficient codes. Given an optimal tree, one can build codewords for the words at terminalnodes according to how they are reached from the root node of the tree. This is doneby writing a “0” for each left branch taken, and a “1” for each right. Although not asefficient as Huffman codes, alphabetic codes have the possibly desirable property thatthat they preserve order. If one word comes before another in the original order then thecodeword for the first will be smaller than the codeword for the second.Another interesting application is that of finding optimal binary search trees. Thisis the problem of devising efficient tests to differentiate a number of objects which mayoccur at different frequencies. An example mentioned in [1] is the problem of determiningwhich of set of ii lengths a particular object is assuming that you can only decide whethera particular object is longer or shorter than a given length and that the frequencies aregiven. (Imagine logs of different lengths to be stacked in different piles according tolength.) To solve this problem, write down the lengths present in sorted order andcreate an external node for each one. Assign each node a weight corresponding to thefrequency with which that length appears. An optimal alphabetic binary tree providesa classification strategy which minimizes the expected number of length comparisonsnecessary to identify the length of an object.Chapter 1. Introduction 3We begin by reviewing the current Hu-Tucker based algorithms, which are unifiedaround the idea of building an optimal intermediate tree on the weights, from which thefinal optimal alphabetic tree is subsequently constructed. A simple observation will showthat the construction of the intermediate tree requires at least as much time as sortingin the given model of computation. An alternative approach is considered in whichinput weights {w} are first classified according to their order of magnitude. Define thecategory of a node of weight to to be [lg to]. Maximal length regions of consecutiveweights in the same category are simultaneously considered. Using this idea, we give0(n) algorithms for the case when all the input weights are within a constant factor ofone another, and the case where they are exponentially separated, in the sense that thereare only a constant number of weights between 2 and for any k. The 0(n) timealgorithm uses a new result of Maria Klawe for performing generalized selection in lineartime. Unfortunately, we are not able to give a general 0(n) algorithm based on regionprocessing. In fact, we show a reduction from sorting to a large class of region-basedalgorithms. We conclude with some remarks about the ideas we have introduced.Chapter 2Review of Current AlgorithmsIn this chapter we describe the basic Hu-Tucker algorithm and its variants. All Hu-Tuckerbased methods begin by building an intermediate tree on the input weight sequence. Thelevels of the weight-carrying leaves in this intermediate tree are recorded and then usedto build an alphabetic tree whose weights are at the same level. Thus the cost of thealphabetic tree is the same as the cost of the intermediate non-alphabetic tree. Theintermediate tree is proved to have optimal cost in a class of trees which contains allalphabetic trees, so it follows that the alphabetic tree constructed is optimal. Some of theproofs given are nearly identical to those given in [2], while others have been simplified.We feel that the proofs given are simplest available for what is quite a complicatedargument.The Hu-Tucker algorithm begins with a list of leaf nodes containing the weightsin1,. . . , w, in order. This list is called the work list and is used to determine how nodescombine to form the intermediate tree. Nodes in the work list are designated eithercrossable or noncrossable. This affects which nodes may pair off together. Initiallyall nodes are noncrossable. When two nodes are paired off, the resulting internal parentnode is designated crossable. The weight of the parent node is assigned the sum of theweights of its children. The nodes that paired off are removed from the work list, and thenew parent node is put where the left child was. We say that two nodes in the list arecompatible if they are adjacent in the work list, or if all the nodes which separate themare crossable. The symbol v will refer to a node in the work list and w will refer to its4Chapter 2. Review of Current Algorithms 5weight. The level of a node v in the tree, denoted by l, is length of the shortest pathfrom v to the root of the tree. Define an order of the nodes in the work list by v <if zv. < w, or if w = w, and v is to the left of v, in the list. Note that this ordering istransitive. A pair of nodes (Va, v,,) are said to be a local minimum compatible pair(lmcp) if and only if the following two conditions hold:1. vb <v for all nodes v compatible with node Va.2. Va vi,, for all nodes VY compatible with node vb.The first condition says that v, is the best node for Va to pair with, and the secondsays that Va is the best for Vb to pair with. We refer to the new node formed after theypair as Vab.Only lmcps are combined. When only one node remains it will be the root node inthe intermediate tree mentioned above. From now on we will refer to this intermediatetree as the lmcp tree. In general, the construction of the lmcp tree is not unique; thelmcps can combine in different orders. We now sketch a typical implementation of theHu-Tucker algorithm.1. Given the initial nodelist,(Vi) (V2) (V3) . . . (v,)form sorted priority queues of nodes which are compatible with each other. (SeeFigure 2.1.)2. Since all nodes are compatible within a given priority queue above, the two smallestnodes at the top of the queue will be the only candidates for being an lmcp. Tofind an lmcp we can use a stack-based method. Beginning with the leftmost queue,maintain a pointer to the current queue being considered. By checking neighboringChapter 2. Review of Current Algorithms 6I I I I I I II I I I I I I(v1) (v3) (v3) (v,) i (va_i)1 (v) (v2) (v4) 1 ‘ (vfl) (va)Figure 2.1: Sorted list of priority queuesregions, we can determine in 0(1) time whether or not the current pair is an lmcp. Ifit is, combine the two nodes and place the resulting node in the position previouslyoccupied by the leftmost node. Move the pointer back to the previous queue, as itmay now contain an lmcp. Otherwise move the pointer forward.It will often happen that combining two nodes will allow several queues to merge. Ifwe choose the appropriate data-structure for representing the queues, then mergingthe queues can be done in 0(lg n) time. Leftist trees provide such a representation[10]. If we amortize the cost of finding lmcps over the entire course of the algorithm,they take only 0(1) time per lmcp to find. After each lmcp is combined we mayrequire 0(lg n) time to update the queue structures. Hence forming the lmcp treein this way takes 0(n lg ii) time.3. From the lmcp tree, it is easy to construct the optimal alphabetic tree in 0(n) timeusing 0(n) space. This is a simple process and is well described in the originalHu-Tucker and subsequent papers, so I will not describe it here.2.1 The Garcia-Wachs AlgorithmAn interesting modification to the basic Hu-Tucker algorithm presented above was madeby A.M. Garcia and M.L. Wachs in 1977 [3]. Their modification rests on the observationChapter 2. Review of Current Algorithms 7that crossable nodes may be moved past smaller nodes, regardless of whether the latterare crossable or not. This follows from the fact that the smaller nodes will naturallycombine and become crossable before the moved crossable node will be involved in anlmcp. By moving newly formed nodes judiciously, they ensure that all lmcp pairings arebetween adjacent nodes, and hence no information about crossability of nodes need bestored. Their algorithm begins with a list of the original input nodes:(Vi) (v2) (v3) ... (vs)Define a consecutive pair (v1_,v1) to be right minimal if it satisfies:1. w_2 + w_1 w_1 + w (when i > 2),2. w_1 + w <wj_1 + w for all j > i.First the right-most right minimal pair is located. Say this pair is (v1_i,v1). This willbe an lmcp. Next the first (if any) entry to the right of v whose weight is greater thanw_1 + w is found. Let this be Vi+k+1. Delete nodes v_1 and v from the list and insertthe new node v_1, into the list before Vj+k+1. The new list is then(vi) (v2) . . . (v_2) (v1+) . . . (v_i,) (vI+k÷1) . . . (v,j.If no such entry exists, then the new list is(vi) (v2) . . . (v_2) (v+i) . . . (v,2) (v_1,).An O(n lg ii) implementation of this idea is given at the end of their paper. It is dueto Tarjan and makes use of some properties of balanced binary trees; namely that theycan be used to represent lists such that most basic list operations on lists of length krequire O(lg k) time.Chapter 2. Review of Current Algorithms 82.2 Correctness of the Hu-Tucker AlgorithmWe prove the correctness of the basic Hu-Tucker algorithm following the approach in[2j. Their proof is somewhat simpler than the original one given the original Hu-Tuckerpaper [1] and is slightly more general in that it proves the algorithm works for a classof cost functions. We have further simplified many of their proofs and present thesesimplifications in full.There are two key issues addressed in the proof of the Hu-Tucker method. The firstissue is feasibility. Are the leaf node levels in the lmcp tree realizable in an alphabetictree? The second issue is optimality. Does this alphabetic tree have minimum cost?Following the notation in [2], denote the cost of a tree T by ITI. Instead of simplyadding the weights of the children to get the weight of the parent, they consider a moregeneral combine function f, where the parent of nodes Va and vb is assigned the weightf(wa, Wi,). The tree cost and combine functions must satisfy the following properties:P1 TI cannot increase if any of its weights is replaced with a smaller weight.P2 For all Va, Vi,, and v,f(wa,u)b) = f(wi,,wa),Wa f(wa,ws)andWa Wi, 4 f(wa,wz) f(wi,,w).P3 If Vi, < v and ii, l, then interchanging vi, and v0 cannot decrease the cost of thetree.P4 Let T be a tree with it nodes in which Va and vi, are combined. Let T* be thetree of it — 1 nodes resulting from the combination, with the father of Va and Vi,Chapter 2. Review of Current Algorithms 9having weight f(Wa, wb). Then there exists a function F(Wa, wb) such that ITI =F(Wa,Wb) + IT*I.P5 The functions f and F must satisfy the following conditions if wb w:f(f(Wa,Wb)),Wc] f(f(wa,wc),w&)andF(a, wb) + F(f(Wa, wb), w) F(a, w) + F(f(Wa, we), Wb).If the tree cost and combine functions satisfy these properties they are said to beregular. From [2] some examples of regular cost functions are:1. F(w,w,) = w,, + w, = f(w,w)= (Hu-Tucker)2. F(w, w) = 0, f(w, w) = t(w + wy), t> 1I TI = w(root) = > wt’ (power summation)3. F(w,, WY) = 0, f(w, WY) = t x max(w, WY)I TI = w(root) = max Wth1 (mm-max)4. F(w, WY) = 0 and let g be any function for which g(x) > x, and is increasing forall relevant x in the range of the problem.f(w,w) = max(g(wx),g(wy))JTJ = maxg(w)l (mm-max)2.2.1 FeasibilityTo prove feasibility we consider a class of trees whose external node levels can be realizedin an alphabetic tree and show that all lmcp trees belong to this class. We begin withChapter 2. Review of Current Algorithms 10a definition of a class of trees which includes alphabetic trees, and as we will see, alsolmcp trees. Recall that two nodes are compatible if there are no noncrossable nodesbetween them in the work list. A construction for a binary tree is just a list of node-pairings, which if performed in order, will construct the tree. A construction is said to bea monotone construction if the final levels of the nodes listed in the construction is adecreasing sequence. That is, if the tree has depth 5, then all nodes at level 5 are pairedfirst, then all nodes at level 4, then all at level 3, and so on. Finally, a construction issaid to be compatible, if all node-pairings occur between compatible nodes.Definition 2.1 Define C to be the class of all binary trees which have a monotone compatible construction. For a binary tree T in C, let CT denote such a construction.We now give two complementary theorems about the class of alphabetic binary treesand the class C.Theorem 2.1 C includes all alphabetic binary trees.Proof: Let T be an alphabetic binary tree. The result is easily proved by inductionon the height of T, with the base case of one node being trivial. Since T is binary,there must be an even number of leaves on the bottom (largest) level and they mustcome in pairs in the original list of weights. We can build a monotone construction for Trecursively by appending a list of these pairs to the beginning of a monotone constructionfor the subtree of T with the bottom level nodes pruned. The monotone construction forthe subtree exists by induction. LiTheorem 2.2 The level sequence of the leaves of any tree T in C can be realized by analphabetic tree.Chapter 2. Review of Current Algorithms 11Proof: Let T be a tree in C. Assume that it is constructed via some CT. To showthat a tree can be made alphabetic, it suffices to show that the nodes at any given levelin the construction occur in consecutive pairs in the work list. If this is true, one can pairoff the nodes at the largest (bottom) level, then pair off the nodes at the next largestlevel and so on, as in the proof of the previous theorem. Consider the bottom level ofT. If we can establish that nodes on this level come in consecutive pairs then we will bedone, and the general result will follow by induction. If the nodes at the bottom level arenot in consecutive pairs, then there must be two nodes which pair up and are separatedby a node at a smaller (higher) level in the tree. But this node cannot be crossable, as weare following a monotone compatible construction. This is a contradiction as the nodespairing would hence not be compatible.The next theorem shows that lmcp trees are in C, so from the last result we knowthat they can be made alphabetic. The proof of the next theorem relies on the followinglemma which we give without proof. The proof can be found in [2].Lemma 2.3 Let Vbc, Vad be nodes, where vb0 is crossed over in the formation of Vad.Then 1bc 1ad•Proof: Omitted. LiTheorem 2.4 Any lmcp tree T’ is in C.Proof: We can provide a monotone compatible construction for T’ by listing alllowest level pairs, followed by all second lowest level pairs, etc. The only thing to show isthat if these node pairs are combined in order, then all combinations are made betweenChapter 2. Review of Current Algorithms 12compatible nodes. It is easy to see that we can order the lowest level pairs so that nolowest level node prevents a pair from being compatible. The only issue is when a pairof nodes (Va, Vd) are initially incompatible and only become compatible after a node vat a higher (smaller) level has formed. Hence lb < La. But by the last lemma, tbc lad,which implies that 1,, la. This contradiction proves that we can find an order to makeall the lowest level node combinations compatible, and by repeating the argument, finda monotone compatible construction for T’. D2.2.2 OptimalityThis completes the proof of feasibility, so we now consider optimality. The crux of theproof that the lmcp tree T’ is optimal in C is the next lemma.Lemma 2.5 Given an initial node sequence v1, . . . , v,, where some nodes may be cross-able, there is an optimal tree T* in C such that all node combinations are between lmcps.Proof: The proof is by induction on the number of nodes present. The base caseof two nodes is trivial. Assume the statement is true for any sequence of at most n— 1nodes; we need to prove it for sequences of n nodes. By the inductive hypothesis itsuffices to show that there is is a construction of the tree in in which the first node pairis an lmcp.If there is no optimal tree T* in C, whose first combination is an lmcp then pick anoptimal T* and a construction CT. such that the first pair of CT. is (Va, vb) with f(Wa, Wb)as small as possible among all constructions of optimal trees in C. After Va and Vbcombine, there will be n — 1 nodes in the sequence. By the inductive hypothesis we mayassume that the rest of the combinations in the construction are lmcps.Chapter 2. Review of Current Algorithms 13Since we are assuming that (Va, vb) is not an lmcp, there must be a node v compatibleto one of these nodes (say vb) with Va > v. Further assume that v is minimal amongall such nodes. Consider the second pair formed in CT., which by assumption is an lmcp.There are four possible cases:(1) The lmcp is (vr, v8) (two nodes other than Vab and va).(ii) It is (v,vd).(iii) It is (Vab,Vc).(iv) It is (Vab,Vd).We can immediately rule out case (iv); since v is compatible with Ub, it will becompatible with vd, when Vab is formed. From P2 and the fact that va > v, it followsthat Vab > V > v. Hence Vd would prefer to combine with v, and so (Vab, vd) could nothave been an lmcp.In cases (ii) and (iii), we can change the first two combinations in CT. to be(ii) (vb,v) followed by (Va,Vd).(iii) (vb,v) followed by (Vbc,Va).In both cases, this would combine a smaller initial pair first, contradicting the thechoice of CT..If case (i) happens, consider the sequence of combinations until v is combined. Thiswill be either(Va,Vb),(Vxi,Vyi),.or(Va, vb), (v1 vu,), . . . , (va, Vab)Chapter 2. Review of Current Algorithms 14where their intermediate combinations (v1,u,,1) do not involve v (or Vj, by the argumentused to handle case (iv) above). We assume that v1 is to the left of v711 in the work list.We now change these sequences to(ub, vs), (v1u,,1),. . . , (v,,,,, v,,), (Va, vd)and(vb,v0),(1, ,,.As in cases (ii) and (iii), these new sequences give a construction of an optimal treewith a smaller initial pair. The only problem is that one of the intermediate pairs (v1,u,,1)might need to cross Va, which may be noncrossable. We show this never happens.Suppose that (v1,u,,1) is the first pair to cross Va, and that Va is noncrossable. Assumew.l.o.g. that v is to the left of Va, and and v& are to the right of Va in the initialsequence. We know that when the pair (v1,v,,) combines, the node v11 is compatiblewith the node Vab and thus at this point both v1 and v are compatible with v0. Thisimplies that Vm cannot be compatible with Vb initially since this would imply that Vc < Vmand hence v would prefer to combine with v0. Hence v,, becomes compatible with V,sometime during the node combinations (v1,u,,1),.. . , (v,1_v,,1_,). We are specificallyinterested in which of these combinations are between nodes which block compatibilityof and Vb. Figure 2.2 shows a fairly general case of how compatibility of v1 and Vj, canbe blocked by noncrossable nodes. Crossable nodes are circles and noncrossable nodesare squares.We note that all the nodes in the initial sequence between Va and t’, are crossable, andthat no combination is made that crosses Vb. This is because one of the nodes would thenbe compatible to v, and thus also to v. Thus this node would prefer to combine withv. We now show how to define a sequence of node pairs { (v, Vyik ) }, k = 1 . . . t, withsuch that 1 3k <j and at least one of u1 and V1 is noncrossable. Furthermore thisChapter 2. Review of Current Algorithms 150.•y.. J2o ci o d C C 0 [JOVa Vb V Vy V Vy1 ii ii i2 1Figure 2.2: Compatibility blockingsequence will have the property that Vyik and Vxik+1 be compatible for k = 1 .. . t — 1 andthat v1 is compatible to vb and v is compatible to v. Let (v31,v1) be the closestpair to the right of Vb with at least one noncrossable node. Let (v3,2v2) be the closestpair to the right of v31 that combines after (v,1,v1) with at least one noncrossablenode, and so on. Eventually we will define a pair (v,1,v) so that v is compatiblewith v and we are done. Notice that this last pair may cross over the location of v,as is shown in the diagram above. Local minimality of combining pairs together withthe compatibility requirements imply that v < < < ... < v < v. Sincev,, is compatible to v3 just before the pair v1) forms, this pair cannot be an lmcp.This contradiction ensures that no lmcp combination ever crosses Va, and thus finishesthe proof. 0We require one more lemma which we state without proof. The proof can be foundin[2].Lemma 2.6 Given a node sequence consisting of crossable and noncrossable nodes, thencombining any lmcp in the sequence will not decrease the value of the smallest nodecompatible to any other node in the sequence. In particular all other lmcps not combinedChapter 2. Review of Current Algorithms 16will remain lmcps.Proof: Omitted. DWe can now state and prove the final theorem.Theorem 2.7 Given an initial node sequence v1,.. . , v, of noncrossable nodes, an optimal tree in C for any regular cost function can be built by successively combining lmcpsin any order.Proof: We know from Lemma 2.5 that there exists a sequence of lmcp combinationsCp* which constructs an optimal tree in C on these nodes. Assume that sequence is(v,v1),. . . , By Lemma 2.6, no lmcp can prevent another from combining,hence all will eventually combine and an optimal tree will be built. CThis last theorem proves the correctness of all optimal alphabetic tree-finding algorithms which are based on combining lmcps to find a level assignment for the externalnodes, and then building the alphabetic tree. The algorithms we present in the subsequent chapters are all based on this method.Chapter 3Faster Methods?For certain classes of inputs, linear time algorithms for finding optimal alphabetic binarytrees exist. In this section we will present simple methods for some of these cases. Wewill also introduce a new idea for finding optimal alphabetic binary trees in 0(n Ig n)time, which to some extent is a hybrid of the two simple methods presented. We callmethods which use this idea region-based. Although all region-based methods will laterbe shown to have at least the time complexity of sorting in the model of computationbeing used, there are some interesting speed-ups which can improve their performancesubstantially in a number of cases.3.1 Some Simple Cases Solvable in 0(n) TimeThe first special case considered is that where the values w1,.. . , w, in the initial weightsequence are all within a factor of two. That is, there is a real number c such thatc w1,.. . , w < 2c. The next theorem provides the basis for a straightforward linear-time algorithm given as a corollary afterwards.Theorem 3.1 Given a sequence of n crossable nodes which are within a factor of two,after the first [(n + 1)/2] imeps have been found and combined, the new sequence willconsist of [n/2] nodes whose weights are again within a factor of two. Furthermore, if wekeep combining lmcps, the resulting lmcp tree will be balanced, with the leaves differingin level by at most one. Specifically, the 2(n — 2[1f1) smallest weights will be at level17Chapter 3. Faster Methods? 18[ig n] + 1 and the others will be at level [ig n].Proof: We note that that since all the nodes are crossable, this reduces the problemto building a Huffman tree, where the result is known. We present a new proof, whichprovides insight to the actual behavior of the algorithm, and motivates our results tofollow.Let the initial sequence of nodes be v1, . . . , v and let c be a real number such thatc w <2c for i = 1 to n. Whenever two nodes form an lmcp and combine, the weightof the new node is greater than 2c, so it will not be involved in another lmcp until thereare less than two nodes smaller than 2c. When n is odd, a solitary original node lessthan 2c will remain. It will be the largest node present in the original sequence and willform an lmcp with the smallest available newly formed node. When n is even the largestnode present in the original sequence will merge with another original node. Regardlessof whether ii is odd or even, the last (there may be more than one) largest node willmerge during the [(n + 1)/2]-th lmcp pairing. At this stage there will be exactly [n/2Jnodes present and they will be within a factor of two, as is easily observed below:If n is even then it is clear that the [n/2} nodes present will be within a factor of twosince the weight of each new node is the sum of the weights of exactly two original nodes.If n is odd, take c’ = w + w, the smallest value among the first (n — 1)/2 newly formednodes. Clearly the rest of the first (ii—1)/2 newly formed nodes have weights less than2c’. Let 13k be the single remaining node. It will form an lmcp with the node v. Sincethe original weight sequence was within a factor of two, wk < w + w = c’. So the weightof the next node formed, Vj,k, will equal w + w + Wk and hence be less than 2c’. Thusall [n/2] = (n — 1)/2 nodes present fall strictly between c’ and 2c’.We note that when n is even all leaves have had their level increase by exactly one.When n is odd, the leaves associated with the node which paired with the largest nodeChapter 3. Faster Methods? 19have had their level increase by exactly two; all others (including those of the largestnode) will have increased by exactly one. If the nodes were presented in ascending orderwe would have the case illustrated in Figure 3.3.Figure 3.3: Odd number of nodes presentThe crucial observation for this proof is that if n is odd, then the largest node presentin the sequence of [n/2] newly formed nodes will be unique. If the largest node is vb withweight wb, then w 1Db for nodes v in the original sequence. Suppose that v,, mergeswith vkz the smallest of the newly formed nodes. Since the weight of vkj is 1Dk + w( > 2c, itfollows that wj < 1Dkl for nodes vj in the original sequence. So w + w <Wk( + 1Db =for any two nodes v and v in the original sequence. Hence the largest node is unique.Since it is unique, its parent will also be the largest node in next sequence of newly paired-off nodes. This means that although the leaves associated with the node that paired withthe largest node had their levels increase by two after one stage of the algorithm, theywill always be rooted by a largest node and so will only have their levels increase byone during all remaining stages. This argument also applies to nodes which have theirlevels increased by two at later stages of the algorithm. Thus when the tree is finished,the level of any two leaves can differ by at most one. As any lmcp-combining strategy isguaranteed to produce a tree of optimal weight, we can be sure that the smallest valuesare at the bottom (largest) level. A certain number, 2; of the smaller nodes will pair offand occupy the bottom level [lg ii] + 1, and the remaining n — 2x larger nodes will gettheir own leaves. We require x + n — 2x = or x = n — 2[1gn] ciChapter 3. Faster Methods? 20Corollary 3.2 There exists a simple linear time algorithm for finding an optimal alphabetic binary tree on a sequence of input weights which differ at most by a factor oftwo.From chapter one, we know that it suffices to find the levels of the leaves in the lmcptree. We know that this level sequence can be realized by an alphabetic tree which canbe constructed in 0(n) time. In point form, the algorithm for finding the levels of theleaves in the alphabetic tree is as follows:1. Begin with the work list containing the original input sequence. Note that all nodesare noncrossable.2. Use a stack-based method to find lmcps and pair them off, removing them temporarily from the work list. This process will continue until there are zero or onenodes left in the work list. If a single node remains (n is odd), scan through thelist of newly formed crossable nodes to find the smallest pair. Pair this node withthe single noncrossable node.3. At this stage we have m = [n/2j crossable nodes left. Moreover the new nodes arestill within a factor of two, by the same argument as in the proof of the precedingtheorem.4. We can now directly find the levels of every leaf in the lmcp tree for the m crossablenodes present by the preceding theorem. We know that if we continue to build upthe lmcp tree that we will get a balanced binary of tree of depth [lg m]. To find the2(m — 2[lm])th weight in the list we can use the well-known linear time selectionalgorithm based on median-finding [7]. We then scan through the list and findChapter 3. Faster Methods? 212(m — 2gm]) nodes whose weights are less than or equal to the 2(m — 2Llm])thweight, and assign them the level [lg in] + 1. The remaining nodes are assigned level[lg in]. In an additional 0(n) time, we can compute the levels of the nodes in theoriginal input sequence.5. With knowledge of the leaf levels we can construct the optimal alphabetic tree forthe input sequence in 0(n) time.The next special case that is easy to solve in linear time happens when the weightsw, in the input sequence are exponentially spread out; i.e. none of the nodes arewithin a factor of two. Formally, for all weights w and wj, if w w then Rv/wI 2.A simple bottom-up approach to building the optimal alphabetic tree on an exponentially spread out sequence is just to use the basic stack-based method, but ignorecrossability of nodes (new nodes are still noncrossable). This reduces the running timeto 0(n) time. The following lemma justifies this.Lemma 3.3 If for all input weights w,w3, w to3 =* w3/wj 2, no crossable nodewill ever be crossed over.Proof: In view of obtaining a contradiction, assume that the first lmcp combinationthat crosses over a section of crossable nodes occurs when 0a and v combine and crossover the nodes. ,in the work list, and that of these in-between nodes,was the last to form. Then va was compatible to v1 and v was compatible towhen the node 0Xjj was formed. This means that Wa > w and zv > w. Somax(wa, wb) > max(w1,w1). Let the weight of the largest original node present in theleaves rooted by either or be u. Since the leaves of and v1 are disjoint andleaves decrease in weight geometrically by a ratio of a least two, it follows that w11,which equals the sum of the weights of the original nodes rooted by 0Zjj is less than 2u.Chapter 3. Faster Methods? 22On the other hand, max(wa, wb) > max(w1,w1) > u. Since the weight u is not amongthe weights of the original nodes of Va or vb, there must be a larger original node presentin order for the sum of the weights to exceed u. Such a node must have weight v > 2uby assumption. Thus max(wa, wb) > 2u> w11. This contradicts the pair (Va, vb) beingan lmcp since min(va, v,,) would prefer to merge with UHence it is not necessary to maintain information about crossability as it will neverbe used. It is relatively straightforward to see that ignoring crossability speeds up thestandard stack-based method to linear running time. The speed-up comes from the factthat there are no longer crossable nodes lists to maintain and merge. We now give aformal description of the algorithm.Let the input sequence beV1 V2 V3 V4...V,Fwith the stack pointer F initially pointing to the first node in the work list. The basicstep consists of checking if the node above P and its neighbor to the right forms an lmcp;this just involves checking the next two nodes to the right as we will see shortly. If thepair is an lmcp it is combined, the new node inserted at F, and F reset to the previousnode. If not, P is advanced one node to the right. Thus in each step we either combinetwo nodes or advance P. Assume the current work list is always v1, . . . , v and thatwe are examining the node (vi, v+1). This means that the nodes (vi,v2),. . ., (v_1v)were not lmcps. Hence v1 > v3, v2 > v4, v3 > v5,. .. , v_1 > v÷i. Recombining theseinequalities shows that v1 > v3 > v5 > ... > V2(1/]4and v2 > v4 > v6 > ... >In light of the fact that nodes are assumed noncrossable, we only need to compare v andChapter 3. Faster Methods? 23v2 to see if (vi, v+i) is an lmcp. If v vj2, it is. If we reach the end of the list and arechecking (v_1,v) we see that automatically this pair will be an lmcp. After combiningit, removing v,,— 1 and v from the list, and placing the new node in the last position,n— 1, P will retreat to check (vn_, vfl_2) for local minimality. Either it is an lmcp or weneed to move forward once to find one.We now calculate the running time of this method, by counting the maximum numberof comparisons necessary to build the tree. Let T(n, p) be the maximum number ofcomparisons needed when there are n nodes and P is at position p in the list. We makethe following observations on T:1. T(2, 1) = 0 (automatic pairing)I T(n,p+1)+1 (nolmcpaboveP)2. T(n,p)=max forp<3.( T(n—1,1)+1 (lmcpaboveP)I T(n, p + 1) + 1 (no lmcp above P)3.T(n,p)=max. for2<p<n—1.( T(n — 1, p — 2) + 1 (lmcp above P)4. T(n, n — 1) = T(n — 1, p — 2) (automatic pairing)It is easy to verify by induction that T(n, p) 3n—p. Hence the total time cost ofthe algorithm will be 3n for the cost of comparisons plus an additional cost of n — 1 forforming the n— 1 new nodes between lmcps.An alternative approach to finding an optimal alphabetic tree on n weights exponentially spread out is to build the tree top-down. This is based on the observation thatplacing the largest node as high as possible in the tree dominates over the considerationthat other nodes will potentially be at larger levels. We leave the details to be workedout by the reader.Chapter 3. Faster Methods? 243.2 Region-based MethodsIn this section we present a new method for finding optimal alphabetic binary trees.After the basic version has been explained, we will make observations on how to speedup its performance in special cases. The method will be referred to as region-based.Although it will later be shown that in the worst case it requires as much time as sorting,it exhibits o(n ig n) time performance for a significant variety of inputs. In particular,we are able to give 0(n) algorithms for the case when all the input weights are within aconstant factor of one another, and the case where there are only a constant number ofweights between and 21 for any k. In all cases, its performance is at least as goodas the Hu-Tucker algorithm. The method is based on the observation that nodes thatare within a factor of 2 have approximately the same level. We begin with a completedescription of the basic algorithm.Input: a list of weights: w1,w2,.. . ,Output: a representation of an optimal binary tree built on these weightsAssume the input has been placed in a doubly-linked list called the work list. Wedefine the category of a node v in the work list to be [lg wJ. We then re-scan the worklist from left to right to build a new list (doubly-linked) of consecutive regions. A region isdefined to be the maximal length list of consecutive nodes from the work list of the samecategory. It is represented as a 5-tuple (c, Pie, Pre, Pim, prm) where c is the category of theregion, Pie and Pre are pointers to the left and right endpoints of the actual region in thework list, and Pim and Prm are pointers to the minimum nodes in the region compatible toimaginary nodes to the immediate left and right of the nodes at Pim and Prm respectively.If, for example, the node at Pie is noncrossable, then Pim = Pie. We have now partitionedChapter 3. Faster Methods? 25the work list into regions:{ viv2} {v3} {v4} {v5 v6 V7} ... { ... v}The algorithm proceeds by determining a certain region is suitable to process, andthen combining nodes in that region. In some cases it will process several regions simultaneously. It then combines pairs of nodes in it (them) appropriately and in so doing,eliminates the old region(s). Newly formed nodes may then join adjacent regions or create new ones. If r is the number of nodes in the region processed, then we only requireO(r ig r) time to do the entire operation (including the time spent to update the worklist and region list). It is also the case that a region is examined in the region list atmost twice until it is processed and eliminated. So the total cost of eliminating the rnodes and replacing them with [r/2 + 1] pairs is still O(r lg r) time, if we amortize overthe course of the algorithm. Hence we spend only O(r lg r) time to reduce the problemto size [n— r/2]; this yields an O(nlg n) time algorithm.To illustrate this, after a few steps of the algorithm, it may be that we have thesituation pictured in Figure 3.4.I Iv+v+v,) I••J _IJI I1--K \\\ :[ v4 J v5 v6 v7 , ...Dashed lines indicate crossabiityFigure 3.4: Region processingChapter 3. Faster Methods? 263.2.1 Detailed description of the algorithmWe use a stack-based method to determine which regions to process. We keep track ofnode crossability and only combine lmcps. This ensures optimality by the results of theprevious chapter.Assume that at the start the region list is:R1 R2 R3 R4 ... R1 R‘ftPThe pointer P points to the region we are currently considering for processing. Todetermine if we can process the region above P we need to check adjacent regions inthe region list. To describe this process assume we are in the general situation picturedbelow, with P below a region Rk, where 1 k in:R1 R2... Rk Rk+1 ... Rm‘ftPLet Rj < R3 if the category of R is less than the category of 1?. The followinginvariant is maintained by the algorithm:There exists a possibly empty set J = {ji < .12 < ... < j8} of nonconsecutive indicesfrom the set {1,. .. , k— 1} such that:1. J?3 consists of a single non-crossable node for all i = 1 . . . s2. R1> ... > R31 <R31 > R21÷i >...> R32_1 <R32 > ...> R,_1 <1?3>3. R,,_2j foralli=1...swherej—2>O.Chapter 3. Faster Methods? 274. 1R31_iI is oddDepending on the adjacent regions to Rk, we either process Rk and possibly otherregions or we advance P. In either case the invariant above is maintained. The cases aregiven below:1. Rk > Rk+1: Move P under Rk+1.2. Rk < Rk+1, Rk+1 consists of a single non-crossable node, and Rk Rk+2: Move Punder Rk÷2.3. Otherwise: In this case, we process Rk, and possibly other regions to the left. Webegin by finding the maximum even value j such that for all i = 0,2,4,. . .,j wehave k — 1 — i in J and Rk_2_ = Rk. In the case that no such j exists we setj = —2. Let Rie refer to the node in the work list pointed to by the Pie pointer forregion R. Define Rre,, and Rrm likewise. Now use a stack-based subroutine toprocess the following list of nodes:Rk_j_3rm Rk_j_2 . . . , Rk_3_l1, . . • . . • , Rk . . . , Rk+limFor example, when j = —2, we just have the list:Rk_lrm Rk1, . . . , Rk,. Rk+limThe reason for keeping track of singleton regions, is that nodes in regions on eitherside of a singleton region may compete to combine with the singleton node separatingthem. This happens when j = 0 as is shown in Figure 3.5.3.2.2 The region-processing stepThe subroutine begins by placing pointers in the work list to delimit the region we areworking on. This list will be a scratch list, and as we work on it we will remove nodesChapter 3. Faster Methods? 28Rk+ 1k-3• • • • • • •Figure 3.5: Typical region distributionwhich are paired. We maintain a pointer Q in the scratch list that points to the currentnode being considered for pairing with its right neighbor. To illustrate this, let us assumethe essentially general case when j = 2. For further convenience we will label the nodesin region Rk_4 by a1, . . . , a,, those in Rk_2 by b1,.. . , bq and those Rk by c1, ... , c,.. Theinitial set-up of the work list is:Rk_5rm a1 ... a Rk_31 b1 ... bq Rk_11 Cl ... C,. Rk+limiTQSome of the nodes in a, b and C may be crossable. The singleton regions f%..3 and Rk_1are not crossable and neither are the endpoint nodes Rk...5rm and Rk+ljmWe now use the fairly simple stack-based implementation of the Hu-Tucker algorithmto determine all the lmcps in the regions a, b and c, and pair them up. Since we knowthat all of the nodes in a given region, say a will pair up and form a consecutive block ofcrossable nodes, we can keep all newly formed nodes in a separate list, which we insertinto the work list when all the original nodes have been combined. After this is done wecan find the categories of all the new nodes and the new regions in this part of the worklist with one scan over the newly created nodes.If there were an odd number of nodes in one of the regions, say a, to begin with theChapter 3. Faster Methods? 29last node will either merge with one of the newly formed nodes or with an endpoint orone of the singleton region nodes. After all the initial nodes have been merged, the listof newly formed nodes is inserted into the work list. The region list is also updated. Thetime complexity of the entire region-processing step is 0(r lg r).3.2.3 Time complexity of the basic region-based methodThe proof that this method runs in 0(n lg n) time is very similar to the proof that thestack based method in the beginning of the chapter ran in 0(n) time. The differencebeing that the stack maintained now contains regions, and the basic cost of processinga region of size rn is 0(m lg m). Let S(n, p) be the maximum number of steps requiredby our method to build the lmcp tree on a list of n nodes, divided into some numbernumber of regions, with the region pointer P at position p in the region list. We makethe following observations on1. S(2, 1) = 0 (can automatically process region above P)I S(n,p+ 1) + 1 (cannot process region above P)2. S(n,p)—max’1 S(n — [s/2],1) + O(slgs) (can process region above P)for p < 3.I S(n,p + 1) + 1 (cannot process region above P)3. S(n,p)=maxçS(n— [.s/2],p —2) + O(slgs) (can process region above P)for 2 <p < [last region].4. S(n, [last region]) = S(n— [s/2J,p— 2) + 0(slgs)(can automatically process region above P)It is easy to verify by induction that S(n, p) = 0(n lg n)—0(p), which implies thatthe running time of the method is 0(n lg n).Chapter 4Speeding up Region-based MethodsThere are many classes of inputs for which we can speed up the region-based methodpresented in the previous chapter. This is done by not combining every lmcp, whichas we will see in the next chapter can be as time consuming as sorting. We begin bydiscussing a way of speeding up the region-based method for general inputs, and thenconsider the case where the inputs weights are within a constant factor of one anotherand the case where there is a bounded number of weights between 2’ and 211 for everylv. We refer to the input weights in these cases as being bounded by a constantfactor and being exponentially separated respectively. The linear time algorithmfor exponentially separated sequences is a simple observation on the behavior of thebasic region-processing method, whereas the corresponding linear time algorithm for theconstant factor case is very complicated. We only sketch the basic details of the methoddiscovered by the author and his supervisor Maria Klawe.4.1 Speeding up General Region-based AlgorithmsThe basic idea used is that instead of explicitly combining lmcps we only find the oneswhich will affect the levels of the leaves in the tree. For example, if the category of agiven region is much less than that of its neighboring regions, we know that the nodesin the region will pair up and form a balanced subtree before pairing with any nodesin neighboring regions. Hence we can process the region using a linear time method bytreating it as if it were the entire input sequence, as is shown in Figure 4.6.30Chapter 4. Speeding up Region-based Methods 31S • S S S SFigure 4.6: Can process R quicklyUnfortunately there are cases where our strategy still requires accurate knowledge ofmany of the lmcps which form in a given region. This happens when the categories ofthe adjacent regions are not too much greater than the region currently being processed.Newly formed nodes from the processed region recombine until they enter one of theadjacent regions. If the category of the adjacent regions was not much larger than thatof the processed region many new nodes will enter it. For example, if the categoriesdiffer by two, and there were r nodes in the lower (larger level) region then [r/4j nodeswill enter the higher region. Although immediate knowledge of the exact weights of theentering nodes may not be necessary initially, it can happen that such information willbe required later in order to properly determine future lmcps. This would be the caseif, upon entering the new region, the input sequence is now within a factor of two. Weknow that at this stage a balanced tree will be built with some of the nodes occupyinglargest (lower) levels. It would then be necessary to have all the weight informationbecause we have to assign a set of the smallest weights to these larger levels and some ofthe new nodes may well be among them. The region-based algorithm we give accuratelyRChapter 4. Speeding up Region-based Methods 32determines the weights of all nodes entering a higher region; this strategy can save timeif the gaps between the initial regions are large enough.We follow the overall region-combining approach that we described previously. Thatis, we use a stack-based method to find regions to combine and then enter a subroutineto process individual regions. The difference will be in the region processing step. Bychecking the category of adjacent regions we can determine the height of the gap. If itis cost-effective to do so, we will use a different method than just successively combininglmcps until all the nodes enter one of the adjacent regions. We also sort the list of newlycreated crossable nodes into adjacent regions before merging them into adjacent regions.As we will see, this does not increase the time complexity of the method, as we spend atleast this much time processing the region itself. (This is clearly the case if the regionprocessing step given in chapter two is used; processing a region of size r cost O(r Ig r)time.)Region Sh{ AAAA}h-1Region RFigure 4.7: Gap between regionsAssume that there are r nodes in the middle region, R, which is the one currentlybeing processed. If we were to follow the basic region-processing procedure, we wouldChapter 4. Speeding up Region-based Methods 33spend O(r ig r) time combining the nodes present to form a new region whose categorywould be one greater than the first, but would have half the nodes. This would continueuntil eventually the newly formed region would merge with one of the adjacent regions.The alternative is to use the idea of the linear time approach for weights within a factorof two to predict the values of the nodes in the last newly formed region S just before itmerges with one of the neighboring regions. In this way, we will be able to construct Swithout computing any of the intermediate regions in between. Provided the gap heighth is large enough, the time spent will be less than O(r lg r). Indeed if the gap is largerthan [lg r] then we know that we will form a complete balanced binary on the regionbeing processed, and that the root of this tree will later combine with a node in one ofthe adjacent regions. As was mentioned before we could thus use the algorithm given inthe previous section to build this tree in linear time. We will begin with describing themethod for predicting the nodes in the last newly formed region that would be formedby any lmcp-combining strategy and then we will provide a criterion on the gap heightfor its cost effectiveness.Let the gap height between R and its lowest adjacent neighbor be Ii, as picturedin the previous diagram. We wish to find the nodes in S immediately below the lowestadjacent neighboring region without going through the expense of computing every regionin between. Expand r asr = 2a1(2a2(. . . rk_2(2ak_1(rk + 1) + 1)...) + 1).This takes O(lg r) time. Note that E = [lg r]. This expansion will let us determinethe partition of the original input sequence according to which nodes they descend fromin S. By summing the weights of the nodes in a given partition, we compute the weightof their common ancestor in S. The next theorem shows how to find the partition of Rbased on the expansion of r. It assumes that all the nodes in B are crossable. If they areChapter 4. Speeding up Region-based Methods 34not all crossable then additional time must be spent in order to pair off all noncrossablenodes first. This will only require 0(r) time, since any sections of crossable nodes are insorted order. It is not necessary to re-sort the crossable nodes afterwards.Before we formally state a theorem about how R is partitioned, we will provide somenomenclature for describing the intermediate combination steps in the transitory regionsformed between R and S. Assume that all nodes in R are crossable and that they arelabelled v1 to V7. If r is even, then the lmcps we find and combine will be in order(v1,v2), (v3,v4),. . . , (Vr_i, V7). At this point we will have formed a new region. Call thisone r-step. If r is odd then the lmcps formed will beAlso call this one r-step, even though two nodes have paired twice. This is pictured inFigure 4.8. In either case, after one r- step a new region is formed.V1 V2 V3 V4 V5 V6 V7 V8 V9 V,0 V11 V12 V13 V14 V15 V16 V17 V18 V19Figure 4.8: One r-stepAs was seen in the proof of (the first theorem of this section), the largest node V19 isa child of the largest node in the next region, and this node is unique.After a number of r-steps the new region created will either have its smallest nodeat category c — 1, where c is the category of the nearest adjacent region or consist ofa singleton. In both cases we must stop. In the former case we can no longer assumethat all nodes present will combine with one another. This is because the largest nodecan now be at category c, and thus be at the same level as one of the adjacent regions.Chapter 4. Speeding up Region-based Methods 35For example this would happen if in the last diagram w1 = 2, w2 = 2, w19 = 3.9 andthe category of one of the adjacent regions was 2. Assuming this is not the case, aftera couple more r-steps we will have formed the forest shown in Figure 4.9. Notice thatFigure 4.9: Forest formed after three r-stepsthat the smallest six nodes have increased their levels by 4, whereas all the others haveincreased their levels by 3. The next theorem shows it easy to predict these numbers.Theorem 4.1 Let r = 2a1(2a2(.. . 2ak_2(2ak_1(2ak + 1) + 1)...) + 1) and assume that jr-steps occur (j < [ig r]). Let .s be maximal such that a < j. Then there will beexactlyf 2(2 + 2aa + ... + 2a1++as) if k> 1 and s existscr,j = max1. 0 otherwisenodes in R which have increased their level by j + 1 and the remaining r — Cr,j will haveincreased their level by j. Furthermore the nodes in the region formed after the j-th rstep will have weights in order, (Wcr,,+i +.. + W,.+21), (Wcrj+2j+i + + Wcrj+2j++i)• •. (Wcr,j+(m_i)2i+1 + •.. + W j+m2i), (w1 + • •• + w + WCrJ+m2j+1 + ... + Wr) wherem=[r/2j]—1.Proof: The proof of this theorem mirrors the one given for Theorem 3.1. We knowthat in any r-step where the number of nodes present is odd, the smallest two nodes willv1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19Chapter 4. Speeding up Region-based Methods 36combine to form a new node, and this new node will then combine with the largest nodepresent, as was pictured in the last figure. We also know that this new node will be thesole largest node present in the next region. It was this observation which showed thatany node would have its level increased by 2 during at most one r- step; after this, itwould be a descendant of a largest node, whose levels only go up by one every r-step.The formal proof is by induction on lv. The base case of lv = 1 is simple; r is a power oftwo, and so there are never any leftover nodes to combine after any r-step. Assume thatthe theorem is true for all lv up to in— 1. We need to show it works for lv = in. Considera1.If a1 = 0, we know that there are an odd number of nodes present. Hence the twosmallest nodes present will have their levels increased by 2 in the first r-step. The numberof nodes which have their levels increased by two will thus be 2 plus twice the number ofnodes in the next level which have their levels increased by two. This is because everynode at the next level represents two in the original level. In the case in> 2, by inductionthis number is2 + 2(2(2a2_1 +2a—1-J-a + r2_l+aa+a4 + . . . +2a21.ds))= 2(20 + 2a2 +2a-J-a3 + a2÷a3÷a4÷+2a1++as)= Cr,j.If in = 2, note that s = 1 (otherwise there are not enough nodes to combine) and so thenumber of nodes which increase their level by 2 is 2 + 0 = 2(a1) = Cr,j.If a1 > 0 then there are an even number of nodes present and so in the next r-step,every node will have its level increased by exactly one. Thus the number of nodes whichhave their levels increased by two will be twice the number of nodes at the next levelthat do. Thus, by induction,2(2(21 + r1’2+2a,1+a3 + . . +2a1—1+...-f-aa)) =Chapter 4. Speeding up Region-based Methods 37nodes have their levels increased by two.It remains to show how the nodes are partitioned. We will use induction on thenumber r-steps to do this. The statement about partitioning is obvious for one r-step,regardless of whether r is even or odd. Assuming it is true for j — 1 r-steps, we shall proveit for j. Let the nodes in order be v1,. . . , v7. If a = 0, then there are an odd number ofnodes present. After one r-step the nodes in order are v3,4v5,6... , Vr_2,r_1, V(i,2),7. Byindnction these nodes get partitioned as pictured in Figure 4.10.C[r/21,fi 2,1-1 2JCr,j 2.1 2Figure 4.10: Partition according to inductive hypothesisObserve that Cr,j = 2 + 2C[r/],j_l. From the diagram and the inductive hypothesis,we see that the first Cr,j nodes and the r— (Cr,j+m21) largest nodes in 11 share a commonancestor in the region formed after j r-steps, . It is also clear that the rn2’ nodes inbetween are partitioned in sections of size 2. If a1 > 0, then r is even and it is easy tosee that Cr,j =2C[r/2],j_l and the partition of the nodes in R is as stated. UHaving shown how to predict the nodes formed after j r- steps, we now need to findout when it is economical to do so.Theorem 4.2 Let the region currently being processed be R, and assume it contains rChapter 4. Speeding up Region-based Methods 38nodes. Let the gap height between R and its lowest adjacent neighbor be h. We candetermine the region S formed after h— 1 r-steps in O(rlg(fg)) time.Proof: There are approximately (fr) nearly evenly spaced values to find in R. Usingthe linear time selection algorithm we can find the middle one in 0(r) time. Split thelist on this value into two approximately equal halves and again find the middle value ineach; this will take 0(i) + 0(i) = 0(r) time. After O(lg(ç)) steps, we will know thepartitioning of R according ancestors in the new region 5, at a cost of 0(r lg(*)) time,and so we can accurately form the nodes in S by just summing the weights in each nodespartition of R. UNow use the conventional means of combining lmcps in S until all nodes have entered the nearest adjacent region and we are finished. This only costs an additionalO(()lg()) steps, including the time necessary to sort the list of entering nodes, sothe entire running time is still 0(r lg()).4.2 The Constant Factor CaseWe now outline the linear time algorithm for weights within a constant factor. That is,for the case where there is a constant C such that max{w1}/min } < C. We make useof new linear time generalized selection algorithm discovered by Maria Klawe. Becauseof the complexity of the result and its recent discovery we only provide an outline of themethod and refer the reader to a forthcoming (as of August, 1992) paper of the authorand Maria Klawe.As before, our objective is to determine the levels of the leaf nodes in the lmcp tree.The underlying idea is an extension of the techniques in the case where the input weightsare within a factor of two. Specifically, we use a region-based method to process theChapter 4. Speeding up Region-based Methods 39weights region by region in increasing order by category number until we are left with asingle region of crossable nodes. At this point we apply Theorem 3.1 to determine thelmcp tree levels of the nodes in this final region, and then work backwards from thisinformation to find the lmcp tree levels of the original weights.There are a number of fundamental problems that arise in implementing this strategy.The first is that given a region of r nodes that may be either crossable or noncrossable,finding the lmcps that will be formed out of the region can take IZ(r lg r) time since wemay be forced to sort the crossable nodes. Thus we cannot afford to actually determinewhich nodes pair together in lmcps, nor the weights of the lmcps formed. Instead we onlyobtain coarser information about the structure of the lmcp tree. In particular, we willdetermine groups of nodes that pair together in the sense that the lmcp partner of eachnode in the group is also in the group. As this procedure is iterated, we must be able to“process” regions where we only have a coarse description of the region. Specifically thedescription of a region is an ordered list of node-groups, where each noncrossable node isa singleton node-group, and each crossable node-group, G, is described by an node-grouplist representing the children of the nodes in G. For each node-group we will know thenumber of nodes in the group, and the largest and two smallest weights of the nodes inthe node-group.Working with node-group lists introduces a number of technical challenges. For example, given a node-group list for a region, 1?, we must construct a node-group list forL, the set of lmcps containing nodes in R. If the number r of nodes in R is even, becausethe regions adjacent to R have higher category numbers, it is easy to show that the lmcppartner of each node of R is in R. This makes constructing the node-group list for L fairlytrivial (i.e. it consists of a single node-group), except for the problem of determining thelargest and smallest weights of the nodes in L. The reason we need to know the largestand smallest weights of the nodes in each node-group arises from handling the case whenChapter 4. Speeding up Region-based Methods 40r is odd. In this case, R has exactly one node, a wallflower, whose lmcp partner is not anode in R. We need to know the largest weight in each node group in order to find thewallflower, and we will need to know the two smallest weights in order to know, at thenext stage, which node will be the lmcp partner of the wallflower. Our general strategyfor efficiently computing the largest weight among the lmcps that will be produced froma region, is to construct (in linear time) a list of weighted nodes for which we can obtainall the lmcps in linear time, and to prove that the largest weight lmcp from this list isthe same as that from the region. We use a similar method to find the two smallestweights. The choice of these approximate lists is fairly natural, but proving that theypossess the appropriate extremal lmcps requires establishing a number of tools to dealwith how modifying work lists affects the extremal lmcps.Another problem that arises is in applying Theorem 3.1 to the final region of crossablenodes. In the case where the original weights differ by at most a factor of two, we knowthe actual weight of each crossable node in the region and hence can apply the usuallinear-time selection algorithm to find the k smallest weight nodes for the appropriatevalue of k. In the constant factor case, however, we only know the largest and twosmallest weights in each node-group, which makes selecting the k smallest weight nodesmuch more difficult. What we need is a linear-time technique for producing a node-grouplist that represents the k smallest weight nodes, and a node-group list representing theremaining nodes in the region. From these we can work backwards to deduce the lmcptree levels of each of the original weights.The linear-time algorithm for selection in node-groups consists of two pieces. Thefirst piece is a generalized selection algorithm, which given a collection of sets and a“fast” selection algorithm for each set, does “fast” selection on the union of the sets.Here the term “fast” selection algorithm for a set A means that for any a between 0 and1, in time linear in the number of nodes in A we can partition A into A1 U A2 accordingChapter 4. Speeding up Region-based Methods 41to the following conditions. First, we have “fast” selection algorithms for both A1 andA2; second, every element in A1 is less than or equal to every element in A2; and third,A1 = [cyAj. The other piece of the node-group selection algorithm is a technique thatdoes the following task: Given “fast” selection algorithms for each node-group in a node-group list for a region R and an integer k, produce a node-group list that represents the ksmallest weight nodes in L, the set of lmcps constructed from 1?, as well as a node-grouplist representing the remaining nodes in L. By iteratively interleaving these two pieceswe eventually obtain an algorithm which performs the desired selection in the final regionin time linear in the number of weights in the original list (though as might be expected,the constant depends on C, the bound on the ratio between the original weights).4.3 The Exponentially Separated CaseIn this section we show that the basic region processing method yields an 0(n) timealgorithm for the case where the input weights are exponentially spread-out. Formallywe say the the input weight sequence w1,.. . , w, is exponentially separated if thereexists a constant C such that for all n,I{i: [lgw]= k} 1< CforallkE Z.Theorem 4.3 If the input weights in1, . . . w are exponentially separated then we canbuild an optimal alphabetic binary tree on them in 0(n) time.Proof: It is straightforward to observe that there are at most 2C nodes in anyintermediate region we process. Since this is the case, we can easily process every regionof size r in 0(r) time. Since this is the case, we can afford to use the basic regionprocessing method to build the entire lmcp tree in 0(n) time. In an additional 0(n)time we can build the optimal alphabetic binary tree from the leaf-levels in the lmcpChapter 4. Speeding up Region-based Methods 42tree. 0Chapter 5Hardness ResultsWe begin with a simple hardness result that shows constructing the intermediate lmcptree produced by Hu-Tucker based algorithms in any model of computation is at leastas difficult as sorting in that model. We also give a more complicated reduction fromsorting to constructing the optimal alphabetic tree by means of a region-based method.At this point we are uncertain what model of computation is realistic to use for thisproblem, so we do not state explicit lower bounds for a given model, but rather showthat the problems stated are at least as hard as sorting in whatever model used. Givenan unsorted list of n numbers, in 0(n) time we show we can transform the problemof sorting the list into a problem about constructing a tree, and from the informationrecorded in the structure of tree produced, compute the sorted order of the numbers in0(n) time. Hence, if a w(n) time lower bound exists for sorting in the model, then wehave non-trivial lower bound on constructing the tree.One easy way of getting a trivial lower bound is counting the number of possibleoutputs the algorithm must generate. In any binary decision tree model in which outputsappear at distinct leaves, the algorithm must be able to differentiate each of the differentoutput cases, and so any decision tree must have maximum depth at least lg s, where.s is the number of different solutions to the problem. A trivial 12(n) time lower boundfollows easily by showing that the number of possible alphabetic trees built on a sequenceof n weights is To see this, let c,,, be the number of alphabetic binary trees with nleaves. As the root of the tree can split the input list in any of n — 1 places it follows that43Chapter 5. Hardness Results 44C = C1_ +c2,_ + .. . c_1 for n > 1, with c1 = 1. The solution of this recurrenceis given by the Catalan sequence, {C}, [9], with c = C_1==5.1 Finding the lmcp tree is as Hard as SortingLemma 5.1 Let x1,x2,. . . ,x, be distinct real numbers drawn from [2,4). Let yt =X[/2]i, for i = 1 . . . 2n. If (yi,.. . , Y2fl) is given as input to any lmcp finding algorithm, the set of the first n lmcps found, disregarding order, will be{(yi, y2), (y, y4),... , (yfl—i, y2fl)}.Proof: First note that 1 <j <2 for i = 1,.. . , 2n. Thus each new node formed froma pair will have weight at least 2. This means that any noncrossable node would preferto combine with another noncrossable node, and thus the first n lmcp combinations arebetween noncrossable nodes. Since we are only concerned with the first n lmcps, we mayassume that we just delete any two nodes involved in an lmcp. This ensures that alllmcps found are between consecutive nodes in the list. Moreover, it suffices to show thatif (y, yi) is among the first n lmcps formed, then i is odd. Since (yi, is an lmcpwe have Yi—i > Yi--i and y yi+2. If i were even it would follow that Yi—1 Y and= Yi+i which is a contradiction. Hence i is odd and the result follows. LiTheorem 5.2 We can reduce sorting to finding the lmcp tree in 0(n) time.Proof: Assume n is even. Let x1,x2, . . . , x be drawn from [2,4). Define the y asabove and consider the behavior of some lmcp-combining algorithm on the input sequenceYi, .. , Y2n According to Lemma 5.1, after n lmcps have been combined there will be nnodes present in the node list. They will have weights x1,. . . , x,. Since these nodes are allChapter 5. Hardness Results 45crossable there will be only one lmcp present; the smallest pair of nodes in {x1,. . . ,This pair will combine to form a new node having weight at least 4. The next lmcp willbe the second smallest pair of nodes from {x1,... , x,} and so on. Hence the next n/2lmcps found after the first n lmcp combinations have occurred sort {x1,. . . , x} by pairs(only consecutive pairs may need to switched in order for the list to be totally sorted).This information can easily be recovered from an lmcp tree produced by any method bysearching it depth-first, always searching the least weight subtree first. We will encounterthe nodes corresponding to {x1,. . . , x,} in fully sorted order (a node with weight x willbe the parent of leaf nodes with weights Y2—1 and y2i). Hence we can reduce sorting tofinding the lmcp tree in 0(n) time. D5.2 Hardness Result for a Class of Region-based MethodsIf it were possible to process any region of size r in 0(r) time, it is easy to show thatthe region-based method of chapter two would then yield a linear time algorithm forfinding optimal alphabetic binary trees. Since we need only find the level of the nodesin the lmcp tree, and not its actual structure, it is natural to see if we might be able toavoid finding all the lmcps in every region we process. For example, we know that everynode will pair up with another in the region we are working on, except possibly the lastnode when the number of nodes in the region is odd. This last node may form an lmcpwith one of the newly created nodes, or with an accessible node in one of the adjacentregions. This suggests that we just need to know the smallest two nodes (they pair upto form the smallest newly created node) in each new region when we start processing,as we can then resolve in constant time which of the possible three alternative nodesthe leftover node combines with. In light of this, perhaps there is an efficient way ofChapter 5. Hardness Results 46maintaining this information for every region we create? Unfortunately, the answer isno. We show that any algorithm which follow this strategy will essentially sort the theinput weights. The algorithm need not find all the lmcps, it need only find enough ofthem to compute the two smallest nodes in each region. On the surface this might bepotentially easier to do. For example, in the case where all the weights are within afactor of two and n is a power of 2, it can be done in 0(n) time by combining the firstn/2 lmcps (to ensure that every node is crossable). Let them be, in increasing order,x,q2. We can find x1,x2,x4,.. . , X,/4 (the median) in 0(n) time by using thelinear time median-finding algorithm [7] to find x,q4 in 0(n) time, then using it again onthe list of weights less than or equal to to find x,q8 in 0(n/2) time and so on. In anadditional 0(n) time we can scan through the list of nodes and determine whether theyare in [xi, x2], [x3,x4], [x5, xs],...,[x1q4i, Xn/2]. It is clear from Figure 5.11 that we cannow easily predict the two smallest nodes in the k-th region formed after the first n/2lmcps have combined. The smallest node will be the root of a balanced tree built on theweights x1, .. . x2 and the second smallest node will be the root of a balanced tree builton the weightsx2k+i,.. .,X2k-I-1.x1 x2 x3 x4 x5 x16x8 x9Figure 5.11: Simple caseChapter 5. Hardness Results 47Theorem 5.3 We can reduce sorting to constructing a tree with the same leaf-levels asthe lmcp tree by processing regions in such a way that the two smallest nodes in everyregion root the same leaves (and thus have the same weights) as they do in the lmcp treein 0(n) time.Proof: We consider the behavior of any region-processing algorithm which accuratelyfinds the smallest two nodes in each region on a class of inputs which have the propertythat with this information alone we know the structure of the lmcp tree. The inputsequences we consider consist of approximately /i regions, each containing about ./inodes, and such that the category of a given region is one more than the region on itsleft. The nodes in a region are distributed in such a way that in the lmcp tree theyinterleave with the newly formed nodes created from the region to the left. It turns outthat insisting that the two smallest nodes in each region be accurate in the sense thatthey root exactly those leaves that the corresponding node in the lmcp tree does, forcesenough structure on the tree being built, that it is essentially the lmcp tree up to somelinear time reordering of nodes. From this tree it is possible in linear time to sort theentire input sequence. Finally, there are enough different input sequences in this classthat it is as hard as sorting in the general case.Assume n = k2 + 3k + 4, where k is a positive integer. Begin with 21 real variablesx1 < x2 < ... < Z2k--i drawn from [2,4). (We assume that they have been sorted inascending order.) The input list will consist of weights which form successively increasingregions. The first region will contain weights values in [1,2), the next [2,4), then [4, 8),etc. Denote the j-th value in the i-th region by yjj. The first region will have 4k + 4weights, the remaining have 2(k— 1), 2(k — 2), 2(k— 3), . . . , 2 weights respectively. Notethat 4k+4+2(k—1)+2(k—2)+...+2=k2+3k+4. Thevaluesforthe{y1}willbe determined from the {xj}. As the proof depends on the crossability of nodes, theChapter 5. Hardness Results 48values come in pairs so that the leaf nodes initially combine in pairs (this will be provedin Lemma 5.4).Consider the following recursively generated binary tree, built from the {x}. Ifinternal nodes are assigned the sum of the weights of their children, then it has theproperty that the left child of any node is always less than the right.Figure 5.12: Tree generated from {x}Figure 5.12 the tree built for k = 3. The tree built for k = 2 is the subtree rootedat the left child of the root. The tree for k = 4 has this tree as the left child of its root,with the right child of the root consisting of arm with leaf weights x17 + .. . + x24,x25 ++ x28,x29 + x30,x31,x32 from left to right.The purpose of this tree is to assign values to the {yjj}. Randomly distribute consecutive pairs (y1,j, yijjj, j = 1,3, . . . ,4k + 3, among the 2k + 2 lowest terminal leaves inthis tree. For j = 1,.. . ,4k + 4, let yj be half the weight of the leaf that it is associatedwith. Then assign values to consecutive pairs of the 2(k— 1){y2J} by distributing themamong the next lowest terminal leaves and so on. This new tree is called the orderingtree, and is shown in Figure 5.13. It records how the weights were assigned, and alsotheir sorted order.x1 x2 x3 x4 x13+x4 x15 x115Chapter 5. Hardness Results 49Yl,3 >‘l,4 ‘1,9 >‘lIOFigure 5.13: The ordering treeThe input weight list is as follows, with regions distinguished by height.Y3,1, Y3,2112,1, 112,2, 112,3, 112,4Yi,i, 111,2, . . . 111,15, 111,16Now consider the behavior of any region-based algorithm which finds the smallest twolmcps in every region. The region chosen to process will always be the first one on theleft, as the regions present are always sorted in increasing order. We also note that thereare always be an even number of nodes in every intermediate region. To finish the proofof the main result, we first need a lemma.Lemma 5.4 If the children in the lmcp tree are ordered according to weight, then thelmcp tree is isomorphic to the ordering tree.Proof: We may assume that we begin by combining all the lmcps in the lowest (largestlevel) region. From Lemma 5.1 we know that since the weights come in consecutive pairsof the same weight, these pairs will eventually form lmcps and combine, in agreementwith the ordering tree. At this stage the lowest region in the work list consists of crossabley1,7 I.S ‘l,l3>’l,14”1,1I Y)12, Y,,2 Y115,65‘),6y2jChapter 5. Hardness Results 50nodes interspersed with some noncrossable ones, which again come in pairs. It is easilyseen from the ordering tree that there is always an even number of crossable nodes smallerthan the consecutive pairs of noncrossable nodes in the lowest region. Thus we know thatthese crossable nodes will pair off first, and then the consecutive pairs of noncrossableleaf nodes will pair off as is shown in the ordering tree. It is clear from the ordering tree,that this process continues and the lmcp-tree, with every internal node’s children orderedby increasing weight is isomorphic to the ordering tree. CWe can now easily predict how the weights will be distributed on the tree T that ouralgorithm produces. We know that the last region processed will contain just two nodes.Since these nodes will be the two smallest nodes in that region, their weights must matchthe weights of the same nodes in the lmcp tree. The smallest node will thus root all theleaves on the left branch of root the ordering tree, while the second smallest (the largest,in this case) node will root all the leaves on the right branch. In time proportional to thenumber of leaves we find, we can traverse the right branch of our tree and find all theleaves and hence weights {y} that are on the right branch of the ordering tree. Sincethere are only a constant number of leaves per level, we can afford to sort each level, andhence begin sorting each of the regions in the initial input list. We now use this idearecursively on the subtree rooted at the smallest node of the last region. This lets usfind all the leaves in the right branch of left branch from the root in the ordering tree.Again, we may sort the weights present by region, and append them to the beginning ofthe sorted region lists created previously. This will take time proportional to the numberof nodes in this branch. By repeating this process, we will completely determine everyinput weight’s location in the ordering tree, and from this information produce sortedlists of the weights in each region in the input. All this takes only 0(n) time to do, onceChapter 5. Hardness Results 51the tree T is known.We have not reduced the problem of sorting arbitrary sequences to using region-basedmethods to construct the optimal alphabetic binary tree because the sequences that wesort are subject to the restriction that the first 4k + 4 weights come before the next2(k— 1) which come before the next 2(k —2) and so on. However, this problem seems tobe, asymptotically speaking, just as hard as sorting. An informal information-theoreticargument for this is that the same order of bits must be given to specify the sorted order.The total number of different orderings of the input sequences subject to the conditionthat the first 4k + 4 weights come before the next 2(k — 1) which come before the next2(k— 2), etc. is(2k + 2)!(k — 1)!(k — 2)! . . . (2)! > ([k/2j!)’’1 1 > = Q(ke(k2))Since k = e(4), this number is Q(°01), which the information necessary to specifythe sorted order for arbitrary sequences. UChapter 6ConclusionsIn this thesis we have extended the ideas of Hu and Tucker for constructing optimal alphabetic binary trees. In particular, we have used their basic idea of tmcp tree constructiontogether the new idea of region-processing to give 0(n) time algorithms to solve the caseswhere the input weights are within a constant factor, or exponentially separated. Theconstant factor case makes use of a new technique for doing generalized selection in 0(n)time discovered by Maria Klawe. We show that any natural method employing eitherthe idea of lmcp tree construction or the idea of region-processing may force us to sorta substantial amount of the input. We also make some observations about speeding upregion-based methods in the general case.The basic question of whether there is a general o(n lg n) time algorithm for findingoptimal alphabetic binary trees for this problem remains open, though it is conceivablethat the ideas in the proof of Theorem 5.2 could be extended to obtain a superlinearlower bound for the general problem in a weaker decision tree model.52Bibliography[1] T. C. Hu and A. C. Tucker. Optimal Computer Search Trees and Variable-LengthAlphabetical Codes. SIAM Journal of Applied Mathematics, Vol. 21, No. 4, pp. 514-532, 1971.[2] T. C. Hu, D. J. Kleitman and J. K. Tamaki. Binary Trees Optimum Under VariousCriteria. SIAM Journal of Applied Mathematics, Vol. 37, No. 2, pp. 246-256, 1979.[3] A. M. Garsia and M. L. Wachs. A New Algorithm for Minimum Cost Binary Trees.SIAM Journal of Computing, Vol. 6, No. 4, pp. 622-642, 1977.[4] E. N. Gilbert and E. F. Moore. Variable length encodings. Bell System TechnicalJournal, Vol. 38, pp. 933-968, 1959.[5] F. F. Yao. Speed-up in dynamic programming. SIAM Journal on Algebraic and Discrete Methods, Vol. 3, No. 4, pp. 532-540, 1982.[6] P. Ramanan. Testing the optimality of alphabetic trees. Theoretical Computer Science. to appear.[7] M. Blum, R. W. Floyd, V. R. Pratt, R. L. Rivest and R. E. Tarjan. Time boundsfor selection. Journal of Computer and System Sciences, Vol. 7, No. 4, pp. 448-461,1972.[8] D. E. Knuth. Optimum binary search trees. Acta Informatica, Vol. 1, pp. 14-25,1971.[9] D. E. Knuth. The Art of Computer Programming, Vol. 1: Fundamental Algorithms.Addison-Wesley, Reading, MA, 1968.[10] D. E. Knuth. The Art of Computer Programming, Vol. 3: Sorting and Searching.Addison-Wesley, Reading, MA, 1973.53
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Some new results on constructing optimal alphabetic...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Some new results on constructing optimal alphabetic binary trees Mumey, Brendan M. 1992
pdf
Page Metadata
Item Metadata
Title | Some new results on constructing optimal alphabetic binary trees |
Creator |
Mumey, Brendan M. |
Date Issued | 1992 |
Description | This thesis studies the question of whether optimal alphabetic binary trees can be constructed in o(n lg n) time. Hu and Tucker [1] gave the first subquadratic algorithm in 1971, namely an O(n lg n) time algorithm. Optimal alphabetic binary trees have many applications and the question of whether an o(n lg n) time algorithm exists has remained an open problem in data structures for the last 20 years. We show that a class of techniques for finding optimal alphabetic trees which includes all current methods yielding O(n lg n) time algorithms are at least as hard as sorting in whatever model of computation used. We introduce a new idea for finding optimal alphabetic binary trees which we refer to as region-processing. Region-processing methods have been successfully used to give O(n) time algorithms for the case where all the input weights are within a constant factor of one another and when they are exponentially separated. Although the region based method we give exhibits o(n lg n) time performance for many cases, we also show a reduction from sorting to it as well. It is possible that the ideas used in this reduction can be extended to yield a general ω(n) lower bound. |
Extent | 811368 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-01-06 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0051343 |
URI | http://hdl.handle.net/2429/3384 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1992-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1992_spring_mumey_brendan_m.pdf [ 792.35kB ]
- Metadata
- JSON: 831-1.0051343.json
- JSON-LD: 831-1.0051343-ld.json
- RDF/XML (Pretty): 831-1.0051343-rdf.xml
- RDF/JSON: 831-1.0051343-rdf.json
- Turtle: 831-1.0051343-turtle.txt
- N-Triples: 831-1.0051343-rdf-ntriples.txt
- Original Record: 831-1.0051343-source.json
- Full Text
- 831-1.0051343-fulltext.txt
- Citation
- 831-1.0051343.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0051343/manifest