C - l MODEL AND SOLUTION STRATEGY FOR PLACEMENT OF RECTANGULAR BLOCKS IN THE EUCLIDIAN PLANE By AMIR ALON B.Sc, University of British Columbia, 1985 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (DEPARTMENT OF COMPUTER SCIENCE) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA May 3, 1986 "© Amir Alon, 1986 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f the requirements f o r an advanced degree a t the U n i v e r s i t y o f B r i t i s h Columbia, I agree t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and study. I f u r t h e r agree t h a t p e r m i s s i o n f o r e x t e n s i v e copying of t h i s t h e s i s f o r s c h o l a r l y purposes may be granted by the head o f my department or by h i s o r her r e p r e s e n t a t i v e s . I t i s understood t h a t c o p y i n g or p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l gain s h a l l not be allowed without my w r i t t e n p e r m i s s i o n . Department of The U n i v e r s i t y o f B r i t i s h Columbia 2075 Wesbrook P l a c e Vancouver, Canada V6T 1W5 Date 7)W C. Abstract This thesis describes a nonlinear optimization model for the placement of rectangular blocks with some wire connections among them in the Euclidian plane, such that the total wire length is minimized. Such a placement algorithm is useful as a CAD tool for VLSI and PCB layout designs. In contrast to some previous placement techniques, the mathematical model pre-sented here ensures that the blocks will not overlap, and minimizes the sum of the distances of the interconnections of the blocks with respect to their orientation as well as their position. We also present mechanisms for solving more restrictive placement problems: one in which there is a set of equally spaced, discrete angles to be used in the placement, and one in which the blocks have to be assigned into predefined slots. The mathematical model is based on the Lennard-Jones 6-12 potential equation, on a sine wave shaped penalty function, and on minimizing the sum of the squares of the Euclid-ian distances of the block interconnections. We implement and embed our optimization routines in an interactive graphic block editor. We also present some experimental results which show that near optimal placements are achieved with our techniques. ' ii Contents Abstract ii List of Tables v List of Figures v i Acknowledgement vii 1 Introduction. 1 1.1 Introduction to placement problems 1 1.2 Overview 6 2 Review of placement techniques. 8 2.1 Introduction 8 2.2 Discrete techniques 10 2.2.1 Quadratic Assignment Problem 11 2.2.2 Initial placement 15 2.2.3 Placement improvement 24 2.3 Continuous techniques 30 2.3.1 Continuous placements minimizing lp norms 31 2.3.2 Continuous placements of point components 32 2.3.3 Continuous placements of rectangular blocks 36 2.4 Hybrid techniques 40 2.5 Top down design (Floor Planning) 43 2.6 Simulated Annealing 45 2.7 Hardware implementations 47 3 The mathematical model. 4 9 3.1 Introduction 49 3.2 Preliminary Definitions 50 iii 3.3 The constrained minimization problem 51 3.4 Standard penalty functions 55 3.5 Our penalty function approach 57 3.6 The fixed slots problem 62 4 The solution strategy. 65 4.1 Introduction • 65 4.2 Convexity analysis 66 4.3 Experimental evidence 68 5 Experimental results. 75 5.1 Introduction 75 5.2 The block edit program 76 5.3 Placement examples 77 5.4 Steinberg example 83 6 Conclusions and future research. 88 6.1 Conclusions 88 6.2 Future study 89 Bibliography 90 A Quasi-Newton Methods . 97 iv List of Tables 4.1 Continuation sequence for figure 4.3 73 5.1 Continuation sequence for figure 5.1 77 5.2 Continuation sequence for figure 5.3 79 5.3 Continuation sequence for figure 5.5 82 5.4 Steinberg example: previous results. . : 85 5.5 Steinberg example: experimental results 86 v List of Figures 2.1 Unconnected sets 12 2.2 Linear to two dimensional placement 17 2.3 Cluster tree 19 2.4 Cluster placement 20 2.5 Generalized Force Directed Relaxation 26 2.6 Cut lines 27 2.7 Pins interchangeability. 28 2.8 Rotating scan line 29 2.9 Scan lines obstructions 30 2.10 Block modeling . 38 2.11 Force directed placement of two blocks 42 2.12 Chip area calculation 44 2.13 Neighborhood exchanges. 47 3.1 Terminal coordinates 52 3.2 Calculating blocks separation 53 3.3 Graph of V{j 60 3.4 The function Ri 61 4.1 3 block placement using A f = A f = 1 70 4.2 3 block placement using improved strategy 72 4.3 Choosing values for A 2 73 5.1 6 block placement 78 5.2 6 block placement with 45° angles 79 5.3 10 block placement 80 5.4 Placement problem where initial placement is important . 81 5.5 10 Block placement into fixed slots 82 5.6 Steinberg example 84 5.7 Lower Bounds 87 vi Acknowledgements First and foremost many thanks to my supervisor Dr Uri Ascher for his careful and helpful suggestions while working on this thesis. Thanks are also due to Dr Bob Wood-ham, the second reader, as well as to Keith, Mark, Tian, Luping and Ron for many helpful discussions and for proof reading this thesis. Many thanks to NSERC for fi-nancially supporting me during my graduate work. Last but not least I thank my wife Yael for bearing with a single minded husband who spends his romantic nights" in front of a computer terminal instead of spending them in the company of his beautiful wife. vii Chapter 1 Introduction. 1.1 Introduction to placement problems Layout problems are optimization problems in which a layout designer is given a set of objects which he needs to arrange on the plane so as to optimize some topological properties resulting from the arrangement. Layout problems can usually be decomposed into two related subproblems: placement and routing. The placement problem consists of finding the location of each of the objects while the routing problem consists of connecting the objects together according to a given specification. In this thesis we consider the placement problem for rectangular blocks of various dimensions and orientations in the plane. Each block has a set of terminals (or ports) which are fixed points on the boundary of the block. Also as part of the input we get a list of pairs of terminals that are to be connected to each other. At times, additional restrictions are placed on the choice of orientation or position of the blocks. The objective of our optimization is to place the blocks such that the total sum of the 1 CHAPTER 1. INTRODUCTION. 2 lengths of the interconnections is minimized without having any two blocks overlap, and satisfying possible additional orientation and location constraints. Instances which are some variations of this placement problem arise in operations research location/allocation problems [GILM 62,HALL 70], as well as in office lay-out problems [DREZ 80]. The two most common applications which motivate us are Printed Circuit Boards(PCB) [STEI 61,GILM 62,CART 79] and Very Large Scale In-tegration(VLSI) layouts [CHEN 84,CHYA 83,CROW 83,FUKU 83,GOTO 79]. In the layout design for PCB, the blocks are electrical components which must be connected to other components via pins (terminals) on their boundaries. The design of PCBs dictates that each of the components must be placed in one of the predefined grid loca-tions. Also, depending on the specific PCB, components may be allowed to be placed in only two or four orientations. Due to the increased complexity of VLSI designs, the use of a hierarchical design methodology is warranted. In hierarchical VLSI design the chip is partitioned into func-tional components, each of which is recursively subdivided again. A CPU, for example, may be subdivided into an ALU, ROM, Register file and more. The register file may be subdivided into, say, sixteen registers each of which may again be subdivided into sixteen bits and so on'until all components have been partitioned into the elementary electrical devices. At each level of the hierarchy the components need to be placed and routed with the other components on the same level of the hierarchy. Depending on CHAPTER 1. INTRODUCTION. 3 the particular design process, various constraints may apply here. Components may have to be placed in two, four, or even eight orientations if the design process allows for 45° geometries. We must bear in mind that many factors are involved in producing a good placement. These include the total wire length, wire crossing, heat dissipation and total circuit area [HOFF 79,ODAW 85,QUIN 79]. Note that while total wire length does not represent all of these design goals exactly it is reasonable to assume that overall shorter wires lead to less circuit area, less resistance, and less wire crossing. We have adopted the total wire length measure, as most placement algorithms do [GOTO 79,FUKU 83,KOZA 83,HOFF 79], since it also gives a reasonable indication as to how expensive it is to construct the detailed routing of the wires after the blocks have been placed. In fact, since the wires are measured as straight lines, the obtained total wire length gives a lower bound on the total length of the detailed routing. Even after choosing the objective function to be the total wire length, placement problems are still computationally hard [BREU 76b,HOFF 79,GILM 62]. Mathemati-cally, we obtain a nonlinear optimization problem, with some of the variables possibly restricted to have integer values only. As a result of the difficulty of the placement problem and the heuristic meaning of "best" placement, most of the work on the sub-ject is concentrated on getting a "good" placement only. By a "good" placement we mean that the blocks are placed in good topological relation to each other but are not CHAPTER 1. INTRODUCTION. 4 necessarily separated by the exactly required distances between them. Thus, we accept certain suboptimal solutions, and even allow the constraint of separation of the blocks to be slightly violated. This is an approximate solution of the optimization problem, solving a reasonable relaxation of it since the exact distance separating the blocks is not known anyway until the routing phase which follows the placement. Most of the placement techniques reported in the literature have concentrated on placing a large (say > 50) number of blocks, ignoring their size, orientation and posi-tion of terminals [QUIN 79,RICH 84,ROTH 83]. Placement techniques can be roughly classified as discrete or continuous. A discrete placement technique assumes a set of discrete locations (usually on grid lattice) on which the blocks are to be located. Con-tinuous placement techniques assume placement in the Euclidian plane (or a bounded part of it). The choice between continuous and discrete techniques varies with the spe-cific placement model at hand. P C B placements, for example, are naturally modeled as discrete, while VLSI placements are naturally continuous or mixed. The computational difficulties associated with each technique are also an important factor. Discrete tech-niques offer more exact representations but are usually less tractable than continuous ones. Discrete placement is usually performed by a two phase algorithm. An initial con-structive placement phase is followed by an iterative improvement or refinement phase. The iterative improvement phase is a crucial phase, and usually involves choosing two CHAPTER 1. INTRODUCTION. 5 blocks and interchanging their position if this reduces the total wire length [HOFF 79,IOSU 83,KERN 69]. Alternatively a discrete model can be used to formulate an in-teger programming problem [LOVE 76] or to be solved via branch and bound methods [PIER 71,GILM 62,BAZA 83]. Other placement techniques use a continuous model, usually drawing on physical phenomena as a model for placement. These include Hooks law force directed models [QUIN 79] and Resistive network optimization [CHEN 84]. The solution techniques here range from nonlinear mixed integer programming [MARK 84] to statistical an-nealing methods [VECC 83]. Also available are various hybrid models, usually using a continuous algorithm for getting an approximate placement and a discrete algorithm for mapping the result onto grid locations. We have chosen a different continuous model for placement: that of the potential energy between particles [ALON 85]. We use twice differentiate nonlinear penalty functions to represent the constraints of the model and avoid the ill-conditioning of some previous formulations [SHA 85]. We show how to effectively solve the nonlinear unconstrained optimization problem which results from our formulation. We model the problem in more detail than has been previously done by taking into account the size and orientation of the blocks, as well as the positions of the terminals on the blocks edges. As a result our mathematical problem is computationally more involved. This is not necessarily a severe limitation since our procedure can be used interactively to CHAPTER 1. INTRODUCTION. 6 aid a human designer, or it can be used in an hierarchical design where the number of blocks in each level of the hierarchy is limited. 1.2 Overview. In chapter 2 we provide the reader with a survey and discussion of existing placement algorithms. In particular we discuss related operations research location/allocation literature, initial placement and placement improvement algorithms, floor planning algorithms, simulated annealing type methods, and some possible hardware implemen-tations. In chapter 3 we introduce our nonlinear model for the placement problem. We describe specialization of the model which includes restricting the allowable orientation of the blocks as well as restricting the placement to fixed grid points. We describe how to restate the model as an unconstrained nonlinear optimization problem by replacing the constraints with penalty functions. In chapter 4 we present our solution strategy. We analyze the convexity qualities of our model and see how they suggest a solution strategy. We explore various tuning strategies for parameters in our objective function and discuss the tradeoffs affecting these parameters. In chapter 5 we present an experimental interactive placement program. We describe an interactive block editor used to input and edit placements and run the optimization CHAPTER 1. INTRODUCTION. 7 routines. Also in chapter 5 we present and discuss some tested placement examples. In chapter 6 we present our conclusions as well as some of the future work which may complement this research. In Appendix A we briefly review nonlinear unconstrained optimization by quasi-Newton methods, which is a mathematical technique necessary for the understanding of our approach. Chapter 2 Review of placement techniques. 2.1 Introduction. In this chapter we review previous placement algorithms. We classify previous work as based on the discrete or continuous model used for the placement problem. A discrete algorithm assumes a discrete set of locations into which the blocks are to be placed. A continuous algorithm assumes that the placement is to be done in the Euclidian plane. We also present in this chapter a class of hybrid placement algorithms which compute placements in two phases. First a continuous phase approximating a discrete model, followed by a discrete phase which attempts to change the placement from the first phase as little as possible. Another type of algorithm uses statistical distributions for placement. A statistical typed algorithm does not assume restrictions on the placement of the blocks, but the existence of some mechanism which successively produces different placement configu-rations and therefore can apply, as a solution technique, to both discrete and continuous 8 CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 9 models. Alternatively we can classify placement algorithms as bottom up or top down. Bot-tom up algorithms assume that all the blocks to be placed have been already specified, i.e. their sizes, and location of terminals are known. Top down algorithms, commonly referred to as floor planing assume that the block parameters are not yet known and can only be estimated. In floor planning the block positions and locations are first es-timated. This step is followed up by recursively constructing each of the blocks. Most of the algorithms reviewed in this chapter are of the bottom up type, Section 2.5 is dedicated to the review of top down techniques. We note that here, as in many design problems, the choice of bottom up or top down methodology is up to the personal pref-erence of the circuit designer. Successful design methodologies will probably include a mixture of both. Judging the quality of the various placement algorithms in comparison to each other is an uneasy task. Since most of the algorithms are heuristic, experimental comparison seems to be the right tool. Most authors, however, present their own test cases making it very hard to compare. An exception to this is the Steinberg example which was tested on many of the better placement algorithms [HALL 70,GRAV 70,HILL 66,CART 79,CHEN 84,GILM 62,STEI 61,BAZA 83]. We summarize these experimental results including our own in table 5.4. We conclude this chapter by discussing some suggested hardware implementations for some of the reviewed algorithms. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 10 2.2 Discrete techniques. Discrete placement problems arise often in placement of components onto a PCB [STEI 61,GILM 62,PIER 71,PALC 84,HOFF 79], as well as in the design of master slice LSI and gate arrays [HOFF 79,KOZA 83,GOTO 79,KIRK 83] and in the location of interacting facilities in operations research literature [PIER 71,LOVE 76,BAZA 83]. Certain discrete placement problems can be formulated as the well known, nonlinear, Quadratic Assignment Problem (QAP) [STEI 61,GILM 62,PIER 71.LOVE 76,BAZA 83]. The QAP can be solved optimally by very expensive branch and bound techniques [GILM 62], or suboptimally by branch and bound heuristics [GILM 62,BAZA 83,PIER 71]. Other heuristics for QAP involve solving a series of linear assignment problems [STEI 61,CART 79]. In §2.2.1 we present such algorithms. More common with the electrical engineering community are initial placement and placement improvement heuristics. The initial or constructive placement heuristics attempt to find, in an efficient manner, good starting placements [HOFF 79,PALC 84,RICH 84]. The placement improvement algorithms attempt to improve a given place-ment by successively interchanging the placement of a few of the components [HOFF 79,KOZA 83,GOTO 79,KIRK 83,ULRI 79,BREU 76b]. Many placement strategies em-ployed an initial placement phase followed by a placement improvement phase [HOFF 79,KIRK 83,GOTO 79,SUPO 83]. Sections §2.2.2 and §2.2.3 describe some common CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 11 initial placement and placement improvement procedures. 2.2 .1 Quadratic Assignment Problem. Given a set of n components (or facilities or blocks) to be located each in one of n locations, the placement (facility location) problem can be formulated as a 0 — 1 nonlinear Quadratic Assignment Problem (QAP) MIN £ J2 £ CWXVXH (2.1) , = 1 j=i+l k=j+l l=k+l subject to £ , n = 1 X t i = l f o r / = l , . . . , n Ey=i Xn = 1 for * = 1,..., n (2.2) Xn e {0,1} where X,y = 1 says that component i is assigned to location j and Cijkt is the sum of the interconnection lengths (or facility interaction weight) between component i and component A; when component i is located in location j and component k is located in location /. The constraints ensure that every component is located in exactly one location, and that every location is assigned exactly one component to it. The QAP, as many researchers have found, is a hard problem indeed, and there is little hope for an efficient solution algorithm with current knowledge [GILM 62,GARE 79]. Since seeking the optimal solution to Q A P is not realistic in many applications, most of the work on Q A P is concentrated on finding a heuristic for suboptimal but good placements. Such heuristics can be classified into two classes. The first class involves reducing the Q A P into a series of linear assignment problems [STEI 61,CART CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 12 i connection ^ Components 1 2 3 L o c a t i o n s ^ Langth=2 • Figure 2.1: Unconnected sets. 79]. The second class involves using branch and bound based heuristics [GILM 62,BAZA 83,PIER 71]. Reducing Q A P to linear assignment. The key idea in reducing Q A P into a series of linear assignment problems was introduced by Steinberg [STEI 61]. This heuristic is based on the following observation. Given a subset of the components in which no two components are connected, and fixing all other components, the optimal placement of such a subset is. a linear problem, and can be solved via linear assignment algorithms. In figure 2.1 for example, the sets {1,2} and {2, 3} form two such unconnected sets. Given an unconnected set S, and a set L of locations currently used by the elements of S, the cost csi of assigning a component 5 6 5 to a location / G L becomes the sum of the interconnection lengths from all other components, which are not in 5, to the location I, giving rise to the linear assignment problem MIN EECA (2-3) CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 13 subject to Zaesx3i = i tevieL ZZI€LX1S = 1 foxseS (2.4) Xiy€{0,l}, which is solvable in 0(n 3) time complexity by the Hungarian algorithm of [KUHN 55]. For the unconnected subset {1,2} of figure 2.1 we get Cn = 2, Cu = l,C2i = C 2 2 = 0 which by linear assignment results with assigning component 1 to location 2 and component 2 to location 1 therefore reducing the total wire length from 2 to 1. Of course the Steinberg heuristic depends on the initial placement and on the suc-cessive selection of unconnected sets. This heuristic seems to be particularly suited for placement problems which give rise to sparse connection matrices. Steinberg suggests finding a collection of unconnected sets such that their union covers all the components and the cardinality of each such set is as large as possible. He does not, however, show how such a collection may be found efficiently. Carter et al. [CART 79] have suggested that to improve the running time for Stein-berg's algorithm one uses as much information from previous linear assignments as possible. Their incremental algorithm does not achieve better asymptotic time com-plexity but seems to improve the practical running time slightly. In another paper, Love and Wong [LOVE 76] suggest using standard integer programming techniques to solve a specialized QAP for the location/allocation problem. Because of the computational difficulties of such techniques they suggest fixing some of the components and resolving the same problem for a few times. Even so, this method is not applicable to more than CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 14 very few (8-9) components. Branch and Bound heuristics for the Q A P . Branch and Bound methods for the QAP are concisely summarized by Pierce and Growston in [PIER 71]. Such methods involve the systematic enumeration of possible placements while cutting off branches of the search tree which are known to lead to infeasible or more expensive placement than are already known to exist. Branch and Bound algorithms differ in their enumeration methods and in their bounding schema. Since Branch and Bound algorithms are exponential in their worst case behavior we concentrate our review on Branch and Bound hueristics. In their paper [BAZA 83] Bazaraa and Kirca present such a Branch and Bound based heuristic. This algorithm is a relaxation of a straight Branch and Bound procedure since the algorithm proceeds to eliminate not only candidate branches which are sure not to lead to an optimal solution but also such branches that are unlikely to lead to one. One of the methods applied in [BAZA 83] is the elimination of Branches on the search tree which are symmetri-cal and therefore induce the same assignment cost. The following two arrangements, for example, yield the same distances between each pair of blocks and are therefore symmetrical. 1 2 2 1 3 4 4 3 Another heuristic employed in [BAZA 83] is the application of local improvement tech-CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 15 niques while calculating the bound; specifically, interchanging pairs of components while calculating lower bound at lower levels of the search tree. Furthermore, the initial as-signment and fixing of heavily connected components to central locations was employed to further cut the size of the search tree down. Another heuristics [GILM 62] involve successively assigning one component at a time. This is equivalent to a depth first scan of the search tree with a heuristic decision procedure applied at each branching node in an attempt to choose the next assignment. Using his lower bound technique and having assigned k components Gilmore's 0(n5) heuristics involve taking the cheapest k + 1 component assignment as predicted by the lower bound. In chapter 6 we illustrate this lower bound technique as a measure of the quality of the placement. 2.2.2 Initial placement. Initial (or constructive) placement starts with a blank board and proceeds by assigning small groups of components to groups of locations [HOFF 79,PALC 84]. In its simplest form initial placement is a repetition of the following three steps until all components have been assigned: 1) Select an unassigned component. 2) Select a free location. 3) Fix placement. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 16 Palezewski [PALC 84] analyzes a class of algorithms of this type with different selection rules and puts them in a unifying tree search model. He then shows that such trees have exponential number of nodes and that the time complexity for all such initial placement algorithms is 0(n log(n)). Examples of such algorithms include linear ordering [SUNG 83] and SORG [GOTO 79] and are discussed below. In its more elaborated form initial placement involves the partitioning of the com-ponents into clusters [RICH 84,KIRK 83,TOKI 84,KOZA 83,DUNL 85,MAY 83] where the aim of the partition is to group together strongly connected components. The clusters are placed together in a group of locations and further processing is done to assign each component in a cluster to its final location. Such clustering and partitioning algorithms are discussed later in this section. Linear ordering. In [SUNG 83] the following strategy is suggested for initial placement. First assume that the placement has to be done on a line (hence the "linear" in linear ordering). Then, starting with the most lightly connected component assign the components one at a time to the next free location on the line. Kang introduces three sets IN = the components already placed. ACTIVE = the components candidate for placement in the next step. OUT = all other components. The algorithm starts with IN = 0 and OUT = all other components. At each iteration one component, C, is selected from the ACTIVE set (or from OUT if ACTIVE = CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 17 L i n e Simple M u l t i p l e Placement Z l g Z a g ZlgZag Figure 2.2: Linear to two dimensional placement. 0) and placed in IN. All the components in OUT which are connected to C are moved to ACTIVE. The iterations continue until all components are in IN. The following criterion was used to select the next component to be moved to IN: Choose the component with the minimum number of (connections to the components in OUT minus connections to the components in 77V). Goto [GOTO 79] suggests a similar strategy which involves placing the heavier components first. Other similar algorithms are summarized in [BREU 76a]. The transformation from a linear placement to a two dimensional structure is il-lustrated in figure 2.2 and is achieved by folding the line placement into a "zigzag". The amount of folding depends on the length-to-width ratio of the placement medium [SUNG 83,KOZA 83]. Clustering. In cluster type placement the components are first grouped into clusters according to the degree of connections between the components [KOZA 83]. In Stabler and Eurichik [STAB 79], for example, each cluster is assigned to a row on the grid mesh. The clusters CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 18 are chosen to minimize p *=|£X>0- (2-5) where Kij is the number (or weight) of connections running between cluster % and cluster j. We require / clusters, one for every grid row. Starting from a given partition a pairwise exchange heuristic is used to improve the clustering. A similar clustering technique is also applied in [KIRK 83]. When the clustering is complete each cluster component is assigned via branch and bound algorithm to a position within a generic row. We note that for a square grid the branch and bound algorithm runs only on y/n components. Finally considering each cluster row as a supercomponent the allocation of the specific rows can be carried in the same way as above. Richard, in his work [RICH 84], suggests a similar three stage, graph based, approach to placement. The first stage is the construction of a graph G — (E, V), where the nodes V represent the components and the edges E represent the connections between the components. The second stage includes the generation of a cluster tree from the graph G. In the final stage the components are assigned to their locations according to their position in the cluster tree. The graph G is transformed into a binary cluster tree by successively choosing two nodes with the maximum connection weight between them and merging the two nodes together. For any such merge the total connection weight is recorded in the new node. Figure 2.3 presents a placement example with its cluster tree. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 19 7 =Total Connection Weight 4 Placement C l u s t e r Tree Figure 2.3: Cluster tree. The actual placement in [RICH 84] is achieved by repeatedly selecting an unplaced component from the leaves of the binary cluster tree. For each component, as it is selected for placement, the center of gravity of all its connected and already placed components is calculated and the component is placed in its nearest grid point. The center of gravity, (x, y) of a set of n components with coordinates (a;,-, y,) for i'• = 1,..., n is calculated as (2.6) y = £ yi-Kozawa [TOKI 84] devise a more sophisticated way of generating the cluster tree. They introduce a measure, P(7© J), which includes terms measuring the effect of combining the components I and J into the hypercomponent /.© J, where © denotes the combination (or clustering) operator. This measure includes terms sensitive to wire length, placement area, and the dead space resulting from the combination of two blocks of different shape. The area ratio, for example, is defined as: Area ra.io o f / © . = (2.7) where A(X) denotes the area of X. The other terms are constructed similarly. The CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 20 A C l u s t e r Tree P o s s i b l e placements • Figure 2.4: Cluster placement, construction of the binary cluster tree in [TOKI 84] is now achieved by taking the pair ( J © J) which maximize the clustering value P ( 7 © J) and merge repeatedly. A similar approach is taken in [MURA 79]. Here the clustering value, C(i,j), is measured as r... n.. (2.8) C(iJ) = A i w % r + Aj- Ci> Ci — Cij Cj — Cij where A,- denotes the area of block t, C,- denotes the total connection weight of compo-nent i and C^ denotes the total connections weight for the connections running between i and j. The placement of the components is done by top down placement from the cluster tree. For the single cluster tree of figure 2.4, for example, there are four possible top down placements as illustrated in the figure. Each of the configurations is evaluated for its total wire length and the cheaper one is selected. The cluster placement may then be improved by local improvement methods of section 2.2.3 [MURA 79]. Graph partitioning. Clustering can be viewed as a graph theoretic problem [KERN 69]. As such Kernigham and Lin show that an exact solution is very likely exponential (the partition problem CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 21 was later proven to be NP — complete [GARE 79]). Kernigham and Lin define the two way partition of a set 5 with a cost, c7tJ- defined for each pair < i,j >G S, as finding two subsets of S, A*,B*, \A*\ « « ^ such that the total partition cost E E Cti (2.9) i&A' j€B' is minimized. The external cost, Ea, for each a G A, with respect to a two way partition S = A\JB, is defined as Ea = Y,C*i (2-10) i€B and similarly the internal cost, Ia, is defined as /« = £ < ? « • . (2.11) i€A Finally Kernigham and Lin show that for a two way interchange, a G A =>• B and b G B =>• A, the total gain, ffaj, in cost becomes gab = Ea- Ia + Eb- Ib- 2Cab (2.12) Kernigham and Lin then proceed by successively choosing a pair a*, b* which maxi-mizes the gain, ga'b', and interchange a*,b*. This process continues until no more gains can be achieved. This algorithm runs in 0(n2logn) and is further generalized in [KERN 69] to multiway and unequal size partitions. Note also that many of the placement im-provement techniques of section 2.2.3 share a very similar character to Kernigham and Lin's procedure. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 22 Knowledge-based partitioning. The basic idea behind knowledge-based partitions is to use extra knowledge available to the designer of electrical circuits in the form of the functionality of the components clusters is based on the type of the components as well as their physical parameters (size and connectivity). We review two such knowledge-based systems. In his work McFarland [MCFA 83] develops a measure of similarity between com-ponents. This measure is composed of weighting both connectivity of the components and similarity of the functions which they designate. Similarity of components is de-cided by the amount of common resources that can be shared by the functions. More specifically, the overlap measure of two components C,, is denned as the amount of hardware which can be shared between the two components i and j. For example, if an adder needs 6 particular gates and a subtractor need 8 of the same typed gates then the overlap benefit C,;- may be as high as 6. If we let C,- designate the number of hardware components needed to realize component i we get the combined (cluster) cost of merging the adder and subtractor above as to be placed [MCFA 83,ODAW 85]. Hence the partition of the electrical network into Cadd + Csub — Cadd,sub — 8. (2.13) Generally the measure of similarity is normalized as Ct + Ci Cij (2.14) CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 23 and a term measuring the connectivity between i and j is added. Note that the costs C,-and Cij need to be defined for all i, j as part of the knowledge supplied. The algorithm in [MCFA 83] then proceeds to partition the components according to their similarity measure. Alternative measure of clustering value is suggested in [ODAW 85]. For two com-ponents i, j the clustering value, CLij, is defined as CLij = CEij * CTij * CRi * CRj (2.15) where • CE^ = a factor depending on the type of connection between i and j. For example, an address bus connecting i, j will have much higher value than a simple control line. • CT^ — a factor designating the category or type of For example, for i = CPU and j = BUFFER this value will be much higher than if both i and j are i/o pads. • CRi^j) — the ratio of connections weight The central problem attacked in [ODAW 85] is the construction of a knowledge base containing the numerical values for CEij, CTij and CRi. To this end Odawra developed an analyzer program which extracts typical values from manually placed circuits. For a given set of components to be placed these values together with the type of the components and interconnections are used to generate a cluster tree. The actual placement is done by techniques similar to the ones already discussed. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 24 2.2.3 Placement improvement. Iterative improvement algorithms usually start where initial placement has left of with the components already initially placed. In iterative improvement a small set of com-ponents is chosen and their placement is permuted [HOFF 79]. In its simplest and commonly used form iterative improvement is carried out as a sequence of pairwise exchanges. With pairwise exchange a pair of components is chosen and its locations are interchanged if this interchange results with lowering the total connection length. Usually an improvement pass may include all "l"" 1) possible pairs [KIRK 83]. Alter-natively one can calculate a force vector from each component representing the sum of the connections when they are regarded as force vectors and to exchange a component with another component at the direction given by the force vector [HANA 76]. Such pairwise exchanges are common in placement techniques [KOZA 83,HOFF 79] as a tool for improving both initial placement and cluster assignment [KIRK 83]. The next subsection presents a generalization of pairwise force directed exchange. Also in this section we review Breuers [BREU 76b] min-cut placement technique. In min-cut procedures vertical and horizontal lines are employed through the placement and components are interchanged so as to minimize the number of connections which cross the current cut line. Another algorithm which falls into this category of placement improvement is the heuristics of May et al. [MAY 83] which minimize the line intersec-tions in logic diagrams by assuming independence of the rows and applying dynamic CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 25 programming techniques. Finally in this section we discuss the pin assignment problem. Most placement algorithms ignore the fact that components should be rotated because of the torque applied by the fixed terminals on their boundaries to which the connection wires are connected. Put differently, this is the problem of locating the terminals on the components boundaries so as to minimize the total wire length [BRAD 84,MORY 78,KORE 72]. Generalized force directed r e l a x a t i o n ( G F D R ) . The basic idea behind GFDR is the combination and generalization of Lin and Kernigham local optimum techniques [LIN 73] and Hanan [HANA 76] force directed approach. GFDR techniques has been used in different versions in some of the previous algorithms reviewed usually in order to improve clustering [KOZA 83,KIRK 83,RICH 84,TOKI 84]. Here we examine GFDR as a technique by itself. We start with some definitions from [GOTO 79]. The median of a component / is defined as the index which minimizes n f[xi, Vi) = £ Ci,i(\xi - Xi\ + \yi - yi\). (2.16) i=i Here x,-, y,- are the current position of component i and Cj)t- is the number, or total weight of connections running between components / and i. We continue by defining the e — neighborhood of a block median as the nearest e components to the median. Finally we define A — optimum as the best placement achievable by trying all A-way interchanges [LIN 73]. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 26 a b ^ I n t a r c l Median(a) £—neighborhood ( £ = 2 ) Figure 2.5: Generalized Force Directed Relaxation. The G F D R algorithm proceeds by selecting a primary block at a time, calculating its median and interchanging it in turn with each block in the e — neighborhood of the median. An interchange is accepted if it reduces the total wire length. The interchanged block now becomes the primary block, its median is calculated and it is interchanged again. This interchange cycle continues until either A blocks have been interchanged, or no more gains can be realized. Note that this schema does not guarantee A — optimum but is much cheaper then Lin's approach [LIN 73] since not all A-way interchanges have to be considered. Figure 2.5 illustrates one cycle of G F D R interchanges starting with block A as the primary block, interchanging it with B which in turn is interchanged with E. Goto [GOTO 79] found, experimentally, that e = A = 4 is the best value for trading off placement improvement and run time. M i n - c u t placement. In min-cut placement procedures the object of the minimization is the number of wire connections crossing a family of horizontal and vertical cut lines [BREU 76b]. Breuer defines a cut line as a straight horizontal or vertical line through the current placement. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 27 • • • • • • • • • • • • • • • G r i d Point C4 C2 C6 Figure 2.6: Cut lines. For a given cut line, C , its value, V(C) , is the number of wires connecting components on both sides of the cut line. A min-cut placement procedure involves finding a family F of cut lines, employing the cut lines in a particular order and minimizing V(C) for every cut line being employed. We briefly review these requirements as presented in [BREU 76b]. For a placement on a grid lattice a natural set of cut lines is the collection of all vertical lines separating the grid columns and the collection of all horizontal lines separating the grid rows. For the selection of cut line ordering Breuer suggests two heuristics: slice cut and bisection cut. In a slice cut a fixed number of blocks is cut by the cut line. In a bisection cut, the next cut line bisects the blocks remaining from previous cuts. The cut lines alternate between horizontal and vertical lines. Figure 2.6 presents the first 6 cut lines as applied in the bisection heuristic. The actual minimization of the value of the cuts is done by a modified Kernigham-Lin [KERN 69] procedure which was discussed in section 2.2.2. Min cut techniques have been integrated in some other placement procedures in order to improve initial placement [ULRI 79], in combination with force directed techniques [WIPF 82] and as part of a hierarchical CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 28 O u t p u t I n p u t 1 and I n p u t 2 a r a I n t e r c h a n g e a b l e Figure 2.7: Pins interchangeability. placement procedure [BURS 83]. Pin assignment. Most of the placement procedures appearing in the literature assume that the com-ponents are reduced to points and all wires are considered to connect to this point. An improved placement, however, will result from rotating the components into an optimal orientation with respect to the connection length being measured from pins (or terminals) on the boundary of the component. Calculating the optimal rotation of all components can be done via branch and bound techniques in exponential worst time [ULRI 79]. Actually this problem is conjectured to be NP — complete in both [DONA 80] and [SAH 80]. Equivalent to the rotation problem is the pin assignment problem, in which, the component is considered fixed and the pins to be rotated around it. Note that some of the pins may be interchangeable as in figure 2.7. The literature contains a few pin assignment algorithms [MORY 7&,KORE 72] which usually proceed by one-by-one assignment, each time considering the next pin. Here we review the work of Brady [BRAD 84] on topological pin assignment. The CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 29 C o n n e c t i n g ^ P i a a X m • Pin Location Scan Ray Figure 2.8: Rotating scan line. topological pin assignment proceeds in two stages: initial pin assignment and assign-ment improvement. The improvement stage is a simple rotation of the components one by one while minimizing the total wire length and need not be discussed here. The initial pin assignment is carried out as follows: choose a component, assume a ray is originated at the center of the chosen component and is rotated in a clockwise direction as illustrated in figure 2.8. With the rotating ray we match the blocks pins with the X pins to which they need to be connected. Pin 1 is matched with the first X pin touched by the ray, pin 2 is matched by the second and so on. Note that a crucial assumption here is that all the blocks pins can be connected to every X pin, i.e. all the pins are interchangeable. Figure 2.8 is a simplified view of the pin allocation. Actually Brady attempts to incorporate routing considerations as well by considering obstructions to the scan lines as in figure 2.9. Observing figure 2.9 we note that not all the rays between A and E can be considered as a realistic measure of how close the pins are. The solution here is to associate all rays between C and A with A and all the rays between C and E with E and change the pin allocation accordingly. Of course the algorithm generalizes to more than one obstructing block. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 30 A Figure 2.9: Scan lines obstructions. 2.3 Continuous techniques. Given a set of components to be placed in the Euclidian plane (or a part of it), contin-uous placement techniques attempt to minimize n n ~£E(|s.-*;l + ly.--y,-l) (2.17) or ^ E E V ^ - ^ ' + Cy.-yi)2 (2.1s) ' .= 1;=1 or l i b t a * - *i)a + (v* - y*) a) • (2.19) z t= l J = l where (x,-,y,) are the coordinates of the ith component. Of the three measures (2.19) is mathematically well behaved as it is continuously differentiate and leads to least square solutions and is most often used [BLAN 85,VERG 67,SHA 85]. Depending on the nature of the specific placement problem at hand it is sometime advised to use (2.17) as the distance measure, for example, if the routing has to be done only along vertical and horizontal lines. We review some work [MORR 79,CALA 85] aimed at CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 31 introducing generalized algorithm for every norm /„(*,-,y,-) = [\x3- - x*\p + |yy - y*H? 1 < p < oo. A major advantage in all of these measures is that they are everywhere convex (in section 4.2 we show that (2.19) is convex). Minimizing each of the above as it is stated, however, results with the trivial solution = x2 =,...,= xn and yx = y 2 =,...,= yn. In section 2.3.2 we review various continuous placement algorithms which impose an additional constraint in order to avoid the trivial solution [HALL 70,BLAN 85,DREZ 80,VERG 67]. Finally in this section we describe in details two recent nonlinear for-mulations of the placement problem which account for blocks dimensions and for some limited rotations [SHA 85,MARK 84]. 2.3.1 Continuous placements minimizing lp norms. The work reviewed in this section intends to overcome the mathematical difficulty involved with using some of the distance norms. More specifically, following the notation of [MORR 79] consider the following min-sum single location problem • (xj,yj) = the coordinates of already existing components. • (x*,y*) = the coordinates of the new component. • Wj = cost weight representing the interaction between the jth existing component and the new component. MIN n (2.20) where CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 32 For simplicity we are considering single component in two dimensions although this can be generalized to multiple components in n dimensions. Now consider the partial derivatives dCl^ry^ a n d dClv(*vvi) a n o; n o t e the discontinuities at Xj = x* and y;- = y*. The basic approach for circumventing this difficulty is to slightly perturb the distance norm [LOVE 72,EYST 73,MORR 79]. Following the perturbation schema of Morris and Verdini [MORR 79], we let U*f>yi) = [\*i-z'\p + \yj-v'\ p + e]' e > ° - (2-21) Note that e is a smoothing parameter. Morris and Verdini continue by proving that the perturbed problem is both strictly convex and differentiate. Moreover they prove that \U*i,yi)-U*i,Vi)\<2>ek (2-22) thereby providing a bound on the error resulting from minimizing the perturbed prob-lem. Other authors have provided a Projected Newton Method for solving the nonlinear convex optimization problem [CALA 85]. 2.3.2 Continuous placements of point components. We now consider a few algorithms which minimize the objective functions of (2.17,2.18) or (2.19). As noted before the problem should be restricted some more to avoid the trivial solution. Such a restriction which is employed by Vergin and Rogers [VERG 67] is simply to start with a few fixed locations. The basic idea in [VERG 67] is to CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 33 iteratively add the new components one at a time, each time minimizing for the new location by setting the partial derivatives of the sum of the length equation w.r.t. x and y to 0 and solving. When all the new components have been placed they again improve the positions by repeating the same optimizations on each component separately. More successful algorithms [HALL 70,DREZ 80,BLAN 85] impose mathematical constraints to avoid the trivial solution. These are summarized in the following sections. Eigenvalue approach. Hall [HALL 70] presents a solution for the placement problem of minimizing equation (2.19). Following Hall's notation we illustrate his algorithm for placement in one di-mension (on a line). Let X = (xi, x%,..., xn) be the unknown coordinates of the n components minimizing * = - * y ) 2 - (2.23) To reduce the degrees of freedom in equation (2.23) Hall imposes the quadratic con-straint X T X = 1. (2.24) CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 34 Note that this still leaves xy = ^ for all j as a feasible solution. Hall continues by showing that equation (2.23) can be rewritten as X T 5 X as follows: z =\ X)"=i(£t — Xj)2dj = | E ^ E | = 1 ( ^ - 2 x t x J + ^ ) C , y = IZvLi %i I3y=i Cij — Z)"=i 52?jij xixjCij = X T £ X where 0 . . = \ £ * = i - ^ 1 = , J | —C^ otherwise. Note that this will work only for symmetric cost matrix, C,y = Cy,-. Furthermore, consider minimizing the Lagrangian L = X T 5 X - A ( X T X - 1 ) (2.25) which, by setting the partial derivatives of L to 0, gives us 2SX - 2AX = 0 (2.26) or (B - A/)X = 0 (2.27) or A = X T 5 X (=2) . (2.28) Thus to minimize (2.23) subject to constraint (2.24) we need to look for the eigenvec-tors of B corresponding to the second smallest eigenvalue A. Note that the smallest CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 35 eigenvalue yield the uninteresting solution X T = ^=(1,1,...,1). (2.29) Generalization of (2.25) to two or more dimensions is easily done by considering the Lagrangian L = X T 5 X + YTBY - a ( X T X - 1) - j3(YTY - 1) (2.30) which yields minimum value z = a + /3 (2.31) for a, (3 the first and second eigenvalues of B and their associated eigenvectors. Hall also shows that if the components give rise to a connected graph (i.e. no group of components is totally independent from all other components) then we are guaranteed to have at least d — 1 real eigenvalues, where d is the dimension of the placement plane (usually 2). Note that while the eigenvalue solutions are exact for equation (2.30), the constraint (2.24) is not really enough to guarantee feasible placement. For example consider 1 = X T X = x\ + x\-\-,..., -\-x\ = 0 + 0+,..., +0 + 1 + 0+,..., +0. So, for a component i with very small cost coefficients c7tJ- an optimal solution may still include most of the components placed on top of each other. Blank [BLAN 85] uses the eigenvalue solution as a lower bound to confirm the asymptotic convergence of local improvement methods. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 36 Circle placement. In circle placement the components are considered to be circles. Drezner in his work [DREZ 80] uses the distance measure of (2.17) and the addition constraints \{Ri + Rj - dij) < 0 \>0,i^j (2.32) where i2i(j) = the radii of a circle centered around component dij = the current distance between i and j. A = a positive parameter. Drezner uses a two phase procedure. First he locates all of the components at one point (two of which are slightly perturbed) and applies the Lagrangian Differential Gradient method [HADL 64]. This is called the dispersion phase. Secondly, he applies a concentration step which pulls back the components together. The two stages differ by the strength of applying the penalty parameter A. Drezner applies his algorithm to office layout problems and claims, but cannot explain why, that the dispersion phase already yields good results. 2.3.3 Continuous placements of rectangular blocks. In this section we discuss continuous models in which the components are no longer reduced to points but are treated, more realistically, as rectangular blocks. The first work we consider is by Markov [MARK 84]. This paper describes a nonlinear mixed in-teger/real programming model for the placement problem, (the authors also claim that they have found an iterative algorithm which produces the optimal solution, however CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 37 they do not provide any details). Blocks in this model are rectangular with fixed length and width. Blocks with aspect ratio, , bigger than a user specified parameter, k, are broken down into a collection of smaller blocks. All pins are considered to be located at the center of the blocks. The nonlinear mixed integer programming problem of [MARK 84] is formulated as MIN £ / , • (2-33) i=l where /,• = £ - Xj\ + \Ui -Vj\) = average wire length or /,• = E j ( | z i - Xj\ + \yi - yj\) = total wire length (2.34) or /,• = MAX^2j(\xi — Xj\ + — y,-|) = longest wire. The choice between the different /,• is user denned. Also appearing in this model are three sets of constraints. First nonlinear constraints specifying the non-overlap of the blocks are required: ( * « - « i ) a > [ £ ? i + a*y]a or (2.35) { y i - y J ? > [ ^ + ^\2-Here (x,-, y.) are the center coordinates of block i, c,(j) is the width/length of block and ctij,fiij are the user specified separation requirements. The second set of constraints, similarly denned, enforces the blocks to stay within a predefined chip area. The last group of constraints enforces the subdivided blocks to stay together and give rise to a set of integer constraints. A more elaborate model of block placement is presented in the work of Sha and Dutton [SHA 85]. They also suggest a solution technique. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 38 t H i I I s • M o d e l e d X i , X i As Wi • Figure 2.10: Block modeling. The nonlinear model in [SHA 85] is developed around an elaborate formulation of a block. Figure 2.10 illustrate the modeling of a block. Here the objective function W(X) is again minimizing the total square wire length where the wires are considered as connected to the block centers. For a set of m connected components the measure of the total wire length is calculated as the sum of the lengths from the center of gravity as calculated in equation (2.6). Sha and Dutton [SHA 85] write the non-overlapping requirement of blocks i and j as four constraints over the points Pa,P , -2,-P>i, P/2 9i{i.j) =U + ry - cf(P,i, PyiP/2) < 0 92{i-j) = r,- + ry - d{Pi2,PnPj2) < 0 0s(*'.i) = r,- + ry - d{PjUP*Pi2) < 0 94*-J) = r< + ri ~ d.(Pj2,PiiPi2) < 0 (2.36) where d(Pik, PnPn) specifies the distance between the point P,fc and the line segment PiiPn- Note from figure 2.10 that overlap can still occur either at the corners or at the center of a block if the line segment P.iP.j is longer than 2r,-. Since each block, P,-, is represented by two different points they add the constraint 9s{i) = ( w i - hi) - di = 0 i=l,...,n (2.37) CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 39 which keeps the axis of the block P,iP,2 fixed in its length since di is defined as follows di = y/{xa - xi2y + (y a - y,-2)2. (2.38) Rotations are incorporated as further constraints g6[i) = j = 0 z = l , . . . , n (2.39) Note that the constraints g6 allow only two orientations of the blocks. Further con-straints ensure that the placement is done in a given bounded area if needed. The method suggested in [SHA 85] for solving this nonlinear model is a standard penalty function approach. First they construct a new unconstrained objective function ^constraints P(X) = W(X) + c* £ [MAX(0, ? t - (X)j 2 (2.40) t=i which is solved by a modified Newton method for fixed values of the parameter c*. The solution strategy is to start with c0 = 1 and solve (2.40) repeatedly for increasing values of C j t + 1 = lOcfc. The method was tested on small placement problems and was found successful by the authors. To conclude our discussion of this work we note the following points. First this model allows only two orientations for the blocks. Secondly, the connections are measured from the center of the blocks and not from the boundaries making even this small freedom of rotation useless since it is the torque applied by the connections which makes the block rotate. Thirdly, note that the standard penalty method introduces some ill-conditioning. Fourthly, the lumping together of all the different constraints may not be CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 40 warranted as the block integrity constraints may clash with other constraints and they clearly should have priority. In this section we review a small class of placement techniques which combine both continuous snd discrete algorithms. The hybrid techniques usually consist of nonlinear continuous optimization followed by a discrete algorithm which maps the continuos placement into a set of discrete locations [FUKU 83,QUIN 79,BLAN 84,WIPF 82,CHEN 84,ROTH 83]. In their work Fukunaga [FUKU 83] use the eigenvector solution as rediscovered in the context of graph representation. After achieving a good continuous placement the mapping to grid points is done via a linear assignment by minimizing the total change resulting from moving the components into slots. This discrete stage is the same as ours and is explained in details in section 3.6. In a second hybrid algorithm a continuous physical model of a system of springs is suggested [QUIN 79] followed up by discrete stage in which the blocks are assigned to locations. Here each block is reduced to a point and the forces acting between each two points are calculated by Hook's law: Where K\j is a constant proportional to the number of interconnections between block 2.4 Hybrid techniques. (2.41) CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 41 i and block j and AX,j, AYy are the distances in X, Y coordinates between the blocks centers respectively. The force directed placement approach suffers from two main drawbacks. First the system of springs exhibits only attractive forces between the blocks, so as to avoid the trivial solution of collapsing all blocks into a single point some repulsive forces must be introduced. In [QUIN 79] such a repulsion factor is added resulting with a new objective function where f 0 if Kij # 0 Rij = 0 if i = j [ constant if K^ = 0 and ASij = \AXij\ + \AYij\ but since the repulsion must be 0 between connected blocks the resulting placement may, and usually does, contain overlapping blocks. Secondly, the reduction of blocks into points results in non-optimal block orientations. Figure 2.11 presents a simple example which demonstrates that the resulting force directed placement is not optimal (assuming that blocks overlaps are resolved) with respect to both location and orientation. The equation of (2.42) is solved by a modified Newton Raphson method. Again here the linear assignment formulation is used to map the blocks into their discrete locations. Alternatively, if the placement is to be carried on a continuous medium an CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 42 t I n i t i a l Force Dlrcted T r u e Optimized Optimum Figure 2.11: Force directed placement of two blocks. algorithm is carried out to resolve the components overlap. Other algorithms which combine force directed for continuous placement and cut lines techniques for mapping the continuous placement to grid points are given in [WIPF 82,ROTH 83]. A similar hybrid technique is developed by Blanks in [BLAN 84]. Here the objective function is the sum of the squared wire length. To avoid the trivial solution, to center the blocks on the grid, and to avoid highly correlated x and y placement (i.e. placement on a diagonal line) the following constraints are added E f e i xi = E,n=i Vi = o E . ' L i *? = No* (2.43) £ ? = i XiVi = 0 where ox1oy are the variance of the expected x and y placement assuming homoge-neous distribution of the components. The solution technique in [BLAN 84] involves repeatedly calculating the optimal solution for each component individually minimizing only the objective first and adjusting the placement, in a separate step, to obey the constraints (2.43). We note that the observation motivating this algorithm is that the total squared wire length objective can be minimize exactly when the constraints are ignored. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 43 The discrete phase in this algorithm involves the following two step heuristics. 1. Sort the components by their distance from the center. 2. Place the components one at a time according to the sorted order, starting with the ones further away. Each new component is located as near as possible to the center of gravity of its already placed and connected components. Cheng and Kuh [CHEN 84] present an hybrid placement algorithm based on the analogy between resistive linear network optimization and placement. The objective function is again the total squared wire length, but the constraint set is new: (2.44) Here Pt- is the coordinate (assuming one dimensional placement) of the ith slot. B y considering only the first (linear) constraint, linear resistance optimization techniques can be used to find the optimal solution which is then used as a starting point for a scaling phase involving the incorporation of the second (quadratic) constraint as well. The placement is further improved by fixing parts of the components and repeating the scaling sequence again. Finally a cut line approach is used to map the components into legal grid points. 2.5 Top down design (Floor Planning). Floor planing or top down design differ from previous placement models since the blocks have unknown shapes. The area of the block is assumed to be estimated but their shape may be varied [ U E D A 8 3 , O T T E 8 2 , H E L L 82]. " CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 44 p D e a d A r e a Bl R o u t i n g A r e a B2 \ X B l o c k s A r e a Figure 2.12: Chip area calculation. We can define the objective of a floor planing algorithm as follows [UEDA 83]. Determine the length/width ratio and location of each block such that their boundary rectangular (chip boundary) is approximately square and its area is minimized. The chip area CA is calculated as CA = BA + I B A + DA. (2.45) Here B A denotes the total blocks area and is assumed to be constant. I B A is the area occupied by routing channels between the blocks and DA is the dead space on the chip resulting mainly from mismatching in shape between neighboring blocks. Figure 2.12 illustrates the different areas for a two block placement. The details of the heuristic to calculate CA can be found in [UEDA 83]. The block initial placement is done, after reducing blocks to points, by a force directed method followed by iterative improvement techniques as discussed in previous sections. The key heuristic in [UEDA 83] for deciding the block shapes is an iterative improve-ment schema which alternate between location improvement and shape improvement as follows: 1. Reduce the chip area by moving the boundaries inside by a small amount. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 45 2. Select one block and move it slightly in the direction which minimizes its overall overlap with its neighbors. 3. Select one block and reshape it slightly, again in order to reduce its overlaps. 4. Iterate 1-3 until the chip area matches the estimate and there are no block over-laps. This floor plan schema can be recursively applied to each of the blocks so determined. Note that at some stage in this hierarchical processing the top down approximations must be replaced by an exact bottom up construction. This may then "back fire" on the previous placement and routing estimates, so an adjustment mechanism is necessary in order to apply this approach recursively. 2.6 Simulated Annealing. Kirkpatrick [KIRK 83,PHYS 82] introduce the technique of simulated annealing into combinatorial optimization in general. In particular they have applied it to placement and routing problems [KIRK 83,VECC 83]. Simulated annealing, in general, requires four ingredients: A representation of the system to be optimized (e.g. the location of possible grid points for placement); a generator of legal system configuration (e.g. pairwise exchange routine); a cost function measuring the energy of a given system configuration (e.g. total wire length) and an annealing schedule of temperatures. Optimization by simulated annealing is carried out by successively generating new system configurations. Each new configuration is automatically accepted if it reduces CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 46 the total system energy; it may also be conditionally accepted with the Boltzmann probability -&/(*) P{x) = e~rpr~ (2.46) if the system energy has increased (l.e.Af(x) > 0) from previous configurations. Here Ti is the current system temperature and k in the Boltzmann constant. The temperature schedule T0,Ti,...,Tm is a decreasing sequence with Tm —> 0. Intuitively, when the temperature is high we accept a more expensive system configu-ration much more readily than when the temperature is low. This allows us to climb in an up hill direction and, therefore, possibly escape local minima in favor of global one. We may switch from Tt to Ti+i when for some time, say n iterations, there was no improvement in the system energy. Kirkpatrick et al. have applied simulated annealing to placement [KIRK 83] and routing [VECC 83] problems. Simulated annealing optimization techniques leave a few important questions unan-swered. In particular it is not at all clear to us that the probability equation (2.46) really reflects the appropriate probability model for placement problems. Further more, the temperature schedule seems to be dependent on the particular system configuration at hand [NAHA 85]. In fact, some experiments with simulated annealing seem to suggest that there is no particular benefit in using the Boltzmann probability and annealing temperature schedule when compared with using the same amount of computer time to successively try as many system configurations as possible [NAHA 85]. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 47 Figure 2.13: Neighborhood exchanges. 2.7 Hardware implementations. Placement improvement techniques and specifically pairwise relaxation lend themselves to hardware implementations [UEDA 83,CHYA 83,IOSU 83]. Iosupovici et al. [IOSU 83] designed hardware implementation for calculating the total change in wire length resulting from a pairwise exchange. The speedup in the system becomes roughly 1:10 in comparison with software implementations. Two separate papers [CHYA 83,UEDA 83] suggest similar hardware approach to pairwise neighborhood exchange. In neighborhood exchanges only interchanges between neighboring components are considered, giving rise to four possible exchange patterns as illustrated in figure 2.13. The basic idea behind a parallel implementation of a pairwise neighborhood inter-change algorithm is to associate each component with one processor in an array of processors. Each improvement iteration then proceeds as follows. For each of the four possible neighborhood interchanges patterns do: 1. Form a pair candidate for exchange: each processor locates its possible target for exchange. CHAPTER 2. REVIEW OF PLACEMENT TECHNIQUES. 48 2. Calculate gain: each processor calculates the change in its total wire length re-sulting with its interchange with the target. 3. Interchange pairs: if the gain in (2) is positive the interchange takes place and the two processors exchange the components data. 4. Broadcast new locations: each processor sends its current component identifier to all other components so that they could update their local copy of the placement. For practical reasons only steps 1-3 should be implemented as concurrent operations. The expected improvement in speed when adding 0(n) processors is proportional to the number of processors used [CHYA 83,UEDA 83]. Chapter 3 The mathematical model. 3.1 Introduction. In this chapter we discuss a mathematical model for our placement problem. We first formulate the placement problem as a continuous, nonlinear constrained minimization problem; we discuss the construction of the terms in the objective function and their interrelations, and then we formulate the constraints and discuss their role in the model. We also show how to transform the constrained formulation to an unconstrained one by. replacing the constraints with penalty terms. Finally we show how to specialize the placement for a more restrictive problem requiring that the block positions be drawn out of a discrete set. 49 CHAPTER 3. THE MATHEMATICAL MODEL. 50 3.2 Preliminary Definitions. We start by formulating the mathematical entities involved. Let B = {1,2,n} be an index set of the blocks to be placed. For each i €. B we have the following variables: Xi = the X coordinate of the center of block i yi = the Y coordinate of the center of block i &i = the rotation angle of the center of block i and the following constants: uii = half of the width of block i /,• = half of the length of block i T* = {Ti,Tfm} = the set of terminals for block i For each Tj 6 T we have Tj = (Ax,,, Ay,-y), where Axij = the X displacement of terminal j from the center of block i Aytj = the Y displacement of terminal j from the center of block i. Also we are given a set of pairs i = {< r ; , r * > , < 17, r ; >}, where each pair < Ty,T* > £ I designates two interconnected terminals. For each such pair we are also given a constant Cijki representing the weight or relative importance of the connection. For example, a sixteen bit bus connection between two components may have sixteen times the weight of a single control line. Note that connections between more than two terminals, say m, can be represented by m ^ ~ ^ pairs of connections. Finally let distance(TJ,Tlk) be the distance between terminal T;- and terminal T*, and distance(Blocki, Block j) be the distance between the centers of blocks i and j, CHAPTER 3. THE MATHEMATICAL MODEL. 51 where we define the function distance(< xx,yx >,< x2,y2 >) = \J{x\ - x2)2 + {yx - y2)2 i.e. it is the usual Euclidian distance. In the next section we formulate and describe the optimization problem. 3 . 3 The constrained minimization problem. We consider three variations of the placement problem, denoted P l , P 2 , and P3. The model of P I assumes no restriction on the orientation or location of the blocks but in-cludes the restriction that blocks cannot overlap. The models P2 and P3 are formed by-adding orientation and location constraints to P i . In all three problems the optimiza-tion aim is to find the values of the variables x,-,yt-,0,- for i = 1, . . . , n which minimize the total wire length. We write the placement problem P i as follows P i : MINIMIZE D = E " i | i M = 1 Cijkl distance(T), T*) 2 such that where distance(Blocki, Block j)2 > afj % / j (3.1) f given weight if < 7*,r,* >e I ijkl [0 if < TJ,Ttk >&I = minimum separation between blocks i arfd j. We calculate the objective function D as follows. Let = U=t\dli^ Cijkl{distance{TlTt)Y) (3.2) = E , ^ 1 E L , + i ( E I l = x Cwi&i - xkiy + (y,,- - yklY)) CHAPTER 3. THE MATHEMATICAL MODEL. 52 where and {xki,yki) a r e defined similarly with respect to the variables of block k. Note that {xij,y~ij) a r e coordinates the terminal Ty calculated by rotating the point (Ax,y, Ay.y) (the displacement of Tf) around the origin (0,0) by an angle and then translating the resulting rotated point by (x,-,y,) (the center coordinates of block i). The result of this formulation is holding the terminal in fixed distance from the center of the block. Figure 3.1 illustrates this formulation graphically. The minimum distance, «7,y, separating blocks i,j is determined from the radii of the circles containing the blocks plus a minimum required separation, a, as follows. on = vV? + % + + l) + a (3-4) Figure 3.2 illustrates the calculation of <7,y. However this a,y may be an overestimation of the required separation and may not represent the placement correctly if for some blocks Wi << /,• or /,• >> u>,-, since such narrow and long blocks would repel other blocks from their narrow side as the containing circle will be far from the block edge. CHAPTER 3. THE MATHEMATICAL MODEL. 53 Block i Block j Figure 3.2: Calculating blocks separation A simple remedy here is to subdivide such blocks into a number of blocks with aspect ratio approaching 1, and link these together via very strong connections. To avoid having these strong connections dominate the optimization we start with an initial educated placement which ensures that the subblocks are close together and in the proper topology. Although no statistics are available we assume that long and narrow blocks do not often arise in PCB or VLSI layouts, so not too many new blocks will have to be generated. We note that both in the objective function of P i and in the constraint (3.1) we use the square of the distance function. We choose distance(Blocki,Blockj)'2 and distance(Tj, T*) 2 since this distance function is continuously differentiable with respect to the unknowns {£,•, y,-, 0,-, 1 < i < n}, and still minimizing it gives the same result as minimizing the true length. The Euclidian distance squared, as a measure of wire length, is very popular in placement algorithms due to its nice mathematical properties [CHEN 84,BLAN 84,HALL 70,BLAN 85,MORR 79] (See also §2.3). In many applications the blocks are not free to assume any orientation. For these CHAPTER 3. THE MATHEMATICAL MODEL. 54 applications we assume that the allowed angles are equally spaced. More formally we allow placed blocks to be oriented in angles ^ for k = 0, . . . , m — 1. If m = 4, for example, only horizontally or vertically placed blocks are allowed, i.e. 0, |, IT, ^ are the only allowed angles. For these cases we add the following n constraints to P i , obtaining a more restricted optimization problem P2 Oi e {cpo,...,<pm-i} l<i<n (3.5) for a given set of angles (e.g. cpk = ^ f ) . Note that our formulation here really gives rise to a set of models allowing representation of placements in which no restrictions are placed on the orientations of the blocks P i and representations in which only m evenly spaced angles are allowed. In some applications such as Printed Circuit Boards Placement (PCB's) one may want to consider a set of discrete slots into which the blocks are to be placed. Here we are given a set S = {< x[,y{ >,-••,< x3m,y3m >} of locations such that m > n and < x\,y\ > are the coordinates of the ith slot. Such a restriction gives rise to the following constraints, which we add to P2 to get our third model P3: (x»yi) E {< x{,y{ >,•••,< x3m,y3m >} 1 < i < n, (3.6) for a given set of m grid locations (m > n). The constraint in (3.1) ensures that two blocks will not be assigned to the same grid location. Occasionally it is useful to fix some of the block positions and/or orientations before CHAPTER 3. THE MATHEMATICAL MODEL. 55 applying the placement algorithms. This may be done both in order to speed up the convergence of the placement algorithm and in order to incorporate additional constraints. An example of such application is a case where the placement has to fit into a given bounded area.' By fixing small blocks around the perimeter of the designated area we can ensure that other blocks will not "jump outside" the boundary. This is easily handled in this model by changing the indexing of the variables. In the next section we show how to replace constraints by penalty functions. 3 . 4 Standard penalty functions. Consider, in general, a nonlinear constrained problem MIN /(x) (3.7) subject to ft(x) > 0 for j = 1, . . . , m. (3.8) A common approach for solving such a problem is to transform it to an unconstrained minimization of m /(x) = /(x) + A £ « % , ( x ) ] , (3.9) t=i where A£)£Li <r>[s,«'(x)] 1S a penalty term penalizing a violation of the constraints. There are two common penalty function approaches. If <^ [<7,-(x)] —• oo as <7,(x) 0 + then (j> is called a barrier penalty function and the method is classified as interior since CHAPTER 3. THE MATHEMATICAL MODEL. 56 the large penalty guarantees that after starting an iterative process from a feasible point, x will approach the minimum from the interior of the feasible region. Usually the function <f>(x) = —log(x) is used as a barrier function. Since often the interesting minimum point coincides with the boundary of the feasible region, we need some way of approaching this point where the penalty function value approaches oo, and reducing o the effect of the other, inactive constraints. This is achieved by a continuation sequence Ai, A2,..., An,... such that lim,-,,*, A< = 0. We utilize the continuation sequence as follows: for each Xj we solve the unconstrained minimization problem /y(x) = /(x) + Aj Z)i=i <r>[0t(x)] while using the final values for x which resulted from the minimization of 7J_1(x) = /(x) + Xj_i J2iLi <rHj7."(x)] a s the initial values for this optimization problem. If <#[<7t-(x)] gives a (finite) positive contribution while x is infeasible then <j> is called a loss function and the method is classified as exterior since a large penalty will "push" x toward the feasible region from the outside. Usually the function <f>(x) = [max(x, 0)]2 is used for a loss function. For loss functions we need a continuation sequence such that lim,_ 0 0 A,- = + 0 0 . Theoretically the good news is that under fairly general assumptions it can be shown that for the appropriate continuation sequences for both loss and barrier penalties, if we minimize successively the resulting unconstrained functions we are guaranteed to find a minimum point. Furthermore, if our objective function and constraint set are convex we are guaranteed a global minimum [GOTT 73]. CHAPTER 3. THE MATHEMATICAL MODEL. 57 The bad news is that as A approches its limits the Hessian of (3.9) becomes i l l -conditioned. This seems to suggest applying a continued sequence of subproblems when using the penalty function approach [GILL 81]. The Hessian ill-conditioning also suggests using the continuation sequence partially, i.e. not carrying A to +oo (or 0) if "good" solutions are formed earlier. Of course the constraint set can be split over barrier and loss functions; in which case we get a mixed penalty function [LOOT 71] k m /m(x) = /(x) - A, £<?M<7.(x)] - A6 E &[*(x)]. (3.10) t=l i=k+l 3.5 Our penalty function approach. To handle the constraints (3.1) we use a penalty function approach and reformulate the placement problem as follows: find x\, yt*, 0t* for i = l,...,n such that Z = D + X1V (3.11) is minimized. Here D is the sum of the squares of the interconnection lengths as before; V is a penalty function term designed to ensure that blocks would not overlap each other; and Ai is a parameter, or weight used to define the relative importance of the penalty term at different stages in the solution process. This is discussed in detail in the next chapter. CHAPTER 3. THE MATHEMATICAL MODEL. 58 Sometimes the following loss function is used in the literature [SHA 85] V = X^lVij • (3.12) >,j€B where V; ; = ( m a x ( (4-4 ) , 0 ) ] 2 (3.13) and dij = distance(Blocki, Block j){= \J (x,- — x ; ) 2 + (y< — y ;) 2) (3-14) Thus Vij > 0 and Vj;- is sensitive to the amount of violation of the tj — th constraint, as long as that constraint is violated. Moreover, V,y = 0 as soon as the constraint is satisfied, no matter how much larger is from crtJ-. An advantage of this is that when the constraint (3.1) is satisfied, D = D + \!V (3.15) which means that the original problem P i is indeed solved. A disadvantage is that a certain ill-conditioning is introduced as it is necessary to successive increase A such that lim.-_.oo A,- = +oo in order to insure feasiblity. Instead we use a variation of the barrier potential function (3.16) Observing figure 3.3 we note that equation (3.16) has the properties of both smooth-ness and rapid increase of the penalty when dij < o~ij. Also note that this penalty CHAPTER 3. THE MATHEMATICAL MODEL. 59 function automatically eliminates inactive constraints, that it is bounded from below, and that the positive singularity is less likely to coincide exactly with the minimum point. Furthermore, as we will see in the next chapter, it usually suffices to use only few continuation steps with no need to take A to its limit. The price paid for using the smooth equation of (3.16) instead of equation (3.13) is that the objective function is nonconvex and relaxed, i.e. at optimum equation (3.15) does not hold, and we are not guaranteed that an optimum of P i is achieved, unless Ai = 0. Still, the solution is good, because the minimum of (3.16) is at d,y = \j2~Oij which is not far from the feasibility edge d,-y = Cjy. Note that since our block separation a is defined heuristically (since we do not know how much space is needed for routing), there is no need to insist on the exact solution to P I . Note also that if one does not carry the continuation sequence to the limit also then objective function of equation (3.13) also is relaxed (and the ill-conditioning may be avoided). The physical analogy to potential energy is made when we note that equation (3.16) is a variation on the Lennard-Jones 6-12 potential equation [REIC 80]. If we think of the blocks i and j as "charged particles" then V,y is a measure of the potential energy between the particles, which is very high when the particles are very close together and almost 0 when they are far apart. Observing the graph of VJy in figure 3.3, we note that the (negative) minimum is achieved at rf,-y = V^cr.y and that Vj-y grows rapidly when d,y is below <r,y and V,y —*• +oo as d,-y approaches 0. If we now take <7,y to be as in equation CHAPTER 3. THE MATHEMATICAL MODEL. 60 13 Figure 3.3: Graph of V{j (3.4) we practically ensure that those blocks would not overlap. This type of penalty function is sometime called barrier function [LOOT 71]. Of course, a very large number of interconnections may result with overlaps but this is easily taken care of by changing equation (3.4) to read an = Cy(y/wf+l* + yjw) + I) + a) (3.17) for some Ct-3- > 1, say C t J- « i = 1 C^ proportional to the total weight of interconnec-tions between block i and block j. Note that the 4e in equation (3.16) can be grouped with A x . Alternatively we can also make e = e,-,- a parameter such that at its minimum point Vj;- = e,y. We can, therefore, adjust e,3- to reflect the importance of having any two blocks being near each other. To formulate an equality constraint like (3.5) there are two approaches, discrete and continuous. The discrete approach would yield a mixed (nonlinear) integer pro-gramming problem, and possibly a branch and bound algorithm would be used for its solution, where on each leaf of the search tree a problem like the original PI (but CHAPTER 3. THE MATHEMATICAL MODEL. 61 -cos(m 9 ) i 2 ( m - l ) i t 2% Figure 3.4: The function Ri smaller) is solved. This is not very efficient, and we again resort to a continuous for-mulation. We write (3.5) as nr=i(0.-<^) = o i < » < n (3.18) To obtain an unconstrained problem we again use a penalty function, R, and minimize Figure 3.4 depicts the function Ri. Again note that the same tradeoff between exactness of solution and smoothness of the penalty term affects the choice of Ri here. We note that the function —COS(m6i) achieves minimum value at each angle = ^ for k = 0,. . . , m — 1. With a large enoug-h weight A2, we can make 0,- for % — 1 , . . . , n arbitrarily close to the allowed angles. For m —»• 00 the problem becomes "closer" to continuous which implies applying the penalty less stingily for large values of m. Note that in contrast to the constraints (3.1), the constraints (3.5) are not in direct competition Z = D + XiV + \2R (3.19) where R CHAPTER 3. THE MATHEMATICAL MODEL. 62 with the objective of minimizing the wire length. This suggests that different penalty-functions (loss functions) should be used for the different constraints. Of course we may have to fix the angles 0,- so obtained exactly afterward, but this should not offer any practical problem since the human designer cannot distinguish between very small angle changes. Unfortunately we can not use the same methodology of continuous formulation and penalty functions for the constraints (3.6). While it is clear that in equation (3.19) we could enforce feasibility by making \ \ and A 2 large enough, this is not the case for a penalty function representing (3.6). The problem here is that the constraint sets (3.1) and (3.6) may be in direct competition. If, for example, a grid point falls near the center of gravity of a subset of the blocks, all these blocks will be "attracted" to this grid point while they still repel one another. We have chosen to use a discrete approximation which will map the placement resulting from minimizing equation (3.19) onto grid locations. The next section presents this approximation. 3 . 6 The fixed slots problem. Reformulating the problem P i plus the constraint (3.6) (this formulation does not take into account the allowable orientations) we get a variant of the Q A P of section 2.2.1. In our model we attempt to use the solution found for the non-restricted continuous placement problem and approximate the best discrete placement from this. To this end CHAPTER 3. THE MATHEMATICAL MODEL. 63 we define: C^ = distance square from the current position of block * to slot j = (x* - x>Y + (y* - y ;')2 where £,*,y,*, i = 1, . . . ,n are the coordinates of the best continuous placement found by minimizing Z. Also, if m > n, ie. there are more slots than blocks, we add dummy blocks j = n + 1 , . . . , m such that C^ = 0 for all slots k. We can now reformulate an approximate problem MIN 22 CijXij (3.20) i=l j=i+l Subject to: E?=iXiy = l forj = l , . . . , n Zj=iXij = l fon' = l , . . . , n (3.21) Xti €{0,1} Informally we are minimizing the changes resulting from moving the blocks from their current optimal position into the fixed slots. Note that the above is a linear assignment problem and can be solved using the Hungarian method in 0(n 3) time [KUHN 55]. Of course the placement here may be sub-optimal to a placement done via a branch and bound algorithm or implicit enumeration on the original QAP, but it is much faster, and the obtained solution may not differ much from the optimal one as we are using a good placement to start with. This is also the method employed by [FUKU 83,QUIN 79] to map continuous placements into discrete ones. If we use the model given by problem P2 we may lose the good orientations which /were achieved in the continuous optimization. We can now fix all the {x,-,y,-, 1 < CHAPTER 3. THE MATHEMATICAL MODEL. 64 i < n} to their assigned grid points and optimize P2 again only over the variables {6i, 1 < i < n}. To get a rough measure of the computational complexity, we may consider the cost of function evaluations. Note that for n blocks with a total of m interconnections the complexity of evaluating the objective function is 0(n2 + m). Computing the first partial derivatives in each iteration can be seen to be also of complexity 0(n2 + m) since each variable only occurs in 0(n) terms. Since the number of iterations is independent of n (the quasi-Newton method converges sufficiently fast) and m = 0(n2), we get an overall complexity of 0(n2) for our placement algorithm. Also note, from the construction of D, V, R, the possibility of applying ra^n~1^ parallel processors for calculating the and Vfj terms in constant time and applying n parallel processors for calculating the Ri terms. Chapter 4 The solution strategy. 4.1 Introduction. The unconstrained minimization problem with which we have ended up in chapter 3 is still nontrivial. Particularly difficult is finding a global minimum for P l , P 2 or P3. For the solution process we utilize continuation. The question we are faced with now is: how to choose two sequences of values X\, X\,A™ and X\, Xl,X™ such that minimizing the sequence of problems Zx = D - f X\V + X]R Z2= D + X\V + X\R Zm= D + XfV + XfR where the placement resulting from minimizing Zi is the starting point for minimizing Zi+i, will result in identifying a local minimum which is as close as possible to the global minimum. The results here are based both on the convexity analysis in the following section and on experimental evidence using the quasi-Newton approach for unconstrained nonlinear minimization. 65 CHAPTER 4. THE SOLUTION STRATEGY. 66 4.2 Convexity analysis. For Pi and P2 to be convex, both the objective function and the feasible set generated by the constraints must be convex. We first investigate the objective function Z = D. Again we specify D as D = T , t ( E m-^y+im-m)2))- (4.2) <=i k=i+i <T;,rl,!>e/ We note that D is a sum of functions of the form (xi — x 2) 2 . In Lemma 1 we show that the function f(xx,x2) = (xi — x 2) 2 1 S convex, which in turn implies that D is convex. L e m m a 4.1 The function /(x) = (xi — x 2) 2 is convex. Proof: Let x ,x E R 2 and 0 < o < 1. We need to show that /(ax + (1 - cr)x) < <r/(x) + (1 - <r)/(x) (4.3) or /*(x,x) := a/(x) + (1 - o)f(x) - / (ax + (l - a)x) > 0 (4.4) but by direct computation we get: /*(x,x) = a ( x i - x 2 ) 2 + ( l - a ) ( x T - X 2 - ) 2 - (axi + (1 - o-)xT- (crx2 + (1 - a)xi"))2 = (o-- a2) ((xi - x2) - (xi - x?))2 > 0 since 0 < o < 1 (a - a2) > 0 • Since £> is the sum of convex functions, D is itself convex. Since D is convex every local minimum is also a global minimum. We therefore conclude that at least a minimum found for Z = D can be guaranteed to be the global minimum. CHAPTER 4. THE SOLUTION STRATEGY. 67 We now analyse the convexity of the constraint set. Let A = {(x1,y1,x2,y2)\C(x1,y1,x2,y2) > d} ( 4 . 5 ) be the feasible set where C(xi,yu £2,2/2) = (xi - x 2 ) 2 + (j/i - y 2 ) 2 . We need to check if A above is a convex set. Unfortunately we can easily find an example where A is not convex. Let zx = (0 ,0,1 , (1 -^3) ) z2= ( i , ( i - v / 3 ) , o , o ) then Zi,Z2 £ A since C{Z1) = 0 + (1 - (1 - Vd))2 = d>d C(Z2) = (1 - (1 - Vd))2 + 0 = d>d. We know that for a convex set, A, Z\,Z2 € A => ZXZ2 C A where ~Zx~Z~2 = {(1 - e)Zi + eZ2\0 < e < 1} but for e = i we let P = (1 - e)Zx + eZ2 = | (o , 0,1, (1 - Vd)) + i ( i , (1 - VI), 0,0) = 1(1,(1-Vd),1,(1-Vd)) now P € ZXZ2 by construction but c(p) = (\Vd)2 + [\Vd)2 < d which implies that P £ A and therefore A is not convex. CHAPTER 4. THE SOLUTION STRATEGY. 68 4 . 3 Experimental evidence. We are looking for effective values for A x and A 2 . We note that any Ai > 0 only-affects the magnitude of the minimum value (= e,-;-) of V,-;- but not the maximum since Vij —> oo as dij —> 0. We therefore conclude that for any A x > 0 the term \\V{j will act as a penalty function allowing the blocks i and j to come arbitrarily close to each other as At —> 0 but not to slide over each other. Setting \i > 0, therefore, will not result in a significant change in the topology of the placement. We control the scaling of the paramters by insisting that the placement is to be done in a unit square. This means that wire length, for example, cannot exceed y/2 and D < y/2m were m is the total number of interconnecting nets. We will try A': G {0, i} only. One approach for finding the values of A2 is to consider A2j- as a continuous multiplier for the n constraints of (3.18). This is theoretically feasible since the constraints (3.18) are equality constraints, and lead to the Lagrangian n m L = D + X[V + 22 A2y II (*y - <t>k) • (4-6) j=l k=l We then add the partial derivatives for % = 1, . . . , n to the gradient vector. To be able to minimize (4.6) analytically it suffices to ensure that the Jacobian of the constraints (3.18) has full rank [GOTT 73,GILL 81], as is the case here. To be able to minimize (4.6) numerically, however, we have to look for a saddle point of L since minimizing P2 subject to the constraints (3.18) is equivalent to finding a saddle point of CHAPTER 4. THE SOLUTION STRATEGY. 69 (4.6) [GOTT 73]. This is hard to do since there are no "nice" conditions on the Hessian which ensure saddle point (in contrast to the definiteness condition for a minimum or a maximum). We look at possible discrete values for the A2 parameters. We first note that the R functions are bounded namely, — n < R < n, which means that A2 scales the terms values — X\n < X'2R < A2n. If A2 is large enough such that X2R » D + X[V (4.7) holds, then the blocks would tend to get fixed in the nearest allowed angle, since the A2i? term assumes more importance in the minimization (which does not necessarily coincide with the best allowed orientation). The value for which this happens depends on the length of the interconnections, and was, for the problems we have tried, roughly A2 « 1.0. If we let, on the other hand, A2 —»• 0, this has the effect of reducing the penalty generated by the term X2R, thereby, allowing the blocks to assume any angle. We therefore used A2 values in the interval [0, l]. One simple strategy in minimizing Z is to let m = 1 and A™ = X™ = 1. Then we minimize ' Zm = Z = D + V + R. (4.8) As the penalty function V is turned on, blocks are not allowed to slide over one another. As the penalty function R is also turned on, blocks tend to get fixed in the nearest allowed orientation. The improvement over the initial placement, therefore, may not CHAPTER 4. THE SOLUTION STRATEGY. 70 Figure 4.1a: Initial Figure 4.1b: Final Figure 4.1: 3 block placement using A™ = A™ = 1 be significant and depends strongly on the initial placement of the blocks. Figure 4.1 presents a simple 3 block placement problem and the result of optimizing with A™ = A™ = 1 and y- allowed angles. This is clearly far from the global minimum. A far better strategy is to use the following continuation sequences: X\ = o A* = 0 \\ = l X\ = ei Af = l A| = e2 (4.9) A f = l A 2 m = l with 0 < ei < €2 < ... < em_2 < 1. This strategy results with the following minimization sequence: Zx = D Z2= D + V + exR Z3= D + V + e2R (4.10) Z m = D + V + R Note that the elimination of the penalty terms in Z\ of (4.10) allows the blocks to slide over each other, therefore resulting in overlaps among the blocks, but also in a much better topological placement. This is really solving the "relaxed" problem which occurs CHAPTER 4. THE SOLUTION STRATEGY. 71 when we solve for the original objective of P I ignoring all the constraints. Note also that in contrast to other placement methods [IOSU 83,QUIN 79], since blocks occupy area and are not reduced to points (D is measured from the terminals and not from the centers) the event of collapsing all blocks into a single point is very unlikely. Finally note that by the convexity analysis of the previous section, Zx of (4.10) is convex and can be solved for the global minimum. In minimizing Zi,...,Zn in sequence we increment A2 from 0 to 1, first giving the blocks full freedom of rotation and then restricting them into the allowed angles only. Figure 4.2 presents the sequence of solutions for the simple 3 block placement problem of Figure 4.1 starting from the initial placement of Figure 4.1a, and the result of optimizing using this strategey with allowed angles. This is clearly a global minimum for this problem. Note that block 3 has rotated into its optimal orientation, and that the two blocks 2 and 3 have slid over each other into their optimal positions. A natural question arising here is, why not take A 2 € {0,1}, i.e. is there any benefit to using the sequence A 2 = 0, ei, e 2 , 1 . 0 instead of just using A 2 = 0,1.0, as was the case with A^? Figure 4.3 presents a test case which illustrates the benefits of using the above sequence. In figure 4.3a the initial placement of a two blocks problem is presented. In this example the blocks are fixed in their location, and we use m = 4, namely only horizontal and vertical placement is allowed. In Figure 4.3b we see the result of optimizing with Z\ — D. Note that block 1 is slightly biased toward the CHAPTER 4. THE SOLUTION STRATEGY. 72 Figure 4.2a: Ai = A 2 = 0.0 - o . Figure 4.2b: A t = 1,A2 = 0.001 Figure 4.2c: A x = 1,A 2 = 1 Figure 4.2: 3 block-placement using improved strategy 90° angle in its orientation, and block 2 is slightly biased toward the 0° angle in its orientation. In figure 4.3c we observe the result of optimizing Z2 = D + 1.0R, which is a suboptimal solution since there exists an allowed orientation, as in figure 4.3a, in which the line connecting the two terminals is shorter. This suboptimal placement is the result of applying the "full strength" of the penalty term R at once, which leads to fixing the "nearest allowed angle" to each of the blocks. Figures 4.3b, and 4.3d-4.3f, on the other hand, present the continuation sequence of optimizations of table 4.1 which results with the optimum. What figure 4.3 illustrates is that there are cases (probably many) in which the continued application of the penalty term, R, is necessary to achieve good placement. Unfortunately, we do not know of any automated way to produce the optimal sequence of values for A 2 . Although various heuristics can be employed here, CHAPTER 4. THE SOLUTION STRATEGY. 73 Figure 4.3a: Initial Figure 4.3d: A*2 = 0.001 1 2 Figure 4.3e: A» = 0.01 Figure 4.3f: A*2 = 1.0 Figure 4.3: Choosing values for A 2 PROBLEM FIGURE ZX = D 4.36 Z2 = D + O.OOli? 4.3d Z3 = D + 0.01R 4.3e Z5 = D + 1.0R 4.3/ Table 4.1: Continuation sequence for figure 4.3 CHAPTER 4. THE SOLUTION STRATEGY. 74 such as applying binary search to A2, or incrementing it with a fixed small value, we have found, experimentally, that only few iterations with A2 = 10A2_1 were enough for good placements. As the experimental results of the next section indicate, our strategy prbves useful, and in particular minimizing Z = D gives excellent results. This is reasonable since we have shown Z = D to be a convex problem. Finally we discuss the "fixed slots problem". Here we apply the Hungarian method to the placement resulting from minimizing Zm as above, which results with placement at the predefined slots. We then fix x\, y,* for % = 1,..., n leaving 0*{ as the only variables and we solve the following sequence of problems: Zx = D Z2= D + erf Z3= D + erf (4.11) Zm = D + R. The sequence in 4.11 results with, again, fixing the allowed angles. The additional sequence is necessary since moving the blocks into their slots may result with some op-timal orientations becoming suboptimal. The next section presents some experimental results which support our choice of strategy and present experimental choices of the A sequences. Chapter 5 Experimental results. 5 . 1 I n t r o d u c t i o n . We have implemented an experimental program on a SUN workstation running UNIX. In our program we calculate D, V and R and the gradient vector for the variables. We use a canned routine, FNMIN [VASE 84], which implements the Variable Metric Method based on the quasi-Newton approach for the unconstrained minimization of nonlinear functions [FLET 70]. See appendix A for a short discussion of quasi-Newton methods. In this chapter we discuss our experience with the optimization routines and the interactive block editor designed to support interactive input and editing of block place-ments. We present some simple 6 and 10 block placement problems. Finally we test our placement procedure on the more involved Steinberg [STEI 61] example; we com-pare our results with previous ones for this problem and briefly investigate lower bound techniques. 75 CHAPTER 5. EXPERIMENTAL RESULTS. 76 5.2 The block edit program. An interactive block editor was developed, utilizing the graphics package CORE [SUN 85] on a SUN/50 workstation. The block editor allows the creation, deletion and mod-ification of three types of objects: blocks, terminals, and interconnecting nets. In the graphical representation blocks appear as rectangles, terminals as stars on the edge of the blocks, and nets as dotted straight lines between the terminals. The editor maintains a database and allows saving and restoring of a placement to and from a disk file. The editor i/o commands include viewing the current placement, windowing on part of it, or generating a scaled output plot. The editing menu includes commands for creating, deleting, moving, copying, and rotating blocks. A block can be picked as the current block which implies that all the succeeding commands will apply to this block. The current block is displayed with a pattern different from the other blocks. There is also an information command which displays all the parameters associated with the current block. These parameters include the block's x, y position, its rotation angle, its width and length, and finally its fixed status. The fixed status is used by the optimization routines to determine if the position and orientation of a block can be changed during the optimization, or if they are of fixed value. A command is available to change the fixed status of the current block or of all blocks. Deletion and addition commands exist also for terminals and nets. Of course, deleting a block deletes all its terminals and nets, and deleting a terminal CHAPTER 5. EXPERIMENTAL RESULTS. 77 PROBLEM FIGURE ^FUNCTIONS Zt = D 5.16 114 Z2 = D + V + 0.00172 5.1c 79 Z3 = D + V + OMR 5.1d 33 Z4 = D + V + 1.0R 5.1e 37 Table 5.1: Continuation sequence for figure 5.1 deletes all its nets. The editor gives a full interface to the optimization routines. One can use the current graphical representation as a placement input for the optimization functions, and can set the various switches and parameters for the optimization. 5 . 3 Placement examples. Various placement problems were tested. Figure 5.1 presents a 6 block placement problem (allowed angles which are multiples of |) solved in stages with the continuation sequence of table 5.1. A measure of computational complexity is ^FUNCTIONS which is the number of the Z{ function evaluations before convergence is reached. We found that for all the block configurations tested the continuation sequence A2 = 0.0,0.001,0.01,0.1,1.0 was sufficient to achieve good orientation in a placement problem with a fixed set of allowed angles. Note that for Jhis example already A 2 = 0.01 was large enough to fix the blocks, so there is no noticeable difference between figure 5.Id and figure 5.1e. Figure 5.2 presents the final result when angles which are multiples of CHAPTER 5. EXPERIMENTAL RESULTS. 78 Figure 5.1c:5.1c: Figure 5.Id: Figure 5.1e: Ai = 1.0, A 2 = 0.001 Ai = 1.0, A 2 = 0.01 A x = 1.0, A 2 = 1.0 Figure 5.1: 6 block placement | are allowed. The running times and function evaluations were roughly the same as in table 5.1. Figure 5.3 presents a 10 block placement problem (allowed angles which are multiples of j) solved in stages with the same strategy as above, resulting with the optimization sequence of table 5.2. Note that although figure 5.3d and 5.3e are almost identical there was a computational effort involved in solving Z4. This is the result of additional degrees of freedom which caused the final convergence to be at a slower rate. One way to overcome this problem is to fix the placement of some blocks before the optimization is done. Of course such an arbitrary fixing of some blocks may result in a less favourable placement than is otherwise possible. CHAPTER 5. EXPERIMENTAL RESULTS. 79 3 Figure 5.2: 6 block placement with 45° angles. PROBLEM FIGURE ^FUNCTIONS ZX = D Z2 = D + V + O.OOliE Z3 = D + V + 0.01J2 Z4 = D + V + 1.0R 5.36 5.3c 5.3d 5.3e 73 129 53 61 Table 5.2: Continuation sequence for figure 5.3 To further support the claim for good placement achievable by this strategy, we have applied it to the above 6 and 10 blocks placement problems with 10 random starts generated for each problem. The random starts were generated by randomly setting Xi, yi, di for i = 1 , . . . , n at the initial placement before optimizing Z\. In both problems we have found that all 10 trial starts have converged to exactly the same minimum. We have also noticed that the convergence to the same placement occurred very early in the sequence, after minimizing Z\ — D. Unfortunately, figure 5.4 presents a simple test case which shows that this "globally convergent" quality of the algorithm does not always work. The difficulty illustrated in the optimization sequence of figures 5.4a-5.4c is that the blocks may not be able to slide over each other at the first stage when optimizing CHAPTER 5. EXPERIMENTAL RESULTS. Figure 5.3: 10 block placement. CHAPTER 5. EXPERIMENTAL RESULTS. 81 Figure 5.'4a: Initial Figure 5.4b: Z\ — D Figure 5.4c: Final Figure 5.4d: Initial Figure 5.4e: Zi = D Figure 5.4f: Final Figure 5.4: Placement problem where initial placement is important. Z\ = D, if they have reached the minima in such an orientation that their centers are still placed on the "wrong" sides. Figures 5.4d-5.4f show how different starting position leads to an optimal placement. We have also experimented with the fixed slots problem. The Hungarian method was implemented. Figure 5.5 presents the result of fixing the final placement of figure 5.3 into the given slots, using the continuation optimization sequence of table 5.3 Figure 5.5 shows the result of letting the 10 blocks rotate freely again and then fixing their angles using the above sequence. Note that block 5 was rotated from its initial position after fixing the blocks into a better fixed orientation. We have also found that although fixing some of the blocks variables speeds up the convergence of the optimization routines, it may result in suboptimal placement. As far CHAPTER 5. EXPERIMENTAL RESULTS. PROBLEM FIGURE ^FUNCTIONS ZX = D 5.56 25 Z2 = D + O.OOli? 5.5c 22 Z3 = D + OMR 5.5<f 29 Z4 = D + 1.0R 5.5e 23 Table 5.3: Continuation sequence for figure 5.5 8 ... 10 - 9 Figure 5.5a: Initial Figure 5.5b: In Slots gag.-• Figure 5.5e: A 2 = 0.01 [#0-0 • Figure 5.5f: Final Figure 5.5: 10 Block placement into fixed slots. CHAPTER 5. EXPERIMENTAL RESULTS. 83 as run time is concerned we have found out that small placement problems (n < 15) can be solved interactively on a SUN workstation (without a floating point hardware). By interactively we mean a wait of no more than a few minutes for each of the optimization sequences. We expect much larger problems to be solvable efficiently on a mainframe computer. 5.4 Steinberg example. Figure 5.6 and tables 5.3 and 5.4 describe our experiments with the classical example of Steinberg [STEI 61]. Figure 5.6 presents our best result on the Steinberg problem. This problem consists of placing 34 blocks with a total number of connections 2,620 on a 4 by 9 grid. Distances are measured as one unit between neighboring grid points. Note that since we restrict the placement to grid points the separation between the blocks is determined by the grid spacing and we can calculate an objective function exactly. Previous researchers [BAZA 83,STEI 61,GILM 62,HALL 70] have used three distance norms to measure the wire length: lx the Manhattan distance, / 2 the Euclidian dis-tance and l\ the Euclidian distance squared. The original specification of the problem, including the connectivity matrix, can be found in [STEI 61]. Since we represent the placement problem more precisely, we had to change the problem by adding terminals to the blocks and routing the connections through the terminals. We have arbitrarily added four terminals to each block and connected the wires arbitrarily to one of the CHAPTER 5. EXPERIMENTAL RESULTS. 84 Figure 5.6: Steinberg example. terminals. Steinberg's best placement results are summarized in table 5.4 under the row labeled "Steinberg". Over the last few years other researchers [HALL 70,GILM 62,GRAV 70,BAZA 83,HILL 66,CART 79.CHEN 84] have improved significantly on Steinberg's results. The best known wire lengths for the li norm were achieved by Hall [HALL 70] and the best known wire lengths for the l2 and l\ where achieved by Cheng and Kuh [CHEN 84]. Our best results are better than any other previously known in all norms. Table 5.4 presents two of our best placements and figure 5.6 gives the resulting placement for the first set of numbers. We note that experimenting with various initial configura-tions, various 4 by 9 grid spacing and various fixed blocks, we were able to find many placements configurations with values near our minimum. Table 5.5 presents some further statistics regarding the Steinberg problem. Un-der the "Random" column are the average wire lengths for a few random placements. CHAPTER 5. EXPERIMENTAL RESULTS. 85 Author li k Carter & Breuer N.A. 5575 N.A. Hillier & Connors N.A. 4821.7 10929.9 Grave & Whinston N.A. 4490 11909 Gilmore(n* algorithm) N.A. 4547.5 10656 Gilmore(n5 algorithm) N.A. 4680.3 11929 Steinberg N.A. 4894.5 11875 Bazar aa N.A. 4800 N.A. Hall 5139 4419.1 9699 Cheng — Kuh 5316 4358.4 8596 Ours(l) 5016 4271.8 8552 Ours (2) 5225 4296.2 8475 Table 5.4: Steinberg example: previous results. These are roughly twice the wire length of the best known placements. Since placement problems, on a fixed grid, are computationally hard [GILM 62,BREU 76b] the verifica-tion of the minimum wire length for a placement requires exponential time via branch and bound or implicit enumeration techniques [GILM 62,BAZA 83]. An alternative approach is to try to compute a lower bound on the total wire length. Since there are a total of 2,620 connections and blocks are not allowed to share the same grid point, this number becomes a trivial lower bound. We could, however, exhibit slightly better bounds in the following way. Assume we "charge" the cost of all the connections coming out of block X to block X. Locating block X at the center of the 4 by 9 grid as in figure 5.7 we note that there are exactly four slots from which the distance, in the li norm, is one unit. These slots are marked '1'. Similarly there exist exactly seven slots CHAPTER 5. EXPERIMENTAL RESULTS. 86 Norm Random Lower Bound Best Real h 11709 3464 5016 3992.3 h 9530 3196.9 4271.8 3251.9 45032 4326.5 8475 5733.5 Table 5.5: Steinberg example: experimental results, from which the distance to X is two units and these are marked '2'. We can subdivide the grid locations into equivalence classes with regard to the slots distances from X. These equivalent distance classes are marked in figure 5.7a for the l\ norm and in figure 5.7b for the l2 and l\ norms. We can now use these classes to compute the best possible configuration for any one block by putting heavily connected blocks in the nearest slots, and lightly connected blocks in the further away slots. Repeating this for all blocks while making sure to charge any connection only to one block we get a lower bound on the best achievable wire length. Of course what we are really doing is solving a much relaxed problem; one in which blocks can both overlap and "split". Note also that the trivial lower bound can be counted in this model by using only the first equivalence class to accommodate for all the blocks. This lower bound schema can be improved by shifting the X position around the grid, calculating the minimum value for each of the 34 positions. We then repeat this for every block and store the results in a 34 by 34 matrix. A linear assignment on this matrix now gives a lower bound [GILM 62]. We have calculated lower bounds for the CHAPTER 5. EXPERIMENTAL RESULTS. 87 6 5 4 5 5 4 3 4 4 3 2 3 3 2 1 2 2 1 X 1 3 2 1 2 4 3 2 3 5 4 3 4 6 5 4 5 11 10 9 10 8 7 6 7 5 4 3 4 4 2 1 2 3 1 X 1 4 2 1 2 5 4 3 4 8 7 6 7 11 10 9 10 Figure 5.7a:/i Norm Figure 5.7b:/2,/| Norms Figure 5.7: Lower Bounds. Steinberg example and the numbers appear under the column "Lower Bound" in table 5.5. Note that there is still a ratio of about | between our lower bounds and the best results. We believe that the differences reflect mainly on optimistic lower bounds rather than on bad placements. Finally the column marked "Real" in table 5.5 contains the total length of the connections as measured from the terminals. The separation between the blocks was normalized to 1.0, and the blocks took approximately 85% of the area of the placements. These numbers are smaller than all others even though we have allowed only four block orientations. Of course, these numbers cannot be compared directly with the other measures and are presented here only for completeness. Chapter 6 Conclusions and future research 6.1 Conclusions Our nonlinear model for the placement problem has proven successful in overcoming two of the limitations of previous force directed models. The potential function ensures that blocks do not overlap, and the calculation of distances from the terminals instead of from the centers, forces the blocks to rotate, resulting in a better placement. Our model allows for the handling of fixed and movable blocks, and for convenient control over the separation between blocks via the parameter <7fJ- in the potential function. The quasi-Newton method used in the penalty approach, being superlinearly convergent, ensures reasonable running time. We have also implemented the penalty functions and transformations allowing re-strictive forms of the placement problem to be solved in our system. A graphics block editor was designed and implemented to complement the optimization routines with an interactive user interface. 88 CHAPTER 6. CONCLUSIONS AND FUTURE RESEARCH. 89 Our experience with the minimization sequence has resulted in good placements. We further conclude that our optimization procedure is robust, being fairly independent of the initial placement. We have conducted some simple experiments and a larger one, and our results compare favourably with previous results. Better placements for the Steinberg problem than the currently best known have been found. 6.2 Future study Among the issues raised by our work, the following points require further research: • Finding a systematic way to generate the values of the continuation parameters. • Using advanced mathematical techniques such as regularization in order to decide how to ensure that our minimization process is not ill-conditioned. • Finding a better algorithm for mapping the best continuous placement to the discrete grid placement. • Exploring further lower bound techniques in order to be able to judge the quality of the obtained placements. • Getting a better understanding of the potential function as a general penalty approach. Bibliography [ALON 85] A. Alon, U . Ascher, "Minimization of Placement Using Potential Directed Forces," ISTAD Con. Proc. July 1985. [BAZA 83] M.S. Bazaraa O. Kirca, " A Branch-and-Bound-Based Heuristic for Solv-ing the Quadratic Assignment Problem," Naval Res.. Log. Quarterly Vol. 30, pp. 287-304, 1983. [BLAN 84] J.P. Blanks "Initial Placement of Gate Arrays Using Least-Squares Meth-ods," Proc. 2l"* Design Automation Conference, pp. 670-671, June 1984. [BLAN 85] J.P. Blanks "Near-Optimal Placement Using A Quadratic Objective Func-tion," Proc. 22th Design Automation Conference, pp. 609-615, June 1985. [BRAD 84] H.N. Brady " A n Approach to Topological Pin Assignment," IEEE Trans-action on Computer-Aided Design, Vol.3, #3, pp. 250-255, July 1984. [BREU 76a] M . A . Breuer, Design Automation of Digital Systems, Theory and Tech-niques, Vol. 1, Prentice-Hall. New York, U.S.A., 1976. [BREU 76b] M . A . Breuer " A Class of Min-Cut Placement Algorithms," Proc. 13th Design Automation Conference, pp. 284-289, June 1976. [BURS 83] M . Burstein, S.J. Hong, R. Pelavin "Hierarchical VLSI Layout: Simulta-neous Placement and Wiring of Gate Arrays," VLSI 83, pp.45-60, 1983. [CALA 85] P. H . Calamai, A. R. Conn " A Projected Newton Method for lp Norm Location Problems," Technical Report CS-85-07 University of Waterloo, August 1985. [CART 79] H.W. Carter, M . A . Breuer, Z.A. Zyed, "Incremental Processing Applied to Steinberg's Placement Procedure," Proc. 16th Design Automation Conference, pp. 26-31, June 1979. 90 BIBLIOGRAPHY 91 [CHEN 84] C. Cheng, K.S. Ernest, "Module Placement Based on Resistive Network Optimization,'' IEEE Transaction on Computer-Aided Design, VoL3 #3 pp. 218-225, July 1984. [CHYA 83] D. Chyan, M . A . Breuer, " A Placement Algorithm for Array Processors," Proc. 20th Design Automation Conference, pp. 182-188, June 1983. [CROW 83] I.H. Kirk, P.D. Crowhurst, J .A. Skingley, J .D. Bowman, G.L. Taylor, "Placement of Irregular Circuit Elements on Non-Uniform Gate Arrays," Proc. 20"* Design Automation Conference, pp. 637-643, June 1983. [DONA 80] W.E. Donath, "Theory and Design Automation," Proc. 17th Design Au-tomation Conference, pp. 412-419, June 1980. [DREZ 80] Z. Drezner, "DISCON: A New Method for the Layout Problem," Opera-tion Research, Vol. 28, #6, pp. 1375-1384, 1980. [DUNL 85] A . E . Dunlop, B.W. Kernigham, " A Procedure for Placement of Standard-Cell VLSI Circuits," IEEE Transaction on Computer-Aided Design, Vol.4, #1, pp. 92-98, January 1985. [EYST 73] J.W. Eyster, J .A. White, W.W. Wierwille, "On solving Multifacility Lo-cation Problems Using a Hyperboloid Approximation Procedure," AIIE Trans., Vol. 5, pp. 1-6, 1973. [FLET 70] R. Fletcher, " A New Approach to Variable Metric Algorithms," Com-puter Journal, Vol 13, pp. 317-322, 1970. [FUKU 83] K. Fukunaga, S. Yamada, H.S. Stone, T . Kasai, "Placement of Circuit Modules Using A Graph Space Approach," Proc. 20th Design Automa-tion Conference, pp. 465-471, June 1983. [GARE 79] M.R. Garey, D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness, W.H. Freeman and Co., San Francisco, U.S.A., 1979. [GILL 81] P.E. Gill, W. Murray, M . H . Wright, "Practical Optimization," Academic Press, San Francisco, pp. 78-81, 1981. [GILM 62] P.C. Gilmore, "Optimal and Suboptimal Algorithms for the Quadratic Assignment Problem," J. Sos. Indust. Appl. Math. Vol. 10, #2, pp. 305-313, 1962. BIBLIOGRAPHY 92 [GOTO 79] S. Goto, " A Two-Dimensional Placement Algorithm for the Master Slice LSI Layout Problem," Proc. 16th Design Automation Conference, pp. 11-17, June 1979. [GOTT 73] B.S. Gottfried, J. Weisman, "Introduction to Optimization Theory," Prentice-Hall Inc, Englewood Cliffs, N.J., pp. 44-49, 1973. [GRAV 70] G.W. Graves, A . B . Whinston, " A n Algorithm for the Quadratic Assign-ment Problem," Management Science, Vol. 17, #2, pp. 453-471, 1970. [HALL 70] K . M . Hall, " A n r-Dimensional Quadratic Placement Algorithm," Man-agement Science, Vol. 17, #3, pp. 219-229, 1970. [HADL 64] G. Hadley, Non-Linear and Dynamic Program, Addison-Wesley, Reading, Mass., U.S.A., pp. 343-344, 1964. [HANA 76] M . Hanan, P.K. Wolff, B.J. Anguli, "Some Experimental Results on Placement Techniques," Proc. 13"* Design Automation Conference, pp. 214-224, June 1976. [HELL 82] W.R. Heller, G. Sorkin, K. Mailing, "The Planar Package Planer for System Designer," Proc. 19th Design Automation Conference, pp. 253-260, June 1982. [HILL 66] F.S. Hillier, M . M . Conners, "Quadratic Assignment Algorithms and the Locations of Indivisible Facilities," Management Science, Vol. 13, #1, pp. 42-57, 1966. [HOFF 79] S.C. Hoffman, COMPUTER-AIDED DESIGN of digital electronic cir-cuits snd systems, North-Holland, Brussels' pp. 229-236, 1979. [HUDS 84] J .A. Hudson, R .C. Peters, "Module Positioning Algorithms for Rectiliear Macrocell Assemblies," Proc. 21th Design Automation Conference, pp. 672-675, June 1984. [IOSU 83] A. Iosupovici, C. King, M . A . Breuer, " A Module Interchange Placement Machine," Proc. 20"* Design Automation Conference, pp. 171-174, June 1983. [KERN 69] B.W. Kernigham, S. Lin, " A n Efficient Heuristic Procedure for Partition-ing Graphs," The Bell System Technical Journal, pp. 291-307, February 1970. BIBLIOGRAPHY 93 [KIRK 83] S. Kirkpatrick, C D . Gelatt, M.P. Vecchi, "Optimization by Simulated Annealing," Science, Vol. 220, #4598, pp. 671-680, May 1983. [KORE 72] N.L. Koren, "Pin Assignment in Automated Printed Circuit Board Des-gin," Proc. 9th Design Automation Workshop, pp. 72-79, June 1972. [KOZA 83] T . Kozawa, H . Terai, T . Ishii, M . Hayase, C. Miura, Y. Ogawa, K. Kisida, N. Yamada, Y. Ohno, "Automatic Placement Algorithms for High Pack-ing Density VLSI," Proc. 20th Design Automation Workshop, pp. 175-181, June 1983. [KUHN 55] H.W. Kuhn, "The Hungarain method for the Assignment Problem," Naval Res. Log. Quarterly, Vol. 2, pp. 83-97, 1955. [LAUG 70] D.J. Laughhunn, "Quadratic Binary Programing with Application to Capital-Budgeting Problems", Operation Research, Vol. 18, pp. 454-461, 1970. [LIN 73] S. Lin, B. Kernighan, " A n effective Algorithm for the Travelling-Salesman Problem," Operation Research Vol. 11, pp. 498-516, 1973. [LOOT 71] F . A . Lootsma, " A Survey of Methods for Solving Constrained-Minimization Problems via Unconstrain Minimization," in Optimization and Design, Premtice-Hall Inc, Englewood Cliffs, N.J., pp. 88-113, 1971. [LOVE 72] R.F. Love, J .G. Morris, "Modelling Inter-City Road Distances by Math-ematical Functions," Oper. Res. Quart., Vol. 23, pp. 61-71, 1972. [LOVE 76] R.F. Love, J.Y. Wong, "Solving Quadratic Assignment Problems with Rectangular Distances and Integer Programing," Naval Res. Log. Quar-terly, Vol. 24, pp. 623-627, 1976. [MARK 84] L . A . Markov, J.R. Fox, J .H. Blank, "Optimization Techniques for Two Dimensional Placement," Proc. 21th Design Automation Conference, pp. 652-654, June 1984. [MAY 83] M . May, A. Iwainsky, P. Mennecke, "Placement and Routing of Logic Schematics," IEEE Transaction on Computer-Aided Design, Vol.15, #3, pp. 115-122, May 1983. [MCFA 83] M . C . McFarland, "Computer-Aided Partitioning of Behavioral Hardware . Descriptions," Proc. 2Qth Design Automation Conference, pp. 472-478, June 1983. BIBLIOGRAPHY 94 [MORR 79] [MORY 78] [MURA 79] [NAHA 85] [ODAW 85] [ O T T E 82] [PALC 84] [PREA 79] [PHYS 82] [PIER 71] [QUIN 79] [REIC 80] J . G . Morris, W.A. Verdini, "Minisum lp Distance Location problems Solved via s Perturbed Problem and Weiszfeld's Algorithm," Operation Research, Vol. 27, #6, pp. 1180-1187, 1979. L. Mory-Raveh, "Pin Assignment on Printed Circuit Board," Proc. 15th Design Automation Conference, pp. 70-73, June 1978. S. Murai, H . Tsuji, M . Kakinuma, K. Sakaguchi, C. Tanaka, " A Hierar-chical Placement Proceduer with a Simple Blockin Scheme," Proc. 16th Design Automation Conference, pp. 18-23, June 1979. S. Nahar, S. Sahni, E . Shragowitz, "Experiments with Simulated An-nealing," Proc. 22th Design Automation Conference, pp. 748-752, June 1985. G. Odawara, K. Iijima, K. Wakabayashi, "Knowledeg-Based Placement Technique for Printed Circuit Boards," Proc. 22th Design Automation Conference, pp. 616-622, June 1985. R .H. Otten, "Automatic Floorplan Design," Proc. 19"* Design Automa-tion Conference, pp. 261-267, June 1982. M . Palczewski, "Performance of Algorithms for Initial Placement," Proc. 21th Design Automation Conference, pp. 399-404, June 1984. B.T. Preas, W . M . VanCleemput, "Placement Algorithms for Arbitrarily Shaped Blocks," Proc. 16th Design Automation Conference, pp. 474-480, June 1979. BMS, "Statistical Mechanics Algorithm for Monte Carlo Optimization," Physics Today, May 1982. J.F. Pierce, W.B. Crowston, "Tree Search Algorithms for Quadratic As-signment Problems," Naval Res. Log. Quarterly, Vol. 18, pp. 1-36, 1971. N.R. Quinn, M . A . Breuer, " A Force Directed Component Placement Pro-cedure for Printed Circuit Boards," IEEE Transaction on Circuits and Systems, Vol.26, #6, pp. 377-388, June 1979. L. E . Reichl. A Modern Course in Statistical Physics, p. 363, University of Texas Press, 1980. BIBLIOGRAPHY 95 [RICH 84] B .D. Richard, " A Standard Cell Initial Placement Strategy," Proc. 21 th Design Automation Conference, pp. 392-398, June 1984. [ROTH 83] H.J. Rothermel, D.A. Mlynski. "Routing Method for VLSI Design Using Irregular Cells," Proc. 20*'1 Design Automation Conference, pp. 257-262, June 1983. [SAST 84] S. Sastry, A. Parker, "On the Relation between Wire Length Distribu-tions and Placement of Logic on Master Slice ICs," Proc. 21th Design Automation Conference, pp. 710-711, June 1984. [SAH 80] S. Sahni, A. Bhatt, "The Complexity of Design Automation Problems," Proc. 17th Design Automation Conference, pp. 402-411, June 1980. [SHA 85] L. Sha, R.W. Dutton, " A n Analytical Algorithm for Placement of Arbi-trarily Sized Rectangular Blocks," Proc. 22th Design Automation Con-ference, pp. 602-607, June 1985. [STAB 79] E.P. Stabler, V . M . Eureichik, "Placement Algorithm by Partitioning for Optimum Rectangular Placement," Proc. 16th Design Automation Con-ference, pp. 24-25, June 1979. [STEI 61] L. Steinberg, "The Backboard Wiring Problem: A Placement Algorithm," SIAM Review, Vol. 3, #1, pp. 37-50, January 1961. [SUNG 83] S. Sungho, "Linear Ordering and Application to Placement," Proc. 20th Design Automation Conference, pp. 457-464, June 1983. [SUPO 83] K. Sopowit, "Placement Algorithm for Custom VLSI," Proc. 20th Design Automation Conference, pp. 164-170, June 1983. [SUN 85] "Sun Core Manual," Sun Microsystems Inc, Mountain View, C A , May 1985. [TOKI 84] T . Kozawa, C. Miura, H . Terai, "Combine and Top Down Block Place-ment Algorithm for Hierarchical Logic VLSI Layout," Proc. 21th Design Automation Conference, pp. 667-669, June 1984. [ULRI 79] L. Ulrich, " A Min-Cut Placement Algorithm for General Cell Assem-blies Based on a Graph Representation," Proc. 16"* Design Automation Conference, pp. 1-9, June 1979. BIBLIOGRAPHY 96 [UEDA 83] K. Ueda, T . Komatsubara, T . Hosaka, " A Parallel Processing Approach for Logic Module Placement,'' IEEE Transaction on Computer-Aided De-sign, Vol.2, #1, pp. 39-47, January 1983. [UEDA 85] K. Ueda, H. Kitazawa, I. Harada, " C H A M P : Chip Floor Plan for Hi-erarchical VLSI Layout Design," IEEE Transaction on Computer-Aided Design, Vol.4, #1, pp. 12-20, January 1985. [VASE 84] W. Vaessen, UBS NLP-Nonlinear Function Optimization. Computer cen-ter manual, University of British Columbia. August 1984. [VECC 83] M.P. Vecchi, S. Kirkpatrick, "Global Wiring by Simulated Annealing," IEEE Transaction on Computer-Aided Design, Vol.2, #4, pp. 215-222, October 1983. [VERG 67] R . G . Vergin, J .D. Rogers, " A n Algorithm and Computational Procedure for Locating Economic Facilities," Management Science, Vol. 13, #6, pp. 240-254, 1967. [WIPF 82] G.J. Wipfler, M . Wiesel, D.A. Mlynski, " A Combined Force and Cut Algorithm for Hierarchical VLSI Layout," Proc. 19"* Design Automation Conference, pp. 671-676, June 1982. Appendix A Quasi-Newton Methods. In this appendix we sketch a short introduction to quasi-Newton methods for uncon-strained minimization of nonlinear functions. More elaborate expositions and references can be found in [GOTT 73] and [GILL 81] for quasi-Newton methods. Newtons method. Let /(x) be the objective nonlinear unconstrained function for which we wish to find a critical point (local minimum, local maximum, or saddle point) with respect to the variables x = (xi,x2,..., xn). Consider a truncated Taylor series approximation to /(x) near a point Xjt f{x) « f{xk) + V/(x t)(x - xfc) + i(x - xk)TF{xk){x - xfc). ( A l ) Here V / ( X f c ) is the gradient vector of f[xk) and F(xk) is the Hessian matrix. Ignoring the higher order terms, the right-hand side of equation (A.l) has a critical point at x f c + 1 = xk - [Fix,)}-1 V /(x*) T (A.2) 97 APPENDIX A. Q UASI-NE WTON METHODS. 98 which gives us the pure form of Newton's method [GOTT 73]. One can terminate the iterations when 11 Xj. — x.k+111 < e for a given termination tolerance e. Verification of local minima is done by verifying that the Hessian is positive definite which is the condition for local a minimum. In practice one usually adds a multiplier ak to the second term of equation (A.2) which is chosen to approximately minimize f{xk) (This is called weak line search). Evaluating [//(x^)]-1 at every iteration may be fairly expensive. Quasi Newton methods attempt to approximate the inverted Hessian instead of directly recomputing it at every iteration. They also attempt to guarantee minimum upon convergence, by keeping the approximated inverted Hessian positive definite. Quasi-Newton methods. A simple quasi-Newton approach is to calculate the Hessian inverse once and not to update it between iterations. This is valid if indeed [//(x^)]-1 « [//(x^+i)] - 1 for all k. A more sophisticated method, the Variable Metric Method by Fletcher et al. [FLET 70] updates an approximation of the inverted Hessian at each iteration. Here is an outline of the variable matrix method. Start with any initial point x 0 and any symmetric positive definite matrix So and set k to 0. At each iteration: 1. Minimize f(xk — akSk V /(xfc)) w i t h respect to ak > 0 to obtain Xt+i, v / ( x j t+i) and pit = -ajtSjfe V /(xt). APPENDIX A. QUASI-NEWTON METHODS. 99 2. Let qk = V/(x*) - V / f ab+ i ) an<* let ok+i -ok + - j r - r 3. Let A; = A; + 1 and return to (1). Fletcher et al. show that if S* is positive definite then so is Sfc+i and that the rate of convergence is superlinear for xk close enough to a minimum.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Model and solution strategy for placement of rectangular...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Model and solution strategy for placement of rectangular blocks in the Euclidian plane Alon, Ami 1986
pdf
Page Metadata
Item Metadata
Title | Model and solution strategy for placement of rectangular blocks in the Euclidian plane |
Creator |
Alon, Ami |
Publisher | University of British Columbia |
Date Issued | 1986 |
Description | This thesis describes a nonlinear optimization model for the placement of rectangular blocks with some wire connections among them in the Euclidian plane, such that the total wire length is minimized. Such a placement algorithm is useful as a CAD tool for VLSI and PCB layout designs. In contrast to some previous placement techniques, the mathematical model presented here ensures that the blocks will not overlap, and minimizes the sum of the distances of the interconnections of the blocks with respect to their orientation as well as their position. We also present mechanisms for solving more restrictive placement problems: one in which there is a set of equally spaced, discrete angles to be used in the placement, and one in which the blocks have to be assigned into predefined slots. The mathematical model is based on the Lennard-Jones 6-12 potential equation, on a sine wave shaped penalty function, and on minimizing the sum of the squares of the Euclidian distances of the block interconnections. We implement and embed our optimization routines in an interactive graphic block editor. We also present some experimental results which show that near optimal placements are achieved with our techniques. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-06-14 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0051907 |
URI | http://hdl.handle.net/2429/25740 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1986_A6_7 A48.pdf [ 4.65MB ]
- Metadata
- JSON: 831-1.0051907.json
- JSON-LD: 831-1.0051907-ld.json
- RDF/XML (Pretty): 831-1.0051907-rdf.xml
- RDF/JSON: 831-1.0051907-rdf.json
- Turtle: 831-1.0051907-turtle.txt
- N-Triples: 831-1.0051907-rdf-ntriples.txt
- Original Record: 831-1.0051907-source.json
- Full Text
- 831-1.0051907-fulltext.txt
- Citation
- 831-1.0051907.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0051907/manifest