MINIMAL SPANNING TREES WITH DEGREE RESTRAINTS by ARCHIBALD McFARIANE B. S c , U n i v e r s i t y of B r i t i s h Columbia, 1963 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE i n the Department of COMPUTER SCIENCE We accept t h i s t hesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA October, 1968 C In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f the r e q u i r e m e n t s f o r an advanced degree a t the U n i v e r s i t y o f B r i t i s h Columbia, I agr e e t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and Study. I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y purposes may be g r a n t e d by the Head of my Department or by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n of t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . . Department o f Computer Science The U n i v e r s i t y o f B r i t i s h Columbia Vancouver 8, Canada Date November 20, 1968 ABSTRACT i The purpose of t h i s t h e s i s i s t o develop a s o l u t i o n t o the problem of determining the minimal spanning tree with degree r e s t r a i n t s f o r a given non-directional graph. Section 1 gives an introduction t o the problem. A set of d e f i n i t i o n s describing the graphical terminology used i n the body of the t h e s i s , i s presented along with a d e s c r i p t i o n of the problem. At the end of t h i s section a few applications of the problem are given. Section 2 ou t l i n e s the method of so l u t i o n used. The algorithm incorporates a branch and bound technique and t h i s problem solving method i s discussed i n general i n the f i r s t part of the section. Some other app l i c a t i o n s of branching and bounding are also discussed. Next, the complete algorithm i s described along with a proof of optimality. A sample problem i s worked through t o i l l u s t r a t e the method of s o l u t i o n . Two d i f f e r e n t minimal spanning tree algorithms, one by R.C. Prim, the other by J.B. Kruskal, are used i n the main core of the s o l u t i o n algorithm. These two approaches are discussed with the a i d of a sample problem, at the end of Section 2. Computer programs were written t o t e s t the algorithms. Several sets of data were compiled f o r various s i z e s of graphs and values of degree r e s t r i c t i o n s . The r e s u l t s of these runs were tabulated and are discussed i n Section 3 . Next, a comparison i s made of the method discussed here and a so l u t i o n involving l i n e a r programming. Section 3 also presents some u s e f u l h e u r i s t i c approaches at sub-optimization which e f f e c t i v e l y reduce the amount of computation. Section 4 summarizes the r e s u l t s of Section 3 and in d i c a t e s the best approach t o use f o r a s p e c i f i c problem. TABLE OF CONTENTS i i Section 1 Introduction Page 1.1 D e f i n i t i o n s 1 1.2 Problem Description 2 Section 2 Method of So l u t i o n 2.1 Branch and Bound Methods 3 2;2 D e s c r i p t i o n of the Algorithm 5 2.3 V e r i f i c a t i o n of Optimality 9 2.4- Sample Problem 11 2.5 The Minimal Spanning Tree Algorithm 15 Section 3 Discussion of Results 3.1 Results f o r Several (n, r ) Combinations 23 3.2 Comparison with Linear Programming Solution 3V 3.3 H e u r i s t i c Suboptimization Schemes 39 Section 4 Conclusions 48 Bibliography 50 Appendix Desc r i p t i o n of the Computer Program 51 L I S T OF T A B L E S i i i Page 1. Comparison of Prim and Kruskal Methods 19 2. Comparison of I t e r a t i o n s Necessary t o Reach Optimal S o l u t i o n 20 3. Results f o r n — 10, r = 2 27 4. TI it n = 15, r _ 2 28 5. II it n = 20, r = 2 29 6. ii ii n = 20, r = 3 30, 31 7. II ii n = 30, r = 3 32, 33 8.. ii II N - 40, r • = 3 34 9. II II N = 50, r = 3 35 10. II II N y 50, r = 3 36 11. Suboptimization Comparisons f o r n = 15, r = 2 43 12. II ii " n = 20, r = 2 44 13. II it " n = 40, r = 3 45 14. Kruskal Search Limits 46 15. Number of Prohibited Links f o r n, r Combinations 47 LIST OF FIGURES i v Page 1. 8 x 8 Sample Problem 13 2. Distance Matrix f o r 1st Subproblem 13 3 . Complete Tree Diagram 14 4 . 8 x 8 Distance Matrix t o I l l u s t r a t e the Difference Between Kruskal and Prim 21 5. Kruskal Tree Diagram 22 6. Prim Tree Diagram 22 7. Program Block Diagram 52 ACKNOWLEDGEMENTS I wish t o thank Mr. H. Pachon of the Automatic E l e c t r i c Company for his suggestion of the t o p i c . I would also l i k e t o thank Dr. J.M. Kennedy f o r h i s assistance and encouragement during the writing of t h i s t h e s i s . F i n a l l y I wish t o thank the B r i t i s h Columbia Telephone Company for t h e i r f i n a n c i a l a i d and also the Department of Computer Science f o r making funds a v a i l a b l e during the summer of 1968 . INTRODUCTION 1 1.1 D e f i n i t i o n s A g r a p h G c o n s i s t s o f a f i n i t e s e t P o f nodes Pi>?2> " '>^n and a s e t R o r o r d e r e d p a i r s o f d i s t i n c t nodes o f P. E a c h p a i r o f nodes ( x , y ) € R i s c a l l e d l i n k o f t h e g r a p h G. A s s o c i a t e d w i t h each g r a p h G, we have a d i s t a n c e m a t r i x L w i t h e l e m e n t s 1- ( i = l , 2, ... ,n; ' j = l , 2. .. ,.n; i ^ j ) o f t h e g i v e n g r a p h . Node i i s c a l l e d t h e i n i t i a l node o f l i n k 1^ • and node j i s c a l l e d t h e t e r m i n a l node. L y denotes t h e l e n g t h o f t h e l i n k f r o m i t o j . A p a t h i n a g r a p h i s a s e t o f l i n k s l - j _ , ±2> • • • > s u c h t h a t t h e t e r m i n a l node o f t h e l - j _ l i n k c o r r e s p o n d s t o t h e i n i t i a l node o f t h e 1-^ +1 l i n k f o r i = l , 2 , .. . s-1. A g r a p h i s c o n n e c t e d i f t h e r e i s a p a t h j o i n i n g e v e r y p a i r o f nodes i n t h e g r a p h . I f l j _ j = l j . j _ f o r a l l i , j e P we s a y t h e g r a p h i s s y m m e t r i c . A l i n k i s s a i d t o be i n c i d e n t w i t h t h e nodes i t j o i n s . I f x i s an i s o l a t e d node o f g r a p h G t h e n ^ c , ^ ' . ^ R and £ , ^ ^ R f o r a l l i e P. A c o n n e c t e d g r a p h c o n t a i n i n g n d i s t i n c t nodes and n - l d i s t i n c t l i n k s w i t h no i s o l a t e d nodes i s c a l l e d a s p a n n i n g t r e e . The degree of any node o f a g r a p h i s e q u a l t o t h e number o f l i n k s i n c i d e n t w i t h i t . When t h e degree o f e v e r y node o f a t r e e i s l e s s t h a n o r e q u a l t o two, t h e t r e e d e f i n e s a H a m i l t o n i a n p a t h . A f e a s i b l e s o l u t i o n t o a c o n s t r a i n e d o p t i m i z a t i o n p r o b l e m i s a s o l u t i o n w h i c h s a t i s f i e s t h e c o n s t r a i n t s o f t h e p r o b l e m . A minimum s p a n n i n g t r e e has a t o t a l l i n k l e n g t h no g r e a t e r t h a n t h a t o f any o t h e r s p a n n i n g t r e e . 1.2 Problem Description 2 The minimum n-node spanning tree problem with degree r e s t r a i n t s r, may be defined as follows: Given n d i s t i n c t nodes and an associated symmetric distance matrix, f i n d the minimum spanning tree such that the degree of each node i n the tree i s less than or equal to r. This type of problem occurs i n the backplane wiring of computers. The connector pins are the nodes of the graph and the lengths of wire required t o jo i n any two pins are the l i n k s . The degree r e s t r a i n t s arise from the physical l i m i t a t i o n s on the number of connections which can be made at any pin. The problem also occurs i n other e l e c t r i c a l design applications, for example, i n the layout of integrated c i r c u i t r y and card modules. When the degree r e s t r i c t i o n i s equal to two, the problem becomes the determination of the shortest Hamiltonian path i n the graph. This i s closely related to the Travelling Salesman problem which i s one of the 'unsolved' problems of combinatorial mathematics. The number of d i s t i n c t spanning trees i n a complete symmetric graph with n nodes i s n 1 1 -^. The number of d i s t i n c t Hamiltonian paths i n t h i s graph i s ^ '. The pro b a b i l i t y of a minimum spanning tree being a Hamiltonian path i s therefore given by the r a t i o of ^ ' to n 1 1 -^. This r a t i o goes very quietly to i t s l i m i t of zero. For example, when n=10, the pro b a b i l i t y i s less than 0 .04 and when n=15 i t becomes less than 0.007. I t i s for t h i s reason that the solution of the problem becomes very time consuming when the degree r e s t r i c t i o n s equal two and n>15. 2.1 B r a n c h a n d B o u n d M e t h o d s 3 T h e t e c h n i q u e u s e d t o s o l v e t h i s r e s t r a i n e d o p t i m i z a t i o n p r o b l e m i s k n o w n a s b r a n c h a n d b o u n d . I t i s a m e a n s o f p r o g r e s s i n g t o w a r d s a n o p t i m a l s o l u t i o n , w i t h a s i g n i f i c a n t r e d u c t i o n i n t h e s i z e o f t h e s e t o f f e a s i b l e s o l u t i o n s r e q u i r i n g i n v e s t i g a t i o n . A l a r g e s e t o f f e a s i b l e s o l u t i o n s n o r m a l l y e x i s t s w h e n a n o b j e c t i v e f u n c t i o n z i s t o b e m i n i m i z e d , s u b j e c t t o a s e t o f r e s t r a i n t s . E a c h o f t h e s e f e a s i b l e s o l u t i o n s h a s a d i s t i n c t z v a l u e . T h e b r a n c h a n d b o u n d p r o c e s s b r e a k s u p t h i s s e t o f f e a s i b l e s o l u t i o n s i n t o s m a l l e r s u b s e t s a n d c a l c u l a t e s a l o w e r b o u n d f o r t h e z v a l u e s w i t h i n e a c h . s u b s e t . T h e s e s u b s e t b o u n d s a r e o b t a i n e d b y s o l v i n g a s i m p l e r p r o b l e m t h a n t h e g i v e n r e s t r a i n e d o n e . O n e o f t h e s e s u b s e t s i s s e l e c t e d a n d a g a i n p a r t i t i o n e d • i n t o b o u n d e d s u b s e t s a s a b o v e . T h e p a r t i t i o n i n g c o n t i n u e s u n t i l a f e a s i b l e s o l u t i o n , z ' t o t h e o r i g i n a l p r o b l e m i s i s o l a t e d . S u b s e t s w i t h a b o u n d g r e a t e r t h a n o r e q u a l t o z ' a r e n o t i n v e s t i g a t e d f u r t h e r . S u b s e t s w i t h a l o w e r b o u n d t h a n z ' a r e p r o c e s s e d i n t h e h o p e o f d i s c o v e r i n g a s m a l l e r f e a s i b l e , s o l u t i o n . T h e o p t i m a l s o l u t i o n z * i s r e a c h e d w h e n t h e b o u n d o n a l l s u b s e t s i s g r e a t e r t h a n o r e q u a l t o z * . A n i m p o r t a n t p o i n t i n b r a n c h i n g a n d b o u n d i n g i s t h e d e t e r -m i n a t i o n o f t h e o r d e r i n w h i c h t h e p a r t i t i o n e d s u b s e t s a r e t o b e p r o c e s s e d . T h i s g i v e s r i s e t o t w o b a s i c m e t h o d s o f s u b s e t s e l e c t i o n . T h e f i r s t o f t h e s e i s t h e r a t h e r o b v i o u s o n e o f c h o o s i n g t h e s u b s e t s i n t h e i n c r e a s i n g o r d e r o f t h e i r r e s p e c t i v e b o u n d v a l u e s . T h i s w o u l d e n s u r e a m i n i m u m a m o u n t o f c o m p u t a t i o n . T h e o n e d i s a d v a n t a g e w i t h t h i s m e t h o d i s t h a t a l a r g e a m o u n t o f i n f o r m a t i o n m u s t b e r e t a i n e d a t a l l t i m e s i n o r d e r t o d e t e r m i n e a n d t o d e f i n e t h e s u b s e t w h i c h s h o u l d b e s t u d i e d n e x t . 4 The second method i s that of investigating the subsets in some prearranged order u n t i l either a feasible solution i s obtained, or the bound on a subset exceeds the length of a known feasible solution. In other words, the f i r s t subset of a partition i s i t s e l f partitioned and so on u n t i l at, say, the kth level of partitioning a conclusion i s reached as above. Now the second subset of this kth partition i s investigated. This method is useful i f the ordering of the subsets i n each partition i s such that there i s a greater probability of finding the optimal solution in the i n i t i a l subsets. This latter method i s the one used here and more w i l l be said on the ordering of the partitions. Branch and bound methods have been used to handle a variety of problems. E.L. Lawler and D.E. Wood (4) give an excellent account of several applications. They include integer linear programming, nonlinear programming, the quadratic assignment problem and the travelling salesman problem. There i s also a note on applications outside the realms of mathematical programming; e.g. pure combina-torics. The branch and bound technique used on the travelling salesman problem has been compared with other popular 'solutions' in an article by Bellmore and Newhauser (8). Here the branch and bound methods of Eastman (10), L i t t l e , et a l (3), and Shapiro ( l l ) are compared with the dynamic programming of Held and Karp (12) and the '(^.- optimization' of Lin (13). Shapiro's work appears to be the superior branch and bound approach and i s rated very highly for the solution of symmetric problems of up to 40 nodes. 2.2 Description of the Algorithm 5 The algorithm i s a search technique i n which one p a r t i t i o n s the set of spanning trees into subsets and calculates lower bounds on the length of a l l trees i n a subset. In t h i s way, spanning trees which meet the degree requirements are discovered, and the smallest one w i l l be the optimal solution to the restrained problem. The i n i t i a l bound i s found by solving the unrestricted minimum tree problem defined by the given distance matrix L with elements 1.. (i=l,2,...,n; j=l,2,...,n; i ^ j ) . I f the solution complies with the degree r e s t r i c t i o n s , the restrained problem i s solved t r i v i a l l y . I f i t does not s a t i s f y the r e s t r a i n t s , i . e . i f there exists at least one node i i n the tree of degree x with x > r , one branches into p subproblems where p i s the number of ways of selecting x-r d i s t i n c t l i n k s from x l i n k s . The above node i of degree x w i l l have x l i n k s incident with i t . Let these l i n k s be 1]_,12, .. ,,l x, arranged i n the reverse order they are chosen i n the minimum tree algorithm. The p subproblems are created by prohibiting p d i s t i n c t sets of x-r l i n k s from being included i n minimum tree solutions. For subproblem 1, l e t l1=oe»,l2 =oo,..., l x_ r=oo; for subproblem 2, l e t 1]_ =oo,l 2 =oo J f..,l x_ r_]_ =oo, l x - r + l =O0;...; for subproblem p l e t l r + 1 = o o , l r + 2 =CO, . . . , l x =0<9. These sets of prohibited l i n k s are the lexicographical orderings of the p selections of x-r l i n k s from l i n k s 1-^ , 1 2, . .., l x . Subset number 1 then consists of the set of a l l trees for which l i n k s 1-|_,12, .. .,l x_ r are prohibited, subset number 2 prohibits l i n k s 1 1 ? .. . , l x _ r _ i , ^x-r+l' e~ t c- I n s h o r t> w e a r e investigating all the possible ways of forcing the degree of node i to s a t i s f y the degree restraints r. It is convenient to refer to a node whose degree exceeds the restrictions as a 'trouble' node, and the as-sociated links as 'trouble' links. The minimum tree solutions to these p subproblems become the bounds for their respective subsets. If the solution to subproblem number 1 is not a feasible solution to the restrained minimum tree problem, we branch again. This con-tinues until a feasible solution is reached in a subset k. Now the algorithm examines the next subproblem in subset k as above. The subproblems are examined until either an improved feasible solution is found or the bound on a subset is' larger than the length of the current best feasible solution, in which case the subset is rejected. When a l l possible subsets have been examined, the best feasible solution is the optimal solution to the minimum tree problem with degree restraints. In the next section an example is worked through. It is convenient to use a pushdown stack to define the cur-rent subproblem. As soon as a branch occurs in the algorithm a new entry is placed on the top of the stack. This entry gives the ordered l i s t of links connected to the trouble node which caused the branch. It also indicates which of these links are prohibited and which are allowed for the particular subproblem. This is accomplished by means of an x-r digit number, (x = degree of the trouble node) which indicates the current combination of prohibited links. This is best illustrated by an example. With x = 6 and r = 2 we have C(6,4) = 15 subproblems to investigate. Initially we have the combination pointer set at 1234 which indicates that the first, second, third and fourth links of the ordered l i s t in the stack entry are prohibited in the first subproblem. 7 The fi f t h and sixth links are allowed to enter the solution. If this first subproblem yields an improved feasible solu-tion, or exceeds the current bound, the combination pointer is changed from 1234- "to 1235 giving a new set of allowed links and prohibited links for the second subproblem. If not, a new entry is added to the stack, depicting the next branching subset. Eventually, an entry on the top of the stack will exhaust a l l possible combinations. When this is the case, this entry is deleted and the next lower level becomes the new top of the stack. As before the next combination is generated, etc. In this manner the other thirteen subproblems for the above example will be duly investiaged; i.e. the pointer will take on the values 1236, 1245, 1246, 2346, 2356, 2456, 3456. The algorithm terminates when the stack is empty. The algorithm completes an iteration when either an improved feasible solution is found or a solution exceeds the current bound. If the same trouble node occurs more than once in the branching processes during an iteration, a l l the trouble links for that node are 'bunched' together at one branch. This cuts down unnecessary duplication of subsets. If more than one node has a degree greater than r, the first one encountered by the unrestrained minimum tree algorithm is the one which determines the next set of subproblems. The order in which the links are chosen in the minimum tree algorithm is extremely important. It affects not only the determination of the first trouble node but also the ordering of the trouble links for the combination generator. This ordering of the links is dependent on the minimum spanning tree algorithm used and for this reason, two different methods are compared. These two methods, one by R.C. Prim (7), the other by J.B. Kruskal (6), will be discussed in section 2.5. V e r i f i c a t i o n o f O p t i m a l i t y 9 T h e . p a r t i t i o n i n g u s e d e x p l o r e s a l l t h e n e c e s s a r y s u b s e t s , a n d e n s u r e s u s o f c o v e r i n g t h e e n t i r e s e t o f f e a s i b l e s o l u t i o n s . C l e a r l y , t h i s i s d u e t o t h e i n v e s t i g a t i o n o f t h e c o m p l e t e s e t o f p o s s i b l e c o m b i n a t i o n s a t e a c h b r a n c h i n g s t a g e . T h e a l g o r i t h m i s t e r m i n a t e d w h e n t h e b o u n d s o n a l l t h e s u b s e t p r o b l e m s a r e g r e a t e r t h a n o r e q u a l t o t h e l e n g t h o f t h e b e s t r e s t r a i n e d m i n i m a l t r e e . T o p r o v e t h a t t h i s i s t h e o p t i m a l r e s t r a i n e d s o l u t i o n , i t i s o n l y n e c e s s a r y t o s h o w t h a t e a c h t i m e a b r a n c h i s m a d e , t h e s u b p r o b l e m s d e f i n e d w i l l h a v e o p t i m a l s o l u t i o n s g r e a t e r t h a n o r e q u a l t o t h e l o w e r b o u n d a t t h e b r a n c h . L e t u s a s s u m e t h a t a p a r t i c u l a r s u b p r o b l e m k , p r o h i b i t i n g x l i n k s y i e l d s a m i n i m u m s p a n n i n g t r e e o f l e n g t h t . T h e d i s t a n c e m a t r i x f o r t h i s s u b p r o b l e m m a y b e d e s i g n a t e d D . k L e t t h i s t r e e o f l e n g t h t b e c o m p r i s e d o f t h e l i n k s 1-^, l g , . . . , 1 ^ p l p l j ^ , . . . , 1 , a r r a n g e d i n t h e o r d e r t h e y a r e c h o s e n b y t h e m i n i m u m s p a n n i n g t r e e a l g o r i t h m . ( A s s u m e t h a t t h e a l g o r i t h m e m p l o y s t h e r u l e o f c h o o s i n g t h e n e x t l a r g e s t l i n k w h i c h d o e s n o t f o r m a c l o s e d l o o p w i t h p r e v i o u s l y s e l e c t e d l i n k s . ) I f we e x c l u d e a n y o n e o f t h e l i n k s o f t h i s t r e e f r o m o u r m a t r i x D ^ , g i v i n g a m a t r i x D ^ ' , a n d s o l v e t h e new m i n i m u m t r e e p r o b l e m , we w i l l o b t a i n a s o l u t i o n o f l e n g t h t * w i t h f i t , • T o i l l u s t r a t e t h i s , , l e t u s e x c l u d e t h e l i n k 1^ f r o m D^., a n d s o l v e t h e new m i n i m u m t r e e p r o b l e m d e f i n e d o n D^.' . L i n k s ±]_,±2> • • • A i - 1 w i l l b e c h o s e n a s b e f o r e , b u t b e c a u s e l i i s p r o h i b i t e d , we m u s t i n v e s t i g a t e t h e n e x t l a r g e s t l i n k f r o m o u r m a t r i x D ^ ' t o c o n t i n u e t h e a l g o r i t h m . ( T h e n e x t ' l a r g e s t ' l i n k 1 ^ w i l l h a v e t h e r e l a t i o n l ^ S l - j ^ . T h e r e m a i n i n g s e q u e n c e o f m i n i m u m spanning tree links will be the links ..., 1 . + ^ , . . 1 with l j _ + 1 , 1^,1 . . ., l s as above in the solution using the matrix and 1^' a new link with L^^l-^. Thus the total length t of this solution will be such that t'^>t. Clearly this will also be the case i f D, 1 has more than one link of the D, minimum tree k k prohibited. Since every possible subset has a bound which is greater than or equal to the length z of the best restrained solution, no other restrained solution can exist with a length smaller than Sample problem 11 Given the following 8x8 distance matrix (Fig. l) and the degree restrictions r=2 we solve the unrestricted minimum tree problem. The minimum tree is given by the links 2-7, 4-8, 1-8, 5-8, 5-6, 3-8, 2-6 with total length 603. Clearly, node 8 has degree 4 and our solution is not a feasible one for the restricted problem. Now the 4 links connected to node 8 are arranged in the order 8-3, 8-5, 8-1, 8-4 and the first of the C(4,2) = 6 subproblems is defined by setting link 8-3 = 0 0 and link 8-5 =CxO. The distance matrix for this subproblem'is shown in Fig. 2. The solution to this particular subproblem is defined by the links 2-7, 4-8, 1-8, 1-5, 5-6, 2-6, 3-4 with total length 781. This turns out to be our first feasible solution since the degree of each node is less than or equal to two. The current bound for our optimal solution is therefore 781. The second subproblem is defined by setting 8-3 = 0 0 , and 8-1 = 0 0 . This search technique can best be illustrated by a tree diagram (see Fig. 3 ) . The nodes of the tree diagram represent unrestrained minimum tree subproblems and the branches t e l l which links have been set to infinity in the distance matrix; e.g. If we want to exclude the link x-y from a particular subproblem then xy would appear on a branch leading to i t . The number at each node denotes the length of the particular solution. A square node indicates a feasible solution to the restrained problem. 12 Subproblems with minimum tree lengths greater than or equal to the current best f e a s i b l e s o l u t i o n are not investigated f u r t h e r . Thus, the 2nd and 3rd subproblems with lengths greater than 781 are abandoned. The 4th subproblem which has l i n k s 5-8 and 1-8 set t o i n f i n i t y i n i t s distance matrix has a length of 735. The minimum tree f o r t h i s subproblem i s comprised of the l i n k s 2-7, 4-8, 1-5, 5-6, 3-8, 4-5 and 2-6. This so l u t i o n i s not a f e a s i b l e one since node 5 has degree 3. Therefore t h i s 4th subset i s furth e r p a r t i t i o n e d i n t o 3 smaller subsets; the 1st one excludes the l i n k 4-5, the second excludes the l i n k 6-5 and the l a s t excludes the l i n k 1-5. Now the 1st of these smaller subsets i s examined. We continue i n t h i s fashion u n t i l a l l subsets (nodes of the tree diagram) have been investigated. The optimum so l u t i o n i s comprised of the l i n k s 2-7, 4-8, 1-5, 3-8, 4-5, 1-6, 2-6 of length 767 and occurs when l i n k s 5-8, 1-8 and 6-5 are pr o h i b i t e d . 13 1 2 3 4 5 6 7 8 1 - 576 374 357 63 174 511 37 2 576 • - 854 205 474 1 86 5 739 3 3 74 854 - 332 872 587 846 156 4 357 205 332 — 167 671 597 16 5 63 474 872 167 - 142 326 61 6 1 74 186 587 671 142 — 983 865 7 511 5 846 597 326 983 - 622 8 37 739 156 16 61 865 622 -FIGURE 1 8 x 8 SAMPLE PROBLEM 1 2 3 4 5 6 7 8 1 - 576 374 357 63 174 511 37 2 576 - 854 205 474 186 5 73 9 3 374 854 — 332 872 587 846 GO 4 357 20 5 332 - 167 671 597 16 5 63 474 872 167 - 1 42 326 Co 6 1 74 1 86 587 671 1 42 - 983 865 7 51 1 5 846 597 326 983 - 622 8 3"T 7 39 16 865 622 -FIGURE 2 DISTANCE MATRIX FOR I S T SUB PROBLEM LINKS (3,8) AND (5,8) PROHIBITED 781 FIGURE 3 COMPLETE TREE DIAGRAM 2.5 The Minimal Spanning Tree Algorithm 15 The Branch and Bound method described solves a set of minimal spanning t r e e problems. Two d i f f e r e n t methods for generating t h i s set are compared. The f i r s t of these methods was developed by R.C. Prim (7). B a s i c a l l y , t h i s algorithm s t a r t s with any given node and finds i t s nearest neighbour, thus forming the f i r s t l i n k of the minimal t r e e . Then i t f i n d s the nearest node t o t h i s subtree by searching through the set of nodes not yet included i n the subtree. This process continues u n t i l the f u l l spanning t r e e i s formed. At a l l stages we are dealing with a subtree, thus there i s no need t o check for closed loops or connectedness. The algorithm i s very f a s t and uses a minimal amount of core storage. A t e s t program written i n Fortran on an IBM 7044 found the minimum tre e f or an 80x80 complete distance matrix i n l e s s than one second. The big disadvantage with t h i s algorithm i s the fact that the minimal tree l i n k s are not chosen i n increasing order of magnitude. Since t h i s Branch and Bound method always examines the subset problems i n order,, there i s a greater p r o b a b i l i t y of f i n d i n g the optimal s o l u t i o n e a r l i e r i f the largest l i n k s connected t o a trouble node are prohib i t e d i n the f i r s t problem of each subset. This i s best accomplished i f these l i n k s are given t o the combination generator i n order of increasing magnitude. Due t o the nature of the combinations which are generated i n l e x i c o g r a p h i c a l order, the f i r s t problem of a given subset w i l l then exclude the x-r largest l i n k s (x being the degree of the trouble node f o r t h i s subset with x > r ) , the second problem w i l l exclude the x-r-1 largest l i n k s and the (x-r+l)th largest l i n k and so on for the remaining problems. 16 The l a s t problem of each subset w i l l of course exclude the x-r smallest l i n k s . The second algorithm developed by J..B. Kruskal (6) achieves t h i s ordering. I n i t i a l l y a l l the l i n k s of the distance matrix are sorted i n increasing order of magnitude. The f i r s t l i n k i n the sorted l i s t becomes the i n i t i a l l i n k of the minimal spanning t r e e . Links which do not form closed loops with e x i s t i n g l i n k s are then chosen i n order from the l i s t u n t i l the complete t r e e i s constructed. The disadvantage here i s that the sort time grows as the dimension n of the distance matrix increases. However, once the sort has been completed, the determination of the minimal t r e e i s much f a s t e r than the Prim algorithm. The sort only needs t o be done once i n order t o solve the many minimal t r e e subproblems which a r i s e i n the branching process. For a reasonable number of i t e r a t i o n s the sort time i s outweighed by the f a s t l i n k s e l e c t i o n and the o v e r a l l Kruskal time becomes f a s t e r than the Prim time. The number of i t e r a t i o n s necessary t o complete the algorithm depends on both the dimension n of the distance matrix and the degree r e s t r i c t i o n r for the problem. Complete symmetric distance matrices of various dimensions with elements obtained from a random number generator were used t o t e s t both the Prim and the Kruskal methods of s o l u t i o n on an IBM 704-4- • The r e s u l t s of these t e s t s are given i n Table 1. A L i b r a r y subroutine (14) i s used t o order the l i n k s i n the Kruskal method. The s o r t i n g i s accomplished by a merge-exchange technique and i t i s very f a s t . The number of i t e r a t i o n s necessary f o r the Kruskal method t o overcome i t s sort time and become f a s t e r than the Prim method, for the same number of i t e r a t i o n s , was tabulated f o r each value of n and r . For example, with n = 40 and 17 r = 3, the Kruskal method would be f a s t e r than the Prim method i f both took more than 25 i t e r a t i o n s t o terminate. An average value, over various sample s i z e s , was calculated f o r each n, r combination. In a l l cases fewer i t e r a t i o n s were required on the average by the Kruskal method. -Figures f o r three random matrices with n>50 were shown f o r comparison. The Prim times to complete the algorithm f o r n = 20 and r = 2 were greater than 10 minutes on the average; therefore only 3 sets were given. The Table indicates that the Kruskal method should be used f o r graphs with f i f t y or less nodes. L i t t l e can be said of problems with n>50, since only a few examples were tested. In a l l cases, the Kruskal method terminated i n a smaller number of i t e r a t i o n s and each time, the number was w e l l above the c r i t i c a l value f o r that c l a s s . In t h i s range, the Kruskal sort times are becoming s i g n i f i c a n t ; f o r n = 80, the sort time was 11 seconds. • The following example with n = 8 and degree r e s t r i c t i o n s r = 2 given i n F i g . 4 w i l l i l l u s t r a t e the difference between the Kruskal and the Prim approach. The s o l u t i o n tree diagrams f or both methods are given i n F i g s . 5 and 6 . The numbers within the nodes represent the s p e c i f i c i t e r a t i o n . The Kruskal approach f i n d s the optimum so l u t i o n i n the f i r s t i t e r a t i o n and completes the algorithm i n 8 i t e r a t i o n s . The Prim method finds the optimum i n the 10th i t e r a t i o n and terminates i n 14 i t e r a t i o n s . This i s t y p i c a l behaviour f o r Prim versus Kruskal as Tables 1 and 2 c l e a r l y i n d i c a t e . Table 2 gives the average number of i t e r a t i o n s necessary to reach an optimal s o l u t i o n f o r both the Prim method and the Kruskal method. The sample s i z e 18 fo r each n, r set i s included beneath these average f i g u r e s . The r a t i o of the number of Kruskal i t e r a t i o n s t o the number of Prim i t e r a t i o n s i s also c a l c u l a t e d . In t h i s case, the reason the Prim approach was slower t o f i n d the optimal s o l u t i o n was that the Prim algorithm f a i l e d t o select the minimum t r e e l i n k s i n increasing order of magnitude. For example, l i n k 6-1 of length 172 u n i t s was selected before l i n k s 6-7 and 6-3 equal t o 48 u n i t s and 83 u n i t s r e s p e c t i v e l y . Thus the subproblems p r o h i b i t i n g 6-3 and 6-7, both smaller i n length than l i n k 6-1, were investigated f i r s t . This gave a bound of 970 u n i t s on the optimal s o l u t i o n (See F i g . 6). I f the subproblem p r o h i b i t i n g l i n k 6-1 had been investigated f i r s t , as i n the Kruskal approach, the bound would have been 894 units and the 63 and 67 subsets would have been rejected with t h e i r bounds of 967 units and 944 u n i t s r e s p e c t i v e l y . The same lack of ordering occurs at the node 7, (Prim ordering; 7-1, 7-6, 7-8, 7 -4 ) . Fortunately t h i s does not create any extra i t e r a t i o n s because the 970 bound i s s u f f i c i e n t to r e j e c t the 9th and 10th subsets with bounds of 1,008 and 1,039 r e s p e c t i v e l y . 19 ITERATION TIME (SECS.) AVERAGE SORT TIME(SECS.) KRUSKAL CRITICAL NUMBER OF ITERATIONS TO OVERCOME K. SORT TIME AVG. # ITERATIONS SAMPLE SIZE N R PRIM KRUSKAL PRIM KRUSKAL 10 2 .014 .010 .07 18 173 2 9 147 2 9 15 2 .027 .0 18 .2 23 2 4 1 2 19 1 123 19 2 0 2 .046 .026 .4 2 0 12252 3 7 6 7 2 3 2 0 3 .046 .026 .4 2 0 3 0 4 0 2 9 4 0 3 0 3 .09 4 .049 1. 1 2 5 127 5 0 126 5 0 4 0 3. .167 .078 2.1 2 5 6 9 0 2 3 561 2 3 5 0 3 .260 .125 3.5 2 6 5 5 8 14 510 14 6 0 3 .360 .165 5.6 2 9 719 1 581 1 7 0 3 .500 .200 8.0 27 9 8 5 1 7 3 9 1 8 0 3 .620 . 2 5 0 1 1 .0 3 0 4 7 2 1 3 2 8 1 TABLE I COMPARISON OF PRIM AND KRUSKAL METHODS NUMBER OF ITERATIONS AND ITERATION SPEED N R AVERAGE NUMBER OF ITERATIONS FOR OPTIMUM SOLUTION RATIO K/P KRUSKAL PRIM 1 0 2 82 29 1 1 4 29 0.745 1 5 2 769 1 9 15 16 19 0.508 20 2 5422 3 6073 3 0.890 20 3 12 40 16 40 0.7 50 30 3 64 50 8 7 50 0.735 40 3 280 23 306 23 0.9 1 5 50 3 336 1 4 402 1 4 0.835 TABLE 2 COMPARISON OF ITERATIONS NECESSARY TO REACH OPTIMAL SOLUTION 21 1 2 3 4 5 6 7 8 1 - 88 3 6 0 5 4 0 7 3 0 1 72 251 4 9 2 2 88 — 5 6 0 2 5 9 5 3 8 4 8 3 321 4 2 0 3 3 6 0 5 6 0 - 291 5 8 2 8 3 7 2 5 5 8 0 4 5 4 0 2 5 9 291 - 8 5 9 9 1 6 2 4 3 4 7 5 5 7 3 0 5 3 8 5 8 2 8 5 9 - 3 8 6 891 4 6 6 172 4 8 3 8 3 916 3 8 6 - 4 8 2 3 3 7 251 3 2 1 7 2 5 2 4 3 891 4 8 - 7 9 8 4 9 2 4 2 0 5 8 0 4 7 5 4 6 233 7 9 FIGURE 4 8 x 8 DISTANCE MATRIX TO ILLUSTRATE DIFFERENCE BETWEEN KRUSKAL AND PRIM F I G U R E 6 PRIM T R E E D I A G R A M 3.1 Results f o r several (n, r ) combinations. 23 Programs were written i n Fortran f o r both the Prim approach and the Kruskal approach. The programs were tested on an IBM 7044 using complete symmetric distance matrices whose elements were random generated numbers. A d e s c r i p t i o n of the basic program i s given i n the appendix. Tables 3 through 10 give the r e s u l t s of several, runs on various combinations of distance matrix s i z e s , n and degree r e s t r a i n t s , .v.. The tables give the number of i t e r a t i o n s and the t o t a l times ( i n seconds) required both t o terminate the algorithm and t o f i n d the optimum s o l u t i o n . These fi g u r e s are given f or both 'Kruskal' and 'Prim' as i n d i c a t e d . In almost a l l cases.the r e s u l t s f o r each sample set give a wide range of values. However, f o r the larger samples, averages are ca l c u l a t e d f o r both methods and included i n the tables for comparison. Table 3 with n = 10 and r = 2 i l l u s t r a t e s the a b i l i t y of the Kruskal method t o f i n d the optimal s o l u t i o n early. In 12 of the 29 cases, the optimal s o l u t i o n i s found i n 10 or fewer i t e r a t i o n s and 5 of these 12 solutions are found i n the f i r s t i t e r a t i o n . The average fi g u r e s show the Kruskal method t o be the superior one f o r t h i s combination of n and r . The Prim method terminated f a s t e r than the. Kruskal method on only one occasion and t h i s occurred when the number of i t e r a t i o n s f o r both methods was smaller than the c r i t i c a l value of 18 (See Table l ) . For the same reason, the Prim method found the optimum s o l u t i o n f a s t e r i n 5 cases. The average number of i t e r a t i o n s necessary t o terminate the Kruskal method was 147. This was f e l t to be a high f i g u r e since 22 of the 29 values were smaller than i t . The same can be sa i d of the Prim average of 173. Table 4, c l e a r l y shows the s u p e r i o r i t y of the Kruskal method 24 for t h i s p a r t i c u l a r n, r combination. The gap between the completion times of both approaches widens as the problems become more complex ( i . e . when more branching steps are required t o i s o l a t e a f e a s i b l e s o l u t i o n ) . For example, the problem which required 4048 Kruskal i t e r a t i o n s t o terminate, required 7719 Prim i t e r a t i o n s . Here- the Kruskal method f i n i s h e d 137.1 seconds f a s t e r than the Prim method. Occasionally, the Prim method w i l l take fewer i t e r a t i o n s than the Kruskal method ('Kruskal' took 1452 i t e r a t i o n s t o f i n d an optimal s o l u t i o n which 'Prim' found i n the f i r s t i t e r a t i o n ) . This i s due t o the fact that, i n a few cases, the best r e s t r a i n e d t r e e i s not the one which deletes i t s largest superfluous l i n k s f i r s t at some branching stage. However, the average number of i t e r a t i o n s over the sample set bears out i n favour of the Kruskal choice of l i n k s . Furthermore, i n 17 of the 19 instances, the Kruskal method terminates i n fewer i t e r a t i o n s . The average completion time for the Kruskal method was more than three times f a s t e r than that of the Prim method. Table 5 gives a few r e s u l t s f or n = 20 and r = 2. More success was again experienced with the Kruskal approach. Unfortunately, the large running times involved f o r both methods r e s t r i c t e d the si z e of t h i s data set. The four extra Kruskal solutions i n d i c a t e problems of such a degree of complexity that the Prim algorithm f a i l e d t o y i e l d an optimal s o l u t i o n i n a set time of 15 minutes. Once again we have a case where the Prim algorithm r e s u l t s i n fewer i t e r a t i o n s ( f i r s t problem). The fa c t that the completion times are only 2 seconds apart, when the Kruskal method performs approximately twice as many i t e r a t i o n s , i l l u s t r a t e s the f a s t e r i t e r a t i o n time of the Kruskal algorithm (See Table l ) . Table 6 i l l u s t r a t e s an 'easier' set of problems with n = 20 25 and r.= 3. As a matter of f a c t , 6 of these 40 problems have a t r i v i a l s o l u t i o n ; i . e . the s o l u t i o n t o the unrestrained minimal spanning t r e e problem meets the degree r e s t r a i n t s of two. For t h i s class of problem, the e f f e c t of the l i n k ordering i n the Kruskal algorithm i s not very s i g n i f i c a n t . There are r e l a t i v e l y few nodes of degree greater than three, and therefore most subsets w i l l have three subproblems. The ordering of the l i n k s i s more important when the subsets have more members, since i t w i l l take more i t e r a t i o n s t o reach a lower bound which might have been discovered i n one of the f i r s t few subproblems, i f the l i n k s had been ordered. The r e s u l t s are more c l o s e l y grouped and the averages give a better i n d i c a t i o n of the group. The average figures f o r the i t e r a t i o n s are quite s i m i l a r . In both cases, the average number of i t e r a t i o n s necessary t o reach the optimal value f a l l s below the c r i t i c a l value of 20 for t h i s group (See Table l ) . For t h i s reason, the optimum times f o r the Prim method are often the f a s t e r ones. (In 20 of the 40 cases the Prim method fi n d s the optimal s o l u t i o n f a s t e r . ) In the set of Table 7, the Kruskal method has better average times than the Prim method. This i s l a r g e l y due t o the f a s t e r i t e r a t i o n times of the Kruskal method, since the average number of i t e r a t i o n s f o r both methods are quite s i m i l a r . The two approaches have good success at f i n d i n g the optimal s o l u t i o n i n the f i r s t i t e r a t i o n . The Kruskal method accomplishes t h i s 14 times and the Prim method, 10 times. There are four or f i v e 'harder 1 problems i n t h i s set and a comparison of t h e i r solutions by both methods emphasizes the s u p e r i o r i t y of the Kruskal approach. A l l of the averages appear t o be too high. For'example, 43 of the 50 values are smaller than the average values f o r the number of 26 i t e r a t i o n s necessary t o terminate both the Prim and the Kruskal methods. In Tables 8 and 9, the cla s s of problems becomes more d i f f i c u l t . A comparison of the average completion times and the average optimum times w i l l i n d i c a t e the large savings i n time, r e a l i z e d by using the Kruskal method over the Prim method. In every case i n Table 8, the Kruskal method terminates f a s t e r than the Prim method. In Table 9, only one problem i s solved f a s t e r by the Prim method and t h i s i s due t o the fact that the t o t a l number of i t e r a t i o n s necessary t o terminate the algorithm (16 i t e r a t i o n s ) i s l e s s than the c r i t i c a l value (30 i t e r a t i o n s ) . Table 10 gives some r e s u l t s f o r a few matrices l a r g e r than 50 x 50. C l e a r l y the Kruskal method should be used i n t h i s region at a l l times. The i t e r a t i o n times i n Table 1 indi c a t e that for t h i s class of problems, the Kruskal approach can take more than twice as many i t e r a t i o n s and s t i l l f i n i s h f a s t e r than the Prim approach. The algorithms were not tested f o r problems with n>~80 and r = 3. In view of the previous r e s u l t s , i t i s f e l t that t h i s region would best be dealt with using the Kruskal method. I f the run-times.J become too large for t h i s c l a s s of problems, one of the H e u r i s t i c methods described i n section 3.3>may be used. The random numbers which formed the distance matrices f o r the preceeding examples were d i s t r i b u t e d i n the range 0 to 1000. 27 ITERATIONS RUNNING TIMES (SECS) KRUSKAL PR 1M KRUSKAL PRIM END • OPTIMUM END OPT 1 MUM END OPT 1 MUM END OPTIMUM 33 6 35 6 0.42 0.15 0.52 0.13 12 2 12 2 0.22 0.10 0.20 0.05 50 2 57 15 0.57 0.10 0.85 0.25 88 33 114 59 0^92 0.42 1.62 0.90 37 24 55 47 0.42 0.30 0.82 0.70 124 52 180 106 1.24 0.59 2.68 1.65 34 1 36 9 0.37 0.07 0.53 0.17 . .5A2-. 566 480 6.05 . S..29 7.75 6.60 24 12 55 46 0.30 0.20 0.78 0.67 77 28 80 36 0.84 0.35 1.15 0.53 25 20 23 18 0.30 0.25 0.35 0.28 22 17 20 7 0.29 0.22 0.30 0.10 21 1 147 127 0.25 0.09 2.00 1.75 933 166 913 237 9.42 1.82 12.40 3.28 ' 22 1 26 1 0.27 0.07 0.37 0.02 84 29 79 10 0.87 0.37 1.07 0.15 68 61 73 63 0.74 0.67 1.05 0.92 28 6 31 15 0.37 0.17 0.48 Q.2Z. 20 1 32 15 0.25 0.07 0.47 0.25 14 1 '• 99 93 0.20 0.08 1.42 1.33 185 65 211 118 1.90 0.77 3.08 1.75 160 139 338 316 1.65 1.45 4.70 4.40 270 8 302 87 2.69 0.17 4.12 1.25 951 928 956 934 L0.05 9.82 13.27 12.97 8 1 35 30 0 .14 0.07 0.53 0.47 77 35 135 71 0.82 0.44 1.92 1.03 207 128 227 204 2.22 1.49 3-25 2.93 146 138 146 138 1.65 1.57 2.04 1.93 14 10 33 26 0.22 0.17 0.50 0.40 A V E R IGES 147 . 82 173 114 1 .57 0 .94 2.42, 1 .63 , TABLE 3 RESULTS FOR N = 10 R - 2 SAMPLE SIZE = 29 28 ITERATIONS RUNNING TIMES (SECS) KRUSKAL PR 1 M KRUSKAL PRIM END • 'OPT 1 MUM END OPTIMUM END OPTIMUM END OPTIMUM 2503 2083 3292 1370 41.7 35.0 88.9 37.3 880 716 1062 498 15.8 13.1 29.2 13.8 84 1 546 482 1.5 0.2 15.3 13.6 476 414 404 323 8.0 7.0 11.7 . 9.3 60 19 83 19 1,1 0.5 2.5 0.6 574 437 1379 1276 9.6 7.4 39.7 36.7 95 1 109 9 1.6 0.2 3.2 0.4 1942 4758- . 4588 31.4 25.9 138.5 133.5 , 925 394 985 192 15.2 6.8 28.2 5.6 172 1 415 334 2.9 0.2 11.5 9.3 1344 1162 1832 428 23.3 18.8 52.8 12.4 4048 3606 7719 3821 79.8 70.8 216.9 109.2 114 1 115 1 2.0 0.2 3.4 0.1 37 28 40 31 :0.7 0.6 1.2 • 0.9 623 370 1011 234 10.5 6.4 28.6 6.8 2871 1879 10155 8944 52.9 36.5 298.3 262.6 —491 194 6366 ' 5842 8.3 3.3 184.1 169.1 2091 1452 645 1 36.9 26.1 19.0 0.1 2004 256 4905 418 34.9 4.8 134.4 11.7 AVEI AGES 1123 769 2412 1516 19.9 13.9 68.8 43.8 TABLE 4 RESULTS FOR N = 15 R - 2 SAMPLE SIZE = 1 9 2 9 ITERATIONS RUNNING TIMES (SECS) KRUSKAL PRIM KRUSKAL PRIM END ' OPT I .MUM END OPT 1 MUM END OPT 1 MUM END - OPT 1 MUM 2 - 4 4 3 1 7 6 2 1 2 2 9 3 3 9 6 1 . 2 4 4 . 3 5 9 . 4 1 6 . 6 1 4 5 6 5 9 5 2 0 1 7 5 7 2 1 4 5 5 9 3 8 1 . 5 2 5 1 . 3 8 1 1 . 3 6 7 0 . 9 6 0 0 7 4 9 8 6 1 7 9 5 6 3 3 2 3 1 5 0 . 8 1 2 6 . 8 8 3 9 . 3 1 6 7 . 4 5 9 8 4 3 8 8 2 1 5 8 . 5 1 0 2 . 8 6 9 5 6 , 5 7 3 7 . 1 8 7 . 3 1 5 6 . 2 1 3 6 9 8 . 1 1 0 2 8 3 8 1 . 4 3 0 7 . 1 2 - 9 8 1 3 2 2 6 9 7 7 8 1 . 4 5 9 4 . 7 A V E R - L G E S 1 3 2 4 4 9 9 3 5 3 0 0 . 3 2 2 6 . 2 TABLE 5 RESULTS FOR N - = 2 0 R = 2 SAMPLE SIZE = 7 30 I T E R A T I O N S R U N N I N G T I M E S ( S E C S ) K R U S K A L P R I M K R U S K A L P R I M E N D - O P T I M U M E N D O P T 1 MUM E N D O P T 1 MUM E N D O P T 1 MUM 65 15 65 23 1.9 0.7 2.8 1.1 7 2 7 2 0.5 0.4 0 .4 0 .1 10 1 10 1 0.6 0.3 0.5 0 .1 . 46 3 46 3 1.5 0.4 2 .0 0 . 2 44 26 47 30 1.3 0 . 8 2.2 1.4 16 14 22 20 0.5 0.5 1.1 1.0 31 20 34 26 1.0 .8 1.7 1.3 16 2 16 2 0 . 8 0.4 0 . 8 0 . 2 ' 7 5 4 1 0.4 0.4 0 . 2 0 . 1 13 6 13 13 0.5 0 .4 0 . 6 0 .6 40 35 37 34 1.1 1.2 1.7 1.6 1 1 1 1 0.3 0 . 3 0 . 1 0 .1 22 . l£....... 22 14 0.7 0.5 1.0 0.6 1 1 1 1 . 0.4 0.4 0 . 1 0 . 1 7 1 7 4 0.5 0.4 0.4 0 . 2 221 69 200 51 5.8 2.1 8 . 8 2.4 1 1 1 1 0.4 0 .4 0 . 1 0 . 1 1 1 1 1 0.4 0 .4 0 .1 0 .1 1 1 1 1 0.4 0.4 0 . 1 0 . 1 7 2 7 4 0.5 0 .4 0.4 0 . 2 61 27 55 21 1.8 1.0 2.6 1.1 4 1 1 0.4 0.4 0 . 2 0 . 1 19 11 19 14 0.7 0.5 1.0 0.7 1 1 1 1 0.3 0 .3 0 . 1 0 . 1 56 35 56 35 1.4 1.0 2,6 1.7 4 2 4 2 0.4 0 .4 0 . 2 0 .1 16 14 16 16 0.7 0 . 6 0 . 8 0 .8 4 1 4 1 0.4 0.4 0 . 2 0 . 1 3 7 7 0.4 0.4 0.4 0.4 38 11 32 10 1.1 0.5 1.5 0 .6 31 20 34 21 1.0 0.7 1.6' 1.0 TABLE 6 RESULTS FOR N - 20 R = 3 SAMPLE SIZE = 40 31 ITERATIONS . RUNNING TIMES (SECS) KRUSKAL PR 1 M KRUSKAL PRIM END • 'OPT 1 MUM END OPTIMUM END OPT 1 MUM END • OPTIMUM 19 14 46 44 0.7 0.6 2.2 2.1 10 5 10 7 0.5 0 .5 0.6 0.4 137 38 122 73 3 . 5 1.3 5.6 3.4 22 2 22 2 0 . 9 0.4 1.1 0 .1 79 71 79 73 2.3 2.1 3.7 3-4 77 29 98 56 2 . 0 1.0 L. 7 2.8 7 3 16 16 0.5 0 .4 0 . 8 0 . 8 4 1 4 1 0.4 0.4 0 . 2 0 .1 10 2 10 4 0.5 0.4 0.6 0.3 AVER iGES 29 12 30 16 1.0 0.6 1.4 0 . 8 TABLE 6 RESULTS FOR N. = 20 R = 3 (CONT'D) SAMPLE SIZE = 40 . 32 ITERATIONS RUNNING TIMES (SECS) KRUSKAL PRIM KRUSKAL PRIM END : OPT 1 MUM END OPT 1 MUM • END OPTIMUM END OPT 1 MUM 16 6 16 6 1.6 1.2 1.6 .7 690 476 842 764 35.2 24.8 82.5 74.6 100 33 100 200 6.3 3.1 10.0 2.2 65 1 65 10 3.9 1.1 6.1 1.0 49 30 49 30 3.2 2.4 4.8 3.0 112 51 100 47 6.0 3.3 9.4 4.4 52 1 109 7 3.4 1.1 11.4 .8 1 1 1 1 1.1 1.1 .1 .1 37 11 37 11 3.0 1.8 3.4 1.1 264 185 367 325 13.9 10.0 35.0 31.0 71 49 86 30 4.0 3.1 8.5 3.1 16 8 25 19 2.1 1.8 2.5 1.9 ' . 1 1 1 1 1.0 1.0 .1 .1 47 10 25 8 3.7 1.9 2.5 . .9 650 513 620 527 30.2 24.0 60.6 51.6 , 28 17 121 • 119 2.5 2.0 12.3 12.1 31 11 28 16 2.8 1.9 .'•2'.'9 1.7 13 2 13 2 1.5 1.0 1.2 .2 73 21 121 84 4.6 2.1 11.9 8.3 19 1 - 19 7 2.3 1.4 2.0 .8 13 1 13 1 1.6 1.1 1.2" .1 442 246 386 213 21.8 12.7 36.6 20.1 55 15 61 50 3.7 1.7 5.9 4.8 37 22 37 22 2.9 2.2 3.4 211 4 1 4 1 1.3 1.1 .4 .1 7 5 7 5 1.3 1.1 .7 .5. .1 1 1 1 1.0 1.0 .1 .1 79 38 85 45 5.2 3.1 8.6 4.5 7 6 7 6 1.2 1.2 .7 .6 95 41 95 56 5.9 3.4 9.3 5.6 13' 3 13 3 1.8 I 1.3 •4 , TABLE 7 RESULTS FOR N = 30 R = 3 SAMPLE SIZE = 50 33 ITERATIONS ' RUNNING TIMES (SECS) KRUSKAL PR 1M KRUSKAL PRIM END •' 'OPTIMUM END OPTIMUM END OPT 1 MUM END OPTIMUM 10 1 10 1 1.8 1.3 1.0 .1 1 4 1 1.4 1.3 .4 .1 ,., 41. 6 41 6 2.6 1.1 3.9 .7 53 38 38 20 3.7 2.0 3 7 2,1. 2070 949 1880 1080 95.4 44.5 177.6 10? 3 274 254 304 304 15.1 14.1 30.3 30 0 10 1 ' 10 4 1.6 1.2 1:1 .5 13 9 13 9 1.8 1.6 1.3 .9 65 59 28 20 4.2 3.9 2.9 2.1 7 1 7 1 1.5 1 . 3 7 l 430 27 303 237 22.5 3.1 • 29.1 22.7 4 2, 4. 2 1.3 1.2 .4 .2 104 8 113 10 6.2 1.8 10.9- 1.1 16 3 16 3 1.6 0.9 1.7 .3 13 5 13 1 1.9 1 .5 1 .5 .3 7 7 1 1 8 1 7 ,7 , ? 7 1 7 4 1.6 1.3 .8 .5 34 21 37 17 3.0 2.4 . 3.8 .' 1.8 31 5 40 16 2.8 1.6 4.0 1.7 AVER 126 64 127 87 7.0 4.2 12.2 8.1 . -TABLE 7 RESULTS FOR N = 30 R = 3 SAMPLE SIZE = 50 34 ITERATIONS RUNNING TIMES (SECS) KRUSKAL PR 1M KRUSKAL PRIM END • ' OPTIMUM END OPTIMUM END OPT 1 MUM END OPTIMUM 127 98 199 169 11.7 9.6 33.8 28.8 1418 1299 1192 724 113.1 104.3 200.0 121.9 331 131 319 116 29.9 14.5 55.6 20.1 5529 2171 .6398 1196 405.9 165.5 1086.4 201.8 34 14 38 1 3.8 2.4 6.8 .6 25 1 64 1 3.6 2 .1 10.9 • 3 441 317 418 294 37.4 27.7 70.2 49.5 603 : 396 1186 1051 46.4 31.5 192.6 170.3 85 76 85 77 8.1 7.5 14.6 13.3 316 56 352 148 24.5 6.0 58.4 24.5 61 46 73 56 6.5 5.4 12.9 9.9 214 17 238 20 17.9 2.9 38.3 3.4 982 5 1048 92 72.9 2.4 176.0 15.9 201 69 297 139 16.2 6.8 48.4- 22.8 399 200 522 272 3o;6 16.1 86.6 45.6 ' 25 1 25 10 4 .0 2.3 4.1 1.7 49 25 58 . 38 5.7 4 .0 10.1 6.7 1237 1082 2604 2257 101.0 88.8 428.9 372.2 34 11 58 37 4.2 2.6 10.3 '. 7.0 . 130 125 67 49 12.6 12.2 10.9 8 .0 106 11 112 11 10.7 3.2 19.9 2 .1 550 284 493 • 269 44.3 23.5 82.3 45.5 16 5 22 13 3.3 2.5 3.7 2.2 AVERj iGES 561 280 690 306 44.1 23.6 115.7 51.0 TABLE 8 RESULTS FOR N = 40 R = 3 SAMPLE SIZE = 23 35 ITERATIONS RUNNING TIMES CSECS) KRUSKAL PRIM KRUSKAL PRIM END •' 'OPT 1 MUM ' END OPT 1 MUM END OPT 1 MUM END ' • OPT 1 MUM 94 34 100 57 14.1 7.0 26.4 15.2 830 333 926 643 95.2 40.6 238.6 166.5 40 2 52 2 8.7 4.5 13.3 .8 163 116 88 59 22.9 17.6 22.9 15.5 28 9 64 57 6.4 4.4 17.2 ' 15.5 145 79 382 345 21.9 14.5 98.5 89.1 22 1 37 34 5.0 2.8 9.7 8.9 3221 2284 ' 3437 2629 365.5 261.4 ' 834.1 685.0 63 18 63 18 11.4 6.5 15.9 4.7 826 536 86i_. 535 97.2 64.2 219.7 135.7 241 224 . 283 266 31.2 29.3 70.6 66.4 . 16 1 16 4 4.8 3.2 4.5 1.4 427 380 490 467 51.8 46.5 130.3 124.4 1033 679 1009 509 122.0 81.2 282.5.., 1 4 3 . 0 AVEPJ GES 510 336 558 402 61.3 4 1 . 7 146.0 105.0 TABLE 9 RESULTS FOR N = 5 0 R = 3 ' SAMPLE SIZE = 1 4 36 ITERATIONS RUNNING TIMES (SECS) KRUSKAL PR 1M KRUSKAL PRIM END 'OPT 1 MUM END OPT 1 MUM END OPTIMUM END OPT 1 MUM N=60 R=3 581 2 719 46 93.9 6.4 260.5 17.1 298 259 1597 1578 51.4 45.6 584.4 577.6 N=70 R=3 739 611 985 918 171.9 144.0 494.5 460.7 N=80 R=3 328 140 472 274 90.0 52.5 295.6 • 171.8 1009 873 487 224 269.2' 234.4 302.0 138.7 TABLE 10 RESULTS FOR N > 50 R = 3 Comparison With Linear Programming Solution 37 A different approach at solving the problem has been proposed by N. Deo and S.L. Hakimi (5). They set up the problem as a.linear program in the following manner. The total length of n-l links chosen from the distance matrix must be minimized subject to the following conditions. (1) Every vertex must have degree less than an equal to the problem restraint, r. (2) There is no isolated vertex in the graph. In order to define the linear program, i t is necessary to introduce many additional variables. These variables are bounded and can only assume the values 0 or 1. They end up with a simplex tableau involving n(n+l) variables in "^"^) + 1 equations, which for n = 15 becomes 240 variables in 136 equations. The above restraints do not guarantee that the solution to the linear program will yield a tree. There is a possibility of closed loops existing in the solution. • Deo and Hakimi incorporate a tree test algorithm to determine the feasibility of the solution. If the solution fails this tree test, the next smallest vector is forced into the solution and the linear program is iterated again. An example is given of a 14 x 14--incomplete - symmetric graph with a degree restraint of two. It is solved on an IBM 709 in a matter of 249 seconds. The same problem was solved by the Prim method in 47 seconds and by the Kruskal method in 18 seconds on an IBM 7044- The internal logic and arithmetic organization of the two machines is quite similar but the cycle time of the 7044 is six times faster than that of the 709. Therefore a rough estimate of the running time of the Deo and Hakimi solution on the 7044 would be about 40 seconds. One advantage of the branch and bound approach is that we will always obtain a reasonably good upper bound on our optimal solution i f we run out of time. With this linear programming approach, we have no such estimate i f the problem takes longer than a set time. H e u r i s t i c Suboptimization Schemes 39 The following i s an account of a few schemes designed to-decrease s i g n i f i c a n t l y the amount of computation involved i n a problem at the r i s k of f a i l i n g t o f i n d the optimal solution, and accepting a near optimal s o l u t i o n i n a few cases. The f i r s t of these guarantees that our answer w i l l be within a p r e s p e c i f i e d amount of the optimal s o l u t i o n . Suppose we are w i l l i n g t o accept a f e a s i b l e s o l u t i o n which d i f f e r s from the optimal s o l u t i o n by no more than 10%. I f a f e a s i b l e s o l u t i o n i s discovered with a t o t a l length of 2000 un i t s , then we can r e j e c t a l l subsets with bounds of 1819 or more (1.10 x 1819 = 2000.972000). A few of these suboptimization runs were made, both with the Kruskal method and the Prim method. The r e s u l t s are given i n Tables 11, 12 and 13- In each of these runs a 10$. suboptimization l i m i t i s allowed. The actual deviation from the known optimal s o l u t i o n i s tabulated. At the bottom of each t a b l e , the average amount of work and time saved i s given f o r the data set. Table 11 shows an average deviation of 2% from the optimums and i n 11 of the 20 cases the optimal s o l u t i o n was found. This set of data contains more d i f f i c u l t problems than the set of Table 4, as indic a t e d by the higher average f i g u r e s . The suboptimization e f f e c t i v e l y reduces these averages by more than 60% i n a l l cases. Table 12 gives a few r e s u l t s f o r n = 20 and r = 2. The savings there are a l l i n excess of 75%. The 10%> suboptimization run using the Prim method i s presented i n Table 13. In 16 of the 20 cases, the f i r s t f e a s i b l e s o l u t i o n reached i s within the 10$ l i m i t . This indicates that the 4-0 method gives a r e l a t i v e l y good i n i t i a l estimate of the optimal s o l u t i o n . The Kruskal method also gave t h i s good f i r s t guess, but because of the 2.1 second sort time, the average completion time over the same set was greater than the Prim method. The average amounts saved i n the Prim case were a l l greater than 85% with the average deviation from the optimal less than A % . The sort algorithm used f o r the Kruskal algorithm has a speed which i s pro p o r t i o n a l to N l n N, where N i s the s i z e of the unsorted l i s t . Since we are dealing with n x n complete symmetric matrices, these l i s t s are of s i z e " ( " " l ) . Therefore the net e f f e c t has the 2 sort times increasing by the order of n^ l n n (see Table l ) . The Kruskal algorithm i s f a s t e r f o r sparse matrices than f o r complete matrices since the sort time i s reduced due to the smaller l i s t of l i n k s r e q u i r i n g s o r t i n g . . At the moment, the Prim method i s not set up to deal with sparse matrices. The algorithm i s matrix oriented, with the p o s i t i o n of a l i n k i n the distance matrix representing the nodes which the l i n k j o i n s . Sparse matrices could be handled more e f f i c i e n t l y i f a l i s t of l i n k s was formed as i n the Kruskal algorithm. The d i f f e r e n c e between the completion time and the time when the optimal s o l u t i o n i s found i s often quite large (see Tables 3 to 10). ) This suggests concluding the algorithm a f t e r some set.time and taking our current best f e a s i b l e s o l u t i o n as an estimate of the optimal s o l u t i o n . The cases f o r n>50 and r = 3 appear to back up t h i s approach, i n that many of the i n i t i a l f e a s i b l e solutions are quite close to the optimal.• In the f i v e problems with n >50, four produced f i r s t estimates within 10% of t h e i r optimum solutions and one gave a f i r s t estimate within 15% of i t s optimum s o l u t i o n . 41 A time l i m i t of 30 seconds was imposed on the Kruskal method f o r the 20 problems of Table 11. The optimal s o l u t i o n was found i n 12 cases. The remaining problems averaged solutions within 13$ of the optimum. The largest deviation was 35$. The t o t a l time f o r the en t i r e set was 433 seconds. The time f o r the 10$ suboptimization run was 506 seconds and the complete algorithm took 1392 seconds. The same time l i m i t was used on the 'easier' set of Table 4» The optimal s o l u t i o n was found i n 16 of the 19 cases. The remaining three answers were within 0.5$, 13$ and 11$ r e s p e c t i v e l y . The t o t a l time saved over the basic algorithm was 111 seconds. A survey of the a v a i l a b l e data was made to see.what per cent of the t o t a l number of l i n k s was a c t u a l l y processed i n a r r i v i n g at an optimal s o l u t i o n . This survey was taken on the Kruskal r e s u l t s and i s summarized i n Table 14. The average p o s i t i o n i n the sorted l i s t of l i n k s , where the f i n a l l i n k of an optimal s o l u t i o n occurred, was obtained f o r each set of data. These f i g u r e s were expressed as a percentage of the t o t a l number of l i n k s a v a i l a b l e ( n ( n ~ l ) ) - Percentage f i g u r e s were also 2 obtained f o r the maximum l i s t p o s i t i o n s of the optimum solutions i n each, set. From these r e s u l t s , a proposed l i m i t , on the percentage of l i n k s r e q u i r i n g i n v e s t i g a t i o n , was estimated f o r each n, r combination. For example, from Table 14, f o r n = 50, r = 3, i t i s estimated that we only need to examine the smallest 245 (0.20 x 1225 = 245) l i n k s of the distance matrix. The average number of l i n k s prohibited i n an optimal so l u t i o n also appeared to be an i n t e r e s t i n g s t a t i s t i c to inv e s t i g a t e . Table 15 gives the average value and the maximum value of the number of l i n k s set t o OO at the optimal s o l u t i o n f o r each n and r studied. It was f e l t that a saving i n time could be r e a l i z e d by imposing these two l i m i t i n g conditions on the algorithm. V i z . , l i m i t the set of l i n k s studied to the smallest x% and at the same time l i m i t the number of l i n k s , y, set to i n f i n i t y (x and y are taken from Tables 14 and 15 r e s p e c t i v e l y ) . A few runs were t r i e d using both the maximum and the average values given i n Table 15 f o r the y l i m i t . There was no e f f e c t i v e reduction i n the number of i t e r a t i o n s and i n some cases, extra i t e r a t i o n s were required. • It was f e l t that the l i m i t s imposed on the algorithm, excluded some subsets which would have given us a better estimate of the optimal so l u t i o n at an e a r l i e r stage. Thus our bound value f o r r e j e c t i n g subsets was higher and consequently some a d d i t i o n a l subsets were examined. I t appeared that the number of subsets r e j e c t e d by t h i s l i m i t i n g strategy was approximately equal to the number that would have been rejected i f these lower bounds had been established e a r l i e r . A combination of some of these ideas could be used where furthe r reduction i n time i s desired. For example, the percentage suboptimization could also be used with a lower time l i m i t since the complete s o l u t i o n t r e e i s much smaller than i t i s i n the main algorithm. 43 ITERATIONS RUNIUNG E04ES (S3CS.) • - cfl Create than Opti-mum Kruskal Kruskal Suboptimur; Kruskal (ruskal Suboptimum • End Dptirrium End Outiiuum End Opt imam End Optimum 2091 1452 340 226 - 36.9 26.1 6.1 4.2 0. 5_„ ' 6 2004 256 910 1 35.0 _A-8 16.0 152.6 .2... 42.1 .20596 . 3276 14951 8586 2291 371.1 _ 2 7 2 ^ L _ 50.8 2989 1103 509 55.4 18^5„_.. .14.1 ..._..l_-.2„... 24.8 ._9JP__.__ ...3.3. . _ ? _ _ . „ _ - 2.8 .... .5. . . _ . . 5 _ _ . ....._0 ., ..: 0.. 0... 2 . .2 . 4 1914 1578 705 443 _38._2_ 5.3 '' __3l_.:4__ .2 313 1 66 1 • 8171 1100 1538 449 134.1 19.1 1 9 7 2 0 7274 322 223 -3QJL 20.1 -AJL . 4 » 0 „ .. 9349 4166 J3HJL • 72.4 ' 84.3 128.5 ...15.8J,._._ __24.JL_ 34_,8 ...7.3.^0 .17., 2 24.1 1992 2588 1370 908 48.6 4866 3380 2034 1376 '59.7 554. 160 222 36 9.2 2.8 ..__..3..i__ .4 „_Q.B_.. ___^ 2 .2 ___4_.8 .....6.8 6.0 .10.^9-.2 .... 3. 0 0 62 1 15 1 1.2 .2 218 1 • 2 0 1 3.6 .2 .4 ' 588 555 275 • 266 10.6 10.1 17 . . 5 ^ 4.9 0 .0. . 0 ... 0. 0 753 . 636 526 _ J S 1 _ „ 329 28 a 9,0 . >: 367 13.8 11.9 6.5 3 6 , U 2766 720 589 63.3 49.3 12.9 192 1 - 16 1 3.5 .2 _._ -4__ 33*58 2978 641 1 58 .3 52.0 I I J J L - _ . J , 2 _ _ 7 AVER \GES 3988 2241 1456 610 69.6 40.2 25.3 11.2 2.0 AMOUNT SAVED 63$ 73$ 62$ 72$ — TABLE 11 RESULTS FOR IT SAMPLE SIZE = 1 5 R = 2 = 20 y 44 ITERATIONS RUTIXTIHG TIKES (SECS.) Jreate than Opti-mum Kruskal Kruskal Suboptimum Kruskal {ruskal Suboptimum End Optimum End Optimum End Optimum End Optimum 2443 1762 613 324 ~ 61.2 . 4 4 ^ . 15-A_ _ 8,3 ...2.4... 19-9 14,3. 5 6 14565 9520 3504 64 381.5 251.3 89.3 .5984 6007 J8H2__ 4986 1Q31_ 1553 543 150.9 ?8.1 .......... 3. 1 126.8 .37.7 .. LGES AVERi 7249 5037 1675 412 188.0 42.6 11.2 2.7 a i l . . . . . . . AM3UNT SAVED 11% 92% ———•— . . . . . . > -— -TABLE 12 RESULTS FOR K = 20 R = 2 SAMPLE SIZE = 4 45 ITERATIONS KUH-UNG TIKES (SECS.) < Creater than Opti-mum Prim Prim Suboptimun Prim Prim Suboptimum End Dptimum End Optimum End Optimum End Out Irauiti 199 169 22 1 - 3 3 . 8 2 8 . 8 3.7 . 3 ... .1.9 ,5. .7 _ 4 . 1 . 0 .. 4 0 1192 724 179 158 200.0 121.9 .30.1. 1*J2._.:_ . 7 9.0 5.2 319 _ 6398 116 1196 19 11 55.6 20.1 467 1 1086 ,4 .. __6.8 1 0 . 9 . _ 2 Q l i 8 _ _.6 .3 38 1 29 1 64 1 13 1 2.2 . .... ,3 0 418 294 25 1 70.2 49.5 4.3 .4... .4 _3.0 ... ,3 3 3 .e_ .JL .......2... .. 6 4 8 . . . . ?... 3 8 2 . 7 1 2 9 7 0 , 3„.9 1186 105H 79 1 192 .6 170.3 13-3 85 77 31 17 14.6 __13 ;3_. 24.^ 5.3 352 148 112 1 58 .4 . .18,7 4.6_ 4.1~_ 66^4, 2.7 73 56 25 1 12 .9 9.9" 238 20 _21 1 38.3 3 .4 1048 92 397 1 176.0 15.9 297 139 16 .1 48^4 22.8 522 272 78 1 8 6 . 6 45.6 13.1 .5 25. 10 1 .4.1 l.J__ 6.7 2.1 2.3 , 2 ^3 58 . 38 13 1 10.1 2604 2257 436 1 428.9 372.2 72.0 3 58 .37 19 1 10.3 _ _ Z - 0 _ 8.0 3 ,3 . 6.5 .5 4 , 6 67 49 40 28 1 0 . 9 AVEF AGES 762 337 102 12 127 .8 56.2 17'. 1 2.1___ AMOUNT SAVED 86% 96% 86% 9 6 i ' •- -- • TABLE 13 RESULTS FOR U = 40 R = 3 SAMPLE SIZE = 20 N R % LINKS SEARCHED FOR OPTIMUM SOLUTION PROPOSED LIMIT AVERAGE MAXIMUM 10 2 3 8 % 5 8 % 6 0 % 1 5 2 32% 6 0% 60% 20 2 2 0 % 23% 5 0 % 20 3 17% 36% 4 0 % 30 3 1 4 % 21 % 3 0 % 40 3 1 0 % 13% 20% 50 3 9 % 1 5 % 2 0 % 60 3 - 1 2 % 1 5 % 70 3 — 10% 1 5 % 80 3 - 6% 10% TABLE 14 KRUSKAL SEARCH LIMITS N R NUMBER OF LINKS SET TO CO AT OPTIMUM AVERAGE MAXIMUM 10 2 6 1 1 15 2 1 2 23 20 2 1 8 26 20 3 2 8 30 3 3 9 40 3 6 1 1 50 3 6 1 0 60 3 12 1 2 70 3 6 1 0 80 3 1 0 1 0 TABLE 15 NUMBER OF PROHIBITED LINKS FOR N,R. Conclusions A branch and bound algorithm has been developed to solve the problem of f i n d i n g the restrained minimal spanning tree f o r a symmetric graph. Two options i n the basic algorithm are presented and both of these methods are used to solve a wide v a r i e t y of problems. The r e s u l t s i n d i c a t e that the Kruskal method i s the more e f f i c i e n t one of the two. The r e l a t i v e e f f i c i e n c y of the Kruskal method over the Prim method i s d i r e c t l y p r o p o r t i o n a l to the degree of complexity of the problem. In many of the ' d i f f i c u l t ' problems with degree r e s t r i c t i o n s equal to two and distance matrix dimensions of twenty, the Prim method f a i l e d to terminate within a f i x e d time of twenty minutes. • The Kruskal method solved t h i s same set i n an average time of f i v e minutes a problem. • On the other hand, the two methods are quite comparable f o r the 'easier' c l a s s of problems with degree r e s t r a i n t s of three and matrix dimensions no greater t h a n . t h i r t y . In most cases, the Kruskal method gives a better i n i t i a l estimate of the optimal s o l u t i o n than the Prim method. When the degree of d i f f i c u l t y of a problem i s such that even the Kruskal approach f a i l s to y i e l d an optimal s o l u t i o n i n a fi x e d length of time, then one or more of the h e u r i s t i c methods of section 3-3 may be used. Of these methods, the one which appears to give the largest time savings combined with smallest average deviation from the optimal s o l u t i o n , i s the percentage suboptimization technique. 49 The Kruskal method i s the superior one f o r dealing with sparse matrices, since the sort time i s les s than f o r complete matrices due to the reduction i n the number of l i n k s . The algorithms presented here were only developed f o r symmetric matrices, but the branch and bound portion would be exactly the same f o r the nonsymmetric case. Only the minimal spanning tree algorithms would have to be a l t e r e d to deal with nonsymmetric matrices. BIBLIOGRAPHY 50 1. Edwin F. Bechenbach, "Applied Combinatorial Mathematics" (Wiley). 2. Claude Berge, "The Theory of Graphs and I t s Applications", A. Doig (trans.), Methnen, London, 1962. 3. J.D.C.. L i t t l e , K.G. Murtz, D.W. Sweeney, and C. Karel, "An Algorithm for the T r a v e l l i n g Salesman Problem", Operations Research 11, 972-989 (1963). 4. E.L. Lawlerand D.E. Wood, "Branch-and-Bound Methods: A Survey", Operations Research 14, 699-719 (1966). 5. Narsingh Deo and S.L. Hakimi, "The Shortest Generalized Hamiltonian Tree", 3rd A l l e r t o n Conference on C i r c u i t and System Theory (1965). 6. J.B.. Kruskal, "On the Shortest Spanning Subtree of a Graph and the T r a v e l l i n g Salesman Problem", Proceedings of American Mathematical Society, V o l . 7, pp 48-50, 1956. , 7. R.C. Prim, "Shortest Connection Networks and Some Generalization, B e l l System Technical Journal, V o l . 36, pp 1389-1401. 8. M. Bellmore and G.L. Newhauser, "The T r a v e l l i n g Salesman Problem: • A Survey", Operations Research 16, 538-558, (1968). 9. C.A.C.M. Algorithm No. 154. 10. W.L.. Eastman, "Linear Programming with Pattern Constraints", Ph.D. Di s s e r t a t i o n , Harvard, 1958. 11. Shapiro, D., "Algorithms f o r the Sol u t i o n of the Optimal Cost T r a v e l l i n g Salesman Problem", Sc.D. Thesis, Washington University, St. Louis, 1966. 12. M. Held and. R.M. Karp, "A Dynamic Programming Approach t o Sequencing Problems", Journal S.I.A.M. 10, 196-210 (1962). 13. L i n , S., "Computer Solution of the T r a v e l l i n g Salesman Problem", B e l l System Techn. G 44, 699-719 (1966). 14. U n i v e r s i t y of B r i t i s h Columbia Computing Centre, Ml UBC SORT. APPENDIX Description of the Computer Program 51 The algorithm was programmed i n Fortran IV f o r an IBM 70AA computer. F i g . 7 shows a general block diagram of the program. Box 1 of the f i g u r e denotes the minimal spanning t r e e algorithm where the tree l i n k s are selected from the given distance matrix. E i t h e r the Kruskal or the Prim algorithm i s used here. Box 6 indicates the routine f o r 'bunching' the l i n k s from l i k e trouble nodes. These sets of l i n k s are stored temporarily i n a matrix u n t i l a. f e a s i b l e s o l u t i o n i s reached, or the current bound i s exceeded (box 2). Then they are added t o the pushdown stack i n consecutive levelkj each l e v e l representing a d i f f e r e n t trouble node, (box A). The following routine i s incorporated i n t o the algorithm t o save unnecessary redetermination of the minimum t r e e l i n k s which were chosen before the degree r e s t r a i n t s were exceeded. As soon as a node exceeds the degree r e s t r i c t i o n s , the status of the current subproblem i s recorded; e.g. p a r t i a l t r e e length, selected l i n k s , trouble node, etc. as shown i n box 3- Then, using switch 1, the minimal tr e e i s completed without further checking of the r e s t r a i n t s . I f the length of t h i s t r e e i s smaller than the current f e a s i b l e bound, the superfluous l i n k s at the trouble node are excluded from the distance matrix, (box 5). The minimum spanning t r e e algorithm i s then r e s t a r t e d using the stored p a r t i a l t r e e information, (box 7). The time savings become very s i g n i f i c a n t when a trouble node occurs more than halfway through the minimum t r e e algorithm. As indicated i n box 8, the algorithm terminates when the pushdown stack i s empty. 52 START READ N,R AND DlSTANCE MATRIX ^ T U R N SWI O F t ^ ADJUST DISTANCE MATRIX DT GENERATE NEXT COMBINATION CHOOSE NEXT LINK OF MIN. TREE YES 3 A D O B U N C H M A T R I X T O P D S A N D S E T U P T H E C O M B I N A T I O N S A T E A C H L E V E L '• YES D E G R E E "V YES R E S T R I C T I O N S O.K. 2l R E C O R D C U R R E N T S T A T U S OF P A R T I A L T R E E A N D ' T R O U B L E ' NODE I ^ T U R N SWI O N ^ ^ SET S U P E R F L U O U S ' T R O U B L E ' L I N K S TO OO IN D I S T A N C E M A T R I X L U GROUP T R O U B L E L I N K S IN B U N C H M A T R I X IT I N I T I A T E R E S T A R T P R O C E D U R E NO YES P D S E M P T Y YES FIGURE 7 PROGRAM BLOCK DIAGRAM