Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Algorithms and hardware for the solution of set partitioning problems Saxton, Timothy Howard 1981

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1982_A7 S39.pdf [ 4.6MB ]
Metadata
JSON: 831-1.0065574.json
JSON-LD: 831-1.0065574-ld.json
RDF/XML (Pretty): 831-1.0065574-rdf.xml
RDF/JSON: 831-1.0065574-rdf.json
Turtle: 831-1.0065574-turtle.txt
N-Triples: 831-1.0065574-rdf-ntriples.txt
Original Record: 831-1.0065574-source.json
Full Text
831-1.0065574-fulltext.txt
Citation
831-1.0065574.ris

Full Text

ALGORITHMS AND HARDWARE FOR THE SOLUTION OF SET PARTITIONING PROBLEMS by TIMOTHY HOWARD SAXTON B.E., U n i v e r s i t y of Saskatchewan, 1979 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE i n THE FACULTY OF GRADUATE STUDIES ( Department of E l e c t r i c a l E n g i n e e r i n g } We accept t h i s t h e s i s as conforming to the r e q u i r e d standard. THE UNIVERSITY OF BRITISH COLUMBIA December, 1981 © Timothy Howard Saxton, 198 1 In presenting t h i s thesis i n p a r t i a l f u l f i l m e n t of the requirements for an advanced degree at the University of B r i t i s h Columbia, I agree that the Library s h a l l make i t f r e e l y a v a i l a b l e for reference and study. I further agree that permission for extensive copying of t h i s thesis for s c h o l a r l y purposes may be granted by the head of my department or by his or her representatives. It i s understood that copying or p u b l i c a t i o n of t h i s thesis for f i n a n c i a l gain s h a l l not be allowed without my written permission. Department of The University of B r i t i s h Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1W5 Date fly. IS / tfftf  n F - f i ( 2 / 7 9 ) i i A b s t r a c t T h i s t h e s i s i s concerned with an i n t e g e r l i n e a r programming problem, c a l l e d the set p a r t i t i o n i n g problem. The problem i s : minimize cx, s u b j e c t to Ax=e, where A i s a b i n a r y m a t rix, e i s a column of ones, and c i s a c o s t v e c t o r . D e s p i t e i t s p a r t i c u l a r l y simple s t r u c t u r e , the set p a r t i t i o n i n g problem has p r a c t i c a l a p p l i c a t i o n s i n many d i f f e r e n t a r e a s . There are many a p p l i c a t i o n s , however, f o r which the set p a r t i t i o n i n g problem i s so l a r g e and d i f f i c u l t t h a t i t cannot be s o l v e d i n a reasonable amount of time. The b i n a r y s t r u c t u r e of the p a r t i t i o n i n g problem suggests t h a t s p e c i a l d i g i t a l hardware c o u l d be developed to s o l v e these problems more e f f i c i e n t l y . Two new a l g o r i t h m s , one that c o u l d e x p l o i t such hardware, are presented and e v a l u a t e d i n t h i s t h e s i s . They are s i m i l a r i n that they use the same bounding procedure. The f i r s t i s a branch-and-bound a l g o r i t h m , and the second uses the more s o p h i s t i c a t e d A* [16] branching s t r a t e g y . Comparisons with one of the b e t t e r set p a r t i t i o n i n g a l g o r i t h m s i n the l i t e r a t u r e i n d i c a t e s that the new a l g o r i t h m s perform s i g n i f i c a n t l y b e t t e r . H e u r i s t i c v e r s i o n s of the A* a l g o r i t h m are a l s o i n v e s t i g a t e d . The h e u r i s t i c a l g o r i t h m s promise good sub-optimal s o l u t i o n s to very l a r g e p a r t i t i o n i n g problems. A h y b r i d a l g o r i t h m formed by combining the A* a l g o r i t h m with the branch-and-bound a l g o r i t h m i s shown to y i e l d o p t i m a l s o l u t i o n s to l a r g e problems i n minimal time. To s o l v e p a r t i t i o n i n g problems the g e n e r a l i z e d design of a m i c r o p r o c e s s o r based system, supplemented with s p e c i a l hardware, i s i n v e s t i g a t e d . Such a system would be o r g a n i z e d as a p e r i p h e r a l d e v i c e a t t a c h e d t o the host computer. I t would be c o s t e f f e c t i v e because i t would allow the s o l u t i o n of very l a r g e p a r t i t i o n i n g problems without u s i n g the expensive r e s o u r c e s of a mainframe computer. Such a system should be of i n t e r e s t t o those i n a p p l i c a t i o n areas t h a t r e q u i r e the s o l u t i o n of many l a r g e p a r t i t i o n i n g problems. i v Table of Contents A b s t r a c t i i Table of Contents .. i v L i s t of F i g u r e s v i L i s t of T a b l e s v i i Acknowledgements v i i i 1. I n t r o d u c t i o n and A p p l i c a t i o n s 1 1.1 I n t r o d u c t i o n 1 1.2 A p p l i c a t i o n s 4 1.3 A Look Ahead 8 2. Review and Survey 10 2.1 NP-Complete Problems 11 2.2 The Search Tree 12 2.3 Branch-and-Bound and Enumerative Algorithms ...... 14 2.4 L i n e a r Programming 16 2.4.1 B a s i c S o l u t i o n s 16 2.4.2 I n v o l u t o r y Bases 18 2.4.3 Extreme Po i n t P r o p e r t i e s of the SPP 19 2.5 Reductions 20 2.6 The Set Covering Greedy H e u r i s t i c 23 2.7 A Survey of E x i s t i n g Algorithms 24 2.8 The G a r f i n k l e and Nemhauser A l g o r i t h m 27 2.9 The P i e r c e and Lasky A l g o r i t h m 30 V 3. New Al g o r i t h m s f o r the Set P a r t i t i o n i n g Problem ...... 38 3.1 Bounding Algorithms 38 3.1.1 Bounding A l g o r i t h m I 39 3.1.2 Bounding A l g o r i t h m II 43 3.2 Two New Algorithms f o r the SPP 46 3.2.1 A Branch-and-Bound A l g o r i t h m 46 3.2.2 An I m p l i c i t Enumeration A l g o r i t h m 48 3.3 Some Implementation D e t a i l s 54 3.4 A New A l g o r i t h m f o r the Set Covering Problem ..... 56 3.5 Test Problems and Comparing R e s u l t s 58 3.6 T e s t R e s u l t s 61 3.6.1 Set P a r t i t i o n i n g Test R e s u l t s 63 3.6.2 Set Covering Test R e s u l t s 70 3.7 Summary 74 4. S p e c i a l Hardware f o r the S o l u t i o n of SPPs 77 4.1 Some C o n s i d e r a t i o n s 78 4.2 S p e c i a l Hardware 80 4.3 A Few D e t a i l s . 85 5. Summary and C o n c l u s i o n s 88 5. 1 Summary 88 5.2 C o n c l u s i o n s 92 B i b l i o g r a p h y 94 Appendix A: A Program f o r Gene r a t i n g Test Problems 97 Appendix B: Input Data f o r the Test Problems 100 v i L i s t of F i g u r e s 2.1 A simple t r e e 13 2.2 C o n s t r a i n t s i n block format 28 2.3 Convex f u n c t i o n (3(u) 33 2.4 (3 0(u) f o r problem i n Table 2.2 33 2.5 Search t r e e f o r problem i n Tab l e 2.2 36 3.1 SPPBB s o l u t i o n t r e e 47 3.2 SPPA* s o l u t i o n t r e e 50 4.1 A system f o r s o l v i n g p a r t i t i o n i n g problems 81 4.2 Flowchart of Bounding A l g o r i t h m I 82 4.3 S p e c i a l hardware 83 4.4 A row of the s p e c i a l hardware 83 v i i L i s t of Tables 2.1 An example SPP 31 2.2 Example SPP formatted i n t o b l o c k s 31 3.1 Example problem formatted i n t o ascending CPRs 42 3.2 P a r t i t i o n i n g t e s t r e s u l t s 64 3.3 SPPHA* performance r e l a t i v e to SPPA* 68 3.4 SCPA* computational r e s u l t s 72 3.5 Computational r e s u l t s u s i n g SCPA* and E t c h e b e r r y * s a l g o r i t h m ......... 73 3.6 SCPHA* performance r e l a t i v e to SCPA* 73 VI 1 1 Acknowledgements I would l i k e to thank my s u p e r v i s o r , Dr. Mabo R. I t o , f o r h i s a s s i s t a n c e and support. I would a l s o l i k e to thank Robert Cameron f o r the many ideas he o f f e r e d d u r i n g our frequent c o n v e r s a t i o n s . F i n a n c i a l support f o r t h i s work was p r o v i d e d by a N a t u r a l S c i e n c e s and E n g i n e e r i n g Research C o u n c i l Postgraduate S c h o l a r s h i p , and by the U n i v e r s i t y of B r i t i s h Columbia. 1 CHAPTER 1 I n t r o d u c t i o n and A p p l i c a t i o n s 1 .1 I n t r o d u c t i o n T h i s t h e s i s i s concerned with a c l a s s of i n t e g e r o p t i m i z a t i o n problems known as set p a r t i t i o n i n g problems (SPPs). The mathematical s t r u c t u r e of these problems i s simple, as i s the mathematical n o t a t i o n that d e f i n e s them. The d i f f i c u l t y f a c e d by those who attempt to s o l v e SPPs i s g e n e r a l l y r e l a t e d to the massive s i z e of . problems generated by p r a c t i c a l a p p l i c a t i o n s . Many important o p t i m i z a t i o n problems cannot be adequately handled because of the e x c e s s i v e time and resources r e q u i r e d to s o l v e these l a r g e problems. The o b j e c t i v e of t h i s work i s to i n t r o d u c e new and e f f i c i e n t a l g o r i t h m s f o r s o l v i n g p a r t i t i o n i n g problems. Furthermore, u s i n g the a l g o r i t h m s i n t r o d u c e d i t w i l l be shown th a t i t i s p o s s i b l e to d e f i n e a system u t i l i z i n g s p e c i a l hardware to s o l v e the SPP. The purpose of such a system i s not n e c e s s a r i l y to outperform a mainframe computer on these types of problems. Rather, the hardware makes p o s s i b l e the design of a simple, economical system which c o u l d s o l v e l a r g e p a r t i t i o n i n g problems while l e a v i n g the host computer f r e e to perform i t s normal t a s k s . Such a l g o r i t h m s and hardware should be of 2 i n t e r e s t t o those who have to s o l v e many l a r g e p a r t i t i o n i n g problems. The set p a r t i t i o n i n g problem can be expressed m a t h e m a t i c a l l y as: min z 0= Ip t Cj Xj ij i , ajj Xj = 1 , i= 1 , — ,m Xj = 0,1 , j= 1 , — ,n where a,j = 0,1. In t e r s e matrix n o t a t i o n the problem becomes: min z 0= cx SPP Ax = e Xj = 0,1 where c i s a 1 x n c o s t v e c t o r d e f i n i n g the o b j e c t i v e f u n c t i o n and e i s the m v e c t o r (1,1,...,1). C l o s e l y r e l a t e d to the SPP i s the s e t c o v e r i n g problem (SCP), expressed as: m i n z 0 = c x SCP Ax > e = 0 ,1 and the set packing problem 3 max z 0= cx Ax £ e xj = 0 , 1 . An a l t e r n a t e d e f i n i t i o n , from which the problems r e c e i v e t h e i r ' s e t ' p r e f i x , comes from the i n t e r p r e t a t i o n of the problem as a c o l l e c t i o n of s e t s . Let 1= {l,...,m}, and A be the s e t {A 1 fA 2,...,A }, such that A j S l f o r jeJ= {l , 2,...,n}. A subset of J , say J ' , such t h a t U ^ A = I and AjflA k= 0 f o r j^k and k t J ' , d e f i n e s a p a r t i t i o n of I. I f the l a s t r e s t r i c t i o n , Aj(lA k= 0, i s r e l a x e d , a c o v e r i n g of I i s d e f i n e d . Set p a r t i t i o n i n g and set c o v e r i n g problems, along with the t r a v e l i n g salesman problem, form the three most important problems i n pure i n t e g e r programming. They have been s t u d i e d , mainly i n the past f i f t e e n y e a r s , because many important o p t i m i z a t i o n problems can be formulated as one of these t h r e e problems. The t o o l s to i n v e s t i g a t e these problems became a v a i l a b l e as new techniques i n l i n e a r programming were advanced. The success l i n e a r programming had i n the f i e l d s of management s c i e n c e and o p e r a t i o n s r e s e a r c h l e d n a t u r a l l y to the a p p l i c a t i o n of l i n e a r programming as a t o o l f o r s o l v i n g i n t e g e r l i n e a r problems. The set p a r t i t i o n i n g problem i s a more c o n s t r a i n e d problem than the set c o v e r i n g problem. In f a c t the SCP i s a more gen e r a l problem i n that any SPP can be expressed as a SCP by a s u i t a b l e t r a n s f o r m a t i o n of the c o s t v e c t o r . For i n s t a n c e , l e t t = I^j a,| and choose any L such that L t l ] ^ Cj . Then: 4 Theorem 1 [14] I f a SPP d e f i n e d by A and c has a p a r t i t i o n s o l u t i o n , the SCP d e f i n e d by A and C j = C j + L t j , j= 1,...,n f has the same set of o p t i m a l s o l u t i o n s . S i n c e any SPP can be s o l v e d as a SCP, one might c o n s i d e r i t n e cessary to only develop e f f i c i e n t a l g o r i t h m s f o r the s o l u t i o n of c o v e r i n g problems. However, i t i s known that the e x c l u s i v e nature of p a r t i t i o n i n g c o n s t r a i n t s (each row must be covered e x a c t l y once) make p o s s i b l e a l g o r i t h m s which e x p l o i t the s p e c i a l s t r u c t u r e more e f f i c i e n t l y than SCP a l g o r i t h m s . Furthermore, the p a r t i t i o n i n g problem i s of enough importance i n i t s own r i g h t to j u s t i f y s p e c i a l a t t e n t i o n . 1.2 A p p l i c a t i o n s The f o l l o w i n g i s a b r i e f d e s c r i p t i o n of some of the a p p l i c a t i o n s of SPPs and SCPs. Although t h i s t h e s i s i s p r i m a r i l y concerned with the p a r t i t i o n i n g problem, the c o v e r i n g problem i s a l s o i n c l u d e d i n the a p p l i c a t i o n s . The two problems are so c l o s e l y r e l a t e d t h a t , depending on the a p p l i c t i o n , i t may be p o s s i b l e to r e p l a c e a c o v e r i n g problem w i t h an e q u i v a l e n t p a r t i t i o n i n g problem. A l s o , some of the r e s u l t s to be presented l a t e r have a p p l i c a t i o n s to SCPs. 5 P o l i t i c a l D i s t r i c t i n g [13] To e s t a b l i s h e l e c t o r a l boundaries i t i s necessary to e s t a b l i s h e l e c t o r a l d i s t r i c t s composed of p o p u l a t i o n u n i t s (eg. c o u n t i e s , towns, c i t i e s , e t c . ) . I f I i s the set of p o p u l a t i o n u n i t s , then Aj (Aj£l) d e f i n e s a d i s t r i c t i n A p r o v i d i n g i t s a t i s f i e s requirements of p o p u l a t i o n , c o n t i g u i t y , and compactness. I f there are K d i s t r i c t s r e q u i r e d and Cj i s the u n a c c e p t a b i l i t y of d i s t r i c t j , then the p a r t i t i o n s o l u t i o n along with the requirement Xj = K g i v e s the optimal d i s t r i c t i n g p l a n . A i r l i n e Crew Sch e d u l i n g [1] • The o p t i m a l s c h e d u l i n g of f l i g h t crews to f l i g h t segments or ' l e g s ' i s a set p a r t i t i o n i n g problem important to the economic o p t i m i z a t i o n of an a i r l i n e ' s o p e r a t i o n . To o b t a i n a f e a s i b l e f l i g h t schedule an a i r l i n e must a s s i g n c a b i n and f l i g h t p e r s o n n e l to every f l i g h t i n t h e i r f l i g h t t i m e t a b l e over a c e r t a i n p l a n n i n g p e r i o d . The schedule must s a t i s f y c e r t a i n r e g u l a t i o n s and r e s t r i c t i o n s . These r e g u l a t i o n s i n c l u d e s a f e t y r e g u l a t i o n s , union reguirements, company p o l i c y and economic c o n s i d e r a t i o n s . The l a s t requirement i s the o b j e c t i v e f u n c t i o n , while the p r e v i o u s p o i n t s serve as the c o n s t r a i n t s . A r o t a t i o n i s a round t r i p flown by a crew from, and r e t u r n i n g t o , t h e i r home base. Problem g e n e r a t i o n begins with a computer g e n e r a t i n g f e a s i b l e r o t a t i o n s that s a t i s f y the p r e v i o u s l y mentioned requirements. A s o l u t i o n i s d e f i n e d by the 6 subset of r o t a t i o n s which cover each f l i g h t l e g once. The case i n which a f l i g h t l e g i s covered more than once i s known as 'deadheading'. T h i s means a f l i g h t crew i s allowed to occupy passenger s e a t s i n order to p o s i t i o n themselves f o r t h e i r next f l i g h t l e g , or to r e t u r n them to t h e i r home base. I f deadheading i s p e r m i t t e d i n the model a set c o v e r i n g problem i s d e f i n e d . Since many thousands of l e g a l r o t a t i o n s can be generated and there can be s e v e r a l hundred f l i g h t l e g s , i t i s u s u a l l y necessary to reduce the problem to a manageable s i z e with h e u r i s t i c t e c h n i q u e s . Thus, a i r l i n e s are o f t e n r e s t r i c t e d t o sub-optimal s o l u t i o n s . Truck D e l i v e r i e s [ 5 ] The problem of making d e l i v e r i e s to m l o c a t i o n s along n f e a s i b l e r o u t e s i s a set p a r t i t i o n i n g problem t y p i c a l of many d e l i v e r y problems. Cost C j i s the c o s t of route j , and ay = 1 i f l o c a t i o n i i s on route j . Problem s t r u c t u r e w i l l be s i m i l a r t o the a i r l i n e crew s c h e d u l i n g problem although d i f f e r e n t c r i t e r i a would be used f o r the f o r m u l i z a t i o n , g e n e r a t i o n , and r e d u c t i o n of the problem. Information R e t r i e v a l [ 9 ] Suppose that an i n f o r m a t i o n system c o n s i s t s of n f i l e s , each f i l e j having l e n g t h Cj . A u n i t of i n f o r m a t i o n , i , i s i n c l u d e d i n f i l e j i f ay= 1. The optimal cover i s the minimum 7 t o t a l s i z e of the subset of f i l e s which i n c l u d e s a l l m u n i t s of i n f o r m a t i o n c o n t a i n e d i n the i n f o r m a t i o n system. C o l o r i n g Problem The c o l o r i n g problem i s a w e l l known mathematical problem. The s o l u t i o n r e q u i r e s the c o l o r i n g of a map such t h a t no two a d j a c e n t areas have the same c o l o r . Let I be a set of areas and Aj be a subset of I; then Aj i s i n A i f no two elements of Aj c o r r e s p o n d to adjacent a r e a s . I f a l l c o s t s a r e of u n i t v a l u e the o p t i m a l p a r t i t i o n r e p r e s e n t s the minimum number of c o l o r s r e q u i r e d . Optimal S w i t c h i n g Networks, O p t i m i z a t i o n of Boolean Formulas [8,25,26] Cons i d e r the problem of d e s i g n i n g a s w i t c h i n g network f o r a d i g i t a l computer, a telephone c e n t r a l or a c o n t r o l system. The problem i s to design a c i r c u i t which a c c e p t s k b i n a r y i n p u t s and has a s i n g l e b i n a r y output which i s a f u n c t i o n of the 2 p o s s i b l e i n p u t s . Often these c i r c u i t s are d e s i g n e d u s i n g only AND-gate components or OR-gate components. With t h i s q u a l i f i c a t i o n , f i n d i n g the cheapest c i r c u i t which w i l l implement the f u n c t i o n r e q u i r e s the s o l u t i o n of a c o v e r i n g problem. • C l o s e l y r e l a t e d to d e s i g n i n g optimal s w i t c h i n g networks i s a l o g i c i a n ' s problem of m i n i m i z i n g the r e p r e s e n t a t i o n of l o g i c a l f u n c t i o n s . The l o g i c i a n d e s i r e s the Boolean formula which maps Boolean v a r i a b l e s i n t o a given t r u t h t a b l e . G e n e r a l l y there are 8 many f u n c t i o n s which have the same t r u t h t a b l e , but the s i m p l e s t one i s d e s i r e d . T h i s i s the same problem as wit h o p t i m a l s w i t c h i n g networks s i n c e a 't r u e ' corresponds t o a 'hig h ' , a ' f a l s e ' to 'low', c o n j u n c t i o n (pAq) to an and-gate, and d i s j u n c t i o n (pVq) to an o r - g a t e . 1.3 A Look Ahead I t i s apparent from the p r e v i o u s s e c t i o n t h a t SCPs and SPPs have many p r a c t i c a l a p p l i c a t i o n s . However, d e s p i t e b e i n g among the s i m p l e s t problems i n i n t e g e r l i n e a r programming, the l a r g e s i z e of p r a c t i c a l problems make them n o t o r i o u s l y d i f f i c u l t problems to s o l v e . A c c o r d i n g to Balas and Padberg [ 4 ] , a i r l i n e crew s c h e d u l i n g problems with 300-500 c o n s t r a i n t s and 2500-4000 v a r i a b l e s are sometimes s o l v e d , but o f t e n much s m a l l e r problems are u n s o l v a b l e i n reasonable time l i m i t s . These time l i m i t s a re o f t e n measured i n tens of CPU minutes on mainframe computers. T y p i c a l e x e c u t i o n times f o r t e s t problems as s m a l l as 50x100 can take s e v e r a l seconds on mainframe computers. T r a d i t i o n a l methods f o r s o l v i n g i n t e g e r problems, based on l i n e a r programming, have not proved e n t i r e l y s u c c e s s f u l when a p p l i e d to SCPs and SPPs. When l o o k i n g through the l i t e r a t u r e one c o n s t a n t l y reads that l i n e a r programming i s a b o t t l e n e c k i n any a l g o r i t h m that uses i t . Consequently, Chapter 2 w i l l i n t r o d u c e a group of r e l a t e d branch-and-bound a l g o r i t h m s t h a t 9 have a minimal dependence on l i n e a r programming. Computational r e s u l t s w i t h these a l g o r i t h m s have compared f a v o r a b l y w i t h o t h e r methods. Chapter 2 w i l l a l s o d i s c u s s a d d i t i o n a l background i n f o r m a t i o n r e l a t e d t o set p a r t i t i o n i n g problems. Chapter 3 w i l l move from the i n t r o d u c t o r y m a t e r i a l t o i n t r o d u c e two new a l g o r i t h m s f o r the set p a r t i t i o n i n g problem. A l s o , a new a l g o r i t h m f o r the set c o v e r i n g problem i s p r e s e n t e d . The p a r t i t i o n i n g a l g o r i t h m s are compared w i t h one of the a l g o r i t h m s from Chapter 2 by comparing t e s t r e s u l t s o b t a i n e d u s i n g the same set of t e s t problems. The simple b i n a r y s t r u c t u r e of SPPs, c o u p l e d with the newly d e f i n e d a l g o r i t h m s of Chapter 3, makes p o s s i b l e the d e f i n i t i o n of simple hardware f o r the purpose of s o l v i n g p a r t i t i o n i n g problems. Chapter 4 w i l l i n t r o d u c e and e x p l a i n the f u n c t i o n of such hardware. I t w i l l i n v e s t i g a t e the f e a s i b i l i t y of i n c o r p o r a t i n g the hardware i n an economical m i c r o p r o c e s s o r c o n t r o l l e d system. Chapter 5 w i l l summarize the r e s u l t s . C o n c l u s i o n s and su g g e s t i o n s f o r f u r t h e r r e s e a r c h w i l l be made. 1 0 CHAPTER 2  Review and Survey T h i s chapter w i l l review some of the terminology and important background m a t e r i a l r e l a t e d t o p a r t i t i o n i n g and c o v e r i n g problems. I t i s intended to be a b r i e f .survey of some of the important d e t a i l s i n t h i s f i e l d , and as a r e s u l t not a l l of the m a t e r i a l i s d i r e c t l y r e l e v a n t to the new r e s u l t s p r e s e n t e d i n t h i s t h e s i s . In p a r t i c u l a r , s e c t i o n s 2.4.2 through 2.5 can be omitted without h i n d e r i n g the re a d e r ' s understanding of the remainder of the t h e s i s . G e n e r a l i z e d branch-and-bound and enumerative a l g o r i t h m s a r e d i s c u s s e d i n S e c t i o n 2.3 because a d i s t i n c t i o n between the two i s important i n the next c h a p t e r . L i n e a r programming i s d i s c u s s e d b r i e f l y because of i t s t r a d i t i o n a l r o l e i n s o l v i n g i n t e g e r programming problems. A b r i e f survey of SPP and SCP a l g o r i t h m s i s given i n S e c t i o n 2.7. Two of these a l g o r i t h m s a re examined f u r t h e r because of t h e i r s i m i l a r i t y t o the new a l g o r i t h m s i n t r o d u c e d i n Chapter 3. Sin c e the columns of the c o n s t r a i n t m a t r i x a re b i n a r y v e c t o r s , o p e r a t i o n s on the columns can be expressed i n e i t h e r set or l o g i c a l n o t a t i o n . For i n s t a n c e , the union of two columns AjU A k i s e q u i v a l e n t to the l o g i c a l 'or' of two b i n a r y v e c t o r s , 11 AjV A k. To a v o i d c o n f u s i o n t h i s n o t a t i o n a l e q u i v a l e n c e should be kept i n mind. 2.1 NP-Complete Problems Set p a r t i t i o n i n g and set c o v e r i n g problems, l i k e almost a l l i n t e g e r programming problems, are NP-complete problems. NP-complete means that the problem complexity i s not p o l y n o m i a l bound wi t h r e s p e c t to the s i z e of the problem. T h i s i s i n t u i t i v e l y obvious s i n c e a SPP with n v a r i a b l e s has 2 n p o s s i b l e p o i n t s , or s o l u t i o n s , i n i t s s o l u t i o n space. In the term i n o l o g y of l i n e a r programming a f e a s i b l e s o l u t i o n i s a s o l u t i o n f o r which a l l the c o n s t r a i n t s are s a t i s f i e d . An o p t i m a l s o l u t i o n i s the f e a s i b l e s o l u t i o n with the best v a l u e of the o b j e c t i v e f u n c t i o n . Being NP-complete i m p l i e s t h a t these types of problems have no simple a n a l y t i c a l s o l u t i o n . Rather, the s o l u t i o n must be based on some type of search that i m p l i c i t l y examines a l l 2n p o i n t s i n the space. An e f f i c i e n t a l g o r i t h m w i l l , i n g e n e r a l , e x p l i c i t l y examine the fewest p o i n t s . The a l g o r i t h m s which w i l l be of i n t e r e s t t o t h i s d i s c u s s i o n are branch-and-bound, and i m p l i c i t enumeration a l g o r i t h m s . S p i e l b e r g ' [31] p o i n t s out that most authors make no d i s t i n c t i o n between branch-and-bound and enumerative a l g o r i t h m s . In the l i t e r a t u r e most a l g o r i t h m s c a l l e d i m p l i c i t enumeration a l g o r i t h m s are i n f a c t branch-and-bound a l g o r i t h m s . S p i e l b e r g draws a d i s t i n c t i o n , and t h i s work w i l l do l i k e w i s e . 12 2 • 2 The Search Tree A convenient way to represent a search a l g o r i t h m i s with a. search t r e e . A search t r e e i s a map showing the s e q u e n t i a l path of the search through the problem. The t r e e i s composed of nodes, each node r e p r e s e n t i n g a t r i a l v e c t o r i n the s o l u t i o n space. Nodes can a l t e r n a t e l y be c a l l e d p a r t i a l s o l u t i o n s , or c a n d i d a t e problems. The root node d e f i n e s the i n i t i a l s t a t e of the search (eg. x={0,...,0}). Except f o r the root node, every node i s the descendant of a s i n g l e node c a l l e d i t s parent. In g e n e r a l , a node can be the the parent of many nodes but f o r b i n a r y problems l i k e SPPs, a node can have at most two descendants. F i g u r e 2.1 i s an example of a small t r e e . The numbers ad j a c e n t to the nodes i n d i c a t e the order i n which the nodes were generated. A p o s i t i v e numbers adj a c e n t t o an edge i s the v a r i a b l e index of a v a r i a b l e f i x e d t o 1 i n the p a r t i a l s o l u t i o n . A n e g a t i v e number i n d i c a t e s a v a r i a b l e f i x e d a t 0. Thus at node 4 the p a r t i a l s o l u t i o n v e c t o r x={x,,x 2,...x„} has x 2=1, x 5=0 and a l l other v a r i a b l e s f r e e . A fathomed node i s a node which has no descendants. T h i s may be because the node d e f i n e s a f e a s i b l e s o l u t i o n , or there are no f e a s i b l e s o l u t i o n s from the node, or i t i s known that t h e r e i s no s o l u t i o n from the node which has a b e t t e r cost than the c u r r e n t o p t i m a l s o l u t i o n . A l i v e node then, d e f i n e s an unfathomed node. F i g u r e 2.1 c o u l d r e p r e s e n t a p a r t i a l t r e e ; nodes 0 and 1 have been e x p l o r e d and nodes 2, 3 and 4 have yet 13 F i g u r e 2.1: A Simple Tree 14 to be fathomed. The t r e e i s complete and the s o l u t i o n found when there are no l i v e nodes (eg. the e n t i r e t r e e has been fathomed). 2.3 Branch-and-Bound and Enumerative A l g o r i t h m s Branch-and-bound (BB) a l g o r i t h m s are a l s o c a l l e d LIFO ( l a s t i n , f i r s t out) or ' s i n g l e branch' methods. LIFO i n d i c a t e s the search s t r a t e g y e x p l o r e s the l a t e s t l i v e node generated. A l l branch-and-bound a l g o r i t h m s are s i m i l a r t o the f o l l o w i n g g e n e r a l i s e d procedure. G e n e r a l i z e d BB A l g o r i t h m Let S denote the p a r t i a l s o l u t i o n v e c t o r , S +={j|xj = 1,jeS}, S" = { j |xj =0, jeS}, and z(S) = Ij S S * C j . x 0 , z 0 are the c u r r e n t best f e a s i b l e s o l u t i o n and a s s o c i a t e d c o s t , r e s p e c t i v e l y . 1. ( I n i t i a l i z e ) z= 0 , S + = {0}, S* = {0}, z 0 = c*>. 2. (Forward step) From the c u r r e n t node choose a v a r i a b l e , k, to add to the p a r t i a l s o l u t i o n . S 4 = S ++{k}, z(S)=z+c k. 3. (Bounding) Obtain a lower bound f o r the c o s t of completion, LBC(S), of the new node. 4 . (Check f o r f e a s i b i l i t y and. a s o l u t i o n ) I f the lower bound c o s t , z(S)+LBC ( S ) , i s not l e s s than z 0 , goto 5. Otherwise, i f S i s a s o l u t i o n goto 6. Otherwise, goto 2. 5. (Backstep) I f S*={0] goto 7. Otherwise, l e t k be the index of the l a s t v a r i a b l e added to S +. S * = S*-{k), S"=S"+{k}, z ( S ) = z - c k , goto 2. , 15 6. (Save s o l u t i o n ) x 0=x(S), z 0 = z ( S ) , goto 5. 7. (Termination) I f z 0 = ex> there i s no s o l u t i o n . Otherwise, x 0 i s the s o l u t i o n v e c t o r and z 0 i s the o p t i m a l c o s t . Stop. N o t i c e that steps 2 and 3 are vague. The forward branch s t r a t e g y and the method f o r c a l c u l a t i n g the lower bound are the c h a r a c t e r i s t i c s which d i s t i n g u i s h one branch-and-bound a l g o r i t h m from another. The enumerative a l g o r i t h m to be i n t r o d u c e d l a t e r , d i f f e r s from a branch-and-bound a l g o r i t h m i n that i t generates both s u c c e s s o r s to the c u r r e n t node, then p l a c e s both on the c a n d i d a t e l i s t . The new nodes are d e f i n e d by S,=So +{k}, and s 2 =SS+{k}, where the 0 s u b s c r i p t r e f e r s to the parent node, and 1 and 2 r e f e r to the descendants. On a forward step the c u r r e n t node i s taken to be the node on the c a n d i d a t e l i s t w ith the best lower bound e s t i m a t e . P r o v i d i n g the bounding a l g o r i t h m s a t i s f i e s a few c o n d i t i o n s , t h i s b r a n c h i n g s t r a t e g y i s guaranteed to examine fewer nodes than any other branching s t r a t e g y . The a l g o r i t h m i s p e n a l i z e d by a more c o m p l i c a t e d bookkeeping procedure that i s r e g u i r e d to i n s u r e t h a t the best node i s always a v a i l a b l e from the c a n d i d a t e l i s t . In c o n t r a s t , the l a s t node p l a c e d on the c a n d i d a t e l i s t by a BB a l g o r i t h m i s always the next node to be e x p l o r e d . In g e n e r a l , branch-and-bound a l g o r i t h m s are e a s i e r to program and debug. However, as problem c o m p l e x i t y grows BB a l g o r i t h m s are more s u s c e p t i b l e to the e x p o n e n t i a l nature of the problem. Enumerative a l g o r i t h m s tend to be 'smarter' but as 16 S p i e l b e r g [31] s t a t e s : "The t r e n d has always been to s t a y w i t h s i m p l e r schemes (simple i n terms of s t r u c t u r e ) and the burden of j u s t i f i c a t i o n f o r more complex s t r u c t u r e s has t o be borne by the o r i g i n a t o r of such a s t r u c t u r e . " With t h i s i n mind a new i m p l i c i t enumeration a l g o r i t h m and a new BB a l g o r i t h m w i l l be p r e s e n t e d i n Chapter 3. 2.4 L i n e a r Programming I t i s u s e f u l at t h i s p o i n t to d i s c u s s l i n e a r programming and i t s r e l a t i o n s h i p to SPPs. T h i s w i l l h e l p c l a r i f y some p o i n t s made i n the survey of S e c t i o n 2.7. A l s o , r e f e r e n c e s t o l i n e a r programming w i l l occur f r e q u e n t l y throughout the r e s t of t h i s work. But f i r s t a few more d e f i n i t i o n s w i l l be i n t r o d u c e d . A cover J ' i s s a i d to be redundant i f t h e r e i s a column j ' such t h a t J'-{j'} i s a l s o a c o v e r . A prime cover has no redundant columns. The standard form of a c o n s t r a i n t e q u a t i o n has a l l the c o n s t r a i n t s expressed as e q u a t i o n s , except Xj > 0. The c a n o n i c a l form has a l l i n e q u a l i t y c o n s t r a i n t s . A problem i n c a n o n i c a l form can always be expressed i n s t a n d a r d form by the a d d i t i o n of a r t i f i c i a l v a r i a b l e s , c a l l e d s l a c k or s u r p l u s v a r i a b l e s . 2.4.1 B a s i c S o l u t i o n s C o n s i d e r a l i n e a r problem i n standard form and suppose the columns of A are p a r t i t i o n e d i n t o two such that A=(B,N). Then Ax=b can be w r i t t e n as: 17 Bx B+Nx N=b (1) The m*m B matrix i s c a l l e d the b a s i s m a t r i x and x B i s the v e c t o r of b a s i c v a r i a b l e s . N i s m*(n-m) mat r i x , and x H i s the v e c t o r of a s s o c i a t e d non-basic v a r i a b l e s * B i s a n o n s i n g u l a r (det^O) m a t r i x . T h e r e f o r e , we can s o l v e f o r x B i n (1) by m u l t i p l y i n g by B" 1. x B=B"'b-B" 1Nx N (2) The b a s i c s o l u t i o n t o (2) i s found by s e t t i n g x =0. x B=B~ 1b, x„ =0 (3) If x B > 0, (3) d e f i n e s a b a s i c f e a s i b l e s o l u t i o n of ( 1 ) . I f any of the b a s i c v a r i a b l e s are equal t o z e r o the s o l u t i o n i s s a i d t o be degenerate. There are a maximum of (£,) bases, one of which d e f i n e s the o p t i m a l s o l u t i o n . T h i s i s one of the fundamental theorems i n l i n e a r programming; the optimal s o l u t i o n to a l i n e a r problem i s a b a s i c o p t i m a l s o l u t i o n . Thus, the s o l u t i o n of a l i n e a r problem r e q u i r e s f i n d i n g the one b a s i s , of many, which o p t i m i z e s the o b j e c t i v e f u n c t i o n . Standard methods f o r s o l v i n g the l i n e a r problem are based on the simplex a l g o r i t h m . The simplex a l g o r i t h m i s a procedure f o r f i n d i n g the op t i m a l s o l u t i o n s t a r t i n g from an i n i t i a l b a s i c f e a s i b l e s o l u t i o n . During each i t e r a t i o n of the a l g o r i t h m a v a r i a b l e e n t e r s and a v a r i a b l e l e a v e s the b a s i s , such that the o b j e c t i v e f u n c t i o n i s improved. The simplex a l g o r i t h m , however, 18 has d i f f i c u l t i e s with degenerate s o l u t i o n s . In the presence of degenerate s o l u t i o n s i t i s p o s s i b l e that there w i l l be a sequence of degenerate simplex i t e r a t i o n s t h a t d e f i n e a c y c l e . That i s , the simplex a l g o r i t h m w i l l c y c l e i n d e f i n i t e l y through the same set of degenerate s o l u t i o n s . T h i s i s known as c y c l i n g . I t i s known from a wealth of e m p i r i c a l data t h a t c y c l i n g does not occur i n the s o l u t i o n of a p p l i e d l i n e a r problems. However, t h i s i s not the case f o r the r e l a x e d i n t e g e r problem, so p r e c a u t i o n s must be taken to prevent c y c l i n g i n these c a s e s . For h i g h l y degenerate problems such as SCPs and SPPs, o b t a i n i n g a l i n e a r s o l u t i o n i s that much more d i f f i c u l t due to the extreme degeneracy and the necessary p r e c a u t i o n s r e q u i r e d to guard a g a i n s t c y c l i n g . 2.4.2 I n v o l u t o r y Bases Consider a SCP. Then: Theorem 2 I f J'={j|xj=l} i s a prime cover, x' i s an extreme p o i n t i n the space of f e a s i b l e s o l u t i o n s of the SCP. The proof i s simple. Each column s p e c i f i e d by J ' must s a t i s f y at l e a s t one c o n s t r a i n t as an e q u a l i t y . T h e r e f o r e the columns of J ' are l i n e a r l y independent and x' i s an extreme po i n t . Bellmore and R a t l i f f [6] prove t h a t the s e t c o v e r i n g problem has b a s i c optimal s o l u t i o n s with i n v o l u t o r y b a s i s . An i n v o l u t o r y b a s i s i s one f o r which B~ 1=B. Thus i n (3) i t i s 19 unnecessary to i n v e r t the b a s i s m a t r i x , normally a c o m p u t a t i o n a l l y expensive e x e r c i s e . Given a f e a s i b l e s o l u t i o n , J ' , the b a s i s matrix obtained by a s u i t a b l e interchange of the rows and columns i s of the form: 1 , 0 P "12 where I 2 i s the i d e n t i t y matrix a s s o c i a t e d with the s u r p l u s v a r i a b l e s i n the s o l u t i o n . Note t h a t B-B = 1 . 2.4.3 Extreme P o i n t P r o p e r t i e s of the SPP As with the SCP, any s o l u t i o n t o the p a r t i t i o n i n g problem i s an extreme p o i n t i n the space of f e a s i b l e s o l u t i o n s to the SPP. The next theorem o f f e r s some i n s i g h t i n t o the r e l a t i o n s h i p of a d j a c e n t s o l u t i o n s i n the SPP. Adjacent s o l u t i o n s d i f f e r by e x c a c t l y one column. Theorem 3 [4] Let x, be an i n t e g e r non-optimal s o l u t i o n t o an SPP and l e t B, be the a s s o c i a t e d b a s i s . I f x 2 i s the optimal s o l u t i o n to the SPP with the b a s i s B 2 there e x i s t s a sequence of ad j a c e n t bases B 1 0,B n,. ..,B 1 p such t h a t B ^ B , and B 1 p=B 2. Furthermore: ( i ) the i n t e r m e d i a t e s o l u t i o n s d e f i n e d by x u=B^-e, i=0,1,...,p are a l l f e a s i b l e and i n t e g e r . ( i i ) cx 1 0 > cx „ > . . . > cx 1 p . B = 20 ( i i i ) p=|JiflQ 2|, where J variables associated An algorithm outlined in solve the SPP. ! i s the index set of non-basic with B,, and -Q2 = {J€N|xj-j = 1}. section 2.7 uses th i s property to 2.5 Reductions It i s often possible to eliminate columns and rows from the constraint matrix before problem solution begins. In some cases t h i s can s i g n i f i c a n t l y reduce the time required to solve the problem (especially with problems containing s p e c i a l s t r u c t u r e ) . The following l i s t describes the possible reductions and to which problems they apply. Reduction 1 (SCP,SPP) If there e x i s t s a row, rj , which i s a n u l l vector, there i s no feasible s o l u t i o n . This reduction i s obvious since row i cannot be covered by any s o l u t i o n . Reduction 2 (SCP,SPP) If for some i and k, r^=e K, where e K i s the kth unit vector, then xk=1 must be included in any fea s i b l e solution, and AK may be deleted. Also any row t, with teA k, must also be deleted. A^ i s the only column which can cover r . The inclusion of 21 x„ i n t he s o l u t i o n c o v e r s a l l rows i n A K , so t h e s e rows must be removed f rom the s o l u t i o n . R e d u c t i o n 2a (SPP) F o l l o w i n g r e d u c t i o n 2 a l l co lumns w i t h a i f = a i < ^ = 1 r o r some t must be d e l e t e d . T h i s f o l l o w s s i n c e i f co lumn k has been i n c l u d e d i n the s o l u t i o n , t hen any co lumn s a t i s f y i n g 2a wou ld ' o v e r c o v e r ' a row ( e g . ljn ajj Xj > 1) i f i n c l u d e d i n t he s o l u t i o n . R e d u c t i o n 3 (SCP) (SPP ) I f t h e r e e x i s t s a row r t such t h a t r t £ r p f o r some p, then r t may be d e l e t e d . R e d u c t i o n 3 f o l l o w s f rom the f a c t t h a t a c o v e r of row p w i l l c o v e r row t . R e d u c t i o n 3a (SPP) F o l l o w i n g r e d u c t i o n 3, a l l co lumns k such t h a t a t k=1 and ap<=0 must be d e l e t e d . T h i s f o l l o w s because some co lumn q w i t h a, =a =1 must be s e l e c t e d t o c o v e r row p.. Thus the s e l e c t i o n o f co lumn k would o v e r c o v e r row t . R e d u c t i o n 4 ( SCP) (SPP ) I f f o r a s u b s e t of co lumns S, and a co lumn k, Uj f e S A,- =Aj. and I|£S Cj ^ c k , co lumn k may be d e l e t e d . I f co lumn k i s i n c l u d e d i n any s o l u t i o n then s i n c e the co lumns i n S c o v e r the same rows w i t h l e s s c o s t , a b e t t e r s o l u t i o n can be o b t a i n e d by r e p l a c i n g x K =1 w i t h X j = 1 , j e S . 22 Redution 4a (SCP) I f f o r a subset of columns S and a column k, U[ 6 5Aj 2 kK and IjesCj ^ cK, column k may be d e l e t e d . The argument f o r 4a i s s i m i l a r t o that of 4, except that we r e l a x the requirement Uj 6sAj =Ak to permit o v e r c o v e r i n g . To i l l u s t r a t e the above r e d u c t i o n s c o n s i d e r the f o l l o w i n g problem. c o s t = ( 3 4 8 7 9 9 8 7 ) row 0 0 1 1 0 0 0 0 1 0 0 1 0 1 0 0 1 2 A= 0 1 0 0 0 1 1 0 3 1 0 0 0 0 1 0 0 4 0 0 0 1 0 0 0 0 5 0 0 0 0 1 0 0 1 6 0 0 0 0 1 0 1 0 7 column- 1 2 3 4 5 6 7 8 The SPP i s reduced as f o l l o w s . (2) r 5 = e „ , d e l e t e r , , r 5 ; x«='1 . (2a) D e l e t e A 3, x 3 = 0. (3) r 2 > r 6 , d e l e t e r 2 . , (4) A 1UA 2= A 6, c 1+c 2< c 6 , d e l e t e A 6; x 6=0. (2) r,=e,, d e l e t e r„, x , = 1 . Which le a v e s the reduced problem: 23 c o s t = ( 4 9 8 7 ) row 3 6 7 column- 2 5 7 8 C l e a r l y the s o l u t i o n i s : x 1=x 2=x 4=x 5=1, z 0=23. An i n t e r e s t i n g way of l o o k i n g at the p a r t i t i o n i n g problem i s made p o s s i b l e by extending the idea i n r e d u c t i o n 4. Reduction 4b (SCP)(SPP) I f f o r two, not n e c e s s a r i l y e x c l u s i v e subsets of columns, S 1 and S 2, UjSS'A =UjSS*A and Ejes'Cj ^ £ j « l C j > then the columns of S 2 not i n S 1 may be d e l e t e d . Now suppose we l e t S 1 be the index set of the op t i m a l s o l u t i o n . Then r e d u c t i o n 4b would remove a l l columns i n sub-optimal f e a s i b l e s o l u t i o n s . A r e d u c t i o n of t h i s type would p r a c t i c a l l y s o l v e the problem. But of course, to o b t a i n the index set S 1 one must s o l v e the problem. 2.6 The Set Covering Greedy H e u r i s t i c The s et c o v e r i n g problem i s a NP-complete problem so h e u r i s t i c techniques are important f o r the sub-optimal s o l u t i o n of very l a r g e problems. The Greedy H e u r i s t i c i s a simple h e u r i s t i c d e s c r i b e d by C h v a t a l [ 7 ] , I t i s presented as p a r t of the review, and because i s i s r e l a t e d to a SCP a l g o r i t h m t o be i n t r o d u c e d i n Chapter 3. A = 1 0 1 0 0 1 0 1 0 1 1 0 24 Let Pj be the number of rows covered _ by Aj , Pj = |AjAT|, where T i s the v e c t o r of rows remaining to be covered i n the p a r t i a l s o l u t i o n . Let CPRj =Cj/Pj . SCP Greedy H e u r i s t i c 1. ( I n i t i a l i z a t i o n ) Set S+={0}, T=e, z=0 and goto 2. 2. (Choose next v a r i a b l e ) C a l c u l a t e CPRj, j=1,...,n. F i n d the column, k, with the minimum CPRj. Goto 3. 3. (Forward step) S +=S ++{k}, T=T-(A KAT) and z=z+c k. If T={0} goto 4. Otherwise, goto 2. 4. (Termination) S + i s a cover and z i s the c o s t . Stop. 2.7 A Survey of E x i s t i n g A l g o r i t h m s C u t t i n g Planes S a l k i n and Koncal d e s c r i b e an a l l i n t e g e r a l g o r i t h m [27,28,29] that generates s t r o n g c u t t i n g planes f o r the s o l u t i o n of the c o v e r i n g problem. When a p i v o t on an. element other than one i s encountered d u r i n g a dual simplex i t e r a t i o n , a Gomory i n t e g e r c ut [15] i s added and the a l g o r i t h m p i v o t s on a u n i t c o e f f i c i e n t i n that row. T h i s i n s u r e s an i n t e g e r t a b l e a u and an a l l i n t e g e r s o l u t i o n . Computational r e s u l t s a re given i n [29] fo r both c o v e r i n g and p a r t i t i o n i n g problems. 25 Branch and L i n e a r Programming Bound Lemke, S a l k i n and S p i e l b e r g [19] d e s c r i b e a branch-and-bound a l g o r i t h m which uses l i n e a r programming to compute a lower bound, and the SCP Greedy H e u r i s t i c to obtain an i n t e g e r s o l u t i o n from the l i n e a r s o l u t i o n . Computational r e s u l t s f o r c o v e r i n g problems are g i v e n . Subqradient O p t i m i z a t i o n Rather than s o l v e the l i n e a r problem to o b t a i n a lower bound f o r each p a r t i a l s o l u t i o n , E t c h e b e r r y [10,11] used an a l t e r n a t e method. In a branch-and-bound a l g o r i t h m designed to s o l v e the set c o v e r i n g problem, an i t e r a t i v e approach, c a l l e d s u b g r a d i e n t o p t i m i z a t i o n [17,18], was used to c a l c u l a t e an approximate s o l u t i o n t o the l i n e a r problem. Subgradient o p t i m i z a t i o n a s s o c i a t e s a non-negative Lagrangian m u l t i p l i e r , u , w i t h every c o n s t r a i n t i n the c a n d i d a t e problem. The bounding problem i s : L(u) = m i n X j { l j f i T c J - X j + I U R U f ( l - I j ^ a j j - X j ) } (4) where u i s the v e c t o r of Lagrangian m u l t i p l i e r s , R i s the set of rows remaining to be covered by the candidate problem, and J i s the s et of f r e e v a r i a b l e s at the p a r t i a l s o l u t i o n . For a l l non-negative v e c t o r s u, the s o l u t i o n to (4) s a t i s f i e s L(u) < z*, where z* i s the c o s t of the o p timal l i n e a r s o l u t i o n . The i t e r a t i v e procedure begins with a i n i t i a l v e c t o r u 1 . A f t e r s o l v i n g (4), u i s updated by u B * 1 = u'+ t-v', where 26 = I i e R uj,-(1-EjeR ajj-Xj ), 1 i s the i t e r a t i o n counter, and t i s a constant p r o p o r t i o n a l to L(u)-w. w i s an estimated value of z*. L(u) i s guaranteed to converge to z* a s y m p t o t i c a l l y and i n p r a c t i c e good est i m a t e s are achieved i n about 30 i t e r a t i o n s . E t c h e b e r r y c l a i m s that t h i s approach i s c o m p u t a t i o n a l l y more e f f i c i e n t than s i m i l a r l i n e a r programming based methods. However, he does not e x p l i c i t l y compare computational r e s u l t s with other approaches, so i t i s d i f f i c u l t to make an ac c u r a t e comparison. In S e c t i o n 3.6.2 a l i m i t e d comparison i s made between E t c h e b e r r y ' s r e s u l t s and some of the r e s u l t s stemming from t h i s t h e s i s . Adjacent Vertex Path A l g o r i t h m f o r the SPP Balas and Padberg [3] d e s c r i b e how the adjacency p r o p e r t i e s of the SPP (Theorem 3) can be used to c o n t r o l p i v o t i n g of the a s s o c i a t e d l i n e a r problem. However, there has been no computational measure of the e f f e c t i v e n e s s of t h i s approach. Due to massive degeneracy i n the SPP, and c y c l i n g problems, t h i s approach would seem to be c o m p u t a t i o n a l l y expensive. BB Based on the C o m b i n a t o r i a l P r o p e r t i e s of the SPP There are three a l g o r i t h m s which w i l l be i n c l u d e d i n t h i s c l a s s i f i c a t i o n : (1) G a r f i n k e l and Nemhauser [12], (2) P i e r c e [23], P i e r c e and Lasky [24], and (3) Marsten [20]. These three a l g o r i t h m s take advantage of the e x c l u s i v e nature of the problem c o n s t r a i n t s to d e f i n e b l o c k s of columns a c c o r d i n g to the rows 27 they have i n common. So that no row w i l l be overcovered, a t most one column from each block can be i n c l u d e d i n any s o l u t i o n . T h i s r e s u l t s i n a simple but powerful procedure f o r e l i m i n a t i n g columns from the p a r t i a l s o l u t i o n s . In g e n e r a l , the computational r e s u l t s with these a l g o r i t h m s have been s u p e r i o r to other SPP a l g o r i t h m s . The f i r s t two a l g o r i t h m s a r e very s i m i l a r w h ile Marsten's a l g o r i t h m d i f f e r s s i g n i f i c a n t l y i n i t s branching and bounding s t r a t e g y . Only l i m i t e d comparisons have been made on the three a l g o r i t h m s . Marsten's a l g o r i t h m seems to perform b e t t e r on low d e n s i t y problems, while the other two perform b e t t e r on h i g h e r d e n s i t y problems ( d e n s i t y r e f e r s to the r a t i o of the number of M's i n the c o n s t r a i n t matrix t o the t o t a l s i z e of the m a t r i x ) . Because the a l g o r i t h m s of G a r f i n k l e and Nemhauser, and P i e r c e and Lasky w i l l be compared to the a l g o r i t h m s to be presented i n Chapter 3, they a r e d i s c u s s e d i n some d e t a i l i n the next two s e c t i o n s . 2.8 The G a r f i n k l e and Nemhauser A l g o r i t h m [12] Begin by r e o r d e r i n g the columns i n t o ' b l o c k s ' as shown i n F i g u r e 2.2. In other words, the columns of A are p a r t i t i o n e d i n t o b l o c k s B r, r=1,...,m such that f o r a l l kc B r, ark= 1, and arll=0 f o r keUi > rBi . N o t i c e that some b l o c k s may be empty. Wi t h i n a column the b l o c k s are ordered a c c o r d i n g to i n c r e a s i n g c o s t . Let S be the index set of the p a r t i a l s o l u t i o n and S + the subset of v a r i a b l e s f i x e d at 1. A l s o , l e t T be the v e c t o r of B B- B, B row 1 2 3 m 111... 1 0 111 ... 1 1 11 ... 1 0 or 1 .... F i g u r e 2.2: C o n s t r a i n t s i n block format 29 rows remaining t o be covered, l e t x 0 be the c u r r e n t best , f e a s i b l e s o l u t i o n and z 0 , the a s s o c i a t e d c o s t . A l g o r i thm 1. ( I n i t i a l i z e ) E s t a b l i s h the b l o c k s , Bj , 1=1,...,m. S+={0}, T=e, and z 0 = c o . Goto 2. 2. ( S e l e c t next b l o c k ) Let r=min{i|ieT}. Set a p o i n t e r t o the f i r s t column i n B r. Goto 3. 3. ( F i n d a f e a s i b l e forward step) Examine, beginning at the p o i n t e r , the columns i n block B r . I f a column k i s found such t h a t ai K = 0 f o r a l l i£T, and z+c K<z 0, then update the p o i n t e r t o k+1 and goto 4. I f there i s no such column goto 5. 4. (Forward Step) Let S +=S ++{k}, z=z+c k, T=T-A K. I f T={0} (eg. a p a r t i t i o n s o l u t i o n ) , z 0=z, x 0 ( j ) = {1 | jeS + } , goto 5. Otherwise, goto 2. 5. (Backstep) I f S +={0] (eg. a l l columns of block 1 have been examined) goto 6. Otherwise remove the l a s t index, say k, from S +; S +=S +-{k}. Set the p o i n t e r to k+1 and goto 3. 6. (Termination) I f z 0 = ° o there i s no f e a s i b l e s o l u t i o n . Otherwise, x 0 , z 0 d e f i n e the op t i m a l s o l u t i o n . Stop. 30 2.9 The P i e r c e and Lasky A l g o r i t h m G a r f i n k l e and Nemhauser a r b i t r a r i l y chose the f i r s t row to d e f i n e the f i r s t b l o ck ( F i g u r e 2.2), row 2 to d e f i n e the second b l o c k , and so on. A more e f f i c i e n t o r d e r i n g i s p o s s i b l e . Begin by o r d e r i n g the columns i n ascending order of the c o s t to performance r a t i o (CPR), where the performance i s d e f i n e d as the number of rows a column c o v e r s . That i s , r e o r d e r the columns such t h a t CPR(j) < CPR(J + 1), j=1,...,n-1; where CPR(j)= C j / P ( j ) , P( j)=|Aj 1=1-?! a.j. A s s i g n A, to block 1. I f A,AAa#0 add A 2 to b l o c k 1. Otherwise, a s s i g n A 2 to block 2. Assuming that A, i s i n B, and A 2 i s i n B 2, A 3 i s p o s i t i o n e d by f i r s t checking A , A A 3 ^ 0 . I f t h i s i s t r u e add A 3 to B,. Otherwise, i f A 2 A A 3 T F O p l a c e A 3 i n B 2, e l s e p l a c e A 3 i n B 3. Now i f A, and A 2 were both i n B,, then i f A,AA 2AA 3#0 add A 3 to B,, e l s e put A 3 i n B 2. T h i s procedure c o n t i n u e s u n t i l a l l columns have been a s s i g n e d to b l o c k s . Table 2.2 shows the problem i n T a b l e 2.1 ordered by the p r e v i o u s r u l e s . T h i s o r d e r i n g i s more e f f e c t i v e than an a r b i t r a r y o r d e r i n g because i t p l a c e s the ' b e t t e r ' columns c l o s e r t o the beginning of the s e a r c h . Here b e t t e r r e f e r s to the f a c t t h a t a column with a lower CPR has a g r e a t e r p r o b a b i l i t y of being i n the o p t i m a l s o l u t i o n . The improved o r d e r i n g a l s o makes a bounding t e s t p o s s i b l e . N o t i c e t h a t the G a r f i n k l e and Nemhauser a l g o r i t h m had no Cost = ( 3 4 5 6 7 9 1 1 1 3 1 5 1 8 ) 0 0 0 0 0 1 0 1 1 1 0 1 0 0 0 0 1 1 1 1 1 0 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 0 0 0 1 Table 2.1: An example SPP A = column - 1 3 7 13 18 4 15 1 1 9 5 6 0 0 1 1 0 1 0 1 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 . 1 0 0 1 0 0 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 10 Table 2.2: Example SPP formatted i n t o b l o c k s 32 bounding t e s t except f o r z + c k ^ z 0 . P i e r c e [22] added the simple t e s t z+ms.v ^ z 0 , where ms i s the number of rows remaining to be covered at stage S, and u =minj,<sCPR( j ) . P i e r c e and La sky-went f u r t h e r , i n t r o d u c i n g a dominance t e s t based on s o l v i n g the r e l a x e d knapsack problem d e f i n e d by adding up a l l the problem c o n s t r a i n t s . Suppose that at stage S the v a l u e s x, ,x 2, . .. ,xs_, have been a s s i g n e d . We wish to c a l c u l a t e LBC(S), a lower bound on the c o s t of completing the problem. The knapsack problem at stage S i s : min F.jl5 q-yj = LBC(S) E£sPj-yj = ms (5) Yj = 0,1, j =S,S+1,...,n. De f i n e h k= max{Pj | j e B K}, C k = min {CPR( j ) | j eBK }, and 6 K=h k-C K. Then a convex f u n c t i o n j3(u) can be d e f i n e d as shown in F i g u r e 2.3, where n-t = Ejj. sh R and Vi=E| c. s6 l ( >; i ^ q , where q i s the number of rows to be covered by the p a r t i a l s o l u t i o n . To compute a lower bound on the r e l a x e d v e r s i o n of ( 5 ) (eg. 0 ^ yj ^ 1), one simply c a l c u l a t e s (3(ms ) . That {3(ms) i s a v a l i d lower bound to ( 5 ) f o l l o w s from 6 , ^ 6 2 ^ . • . ^ 6cj, and f o r a f e a s i b l e s o l u t i o n to ( 5 ) , I j € B | p SPj-yj ^ n k , f o r a l l k<q. F i g u r e 2.4 g i v e s (3(u) f o r the i n i t i a l problem i n Table 2.2. 1 1 1 n 1 n 2 n 3 . . . F i g u r e 2.3: Convex F u n c t i o n (3(u) u F i g u r e 2.4: |3o(u) f ° r the problem in T a b l e 2.2 34 A l g o r i t h m 1. ( I n i t i a l i z e ) Group the elements i n t o b l o c k s as d e s c r i b e d p r e v i o u s l y . Determine the v a l u e s of h k and 6K, k=1,...,m. Let b{ , 1=1.,...,m, be the row(s) common to Bj . S*={0}, T=e, and z 0=oo. 2. ( S e l e c t next block) Let r=min{11b„A Tj = 0}. I f there i s no such block, goto 5. Otherwise, p o i n t t o the f i r s t column i n B r and goto 3. 3. ( F i n d a f e a s i b l e forward step) Examine, beginning at the p o i n t e r , the columns remaining i n B r . I f no column, k, can be found such t h a t a ^ =0 f o r a l l i ^ T , goto 5. Otherwise, update the p o i n t e r t o k+1 and c a l c u l a t e (3(m s), ms=|T-AK|. If z+c k+ (3(ms )<z 0, goto 4. Otherwise, goto 3. 4. (Forward step) L e t S + =S + + {k}, z=z+c k , T=T-A K. I f T={0} (eg. S i s a p a r t i t i o n s o l u t i o n ) , z 0 = z, x 0 ( j ) = {1|jeS*}, S +=S +-{k}, goto 5. Otherwise, goto 2. 5. (Backstep) I f S+={0} goto 6. Otherwise, remove the l a s t index, say k, from S +; S +=S +-{k}. Set the p o i n t e r to k+1 and goto 3. 6. (Termination) I f z 0=cothere i s no f e a s i b l e s o l u t i o n . Otherwise, x 0 , z 0 d e f i n e the op t i m a l s o l u t i o n . Stop. F i g u r e 2.5 i l l u s t r a t e s the s o l u t i o n t r e e f o r the problem i n Table 2.2. Each numbered node r e p r e s e n t s a forward s t e p ( s t e p 4) i n the a l g o r i t h m . The value of the lower bound estimate of Step 3, z+c k + (3(m5), i s shown to the l e f t of the node. T h i s value has been rounded up to the nearest i n t e g e r v a l u e . The 35 number to the r i g h t of the node i n d i c a t e s the order i n which the nodes were generated. The numbers adjacent to the edges denote the v a r i a b l e s f i x e d at 0 or 1 by a forward or backward s t e p . The o p t i m a l s o l u t i o n i s x 0=(2,5,8), z o=20. P i e r c e and Lasky were the f i r s t t o make use of another important improvement (see [4] f o r o t h e r s ) based on reduced c o s t s . I f the rows of a SPP are m u l t i p l i e d by a r b i t r a r y c o n s t a n t s , b , summed, and the r e s u l t s s u b t r a c t e d from the c o s t v e c t o r , the r e s u l t a n t problem u s i n g the 'reduced' c o s t s has the same o p t i m a l s o l u t i o n as the o r i g i n a l problem. That i s the problem: min z 0 = E]l1 Cj x Ax=e (5) x =0,1; j=1,...,n where c' =Cj-1^ bj-aij , has the same s o l u t i o n as the o r i g i n a l problem, and z 0= z0-EjI'i b[ . Although the c o n s t a n t s can be s e l e c t e d a r b i t r a r i l y , they they should be chosen so that (5) i s the e a s i e s t problem to s o l v e . To t h i s end we r e s t r i c t our a t t e n t i o n to those problems with c'^ 0. Thus, a lower bound on z 0 i s 0, and on z 0 , E ^ b^ . T h i s c h o i c e tends to e l i m i n a t e the f i x e d c o s t i n the c o s t c o e f f i c i e n t s and enhances the r e l a t i v e c o s t d i f f e r e n c e between c o e f f i c i e n t s . One can now argue that the best s e l e c t i o n of b i s the c h o i c e t h a t maximizes the amount of r e d u c t i o n , E£t b^ . Or i n 36 F i g u r e 2.5: Search t r e e f o r problem i n Table 2.2 37 other words, the lower bound on z 0 , E(?t b^ , i s maximized. Thus a good s e l e c t i o n of the c o n s t a n t s i s given by the s o l u t i o n of the l i n e a r problem: max ll?! b t bA < c b-v £ 0; i=1,...,m. T h i s i s the dual l i n e a r problem of the o r i g i n a l set p a r t i t i o n i n g problem. I n t u i t i v e l y one can see why the above c o s t r e d u c t i o n s would be e f f e c t i v e . The columns i n c l u d e d i n the b a s i s of the optimal s o l u t i o n w i l l e xperience the g r e a t e s t r e l a t i v e c o s t r e d u c t i o n . The new o r d e r i n g of columns w i l l i n s u r e that the BB search w i l l f i r s t examine those columns i n the l i n e a r s o l u t i o n . T h i s i s c e r t a i n l y d e s i r a b l e s i n c e a column i n the l i n e a r s o l u t i o n has a good p r o b a b i l i t y of being i n the SPP s o l u t i o n . Another advantage of s o l v i n g the reduced problem i s t h a t i f the l i n e a r s o l u t i o n to the problem i s an i n t e g e r s o l u t i o n , then the SPP has been s o l v e d . Since the l i n e a r SPP o f t e n has n a t u r a l i n t e g e r s o l u t i o n s one o f t e n does not have to proceed beyond s o l v i n g the l i n e a r problem. 38 CHAPTER 3 New A l g o r i t h m s f o r the Set P a r t i t i o n i n g Problem T h i s chapter w i l l i n t r o d u c e the new a l g o r i t h m s c o n t a i n e d i n t h i s t h e s i s . A l l m a t e r i a l i n t h i s chapter r e p r e s e n t s o r i g i n a l r e s u l t s u n l e s s otherwise r e f e r e n c e d . Two new a l g o r i t h m s f o r the s o l u t i o n of set p a r t i t i o n i n g problems are p r e s e n t e d . One i s a branch-and-bound type a l g o r i t h m and the other i s an enumerative type a l g o r i t h m . An enumerative a l g o r i t h m f o r the set c o v e r i n g problem i s a l s o p r e s e n t e d . S e c t i o n 3.1 w i l l d e s c r i b e the bounding procedure used i n these a l g o r i t h m s . .Test r e s u l t s are given towards the end of the cha p t e r , f o l l o w e d by a b r i e f summary of the contents of t h i s c h a p t e r . 3.1 Bounding Alg o r i t h m s An e s s e n t i a l i n g r e d i e n t i n any search a l g o r i t h m i s a strong bounding procedure. I t i s o f t e n the case, as observed by P i e r c e and Lasky, t h a t any time spent c a l c u l a t i n g a stronger lower bound i s time w e l l spent. A good lower bound reduces the search t r e e s u f f i c i e n t l y to j u s t i f y the a d d i t i o n a l time r e q u i r e d to c a l c u l a t e the bound. T r a d i t i o n a l approaches to i n t e g e r 39 programming problems, such as Lemke, et a l . [19], sol v e the l i n e a r problem d e f i n e d by r e l a x i n g the i n t e g e r c o n s t r a i n t s . T h i s method y i e l d s a good lower bound but s o l v i n g the l i n e a r SPP problem i s d i f f i c u l t due to degeneracy and c y c l i n g . Etcheberry [11] c l a i m s to have obtained good r e s u l t s u s i n g subgradient o p t i m i z a t i o n techniques to o b t a i n approximate s o l u t i o n s to the l i n e a r problem. G a r f i n k l e and Nemhauser a v o i d the l i n e a r problem a l t o g e t h e r by only c o n s i d e r i n g the c o m b i n a t o r i a l p r o p e r t i e s of the SPP. The P i e r c e and Lasky a l g o r i t h m i s s i m i l a r , but s u p e r i o r , because i t adds a simple bounding t e s t based on s o l v i n g the l i n e a r knapsack problem d e r i v e d from the r e l a x e d SPP. 3.1.1 Bounding A l g o r i t h m I A new bounding a l g o r i t h m w i l l now be i n t r o d u c e d . Bounding A l g o r i t h m I i s the bounding procedure used by the a l g o r i t h m s i n t r o d u c e d l a t e r i n t h i s c h a p t e r . Thus i t r e p r e s e n t s a major p a r t of t h i s t h e s i s . The e f f i c i e n c y and r e l a t i v e s t r e n g t h of t h i s simple bounding procedure i s l a r g e l y r e s p o n s i b l e f o r the good performance of the new a l g o r i t h m s . The bounding problem i s to determine a lower bound on the c o s t of o b t a i n i n g the best f e a s i b l e s o l u t i o n from the c u r r e n t p a r t i a l s o l u t i o n S. The a l g o r i t h m can be d e s c r i b e d i n words as f o l l o w s . For every row that remains to be covered i n the subproblem, a s s o c i a t e the value of the minimum co s t to 40 performance r a t i o (CPRj =Cj/E- l W aij ) of a column which s a t i s f i e s the p a r t i t i o n c o n s t r a i n t s . I f u' i s the value a s s o c i a t e d with row i , the lower bound c o s t (LBC) i s simply o b t a i n e d by summing the u's. That i s : LBC = Ep, u[ . The formal d e f i n i t i o n f o r the bounding procedure i s given below and the proof that LBC i s a lower bound f o l l o w s . Bounding A l g o r i t h m I 1. ( I n i t i a l i z a t i o n ) R=VjevAj , T=e-R, LBC=0, and l e t k be a p o i n t e r passed by the c a l l i n g r o u t i n e . Order the columns so t h a t CPR( j) < CPR(J+1), j=1,...,n-1. Goto 2. 2. ( F i n d and i n c l u d e the best f e a s i b l e column) F i n d the f i r s t column j , j>k, t h a t s a t i s f i e s the p a r t i t i o n c o n s t r a i n t s , AjAR=0, and covers a row i n T, AjAT/0. I f there i s no such column goto 4. Otherwise, goto 3. 3. (Update and check f o r completion) LBC=LBC+|Aj A T|•CPRj and T =T-(AjAT). I f T={0 ] , goto 4. Otherwise, k=j, goto 2. 4. (Termination) I f T£{0] the subproblem S + i s u n f e a s i b l e , LBC=oo. Otherwise, LBC i s a lower bound f o r the c o s t of completing the subproblem S +. Stop. The p o i n t e r k i n step 1 i s used to exclude from c o n s i d e r a t i o n v a r i a b l e s f i x e d at zero i n the p a r t i a l s o l u t i o n . I t s use w i l l be made c l e a r l a t e r . To prove LBC i s a lower bound i t i s s u f f i c i e n t t o show that 41 LBC i s l e s s than the c o s t of the l i n e a r s o l u t i o n to the same subproblem. I f the l i n e a r subproblem i s : min z 0 = cx Ax > e (1) x > 0 then the d u a l problem i s : max v 0 = | u | = Ei ut uA < c (2) u > 0 where u i s the dual s o l u t i o n v e c t o r . A fundamental theorem i n l i n e a r programming s t a t e s that the p r i m a l problem (1), and the du a l problem (2) have the same optimal c o s t ; z 0=v 0. Thus, i f u' from Bounding A l g o r i t h m I s a t i s f i e s the c o n s t r a i n t s of ( 2 ) , then LBC ^ z 0 and LBC i s a lower bound. S u b s t i t u t i n g i n u' f o r u i n the c o n s t r a i t s of (2) g i v e s : E^ tu{aij = Ei min{CPR 8 |ai5 = 1 , 1=1 , ... ,n}- a;j < Ei CPRj • asj = E L (Cj / 1 Aj ] ).a,j = C j thus E^jU'i -aq ^ C j and the proof i s complete. As an example, c o n s i d e r the problem i n Table 3.1. The l i s t below shows the s e l e c t e d columns and the r e s u l t a n t assignment of v a l u e s . (0) R={0], T=e Cost = ( 3 7 4 1 3 9 1 8 5 1 5 1 1 6 ) CPR = ( 3 3.5 4 4 . 3 3 4.54.5 5 5 5.5 6 ) 0 0 1 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 0 0 0 0 1 0 column 10 Tabl e 3.1: Example problem formatted i n t o ascending CPRs 43 (1) j=1, u 3 = 3 (2) j=2, uj=3.5 (3) j = 3, u 2=4 (4) j = 4, u',=4.33 (5) j=5, u|=4.5 and LBC = Iiut=19.33. Comparing LBC to the op t i m a l c o s t z o=20, one sees t h a t a good estimate has been o b t a i n e d . The bound o b t a i n e d by the P i e r c e and Lasky a l g o r i t h m f o r the same i n i t i a l problem was 16 ( F i g u r e 2.5). A l g o r i t h m I w i l l always y i e l d a bound at l e a s t as good as p(m s), but u s u a l l y a much b e t t e r bound i s o b t a i n e d . In f a c t , f o r low d e n s i t y problems having many f e a s i b l e s o l u t i o n s ( 3 ( m s ) i s not a s u f f i c i e n t l y s t r o n g bound. The P i e r c e and Lasky a l g o r i t h m has b e t t e r success with h i g h e r d e n s i t y problems where the s p e c i a l c o n s t r a i n t s on the problem reduce the s i z e of the search t r e e t o the p o i n t where any f u r t h e r r e d u c t i o n s due t o the simple bounding f u n c t i o n w i l l be e f f e c t i v e . I t w i l l be shown l a t e r t h a t Bounding A l g o r i t m I i s s t r o n g enough to be e f f e c t i v e a t lower d e n s i t i e s . 3.1.2 Bounding A l g o r i t h m II At t h i s p o i n t an even s t r o n g e r bounding a l g o r i t h m can be i n t r o d u c e d . I t i s s i m i l a r to the p r e v i o u s procedure with the p r i n c i p l e d i f f e r e n c e being t h a t once u\ has been a s s i g n e d t o row i , the CPRs are updated by: 44 CPR ( j ) = (c ( j ) - ajj-u [ ) / | Aj AT | (3) where, as b e f o r e , T i s the set of rows remaining to be covered by the bounding a l g o r i t h m . The term i n the denominator of (3) i s j u s t the number of rows i n Aj f o r which v a l u e s of u- have not been a s s i g n e d . Bounding Algorithm, II 1. ( I n i t i a l i z a t i o n ) R=Vjes-Aj , T=e-R, LBC=0. Goto 2. 2. ( F i n d the best f e a s i b l e column) Using (3), c a l c u l a t e C P R ( j ) , j=1,...,n. F i n d the column j with the minimum val u e of CPR ( j ) and which s a t i s f i e s the p a r t i t i o n c o n s t r a i n t s , AjAR=0. If there i s no such column, goto 4. Otherwise, goto 3. 3. (Update and check f o r a s o l u t i o n ) LBC=LBC+| Aj AT j • CPRj and T=T-(Aj A T ) . If T={0}, goto 4. Otherwise, goto 2. 4. (Termination) I f T={0] the subproblem S + i s u n f e a s i b l e , LBC=t>o. Otherwise, LBC i s the lower bound c o s t estimate f o r the completion of the subproblem. Stop. That t h i s procedure y i e l d s a lower bound can be proven with an argument s i m i l a r to that f o r A l g o r i t h m I. The value of Ziu[aij i s never allowed to exceed Cj , thus the dual c o n s t r a i n t s of (2) are always s a t i s f i e d . A p p l y i n g A l g o r i t h m II to the i n i t i a l problem i n Table 3.1 g i v e s : 45 (0) R = { 0 } , T=e. (1 ) j = 1, u 3 = 3 (2) j=2, uj=4 (3) j=3, u 2=4 (4) j = 5, u , 1=uj = 4.5 and LBC= E£Ui=20. For t h i s small problem A l g o r i t h m II has found the o p t i m a l c o s t . Although the second a l g o r i t h m w i l l c a l c u l a t e a b e t t e r bound i t r e q u i r e s the r e c a l c u l a t i o n of the CPRs on each i t e r a t i o n . A l s o , f i n d i n g the best f e a s i b l e column i s more d i f f i c u l t because the i n i t i a l o r d e r i n g of the columns change with the r e c a l c u l a t i o n of the CPRs. For these reasons, Bounding A l g o r i t h m II has yet to be t e s t e d . A l g o r i t h m II might be u s e f u l i n a SCP program. The SCP i s a l o o s e l y c o n s t r a i n e d problem, so s t r o n g e r bounding procedures are important. At t h i s time though, i t i s not c l e a r that the a d d i t i o n a l time r e q u i r e d to compute the bounds with A l g o r i t h m II w i l l reduce the search t r e e s u f f i c i e n t l y to improve performance over the e q u i v a l e n t program u s i n g A l g o r i t h m I. Test r e s u l t s f o r a SCP program u s i n g the l a t t e r bounding procedure are given . i n S e c t i o n 3.6.2. 46 3.2 Two New Alg o r i t h m s f o r the SPP 3.2.1 A Branch-and-Bound A l g o r i t h m U s i n g Bounding A l g o r i t h m I one needs on l y d e f i n e a branching s t r a t e g y to d e f i n e a BB a l g o r i t h m t o s o l v e the SPP. The o r d e r i n g of columns d e f i n e d by the bounding a l g o r i t h m makes the d e f i n i t i o n simple and co n v e n i e n t . SPPBB A l g o r i t h m 1. ( I n i t i a l i z a t i o n ) S + = { 0 } , z 0=~, R = { 0 } , k=0. Order the columns so that CPR(j) < CPR(J+1), j=1,...,n-1. Goto 3. 2. (Forward step) F i n d the f i r s t column, j , j>k, t h a t s a t i s f i e s the p a r t i t i o n c o n s t r a i n t s , AjAR = 0. I f there i s no such column, goto 4. Otherwise, k=j, R=RVAK, S +=S*+{k}, z=z+c k, goto 3. 3. (Bounding) Determine LBC(S) as d e f i n e d by Bounding A l g o r i t h m I.. I f z+LBC(S)^z 0, goto 4. Otherwise, i f R=e (eg. a p a r t i t i o n s o l u t i o n ) , then z 0=z, x 0 ( j ) = { l | j e S +}, S +=S +-{k}, goto 4. Otherwise, goto 2. 4. (Backstep) Let k be the l a s t index value i n S +, S +=S +-{k} and R=R-Ak. If S +={0], goto 5. Otherwise, goto 3. 5. (Termination) If z 0 = ° o there i s no p a r t i t i o n s o l u t i o n . Otherwise, x 0 , z 0 d e f i n e the optimal s o l u t i o n . Stop. F i g u r e 3.1 i l l u s t r a t e s the SPPBB a l g o r i t h m by showing the s o l u t i o n t r e e f o r the problem i n Table 3.1. The same l a b e l i n g format i s used as i n F i g u r e 2.5. 2 0 n F i g u r e 3.1: SPPBB s o l u t i o n t r e e 48 N o t i c e t h a t the valu e s of S~, the v a r i a b l e s f i x e d a t 0, a r e i m p l i c i t l y handled by the p o i n t e r k. That i s , S"={j|j<k, jeS*} where k i s the l a r g e s t index value i n S* or S " . For example, node 2 i n F i g u r e 3.1 has S*={1,3}, S"={2} and a l l other v a r i a b l e s f r e e . Node 6 has S + = { 1 ] , S~={2,3}, and a l l o t h e r a r e v a r i a b l e s are f r e e . 3.2.2 An I m p l i c i t Enumeration A l g o r i t h m The next a l g o r i t h m to be i n t r o d u c e d belongs t o a c l a s s of a l g o r i t h m s c a l l e d H e u r i s t i c Path A l g o r i t h m s (HPA) [24], I t a l s o belongs to a more s p e c i f i c group c a l l e d A* a l g o r i t h m s [ 1 6 ] . A b r i e f i n t r o d u c t i o n to these g e n e r a l i z e d a l g o r i t h m s can best be made by f i r s t i n t r o d u c i n g the p a r t i c u l a r a l g o r i t h m of i n t e r e s t . SPPA* A l g o r i t h m 1. ( I n i t i a l i z a t i o n ) S Q = [ 0 ] , S 5 = { 0 } . Order the columns such t h a t CPR(j) < CPR(J+1), j=1,...,n-1. Goto 3. 2. ( S e l e c t a subproblem from the c a n d i d a t e l i s t ) I f there are no subproblems on the c a n d i d a t e l i s t goto 5. Otherwise, f i n d the subproblem from the c a n d i d a t e l i s t which has the minimum lower bound c o s t , ZLBC(S). C a l l t h i s subproblem, S 0. I f there are no subproblems on the c a n d i d a t e l i s t , goto 5. Remove S 0 from the c a n d i d a t e l i s t and i f V j £ c ^ A j = e , goto 6. Otherwise, goto 3. 3. (Generate the descendants to the c u r r e n t node) Let 1 be the 49 l a r g e s t index value i n So or SQ. F i n d the column k, k>l, t h a t s a t i s f i e s the p a r t i t i o n c o n s t r a i n t s , A kA(Vj 6s*Aj ) = 0. I f t h e r e i s no such column goto 2. Otherwise, generate the two descendants S r and S 2 d e f i n e d by S}=S 0 + {k}, ST =S5 and S 2=S 0, Si=Sa+{k}. 4. (Bounding) C a l c u l a t e LBC(S) f o r both S, and S 2. Let ZLBC(S)= Ej^Cj +LBC(S) f o r both nodes, then s t o r e them on the c a n d i d a t e i i s t . Goto 2. 5. ( U n f e a s i b l e t e r m i n a t i o n ) There i s no p a r t i t i o n s o l u t i o n . Stop. 6. (Termination) The c u r r e n t subproblem S i s the o p t i m a l s o l u t i o n and z 0 = ZLBC(S 0)= EjesjCj i s the optimal c o s t . Stop. F i g u r e 3.2 i l l u s t r a t e s the s o l u t i o n t r e e f o r the problem i n T a b l e 3.1. The SPPA* a l g o r i t h m i s an example of a HPA a l g o r i t h m . That i s , given a l i s t of c a n d i d a t e problems the HPA a l g o r i t h m w i l l f i n d the one w i t h the minimum lower bound estimate, then generate and bound the successor nodes and p l a c e them on the c a n d i d a t e l i s t . The g e n e r a l i z e d HPA a l g o r i t h m , however, d i f f e r s from SPPA* i n that a weighting f u n c t i o n , w(x), i s i n t r o d u c e d . If g(x) is.-the c o s t of the p a r t i a l s o l u t i o n at x, and h(x) i s an e s t i m a t e f o r the c o s t of completing the problem, then the e v a l u a t i o n f u n c t i o n f ( x ) i s given by: f ( x ) = (1-w(x))•g(x) + w ( x ) - h ( x ) , 0< w(x) <1. (4) F i g u r e 3.2: SPPA* s o l u t i o n t r e e 51 N o t i c e t h a t h(x) i s an estimate and not n e c e s s a r i l y a lower bound. Consequently, the g e n e r a l i z e d HPA a l g o r i t h m does not guarantee an o p t i m a l s o l u t i o n , hence the name ' h e u r i s t i c ' . The case when w(x)=0.5 and h(x) i s r e q u i r e d to be a lower bound i s known as an A* a l g o r i t h m . C l e a r l y SPPA* i s an A* a l g o r i t h m s i n c e the bounding f u n c t i o n ZLBC(S)= ZJ6S.Cj +LBC(S) (5) i s e q u i v a l e n t to (4) when w(x)=0.5. The g e n e r a l i z e d A* a l g o r i t h m was f i r s t i n t r o d u c e d and s t u d i e d by Hart, N i l s o n and Raphael [16]. T h i s work showed that the A* a l g o r i t h m w i l l f i n d the o p t i m a l s o l u t i o n by e x p l o r i n g fewer nodes than any other s e a r c h procedure using the same bounding procedure. To guarantee t h i s p r o p e r t y , the bounding procedure must s a t i s f y one more t e s t . For the SPPA* a l g o r i t h m t h i s t e s t can be expressd as ZLBC(So) < ZLBC(S,),ZLBC(S 2). SPPA* does not s t r i c t l y s a t i s f y t h i s requirement s i n c e the estimated c o s t f o r a successor node can equal t h a t of the parent node. T h i s i m p l i e s that i f there are t nodes with the same value of ZLBC(S) as the optimal s o l u t i o n , then another a l g o r i t h m , say SPPBB, might f i n d the o p t i m a l s o l u t o n i n at most t fewer nodes. However, to do so the SPPB a l g o r i t h m would have to f i n d the optimal s o l u t i o n before ever examining a node with ZLBC(S) g r e a t e r than the optimal v a l u e . T h i s would i n s u r e that the SPPBB a l g o r i t h m would examine onl y those nodes examined by the A* a l g o r i t h m (e.g. both a l g o r i t h m s would f i n d one f e a s i b l e s o l u t i o n , the optimal s o l u t i o n ) . T h i s s i t u a t i o n seldom o c c u r s . The SPPA* a l g o r i t h m 52 w i l l t y p i c a l l y examine s i g n i f i c a n t l y fewer nodes than the SPPBB a l g o r i t h m . As one would expect, to examine the fewest number of nodes the A* a l g o r i t h m r e q u i r e s more overhead than the simpler branch-and-bound s t r a t e g y . A l a r g e candidate l i s t must be maintained and some k i n d of bookkeeping procedure must keep t r a c k of the c u r r e n t best node on the stack. Computational r e s u l t s show that the e x t r a overhead i s w e l l worth the p r i c e . However, d i f f i c u l t i e s a r i s e when the candidate l i s t grows too l a r g e , as w i l l be seen l a t e r . S i n c e the set p a r t i t i o n i n g problem i s an NP-complete problem t h e r e w i l l always be l a r g e problems f o r which one cannot reasonably expect to f i n d , and prove, that one has found the o p t i m a l s o l u t i o n . For these problems h e u r i s t i c methods must be employed to f i n d good sub-optimal s o l u t i o n s . Since SPPA* i s a HPA a l g o r i t h m i t i s e a s i l y m o d i f i e d to o b t a i n h e u r i s t i c s o l u t i o n s . T h i s i s done by adding a weighting f a c t o r to ( 5 ) . ZLBC(S)= Ej £ S*Cj + w- LB.C(S) ,. w>1 (6) where w i s the weighting f a c t o r . ( 6 ) i s the same as ( 4 ) a l t h o u g h the weighting f a c t o r has been expressed s l i g h t l y d i f f e r e n t l y . The weighting f a c t o r w i l l i n g e n e r a l reduce the search by compensating f o r the underestimated value of LBC(S). For example, i n the t e s t problems examined l a t e r , LBC(S) was 53 t y p i c a l l y about 90% of the l i n e a r s o l u t i o n , which was a l s o an underestimate. Thus, a weighting f a c t o r of w=1.1 i n (6) w i l l g i v e a c l o s e r e s t i m a t e of ZLBC(S). Furthermore, a weighting f a c t o r w i l l tend t o p e n a l i z e those nodes which d e f i n e subproblems ' f a r ' from a f e a s i b l e s o l u t i o n . T h i s can be i l l u s t r a t e d with an example. Suppose at some p o i n t i n the SPPA* search there are two 'best' nodes on the c a n d i d a t e l i s t , both with ZLBCs equal to 400. Let S 1 be the f i r s t and S 2 be the second. I f z 1 = r j S S.Cj =300 and Z 2=100 then LBC(S 1)=100 and LBC(S 2)=300. The SPPA* a l g o r i t h m i s f r e e t o choose e i t h e r of the nodes f i r s t . The h e u r i s t i c v e r s i o n , SPPHA*, c a l c u l a t e s ZLBC(SM = 300+1.1•(100) = 410, and LBC(S 2) = 100+300 •(1. 1 ) = 430, so the h e u r i s t i c w i l l choose S 1. I n t u i t i v e l y , S 1 i s the b e t t e r c h o i c e s i n c e i t i s c l o s e r to a f e a s i b l e s o l u t i o n i n the sense that fewer v a r i a b l e s have to be added to S' to fathom the node. S 2 i s p e n a l i z e d because (6) e s t i m a t e s that by the time S 2 i s as c l o s e to the f e a s i b l e s o l u t i o n as S 1, i t w i l l c o s t more than S 1. One would expect that the c h o i c e of w i s very important. A l a r g e r v a l u e c o u l d be expected to reduce the s e a r c h but at the same time i n c r e a s e s the p r o b a b i l i t y that a sub-optimal s o l u t i o n w i l l be found. One can put an upper bound on the c o s t of a sub-optimal s o l u t i o n . I f z 0 i s the h e u r i s t i c c o s t and z 0 the o p t i m a l c o s t , then Z o / z 0 < w. 54 S e c t i o n 3.6.1 w i l l i n v e s t i g a t e the SPPHA* a l g o r i t h m by g i v i n g t e s t r e s u l t s f o r s e v e r a l v a l u e s of the weighting f u n c t i o n . 3.3 Some Implementation D e t a i l s T h i s s e c t i o n i n c l u d e s a c o l l e c t i o n of d e t a i l s t h a t have not been e x p l i c i t l y s t a t e d i n the a l g o r i t h m d e f i n i t i o n s but which are s i g n i f i c a n t to the implementation and performance of the algorithms." Step 3 of the SPPA* a l g o r i t h m searches f o r the next f e a s i b l e column A k. The i n c l u s i o n and e x c l u s i o n of x k d e f i n e s the successor node, S, and S 2, to the parent node S 0. However, k was p r e v i o u s l y determined by the bounding a l g o r i t h m . That i s A i s the f i r s t column Bounding A l g o r i t h m I uses to compute the bound. So r a t h e r than c a l c u l a t i n g k twice, i t should be saved on the c a n d i d a t e l i s t along with the o r i g i n a l i n f o r m a t i o n d e f i n i n g the node. Saving the index value k makes the d e f i n i t i o n of S" p a r t i c u l a r i l y easy. Consider the subproblem at node S. I f k 0 i s the value of k f o r the parent node S 0, then S~={j|j<k 0, j £ S + } . Thus, a l l the i n f o r m a t i o n r e q u i r e d to d e f i n e the subproblem S on the c a n d i d a t e l i s t i s k, S* and ZLBC(S). In s t e p 4 of the SPPA* a l g o r i t h m , the lower bounds f o r the two descendant nodes are c a l c u l a t e d . LBC(S,) i s c a l c u l a t e d as 55 d e s c r i b e d but there i s a s h o r t c u t when c a l c u l a t i n g L B C ( S 2 ) . Remember that S 2 d i f f e r s from S 0 only i n that a f r e e v a r i a b l e , x K , has been f i x e d to 0 i n S 2. Since S 0 and S 2 have the same set of f e a s i b l e columns except f o r A k, the bounding procedure w i l l be the same except f o r those rows i n A k. So i f Bounding A l g o r i t h m I i s mod i f i e d by T=AK i n ste p 1, one can c a l c u l a t e ZLBC(S 2) by ZLBC(S 2)=ZLBC(S 0)-c K+LBC(S 2). N o t i c e t h a t t h e r e are g e n e r a l l y only a few rows i n A so the bounding can be s i g n i f i c a n t l y speeded up f o r h a l f the nodes i n the search t r e e . The same idea a p p l i e s to SPPBB. In the branch and bound a l g o r i t h m a backstep i s the same as c a l c u l a t i n g the S 2.descendant to the parent node. Large SPPs can generate many candidate problems. P h y s i c a l l y , the number of words r e q u i r e d t o d e f i n e the s t a t e of the c a n d i d a t e problem depends on the s i z e of . the problem. T r e a t i n g S* as a b i n a r y v e c t o r s t o r e d i n machine words (e.g. b i t j =1 i f j e S + ) compresses the data s i g n i f i c a n t l y . In the i n t e r e s t of computational e f f i c i e n c y i t i s d e s i r a b l e to add e x t r a i n f o r m a t i o n to the c a n d i d a t e l i s t i n order t o a v o i d r e c a l c u l a t i n g i n f o r m a t i o n . T h i s i n f o r m a t i o n i s the f i x e d c o s t , z=Ej £ SiCj , and the rows a l r e a d y covered by a subproblem (VjSS*Aj ). The l a t t e r can be s t o r e d as a b i n a r y v e c t o r . The e a s i e s t way to implement the SPPA* cand i d a t e l i s t i s as a s t a c k , i n memory. As a new node i s generated, i t i s added t o 56 the top of the s t a c k . SPPA* expands a parent node i n t o two s u c c e s s o r s . Thus a f t e r t i t e r a t i o n s there are 2t+1 nodes i n the s e a r c h t r e e . Of these, t have been e x p l o r e d , l e a v i n g t+1 l i v e nodes on the c a n d i d a t e l i s t . T h e r e f o r e , i t i s only necessary to save the t+1 l i v e nodes on the s t a c k . A convenient way to do t h i s i s t o w r i t e over the dead S 0 parent node on the c a n d i d a t e l i s t with the i n f o r m a t i o n d e f i n i n g the l i v e S 2 descendant. The v a l u e of z, S + and Vj«*Aj remain unchanged. Only the v a l u e s of k and LBC(S 2) must be updated. T h i s minimizes the memory r e q u i r e d to save the c a n d i d a t e l i s t s i n c e only i n f o r m a t i o n p e r t a i n i n g to l i v e nodes i s r e t a i n e d . 3 . 4 " A New A l g o r i t h m f o r the Set Covering Problem (SCP) Although t h i s t h e s i s i s p r i m a r i l y concerned with the SPP, the s i m p l i c i t y of d e f i n i n g a SCP a l g o r i t h m by modifying SPPA* suggests that a SCP a l g o r i t h m , SCPA*, should be i n t r o d u c e d . SCPA* d i f f e r s from the p a r t i t i o n v e r s i o n mainly i n that the f i x e d o r d e r i n g of columns in SPPA* i s not a v a i l a b l e to SCPA*. T h i s makes the bounding procedure more d i f f i c u l t because the CPRs must be r e c a l c u l a t e d and s o r t e d at every node. Furthermore, every node on the c a n d i d a t e l i s t must s t o r e S~, i n a d d i t i o n to S +. SCPA* A l g o r i t h m 1. ( I n i t i a l i z a t i o n ) S o ={0 } , S5={0} . Goto 3. 2 . ( S e l e c t a subp rob l em f rom the c a n d i d a t e l i s t ) I f t h e r e a r e no subp rob l ems on the c a n d i d a t e l i s t , g o t o 5. O t h e r w i s e , f i n d the subp rob l em on the c a n d i d a t e l i s t w h i c h has t h e minimum l o w e r bound c o s t , Z LBC ( S ) . C a l l t h i s s u b p r o b l e m node S 0 , and remove i t f rom the c a n d i d a t e l i s t . I f Vjes»Aj=e, g o t o 6. O t h e r w i s e , g o t o 3. 3 . (Gene ra t e the d e s c e n d a n t s t o the c u r r e n t node) C a l c u l a t e CPR ( j ) = c ( j ) / P ( j ) , j = 1 , . . . , n ; where P ( j ) i s t he number o f rows i n Aj t h a t rema in t o be c o v e r e d . F i n d the co lumn k t h a t has the minimum CPR and s a t i s f i e s the p a r t i t i o n c o n s t r a i n t s , A KA(Vj Ss*Aj ) = 0. I f t h e r e i s no such c o l u m n , g o t o 2. O t h e r w i s e , g e n e r a t e the two d e s c e n d a n t s S, and S 2 d e f i n e d b y : S }=So+(k ] , Sl=S5 and S 2 =S5 , S i=Ss+{k}. Go to 4 . 4 . ( Bound ing ) C a l c u l a t e LBC(S) f o r b o t h S, and S 2 . L e t ZLBC(S)= F-jesrCj +LBC(S) f o r b o t h n o d e s , t hen s t o r e the new nodes on t he c a n d i d a t e l i s t . Go to 2. 5. ( U n f e a s i b l e t e r m i n a t i o n ) The re i s no c o v e r s o l u t i o n . S t o p . 6 . ( T e r m i n a t i o n ) The c u r r e n t subp rob l em S 0 i s t he o p t i m a l s o l u t i o n and z 0 = ZLBC (S 0 )= Ijss'Cj i s t he o p t i m a l c o s t . S t o p . A l t h o u g h the SCPA* a l g o r i t h m canno t use many o f t he f e a t u r e s t h a t make SPPA* an e f f i c i e n t a l g o r i t h m i t s p e r f o r m a n c e s t i l l a p p e a r s t o be comparab l e t o o t h e r a l g o r i t h m s f o r t he 58 l i m i t e d t e s t r e s u l t s examined. As with the SPPA* a l g o r i t h m , the c o v e r i n g v e r s i o n e a s i l y c o n v e r t s to a h e u r i s t i c a l g o r i t h m . The performance of both SCPA* and the h e u r i s t i c v e r s i o n , SCPHA*, are examined i n S e c t i o n 3.6.2. 3.5 T e s t Problems and Comparing R e s u l t s Comparing the performance of two a l g o r i t h m s i s o f t e n a d i f f i c u l t u n d e r t a k i n g . A good computational comparison i s p o s s i b l e i f the programs r e p r e s e n t a t i v e of the a l g o r i t h m s are executed on the same machine, with the same input data. F a c t o r s such as the o p e r a t i n g system should not be p e r m i t t e d to i n f l u e n c e the comparison. Nor should the language of implementation give one a l g o r i t h m an u n f a i r advantage over another. Often i t i s not p o s s i b l e or p r a c t i c a l to s a t i s f y the above c o n d i t i o n s . For i n s t a n c e , i f one wishes to compare new r e s u l t s with r e s u l t s p u b l i s h e d i n the l i t e r a t u r e , as w i l l be done l a t e r , the program from which the p u b l i s h e d r e s u l t s were o b t a i n e d may not be a v a i l a b l e . However, a comparison i s p o s s i b l e i f t e s t r e s u l t s from the same set of t e s t problems are known. I f the r e s u l t s were obtained on d i f f e r e n t computers then the comparison must take i n t o account the r e l a t i v e performance of the computers. But an a c c u r a t e f i g u r e f o r r e l a t i v e performance can be d i f f i c u l t to determine s i n c e i t depends on many f a c t o r s (e.g. 59 o p e r a t i n g systems, benchmark programs used, e t c . ) . There are a d d i t i o n a l problems which can make a comparison d i f f i c u l t . An annoying problem that shows up throughout o p e r a t i o n s r e s e a r c h i s the p u b l i c a t i o n of t e s t r e s u l t s without a complete d e f i n i t i o n of the t e s t problems. Due to space l i m i t a t i o n s i t i s o f t e n not f e a s i b l e t o e x p l i c i t l y l i s t t e s t problems, e s p e c i a l l y l a r g e ones, i n a paper. But a c c u r a t e comparisons are not p o s s i b l e i f the exact t e s t problems are not known. One p o s s i b l e s o l u t i o n i s the d e f i n i t i o n of s e t s of s t a n d a r d i z e d problems which are r e p r e s e n t a t i v e of the problems found i n each a p p l i c a t i o n a r e a . Such a set of problems would e l i m i n a t e much of the c o n f u s i o n borne from a p r o l i f e r a t i o n of u n r e p r o d u c i b l e r e s u l t s i n the l i t e r a t u r e . S i n c e a set of s t a n d a r d i z e d t e s t problems i s not a v a i l a b l e , and f o r the reasons mentioned, the P i e r c e and Lasky a l g o r i t h m [23] has been programmed and implemented as a benchmark program. T h i s permits a " s i d e - b y - s i d e " comparison of the a l g o r i t h m s i n t r o d u c e d i n t h i s t h e s i s with one which has been i n the l i t e r a t u r e f o r some time. I t a l s o permits a q u a l i t a t i v e comparison with the work which r e f e r s to [23]. P i e r c e and Lasky's a l g o r i t h m was chosen because i t r e p r e s e n t s one of the best p a r t i t i o n i n g a l g o r i t h m s i n the l i t e r a t u r e . As proof c o n s i d e r the a l g o r i t h m s i n the survey of S e c t i o n 2.7. The bounding t e s t i n P i e r c e and Lasky's a l g o r i t h m makes i t s u p e r i o r to G a r f i n k l e and Nemhauser's a l g o r i t h m . The l a t t e r a l g o r i t h m i s shown i n [29] to be s u p e r i o r to the Lemke et 60 a l . [19] and Koncal et a l . [27,28,29] a l g o r i t h m s . The l a s t two a l g o r i t h m s are set c o v e r i n g a l g o r i t h m s t h a t have been a p p l i e d t o the s e t p a r t i t i o n i n g problem. Both have a dependence on l i n e a r programming. The problems used to o b t a i n the t e s t r e s u l t s were randomly generated u s i n g a program l i s t e d i n Appendix A, and using the input data l i s t e d i n Appendix B. Randomness i s obtained by u s i n g a random number generator to randomly p l a c e I's i n the c o n s t r a i n t m a t r i x . Column c o s t s are c a l c u l a t e d by l e t t i n g t h e i r v a l u e s be p r o p o r t i o n a l to the number of 1's i n the column. The random number generator i s then used to add a small a d d i t i o n a l c o s t . T h i s r e s u l t s i n a l l the CPRs being approximately equal, thus the r e s u l t a n t problems are more d i f f i c u l t to s o l v e than the e q u i v a l e n t minimuim c a r d i n a l i t y problem. Minimum c a r d i n a l i t y problems are c o v e r i n g problems with cj=1, f o r j=1,...,n. The t e s t problems, randomly generated and t h e r e f o r e l a c k i n g s p e c i a l s t r u c t u r e , are s i m i l a r t o many p r a c t i c a l s c h e d u l i n g problems which u s u a l l y c o n t a i n some s t r u c t u r e . Consider a truck s c h e d u l i n g problem. The rows i n a column represent the l o c a t i o n s t o which a p a r t i c u l a r truck would make d e l i v e r i e s i f that route were i n c l u d e d i n the schedule. A l a r g e r truck would make more d e l i v e r i e s to more l o c a t i o n s but at a roughly p r o p o r t i o n a l i n c r e a s e i n c o s t . As a r e s u l t one would expect the CPRs to be approximately equal f o r a l l the r o u t e s . Since the problem g e n e r a t i n g program i s l i s t e d i n Appendix A, the problems are r e p r o d u c i b l e to anyone having access to a 61 computer. T h i s i s a much more convenient s t r a t e g y compared to e x p l i c i t l y l i s t i n g the t e s t problems. Furthermore, the need f o r a set of standard problems as d i s c u s s e d p r e v i o u s l y c o u l d be met by the adoption of a program l i k e the one l i s t e d i n Appendix A. Future SCP and SPP papers would then only have to l i s t the input parameters to the program to f u l l y d e f i n e the t e s t problems used. T h i s idea a l s o a p p l i e s to other o p t i m i z a t i o n problems i n o p e r a t i o n s r e s e a r c h . Test problems may a l s o have s p e c i a l s t r u c t u r e j u s t so long as the g e n e r a t i n g program can be an agreed upon standard with r e p r o d u c i b l e r e s u l t s . To i n s u r e that the problems are r e p r o d u c i b l e , the F o r t r a n program of Appendix A uses a p o r t a b l e random number generator [ 3 0 ] . The random number generator i s p o r t a b l e i n the sense that a l l machines that can represent 32 b i t signed i n t e g e r s w i l l generate the same sequence of random numbers. 3.6 Test R e s u l t s Programs were w r i t t e n to implement the SPPBB, SPPA*, and the SCPA* a l g o r i t h m s . Since most of the problem data i s b i n a r y and s i n c e t h e r e are many l o g i c a l o p e r a t i o n s i n the a l g o r i t h m s , the most e f f i c i e n t way to s t o r e the problem data i s i n machine words. That i s , b i t j of word i i s set to 1 i f ajj = 1. T h i s compresses the data and permits the use of l o g i c a l machine i n s t r u c t i o n s to implement the l o g i c a l o p e r a t i o n s i n p a r a l l e l . 62 The degree of p a r a l l e l i s m depends on the word s i z e of the host machine. F o r t r a n programs, with s p e c i a l r o u t i n e s to handle b i n a r y data and l o g i c a l o p e r a t i o n s , c o u l d be used to implement the a l g o r i t h m s . Other high l e v e l languages permit these o p e r a t i o n s to be performed d i r e c t l y . However the programs have been implemented i n assembler language f o r s e v e r a l reasons. An obvious reason i s the ease with which b i n a r y data can be a c c e s s e d and manipulated. Another i s that assembler program development i s w e l l supported and there e x i s t s a l a r g e l i b r a r y of support r o u t i n e s that can be e a s i l y accessed from assembler. An assembler program w i l l be more e f f i c i e n t i n terms of s i z e and e x e c u t i o n time. Furthermore, Chapter 4 w i l l d i s c u s s the p o s s i b l e design of s p e c i a l hardware to improve the performance of some of the a l g o r i t h m s . An assembler program g i v e s the programmer some i n s i g h t as to where s p e c i a l hardware w i l l be most b e n e f i c i a l i n speeding up the a l g o r i t h m . A l s o , an assembler program serves to s i m p l i f y f u t u r e program development f o r a m i c r o p r o c e s s o r which would c o n t r o l the s p e c i a l hardware. L a s t l y , the new a l g o r i t h m s are r e l a t i v e l y simple, at l e a s t simple r e l a t i v e to the more s o p h i s t i c a t e d approaches based on l i n e a r programming. Thus, an assembler implementation of the a l g o r i t h m i s not an unmanageable task f o r the programmer. Program development, t e s t i n g , and e v a l u a t i o n was performed u s i n g the f a c i l i t i e s o f f e r e d by the Computing Centre at the U n i v e r s i t y of B r i t i s h Columbia. The Centre operates an ' Amdahl 63 J 470 V/8 computer o p e r a t i n g under the Michigan Terminal System (MTS), which p r o v i d e s an o n - l i n e , time-shared system. 3.6.1 Set P a r t i t i o n i n g Test R e s u l t s As mentioned b e f o r e , the P i e r c e and Lasky a l g o r i t h m was implemented as a benchmark a l g o r i t h m . To make comparisons a c c u r a t e , the program was implemented i n assembler language. Table 3.2 shows the computational r e s u l t s f o r a range of t e s t problems. For each of the three programs the e x e c u t i o n time r e q u i r e d to so l v e the problem and the number of nodes examined i s l i s t e d . CPU time excludes I/O time and a l l time a s s o c i a t e d with the time-shared o p e r a t i n g system. Problems PR1 through PR7 are the same problems as P1 to P7, except that the o r i g i n a l c o s t s are r e p l a c e d with the reduced c o s t s . The SPPA* program s t o r e s a l l the l i v e nodes on a candidate l i s t implemented as a stack i n memory. For d i f f i c u l t problems the s i z e of the ca n d i d a t e l i s t can become too l a r g e to save. For the problems in Table 3.2 the stack has been r e s t r i c t e d to 80,000 words. The space to d e f i n e a node.on the l i s t depends, n a t u r a l l y , on the s i z e of problem. For problem P5, i t i s 14 words. T h e r e f o r e , a maximum of 5,713 l i v e nodes can be s t o r e d on the ca n d i d a t e l i s t . But there were 13152/2=6576 l i v e nodes upon completion of the problem. O b v i o u s l y the stack i s not l a r g e enough to s t o r e a l l the l i v e nodes. To handle t h i s Optimal Linear Time to Pierce and Lasky SPPBB SPPA* solution solution solve l i n e a r Problem Size Density cost cost problem (sec) nodes CPU(sec) nodes CPU(sec) nodes CPU(sec) P1 15, 120 .205 287 PR 1 P2 15.90 .205 389 PR2 P3 30,120 .071 1039 PR3 P4 30,120 .073 880 PR4 P5 30,200 .072 1025 PR5 PG 30,400 .07 1 704 PRS P7 40,400 .070 731 PR7 284 . 18 376.78 1038.7 878.S 1018.6 703.33 730.15 . 155 1 19 .207 . 146 .272 . 292 2.468 3695 488 1469 21 1 . 281 .045 . 113 .013 >60 23172 1.644 >70 23791 1.132 > 1 10 24777 6.732 1223 523 279 322 3800 1093 1 2006 2408 . 137 .070 .034 .040 .694 . 189 1.918 . 382 46807 7.265 13925 2.155 17975 3.604 3749 2.687 61753 15.357 13601 2.992 436 312 68 98 1328 84 5563 282 .062 .044 .010 .015 .242 .018 1 . 156 .051 13152* 2.762 3170 .546 8324* 2.130 1222 .222 17590** 5.517 5744* 1.294 Table 3.2: P a r t i t i o n i n g test r e s u l t s 65 s i t u a t i o n , a f u l l c andidate l i s t was pruned by removing 20% of the nodes from the l i s t . The pruned nodes where taken from the beginning of the l i s t because these nodes have the fewest number of v a r i a b l e s f i x e d i n t o the s o l u t i o n , and t h e r e f o r e are s t a t i s t i c a l l y l e s s promising p a r t i a l s o l u t i o n s . There are c e r t a i n l y more s o p h i s t i c a t e d pruning methods that are worth f u r t h e r r e s e a r c h , but t h i s scheme i s f a s t and e a s i l y implemented. N a t u r a l l y , once the c a n d i d a t e l i s t has been pruned an o p t i m a l s o l u t i o n cannot be guaranteed. A "*" i n the SPPA nodes column of Table 3.2 i n d i c a t e s that the candidate l i s t was pruned but that the optimal s o l u t i o n was found. "**" means tha t a sub-optimal s o l u t i o n was found. C l e a r l y the reduced c o s t problems i n Table 3.2 are the e a s i e r problems to s o l v e . One w i l l remember from S e c t i o n 2.9 t h a t the reduced c o s t are c a l c u l a t e d by s o l v i n g the a s s o c i a t e d l i n e a r problem. In Table 3.2, the l i n e a r problem was s o l v e d with a p r i m a l - d u a l a l g o r i t h m based on the MINIT a l g o r i t h m ( A l g o r i t h m 333 from the CACM). The r e l a x e d c o v e r i n g problem, Ax>e, was s o l v e d s i n c e i t was c o m p u t a t i o n a l l y e a s i e r than the r e l a x e d p a r t i t i o n problem, Ax=e. The time to s o l v e the l i n e a r problem i n c l u d e s I/O time and the time to c a l u l a t e the reduced c o s t s from the l i n e a r s o l u t i o n . Using reduced c o s t s i s most p r o f i t a b l e f o r l a r g e r problems. For the r e l a t i v e l y easy problems P1 and P2 the gains are modest. When the time to s o l v e the l i n e a r problem i s added to the time to s o l v e the reduced problem i t i s apparent that one i s 66 sometimes b e t t e r o f f s o l v i n g the o r i g i n a l problem. On l a r g e r problems however, reduced c o s t s s i g n i f i c a n t l y reduce s o l u t i o n time. For i n s t a n c e , P i e r c e and Lasky c o u l d not sol v e P4 i n 70 seconds but PR4 was s o l v e d i n 1.13 seconds, more than 60 times f a s t e r . The reason f o r t h i s improvement i s that the in t e g e r s o l u t i o n and the l i n e a r s o l u t i o n u s u a l l y d i f f e r i n cost by l e s s than .1%. In other words, the l i n e a r and i n t e g e r s o l u t i o n s are o f t e n very c l o s e i n the s o l u t i o n space. The a l g o r i t h m s i n t r o d u c e d i n t h i s t h e s i s , SPPBB and SPPA*, performed f a v o r a b l y i n comparison to the P i e r c e and Lasky a l g o r i t h m . On the more d i f f i c u l t problems, P3 through P7, the performance r a t i o f a v o r s the new a l g o r i t h m s , sometimes by a f a c t o r of more than 50. The gains were l e s s dramatic f o r the l e s s d i f f i c u l t c o s t reduced problems, although PR5 was s t i l l a very d i f f i c u l t problem f o r the P i e r c e and Lasky a l g o r i t h m . A p p a r e n t l y , as problems become l a r g e r and more d i f f i c u l t , the performance r a t i o w i l l i n c r e a s i n g l y favor the a l g o r i t h m s i n t r o d u c e d i n t h i s t h e s i s . As expected, the SPPA* enumerative a l g o r i t h m outperforms the branch-and-bound SPPBB a l g o r i t h m , although i t co u l d not always guarantee the optimal s o l u t i o n as SPPBB c o u l d . SPPA* examined l e s s than h a l f as many nodes on average, and r e q u i r e d a s l i g h t l y g r e a t e r f r a c t i o n of time than t h a t to sol v e the problem. The d i f f e r e n c e i n the two r a t i o s i n due to the time spent by SPPA* m a i n t a i n i n g the c a n d i d a t e l i s t . N o t i c e that f o r P5, P6, and P7 the c a n d i d a t e l i s t became too l a r g e to save. For 67 these cases the h e u r i s t i c technique d e s c r i b e d e a r l i e r found the opt i m a l s o l u t i o n f o r three of four problems. Table 3.3 i l l u s t r a t e s the performance of SPPHA*, the h e u r i s t i c v e r s i o n s of SPPA*, f o r v a r i o u s v a l u e s of the weighting f a c t o r w. z'/z i s the r a t i o of the s o l u t i o n c o s t found by SPPHA* to the optimal c o s t found by SPPA* i n Table 3.2. S i m i l a r i l y , N'/N i s the r a t i o of the number of nodes (candidate problems) examined. The value of the r a t i o N'/N can a l s o be taken to be the r a t i o of the CPU times. A value of z'/z=1 i m p l i e s that the h e u r i s t i c found the optimal s o l u t i o n . In a l l cases the r a t i o z'/z i s w e l l l e s s than the upper bound value of w. For the t e s t problems used i n t h i s i n v e s t i g a t i o n a good c h o i c e of the weighting f a c t o r appears to be about w=1.063. T h i s g i v e s , on average, a s o l u t i o n which i s w i t h i n 1% of the optimal s o l u t i o n while examining l e s s that 3% of the nodes examined by the op t i m a l a l g o r i t h m . Thus, f o r l a r g e and d i f f i c u l t problems t h i s approach would guarantee good h e u r i s t i c s o l u t i o n s i n minimal time. Because the p a r t i t i o n i n g problem i s a NP-complete problem, the performance of search type a l g o r i t h m s can vary s i g n i f i c a n t l y from problem to problem. In f a c t , problems with the same s i z e and d e n s i t y may have s i g n i f i c a n t l y d i f f e r e n t e x e c u t i o n times (e.g. P3 and P4). SPPA* w i l l behave more p r e d i c t a b l y s i n c e i t examines the minimuim number of nodes. The behavior of the SPPBB a l g o r i t h m depends to a great extent on how long i t takes the a l g o r i t h m to f i n d a good f e a s i b l e s o l u t i o n . The e a r l i e r a 68 w=1.016 w=1.031 w=1.063 w-1.125 Problem z'/z N'/N z'/z N'/N z*/z N'/N z*/z N'/N P4 1 .241 1 .153 1.006 .050 1.005 .027 P5 1 .335 1.001 .101 1.010 .016 1.038 .016 P6 1.001 .068 1.007 .019 1.013 .007 1.016 .007 P7 1.001 .067 1.001 .027 1.001 .026 1.026 .005 Table 3.3: SPPHA* performance r e l a t i v e to SPPA* 69 good s o l u t i o n i s found the e a r l i e r the a l g o r i t h m w i l l be ab l e to prune branches i n the s o l u t i o n t r e e . I f the o p t i m a l c o s t i s known to the SPPBB a l g o r i t h m , then i t would only examine the s o l u t i o n space examined by SPPA*. That i s , the SPPBB a l g o r i t h m would examine a minimum number of nodes. T h i s suggests a two stage s o l u t i o n p rocess f o r s o l v i n g l a r g e problems which would take advantage of the best f e a t u r e s of both SPPBB and SPPA*. F i r s t , the h e u r i s t i c v e r s i o n of SPPA* i s used to o b t a i n a good s o l u t i o n to the problem i n minimal time. T h i s upper bound c o s t i s passed to SPPBB, which s o l v e s the problem. N o t i c e that SPPBB w i l l now examine c l o s e to the minimum number of nodes guaranteed to f i n d the optimal s o l u t i o n , but i t does not maintain the l a r g e c a n d i d a t e l i s t which makes the SPPA a l g o r i t h m so unwieldy on l a r g e problems. For i n s t a n c e , the problem denoted by P8 i n Appendix B was s o l v e d u s i n g the unreduced c o s t s . SPPA* s o l v e d the problem i n 5.6 seconds but o p t i m a l i t y c o u l d not be assured because the can d i d a t e l i s t was pruned 11 times. Using SPPBB and the upper bound c o s t from SPPA*, the optimal s o l u t i o n was found in 5 seconds. Without an upper bound c o s t , SPPBB took 127 seconds to s o l v e the same problem. The SPPA* a l g o r i t h m can be o p t i m i z e d f o r t h i s two stage p r o c e s s . Since only a lower bound c o s t e s t imate i s d e s i r e d from SPPA*, the program l o g i c and stack space used t o save the p a r t i a l s o l u t i o n v e c t o r s i s not r e q u i r e d . T h i s would make a m o d i f i e d SPPA* a l g o r i t h m more e f f i c i e n t . 70 3.6.2 Set Covering Test R e s u l t s Table 3.5 compares the performance of SCPA* with the r e s u l t s o b t a i n e d by E t c h e b e r r y i n [10,11], Problems E1 through E5 are l i s t e d i n [10], These are a l l randomly generated minimum c a r d i n a l i t y problems (e.g. u n i t c o s t s ) which have been reduced a c c o r d i n g to the r u l e s o u t l i n e d i n S e c t i o n 2.5. CPU time f o r Et c h e b e r r y i s the t o t a l time to s o l v e the problem. I t i n c l u d e s some input time but excludes the time spent reducing the problems. SCPA* CPU time excludes a l l I/O time. The Amdahl 470 V/8 i s approximately 40 times f a s t e r than the IBM 360/67. The r a t i o of CPU times i n Table 3.5 i n d i c a t e s t h a t the performance of SCPA i s comparable to t h a t of E t c h e b e r r y ' s a l g o r i t h m . For high e r d e n s i t y problems, E1 and E4, SCPA* i s s u p e r i o r . E t c h e b e r r y ' s a l g o r i t h m can be expected t o perform r e l a t i v e l y poor on higher d e n s i t y problems because the subgradient o p t i m i z a t i o n method ( S e c t i o n 2.7) would r e q u i r e more i t e r a t i o n s of the bounding procedure to o b t a i n a gi v e n a c c u r a c y . SCPA*, l i k e most c o v e r i n g a l g o r i t h m s , performs b e t t e r on higher d e n s i t y problems. The optimal c o v e r i n g s o l u t i o n t o C1 t u r n s out to be a p a r t i t i o n . Thus SCPA* and. SPPA* have the same s o l u t i o n to C l . SPPA* o b t a i n s the s o l u t i o n to C1 i n 0.215 seconds by examining 1138 nodes. T h i s i l l u s t r a t e s the r e l a t i v e e f f i c i e n c y of the p a r t i t i o n i n g v e r s i o n of A*. The p a r t i t i o n v e r s i o n found the s o l u t i o n 13 times f a s t e r than the c o v e r i n g v e r s i o n . 71 Table 3.4 g i v e s the computational r e s u l t s f o r a number of t e s t problems generated by the program of Appendix A. MC1 through MC7 are the minimum c a r d i n a l i t y v e r s i o n s of the problems C.1 to C7. C l e a r l y the minimum c a r d i n a l i t y problems are much e a s i e r to s o l v e . The more c o n v e n t i o n a l l i n e a r programming based approaches to s o l v i n g c o v e r i n g problems (e.g. Lemke, et a l . [ 1 8 ] , S a l k i n , et a l . [29]) have comparable computational r e s u l t s , to those i l l u s t r a t e d i n Tables 3.4 and 3.5. However one would expect that SCPA* would be more s u s c e p t i b l e to the NP-complete nature of the problem. Consequently, l i n e a r programming based programs would probably be more e f f e c t i v e on l a r g e r problems. Table 3.6 i l l u s t r a t e s the performance of SCPHA*, the h e u r i s t i c v e r s i o n s of SCPA*, f o r v a r i o u s v a l u e s of the weighting f a c t o r . The r a t i o s z'/z and N'/N have the same meaning as i n the p a r t i t i o n case of Table 3.3. The l a s t value of w i n Table 3.6, w>>1, i s e q u i v a l e n t to the Greedy H e u r i s t i c s o l u t i o n ( S e c t i o n 2.6) to the SCP. That the two h e u r i s t i c s w i l l o b t a i n the same s o l u t i o n f o l l o w s from the f a c t that with w>>1, SPPHA* i s f o r c e d to f o l l o w the S, descendant node down the search t r e e , which r e s u l t s i n the same branching s t r a t e g y as with the Greedy H e u r i s t i c . The problems generated by the program i n Appendix A are such t h a t an a b s o l u t e upper bound on the value of z"/z i s 1.50, where z" i s the co s t of the worst f e a s i b l e s o l u t i o n , and z, the t o t a l no. of S i z e c a n d i d a t e Cost CPU time Problem m,n d e n s i t y problems (N) (z) (msec) Cl 30,90 . 1 1 2169 558 2620 MC 1 n ti 121 8 145 C2 30,90 .21 913 457 1016 MC2 II n 415 5 440 C3 30,90 .31 855 398 851 MC3 ft 331 4 298 C4 50,90 .21 81 17 643 9954 MC4 n n 329 6 3790 C5 40,90 .21 1675 525 1873 MC5 « ti 435 5 480 C6 30,50 .21 1797 624 1 149 MC6 IF n 49 5 34 C7 30,70 .21 1307 500 1 1 30 MC7 ti ti 237 5 203 Table 3.4: SCPA* computational r e s u l t s CPU time (sec.) S i z e E t c h e b e r r y SCPA* r a t i o of Problem m,n D e n s i t y IBM 360/67 Amdahl 470 CPU times E l 10,13 .30 0.72 .004 180 E2 30,43 .10 2.55 .084 30 E3 30,59 .10 4.80 .100 48 E4 50,100 .21 352.01 0.534 659 E5 50,75 .07 46.05 1.711 27 Tabl e 3.5: Computational r e s u l t s u s ing SCPA* and E t c h e b e r r y ' s a l g o r i t h m w=1.031 w=1.063 w=1.125 w » 1 Problem z'/z N'/N z'/z N'/N z'/z N'/N z'/z C1 1.013 .194 1.036 .102 1.063 .041 1.11 C2 1 .461 1.059 .181 1.059 .071 1.14 C3 1 .392 1 .219 1 .095 1.06 C4 1 .216 1 .067 1 .016 1.03 C5 1 .251 1 .099 1 .026 1.38 C6 1 .289 1 .132 1 .042 1.12 C7 1 .279 1 .094 1 .034 1.14 Table 3.6: SCPKA* performance r e l a t i v e to SCPA* 74 b e s t . With t h i s f i g u r e i n mind, the r e s u l t s o b t a i n e d by the Greedy H e u r i s t i c are not im p r e s s i v e . In f a c t f o r problem C5, the h e u r i s t i c must have found one of the worst f e a s i b l e s o l u t i o n s . However, SCPHA i n most cases found the optimal s o l u t i o n by examining a much sm a l l e r search t r e e than the op t i m a l v e r s i o n . For example, with a weighting f a c t o r of w=1.l25, a search t r e e about 5% of the s i z e of the optimal a l g o r i t h m was examined and yet the optimal s o u t i o n was found i n 5 of 7 c a s e s . In a l l cases, z'/z was l e s s than the p r e d i c t e d upper bound of w. A s i g n i f i c a n t performance i n c r e a s e i s a l s o recorded f o r s m a l l e r v a l u e s of the weighting f a c t o r . With w=1.031 the search t r e e was reduced by about 70%, and 6 of 7 cases obtained the o p t i m a l s o l u t i o n . 3.7 Summary T h i s chapter has i n t r o d u c e d two new a l g o r i t h m s f o r the set p a r t i t i o n i n g problem and one f o r the set c o v e r i n g problem. E v a l u a t i o n of the p a r t i t i o n i n g a l g o r i t h m s was made by comparing t e s t r e s u l t s with those o b t a i n e d using the P i e r c e and Lasky a l g o r i t h m . P i e r c e and Lasky's a l g o r i t h m was chosen because, as shown i n S e c t i o n 3.5, i t r e p r e s e n t s one of the best algorithms i n the l i t e r a t u r e . A l s o i n t r o d u c e d i n t h i s chapter was the idea of u s i n g a smal l program to generate random t e s t problems. T h i s 7 5 makes the test problems more portable because only a small program and a l i s t of input parameters has to be s p e c i f i e d . The unattractive a l t e r n a t i v e would be to e x p l i c i t l y l i s t many large problems. The new algorithms have one common feature, Bounding Algorithm I . This new bounding procedure calculates the lower bound cost of the lin e a r subproblem, but unlike the linear solution i t i s simple and e f f i c i e n t to implement. The bound i s much stronger than that obtained from the relaxed knapsack problem used in the Pierce and Lasky algorithm. This fact i s i l l u s t r a t e d by the test r e s u l t s . The f i r s t p a r t i t i o n i n g algorithm described (SPPBB) i s a branch-and-bound type algorithm. Most search algorithms in the l i t e r a t u r e tend to be branch-and-bound algorithms because they are well understood and easy to implement. However, the second algorithm (SPPA*) i s smarter than the more conventional approach. It uses a more complicated branching strategy based on the A* algorithm. The A* algorithm conducts an optimal search because, for a given branching strategy, i t examines the minimum number of candidate problems. Test r e s u l t s show that the new p a r t i t i o n i n g algorithms perform exceptionally well compared to the Pierce and Lasky algorithm. The perfomance r a t i o increasingly favored the new algorithms as the problems became more d i f f i c u l t . . In some cases the r a t i o exceeded 50. In comparing the new algorithms, SPPA* generally performed much better than SPPBB. However, optimal 76 r e s u l t s from SPPA* c o u l d only be guaranteed f o r the smal l and medium s i z e problems. For the more d i f f i c u l t problems the e n t i r e c a n d i d a t e l i s t was to l a r g e to save. Thus sub-optimal r e s u l t s were sometimes o b t a i n e d . H e u r i s t i c procedures f o r s o l v i n g p a r t i t i o n i n g problems are very important because of the extreme d i f f i c u l t y of s o l v i n g very l a r g e problems. E x i s t i n g a l g o r i t h m s do not lend themselves towards t h i s a p p l i c a t i o n . However, SPPA* i s a H e u r i s t i c Path A l g o r i t h m (HPA) and i s n a t u r a l l y s u i t e d t o t h i s a p p l i c a t i o n . T e s t r e s u l t s showed that good sub-optimal r e s u l t s c o u l d be o b t a i n e d by examining only a small f r a c t i o n of the op t i m a l s e a r c h t r e e . Furthermore, the h e u r i s t i c v e r s i o n of SPPA* i s p a r t of a unique two stage a l g o r i t h m proposed i n t h i s c h a p t e r . SPPA* i s used to q u i c k l y f i n d an upper bound on the c o s t of the op t i m a l s o l u t i o n . T h i s c o s t i s passed t o the SPPBB a l g o r i t h m which s o l v e s the problem by examining a near minimal search t r e e . T h i s procedure i s an e x c e l l e n t way of combining the s t r e n g t h s of both the A* and branch-and-bound a l g o r i t h m s . A c o v e r i n g a l g o r i t h m , SCPA*, was a l s o i n t r o d u c e d i n t h i s c h a p t e r . I t i s an A* a l g o r i t h m and i t s performance on small t o medium s i z e problems appears to be comparable to other a l g o r i t h m s . The main advantage of SCPA* i s i t s HPA c h a r a c t e r . L i k e the p a r t i t i o n i n g case, e x i s t i n g c o v e r i n g a l g o r i t h m s do not le n d themselves to h e u r i s t i c searches. Test r e s u l t s i n d i c a t e t h a t good sub-optimal s o l u t i o n s c o u l d be -found using SCPA*. 77 CHAPTER 4 S p e c i a l Hardware f o r the S o l u t i o n of P a r t i t i o n Problems T h i s chapter w i l l i n v e s t i g a t e the design and f e a s i b l i l i t y of a hardware system f o r s o l v i n g the set p a r t i t i o n i n g problem. The m o t i v a t i o n f o r t h i s i n v e s t i g a t i o n a r i s e s from the time and re s o u r c e s r e q u i r e d to s o l v e l a r g e SPPs. For most a p p l i c a t i o n s i t i s i m p r a c t i c a l to s o l v e l a r g e SPPs on a time-shared mainframe computer because CPU time i s a v a l u a b l e r e s o u r c e . Furthermore, the e l a p s e s o l u t i o n time on a time-shared machine i s u s u a l l y much longer than the t o t a l CPU time, so turn around time can be p r o h i b i t i v e l y l o n g . The a l t e r n a t i v e to be examined here i s a mi c r o p r o c e s s o r based system, supplemented with s p e c i a l hardware, to s o l v e these problems. To be p r a c t i c a l the system should be r e l a t i v e l y economical, implying t h a t the s p e c i a l hardware should be r e l a t i v e l y simple. To be e f f i c i e n t , the system should s o l v e p a r t i t i o n i n g problems i n about the same time as a mainframe computer. Such a system would be a co s t e f f e c t i v e method of h a n d l i n g many l a r g e problems. 78 4.1 Some C o n s i d e r a t i o n s The a l g o r i t h m s of Chapter 3 make p o s s i b l e the d e f i n i t i o n of a system as p r e v i o u s l y d e s c r i b e d . In p a r t i c u l a r , the SPPBB a l g o r i t h m w i l l be c o n s i d e r e d s i n c e i t i s more e a s i l y implemented. SPPBB i s a r e l a t i v e l y simple a l g o r i t h m , being an " a d d i t i v e " type a l g o r i t h m [ 2 ] , Integer a r i t h m e t i c i s used, a v o i d i n g the numerical i n s t a b i l i t i e s that can a r i s e with f l o a t i n g p o i n t c a l c u l a t i o n s . Thus i t would be p o s s i b l e to d i r e c t l y t r a n s l a t e the programs w r i t t e n f o r the Amdahl 470 V/8 ( i n IBM 360/370 assembler language) i n t o programs aimed at a c o n v e n t i o n a l m i c r o p r o c e s s o r system. T h i s i s . c e r t a i n l y an economical approach, but i t i s i n e f f i c i e n t because of the slower c y c l e time and the s m a l l e r i n s t r u c t i o n set of the microprocessor system. Furthermore, the s m a l l e r word s i z e would decrease the p a r a l l e l i s m with which the b i n a r y data c o u l d be handled. Consequently, a c o n v e n t i o n a l microprocessor system c o u l d be expected to perform more than an order of magnitude slower than a mainframe system, too slow f o r the computational bound job c o n s i d e r e d here. A b i t - s l i c e m i c roprocessor design would overcome some of the p r e v i o u s problems. I t would be f a s t e r due to the s h o r t e r c y c l e time, a l a r g e r word s i z e c o u l d be c o n s t r u c t e d , and s p e c i a l microcoded i n s t r u c t i o n s would f u r t h e r improve e f f i c i e n c y . In f a c t , f o r a d e d i c a t e d system such as the one c o n s i d e r e d here, time i n t e n s i v e r o u t i n e s c o u l d be implemented e n t i r e l y i n microcode. 79 A more powerful a r c h i t e c t u r e i s p o s s i b l e i f simple hardware i s added to the b i t - s l i c e d e s i g n . T h i s design s t r a t e g y i s suggested by s e v e r a l observations•. F i r s t , most problem data i s manipulated i n u n i t s c o r r e s p o n d i n g to the columns in the p a r t i t i o n problem. A l a r g e data word would s t o r e the column i n as few words as p o s s i b l e . Remember that l a r g e p a r t i t i o n i n g problems can have s e v e r a l hundred rows. Another o b s e r v a t i o n i s t h a t about 75% of the e x e c u t i o n time f o r the t e s t problems i n T a b l e 3.2 was spent computing a lower bound. Furthermore, most problems of Table 3.2 are s m a l l e r than the word s i z e of the host machine, the Amdahl 470 V/8. That i s , most problems have fewer than 32 rows, thus problem data i s manipulated e f f i c i e n t l y i n s i n g l e i n s t r u c t i o n o p e r a t i o n s . For l a r g e r problems (m > 32), the bounding procedure w i l l consume a g r e a t e r precentage of the time due to the bookkeeping time r e q u i r e d to manipulate columns s t o r e d i n s e v e r a l data words. I t i s i n the bounding procedure where the vast m a j o r i t y of the problem data i s manipulated, so c l e a r l y performance enhancements there w i l l have the g r e a t e s t o v e r a l l e f f e c t . Another o b s e r v a t i o n concerns a small but time consuming e x e r c i s e i n the SPPBB a l g o r i t h m . T h i s i n v o l v e s counting the number of rows in a column, or at the machine l e v e l the number of '1' b i t s i n a word of d a t a . On a g e n e r a l purpose machine t h i s i s an awkward computation, but i t i s a computation that can be e a s i l y handled with s p e c i a l hardware. 80 4.2 S p e c i a l Hardware F i q u r e 4.1 i s a block diagram f o r a b i t - s l i c e m i c r o p r o c e s s o r d e s i g n , supplemented with s p e c i a l hardware, that c o u l d be used t o s o l v e the set p a r t i t i o n i n g problem u s i n g the SPPBB a l g o r i t h m . Program memory and data memory has been p h y s i c a l l y separated. Program memory i s organized c o n v e n t i o n a l l y and c o n t a i n s a l l the code d e f i n i n g the a l g o r i t h m p l u s numeric data such as the value of the c o s t and cost-performance r a t i o f o r every v a r i a b l e . B i n a r y v e c t o r data, such as the columns of the c o n s t r a i n t matrix and the value of -VjeS*Aj f o r every l e v e l of the branch-and-bound search, i s s t o r e d i n the data memory. The bus between data memory and the s p e c i a l hardware i s very wide, as wide as the data word width. N o t i c e t h a t the data memory/special hardware bus i s not shared with any other d e v i c e s . T h i s makes the i n t e r f a c e between the two u n i t s r e l a t i v e l y simple to implement. The f u n c t i o n of the s p e c i a l hardware i s t w o - f o l d . F i r s t , i t f a c i l i t a t e s the e f f i c i e n t m a n i p u l a t i o n of l a r g e u n i t s of data with s i n g l e i n s t r u c t i o n s . Secondly, i t permits the e f f i c i e n t e x e c u t i o n of the bounding procedure. F i g u r e 4.2 i s a flowchart of Bounding A l g o r i t h m I (page 40). S i s the c u r r e n t p a r t i a l s o l u t i o n , R i s a b i n a r y v e c t o r i n d i c a t i n g the rows covered by S, and T i s the rows to be covered by the bounding procedure. The s p e c i a l hardware i s i l l u s t r a t e d i n F i g u r e 4.3. I t has been d i v i d e d i n t o two p a r t s , one c o n t a i n i n g s p e c i a l l o g i c and the other performing the summation, E^a^-T; ) , i n d i c a t e d by 81 Program Memory Data Memory BUS ^ S p e c i a l Hardware - c o n t r o l - c o n t r o l status-<C BUS ~p pP C o n t r o l l e r IT e x t e r n a l i n t e r f a c e s i g n a l s JL host computer i n t e r f a c e F i g u r e 4.1: A system f o r s o l v i n g p a r t i t i o n i n g problems Begin G i v e n : S * = { j j Xj = 1 } R=Vj6S* Aj , k I n i t i a l i z e : LBC(S)=0 T= e-R yes yes i , LBC(S)=oo R e t u r n F i q u r e 4 . 2 : F l o w c h a r t o f Bound ing A l g o r i t h m I 83 R T r e g i s t e r s L o g i c ([ BUS y | A k A T l o a d A w R update T R Co - encoded sum JJP i n t e r f a c e s i g n a l s F i g u r e 4.3: S p e c i a l Hardware row i ^ ) - f ( A f c l A T t ) F i g u r e 4.4: A row of t h e s p e c i a l h a r d w a r e 84 |A KAT| i n F i g u r e 4.2. The l o g i c has access to three r e g i s t e r s c o r r e s p o n d i n g to the v a l u e s of the b i n a r y v e c t o r s A k, R, and T i n the bounding procedure. These r e g i s t e r s are c o n t r o l l e d through s i g n a l s generated by the m i c r o p r o c e s s o r . The l o g i c i l l u s t r a t e d i n F i g u r e 4.3 i s a b i t - s l i c e component. That i s , a s i n g l e u n i t c o u l d be designed t o handle say, 16 rows of d a t a , and i n d i v i d u a l u n i t s c o u l d be cascaded t o b u i l d a f u n c t i o n a l l y i d e n t i c a l u n i t with an i n p u t width the s i z e of the data bus. Consider now the f u n c t i o n of the s p e c i a l hardware i n computing the lower bound. F i r s t , the R r e g i s t e r i s loaded by a d d r e s s i n g the a p p r o p r i a t e l o c a t i o n i n data memory, and l a t c h e d w i t h the 'R l o a d ' s i g n a l from the c o n t r o l l e r . A k i s loaded i n a s i m i l a r manner. The i n i t i a l v a lue of the T r e g i s t e r was loaded at the same time as the R r e g i s t e r . I f s i g n a l C„ i s t r u e (=1) the value of |A kAT| i s f e t c h e d from the counter l o g i c , LBC(S) i s updated, and the T r e g i s t e r i s updated u s i n g the 'T l o a d ' s i g n a l . Program flow then c o n t i n u e s as shown i n F i g u r e 4.2. The s p e c i a l l o g i c must perform four o p e r a t i o n s . I t must compute | Ak A T | , i n d i c a t e i f AlcAR=0 i s t r u e , update T by T=T-(A KAT), and update R by R=(AKV R). Consider the l o g i c f o r row i i n F i g u r e 4.3. T h i s i s i l l u s t r a t e d i n F i g u r e 4.4. Each of the r e c t a n g u l a r boxes r e p r e s e n t s c e l l i of the l a b e l e d r e g i s t e r . The l a s t three o p e r a t i o n s are performed -by the given l o g i c . The remainder of the l o g i c , t h a t c o u n t i n g E ^ l . ( a , - K - ) , c o u l d be implemented as p a r t of the p r e v i o u s l y d e s c r i b e d l o g i c , 85 but i s more e a s i l y implemented using a v a i l a b l e l o g i c . That i s , the hardware c o u l d be b u i l t using encoders and simple ROM lookup t a b l e s . 4.3 A Few D e t a i l s A few d e t a i l w i l l now be d i s c u s s e d . Because i t i s not the i n t e n t i o n of t h i s t h e s i s to b u i l d any hardware, d e t a i l s w i l l be g e n e r a l i z e d to account f o r the v a r i o u s performance t r a d e - o f f s t hat are inherent i n any d e s i g n . The CPU has been d e s c r i b e d as a b i t - s l i c e m i c r o p r o c e s s o r . T h i s c h o i c e improves throughput by the use of s p e c i a l microcoded m a c r o i n s t r u c t i o n s to implement the a l g o r i t h m . B i t - s l i c e CPUs t y p i c a l l y operate with c y c l e times around 250 nsec. That i s , four microcoded i n s t r u c t i o n s are executed every microsecond, so simple microcoded r o u t i n e s can execute f a s t e r than u s i n g the hardwired i n s t r u c t i o n s of a standard m i c r o c p r o c e s s o r . A l s o , the proposed hardware can be e f f i c i e n t l y c o n t r o l l e d by implementing the c o n t r o l s i g n a l s i n microcode. System performance i s improved by the c o n s t r u c t i o n of a wide data bus to e x p l o i t the inherent p a r a l l e l i s m i n the problem. Bus width, i m p l y i n g data word width, c o u l d be s e v e r a l times t h a t of the 32 b i t word used by most l a r g e general purpose computers. For l a r g e problems, those having more rows than the width of the data bus, data would have to be m u l t i p l e x e d i n t o the s p e c i a l hardware. For t h i s case, the s p e c i a l l o g i c would have to be designed to handle the m u l t i p l e x e d data word. 86 The b i t - s l i c e p r o c e s s o r would be a c o n v e n t i o n a l design except f o r the extended microcode to implement the c o n t r o l s i g n a l s to the s p e c i a l hardware. For the numeric data a 24 b i t data word i s needed to ins u r e s u f f i c i e n t accuracy. Thus a 24 b i t ALU i s a component i n the b i t - s l i c e d e s i g n . A r i t h m e t i c can be handled adequately i n microcode because simple unsigned i n t e g e r a d d i t i o n , s u b t r a c t i o n , and m u l t i p l i c a t i o n i s used i n the SPPBB a l g o r i t h m . M u l t i p l i c a t i o n can be handled e f f i c i e n t l y because one of the operands i s always s m a l l , say an 8 b i t number. For i n s t a n c e , |Ak/vT|- CPR(k) can be t r e a t e d as an 8x24 b i t m u l t i p l i c a t i o n , with a 24 b i t r e s u l t , because |A k| i s a r e l a t i v e l y small number, even f o r l a r g e problems. Custom i n t e g r a t e d c i r c u i t s are g e n e r a l l y expensive to b u i l d . However the hardware d e s c r i b e d i n t h i s chapter i s a simple a r r a y of l o g i c gates coupled with three data r e g i s t e r s . The s i m p l i c i t y i m p l i e s low development c o s t s . Implementation c o s t s c o u l d be kept low using semi-custom, perhaps gate a r r a y , technology. The proposed system must handle l a r g e problems. Suppose that the maximum problem s i z e w i l l be rn=500, n=lO,000. Then, 650 Kbytes of data memory i s r e q u i r e d . Program memory w i l l be dominated by the space r e q u i r e d to c o n t a i n the numeric data f o r the l a r g e s t problem. For the above case 100 Kbytes should be s u f f i c i e n t . Semiconductor memory i s r e l a t i v e l y inexpensive and p r i c e s c o n t i n u e to f a l l . Thus the p r i c e of memory should not s e r i o u s l y a f f e c t the economy of the system. 87 The hardware c o s t s f o r a system s i m i l a r to the one d e s c r i b e d should not exceed $15,000. The time r e q u i r e d t o develop a p r o t o t y p e system would be about 4 to 6 man-months f o r an engineer e x p e r i e n c e d i n the area of d i g i t a l systems d e s i g n . A n . a d d i t i o n a l two months would be r e q u i r e d to debug and e v a l u a t e the system. The important f a c t to be i n f e r r e d from the p r e v i o u s f i q u r e s i s that the system would be r e l a t i v e l y inexpensive t o b u i l d . With the throughput of the system comparable to that of a l a r g e mainframe, the proposed system should enjoy a pric e - p e r f o r m a n c e advantage c l o s e to two ord e r s of magnitude b e t t e r than the l a r g e r system. C l e a r l y then, the proposed system would be a c o s t e f f e c t i v e way to s o l v e l a r g e and d i f f i c u l t p a r t i t i o n i n g problems and should be of i n t e r e s t t o anyone s o l v i n g many of these types of problems. 88 CHAPTER 5 Summary and Co n c l u s i o n s T h i s chapter w i l l summarize the r e s u l t s obtained i n Chapters 3 and 4. Suggestions f o r f u r t h e r r e s e a r c h w i l l be made. 5.1 Summary Chapter 3 i n t r o d u c e d two new a l g o r i t h m s f o r the s o l u t i o n of the set p a r t i t i o n i n g problem. The f i r s t , SPPBB, was a branch-and-bound a l g o r i t h m . The second, SPPA*, was an enumerative, h e u r i s t i c path a l g o r i t h m (HPA). Both are r e l a t i v e l y simple, compact a l g o r i t h m s compared to l i n e a r programming based a l g o r i t h m s . SPPA*, being a HPA a l g o r i t h m , i s guaranteed to f i n d the o p t i m a l s o l u t i o n by examining the minimum number of nodes f o r the given bounding s t r a t e g y (Bounding A l g o r i t h m I ) . I t i s more s o p h i s t i c a t e d than the branch-and-bound a l g o r i t h m because of the a d d i t i o n a l l o g i c r e q u i r e d to handle the l a r g e candidate l i s t g e n erated. For l a r g e problems the candidate l i s t can become too l a r g e to s t o r e , at which p o i n t h e u r i s t i c s o l u t i o n s were obtained by pruning the ca n d i d a t e l i s t . Experience shows that such 89 h e u r i s t i c techniques o f t e n y i e l d optimal s o l u t i o n s . More s o p h i s t i c a t e d pruning h e u r i s t i c s would i n c r e a s e the p r o b a b i l i t y of f i n d i n g an o p t i m a l s o l u t i o n . T h i s remains an area f o r f u r t h e r i n v e s t i g a t i o n . SPPA* r e p r e s e n t s a departure from p r e v i o u s a l g o r i t h m s seen i n i n t e g e r l i n e a r programming. I t s success i n t h i s p a r t i c u l a r a p p l i c a t i o n suggests that other i n t e g e r programming a l g o r i t h m s may b e n e f i t by adopting an A* branching s t r a t e g y . Performance e v a l u a t i o n of the new a l g o r i t h m s was made by comparing t e s t r e s u l t s with those o b t a i n e d u s i n g the P i e r c e and Lasky's branch-and-bound a l g o r i t h m . P i e r c e and Lasky's a l g o r i t h m was chosen because of i t s s i m i l a r i t y t o the new a l g o r i t h m s , and because i t was one of the b e t t e r p a r t i t i o n i n g a l g o r i t h m s a v a i l a b l e i n the l i t e r a t u r e . Test problems were generated using the program l i s t e d i n Appendix A, and the input data l i s t e d i n Appendix B. T h i s approach was used so that the numerous l a r g e t e s t problems would not have to be e x p l i c i t l y l i s t e d , which a i d s i n the p o r t a b i l i t y of the t e s t problems f o r those wishing to compare r e s u l t s . Test r e s u l t s u sing SPPBB, SPPA*, and the P i e r c e and Lasky a l g o r i t h m are l i s t e d i n S e c t i o n 3.6.1. R e s u l t s show that the new a l g o r i t h m s perform s i g n i f i c a n t l y b e t t e r than the P i e r c e and Lasky a l g o r i t h m , p a r t i c u l a r i l y f o r the more d i f f i c u l t problems. In some cases the performance r a t i o exceeded 50. Comparing the new a l g o r i t h m s , SPPA* found the optimal s o l u t i o n , on average, i n l e s s than h a l f the time of the SPPBB a l g o r i t h m . However, f o r 90 the l a r g e r problems only SPPBB c o u l d guarantee an optimal s o l u t i o n , although SPPA* d i d f i n d the optimal s o l u t i o n i n most ca s e s . The h e u r i s t i c behavior of the SPPA* a l g o r i t h m as a HPA a l g o r i t h m was i n v e s t i g a t e d by adding a weighting f a c t o r to the bounding f u n c t i o n . R e s u l t s were very good, suggesting a two stage approach f o r f i n d i n g the optimal s o l u t i o n to very l a r g e p a r t i t i o n i n g problems. During the f i r s t stage SPPA* i s used to o b t a i n a good sub-optimal s o l u t i o n i n minimal time. T h i s s o l u t i o n i s an upper bound on the optimal c o s t . I t i s passed to the SPP a l g o r i t h m , which then f i n d s the optimal s o l u t i o n by i n v e s t i g a t i n g a search t r e e much s m a l l e r than the search t r e e generated without an upper bound. Thus, the s t r e n g t h s of the A* and branch-and-bound a l g o r i t h m s have been combined i n t o a unique h y b r i d a l g o r i t h m . P i e r c e and Lasky d e s c r i b e how t h e i r a l g o r i t h m c o u l d be m o d i f i e d to o b t a i n the k-best s o l u t i o n s , r a t h e r than the j u s t the o p t i m a l s o l u t i o n . T h i s approach i s u s e f u l when s i d e c o n s t r a i n t s are not e x p l i c i t l y b u i l t i n t o the SPP. The optimal s o l u t i o n i s the best s o l u t i o n from the k-best which s a t i s f i e s the s i d e c o n s t r a i n t s . SPPA* can be e a s i l y m o d i f i e d to f i n d the k-best s o l u t i o n s . That i s , the f i r s t k s o l u t i o n s d i s c o v e r e d by the a l g o r i t h m are the k-best s o l u t i o n s . Furthermore, performance of the SPPA* a l g o r i t h m compared to branch-and-bound a l g o r i t h m s would i n c r e a s i n g l y favor SPPA* as k i n c r e a s e d . The 'k-best' idea a l s o 91 a p p l i e s to the h e u r i s t i c v e r s i o n s of SPPA*. Of the k-best s o l u t i o n s , the best s o l u t i o n may be the o p t i m a l s o l u t i o n . I n c r e a s i n g k would i n c r e a s e the p r o b a b i l i t y t h a t the o p t i m a l s o l u t i o n would be found, even though o n l y a f r a c t i o n of the o p t i m a l search t r e e may have been examined. These ideas have yet to be s t u d i e d i n d e t a i l . T est r e s u l t s confirmed the use of reduced c o s t s f o r s o l v i n g the SPP. N e v e r t h e l e s s , s o l v i n g the l i n e a r problem t o o b t a i n the reduced c o s t s , a step shared by most SPP a l g o r i t h m s , i s o f t e n a major computational b o t t l e n e c k . From S e c t i o n 2.9 i t i s c l e a r t h a t a good sub-optimal l i n e a r s o l u t i o n to the SPP would be s u f f i c i e n t to c a l c u l a t e the reduced c o s t s . T h i s suggests t h a t the b o t t l e n e c k c o u l d be reduced u s i n g one of the methods d i s c u s s e d i n the survey of S e c t i o n 2.7. For i n s t a n c e , estimated v a l u e s of the l i n e a r dual v a r i a b l e s c o u l d be o b t a i n e d u s i n g Bounding A l g o r i t h m I or I I , then improved u s i n g the i t e r a t i v e s ubgradient o p t i m i z a t i o n methods i n v e s t i g a t e d by E t c h e b e r r y [10,11], Thus, good reduced c o s t s c o u l d be o b t a i n e d i n l e s s time than i t takes to s o l v e the l i n e a r problem. T h i s remains an area f o r f u r t h e r r e s e a r c h . To s o l v e set c o v e r i n g problems, a m o d i f i e d v e r s i o n of the SPPA* a l g o r i t h m , SCPA*, was i n t r o d u c e d . A l i m i t e d e v a l u a t i o n of the a l g o r i t h m suggests that f o r s m a l l and i n t e r m e d i a t e s i z e problems the performance of the a l g o r i t h m i s comparable to other SCP a l g o r i t h m s . H e u r i s t i c SCPA* s o l u t i o n s were a l s o examined and r e s u l t s were good. 92 The advantage of the new set c o v e r i n g a l g o r i t h m over the t r a d i t i o n a l l i n e a r programming based a l g o r i t h m s i s i t s r e l a t i v e s i m p l i c i t y and e f f i c i e n t usage of r e s o u r c e s . For example, l i n e a r programming manipulates a l l problem data as f l o a t i n g p o i n t numbers. However, SCPA* s t o r e s and manipulates c o n s t r a i n t data as b i n a r y v a l u e s packed i n data word, and i n t e g e r a r i t h m e t i c i s used f o r c o s t c a l c u l a t i o n s . Chapter 4 i n t r o d u c e d an inexpensive microprocessor based system t h a t c o u l d be used to s o l v e the p a r t i t i o n i n g problem. The design i n c l u d e s a wide data bus, and s p e c i a l hardware, to e x p l o i t the p a r a l l e l i s m i n the SPPBB a l g o r i t h m . The system would be o r g a n i z e d as a p e r i p h e r a l d e v i c e a t t a c h e d to the host computer. The system would be c o s t e f f e c t i v e because i t c o u l d s o l v e l a r g e p a r t i t i o n i n g problems i n about the same time as mainframe computer, but without u s i n g the expensive r e s o u r c e s of a mainframe computer. 5.2 C o n c l u s i o n s In c o n c l u s i o n , t h i s t h e s i s has i n t r o d u c e d two e f f i c i e n t set p a r t i t i o n i n g a l g o r i t h m s that represent an improvement upon a l g o r i t h m s c u r r e n t l y i n the l i t e r a t u r e . Such improvements are important because l a r g e p a r t i t i o n i n g problems have important a p p l i c a t i o n s and yet can take hours to s o l v e on mainframe computers. The two a l g o r i t h m s i n t r o d u c e d each have t h e i r own p a r t i c u l a r advantages. A h y b r i d a l g o r i t h m that combines the two promises to be a s t r o n g technique f o r h a n d l i n g very l a r g e 93 problems. The s t r o n g h e u r i s t i c techniques that have been developed are important due to the NP-complete nature of the problem. The microprocessor based system i n t r o d u c e d showed that these problems can be s o l v e d e f f i c i e n t l y without u s i n g the expensive r e s o u r c e s of a l a r g e computer. T h i s chapter has made s e v e r a l suggestions f o r f u r t h e r r e s e a r c h stemming from t h i s work. Using the r e s u l t s presented i n t h i s t h e s i s i t i s p o s s i b l e that even stronger techniques can be developed to s o l v e p a r t i t i o n i n g problems. 94 B i b l i o q r a p h y 1. Arabeyre, J . P., F e a r n l e y , J . , S t e i g e r , F. C , Teather, W., "The A i r l i n e Crew Scheduling Problem: A Survey," T r a n s p o r t a t i o n S c i e n c e , V o l . 3, No. 2, 1969, pp. 140-163. 2. B a l a s , E., "An A d d i t i v e A l g o r i t h m f o r S o l v i n g L i n e a r Programs with Zero-One V a r i a b l e s " , Operations Research, V o l . 13, 1965, pp. 517-546. 3. B a l a s , E., Padberg, M. W., "On the Set Covering Problem, I I . An A l g o r i t h m f o r Set P a r t i t i o n i n g , " O perations Research, V o l . 23, 1975, pp. 74-90. 4. B a l a s , E., Padberg, M. W., "Set P a r t i t i o n i n g : A Survey," SIAM Review, Vol.18, No. 4, October 1976, pp. 710-761. 5. B a l i n s k i , M. L., Quandt, R. E., "On an Integer Program f o r a D e l i v e r y Problem," Ope r a t i o n s Research, V o l . 12, 1964, pp.300-304. 6. Bellmore, M., R a t l i f f , K. D., "Set Covering and I n v o l u t o r y Bases", Management S c i e n c e s , V o l . 18, No. 3, November 1971, pp. 194-206. 7. C h v a t a l , V., "A Greedy H e u r i s t i c f o r the S e t - C o v e r i n g Problem," Mathematics of O p e r a t i o n s Research, V o l . 14, No.3, August 1979, pp. 233-235. 8. Cobham, A., North, J . H., "Extensions of the Integer L i n e a r Programming Approach to the M i n i m i z a t i o n of Boolean F u n c t i o n s , " Res. Rep. RC-915, IBM, 1963. 9. Day, R. H., "On Optimal E x t r a c t i n g from a M u l t i p l e F i l e Data Storage System: An A p p l i c a t i o n of Integer Programming," Operations Research, V o l . 13, 1965, pp. 482-494. 10. E t c h e b e r r y , J . , "The Set R e p r e s e n t a t i o n Problem," Ph.D. t h e s i s , The U n i v e r s i t y of Michigan, August 1974. 11. E t c h e b e r r y , J . , "The S e t - C o v e r i n g Problem: A New I m p l i c i t Enumeration A l g o r i t h m , " O p e r a t i o n s Research, V o l . 25, 1977, pp. 760-772. 12. G a r f i n k l e , R. S., Nemhauser, G. L., "The S e t - P a r t i t i o n i n g Problem: Set Covering with E q u a l i t y C o n s t r a i n t s , " O p e r a t i o n s Research, V o l . 17, 1969, pp. 848-856. 95 13. G a r f i n k l e , R. S., Nemhauser, G. L., "Optimal P o l i t i c a l D i s t r i c t i n g by I m p l i c i t Enumeration Techniques," Management Sci e n c e , V o l . 16, 1970, pp. B495-B508. 14. G a r f i n k l e , R. S., Nemhauser, G. L., Integer Programming, John Wiley, New York, 1972. 15. Gomory, R. E., "An A l l - I n t e g e r Programming A l g o r i t h m , " IBM Res. Rep. RC-189, January 1960; a l s o I n d u s t r i a l S c h e d u l i n g , P r e n t i c e - H a l l , 1963, pp. 193-206. 16. Hart, P., N i l s o n , P. N., Raphael, B., "A Formal B a s i s f o r the H e u r i s t i c Determination of Minimum Cost Paths," IEEE Trans. Sys. S c i . and Cyber., V o l . SSC-4, No. 2, 1968, pp.100-107. 17. Held, M., Karp, R. M., "The Traveling-Salesman Problem and Minimum Spanning T r e e s : Part I I , " Mathematical Programming 1, 1971, pp. 6-25. 18. Held, M., Wolfe, P., Crowder, H., " V a l i d a t i o n of Subgradient O p t i m i z a t i o n , " Mathematical Programming 6, 1974, pp. 62-68. 19. Lemke, C. E., S a l k i n , H. M., S p i e l b e r g K., "Set Covering by S i n g l e Branch Enumeration with L i n e a r Programming Subproblems," Operations Research, V o l . 19, 1971, pp. 998-1022. 20. Marsten, R. E., "An A l g o r i t h m f o r Large Set P a r t i t i o n i n g Problems," Management Sc i e n c e , V o l . 20, No. 5, 1974. pp. 774-787. 21. Padberg, M. W., "Covering, Packing, and Knapsack Problems," Annals of D i s c r e t e Mathematics 4, 1979, pp. 265-287. 22. P i e r c e , J . F., " A p p l i c a t i o n of C o m b i n i t o r i a l Programming to a C l a s s of All-Zero-One Integer Programming Problems," Management S c i e n c e , V o l . 15, No. 3, 1968, pp. 191-209. 23. P i e r c e , J . F., Lasky, J . S., "Improved C o m b i n a t o r i a l Programming A l g o r i t h m s f o r a C l a s s of All-Zero-One Integer Programming Problems," Management Science, V o l . 19, No. 3, 1973, pp. 528-543. 24. Pohl, I . , " P r a c t i c a l and T h e o r e t i c a l C o n s i d e r a t i o n s i n H e u r i s t i c Search A l g o r i t h m s , " Machine I n t e l l i g e n c e 8, 1977, pp. 55-72. 25. Pyne, I. B., McClusikey, E. J . , "An Essay on Prime Implicant T a b l e s , " SIAM J . , V o l . 9, 1961, pp. 604-631. 96 26. Quine, W. V., "A Way of S i m p l i f y i n g T r u t h F u n c t i o n s , " Am. Math. Mon., V o l . 62, 1955, pp. 627-631. 27. S a l k i n , H. M., Koncal, R. D., "A Pseudo-Dual A l l - I n t e g e r A l g o r i t h m f o r the Set Covering Problem," Dept. of Oper. Res. Tech. Memo. 204, Case Western U., C l e v e l a n d , Ohio, Nov. 1970. 28. S a l k i n , H. M., Koncal, R. D., "A Dual A l l - I n t e g e r A l g o r i t h m ( i n r e v i s e d simplex form) f o r the Set Covering Problem," Dept. of Oper. Res. Tech. Memo. 250, Case Western Reserve U., C l e v e l a n d , Ohio, Aug. 1971. 29. S a l k i n , H. M., Koncal, R. D., "Set Covering by an A l l Integer A l g o r i t h m : Computational E x p e r i e n c e , " J . ACM, V o l . 20(2), A p r i l 1973, pp.189-193. 30. Schrange, L., "A More P o r t a b l e F o r t r a n Random Number Generator," ACM Trans. Math. Software, V o l . 5(2), June 1979, pp. 132-138. 31. S p i e l b e r g , K., "Enumerative Methods i n Integer Programming," Annals of D i s c r e t e Mathematics 5, 1979, pp. 139-183. 97 APPENDIX A  A Program f o r Generating Test Problems The program GENTEST generates the p a r t i t i o n i n g and c o v e r i n g t e s t problems used i n t h i s t h e s i s . The problems are r e p r o d u c i b l e u s i n g the input data given i n Appendix B. L i n e s 15 through 27 and 65 through 106 can be changed to s u i t the input/output format d e s i r e d by the user. 1 2 C C Program GENTEST 3 C - T h i s program generates random but r e p r o d u c i b l e 4. C p a r t i t i o n i n g problems a c c o r d i n g to the input 5 c C r data s u p p l i e d by the user. 7 \— INTEGER P(768)/768*1/,C0ST(768)/768*1/ 8 INTEGER LI ST(1)/'*'/,ELINE(768),PMAX,FX 9 INTEGER EINDEX(128)/128*0/,MAXIX/2147483647/ 10 1 1 12 c INTEGER*2 E(128,150) \~ C Input s i z e and d e n s i t y of c o n s t r a i n t m a t r i x , and 13 1 4 15 C c seed value f o r random number generator. PRINT10 16 READ(5,LI ST) N 17 PRINT20 18 READ(5,LI ST) M 19 PRINT30 20 READ(5,LI ST) D 21 PRINT4 0 22 READ(5,LI ST) IX 23 c 24 10 FORMAT('-','INPUT N: 1) 25 20 FORMAT(' ','INPUT M: ') 26 30 FORMAT(' ','INPUT D e n s i t y : ' ) 27 40 FORMAT(' ','INPUT Seed value:') 28 c 29 c P l a c e a '1' i n every column to ensure there are no 30 31 32 c (-t r i v i a l zero columns. DO 50 J=1,N 33 CALL RAND(IX) 34 I=(IX/(MAXIX/M))+1 35 EINDEX(I)=EINDEX(I)+1 9 8 36 J1=EINDEX(l) 37 50 E ( I , J 1 ) = J 38 C 39 C P l a c e '1's at random i n mat r i x . 40 C EINDEX(I) r e c o r d s no. of '1's i n row. 41 C P(J) r e c o r d s no. of '1's i n column. 42 C 43 NONES=D*N*M-N 44 DO 60 K=1,NONES 45 CALL RAND(IX) 46 I=(IX/(MAXIX/M))+1 47 CALL RAND(IX) 48 J=(IX/(MAXIX/N))+1 49 EINDEX(I)=EINDEX(I)+1 50 J1=EINDEX(I) 51 E ( I , J 1 ) = J 52 60 P(J)=P(J)+1 53 C PMAX=max{P(J)} 54 PMAX=1 55 DO 70 J=1,N 56 70 IF(P(J).GT.PMAX) PMAX=P(J) 57 C 58 C C a l c u l a t e c o s t v e c t o r . COST(J) i s roughly 59 C p r o p o r t i o n a l to P ( J ) . 60 C 61 DO 80 J=1,N 62 CALL RAND(IX) 63 K=(IX/(MAXIX/41)) 64 80 C0ST(J)=(P(J)*2*(100+K-20))/PMAX 65 C 66 C P r i n t c o s t s , 25 v a l u e s per l i n e . 67 C 68 KL=1 69 KH=25 70 NT=(N/25) 71 IF(NT.EQ.O) GOTO 100 72 DO 90 L=1,NT 7 3 WRITE(7,110) (COST(J1),J1=KL,KH) 74 KL=KL+25 75 90 KH=KH+25 76 100 IF(KL.GT.N) GOTO 120 7 7 WRITE(7,110) (COST(J1),J1=KL,N) 78 110 FORMAT(' ',2514) 79 C ';' d e l i m i t s the c o s t s from the c o n s t r a i n t s 80 120 WRITE(7,130) 81 130 FORMATC;') 82 C 83 C PRINT maxtrix E ( i , j ) i n e x p l i c i t 0,1 b i n a r y format. 84 C I f N>128, p r i n t each row i n c o n s e c u t i v e l i n e s 85 C of 128 c h a r a c t e r s maximum l e n g t h . 86 C 87 NT=NT/128 88 DO 190 1=1,M 89 DO 150 J=1,N 99 90 150 ELINE(J)=0 91 NR0W=EINDEX(I) 92 DO 160 J1=1,NROW 93 J=E(I,J1) 94 160 ELINE(J)=1 95 C 96 KL=1 97 KH=128 98 IF(NT.EQ.O) GOTO 180 99 DO 170 L=1,NT 100 WRITE(7,200) (ELINE(J1 ) , J1=KL,KH) 101 KL=KL+128 102 170 KH=KH+128 103 180 IF(KL.GT.N) GOTO 190 104 WRITE(7,200) (ELINE(J1),J1=KL,N) 105 190 CONTINUE 106 200 FORMAT(' ',12811) 107 STOP 108 END 109 C 1 1 0 C 111 C RAND i s a p o r t a b l e no. generator d e s c r i b e d by 112 C Schrange i n "ACM T r a n s a c t i o n s on Mathematical 113 C Software," V o l . 5, No. 2, June 1979. 114 C 115 SUBROUTINE RAND(IX) 116 INTEGER A,P,IX,B15,B16,XHI,XALO,LEFTLO,FHI,K 117 DATA A/16807/,B15/32768/,B16/65536/,P/2147483647/ 1 18 C 119 C Get 15 high order b i t s of IX 120 XHI=IX/B16 121 C Get 16 low order b i t s and form low product 122 XAL0=(IX-XHI*B16)*A 123 C Get 15 high order b i t s of low product 124 LEFTLO=XALO/B16 125 C Form the 31 h i g h e s t b i t s of f u l l product 126 FHI=XHI*A+LEFTLO 127 C Get o v e r f l o past 31st b i t of f u l l product 128 K=FHI/B15 129 C Assemble a l l the p a r t s and p r e s u b t r a c t P 130 IX=(((XAL0-LEFTL0*B16)-P)+(FHI-K*B15)*B16)+K 131 C Add P back i n i f necessary 132 IF(IX.LT.O) IX=IX+P 133 RETURN 134 END 100 APPENDIX B  Input Data f o r the Test Problems The f o l l o w i n g i s a l i s t of the input data to GENTEST which generates the problems used i n t h i s t h e s i s . Problem N M D e n s i t y Seed Value C1 90 30 .110 15 C2 90 30 .235 2 C3 90 30 .37 17 C4 90 50 .235 7 C5 90 40 .235 12 CG 50 30 .235 1 1 C7 70 30 .235 14 P1 1 20 15 .23 8 P2 90 15 .23 22 P3 1 20 30 .075 1 P4 1 20 30 .074 29 P5 200 30 .074 10 P6 400 30 .074 1 P7 400 40 .073 15 P8 400 50 .073 3 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0065574/manifest

Comment

Related Items