TERMINATION AND STORAGE REQUIREMENTS IN A MODEL FOR PARALLEL COMPUTATIONS by IVAN 3CHNAPP Dipl.Ing., Czech Technical University, Prague, 1966 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF . MASTER OF APPLIED SCIENCE in the Department of E l e c t r i c a l Engineering We accept this thesis as conforming to the required standard Research Supervisor Members of the Committee Acting Head of the Department Members of the Department of E l e c t r i c a l Engineering THE UNIVERSITY OF BRITISH COLUMBIA June, 1970 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r an advanced d e g r e e a t t h e U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t t h e L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s t u d y . I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s may be g r a n t e d by t h e Head o f my Depar tment o r by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Depar tment The U n i v e r s i t y o f B r i t i s h C o l u m b i a V a n c o u v e r 8, Canada Date ABSTRACT This paper deals with a grajjh-theoret i c model for p a r a l l e l computations as formulated-by Karp and M i l l e r . A necessary condition and a s u f f i c i e n t condition for self-termination of loops with unit loop gain are presented. For the special case that W=U the necessary and s u f f i c i e n t condition i s derived. A direct procedure for testing termination properties of strongly connected graphs i s presented. A method due to Reiter, for determining the rua ximura storage required for a computation graph, i s extended to cover the general case. TABLE OF CONTENTS ABSTRACT i TABLE OP CONTENTS i i LIST OF ILLUSTRATIONS i i i ACKNOWLEDGMENT i v 1 . INTRODUCTION. 1 1 . 1 The K a r p - M i l l e r model 1 1.2 Research connected with p a r a l l e l computation models 6 2. TERMINATION PROPERTIES OF COMPUTATION GRAPHS 9 2 . 1 The K a r p - M i l l e r algorithm 9 2.2 Some necessary and s u f f i c i e n t c o nditions f o r s e l f - t e r m i n a t i o n of loops 1 7 2.3 A d i r e c t method of t e s t i n g termination p r o p e r t i e s of strong components 25 3 . STORAGE REQUIREMENTS 28 3 . 1 Maximum storage requirement - s p e c i a l case 28 3.2 Maximum storage requirement - general case 33 4 . CONCLUSIONS .40 REFERENCES 42 i i LIST OP ILLUSTRATIONS F i g u r e Page 1 Computation graph f o r Laguerre p o l y n o m i a l s ...5 2 Dependence of the amount of d a t a on the l o o p g a i n 18-19 3 T e r m i n a t i o n of loo p s 24 i i i ACKNOWLEDGMENT I wish to express my gratitude to Dr. E . L. Sigurdson for his guidance and encouragement in carrying out this research, and to Dr. G. F. Schrack for reading the manuscript. I also wish to acknowledge the financial support of the National Research Council. iv CHAPTER 1 INTRODUCTION As s i g n a l propagation speeds represent a serious b a r r i e r to i n c r e a s i n g the speed of s t r i c t l y s equential computers more a t t e n t i o n has been paid i n recent years to'the use of the p a r a l l e l -ism i n t r i n s i c to most computational algorithms. A number of de-signs have appeared which u t i l i z e a number of processors which may simultaneously execute several steps of the computation ( / l / - / 5 / ) , rather than overlapping of subfunctions i n sequential processing. In general, values to be used.in a computation step are the r e s u l t s of previous computation steps. This e s t a b l i s h e s c e r t a i n precedence c o n s t r a i n t s upon the computation steps. A model of such a system, s a t i s f y i n g a p a r t i c u l a r c l a s s of precedence c o n s t r a i n t s , lias been formulated by Karp and M i l l e r /6/. This t h e s i s studies some problems a r i s i n g i n connection with t h i s model, i n p a r t i c u l a r t ermination p r o p e r t i e s (Chapter 2) and storage requirements (Chapter 3). In t h i s chapter we present the model and the r e s u l t s of pre-vious research, and compare i t with other approaches to the p a r a l l e l processing. 1.1 The K a r p - M i l l e r Model The model represents the sequencing of a p a r a l l e l computation by a f i n i t e d i r e c t e d graph. Each node of the graph corresponds to an operation i n the computation (or to a processor assigned to per-form that o p e r a t i o n ) . Each branch represents a f i r s t - i n f i r s t - o u t 1 2 q u e u e o f d a t a d i r e c t e d f r o m o n e p r o c e s s o r t o a n o t h e r . To d e s c r i b e d a t a t r a n s f o r m a t i o n b y p r o c e s s o r s , w i t h e a c h n o d e i s a s s o c i a t e d a s i n g l e - v a l u e d f u n c t i o n d e t e r m i n i n g t h e d e p e n d e n c e o f o u t p u t s o n i n p u t s . T h e e l i g i b i l i t y f o r i n i t i a t i o n o f a n o p e r a t i o n i s d e t e r m i n -e d b y t h e l e n g t h s o f t h e q u e u e s o n b r a n c h e s d i r e c t e d i n t o i t s a s s o c i a t e d n o d e . T h u s a c o m p u t a t i o n i s r e p r e s e n t e d b y a d i r e c t e d g r a p h G c a l l e d a c o m p u t a t i o n g r a p h w h i c h i s g i v e n b y : ( i ) a s e t o f n o d e s n ^ n ^ , . . ( i i ) a s e t o f b r a n c h e s d ^ , . . d ^ , w h e r e a n y g i v e n b r a n c h d ^ i s d i r e c t e d f r o m a s p e c i f i e d n o d e n ^ t o a s p e c i f i e d n o d e n • , ( i i i ) f o u r n o n n e g a t i v e i n t e g e r s A ,U ,¥ a n d T , w h e r e • ° ° P P P P T > ¥ , a s s o c i a t e d w i t h e a c h b r a n c h d . P P P H e r e , A^ g i v e s t h e i n i t i a l n u m b e r o f d a t a w o r d s i n t h e f i r s t - i n f i r s t - o u t q u e u e a s s o c i a t e d w i t h d ; U g i v e s t h e n u m b e r o f 1 P P w o r d s a d d e d t o t h e q u e u e w h e n e v e r t h e o p e r a t i o n Ch a s s o c i a t e d w i t h n ^ t e r m i n a t e s ; V<T g i v e s t h e n u m b e r o f w o r d s remove;: f r o m t h e q u e u e w h e n e v e r t h e o p e r a t i o n 0- i s i n i t i a t e d ; a n d T i s a t h r e s h o l d g i v -. J P i n g t h e m i n i m u m q u e u e l e n g t l i o f d ^ w h i c h p e r m i t s t h e i n i t i a t i o n o f 0-. U p o n i n i t i a t i o n o f 0. o n l y t h e f i r s t W o f t h e T o p e r a n d s f o r J J P P 1 C h a r e r e m o v e d f r o m t h e q u e u e . T h e o p e r a t i o n 0. a s s o c i a t e d w i t h a g i v e n n o d e n . i s e l i g i b l e J J f o r i n i t i a t i o n i f a n d o n l y i f , f o r e a c h b r a n c h d d i r e c t e d i n t o n . , P J t h e n u m b e r o f w o r d s i n t h e q u e u e a s s o c i a t e d w i t h d i s g r e a t e r t h a n o r e q u a l t o T^. A f t e r C h b e c o m e s e l i g i b l e f o r i n i t i a t i o n , ¥ w o r d s a r e r e m o v e d f r o m e a c h b r a n c h d d i r e c t e d i n t o n . . T h e o p e r a t i o n 0. P J J 3 i s then performed. Vhen Ch terminates, words are placed on each branch d directed out from n.. The times required to perform the steps mentioned above are l e f t unspecified by the o r i g i n a l model as presented in /6/. These constraints on i n i t i a t i o n lead to the following defin-i t i o n s of the possible sequences of i n i t i a t i o n s associated with a given computation graph G. Let E be a sequence of nonempty sets ,S,j > • . >S^ r, . . , such that each set S^ T is a subset o f £ l , 2 , . . , whore X is the number of nodes in G. Let x(j,0) - 0, and, for N>0, l e t x(j,N) denote the number of sets S , l<m<N, of which i i s an element, i n ' ' 0 The sequence E is an execution of G i f and only i f for a l l N, the following conditions hold: (i ) i f j C S v,. and G has a branch d directed from n. to J N+l p l n., then A + U x(i,N) - W x(j,N)> T : ( i i ) i f E i s f i n i t e and of length R, then for each j there exists a node n. and a branch d directed from n. to l p l n. such that A + U x(i,R) - W x(j,R)<T . j p p v ' ' p x J ' ' p An execution E of G is calle d a proper execution i f the follow-ing implication holds: ( i i i ) i f , for a l l n. and for every branch d directed from l J p n. to n., A + U x(i,N) - W x(j,N)>T , then j € S„ l j p p ' p J ' p' J li for some R > N. The sequence E may be interpreted as giving a possible temporal sequence of i n i t i a t i o n s of operations throughout the per-formance of the p a r a l l e l computation specified by G; the occurence 4 of S.. denotes the simultaneous i n i t i a t i o n of 0. for a l l }€. S,-. Condition ( i ) states that in order for node n_. to i n i t i a t e J for the x(j,N + l ) - t h time, the queue lengths on i t s input branches must be greater than, or equal, to the respective branch thresholds. Condition ( i i ) defines the circumstance under which an execution terminates, i . e . under which the computation defined by G halts. This computation terminates when every node of G is unable to i n i t i a t e . Condition ( i i i ) requires that a node, i f able to i n i t i a t e , a c t u a lly w i l l do so after some f i n i t e number of i n i t i a t i o n s of other nodes. The following example i l l u s t r a t e s these ideas: Example 1.1 Consider the Laguerre polynomials defined by the recurrence r e l a t i o n L ^-.(x) = (2n + 1 - x)L (x) - n 2L ,(x) n+l ' n v ' n-1 v ' with i n i t i a l conditions L Q(x) - 1 L x(x) = 1 - x We want to compute the values of L n(x) for n = 2,3,..,N and for a given x. A computation graph for this calculation is in F i g . l . For each branch the intermediate result is shown. Branch c o e f f i c i e n t s are assumed to be A=0, U=Vi:=T=l unless otherwise shown. Brandies (n^,n^)' ( i . e . the branch directed from n^ to n^ ) and (n2>n 9) serve as counters; the computation is terminated by the depletion of queues associated with these branches. Node n^ produces 2n and n", and places them on (n,,n~) and (n, ,n-j), respectively. Node n„ takes 5 A=N-1 U=0 ( 2 n+l-x) 2 n n n. A=N-U=0 L n - l ( x > Computation graph for Laguerre polynomials F i g . 1 6 ( l - x ) from the branch ^ 2 ^ 2 ) and adds i t to the other input 2n; the r e s u l t (2n+l-x) i s placed on ( i ^ j n ^ ) . Node n^ forms the product of (2n+l-x) and L n ( x ) and places the r e s u l t on (n^,n^). Node n^ m u l t i -2 p l i e s L n_-^(x) by n and places the r e s u l t on (n^,n_). F i n a l l y , n^ produces the d e s i r e d polynomials L ^ x ) , L ^ ( x ) , . . , L^,(x) and places them on (n^,n^) and (n^,n^). The i n i t i a l data queue on (n^,n^) i s L Q ( X ) = 1 , L ^ ( X ) = 1 - X and on (n^,n 4) i t i s L^(x)=l-x. 1.2 Research connected with p a r a l l e l computation models Karp and M i l l e r fbf show that f o r every proper execution the sequence of data words occuring on any branch of G i s always the same thus ensuring the same computational r e s u l t . This property i s r e f e r r e d to as the determinaoy of a computation graph. A l s o , they give an algorithm to determine whether a computation terminates, and a procedure f o r f i n d i n g the number of performances of each operation i n G. F i n a l l y , they give necessary and s u f f i c i e n t con-d i t i o n s f o r the lengths of data queues to remain bounded. R e i t e r i n h i s Ph.D. t h e s i s fjf addresses himself to the problems of storage, scheduling, and optimum assignment of operat-ions to processing u n i t s . He gives an integer l i n e a r program f o r the determination of the maximum storage required by a computation graph G. He introduces a concept of an admissible schedule d e f i n i n g v a l i d node i n i t i a t i o n times and c h a r a c t e r i z e s the c l a s s of a l l admissible schedules i n the case W =T =1, U -0 or 1. He f u r t h e r shows that i n P P P t h i s case i t i s p o s s i b l e to f i n d a p e r i o d i c admissible schedule which achieves the maximum computation rate (also see / 8 / ) . He also defines the cost of an assignment of node functions to processors 7 and gives a method f o r determining f e a s i b l e s o l u t i o n s when the maxi-mum comjmtation rate has a lower bound. F i n a l l y , he extends the model to incorporate a r e s t r i c t e d form of data dependency without l o s i n g i t s detcrminacy. A d i f f e r e n t approach to the g r a p h i c a l r e p r e s e n t a t i o n of a computation i s taken by Martin / 9 / . He allows two types of node i n -put c o n t r o l . In the case of conjunctive input c o n t r o l a node can i n i t i a t e only i f each branch d i r e c t e d i n t o the node contains at l e a s t one data word. In the case of d i s j u n c t i v e input c o n t r o l a node can i n i t i a t e only i f at least one branch directed into the node contains at l e a s t one data word. S i m i l a r l y , conjunctive output c o n t r o l places one word on each branch d i r e c t e d out from the node and d i s j u n c t i v e output c o n t r o l places one word on a branch d i r e c t e d out from the node according to an a p r i o r i p r o b a b i l i t y . The l a t t e r represents a c o n d i t i o n a l t r a n s f e r which i s d e t e r m i n i s t i c every time i t occurs but over many p o s s i b l e data sets may be modelled p r o b a b i l -i s t i c a l l y . Note that the conjunctive input c o n t r o l and conjunctive out-put c o n t r o l correspond to the T=vr=l case and U=l case i n the Karp - M i l l e r model, r e s p e c t i v e l y . Also note, that because of c o n d i t i o n a l t r a n s f e r s , t h i s model i s not determinate. In h i s work Martin studies the assignment of node comput-ations to processors and t r i e s to minimize the average computation time. Further research of t h i s model can be found i n /lO/, where an approximative method f o r c a l c u l a t i n g the average computation time ± s given, and /ll/, where procedures.are Riven.to determine a lower and an upper bound on the number of processors required f o r maximum 8 p a r a l l e l i s m . An a p p l i c a t i o n o f r e s u l t s o f t h e s e s t u d i e s t o a s s e m b l y - l i n e b a l a n c i n g p r o b l e m s i s g i v e n i n /12/. CHAPTER 2 TERMINATION PROPERTIES OF COMPUTATION GRAPHS Pa r t 2.1 of t h i s chapter i s devoted to a p r e s e n t a t i o n of the K a r p - M i l l e r a l g o r i t h m . The a l g o r i t h m i s used to determine whether the computation s p e c i f i e d by a given computation graph terminates, and to f i n d the number of performances of each o p e r a t i o n i n case the computation terminates. Sev e r a l theorems are given i n p a r t s 2.2 and 2.3 to improve the e f f i c i e n c y of the al g o r i t h m . 2.1 The K a r p - M i l l e r A l g o r i t h m Theorems and Lemmas of s e c t i o n 2.1 are proved i n /6/. Since the number of performances of an ope r a t i o n 0 i s independent of the execu-t i o n considered only i f t h i s e x c u t i o n i s proper, Karp and M i l l e r r e s t r i c t t h e i r a t t e n t i o n to proper executions. A node n_. of a computation graph i s s a i d to terminate i f and only i f ( i n the f o l l o w i n g i f f ) j occurs i n only a f i n i t e number of the se t s of a proper execution of G. N a t u r a l l y , t h i s number i s the same f o r a l l proper executions of G. To f u r t h e r study the te r m i n a t i o n of p r o p e r t i e s of computation graphs we need to introduce a few concepts from graph theory. A d i r e c t e d graph i s c a l l e d s t r o n g l y connected i f f given any p a i r of nodes n. and n. there e x i s t s a d i r e c t e d path from n. to n.. A s p e c i a l case of a s t r o n g l y connected graph i s the t r i v i a l graph which has only one node and no branches. For any d i r e c t e d graph there e x i s t s a unique p a r t i t i o n of 9 10 i t s nodes i n t o equivalence classes as f o l l o w s : Two nodes n. and n. are i n the same clas s i f there e x i s t s a d i r e c t e d path from n^ to n^ , and a d i r e c t e d path from n^ to n^. A subgraph c o n s i s t i n g of the nodes of an equivalence c l a s s and the branches of the o r i g i n a l graph connecting these nodes i s then a s t r o n g l y connected graph. The subgraphs corresponding to the node equivalence classes are maximal s t r o n g l y connected subgraphs of the o r i g i n a l graph and are c a l l e d the strong components of the graph. These strong components play a c r u c i a l r o l e i n the Karp-M i l l e r algorithm. Lemma 2.I Let G' be a s t r o n g l y connected subgraph of a computation graph G.'Then e i t h e r every node of G' terminates or nonedoes. We say that 0' terminates i f every node of G' terminates. Divide a l l the nodes of a computation graph G in t o two classes according to whether they terminate or not. Then by Lemma 2.1 the set S of terminating nodes i s a union of sets of nodes of some strong components of G. The s t r o n g l y connected subgraph G' which would terminate i f i t were i s o l a t e d from the r e s t of G i s c a l l e d s e l f - t e r m i n a t i n g . Let us examine when a strong component i s s e l f - t e r m i n a t i n g . Lemma 2.2 Let d^=(n^,iij) be a branch of a computation graph. Then n^ terminates i f n. terminates. l 11 r s This loads to the following concepts: Let G and G' -be strong components of G. Then define Gr >_ GS i f either Gr. = GS or there exist r s such nodes n - 6 G and n.€ G that there is a directed path from n. 1 j 1 to n . i n G. J Theorem 2.3 s A strong component G of a computation graph G terminates r r i f f there exists G such that G i s self-terminating and G r> G s. s r Thus G terminates i f f i t is self-terminating or some G is s e l f -terminating and there is a directed path from some n . 6 G to some n - £ G b. Therefore to determine the set S we examine strong com-r s r ponents for self-termination in such order that i f G >. G , then G is examined f i r s t . Now the problem is how to determine 'whether a strongly con-nected subgraph G' is self-terminating. For this purpose Karp and M i l l e r use properties of the 1OO|JS contained in G'. A computation graph L is called a loop i f i t consists of d i s t i n c t nodes , n 0 , . , " a n d branches d^,d2,..,d£ such that d^ is directed from to n^_ + 1 > k=l, 2, . . , / , - l , and d^ is directed from n^ to n^ . Here we note that any strongly connected subgraph G' except the t r i v i a l graph contains at least one loop. Theorem 2.4 A strongly connected subgraph G1 is self-terminating i f f G' 12 contains a s e l f - t e r m i n a t i n g loop. Given t h i s theorem the only problem i s how to e s t a b l i s h that a p a r t i c u l a r loop i s s e l f - t e r m i n a t i n g . Let L be a loop with branches d, ,d,,, . . ,d«. The product it U . g= TT rr^ i s c a l l e d the gain of the loop. There are 3 cases: g ^ l , i = l i g=l, and g >1. Loops with g < 1 Theorem 2.5 Any loop L f o r which g < l i s s e l f - t e r m i n a t i n g . Loops w i t l i g=I Theorem 2.6. Let r~ P = 1 1 U ' " l v i u l h V l U l h u i V i h W l h w i , V l h h U2 u i Y ' l ''2 i h-l h 1 ' T.I "1 W2 h j I V , ' l V h-l h-l I ' 1 V i i . I 13 *1 /33 ft A k - i " T k - i + 1 where fo l{ = y— k=2 , 3 ,. . and k-1 / 3 i = A i - T x + 1 Then the necessary condition for self-termination of a loop with g=l is P /3 < 0 Karp and M i l l e r give a necessary and s u f f i c i e n t condition for s e l f -termination in the following special case: Theorem 2.7 If , for l£k^X,, Vi'. = U , =1, then the loop L i s self-terminating i f f ( i . e . Z A < Z (T,-l) ). k=l " k=l k k=l k In case that Theorems 2.6 -and 2.7 cannot be applied Karp and M i l l e r derive an upper bound on the numbers of x^erformances of nodes of a self-terminating loop. Theorem 2.8 Let L be a self-terminating loop with g=l. Let X* be a positive integer solution of the system OC^ = a 14 U 1 OC = - i a 2 M1 a U l U 2 u u ry, - _ —£ U a whore a i s an a r b i t r a r y parameter. Let X = x n X, where X^ i s the number of performances of node n^, k=l,2,.., Then at l e a s t one component of X i s l e s s than the correspond ing component of X*. Loops with g >1 Theorem 2.6 i s v a l i d also f o r t h i s case. I f L i s s e l f - t e r m i n a t i n g then we get the f o l l o w i n g upper bound: X < y i - P fi 1-g '-Having obtained an upper bound f o r X we can t e s t s e l f - t e r m i n a t i o n of a loop by applying a procedure given i n the f o l l o w i n g theorem. 15 Theorem 2.9 # For a l l nodes n.€S the f o l l o w i n g i t e r a t i o n scheme converges J i n a f i n i t e number of steps to X; *(0)<;>=° r _ („) , Ik -T +1+U xKa,{iT x ^ n + 1 ^ ( j ) = max J x ^ ( j ) , (i,p)6Z.j, i € S \ % H e r e ^ j i s the set of ordered p a i r s ( i , p ) such that d^ i s a brancli from node n. to node n.. The r e s u l t s given so f a r may be organized i n t o an algorithm f o r determining which nodes of a computation graph G terminate and, f o r the terminating nodes n.. , computing the number of performances x ( j ) . This algorithm may be o u t l i n e d /6/ as f o l l o w s : Step 1. From among the strong components of the computation graph being considered ( i n i t i a l l y t h i s graph i s G), s e l e c t one which i s not covered by any other subgraph. C a l l i t G'. Step 2. By applying Steps 2A,...,2D given below, t e s t whether G' i s s e l f - t e r m i n a t i n g and, when i t i s , determine x ( j ) f o r each n.£G'. Step 3. Form a new computation graph as f o l l o w s : I f G' i s not s e l f -t e r m i n a t i n g , remove G' and a l l branches i n c i d e n t with nodes of G'. I f G' i s s e l f - t e r m i n a t i n g , replace each branch d from n^G' to n^G' by an "equivalent" branch d^, from n^ to n^ , having U^,=0, A ,=A +U x ( i ) , T ,=T , and W , =V,T . Then remove G' . P P p p p p p Step 4. I f the new computation graph i s nonempty, return to Step 1. Otherwise the a n a l y s i s of termination i s complete. 4 The symbol fx! denotes " l e a s t integer greater than or equal to x." 16 The d e t a i l s of Step 2 are now described. Step 2k. I f G' contains a branch d^ with U =0, go to Step 2D. I f not, determine whether G' contains a loop with g < l . This i s equi-valent to determining whether there i s a loop L sucli that Z log (U / V ) < 0 . d € L P P P This determination can be c a r r i e d out by a shortest-route algorithm given i n /13/. Enumeration of loops i s not required i n t h i s pro-cedure. I f a loop with g < l e x i s t s , go to Step 2D; otherwise, go to Step 2E. Step 2B. Every loop of G' has .g > 1. Determine whether there i s a loop not p r e v i o u s l y considered such that 0> Pyg. I f no such loop e x i s t s , G' i s not s e l f - t e r m i n a t i n g ; r e t u r n to Step 3. I f such a loop L i s found, determine upper bounds on the q u a n t i t i e s x^(k) by the methods given above. These bounds hold, of course, only i f L i s s e l f - t e r m i n a t i n g . Step 2C. Continue applying the i t e r a t i o n scheme of Theorem 2.9, tak-ing S to be set of nodes of G', u n t i l e i t h e r (a) i t terminates, e s t a b l i s h i n g that G' i s s e l f - t e r m i n a t i n g , and g i v i n g x ( j ) f o r each n.£G', or J (b) f o r some n and some k, x^ n^(k) exceeds the upper bound on x ^ ( k ) , e s t a b l i s h i n g that L i s not s e l f - t e r m i n a t i n g . Return to Step 2B. Step 2D. G' i s s e l f - t e r m i n a t i n g . Apply the i t e r a t i o n scheme of Theorem 2.9, tak i n g S to be the set of nodes of G', to obtain x ( j ) f o r each n.£G'.. Return to Step 3. 17 2.2 Some necessary and s u f f i c i e n t conditions for .self-termination of 1o o ps As shown above, the K a r p - M i l l e r algorithm is.based on termin-ation properties of loops. Consequently i t s e f f i c i e n c y depends main-l y on the means available for testing self-termination of loops. In the following we s h a l l derive some theorems testing these proper-t i e s . Theorem 2.10 ' If, for l^k<L,£, p— = 1, then the loop L i s self-terminating b k I Z 1X1 £ o k=l PROOF; By Theorem 6 of /6/ the necessary and s u f f i c i e n t condition for self-termination of a loop is the existence of a nonnegative integer solution of the following system of i n e q u a l i t i e s : , , v A^-Tjt+l+lixU) x( l ) > g A1r-T^+1+U,,x(k) x ( k + l ) > — — \ - . 'for k=l,2 , . . j e-l k This system reduces to x(l)> /3£+x(i) x(k+l)>/3 k+x(k) for k-1, 2 , . . , i - l Since a l l x's are integers, we have 1 8 x(k+l) > f/3j+x(k) for k=l,2 f.. ,1-1 By summing l e f t and right-hand sides of the above i n e q u a l i t i e s , respectively, we get the necessary condition To prove that i t i s also a s u f f i c i e n t condition we -ban.-, show that i f i t i s s a t i s f i e d the system x(l)=C x(k+l)=x(k)+ f / 3 k ] k=l , 2 , . . , 1 - 1 where C i s a s u f f i c i e n t l y large integer i s a nonnegative integer solution of the above i n e q u a l i t i e s . Consider the following examples Example 2 . 1 A - 0 U=l \V=T=2 A=2 Here g=l / 2 . The data d i s t r i b u t i o n after each node performes once i s A ' = 0 U=l W=T=2 T=Vf=l U=l A'=l F i g . 2 1 9 Example 2.2 A = 0 A=l Here g=2. The data d i s t r i b u t i o n a f t e r each node performs once i s A ' = 0 T=W=1 A ' = 2 Example 2.3 A=l. Here g-l. The data d i s t r i b u t i o n a f t e r each node performs once i s A ' = 0 T=Vf=l A'=l Dependence of tlie amount of data on the loop gain Fig.2 20 These examples indicate that (roughly speaking) for g > l the "amount of data" increases, for g<1 i t decreases, and for g=l i t remains constant. The following theorem gives this fact a precise form. Theorem 2.11 „ I u Let L be a loop with gain g - II ^— . Let A. and A^ be the i = l i i n i t i a l and current number of words on the i - t h branch, respectively, for i = l,2,..,j£. Then JL I i f g < l Z c i A i < 21 c.A^ i=l i=l L 5L i f g=l ' Z c.A. = Z c.A 0 . , l i • , I l i=l i=l t L i f g> 1 Z c. A. > Z. c.A FA . , I I . , I I 1=1 i=l where c^ - 1 and c. 4 TT T f -* 1 — for i=2,3,..,i. PROOF; Suppose that W . words are taken from the j-th branch (j'^;C), J and L Tj_^ words are placed on the (j+l)-th branch. Taking into account that c . , = c . d+i - j u. + 1 the change i n the sum F c.A. is ^ . , I l i = l 21 W . c. ,U. , - c.V. = c. rr-L- U. , - C.W. = 0 0+1 J+l J J J J+l J J Now suppose that W^ words are taken from the j2-th branch and U words are placed on the 1-st branch. We have C l ! i ! 2 V i h. cz = u 2 u 3 u z = e w^ and the change in the sum U c i u i - C A = u i - i*x W = V 1 " i> i s positive for g > l , negative for g < 1, and zero for g=l. As a coro l l a r y we get a necessary condition for self-term-ination of a loop, which i s ess e n t i a l l y equivalent to Theorem 2.6. Corollary 2.12 A necessary condition for self-termination of a loop with g > l and i n i t i a l number of words on the i - t h branch A?, i=l,2,.., L i s that T c.A°< J" c. (T.-l) 1=1 1=1 PROOF: If the loop is self-terminating, then after a f i n i t e number of performances we must have 22 A. < T . - l f o r i=l,2,..,1 i - i ' ' ' and T c.A.< > c.( T . - l ) i i . , I X I i = l x x i = l By Theorem 2.11 I I I I c.A°< f c.A. < f c. ( T . - l ) . , I l — I l — I l ' i = l A A i = l x x i = l Now we s h a l l derive a s u f f i c i e n t c o n d i t i o n f o r loops with g=l Lemma 2.13 Let L be a loop with g=l, and W\=T^ f o r i = l , 2 , . . , L e t A? be the i n i t i a l number of data words on the i - t h branch f o r i=l,2,..,£. I f Z c.A. < max c .V . i = l 1 1 I<j<l J J then the loop i s s e l f - t e r m i n a t i n g , PROOF: Let k be such number that Tli en JL max c -V . = c, Vf, c .*—, c, 1 i = l k t— n i ^ k Since 23 I c.A° = I c.A. where A. i s the current number of words on the i - t h branch we have l X* c X c A, < 5" — A. = 7 — A°<W, = T, k c, l c, l k k i=l k i=l k Thus the number of words on the k-th branch w i l l never reach the threshold and the number of performances of node iij. i s zero. By Lemma 2.2 every node of the loop terminates. Hence the theorem. Theorem 2.14 Let L be a loop with g=l. Let A? be the i n i t i a l number of data words on the i - t h branch, i = l,2,..,X. A s u f f i c i e n t condition for self-termination of L is that 0 1 y c.A < y c. (T.-W.) + max c.V . i = i 1 1 i i i 1 1 1 i * g j t J J PROOF; Suppose that L does not terminate. Then after each node performed at least once, A i ^ W i , e - A i = ( V V + A i where A[> 0 for i=l,2,.. If we replace A^ and T^ by A| and T^ =Vr_^ , respectively, the resulting loop L' w i l l have the same termination properties as L, i . e . w i l l not terminate. . Then by Lemma 2.13 24 2_ c'A! > max c'.W. i=l 1 1 l£j*Jt J J Since c!=c for i=1.2,*.,£ f c.A 0 = f c.A. = ? c. (T.-YT. ) + [ c.A! = £ c. (T.-V. ) + J_ c!A!> i T i 1 1 i t i 1 1 1 1 1 i = i 1 1 i = i 1 1 1 i = l 1 x ~ T c(T.-W.) + max c'.W. = T c.(T.-W.) + max c .W. i t l 1 1 1 l<j<Z J J i = l 1 1 1 l*j£ J J which i s a contradiction. To i l l u s t r a t e how strong the conditions of Corollary 2.12 and Theorem 2.14 are consider the following two simple examples; Example 2.4 n4 W=T=2 * U=2 Example 2.5 W=T=1 U=l Here c^=c-j=l c 2=c 4=2 Ic.A 0 = 2 =Ic.(T.-l) i i *~ l x l and the loop terminates A=0 Here c^=c 2=l and max c.W.=1 l<j£2 J J 7c.A?=l= Zc•(T.-W.)+raax c.W. 1 1 1 1 1 l<g£2 J J and the loop does not terminate Termination of loops Fig.3 25 2•3 A dire c t method of testing termination properties of strong components• As noted in /6/, the Karp-Miller algorithm requires the inspec-t i o n of each loop of a strong component G' when G' is not self-term-inating. If G' contains many loops a more direc t way i s desirable. Consider the shortest-route algorithm given in / l 3 / . It is used i n Step 2A of the Karp-Miller algorithm for testing termiation properties (see part 2.1). If we use mu l t i p l i c a t i o n and associate U A with branch d ,'rather than addition and log(U Ar ), then, in p/ p p ° p' p' ' ' • the absence of loops with g < l , the algorithm results in assigning a rat i o n a l number to each node of G'. On multiplying these numbers by the least common product of thei r denominators each node n. w i l l have an integer OC . assigned. l ° l ° If d =(n.,n.) i s a branch of a loop L with g=l, then 06. /oi. =\j A . p i ' j 1 o > j ' i p ' p If L has g > 1, then for one and only one branch do we haveOc^/<X< <U A ; for other branches of L OC./CX.^ U A • P P' J/ i P P Consider now an arbitrary loop L of G' with g=l. If L is s e l f -terminating then by Theorem 2.8 the number of performances i s for at least one node n.€ L less thanOC. . J J We shall show that the same i s v a l i d for loops with g>l. Theorem 2.15 If L i s a self-terminating loop with g> 1, then at least one component of X is less than the corresponding component of X? here X = X, X, X , x*= a l oc 0 OC,, a a ( U 1 A 1 ) a(u 1A 1)(u 2A 2)-(fiV ;i-i ) 26 where a i s an ar b i t r a r y parameter and X^ is the number of jierf ormances of node n^, i=l,2,..,X. PROOF; By Theorem 4 of /6/, X i s the minimum nonnegative solution of (E-A)X>/3 , i . e . 0 1 0 0 u. 0 0 , . 0 0 0 • • • u Jt-1 V l 0 x l V x 2 ^ 2 • > • • • where /3-, = W 1 and fc. A i - 1 - T i - 1 + 1 V l i i — 2 , 3 , . . ,/£ Then (E-A)X* = U. 0 0 0 0 0 0 0 0 u 'l-l l - l 0 a tfl W2 ^1^2 V l W l W l 27 (E-A)X* = a Now consider the vector X-X*. (E-A)(X-X*) = (E-A)X - (E-A)X* > /3 If no component of X is less than the corresponding component of X*, then (X-X*) is a smaller nonnegative integer solution of (E-A)X>/| than X, which is a contradiction. Theorems 2.15, 2.8, and 2.4 serve as a basis for a procedure for testing termination properties of strong components, which may be outlined as follows: Step 1. Apply the shortest-route algorithm modified as shown above. If a loop with g<1 i s found go to Step 2; otherwise continue u n t i l each node n. of the strong component is assigned a constant^. . l ° l Step 2. Apply the i t e r a t i o n scheme of Theorem 2.9 to the nodes of the strong component G1 u n t i l either a) the scheme terminates, establishing that G' is s e l f -terminating and giving the number of performances x(j) for each n . € G 1 , o r 3 b) for some n and some k, x (k) exceeds the upper bound C*k on x(k), establishing that G' i s not self-terminating. 1-g 0 0 0 ^ 0 -; CHAPTER 3 STORAGE REQUIREMENTS In the Karp-Miller model each branch d^=(n^,nj) represents a queue of data words which may be an output of operation 0^ assoc-iated with node n^ and which may serve as an input for operation 0j associated with node n^. Each data word has to be stored in a memory location of a computer performing the computation, and the maximum number of memory locations required becomes of int e r e s t . Chapter 3 i s devoted to th i s problem. 3•1 Maximum storage requirement - special case In this part we present the results of / i f . Let us introduce a branch parameter tr > 0: If d =(n..n.),X r p p i J p i s the fixed time required by node n^ to fetch i t s input data from storage, process these data, and place outputs into memory locations associated with the queue on branch d . Thus i f n. i n i t i a t e s at time p l t, i t places U data words upon branch d at time t+tT . P P P um Another parameter we introduce is T^=max^TJ^| where the maxim is taken over a l l branches d directed out from node n.. P i A schedule is a set 4-{^ -^ ' * * ,(->t} w n e r e each is a funct-ion 6. : {l,2,..,X.} -» R such that for l < k < r £ X . l ^ ( k K d . U ) Here R i s the set of real numbers and X^ i s the number of i n i t i a t i o n s 28 29 of node n^ for any proper execution of G. If X^=0, i s undefined. With each we associate a function xi : R-{o,l,2,..,X.} x.(t)=0 i f f either X.=0 or X . i l and t < 4 . ( l ) . 1 x 1 l l For l < k < X i , x t(t)=k i f f 6 (k) ^ t< 6 (k+l) For X. >1, x.(t)=X. i f f <S-(X.)<t. I ' i i i v i For every branch d =(n.,n.) define P i j b d(t)=A +U x.(t-T )-W (x.(t)-E.(t)) p P p i p p J J where £.(t)=l i f there exists k, l ^ k ^ X . such that 6.(k)=t £..(t)=0 otherwise. A schedule is c a l l e d an admissible schedule i f , for j=l,2,..,X <S.(k)=t b 6 ( t ) > T j P P for a l l branches d into n.. and for a l l k, 1 £ k £ X . . P J J A schedule <o is sequential i f for no nodes n., n., with 1 i ' j n . . do we have i J 6.(k)< d.(r)< 6 . (k)+T. I J I I , f o r l £ k £ X . , l £ r < X . . i J These d e f i n i t i o n s are to be interpreted as follows: <^(k)=t means that node n^ begins i t s k-th i n i t i a t i o n at time t under the schedule (J. x^(t) is the number of i n i t i a t i o n s of node I K , up to and including time t, under the schedule <i. b^(t) is the number of data words on branch d^ at time t. The quant-i t y £.(t) is introduced for the following reason: A l l data transmit-J ted to node n^ by node n^ v i a the branch (n^,n^) must f i r s t pass 30 through storage. Then i f t i s a time at which n. does not i n i t i a t e , the storage requirement at time t i s b 6 ( t ) = A +U x . ( t - f )-W x . ( t ) . P P P i • P P J If n^ i n i t i a t e s at time t, the number of data words in storage at time t i s b 6 ( t ) = A +U x . ( t - r )-V ( x . ( t ) - l ) P P P i P P j An admissible schedule specifies those node i n i t i a t i o n times cor-responding to a proper execution. Thus a node n^ i n i t i a t e s at time t (d>.(k)=t for some l<k<X.) only i f each branch d directed into v l v I ' J p n. contains at least T data words at time t. (b^(t)>T ). F i n a l l y , l p ' p p a sequential schedule is a schedule under which no node i n i t i a t e s at the same time that some other node is executing. For any admissible scheduled, define A> = max T b 6 (t) C 6 t p P jkJ^ thus defines the maximum amount of storage required by the admissible schedule 6. Lemma 3.1 Let 6 be an admissible schedule for a computation graph G. Then there exists an admissible sequential schedule (a.s.s.) 6 such that ^ 6 ' * <^6 This Lemma shows that in general a p a r a l l e l system i s more econom-i c a l than a sequential system in the sense that i t needs fewer storage locations. 31 Let ^U/ = max 6 i s &n admissible schedule^ . ^ i s the maximum number of memory locations that a computation graph G can require. Corollary 3.2 i_, £M = oax( ^/<? is an a.s.s.) For any admissible schedule define T<* = {t/t£ (6.(10,4. (kJ+'E) for 0 < k £ X . , i = l,2,..,/}. are those times during which no node of G is executing under the schedule 6 ; however, nodes of G may just be i n i t i a t i n g or terminat-ing under 6 at some of the times t€ T^. We then have the following r e s u l t : Corollary 3.3 (Ms- wax max { ^ ( " t ) / C J is an a.s.s. and t 6 T^} . 6 t Write b. f ° r ^ n e column vector with p-th component A , W for the column vector with p-th component W , T for the column vector with p-th component T , and, for any admissible schedule <$> define _b^(t) to be the column vector with p-th component b ^ ( t ) , p=l,2,..,t. Define a tXj£, matrix A with elements a .=\V i f branch d i s directed into n. but not also out from n.. PJ P P J J a . =-U i f brancli d is directed out from n. but not also into n.. P J P P J J a . =Vr - U i f branch d i s directed out from n. into n.. P J P P P J J a .=0 otherwise. P J 32 F i n a l l y , l e t X be a column vector with i - t h component X^, i=l,2,.., Theorem 3.4 # Let G s a t i s f y b.>T-W. Then is determined by the following integer linear program: |b| - minU ^ I subject to 0 <y_< X where each y^, i=l,2,..£, i s an integer. Theorem 3.4 enables us to determine the maximum amount of storage required for a given computation graph, provided b>T_-W. What i f thi s is not sa t i s f i e d ? We quote from /l/: "In those cases where this inequality is violated, the program /of Theorem 3.4/ i s inapplicable. Under such contingency one pos-s i b l e course of action i s to simulate a l l possible admissible schedules for G u n t i l a d i s t r i b u t i o n of data i s obtained which s a t i s f i e s the above i n e q a l i t y , and then apply the program /of Theorem 3.4/ to this data d i s t r i b u t i o n . Then the maximum storage requirement is either that obtained through the simulation phase, or that obtained by the program, whichever i s the greater. In general, however, such a scheme would be impractical due to the pot e n t i a l l y large number of possible d i f f e r e n t simulations i n -volved. " n # If ii i s a n-dimensional vector, then define |x[ = T, x. . ~~ i = l 1 33 3.2 Maximum storage requirement - general case A node n^ of a computation graph can i n i t i a t e only i f for every branch d^ directed into n^ the number of data words assoc-iated with this branch i s not less than the corresponding threshold T . Consequently, the number of data words on this branch after the i n i t i a t i o n i s not less than T -V . P P Consider a computation graph Cr which does not s a t i s f y b> T-W, i . e . there exists at least one branch d^=(n^,n..), d^ £ G such that A < T -Vir . Let us assume that G contains only one node terminal to p p p such a branch. Later we s h a l l show how to extend the results to the case of more nodes terminal to such branches. Clearly, the data d i s t r i b u t i o n w i l l s a t i s f y the requirement b >T-W after the f i r s t i n i t i a t i o n of n.. In the following we derive J a method for finding the maximum storage required prior to the f i r s t i n i t i a t i o n of n.. Then we apply methods of part 3.1 to determine the maximum storage required after the f i r s t i n i t i a t i o n of n.. J Given a computation graph G modify i t as follows: For the node n^ jmt a l l V .=T -=0 a l l U • .=0 J s Note that the data d i s t r i b u t i o n of the modified graph G' i s not affected by i n i t i a t i o n s of n.. In the following parameters of G' w i l l be primed. Lemma 3.5 Let <o = (6- (k. ), d>. (k. ), . . ,<J. (k. )} be a schedule, L x l x l r2 12 xm xm where ^ i ^ , i 2 , . . , i m } - I C { l , 2 , . . , j - 1 , j + 1,., l) and 34 x l x l x i x2 x2 x2 x2 x2 x3 x3 d i (k £ < <£. (k. ) . m-1 m-1 m-1 m ra Then 6 i s admissible for G' i f f i t i s admissible for G; moreover b*.(t)=b*(t) for 0 < t £ c ? . (k. ). m m PROOF: By d e f i n i t i o n x.(t)=o i f f t ^ 6 . ( i ) x i(t)=k i f f t >6\(k) and t£<S\ (k+l) Therefore x.(t)=x!(t) for i € l and 0 < t < d . (k. ) l l x ~- I l in m and x. (t)=x! (t)=0 for i / l and 0 < t < 6 . (k. ). i i i i m m Also £ i(t) = £j(t) for i € I and 0 < t < d>i (1^ ), m rn and £. (t) = 6! (t)=0 for i / l and 0<t<6- (k. ). i i / i i m m Let us examine b d(t)=A +U x (t-V )-V (x (t)-£ (t)) for d =(n p p p r p p s s p r There are four d i f f e r e n t cases depending on whether r and s are elements of I or not. ( i ) r € I, s € I Here U =U1 , W =V»M P P P P and b 6(t)=A +U x ( t - t )-\f (x (t)-£ (t))=lj 6(t) p v p p r v p p s x s p x ( i i ) r e l , s ^ I Here U p = U p » x g(t)=x^(t)=0, £ g(t)=£^(t)=0 and b*(t)=A p+U ix r(t-r p)=b'J(t) ( i i i ) r £ I, s 6 I H ere Vp = Vp» x r(t)=x^.(t)=0 and b p (t )=Ap-Wp (x s ( t ) - £ s (t) )=b'J( t) (iv) r £ l , s £ l Here x f (t )=x^ .( t )=xs (t )=x^ (t ) = £ g (t ) = £, (t )=0 and b 6(t)=A =b6{t) P P P 35 We have thus established that b£(t) = b£,(t) for 0*t<6. (k. ). ni m Moreover, for a l l d =(n ,n ), s€ I 'we have T'=T and ' p v r' s ' p p b 6 ( t ) > T P P i f f . b'<Ht)>T' for 0 < t £ 6 . '(k. ). p p 1 1 A r ram This proves that 6 i s admissible for G i f f i t i s admissible for G'. Lemma 3.6 The maximum storage S' required for the modified graph G' max is the same as the maximum storage S required for G prior max to the f i r s t i n i t i a t i o n 6 • (1) of n.. J J PROOF: By Corollary 3.2 we need only consider sequential schedules. The proof w i l l consists of 3 parts. ( i ) Every sequential s chedul e d> = i 6>. (k. ), <•> . (k. ) , . . , ( i . (k. ) , . .> 1 1 2 2 m m can be divided into time intervals <o,6. (k ) ) , <<3. (k ),6 (k )),..,<d (k ) , L )),.. X l X l X l 1 l X2 12 1m 1m Sa+l ^m+l From the d e f i n i t i o n of b^(t) and £^(t) i t follows that S(t)< S(0) i f t<6- (k. ) 1 1 and S(t)<S(<S. (k. )) i f (k. ) < t < d i (k. ) in ra m m m+1 m+1 where 3(t)=|b 6(t)| Thus to get S i t i s s u f f i c i e n t to examine S(t) at t=0, <£. (k. ), max l ^ "^ 1' (>• (k. ), d>. (k. ),..; i n other words i t i s s u f f i c i e n t to examine the 12 x2 x3 x3 integer sequence S ( where S]ii=S(<S. (k. ). m m . ( i i ) Here we s h a l l show that'S' i s i n f i n i t e i f f S i s i n f i n i t e . v max max Let N be an a r b i t r a r y integer. If S' i s i n f i n i t e , then there exists 36 a schedule («> and integer M' such that SM • > N Take the i n i t i a l part of d> up to d> . (k. ) and omit a l l i n i t i a t i o n s Hi' XM' of n.. Then by Lemma 3.5 we get an admissible schedule 6 for G which J requires the same storage as C>'. Thus for some M<M' we have S M = S M « > N " In the same fashion we can show that S' i s i n f i n i t e i f S i s max max i n f i n i t e . ( i i i ) Now suppose that S' i s f i n i t e . Then sequences S' are bound-v 1 1 max x m ed by S 1 and there exists such a schedule d' and integer M' that J max S^t=S' . Take the i n i t i a l part of 6 up to 6. (k. ) and omit a l l M max l^p l ^ j , the i n i t i a t i o n s of n.. This by Lemma 3.5 w i l l give us an a.s.s. for G with the same storage requirements as 6 . Thus S ' =SV'( i —S., £v£> max iM' M max On the other hand, since S i s f i n i t e , there exists a schedule Q ' max ' > and integer N such that N max The i n i t i a l part of this schedule up to O. (k. ) i s an a.s.s. o' for XN • \\T G' with the same storage requirements as o* This gives s =s5=s'£^ S' max N N max From this and S' < S we obtain max max S =S1 max" max Corollary 3.7 Let a l l branches of G s a t i s f y A >T -Yi with a possible ex-p - p p 1 ception of branch (n.,n.). Modify G as follows: Put a l l Vrj.=Tr;.=0, a l l U j g=0 37 Then the maximum storage S ^ required f o r G p r i o r to the f i r s t i n i t i a t i o n of n. i s determined by the f o l l o w i n g integer l i n e a r program: max 1 —' 1 d-' subject to o < y k < x ^ . k/j 0 < y • < oo • A'vib-T'+W' where A', T', W', X' are parameters of the modified graph G'. PROOF: Proof f o l l o w s from Lemma 3.6 and Theorem 3.4. Lemma 3.8 Let G be a computation graph whose a l l branches s a t i s f y A >T - V i ' with a po s s i b l e exception of a branch (n.,n.,). p p p 1 i ' j " Let X.> 0. Then £=b^(t), where _c i s defined belov/, for some a.s.s. <S and some t € T^, t > <S. (1) i f f there e x i s t such integers y^ , i = l , 2 , . . ,J& which s a t i s f y ( i ) °-yk- Xk k ^ 1 < y • < X . J J ( i i ) £=P_-AX^1-1L PROOF: Analogous to that of Theorem 2.4 of /l/. C o r o l l a r y 3.9 Let G be a computation graph whose a l l branches s a t i s f y A >T -V with a p o s s i b l e exception of a branch (n.,n.). p p p 1 * i J 38 Let X.>0. Then the maximum storage S^ required after the j ° max 1 f i r s t i n i t i a t i o n of n. i s determined by the following integer J l i n e a r program: S"*" =1 b I -inin j Ay I max ' — 1 J-1 subject to o < y k < \ k^j Ay_<b - T + ¥ where ea.ch y k i s integer. Theorem 3.10 Let G be a computation graph whose a l l brandies s a t i s f y A >T -YY with a possible exception of a branch (n.,n.). P P P 1 J Let X^ > 0. Then the maximum storage required is M/ = max(3 , ) < max' max PROOF: I. a) Suppose S ^ S 1 1 1 max max There exists an a.s.s. for the modified graph G' which requires S m^ x of storage. By Lemma 3.5 this schedule i s admissible also for G and therefore i s an i n i t i a l part of some a.s.s. for G. Thus A/> S > 3 1 C max max and XC>max(S ,3"^ ) C — max' max b) Now suppose S >S 1 1 max— max By Corollary 3.9 S^ a x= I b | -1 Ay_°l whor e | Ay_°| = m i n | Ay_ | subject to 0 ^ y k < X k k^j 39 Then by Lemma 3.8 S 1 =|b 6(t)|- for some a.s»s.» and n i a x /W/^ S1 >s C max— max i . e . Xt/>raax(S ,3^ " ) <• max' max II . Let (•> be an arbitr a r y schedule for G. Then |b 6 ( t ) | < S for t < d . ( l ) 1 — x " max j and I b d (t )| ^ 3 1 for t > 6 . ( l ) 1 — 1 max — j Thus ii/<iraax(S , ) Cr — max' max The results of I. and I I . give M/= raax(S , S"^ ), q.e.d. b < max' max ' ^ Theorem 3.10 and Corollaries 3.7 and 3.9 thus provide means for finding the maximum storage for graphs where one and only one node i s terminal to a branch which does not s a t i s f v A > T -W . - p- p p In case there are' r such nodes, we take a subset of these nodes { n. ,n. ,..,n. , where 0 < s r. For a l l branches directed ^ Xl X2 s into the nodes of the subset we put Yir=T=0, and for a l l branches directed out from the nodes of the subset we put U=0. Then to this modified graph we apply the following integer linear program: S =jbl-min|Ay I max 1 — 1 I J. i subject to 0 < y k < X k M n ,n ,..,n. 1 2 s l £ y • £ x . xi x l 1 < y • < X . 16 y • £ X . J i i s s r r Since there are 2 sucli subsets we have 2 lin e a r programs. The maximum required storage i s the maximum of the 2 p a r t i a l maximum storage requirements. CHAPTER 4 CONCLUSIONS The Karp-Miller algorithm for testing termination properties of computation graphs i s based on the termination properties of loops. It i s , therefore, desirable to have means for testing loops. A quantity related to the number of data words in a loop i s introduced, which decreases for g < l , increases for g > l , and remains constant for g=l, in the course of computation. This concept makes i t possible to derive a simple s u f f i c i e n t condition (Theorem 2.14) for self-termination of loops with g=l, and to give a shorter and i n t u i t i v e l y more s a t i s f y i n g proof (Corollary 2.12) of necessary condition of Theorem 2.6 due to Karp and M i l l e r . In the special case that \V=U, the necessary and s u f f i c i e n t condition i s given (Theorem 2.10). This condition has a simple form due to the fact that data propagation has l o c a l character in this case. Since this i s i n v a l i d i n the general case, one probably cannot hoj)e for a simple form of the general necessary and s u f f i c i e n t condition. The Karp-Miller algorithm i s not well suited for computation graphs with many loops. Therefore, in part 2.3 of Chapter 2 a dir e c t procedure for testing termination properties of strongly connected graphs is derived. The procedure does not require inspection of every loop as in the Karp-Miller algorithm. However, i t also uses the i t e r a t i o n scheme of Theorem 2.9, which for large graphs may be too lengthy. Reiter in /l/ gives a linear integer program for determining the maximum amount of storage required i n the special case that 40 41 b>T-W. Part 3.2 of chapter 3 extends h i s method to cover the general case. The number of l i n e a r programs required i n our method increases as 2 1 where i i s the number of nodes term i n a l to brandies which do not s a t i s f y A^ > ^ p""^ *p» o u ^ -"-^ appears to be more e f f i c i e n t than s i m u l a t i o n of a l l p o s s i b l e schedules /7/, e s p e c i a l l y f o r h i g h l y p a r a l l e l computations. REFERENCES flf Gregory J . and McReynolds R., "The SOLOMON computer", IEEE Trans, on E l e c t r o n i c Computers, v o l . EC-12, pp.774-781, Dec. 1963. /2/ Schwartz J . , "Large p a r a l l e l computers", J.ACM, vol.13, No. 1, pp.25-32, Jan.1966. /3/ S l o t n i c k D.L. et al.,"The ILLIAC IV computer", IEEE Trans, on Computers, v o l . C-17, No.8, pp.746-757, Aug.1968. /A/ Thurber K.J. and Myrna J.W., "System design of a c e l l u l a r APL computer", IEEE Trans, on Computers, v o l . C-19, No.4, pp.291--303, A p r i l 1970. /5/ Koczela L.J. and Yvang G.Y., "The design of a h i g h l y p a r a l l e l computer o r g a n i s a t i o n " , IEEE Trans, on Computers, vol.C-18, No.6, pp.520-529, June 1969. /6/ Karp R.M. and M i l l e r R.E., " P r o p e r t i e s of a model f o r p a r a l l e l computations: deterrninacy, t e r m i n a t i o n , queuing", SIAM J . Appl. Math., vol'.14, No.6, pp. 1390-1411, Nov.1966. flf R e i t e r R., "A study of a model f o r p a r a l l e l computations", Techn. Report No. RADC-TR-6S-350, Rome A i r Development Center, Nov.1968. /8/ R e i t e r R., "Scheduling p a r a l l e l computations", J.ACM, vol.15, No.4, pp.590-599, Oct.1968. /9/ Mart i n D.F., "The automatic assignment and sequencing of comput-ations on p a r a l l e l processor systems", -Ph.D. Thesis, Dept. of Eng., UCLA, Jan.1966. /l O / Martin D.F. and E s t r i n G., "Path length on graph models of computations", IEEE Trans, on Computers, vol.C-18, No.6, pp.530-536, June 1969. / l l / Baer J.L.E. and E s t r i n G., "Bounds f o r maximum p a r a l l e l i s m i n a b i l o g i c graph model of computations", IEEE Trans, on Comp., vol.C-18, No.11,.pp.1012-1014, Nov.1969. /12/ R e i t e r R., "On assembly-line balancing problems", Operations Research, vol.17, No.4, pp.685-700, July-Aug.1969. /13/ Ford L.R. and Fulkerson D.P., Flows i n networks, Princeton U n i v e r s i t y Press, P r i n c e t o n , 1962, pp.130-134. 42
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Termination and storage requirements in a model for...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Termination and storage requirements in a model for parallel computations Schnapp, Ivan 1970
pdf
Page Metadata
Item Metadata
Title | Termination and storage requirements in a model for parallel computations |
Creator |
Schnapp, Ivan |
Publisher | University of British Columbia |
Date Issued | 1970 |
Description | This paper deals with a graph-theoretic model for parallel computations as formulated by Karp and Miller. A necessary condition and a sufficient condition for self-termination of loops with unit loop gain are presented. For the special case that W=U the necessary and sufficient condition is derived. A direct procedure for testing termination properties of strongly connected graphs is presented. A method due to Reiter, for determining the maximum storage required for a computation graph, is extended to cover the general case. |
Subject |
Computers. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-05-17 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0102044 |
URI | http://hdl.handle.net/2429/34610 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1970_A7 S39_4.pdf [ 2.5MB ]
- Metadata
- JSON: 831-1.0102044.json
- JSON-LD: 831-1.0102044-ld.json
- RDF/XML (Pretty): 831-1.0102044-rdf.xml
- RDF/JSON: 831-1.0102044-rdf.json
- Turtle: 831-1.0102044-turtle.txt
- N-Triples: 831-1.0102044-rdf-ntriples.txt
- Original Record: 831-1.0102044-source.json
- Full Text
- 831-1.0102044-fulltext.txt
- Citation
- 831-1.0102044.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0102044/manifest