UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Termination and storage requirements in a model for parallel computations 1970

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
UBC_1970_A7 S39_4.pdf
UBC_1970_A7 S39_4.pdf [ 2.5MB ]
UBC_1970_A7 S39_4.pdf
Metadata
JSON: 1.0102044.json
JSON-LD: 1.0102044+ld.json
RDF/XML (Pretty): 1.0102044.xml
RDF/JSON: 1.0102044+rdf.json
Turtle: 1.0102044+rdf-turtle.txt
N-Triples: 1.0102044+rdf-ntriples.txt
Citation
1.0102044.ris

Full Text

TERMINATION AND STORAGE REQUIREMENTS IN A MODEL FOR PARALLEL COMPUTATIONS by IVAN 3CHNAPP Dipl.Ing., Czech Technical University, Prague, 1966 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF . MASTER OF APPLIED SCIENCE in the Department of E l e c t r i c a l Engineering We accept this thesis as conforming to the required standard Research Supervisor Members of the Committee Acting Head of the Department Members of the Department of E l e c t r i c a l Engineering THE UNIVERSITY OF BRITISH COLUMBIA June, 1970 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r an advanced d e g r e e a t t h e U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t t h e L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s t u d y . I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s may be g r a n t e d by t h e Head o f my Depar tment o r by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Depar tment The U n i v e r s i t y o f B r i t i s h C o l u m b i a V a n c o u v e r 8, Canada Date ABSTRACT This paper deals with a grajjh-theoret i c model for p a r a l l e l computations as formulated-by Karp and M i l l e r . A necessary condition and a s u f f i c i e n t condition for self-termination of loops with unit loop gain are presented. For the special case that W=U the necessary and s u f f i c i e n t condition i s derived. A direct procedure for testing termination properties of strongly connected graphs i s presented. A method due to Reiter, for determining the rua ximura storage required for a computation graph, i s extended to cover the general case. TABLE OF CONTENTS ABSTRACT i TABLE OP CONTENTS i i LIST OF ILLUSTRATIONS i i i ACKNOWLEDGMENT i v 1 . INTRODUCTION. 1 1 . 1 The K a r p - M i l l e r model 1 1.2 Research connected with p a r a l l e l computation models 6 2. TERMINATION PROPERTIES OF COMPUTATION GRAPHS 9 2 . 1 The K a r p - M i l l e r algorithm 9 2.2 Some necessary and s u f f i c i e n t c o nditions f o r s e l f - t e r m i n a t i o n of loops 1 7 2.3 A d i r e c t method of t e s t i n g termination p r o p e r t i e s of strong components 25 3 . STORAGE REQUIREMENTS 28 3 . 1 Maximum storage requirement - s p e c i a l case 28 3.2 Maximum storage requirement - general case 33 4 . CONCLUSIONS .40 REFERENCES 42 i i LIST OP ILLUSTRATIONS F i g u r e Page 1 Computation graph f o r Laguerre p o l y n o m i a l s ...5 2 Dependence of the amount of d a t a on the l o o p g a i n 18-19 3 T e r m i n a t i o n of loo p s 24 i i i ACKNOWLEDGMENT I wish to express my gratitude to Dr. E . L. Sigurdson for his guidance and encouragement in carrying out this research, and to Dr. G. F. Schrack for reading the manuscript. I also wish to acknowledge the financial support of the National Research Council. iv CHAPTER 1 INTRODUCTION As s i g n a l propagation speeds represent a serious b a r r i e r to i n c r e a s i n g the speed of s t r i c t l y s equential computers more a t t e n t i o n has been paid i n recent years to'the use of the p a r a l l e l - ism i n t r i n s i c to most computational algorithms. A number of de- signs have appeared which u t i l i z e a number of processors which may simultaneously execute several steps of the computation ( / l / - / 5 / ) , rather than overlapping of subfunctions i n sequential processing. In general, values to be used.in a computation step are the r e s u l t s of previous computation steps. This e s t a b l i s h e s c e r t a i n precedence c o n s t r a i n t s upon the computation steps. A model of such a system, s a t i s f y i n g a p a r t i c u l a r c l a s s of precedence c o n s t r a i n t s , lias been formulated by Karp and M i l l e r /6/. This t h e s i s studies some problems a r i s i n g i n connection with t h i s model, i n p a r t i c u l a r t ermination p r o p e r t i e s (Chapter 2) and storage requirements (Chapter 3). In t h i s chapter we present the model and the r e s u l t s of pre- vious research, and compare i t with other approaches to the p a r a l l e l processing. 1.1 The K a r p - M i l l e r Model The model represents the sequencing of a p a r a l l e l computation by a f i n i t e d i r e c t e d graph. Each node of the graph corresponds to an operation i n the computation (or to a processor assigned to per- form that o p e r a t i o n ) . Each branch represents a f i r s t - i n f i r s t - o u t 1 2 q u e u e o f d a t a d i r e c t e d f r o m o n e p r o c e s s o r t o a n o t h e r . To d e s c r i b e d a t a t r a n s f o r m a t i o n b y p r o c e s s o r s , w i t h e a c h n o d e i s a s s o c i a t e d a s i n g l e - v a l u e d f u n c t i o n d e t e r m i n i n g t h e d e p e n d e n c e o f o u t p u t s o n i n p u t s . T h e e l i g i b i l i t y f o r i n i t i a t i o n o f a n o p e r a t i o n i s d e t e r m i n - e d b y t h e l e n g t h s o f t h e q u e u e s o n b r a n c h e s d i r e c t e d i n t o i t s a s s o c i a t e d n o d e . T h u s a c o m p u t a t i o n i s r e p r e s e n t e d b y a d i r e c t e d g r a p h G c a l l e d a c o m p u t a t i o n g r a p h w h i c h i s g i v e n b y : ( i ) a s e t o f n o d e s n ^ n ^ , . . ( i i ) a s e t o f b r a n c h e s d ^ , . . d ^ , w h e r e a n y g i v e n b r a n c h d ^ i s d i r e c t e d f r o m a s p e c i f i e d n o d e n ^ t o a s p e c i f i e d n o d e n • , ( i i i ) f o u r n o n n e g a t i v e i n t e g e r s A ,U ,¥ a n d T , w h e r e • ° ° P P P P T > ¥ , a s s o c i a t e d w i t h e a c h b r a n c h d . P P P H e r e , A^ g i v e s t h e i n i t i a l n u m b e r o f d a t a w o r d s i n t h e f i r s t - i n f i r s t - o u t q u e u e a s s o c i a t e d w i t h d ; U g i v e s t h e n u m b e r o f 1 P P w o r d s a d d e d t o t h e q u e u e w h e n e v e r t h e o p e r a t i o n Ch a s s o c i a t e d w i t h n ^ t e r m i n a t e s ; V<T g i v e s t h e n u m b e r o f w o r d s remove;: f r o m t h e q u e u e w h e n e v e r t h e o p e r a t i o n 0- i s i n i t i a t e d ; a n d T i s a t h r e s h o l d g i v - . J P i n g t h e m i n i m u m q u e u e l e n g t l i o f d ^ w h i c h p e r m i t s t h e i n i t i a t i o n o f 0-. U p o n i n i t i a t i o n o f 0. o n l y t h e f i r s t W o f t h e T o p e r a n d s f o r J J P P 1 C h a r e r e m o v e d f r o m t h e q u e u e . T h e o p e r a t i o n 0. a s s o c i a t e d w i t h a g i v e n n o d e n . i s e l i g i b l e J J f o r i n i t i a t i o n i f a n d o n l y i f , f o r e a c h b r a n c h d d i r e c t e d i n t o n . , P J t h e n u m b e r o f w o r d s i n t h e q u e u e a s s o c i a t e d w i t h d i s g r e a t e r t h a n o r e q u a l t o T^. A f t e r C h b e c o m e s e l i g i b l e f o r i n i t i a t i o n , ¥ w o r d s a r e r e m o v e d f r o m e a c h b r a n c h d d i r e c t e d i n t o n . . T h e o p e r a t i o n 0. P J J 3 i s then performed. Vhen Ch terminates, words are placed on each branch d directed out from n.. The times required to perform the steps mentioned above are l e f t unspecified by the o r i g i n a l model as presented in /6/. These constraints on i n i t i a t i o n lead to the following defin- i t i o n s of the possible sequences of i n i t i a t i o n s associated with a given computation graph G. Let E be a sequence of nonempty sets ,S,j > • . >Ŝ r, . . , such that each set Ŝ T is a subset o f £ l , 2 , . . , whore X is the number of nodes in G. Let x(j,0) - 0, and, for N>0, l e t x(j,N) denote the number of sets S , l<m<N, of which i i s an element, i n ' ' 0 The sequence E is an execution of G i f and only i f for a l l N, the following conditions hold: (i ) i f j C S v,. and G has a branch d directed from n. to J N+l p l n., then A + U x(i,N) - W x(j,N)> T : ( i i ) i f E i s f i n i t e and of length R, then for each j there exists a node n. and a branch d directed from n. to l p l n. such that A + U x(i,R) - W x(j,R)<T . j p p v ' ' p x J ' ' p An execution E of G is calle d a proper execution i f the follow- ing implication holds: ( i i i ) i f , for a l l n. and for every branch d directed from l J p n. to n., A + U x(i,N) - W x(j,N)>T , then j € S„ l j p p ' p J ' p' J li for some R > N. The sequence E may be interpreted as giving a possible temporal sequence of i n i t i a t i o n s of operations throughout the per- formance of the p a r a l l e l computation specified by G; the occurence 4 of S.. denotes the simultaneous i n i t i a t i o n of 0. for a l l }€. S,-. Condition ( i ) states that in order for node n_. to i n i t i a t e J for the x(j,N + l ) - t h time, the queue lengths on i t s input branches must be greater than, or equal, to the respective branch thresholds. Condition ( i i ) defines the circumstance under which an execution terminates, i . e . under which the computation defined by G halts. This computation terminates when every node of G is unable to i n i t i a t e . Condition ( i i i ) requires that a node, i f able to i n i t i a t e , a c t u a lly w i l l do so after some f i n i t e number of i n i t i a t i o n s of other nodes. The following example i l l u s t r a t e s these ideas: Example 1.1 Consider the Laguerre polynomials defined by the recurrence r e l a t i o n L ^-.(x) = (2n + 1 - x)L (x) - n 2L ,(x) n+l ' n v ' n-1 v ' with i n i t i a l conditions L Q(x) - 1 L x(x) = 1 - x We want to compute the values of L n(x) for n = 2,3,..,N and for a given x. A computation graph for this calculation is in F i g . l . For each branch the intermediate result is shown. Branch c o e f f i c i e n t s are assumed to be A=0, U=Vi:=T=l unless otherwise shown. Brandies (n^,n^)' ( i . e . the branch directed from n^ to n^ ) and (n2>n 9) serve as counters; the computation is terminated by the depletion of queues associated with these branches. Node n^ produces 2n and n", and places them on (n,,n~) and (n, ,n-j), respectively. Node n„ takes 5 A=N-1 U=0 ( 2 n+l-x) 2 n n n. A=N- U=0 L n - l ( x > Computation graph for Laguerre polynomials F i g . 1 6 ( l - x ) from the branch ^ 2 ^ 2 ) and adds i t to the other input 2n; the r e s u l t (2n+l-x) i s placed on ( i ^ j n ^ ) . Node n^ forms the product of (2n+l-x) and L n ( x ) and places the r e s u l t on (n^,n^). Node n^ m u l t i - 2 p l i e s L n_-^(x) by n and places the r e s u l t on (n^,n_). F i n a l l y , n^ produces the d e s i r e d polynomials L ^ x ) , L ^ ( x ) , . . , L^,(x) and places them on (n^,n^) and (n^,n^). The i n i t i a l data queue on (n^,n^) i s L Q ( X ) = 1 , L ^ ( X ) = 1 - X and on (n^,n 4) i t i s L^(x)=l-x. 1.2 Research connected with p a r a l l e l computation models Karp and M i l l e r fbf show that f o r every proper execution the sequence of data words occuring on any branch of G i s always the same thus ensuring the same computational r e s u l t . This property i s r e f e r r e d to as the determinaoy of a computation graph. A l s o , they give an algorithm to determine whether a computation terminates, and a procedure f o r f i n d i n g the number of performances of each operation i n G. F i n a l l y , they give necessary and s u f f i c i e n t con- d i t i o n s f o r the lengths of data queues to remain bounded. R e i t e r i n h i s Ph.D. t h e s i s fjf addresses himself to the problems of storage, scheduling, and optimum assignment of operat- ions to processing u n i t s . He gives an integer l i n e a r program f o r the determination of the maximum storage required by a computation graph G. He introduces a concept of an admissible schedule d e f i n i n g v a l i d node i n i t i a t i o n times and c h a r a c t e r i z e s the c l a s s of a l l admissible schedules i n the case W =T =1, U -0 or 1. He f u r t h e r shows that i n P P P t h i s case i t i s p o s s i b l e to f i n d a p e r i o d i c admissible schedule which achieves the maximum computation rate (also see / 8 / ) . He also defines the cost of an assignment of node functions to processors 7 and gives a method f o r determining f e a s i b l e s o l u t i o n s when the maxi- mum comjmtation rate has a lower bound. F i n a l l y , he extends the model to incorporate a r e s t r i c t e d form of data dependency without l o s i n g i t s detcrminacy. A d i f f e r e n t approach to the g r a p h i c a l r e p r e s e n t a t i o n of a computation i s taken by Martin / 9 / . He allows two types of node i n - put c o n t r o l . In the case of conjunctive input c o n t r o l a node can i n i t i a t e only i f each branch d i r e c t e d i n t o the node contains at l e a s t one data word. In the case of d i s j u n c t i v e input c o n t r o l a node can i n i t i a t e only i f at least one branch directed into the node contains at l e a s t one data word. S i m i l a r l y , conjunctive output c o n t r o l places one word on each branch d i r e c t e d out from the node and d i s j u n c t i v e output c o n t r o l places one word on a branch d i r e c t e d out from the node according to an a p r i o r i p r o b a b i l i t y . The l a t t e r represents a c o n d i t i o n a l t r a n s f e r which i s d e t e r m i n i s t i c every time i t occurs but over many p o s s i b l e data sets may be modelled p r o b a b i l - i s t i c a l l y . Note that the conjunctive input c o n t r o l and conjunctive out- put c o n t r o l correspond to the T=vr=l case and U=l case i n the Karp - M i l l e r model, r e s p e c t i v e l y . Also note, that because of c o n d i t i o n a l t r a n s f e r s , t h i s model i s not determinate. In h i s work Martin studies the assignment of node comput- ations to processors and t r i e s to minimize the average computation time. Further research of t h i s model can be found i n /lO/, where an approximative method f o r c a l c u l a t i n g the average computation time ± s given, and /ll/, where procedures.are Riven.to determine a lower and an upper bound on the number of processors required f o r maximum 8 p a r a l l e l i s m . An a p p l i c a t i o n o f r e s u l t s o f t h e s e s t u d i e s t o a s s e m b l y - l i n e b a l a n c i n g p r o b l e m s i s g i v e n i n /12/. CHAPTER 2 TERMINATION PROPERTIES OF COMPUTATION GRAPHS Pa r t 2.1 of t h i s chapter i s devoted to a p r e s e n t a t i o n of the K a r p - M i l l e r a l g o r i t h m . The a l g o r i t h m i s used to determine whether the computation s p e c i f i e d by a given computation graph terminates, and to f i n d the number of performances of each o p e r a t i o n i n case the computation terminates. Sev e r a l theorems are given i n p a r t s 2.2 and 2.3 to improve the e f f i c i e n c y of the al g o r i t h m . 2.1 The K a r p - M i l l e r A l g o r i t h m Theorems and Lemmas of s e c t i o n 2.1 are proved i n /6/. Since the number of performances of an ope r a t i o n 0 i s independent of the execu- t i o n considered only i f t h i s e x c u t i o n i s proper, Karp and M i l l e r r e s t r i c t t h e i r a t t e n t i o n to proper executions. A node n_. of a computation graph i s s a i d to terminate i f and only i f ( i n the f o l l o w i n g i f f ) j occurs i n only a f i n i t e number of the se t s of a proper execution of G. N a t u r a l l y , t h i s number i s the same f o r a l l proper executions of G. To f u r t h e r study the te r m i n a t i o n of p r o p e r t i e s of computation graphs we need to introduce a few concepts from graph theory. A d i r e c t e d graph i s c a l l e d s t r o n g l y connected i f f given any p a i r of nodes n. and n. there e x i s t s a d i r e c t e d path from n. to n.. A s p e c i a l case of a s t r o n g l y connected graph i s the t r i v i a l graph which has only one node and no branches. For any d i r e c t e d graph there e x i s t s a unique p a r t i t i o n of 9 10 i t s nodes i n t o equivalence classes as f o l l o w s : Two nodes n. and n. are i n the same clas s i f there e x i s t s a d i r e c t e d path from n^ to n^ , and a d i r e c t e d path from n^ to n^. A subgraph c o n s i s t i n g of the nodes of an equivalence c l a s s and the branches of the o r i g i n a l graph connecting these nodes i s then a s t r o n g l y connected graph. The subgraphs corresponding to the node equivalence classes are maximal s t r o n g l y connected subgraphs of the o r i g i n a l graph and are c a l l e d the strong components of the graph. These strong components play a c r u c i a l r o l e i n the Karp- M i l l e r algorithm. Lemma 2.I Let G' be a s t r o n g l y connected subgraph of a computation graph G.'Then e i t h e r every node of G' terminates or nonedoes. We say that 0' terminates i f every node of G' terminates. Divide a l l the nodes of a computation graph G in t o two classes according to whether they terminate or not. Then by Lemma 2.1 the set S of terminating nodes i s a union of sets of nodes of some strong components of G. The s t r o n g l y connected subgraph G' which would terminate i f i t were i s o l a t e d from the r e s t of G i s c a l l e d s e l f - t e r m i n a t i n g . Let us examine when a strong component i s s e l f - t e r m i n a t i n g . Lemma 2.2 Let d^=(n^,iij) be a branch of a computation graph. Then n^ terminates i f n. terminates. l 11 r s This loads to the following concepts: Let G and G' -be strong components of G. Then define Gr >_ GS i f either Gr. = GS or there exist r s such nodes n - 6 G and n.€ G that there is a directed path from n. 1 j 1 to n . i n G. J Theorem 2.3 s A strong component G of a computation graph G terminates r r i f f there exists G such that G i s self-terminating and G r> G s. s r Thus G terminates i f f i t is self-terminating or some G is s e l f - terminating and there is a directed path from some n . 6 G to some n - £ G b. Therefore to determine the set S we examine strong com- r s r ponents for self-termination in such order that i f G >. G , then G is examined f i r s t . Now the problem is how to determine 'whether a strongly con- nected subgraph G' is self-terminating. For this purpose Karp and M i l l e r use properties of the 1OO|JS contained in G'. A computation graph L is called a loop i f i t consists of d i s t i n c t nodes , n 0 , . , " a n d branches d^,d2,..,d£ such that d^ is directed from to n^_ + 1 > k=l, 2, . . , / , - l , and d^ is directed from n^ to n^ . Here we note that any strongly connected subgraph G' except the t r i v i a l graph contains at least one loop. Theorem 2.4 A strongly connected subgraph G1 is self-terminating i f f G' 12 contains a s e l f - t e r m i n a t i n g loop. Given t h i s theorem the only problem i s how to e s t a b l i s h that a p a r t i c u l a r loop i s s e l f - t e r m i n a t i n g . Let L be a loop with branches d, ,d,,, . . ,d«. The product it U . g= TT rr^ i s c a l l e d the gain of the loop. There are 3 cases: g ^ l , i = l i g=l, and g >1. Loops with g < 1 Theorem 2.5 Any loop L f o r which g < l i s s e l f - t e r m i n a t i n g . Loops w i t l i g=I Theorem 2.6. Let r~ P = 1 1 U ' " l v i u l h V l U l h u i V i h W l h w i , V l h h U2 u i Y ' l ''2 i h-l h 1 ' T.I "1 W2 h j I V , ' l V h-l h-l I ' 1 V i i . I 13 *1 /33 ft A k - i " T k - i + 1 where fo l{ = y— k=2 , 3 ,. . and k-1 / 3 i = A i - T x + 1 Then the necessary condition for self-termination of a loop with g=l is P /3 < 0 Karp and M i l l e r give a necessary and s u f f i c i e n t condition for s e l f - termination in the following special case: Theorem 2.7 If , for l£k^X,, Vi'. = U , =1, then the loop L i s self-terminating i f f ( i . e . Z A < Z (T,-l) ). k=l " k=l k k=l k In case that Theorems 2.6 -and 2.7 cannot be applied Karp and M i l l e r derive an upper bound on the numbers of x^erformances of nodes of a self-terminating loop. Theorem 2.8 Let L be a self-terminating loop with g=l. Let X* be a positive integer solution of the system OC^ = a 14 U 1 OC = - i a 2 M1 a U l U 2 u u ry, - _ —£ U a whore a i s an a r b i t r a r y parameter. Let X = x n X, where X^ i s the number of performances of node n^, k=l,2,.., Then at l e a s t one component of X i s l e s s than the correspond ing component of X*. Loops with g >1 Theorem 2.6 i s v a l i d also f o r t h i s case. I f L i s s e l f - t e r m i n a t i n g then we get the f o l l o w i n g upper bound: X < y i - P fi 1-g '- Having obtained an upper bound f o r X we can t e s t s e l f - t e r m i n a t i o n of a loop by applying a procedure given i n the f o l l o w i n g theorem. 15 Theorem 2.9 # For a l l nodes n.€S the f o l l o w i n g i t e r a t i o n scheme converges J i n a f i n i t e number of steps to X; *(0)<;>=° r _ („) , Ik -T +1+U xKa,{iT x ^ n + 1 ^ ( j ) = max J x ^ ( j ) , (i,p)6Z.j, i € S \ % H e r e ^ j i s the set of ordered p a i r s ( i , p ) such that d^ i s a brancli from node n. to node n.. The r e s u l t s given so f a r may be organized i n t o an algorithm f o r determining which nodes of a computation graph G terminate and, f o r the terminating nodes n.. , computing the number of performances x ( j ) . This algorithm may be o u t l i n e d /6/ as f o l l o w s : Step 1. From among the strong components of the computation graph being considered ( i n i t i a l l y t h i s graph i s G), s e l e c t one which i s not covered by any other subgraph. C a l l i t G'. Step 2. By applying Steps 2A,...,2D given below, t e s t whether G' i s s e l f - t e r m i n a t i n g and, when i t i s , determine x ( j ) f o r each n.£G'. Step 3. Form a new computation graph as f o l l o w s : I f G' i s not s e l f - t e r m i n a t i n g , remove G' and a l l branches i n c i d e n t with nodes of G'. I f G' i s s e l f - t e r m i n a t i n g , replace each branch d from n^G' to n^G' by an "equivalent" branch d^, from n^ to n^ , having U^,=0, A ,=A +U x ( i ) , T ,=T , and W , =V,T . Then remove G' . P P p p p p p Step 4. I f the new computation graph i s nonempty, return to Step 1. Otherwise the a n a l y s i s of termination i s complete. 4 The symbol fx! denotes " l e a s t integer greater than or equal to x." 16 The d e t a i l s of Step 2 are now described. Step 2k. I f G' contains a branch d^ with U =0, go to Step 2D. I f not, determine whether G' contains a loop with g < l . This i s equi- valent to determining whether there i s a loop L sucli that Z log (U / V ) < 0 . d € L P P P This determination can be c a r r i e d out by a shortest-route algorithm given i n /13/. Enumeration of loops i s not required i n t h i s pro- cedure. I f a loop with g < l e x i s t s , go to Step 2D; otherwise, go to Step 2E. Step 2B. Every loop of G' has .g > 1. Determine whether there i s a loop not p r e v i o u s l y considered such that 0> Pyg. I f no such loop e x i s t s , G' i s not s e l f - t e r m i n a t i n g ; r e t u r n to Step 3. I f such a loop L i s found, determine upper bounds on the q u a n t i t i e s x^(k) by the methods given above. These bounds hold, of course, only i f L i s s e l f - t e r m i n a t i n g . Step 2C. Continue applying the i t e r a t i o n scheme of Theorem 2.9, tak- ing S to be set of nodes of G', u n t i l e i t h e r (a) i t terminates, e s t a b l i s h i n g that G' i s s e l f - t e r m i n a t i n g , and g i v i n g x ( j ) f o r each n.£G', or J (b) f o r some n and some k, x^ n^(k) exceeds the upper bound on x ^ ( k ) , e s t a b l i s h i n g that L i s not s e l f - t e r m i n a t i n g . Return to Step 2B. Step 2D. G' i s s e l f - t e r m i n a t i n g . Apply the i t e r a t i o n scheme of Theorem 2.9, tak i n g S to be the set of nodes of G', to obtain x ( j ) f o r each n.£G'.. Return to Step 3. 17 2.2 Some necessary and s u f f i c i e n t conditions for .self-termination of 1o o ps As shown above, the K a r p - M i l l e r algorithm is.based on termin- ation properties of loops. Consequently i t s e f f i c i e n c y depends main- l y on the means available for testing self-termination of loops. In the following we s h a l l derive some theorems testing these proper- t i e s . Theorem 2.10 ' If, for l^k<L,£, p— = 1, then the loop L i s self-terminating b k I Z 1X1 £ o k=l PROOF; By Theorem 6 of /6/ the necessary and s u f f i c i e n t condition for self-termination of a loop is the existence of a nonnegative integer solution of the following system of i n e q u a l i t i e s : , , v A^-Tjt+l+lixU) x( l ) > g A1r-T^+1+U,,x(k) x ( k + l ) > — — \ - . 'for k=l,2 , . . j e-l k This system reduces to x(l)> /3£+x(i) x(k+l)>/3 k+x(k) for k-1, 2 , . . , i - l Since a l l x's are integers, we have 1 8 x(k+l) > f/3j+x(k) for k=l,2 f.. ,1-1 By summing l e f t and right-hand sides of the above i n e q u a l i t i e s , respectively, we get the necessary condition To prove that i t i s also a s u f f i c i e n t condition we -ban.-, show that i f i t i s s a t i s f i e d the system x(l)=C x(k+l)=x(k)+ f / 3 k ] k=l , 2 , . . , 1 - 1 where C i s a s u f f i c i e n t l y large integer i s a nonnegative integer solution of the above i n e q u a l i t i e s . Consider the following examples Example 2 . 1 A - 0 U=l \V=T=2 A=2 Here g=l / 2 . The data d i s t r i b u t i o n after each node performes once i s A ' = 0 U=l W=T=2 T=Vf=l U=l A'=l F i g . 2 1 9 Example 2.2 A = 0 A=l Here g=2. The data d i s t r i b u t i o n a f t e r each node performs once i s A ' = 0 T=W=1 A ' = 2 Example 2.3 A=l. Here g-l. The data d i s t r i b u t i o n a f t e r each node performs once i s A ' = 0 T=Vf=l A'=l Dependence of tlie amount of data on the loop gain Fig.2 20 These examples indicate that (roughly speaking) for g > l the "amount of data" increases, for g<1 i t decreases, and for g=l i t remains constant. The following theorem gives this fact a precise form. Theorem 2.11 „ I u Let L be a loop with gain g - II ^— . Let A. and A^ be the i = l i i n i t i a l and current number of words on the i - t h branch, respectively, for i = l,2,..,j£. Then JL I i f g < l Z c i A i < 21 c.A^ i=l i=l L 5L i f g=l ' Z c.A. = Z c.A 0 . , l i • , I l i=l i=l t L i f g> 1 Z c. A. > Z. c.A FA . , I I . , I I 1=1 i=l where c^ - 1 and c. 4 TT T f -* 1 — for i=2,3,..,i. PROOF; Suppose that W . words are taken from the j-th branch (j'^;C), J and L Tj_^ words are placed on the (j+l)-th branch. Taking into account that c . , = c . d+i - j u. + 1 the change i n the sum F c.A. is ^ . , I l i = l 21 W . c. ,U. , - c.V. = c. rr-L- U. , - C.W. = 0 0+1 J+l J J J J+l J J Now suppose that Ŵ  words are taken from the j2-th branch and U words are placed on the 1-st branch. We have C l ! i ! 2 V i h. cz = u 2 u 3 u z = e ŵ and the change in the sum U c i u i - C A = u i - i*x W = V 1 " i> i s positive for g > l , negative for g < 1, and zero for g=l. As a coro l l a r y we get a necessary condition for self-term- ination of a loop, which i s ess e n t i a l l y equivalent to Theorem 2.6. Corollary 2.12 A necessary condition for self-termination of a loop with g > l and i n i t i a l number of words on the i - t h branch A?, i=l,2,.., L i s that T c.A°< J" c. (T.-l) 1=1 1=1 PROOF: If the loop is self-terminating, then after a f i n i t e number of performances we must have 22 A. < T . - l f o r i=l,2,..,1 i - i ' ' ' and T c.A.< > c.( T . - l ) i i . , I X I i = l x x i = l By Theorem 2.11 I I I I c.A°< f c.A. < f c. ( T . - l ) . , I l — I l — I l ' i = l A A i = l x x i = l Now we s h a l l derive a s u f f i c i e n t c o n d i t i o n f o r loops with g=l Lemma 2.13 Let L be a loop with g=l, and W\=T̂  f o r i = l , 2 , . . , L e t A? be the i n i t i a l number of data words on the i - t h branch f o r i=l,2,..,£. I f Z c.A. < max c .V . i = l 1 1 I<j<l J J then the loop i s s e l f - t e r m i n a t i n g , PROOF: Let k be such number that Tli en JL max c -V . = c, Vf, c .*—, c, 1 i = l k t— n i ^ k Since 23 I c.A° = I c.A. where A. i s the current number of words on the i - t h branch we have l X* c X c A, < 5" — A. = 7 — A°<W, = T, k c, l c, l k k i=l k i=l k Thus the number of words on the k-th branch w i l l never reach the threshold and the number of performances of node iij. i s zero. By Lemma 2.2 every node of the loop terminates. Hence the theorem. Theorem 2.14 Let L be a loop with g=l. Let A? be the i n i t i a l number of data words on the i - t h branch, i = l,2,..,X. A s u f f i c i e n t condition for self-termination of L is that 0 1 y c.A < y c. (T.-W.) + max c.V . i = i 1 1 i i i 1 1 1 i * g j t J J PROOF; Suppose that L does not terminate. Then after each node performed at least once, A i ^ W i , e - A i = ( V V + A i where A[> 0 for i=l,2,.. If we replace A^ and T^ by A| and T̂ =Vr_̂ , respectively, the resulting loop L' w i l l have the same termination properties as L, i . e . w i l l not terminate. . Then by Lemma 2.13 24 2_ c'A! > max c'.W. i=l 1 1 l£j*Jt J J Since c!=c for i=1.2,*.,£ f c.A 0 = f c.A. = ? c. (T.-YT. ) + [ c.A! = £ c. (T.-V. ) + J_ c!A!> i T i 1 1 i t i 1 1 1 1 1 i = i 1 1 i = i 1 1 1 i = l 1 x ~ T c(T.-W.) + max c'.W. = T c.(T.-W.) + max c .W. i t l 1 1 1 l<j<Z J J i = l 1 1 1 l*j£ J J which i s a contradiction. To i l l u s t r a t e how strong the conditions of Corollary 2.12 and Theorem 2.14 are consider the following two simple examples; Example 2.4 n4 W=T=2 * U=2 Example 2.5 W=T=1 U=l Here c^=c-j=l c 2=c 4=2 Ic.A 0 = 2 =Ic.(T.-l) i i *~ l x l and the loop terminates A=0 Here c^=c 2=l and max c.W.=1 l<j£2 J J 7c.A?=l= Zc•(T.-W.)+raax c.W. 1 1 1 1 1 l<g£2 J J and the loop does not terminate Termination of loops Fig.3 25 2•3 A dire c t method of testing termination properties of strong components• As noted in /6/, the Karp-Miller algorithm requires the inspec- t i o n of each loop of a strong component G' when G' is not self-term- inating. If G' contains many loops a more direc t way i s desirable. Consider the shortest-route algorithm given in / l 3 / . It is used i n Step 2A of the Karp-Miller algorithm for testing termiation properties (see part 2.1). If we use mu l t i p l i c a t i o n and associate U A with branch d ,'rather than addition and log(U Ar ), then, in p/ p p ° p' p' ' ' • the absence of loops with g < l , the algorithm results in assigning a rat i o n a l number to each node of G'. On multiplying these numbers by the least common product of thei r denominators each node n. w i l l have an integer OC . assigned. l ° l ° If d =(n.,n.) i s a branch of a loop L with g=l, then 06. /oi. =\j A . p i ' j 1 o > j ' i p ' p If L has g > 1, then for one and only one branch do we haveOc^/<X< <U A ; for other branches of L OC./CX.̂ U A • P P' J/ i P P Consider now an arbitrary loop L of G' with g=l. If L is s e l f - terminating then by Theorem 2.8 the number of performances i s for at least one node n.€ L less thanOC. . J J We shall show that the same i s v a l i d for loops with g>l. Theorem 2.15 If L i s a self-terminating loop with g> 1, then at least one component of X is less than the corresponding component of X? here X = X, X, X , x*= a l oc 0 OC,, a a ( U 1 A 1 ) a(u 1A 1)(u 2A 2)-(fiV ;i-i ) 26 where a i s an ar b i t r a r y parameter and X^ is the number of jierf ormances of node n^, i=l,2,..,X. PROOF; By Theorem 4 of /6/, X i s the minimum nonnegative solution of (E-A)X>/3 , i . e . 0 1 0 0 u. 0 0 , . 0 0 0 • • • u Jt-1 V l 0 x l V x 2 ^ 2 • > • • • where /3-, = W 1 and fc. A i - 1 - T i - 1 + 1 V l i i — 2 , 3 , . . ,/£ Then (E-A)X* = U. 0 0 0 0 0 0 0 0 u 'l-l l - l 0 a tfl W2 ^1^2 V l W l W l 27 (E-A)X* = a Now consider the vector X-X*. (E-A)(X-X*) = (E-A)X - (E-A)X* > /3 If no component of X is less than the corresponding component of X*, then (X-X*) is a smaller nonnegative integer solution of (E-A)X>/| than X, which is a contradiction. Theorems 2.15, 2.8, and 2.4 serve as a basis for a procedure for testing termination properties of strong components, which may be outlined as follows: Step 1. Apply the shortest-route algorithm modified as shown above. If a loop with g<1 i s found go to Step 2; otherwise continue u n t i l each node n. of the strong component is assigned a constant^. . l ° l Step 2. Apply the i t e r a t i o n scheme of Theorem 2.9 to the nodes of the strong component G1 u n t i l either a) the scheme terminates, establishing that G' is s e l f - terminating and giving the number of performances x(j) for each n . € G 1 , o r 3 b) for some n and some k, x (k) exceeds the upper bound C*k on x(k), establishing that G' i s not self-terminating. 1-g 0 0 0 ^ 0 -; CHAPTER 3 STORAGE REQUIREMENTS In the Karp-Miller model each branch d^=(n^,nj) represents a queue of data words which may be an output of operation 0^ assoc- iated with node n^ and which may serve as an input for operation 0j associated with node n^. Each data word has to be stored in a memory location of a computer performing the computation, and the maximum number of memory locations required becomes of int e r e s t . Chapter 3 i s devoted to th i s problem. 3•1 Maximum storage requirement - special case In this part we present the results of / i f . Let us introduce a branch parameter tr > 0: If d =(n..n.),X r p p i J p i s the fixed time required by node n^ to fetch i t s input data from storage, process these data, and place outputs into memory locations associated with the queue on branch d . Thus i f n. i n i t i a t e s at time p l t, i t places U data words upon branch d at time t+tT . P P P um Another parameter we introduce is T^=max^TJ^| where the maxim is taken over a l l branches d directed out from node n.. P i A schedule is a set 4-{̂ -̂  ' * * ,(->t} w n e r e each is a funct- ion 6. : {l,2,..,X.} -» R such that for l < k < r £ X . l ^ ( k K d . U ) Here R i s the set of real numbers and X^ i s the number of i n i t i a t i o n s 28 29 of node n^ for any proper execution of G. If X^=0, i s undefined. With each we associate a function xi : R-{o,l,2,..,X.} x.(t)=0 i f f either X.=0 or X . i l and t < 4 . ( l ) . 1 x 1 l l For l < k < X i , x t(t)=k i f f 6 (k) ̂  t< 6 (k+l) For X. >1, x.(t)=X. i f f <S-(X.)<t. I ' i i i v i For every branch d =(n.,n.) define P i j b d(t)=A +U x.(t-T )-W (x.(t)-E.(t)) p P p i p p J J where £.(t)=l i f there exists k, l ^ k ^ X . such that 6.(k)=t £..(t)=0 otherwise. A schedule is c a l l e d an admissible schedule i f , for j=l,2,..,X <S.(k)=t b 6 ( t ) > T j P P for a l l branches d into n.. and for a l l k, 1 £ k £ X . . P J J A schedule <o is sequential i f for no nodes n., n., with 1 i ' j n . . do we have i J 6.(k)< d.(r)< 6 . (k)+T. I J I I , f o r l £ k £ X . , l £ r < X . . i J These d e f i n i t i o n s are to be interpreted as follows: <^(k)=t means that node n^ begins i t s k-th i n i t i a t i o n at time t under the schedule (J. x^(t) is the number of i n i t i a t i o n s of node I K , up to and including time t, under the schedule <i. b^(t) is the number of data words on branch d^ at time t. The quant- i t y £.(t) is introduced for the following reason: A l l data transmit- J ted to node n^ by node n^ v i a the branch (n^,n^) must f i r s t pass 30 through storage. Then i f t i s a time at which n. does not i n i t i a t e , the storage requirement at time t i s b 6 ( t ) = A +U x . ( t - f )-W x . ( t ) . P P P i • P P J If n^ i n i t i a t e s at time t, the number of data words in storage at time t i s b 6 ( t ) = A +U x . ( t - r )-V ( x . ( t ) - l ) P P P i P P j An admissible schedule specifies those node i n i t i a t i o n times cor- responding to a proper execution. Thus a node n^ i n i t i a t e s at time t (d>.(k)=t for some l<k<X.) only i f each branch d directed into v l v I '  J p n. contains at least T data words at time t. (b^(t)>T ). F i n a l l y , l p ' p p a sequential schedule is a schedule under which no node i n i t i a t e s at the same time that some other node is executing. For any admissible scheduled, define A> = max T b 6 (t) C 6 t p P jkJ^ thus defines the maximum amount of storage required by the admissible schedule 6. Lemma 3.1 Let 6 be an admissible schedule for a computation graph G. Then there exists an admissible sequential schedule (a.s.s.) 6 such that ^ 6 ' * <^6 This Lemma shows that in general a p a r a l l e l system i s more econom- i c a l than a sequential system in the sense that i t needs fewer storage locations. 31 Let ^U/ = max 6 i s &n admissible schedule^ . ^ i s the maximum number of memory locations that a computation graph G can require. Corollary 3.2 i_, £M = oax( ^/<? is an a.s.s.) For any admissible schedule define T<* = {t/t£ (6.(10,4. (kJ+'E) for 0 < k £ X . , i = l,2,..,/}. are those times during which no node of G is executing under the schedule 6 ; however, nodes of G may just be i n i t i a t i n g or terminat- ing under 6 at some of the times t€ T^. We then have the following r e s u l t : Corollary 3.3 (Ms- wax max { ^ ( " t ) / C J is an a.s.s. and t 6 T^} . 6 t Write b. f ° r ^ n e column vector with p-th component A , W for the column vector with p-th component W , T for the column vector with p-th component T , and, for any admissible schedule <$> define _b^(t) to be the column vector with p-th component b ^ ( t ) , p=l,2,..,t. Define a tXj£, matrix A with elements a .=\V i f branch d i s directed into n. but not also out from n.. PJ P P J J a . =-U i f brancli d is directed out from n. but not also into n.. P J P P J J a . =Vr - U i f branch d i s directed out from n. into n.. P J P P P J J a .=0 otherwise. P J 32 F i n a l l y , l e t X be a column vector with i - t h component X^, i=l,2,.., Theorem 3.4 # Let G s a t i s f y b.>T-W. Then is determined by the following integer linear program: |b| - minU ^ I subject to 0 <y_< X where each y^, i=l,2,..£, i s an integer. Theorem 3.4 enables us to determine the maximum amount of storage required for a given computation graph, provided b>T_-W. What i f thi s is not sa t i s f i e d ? We quote from /l/: "In those cases where this inequality is violated, the program /of Theorem 3.4/ i s inapplicable. Under such contingency one pos- s i b l e course of action i s to simulate a l l possible admissible schedules for G u n t i l a d i s t r i b u t i o n of data i s obtained which s a t i s f i e s the above i n e q a l i t y , and then apply the program /of Theorem 3.4/ to this data d i s t r i b u t i o n . Then the maximum storage requirement is either that obtained through the simulation phase, or that obtained by the program, whichever i s the greater. In general, however, such a scheme would be impractical due to the pot e n t i a l l y large number of possible d i f f e r e n t simulations i n - volved. " n # If ii i s a n-dimensional vector, then define |x[ = T, x. . ~~ i = l 1 33 3.2 Maximum storage requirement - general case A node n^ of a computation graph can i n i t i a t e only i f for every branch d^ directed into n^ the number of data words assoc- iated with this branch i s not less than the corresponding threshold T . Consequently, the number of data words on this branch after the i n i t i a t i o n i s not less than T -V . P P Consider a computation graph Cr which does not s a t i s f y b> T-W, i . e . there exists at least one branch d^=(n^,n..), d^ £ G such that A < T -Vir . Let us assume that G contains only one node terminal to p p p such a branch. Later we s h a l l show how to extend the results to the case of more nodes terminal to such branches. Clearly, the data d i s t r i b u t i o n w i l l s a t i s f y the requirement b >T-W after the f i r s t i n i t i a t i o n of n.. In the following we derive J a method for finding the maximum storage required prior to the f i r s t i n i t i a t i o n of n.. Then we apply methods of part 3.1 to determine the maximum storage required after the f i r s t i n i t i a t i o n of n.. J Given a computation graph G modify i t as follows: For the node n^ jmt a l l V .=T -=0 a l l U • .=0 J s Note that the data d i s t r i b u t i o n of the modified graph G' i s not affected by i n i t i a t i o n s of n.. In the following parameters of G' w i l l be primed. Lemma 3.5 Let <o = (6- (k. ), d>. (k. ), . . ,<J. (k. )} be a schedule, L x l x l r2 12 xm xm where ^ i ^ , i 2 , . . , i m } - I C { l , 2 , . . , j - 1 , j + 1,., l) and 34 x l x l x i x2 x2 x2 x2 x2 x3 x3 d i (k £ < <£. (k. ) . m-1 m-1 m-1 m ra Then 6 i s admissible for G' i f f i t i s admissible for G; moreover b*.(t)=b*(t) for 0 < t £ c ? . (k. ). m m PROOF: By d e f i n i t i o n x.(t)=o i f f t ^ 6 . ( i ) x i(t)=k i f f t >6\(k) and t£<S\ (k+l) Therefore x.(t)=x!(t) for i € l and 0 < t < d . (k. ) l l x ~- I l in m and x. (t)=x! (t)=0 for i / l and 0 < t < 6 . (k. ). i i i i m m Also £ i(t) = £j(t) for i € I and 0 < t < d>i (1^ ), m rn and £. (t) = 6! (t)=0 for i / l and 0<t<6- (k. ). i i / i i m m Let us examine b d(t)=A +U x (t-V )-V (x (t)-£ (t)) for d =(n p p p r p p s s p r There are four d i f f e r e n t cases depending on whether r and s are elements of I or not. ( i ) r € I, s € I Here U =U1 , W =V»M P P P P and b 6(t)=A +U x ( t - t )-\f (x (t)-£ (t))=lj 6(t) p v p p r v p p s x s p x ( i i ) r e l , s ^ I Here U p = U p » x g(t)=x^(t)=0, £ g(t)=£^(t)=0 and b*(t)=A p+U ix r(t-r p)=b'J(t) ( i i i ) r £ I, s 6 I H ere Vp = Vp» x r(t)=x^.(t)=0 and b p (t )=Ap-Wp (x s ( t ) - £ s (t) )=b'J( t) (iv) r £ l , s £ l Here x f (t )=x̂ .( t )=xs (t )=x^ (t ) = £ g (t ) = £, (t )=0 and b 6(t)=A =b6{t) P P P 35 We have thus established that b£(t) = b£,(t) for 0*t<6. (k. ). ni m Moreover, for a l l d =(n ,n ), s€ I 'we have T'=T and ' p v r' s ' p p b 6 ( t ) > T P P i f f . b'<Ht)>T' for 0 < t £ 6 . '(k. ). p p 1 1 A r ram This proves that 6 i s admissible for G i f f i t i s admissible for G'. Lemma 3.6 The maximum storage S' required for the modified graph G' max is the same as the maximum storage S required for G prior max to the f i r s t i n i t i a t i o n 6 • (1) of n.. J J PROOF: By Corollary 3.2 we need only consider sequential schedules. The proof w i l l consists of 3 parts. ( i ) Every sequential s chedul e d> = i 6>. (k. ), <•> . (k. ) , . . , ( i . (k. ) , . .> 1 1 2 2 m m can be divided into time intervals <o,6. (k ) ) , <<3. (k ),6 (k )),..,<d (k ) , L )),.. X l X l X l 1 l X2 12 1m 1m Sa+l ^m+l From the d e f i n i t i o n of b^(t) and £^(t) i t follows that S(t)< S(0) i f t<6- (k. ) 1 1 and S(t)<S(<S. (k. )) i f (k. ) < t < d i (k. ) in ra m m m+1 m+1 where 3(t)=|b 6(t)| Thus to get S i t i s s u f f i c i e n t to examine S(t) at t=0, <£. (k. ), max l ̂  "̂ 1' (>• (k. ), d>. (k. ),..; i n other words i t i s s u f f i c i e n t to examine the 12 x2 x3 x3 integer sequence S ( where S]ii=S(<S. (k. ). m m . ( i i ) Here we s h a l l show that'S' i s i n f i n i t e i f f S i s i n f i n i t e . v max max Let N be an a r b i t r a r y integer. If S' i s i n f i n i t e , then there exists 36 a schedule («> and integer M' such that SM • > N Take the i n i t i a l part of d> up to d> . (k. ) and omit a l l i n i t i a t i o n s Hi' XM' of n.. Then by Lemma 3.5 we get an admissible schedule 6 for G which J requires the same storage as C>'. Thus for some M<M' we have S M = S M « > N " In the same fashion we can show that S' i s i n f i n i t e i f S i s max max i n f i n i t e . ( i i i ) Now suppose that S' i s f i n i t e . Then sequences S' are bound-v 1 1 max x m ed by S 1 and there exists such a schedule d' and integer M' that J max S^t=S' . Take the i n i t i a l part of 6 up to 6. (k. ) and omit a l l M max l^p l ^ j , the i n i t i a t i o n s of n.. This by Lemma 3.5 w i l l give us an a.s.s. for G with the same storage requirements as 6 . Thus S ' =SV'( i —S., £v£> max iM' M max On the other hand, since S i s f i n i t e , there exists a schedule Q ' max ' > and integer N such that N max The i n i t i a l part of this schedule up to O. (k. ) i s an a.s.s. o' for XN • \\T G' with the same storage requirements as o* This gives s =s5=s'£̂  S' max N N max From this and S' < S we obtain max max S =S1 max" max Corollary 3.7 Let a l l branches of G s a t i s f y A >T -Yi with a possible ex- p - p p 1 ception of branch (n.,n.). Modify G as follows: Put a l l Vrj.=Tr;.=0, a l l U j g=0 37 Then the maximum storage S ^ required f o r G p r i o r to the f i r s t i n i t i a t i o n of n. i s determined by the f o l l o w i n g integer l i n e a r program: max 1 —' 1 d-' subject to o < y k < x ^ . k/j 0 < y • < oo • A'vib-T'+W' where A', T', W', X' are parameters of the modified graph G'. PROOF: Proof f o l l o w s from Lemma 3.6 and Theorem 3.4. Lemma 3.8 Let G be a computation graph whose a l l branches s a t i s f y A >T - V i ' with a po s s i b l e exception of a branch (n.,n.,). p p p 1 i ' j " Let X.> 0. Then £=b^(t), where _c i s defined belov/, for some a.s.s. <S and some t € T^, t > <S. (1) i f f there e x i s t such integers y^ , i = l , 2 , . . ,J& which s a t i s f y ( i ) °-yk- Xk k ^ 1 < y • < X . J J ( i i ) £=P_-AX^1-1L PROOF: Analogous to that of Theorem 2.4 of /l/. C o r o l l a r y 3.9 Let G be a computation graph whose a l l branches s a t i s f y A >T -V with a p o s s i b l e exception of a branch (n.,n.). p p p 1 * i J 38 Let X.>0. Then the maximum storage S^ required after the j ° max 1 f i r s t i n i t i a t i o n of n. i s determined by the following integer J l i n e a r program: S"*" =1 b I -inin j Ay I max ' — 1 J-1 subject to o < y k < \ k^j Ay_<b - T + ¥ where ea.ch y k i s integer. Theorem 3.10 Let G be a computation graph whose a l l brandies s a t i s f y A >T -YY with a possible exception of a branch (n.,n.). P P P 1 J Let X^ > 0. Then the maximum storage required is M/ = max(3 , ) < max' max PROOF: I. a) Suppose S ^ S 1  1 1 max max There exists an a.s.s. for the modified graph G' which requires S m^ x of storage. By Lemma 3.5 this schedule i s admissible also for G and therefore i s an i n i t i a l part of some a.s.s. for G. Thus A/> S > 3 1 C max max and XC>max(S ,3"̂  ) C — max' max b) Now suppose S >S 1 1 max— max By Corollary 3.9 S^ a x= I b | -1 Ay_°l whor e | Ay_°| = m i n | Ay_ | subject to 0 ^ y k < X k k^j 39 Then by Lemma 3.8 S 1 =|b 6(t)|- for some a.s»s.» and n i a x /W/̂ S1 >s C max— max i . e . Xt/>raax(S ,3̂ " ) <• max' max II . Let (•> be an arbitr a r y schedule for G. Then |b 6 ( t ) | < S for t < d . ( l ) 1 — x " max j and I b d (t )| ̂  3 1 for t > 6 . ( l ) 1 — 1 max — j Thus ii/<iraax(S , ) Cr — max' max The results of I. and I I . give M/= raax(S , S"̂  ), q.e.d. b < max' max ' ̂ Theorem 3.10 and Corollaries 3.7 and 3.9 thus provide means for finding the maximum storage for graphs where one and only one node i s terminal to a branch which does not s a t i s f v A > T -W . - p- p p In case there are' r such nodes, we take a subset of these nodes { n. ,n. ,..,n. , where 0 < s r. For a l l branches directed ^ Xl X2 s into the nodes of the subset we put Yir=T=0, and for a l l branches directed out from the nodes of the subset we put U=0. Then to this modified graph we apply the following integer linear program: S =jbl-min|Ay I max 1 — 1 I J. i subject to 0 < y k < X k M n ,n ,..,n. 1 2 s l £ y • £ x . xi x l 1 < y • < X . 16 y • £ X . J i i s s r r Since there are 2 sucli subsets we have 2 lin e a r programs. The maximum required storage i s the maximum of the 2 p a r t i a l maximum storage requirements. CHAPTER 4 CONCLUSIONS The Karp-Miller algorithm for testing termination properties of computation graphs i s based on the termination properties of loops. It i s , therefore, desirable to have means for testing loops. A quantity related to the number of data words in a loop i s introduced, which decreases for g < l , increases for g > l , and remains constant for g=l, in the course of computation. This concept makes i t possible to derive a simple s u f f i c i e n t condition (Theorem 2.14) for self-termination of loops with g=l, and to give a shorter and i n t u i t i v e l y more s a t i s f y i n g proof (Corollary 2.12) of necessary condition of Theorem 2.6 due to Karp and M i l l e r . In the special case that \V=U, the necessary and s u f f i c i e n t condition i s given (Theorem 2.10). This condition has a simple form due to the fact that data propagation has l o c a l character in this case. Since this i s i n v a l i d i n the general case, one probably cannot hoj)e for a simple form of the general necessary and s u f f i c i e n t condition. The Karp-Miller algorithm i s not well suited for computation graphs with many loops. Therefore, in part 2.3 of Chapter 2 a dir e c t procedure for testing termination properties of strongly connected graphs is derived. The procedure does not require inspection of every loop as in the Karp-Miller algorithm. However, i t also uses the i t e r a t i o n scheme of Theorem 2.9, which for large graphs may be too lengthy. Reiter in /l/ gives a linear integer program for determining the maximum amount of storage required i n the special case that 40 41 b>T-W. Part 3.2 of chapter 3 extends h i s method to cover the general case. The number of l i n e a r programs required i n our method increases as 2 1 where i i s the number of nodes term i n a l to brandies which do not s a t i s f y A^ > ̂ p""̂ *p» o u ^ -"-̂  appears to be more e f f i c i e n t than s i m u l a t i o n of a l l p o s s i b l e schedules /7/, e s p e c i a l l y f o r h i g h l y p a r a l l e l computations. REFERENCES flf Gregory J . and McReynolds R., "The SOLOMON computer", IEEE Trans, on E l e c t r o n i c Computers, v o l . EC-12, pp.774-781, Dec. 1963. /2/ Schwartz J . , "Large p a r a l l e l computers", J.ACM, vol.13, No. 1, pp.25-32, Jan.1966. /3/ S l o t n i c k D.L. et al.,"The ILLIAC IV computer", IEEE Trans, on Computers, v o l . C-17, No.8, pp.746-757, Aug.1968. /A/ Thurber K.J. and Myrna J.W., "System design of a c e l l u l a r APL computer", IEEE Trans, on Computers, v o l . C-19, No.4, pp.291- -303, A p r i l 1970. /5/ Koczela L.J. and Yvang G.Y., "The design of a h i g h l y p a r a l l e l computer o r g a n i s a t i o n " , IEEE Trans, on Computers, vol.C-18, No.6, pp.520-529, June 1969. /6/ Karp R.M. and M i l l e r R.E., " P r o p e r t i e s of a model f o r p a r a l l e l computations: deterrninacy, t e r m i n a t i o n , queuing", SIAM J . Appl. Math., vol'.14, No.6, pp. 1390-1411, Nov.1966. flf R e i t e r R., "A study of a model f o r p a r a l l e l computations", Techn. Report No. RADC-TR-6S-350, Rome A i r Development Center, Nov.1968. /8/ R e i t e r R., "Scheduling p a r a l l e l computations", J.ACM, vol.15, No.4, pp.590-599, Oct.1968. /9/ Mart i n D.F., "The automatic assignment and sequencing of comput- ations on p a r a l l e l processor systems", -Ph.D. Thesis, Dept. of Eng., UCLA, Jan.1966. /l O / Martin D.F. and E s t r i n G., "Path length on graph models of computations", IEEE Trans, on Computers, vol.C-18, No.6, pp.530-536, June 1969. / l l / Baer J.L.E. and E s t r i n G., "Bounds f o r maximum p a r a l l e l i s m i n a b i l o g i c graph model of computations", IEEE Trans, on Comp., vol.C-18, No.11,.pp.1012-1014, Nov.1969. /12/ R e i t e r R., "On assembly-line balancing problems", Operations Research, vol.17, No.4, pp.685-700, July-Aug.1969. /13/ Ford L.R. and Fulkerson D.P., Flows i n networks, Princeton U n i v e r s i t y Press, P r i n c e t o n , 1962, pp.130-134. 42

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
China 23 27
United States 5 0
Russia 2 0
Japan 1 0
City Views Downloads
Beijing 23 0
Unknown 6 0
Tokyo 1 0
Ashburn 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}

Share

Share to:

Comment

Related Items