"Science, Faculty of"@en . "Computer Science, Department of"@en . "DSpace"@en . "UBCV"@en . "Sinha, Prem Swarup"@en . "2010-03-29T20:54:20Z"@en . "1981"@en . "Doctor of Philosophy - PhD"@en . "University of British Columbia"@en . "Analytic modelling has proven to be cost-effective in the performance evaluation of computer systems. So far, queueing theory has been employed as the main tool. This thesis extends the scope of analytic modelling by using optimization techniques along with queuing theory in solving the decision-making problems of performance evaluation. Two different problems have been attempted in this thesis. First, a queueing network model is developed to find the optimal capacities and speeds of the memory levels in a memory hierarchy system operating in a multiprogrammed environment. Optimality is defined with respect to mean system response time under a fixed cost constraint. It is assumed that the number of levels in the hierarchy as well as the capacity of the lowest level are known. The effect of storage management strategy and program behaviour are characterised by the miss ratio function which, together with the device technology cost function, is assumed to be represented by power functions. It is shown that the solution obtained is globally optimal. Next, two adaptive schemes, SELF and MULTI-SELF, are developed to control the flow of jobs in a multiprogrammed computer system. They periodically determine the number of jobs from each class that should be activated to minimize the mean system residence time without saturating the system. The computation is based on the estimated system workload in the next interval. An exponential smoothing technique is used to reduce the error in estimating the values of the model parameters. The service provided to each class (specifically, the mean response time) may be adjusted by changing the weight associated with the job class. From our simulation results, the schemes appear to be both stable and robust. Performance improvement over some of the existing schemes (50%, L=S and the Knee criteria) is significant under some workloads. The overhead involved in its implementation is acceptable and the error due to some of the assumptions in the formulation and solution of the model are discussed."@en . "https://circle.library.ubc.ca/rest/handle/2429/22966?expand=metadata"@en . "OPTIMIZATION TECHNIQUES IN COMPUTER SYSTEM DESIGN AND LOAD CONTROL by PREM SWARUP SINHA B.Sc. , D e l h i U n i v e r s i t y , I n d i a , 1971 M.Sc, Indian I n s t i t u t e of Technology, New D e l h i , I n d i a , 1973 DIIT, Indian I n s t i t u t e of Technology, New D e l h i , I n d i a , 1974 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF. THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Department of Computer Science) We accept t h i s t h e s i s as conforming to the r e q u i r e d standard THE UNIVERSITY OF BRITISH COLUMBIA . SEPTEMBER 1981 \u00C2\u00A9 Prem Swarup Sinha, 1981 In present ing t h i s t h e s i s in p a r t i a l f u l f i l m e n t of the requirements fo r an advanced degree at the Un ivers i ty of B r i t i s h Columbia, I agree that the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e fo r reference and study. I fur ther agree that permission for extensive copying of t h i s t h e s i s for s c h o l a r l y purposes may be granted by the head of my department or by h i s or her representa t ives . It i s understood that copying or p u b l i c a t i o n of t h i s thes is for f i n a n c i a l gain s h a l l not be allowed without my wr i t ten permiss ion . Department of Oovw^iAr/v ^ . e ^ c x The Un ivers i ty of B r i t i s h Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1W5 Date I li . 10. 81 i i S u p e r v i s o r : Samuel T. Chanson ABSTRACT A n a l y t i c m o d e l l i n g has proven to be c o s t - e f f e c t i v e i n the performance e v a l u a t i o n of computer systems. So f a r , queueing theory has been employed as the main t o o l . T h i s t h e s i s extends the scope of a n a l y t i c m odelling by using o p t i m i z a t i o n techniques along with queuing theory i n s o l v i n g the decision-making problems of performance e v a l u a t i o n . Two d i f f e r e n t problems have been attempted i n t h i s t h e s i s . f . . . F i r s t , a queueing network model i s developed to f i n d the optimal c a p a c i t i e s and speeds of the memory l e v e l s i n a memory h i e r a r c h y system o p e r a t i n g i n a multiprogrammed environment. O p t i m a l i t y i s d e f i n e d with respect to mean system response time under a f i x e d c o s t c o n s t r a i n t . I t i s assumed that the number of l e v e l s i n the h i e r a r c h y as w e l l as the c a p a c i t y of the lowest l e v e l are known. The e f f e c t of storage management s t r a t e g y and program behaviour are c h a r a c t e r i s e d by the miss r a t i o f u n c t i o n which, together with the de v i c e technology c o s t f u n c t i o n , i s assumed to be represented by power f u n c t i o n s . I t i s shown that the s o l u t i o n o b t a i n e d i s g l o b a l l y o p t i m a l . i i i Next, two adaptive schemes, SELF and MULTI-SELF, are developed to c o n t r o l the flow of jobs i n a multiprogrammed computer system. They p e r i o d i c a l l y determine the number of jobs from each c l a s s t hat should be a c t i v a t e d to minimize the mean system r e s i d e n c e time without s a t u r a t i n g the system. The computation i s based on the estimated system workload i n the next i n t e r v a l . An e x p o n e n t i a l smoothing technique i s used to reduce the e r r o r i n e s t i m a t i n g the values of the model parameters. The s e r v i c e p rovided to each c l a s s ( s p e c i f i c a l l y , the mean response time) may be adjust e d by changing the weight a s s o c i a t e d with the job c l a s s . From our s i m u l a t i o n r e s u l t s , the schemes appear to be both s t a b l e and robust. Performance improvement over some of the e x i s t i n g schemes (50%, L=S and the Knee c r i t e r i a ) i s s i g n i f i c a n t under some workloads. The overhead i n v o l v e d i n i t s implementation i s acceptable and the e r r o r due to some of the assumptions i n the for m u l a t i o n and s o l u t i o n of the model are d i s c u s s e d . i v TABLE OF CONTENTS 1. . INTRODUCTION AND PROBLEM DEFINITION 1 1.1 The.Importance of Performance E v a l u a t i o n ........... . 1 1.2 Approaches to Computer M o d e l l i n g 2 1.3 A n a l y t i c M o d e l l i n g 3 1.4 Th e s i s Overview 7 2. MEMORY HIERARCHY DESIGN 9 2.1 I n t r o d u c t i o n and Review 9 2.2 System D e s c r i p t i o n , Assumptions and Notation ....... 14 2.3 Queueing Model 19 2.4 Formulation of the O p t i m i z a t i o n Problem . 27 2.5 S o l u t i o n of the O p t i m i z a t i o n Problem 33 3. ADAPTIVE LOAD CONTROL ', . 43 3.1 I n t r o d u c t i o n and Review 43 3.2 System Model 55 3.3 S a t u r a t i o n E s t i m a t i o n 57 3.4 Optimal S e l e c t i o n 67 4. MULTICLASS CONTROL 78 4.1 I n t r o d u c t i o n 78 4.2 M u l t i c l a s s S a t u r a t i o n E s t i m a t i o n ... 79 4.3 M u l t i c l a s s Optimal S e l e c t i o n 85 5. SIMULATION RESULTS AND ERROR ANALYSIS 89 5.1 I n t r o d u c t i o n 89 5.2 D e s c r i p t i o n of the Simulator 90 V 5.3 Si m u l a t i o n R e s u l t s 96 5.4 E r r o r A n a l y s i s 111 6. CONCLUSIONS AND EXTENSIONS 116 REFERENCES 121 APPENDIX A 128 v i LIST OF TABLES 2.1 Optimal Storage S i z e s 40 5.1 Mean Response Time 97 5.2 Resp. Time f o r V a r i o u s Values of/xin L=S C r i t e r i o n ...... 98 5.3 Resp. Time and %Imp. f o r Workload i n Table A.3 ........ 99 5.4 Resp. Time and %Imp. f o r Workload i n Table A.4 ........ 99 5.5 Resp. Time and %Imp. f o r Workload i n Table A.5 100 5.6 Resp. Time and %Imp. of SELF, 50%, L=S and Knee ...... 104 5.7 SELF Resp. Time with D i f f . Weights Under Heavy Load .. 105 5.8 SELF Resp. Time with D i f f . Weights Under L i g h t Load .. 105 5.9 SELF and MULTI-SELF Resp. Time with S t a t i c Beta ..: 107 . 5.10 SELF and MULTI-SELF Resp. Time With Dynamic Beta 109 5.11 SELF Resp. Time With S t a t i c and Dynamic Beta 110 5.12 MULTI-SELF With S t a t i c and Dynamic Beta 110 5.13 % E r r o r and %Imp. Under S t a t i c , Dynamic, No Smoothing . 113 A.1 Workload f o r Table 5.1 129 A.2 Workload f o r Table 5.2 129 A. 3 Workload f o r Table 5.3 129 A.4 Workload f o r Table 5.4 129 A.5 Workload f o r Table 5.5 131 A.6 Workload f o r Table 5.6 131 A.7 Workload f o r Table 5.7 132 A.8 Workload f o r Table 5.8 132 A.9 Workload f o r Tables 5.9 through 5.12 132 v i i LIST OF FIGURES 2.1 Storage H i e r a r c h y i n a Multiprogrammed System 16 2.2 Queueing Network Model \"A\" of System .............19 2.3 Queueing Network Model \"B\" of System 21 3.1 Simple C e n t r a l Server Model .. .... \u00E2\u0080\u0094 47 3.2 S a t u r a t i o n p o i n t 49 3.3 Under-saturated and Ov e r - s a t u r a t e d Regions . 5 0 3.4 General C e n t r a l Server Model 55 3.5 S a t u r a t i o n Point ..' 57 3.6 System Bottleneck 62 3.7 Load C o n t r o l Model 67 3.8 A c t i v e C o n s t r a i n t s 72 3^9 I n a c t i v e C o n s t r a i n t .. 73 5.1 Mean System Jobs Under L i g h t Load 101 5.2 Mean System Jobs Under Heavy Load 102 v i i i ACKNOWLEDGEMENT I would l i k e to thank my a d v i s o r , Dr. Samuel T.. Chanson, f o r h i s guidance and constant encouragement duri n g the past three years of my graduate s t u d i e s . I am a l s o t h a n k f u l to the members of my committee, Dr. David R. C h e r i t o n , Dr. David G. K i r k p a t r i c k , Dr. U r i Ascher and Dr. Mabo R. I t o f o r t h e i r i n t e r e s t and c o n s t r u c t i v e comments. I would a l s o l i k e to thank my c o l l e a g u e s E z i o C a t a n s a r i t i and Mark C. Fox for c a r e f u l l y reading an e a r l y v e r s i o n of the t h e s i s . A very p e r s o n a l note of g r a t i t u d e to my parents, b r o t h e r s and s i s t e r s . I t was almost impossible f o r me to accomplish t h i s achievement without t h e i r constant support and encouragement from h a l f way round the globe. F i n a l l y I am g r a t e f u l to the Department of Computer Scien c e , U n i v e r s i t y of B r i t i s h Columbia f o r the f i n a n c i a l h elp i n the form of Teaching A s s i s t a n t s h i p and U n i v e r s i t y Summer Research f e l l o w s h i p . $8.12, S8.31T $SIG 1 CHAPTER 1 INTRODUCTION AND PROBLEM DEFINITION J_.J_ The Importance of Performance E v a l u a t i o n Performance c r i t e r i a are major c o n s i d e r a t i o n i n the design of computer systems. T h e r e f o r e , knowledge about program c h a r a c t e r i s t i c s , system load and optimal design theory are important a i d s to the p r e v a l e n t i n t u i t i v e and ad hoc methods. The i n c r e a s i n g demand f o r computer resources has made computer system d e s i g n e r s , data p r o c e s s i n g and co r p o r a t e managers, system a n a l y s t s , programmers as w e l l as the user community at l a r g e more concerned about system e f f i c i e n c y and u t i l i z a t i o n . D espite c o n t i n u a l r e d u c t i o n i n the cost of hardware, i n e f f i c i e n t resource u t i l i z a t i o n r e p r e s e n t s unnecessary wastage. Moreover, there i s an i n c r e a s i n g interdependency among us e r s . Whether i n a l a r g e c e n t r a l i z e d system or i n a d i s t r i b u t e d computing environment where resources are shared, poor performance of 2 one component a f f e c t s many users. There are d i f f e r e n t measures of performance such as response time, throughput rate and v a r i o u s resource u t i l i z a t i o n s . O p t i m i z a t i o n of one measure does not n e c e s s a r i l y optimize other measures. Therefore i t i s important to s e l e c t a proper measure or a combination of measures to be c o n s i s t e n t with the o b j e c t i v e s of a p a r t i c u l a r study. Once the system i s i n p r o d u c t i o n , s t r u c t u r a l problems, p a r t i c u l a r l y those r e l a t e d to the computer a r c h i t e c t u r e are o f t e n d i f f i c u l t to remove. A s i g n i f i c a n t change in the system s t r u c t u r e may be r e q u i r e d to remove b o t t l e n e c k s i n the system. Therefore i t i s d e s i r a b l e to study a model of the system before i t i s c o n f i g u r e d . T h i s a l s o a l l o w s the d e s i g n e r s to study d i f f e r e n t models of the system and s e l e c t the best s u i t e d f o r the s p e c i f i c a p p l i c a t i o n . J_.2 Approaches to Computer System M o d e l l i n g The study of computer system m o d e l l i n g and performance a n a l y s i s employs three d i f f e r e n t methods v i z . , measurement, s i m u l a t i o n and a n a l y t i c techniques. D i r e c t measurement techniques are not always a p p l i c a b l e , p a r t i c u l a r l y to design problems because they r e q u i r e the system to be i n e x i s t e n c e . A l s o p r o d u c t i o n systems may not be a v a i l a b l e 3 f o r e xperimentation. Moreover, i t may not be p r a c t i c a l to implement c e r t a i n c o n t r o l mechanisms i f the system was not i n i t i a l l y designed to f a c i l i t a t e e xtensions or m o d i f i c a t i o n s . S i m u l a t i o n techniques are the most popular, since they . are able to represent the c h a r a c t e r i s t i c s of the modelled system more c l o s e l y than i s p o s s i b l e with c u r r e n t l y a v a i l a b l e a n a l y t i c techniques. However s i m u l a t i o n models are expensive to b u i l d , v a l i d a t e and uses Although a n a l y t i c models are more c o s t - e f f e c t i v e f o r e v a l u a t i n g and p r e d i c t i n g computer system performance, they are harder to formulate and to s o l v e i n g e n e r a l . Recent advances have made a n a l y t i c models i n c r e a s i n g l y capable of r e p r e s e n t i n g more c l o s e l y the c h a r a c t e r i s t i c s of the modelled system. j_. 3 A n a l y t i c Model l i n g Queueing theory has been the major t o o l used i n a n a l y t i c m o d e l l i n g of computer systems. A queueing network i s a network of s e r v i c e c e n t r e s . Each c e n t r e has one or more queues a s s o c i a t e d with i t . Customers wait i n queues for t h e i r turn to be served by the s e r v e r . At each s e r v i c e c e n t r e the d i s t r i b u t i o n of the s e r v i c e times f o r each c l a s s of customers and the s c h e d u l i n g a l g o r i t h m are known and 4 f i x e d . A f t e r being served at a cent r e the customer moves to other s e r v i c e c e n t r e s or e x i t s from the system a c c o r d i n g to a f i x e d set of t r a n s i t i o n p r o b a b i l i t i e s which may or may not be c l a s s and/or workload dependent. A queueing network i s s a i d to be open i f job a r r i v a l s are independent of job departures and.the number of jobs present i n the system. In a c l o s e d queueing network, the number of jobs remain c o n s t a n t . Whenever a job leaves the system, a s t a t i s t i c a l l y i d e n t i c a l job i s i n j e c t e d i n t o the system to r e p l a c e i t . A computer system can be c o n s i d e r e d as a network of queues with v a r i o u s p r o c e s s i n g d e v i c e s as the s e r v i c e c e n t r e s . The a n a l y s i s of a queueing network can be c l a s s i f e d as: ( i ) e x a c t ' a n a l y s i s using c l a s s i c a l queueing theory, ( i i ) approximate techniques, ( i i i ) o p e r a t i o n a l a n a l y s i s . As d i s c u s s e d below, none of the three approaches i s s u p e r i o r to the others under a l l circumstances. Exact s o l u t i o n s were f i r s t given by Jackson [Jack63]. 5 He showed that i n a queueing network composed of e x p o n e n t i a l s e r v i c e c e n t r e s , the equations governing the e q u i l i b r i u m d i s t r i b u t i o n of the system s t a t e s e x h i b i t a product form. This means the p r o b a b i l i t y that the queueing network i s i n a p a r t i c u l a r s t a t e i s equal to the product of the p r o b a b i l i t i e s that each i n d i v i d u a l s e r v i c e c e n t r e i s i n i t s c o rresponding s t a t e d i v i d e d by a n o r m a l i z i n g c onstant. Gorden and Newell [GoNe67 ] s i m p l i f i e d the product form s o l u t i o n f o r c l o s e d queueing networks by d e v e l o p i n g a s e p a r a t i o n of v a r i a b l e s s o l u t i o n technique. Buzen [Buze73] in t r o d u c e d the widely used c e n t r a l server model and provided e f f i c i e n t a l g o r i t h m s f o r computing the n o r m a l i z i n g c o n s t a n t s , the e q u i l i b r i u m d i s t r i b u t i o n of system s t a t e s , throughput r a t e s and queue l e n g t h d i s t r i b u t i o n s . Most of the work i n queueing theory assumed i d e n t i c a l jobs u n t i l Basket et a l . [BCMP75] extended these r e s u l t s to cover m u l t i p l e c l a s s e s of jobs, d i f f e r e n t queueing d i s c i p l i n e s and non-exponential s e r v i c e d i s t r i b u t i o n s . Approximate techniques give e i t h e r an approximate s o l u t i o n to the o r i g i n a l network or an exact s o l u t i o n of an approximate model. Norton's theorem as d e f i n e d by Chandy et a l . [ChHW75b] (analogous to Norton's theorem i n e l e c t r i c a l engineering) p l a y s a major r o l e i n the development of approximate techniques. The model i n v o l v e s r e p l a c i n g a subnetwork that has a s i n g l e input stream and a s i n g l e output stream with a s i n g l e composite s e r v i c e 6 c e n t r e . The r e s u l t i n g queueing network i s rep e a t e d l y s i m p l i f i e d u n t i l i t permits a n a l y s i s by g l o b a l balance techniques. I t i s shown th a t , i f the o r i g i n a l system has the l o c a l balance pro p e r t y , then the reduced model i s exact in the sense that the queue l e n g t h d i s t r i b u t i o n at the s e r v i c e c e n t r e s (not the. system) are i d e n t i c a l i n the o r i g i n a l and the reduced network. Good r e f e r e n c e s f o r approximate techniques can be found i n Chandy et a l . [ChHW75b] and C o u r t o i s [Cour77]. O p e r a t i o n a l a n a l y s i s was f i r s t i ntroduced by Buzen [Buze76] and a good survey can be found in [DeBu78]. With t h i s approach, one can ob t a i n most of the r e s u l t s of s t o c h a s t i c a n a l y s i s without making the assumptions ( o f t e n u n r e a l i s t i c ) r e q u i r e d by queueing theory. The main drawback of t h i s technique i s that i t i s not s u i t a b l e f o r performance p r e d i c t i o n and does not lend i t s e l f to the study of t r a n s i e n t system behaviour. Although queueing theory i s an e s s e n t i a l t o o l f o r computer system performance m o d e l l i n g , i t does not s o l v e the problems of decision-making. For example, i t can provide the values of v a r i o u s system performance measures for a combination of system parameters but i t does not provide a best combination of system parameters that w i l l o ptimize c e r t a i n performance measures. Recently, some work has been done to apply o p t i m i z a t i o n techniques along with 7 queueing theory to s o l v e such problems ([ChSi80a], [ChSi80b], [HiMT79], [TrWa80], [TrWS80]). In t h i s t h e s i s , o p t i m i z a t i o n techniques are used along with time s e r i e s a n a l y s i s and queueing theory i n s o l v i n g the decision-making problems of l a r g e computer systems. The techniques developed have been a p p l i e d to the two problems of (a) the design of a memory h i e r a r c h y , and (b) the job flow c o n t r o l i n a multiprogrammed system. J_._4 T h e s i s Overview The t h e s i s c o n s i s t s of f i v e more ch a p t e r s . In chapter 2, the problem of memory h i e r a r c h y design i n a multiprogrammed system i s s t u d i e d . F i r s t a queueing network model of a multiprogrammed memory h i e r a r c h y system i s developed. An ex p r e s s i o n f o r i t s response time i s then computed i n terms of the c a p a c i t i e s and speeds of v a r i o u s l e v e l s of memory. The optimal c a p a c i t i e s and speeds are then obtained by minimizing the response time s u b j e c t to a f i x e d c ost c o n s t r a i n t . A m o d i f i e d v e r s i o n of geometric programming i s used to s o l v e the min i m i z a t i o n problem. A c l o s e d form s o l u t i o n i s d e r i v e d and shown to be g l o b a l l y o p t i m a l . Chapter 3 d e s c r i b e s and compares the r e l a t i v e m e r i t s of v a r i o u s e x i s t i n g load c o n t r o l mechanisms. An e x p r e s s i o n f o r the -estimate of the s a t u r a t i o n p o i n t i s then d e r i v e d 8 using o p e r a t i o n a l a n a l y s i s . An optimal s e l e c t i o n of batch and t e r m i n a l jobs that should be maintained in the system i s a l s o computed. This chapter a l s o d e s c r i b e s how time s e r i e s a n a l y s i s can be used in the e s t i m a t i o n of the system parameters r e q u i r e d f o r computing the s a t u r a t i o n p o i n t estimate. One of the main assumptions of the c o n t r o l scheme developed in t h i s chapter i s that a l l the batch and t e r m i n a l jobs are s t a t i s t i c a l l y i d e n t i c a l i n t h e i r resource demands. Chapter 4 r e l a x e s the i d e n t i c a l job assumption of the / model developed in chapter 3 . The scheme developed here can handle any p r a c t i c a l number of m u l t i p l e c l a s s e s of jobs. Jobs w i t h i n a c l a s s are s t a t i s t i c a l l y i d e n t i c a l but jobs i n d i f f e r e n t c l a s s e s can be d i f f e r e n t . In the extreme case, each job can belong to a d i f f e r e n t c l a s s . Chapter 5 d e s c r i b e s a s i m u l a t o r f o r a c e n t r a l server model. The two schemes d e s c r i b e d i n chapters 3 and 4 were simulated and t h e i r performance compared to some of the e x i s t i n g schemes. The assumptions made throughout the development of the two c o n t r o l schemes and the e r r o r s i n t r o d u c e d by them are d i s c u s s e d . Chapter 6 summarizes the t h e s i s and suggests f u r t h e r r e s e a r c h d i r e c t i o n s . 9 CHAPTER 2 MEMORY HIERARCHY DESIGN 2 . 1 I n t r o d u c t i o n and Review A major c r i t e r i o n of computer system design i s the e f f i c i e n c y of i t s memory resource u t i l i z a t i o n . . Although the cost . of memory i s de c r e a s i n g , the demand f o r computer resources i s i n c r e a s i n g even more r a p i d l y , and i n some a p p l i c a t i o n s the problems are g e t t i n g l a r g e r . T h e r e f o r e , i t i s important to have an optimal design f o r the memory system. The memory system i s g e n e r a l l y a combination of a v a r i e t y of t e c h n o l o g i e s with d i f f e r e n t cost-performance c h a r a c t e r i s t i c s . Such an assembly of i n t e r c o n n e c t e d d e v i c e s i s g e n e r a l l y c a l l e d a memory h i e r a r c h y . Management of the memory h i e r a r c h y r e q u i r e s determining where to s t o r e s p e c i f i c i n f o r m a t i o n , how to r e t r i e v e i t and f i n a l l y when to move i t from one l e v e l to another. The o b j e c t i v e i s to maintain the more f r e q u e n t l y used data i n the f a s t e s t (and t h e r e f o r e most expensive) device i n order to minimize the 10 r e t r i e v a l time. I t i s a l s o necessary to reco g n i z e when the data are no longer needed so that they can be moved to a slower (and cheaper) d e v i c e . The design problem considered in t h i s chapter i s to f i n d the optimal s i z e s and speeds of d i f f e r e n t memory devices given a f i x e d c o s t c o n s t r a i n t . In the past, t h i s problem was t r e a t e d using h e u r i s t i c approaches r a t h e r than q u a n t i t a t i v e . methods. Recently, q u a n t i t a t i v e methods have been developed which y i e l d a b e t t e r understanding of memory h i e r a r c h y c o n f i g u r a t i o n e v a l u a t i o n . The o p t i m i z a t i o n of a memory h i e r a r c h y i s recognized as an important r e s e a r c h area [ T r i v 7 8 ] and has been approached from s e v e r a l d i r e c t i o n s . V a r i o u s s o l u t i o n s , optimal under c e r t a i n c o n s t r a i n t s , have been obtained. Ramamoorthy and Chandy [RaCh70] have c o n s i d e r e d the problem of f i n d i n g the type and s i z e of each l e v e l of memory under c e r t a i n assumptions. T h e i r method i n v o l v e s s o l v i n g a l i n e a r programming problem with the average access time of an in f o r m a t i o n block in a program as the o b j e c t i v e f u n c t i o n and a given h i e r a r c h y cost as the c o n s t r a i n t . The r e s u l t s are then extended to a general case of multiprogramming. The approach presupposes the knowledge, of frequency of access f o r each information b l o c k . A drawback of the model i s t h a t i t uses average access time as the o b j e c t i v e f u n c t i o n while i g n o r i n g the 11 delays due to queues formed i n f r o n t of the lower l e v e l s of memory. MacDonald and Sigworth [MaSi75] have d e a l t with v a r i o u s combinations of o p t i m i z a t i o n c r i t e r i a such as f i x e d c o s t c o n s t r a i n t , f i x e d and v a r i a b l e page s i z e e t c . They too assume knowledge of the storage address sequence and have used i t s s t a t i s t i c a l p r o p e r t i e s e x t e n s i v e l y i n t h e i r work. The o b j e c t i v e f u n c t i o n to be minimized i s again the average access time, or a f u n c t i o n of i t ( i g n o r i a g the queueing d e l a y s ) , which i m p l i e s that even with a v a i l a b l e \ program r e f e r e n c e s t r i n g s the scheme cannot be a p p l i e d to multiprogramming. Chow [Chow74] has very n i c e l y a p p l i e d geometric programming to obtain not only the optimal s i z e and speed of each memory l e v e l , but a l s o the optimal number of l e v e l s of h i e r a r c h y f o r a given c o s t c o n s t r a i n t . The u n i t of inf o r m a t i o n t r a n s f e r i s a f i x e d - s i z e page and the page s i z e i s the same f o r a l l l e v e l s . The e f f e c t of the page replacement a l g o r i t h m s , the program behaviour (and hence the workload) and the page s i z e are captured by a h i t - r a t i o f u n c t i o n . A h i t - r a t i o f u n c t i o n i s d e f i n e d as the p r o b a b i l i t y of s u c c e s s f u l l y r e t r i e v i n g the needed in f o r m a t i o n from a p a r t i c u l a r l e v e l of memory. Furthermore, the h i t - r a t i o f u n c t i o n and device technology cost f u n c t i o n are taken as a power f u n c t i o n of the c a p a c i t y 1 2 and access time r e s p e c t i v e l y of each l e v e l of memory h i e r a r c h y . Chow's a n a l y s i s ignores queueing delays and hence i s a l s o r e s t r i c t e d to a uniprogramming system. Welch [Welc78] g i v e s a very simple and s t r a i g h t f o r w a r d a n a l y s i s of a. memory h i e r a r c h y f o r speed-cost t r a d e - o f f with the assumption that the s i z e and access p r o b a b i l i t y of each l e v e l of memory are known and f i x e d . Rege [Rege76] uses a very simple two-server queueing network model to analyze the cost-performance t r a d e - o f f by using d i f f e r e n t s i z e s and speeds at d i f f e r e n t memory l e v e l s . There i s no \ o p t i m i z a t i o n study i n e i t h e r case. F o s t e r and Browne [FoBr76] are among the f i r s t to e x p l i c i t l y account f o r queueing at dev i c e s i n the memory h i e r a r c h y to study a r e l a t e d but d i f f e r e n t problem - that of f i l e assignment i n the h i e r a r c h y . T r a i g e r and . Mattson [TrMa7l] consider the design problem of a storage h i e r a r c h y using a c e n t r a l s e r v e r model. But t h e i r model i s r e s t r i c t e d to three l e v e l s . In a l a t e r study, L i n and Mattson [LiMa72] extend the technique to four l e v e l s . Because of the exhaustive nature of the search f o r the optimal s o l u t i o n , the technique seems to be i m p r a c t i c a l when the number of l e v e l s i n the h i e r a r c h y i n c r e a s e s beyond f o u r . Gecsei and Lukes [GeLu74] reduced the complexity of the model to some extent but i n doing so they had to approximate each stage of the network as an open-loop queue 13 with random a r r i v a l . T r i v e d i , Wagner and Sigmon [TrWS80] have used a c l o s e d queueing network model to f i n d optimal CPU speed, device c a p a c i t i e s and a l l o c a t i o n of a set of f i l e s a cross the secondary storage d e v i c e s by maximizing the throughput r a t e . In another work, T r i v e d i and Wagner [TrWa79] have obtained the speeds of v a r i o u s secondary d e v i c e s f o r a given set of t r a n s i t i o n p r o b a b i l i t i e s i n a c l o s e d queueing network. Most previous work i n - o p t i m i z a t i o n of memory h i e r a r c h i e s used the mean memory access time as the o b j e c t i v e f u n c t i o n f o r m i n i m i z a t i o n . With a few exceptions (e.g.,[RaCh70], [TrWa79] and [TrWS80]\u00E2\u0080\u0094 the l a t t e r two were done i n p a r a l l e l with t h i s work [ChTr79]), they d e a l t only with the uniprogrammed environment where only one process i s a c t i v e at any time and the processor i s i d l e when a request i s made to any memory l e v e l . I t i s not c l e a r that i n a multiprogrammed environment, minimizing the average memory access time i s meaningful s i n c e a process may be blocked while i t i s r e f e r e n c i n g i n f o r m a t i o n i n a c e r t a i n l e v e l of memory. The f o l l o w i n g a n a l y s i s combines performance e v a l u a t i o n techniques and o p t i m i z a t i o n methods to extend the a n a l y s i s of Chow [Chow74] to cover multiprogrammed systems. Mean response time i s chosen as the o b j e c t i v e f u n c t i o n . With the number of memory l e v e l s 14 f i x e d and the c a p a c i t y of the lowest l e v e l known, an expres s i o n f o r response time i s obtained in terms of the c a p a c i t y and speed of each l e v e l of memory. The optimal expression of memory s i z e s and speeds are then obtained using the Lagrangian f u n c t i o n under a f i x e d cost c o n s t r a i n t . N o t i c e that in the uniprogramming environment, . Average response time = c, * Average access time of the memory h i e r a r c h y + c 2 average number of accesses to the memory h i e r a r c h y per i n t e r a c t i o n , , mean CPU time demand of the process per i n t e r a c t i o n . The parameters c, and c 2 are constants f o r a given process. Hence the average response time and the average memory h i e r a r c h y access time are e q u i v a l e n t o b j e c t i v e f u n c t i o n s i n the uniprogrammed environment. 2.2 System D e s c r i p t i o n , Assumptions and Notation The memory h i e r a r c h y c o n s i s t s of N l e v e l s , M,, M2, MN, where N i s known and f i x e d . G e n e r a l l y , the higher the l e v e l ( i . e . , the smal l e r the index) the smaller i s the c a p a c i t y , the f a s t e r i t s speed and the more expensive i s i t s u n i t c o s t . I t i s assumed that i n f o r m a t i o n where c, = and c 2 = 15 present i n any l e v e l i s a l s o present i n a l l subsequent lower l e v e l s . T h i s assumption may not be tr u e f o r a l l l e v e l s ( p a r t i c u l a r l y f o r lower l e v e l s of memory). In the case of a uniprogramming system, whenever the needed in f o r m a t i o n i s not found i n the highest l e v e l M, , a request i s made to each of the lower l e v e l s s u c c e s s i v e l y u n t i l i t i s found i n a l e v e l M;; i = 2, ..., N. The processor i s h e l d w a i t i n g a l l the time u n t i l the i n f o r m a t i o n i s r e t r i e v e d from M;. As i i n c r e a s e s , the time r e q u i r e d to f e t c h the i n f o r m a t i o n goes up. When i exceeds a c e r t a i n v a l u e , i t becomes uneconomical to keep the processor i d l e while the i n f o r m a t i o n i s being * r e t r i e v e d from M;, p a r t i c u l a r l y when there are other processes w a i t i n g for the pr o c e s s o r . Thus, i n the case of multiprogramming, we have two types of memory - A and B.. While the processor waits f o r access to type A memory, i t does not do so f o r access to type B memory, but r e l e a s e s the c u r r e n t process and takes the next process ready to run, i f one e x i s t s . I t i s t h e r e f o r e p o s s i b l e f o r s e v e r a l requests to queue up at a type B memory l e v e l but there i s at most one request at any one time f o r a type A memory l e v e l . The model of such a system i s shown i n F i g u r e 2.1 where n, and n 2 are the number of type A and type B memory l e v e l s r e s p e c t i v e l y , x,, i = 1, 2, N (N = n, + n 2) i s the c a p a c i t y of memory l e v e l Mj i n the h i e r a r c h y . n 1 f n 2 and x N are assumed to be g i v e n . 17 F o l l o w i n g Chow's [Chow74] terminology, we d e f i n e the f o l l o w i n g : y; , i = 1, 2 N i s the mean t r a n s f e r time of a un i t of in f o r m a t i o n from l e v e l Mj to Mj., ( t h i s does not i n c l u d e the queue wait time f o r type B memory l e v e l s ) and y-, i s simply the mean access time of the f a s t e s t memory. K(x) i s the p r o b a b i l i t y of f i n d i n g the r e q u i r e d i n f o r m a t i o n i n a memory l e v e l with c a p a c i t y x. The h i t - r a t i o f u n c t i o n p; i s t h e r e f o r e given by the d i f f e r e n c e i n p r o b a b i l i t y of f i n d i n g the inf o r m a t i o n i n M;. but not f i n d i n g i t in M;.,. i . e . , P ; = H ( x j ) - H ( x ; . 1 ) , i = 1, 2, N; H(x 0) = 0. (2.1) The miss . r a t i o F(x) i s simply 1 - H(x) and i s assumed to be a power f u n c t i o n of c a p a c i t y x, a p o s i t i v e constant R, and o, d e f i n e d as: - o F(x) = K, x (2.2) The technology cost f u n c t i o n ( i . e . , u n i t cost of a storage l e v e l with t r a n s f e r time y to the next higher l e v e l ) i s assumed to take the form: 18 b(y) = K 2 y (2.3) where p and K 2 are p o s i t i v e c o n s t a n t s . Without l o s s of g e n e r a l i t y , we take K, = K 2 = 1, i . e . , equations (2.2) and (2.3) become -a F(x) = x (2.2') -p and b(y) = y. (2.3') -1/c T h i s means K,-. i s the u n i t . f o r storage c a p a c i t y and K 2 i s the u n i t f o r c o s t . E m p i r i c a l data have shown that equations (2.2) and (2.3) are good approximations f o r the h i t - r a t i o f u n c t i o n and technology c o s t f u n c t i o n r e s p e c t i v e l y (see [Chow74], [Mats71],[Rege76], [Welc78]). Mattson's [Mats7l] e m p i r i c a l h i t - r a t i o data , f o r example, can be approximated by a power f u n c t i o n , Rege's data [Rege76] support a e = 0.5 i n (2.3) i f system c o s t s rather than component c o s t s are used. However, as s t a t e d i n [Welc78], because of a m b i g u i t i e s due to 1) system c o s t s versus component c o s t s , 2) r a p i d change in technology, and 3) approximations needed to estimate the average access times on non-random access d e v i c e s such as d i s c s , tapes e t c . , fi ( i n (2.3')) may take on widely d i f f e r e n t values (the range of 0.2 to 0.6 has been used i n [LiMa72])~ depending on the p a r t i c u l a r a p p l i c a t i o n being analyzed. S i m i l a r l y , due to d i f f e r e n t program behaviour and storage management s t r a t e g i e s , the v a l u e s of o i n (2.2') w i l l vary from a p p l i c a t i o n to a p p l i c a t i o n . 2.3_ Queueing Model The queueing model of the system d e s c r i b e d i n the p r e v i o u s s e c t i o n i s shown i n F i g u r e 2.2. FIGURE 2.2: QUEUEING NETWORK MODEL \"A\" OF SYSTEM 20 The a r r i v a l p a t t e r n of requests i s assumed to f o l l o w a Poisson d i s t r i b u t i o n with mean a r r i v a l r a t e u. Q i , Q 2 / \u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2> <3r\, a r e t n e p r o b a b i l i t i e s of r e f e r e n c i n g memory l e v e l s M, , M 2, M M | r e s p e c t i v e l y and p l r p 2 , pr^ a r e the p r o b a b i l i t i e s of r e f e r e n c i n g memory l e v e l s H n i t , , M H - + 1 v r e s p e c t i v e l y . The p r o b a b i l i t y of e x i t ( i . e . , t e r m i n a t i o n of task or completion of a request) i s Po \u00E2\u0080\u00A2 D e f i n e Q = I q i , i = 1 n 2 P = I pi , i = 0 then c l e a r l y , P + Q = 1. For type A memory, the mean e f f e c t i v e h i e r a r c h y access time T; to l e v e l M;(i < n,) i s the sum of the mean i n d i v i d u a l t r a n s f e r time between two co n s e c u t i v e l e v e l s from Mi up to M, 21 i i . e . , T; = I y;. j = 1 i . In the case of type B memory the mean t r a n s f e r times y i ' s are taken as the i n v e r s e of the s e r v i c e r a t e s of the memory l e v e l s . Furthermore, i t i s assumed that the s e r v i c e r a t e s are e x p o n e n t i a l l y d i s t r i b u t e d with mean 1/yi, i = n, +\u00E2\u0080\u00A2 1, n, + n 2 . The mean e f f e c t i v e h i e r a r c h y access time of type B memory l e v e l s are not so e a s i l y obtained because of p o s s i b l e queueing of requests at these l e v e l s . Furthermore, s i n c e a process may be blocked while a c c e s s i n g these l e v e l s , i t i s more reasonable to use mean response time (or mean request completion time) as the c r i t e r i o n of o p t i m i z a t i o n . To do so, we f i r s t transform the model i n F i g u r e 2.2 to the model i n F i g u r e 2.3. 22 FIGURE 2.3: QUEUEING NETWORK MODEL \"B\" OF SYSTEM 23 Here p;' = p;/P and the mean s e r v i c e time of ce n t r e C 0 is n ' = P.K n 1 with 1/K = E q i /\"it i i = l and *; = 1/T; i = 1, 2, . .., n, (2.4) Taking u + u 1 f - u 1 f u 2 , \"n^to be the a r r i v a l r a t e s of c e n t r e s C 0 ( . C 1 ( C ^ , r e s p e c t i v e l y , and assuming the s e r v i c e d i s c i p l i n e s of the c e n t r e s do not depend on the fut u r e s e r v i c e time requirements or on the fut u r e path of a job through the network, we now compute the response time of the network i n model B. Assuming that the i n t e r a r r i v a l - t i m e d i s t r i b u t i o n and a l l s e r v i c e time d i s t r i b u t i o n s are e x p o n e n t i a l , the model can be analyzed f o l l o w i n g Jackson's [Jack63] approach of c o n s i d e r i n g each s e r v i c e c e n t r e as an M/M/1 queueing model. Using L i t t l e ' s Law, the mean response time of the network i s given by r n, R = 1 I mean number of jobs i n centre i u L l = 1 n 2 _ i _ * P i u i = 1 1 - p; 24 where p\ \u00E2\u0080\u00A2= (mean job a r r i v a l r a t e of s e r v i c e c e n t r e Cj / mean s e r v i c e r a t e of c e n t r e C\) Now the mean job a r r i v a l r a t e f o r s e r v i c e c e n t r e Cj / y + u, i = 0' u ; i = 1, 2, ..., n 2 and the mean s e r v i c e r a t e of s e r v i c e c e n t r e y 1 \u00E2\u0080\u00A2 i i = 0 i = 1, 2, , n 2 Hence R = -1 n,+n2 (a + a, ) v\u00E2\u0080\u00A2'\u00E2\u0080\u00A2 \u00E2\u0080\u00A2'-+\u00E2\u0080\u00A2 I u j . 1 y j 1-(u+u, ),/f ' i=n, + 1 1 - u i . i y i (2.5) From F i g u r e 2.3, job a r r i v a l r a t e f o r s e r v i c e c e n t r e Cj i s given by \u00C2\u00AB n- i - 1 = ( \u00C2\u00AB + \u00C2\u00AB i ) P ' n r i + 1 + u n - 1 + 2 i = 1 2 , . . . , n 2 which, a f t e r some a l g e b r a i c m a n i p u l a t i o n , can be shown to be e q u i v a l e n t to i = (u + o!) I p|' 25 From equation ( 2 . 1 ) , f o r type B memory, we have C J n 2 N 2 I p; = I [ H(x, + n ) - H ( x j + T V 1 ) ] j=i . j = i (t a k i n g H(x ^) = 1 ) = 1 - H ( x ; + t t r i ) -a = X ! + n - i S u b s t i t u t i n g i n (2.6), we get \u00C2\u00AB i' = M x i + n r 1 where = u _ Po = > A l S O tt 1 - 1 = V X , . , . n, ( o + u , ) / > \u00C2\u00BB ' = o _ E Po i=1 j = i J S u b s t i t u t i n g q; = H(x;) - H(x;.,) -a and F ( x ; ) = 1 - H ( x ; ) = x ; r we have, n, - o ( o + u 1)/^'=>/ -E \u00E2\u0080\u00A2 x ; . , y ; . i = 1 S u b s t i t u t i n g (2.7) and (2.8) i n t o (2.5) we o b t a i n R = J Po n, -a i x;. 1 y i i = 1 ri| - o 1 - 1 K x;., y; i=-1 n,+n 2 + E i=n,+1 x i - 1 y i 1-\" x;., y; (2.7) (2.8) (2.9) where x 0 = 1 . N o t i c e that the degree of e x p l i c i t l y represented in the multiprogramming i s not model. I t i s g e n e r a l l y 26 recognized t h a t , f o r a given s i z e of the main memory, the degree of multiprogramming a f f e c t s the performance of the system. Some systems t r y to maintain the degree of multiprogramming at a f i x e d l e v e l . However, except for some simple systems where f i x e d p a r t i t i o n memory management i s used, the r e s i d e n t sets of a c t i v e processes change d r a s t i c a l l y with time and even the average s i z e d i f f e r s from a p p l i c a t i o n to a p p l i c a t i o n . The i n f l u e n c e of the degree of multiprogramming on performance i s l a r g e l y due to i t s e f f e c t on the miss r a t i o f u n c t i o n . In f a c t , i n a paging system using a f i x e d a l l o c a t i o n s t r a t e g y , the mean CPU time i n t e r v a l between c o n s e c u t i v e page f a u l t s e was found to be p r o p o r t i o n a l to a power f u n c t i o n of the s i z e of the r e s i d e n t set m [BeKu69] e oc m K I f main memory i s shared e q u a l l y by the a c t i v e processes then there i s a d i r e c t r e l a t i o n s h i p between the values of parameter a of the m i s s - r a t i o f u n c t i o n and K. The system designer i s not as much i n t e r e s t e d i n the absolute v a l u e s of the degree of multiprogramming as i n i t s e f f e c t on system performance (which i n t h i s model i s represented by the value of o i n ( 2 . 2 ) ) . 27 2.4 F o r m u l a t i o n of the O p t i m i z a t i o n Problem S i n c e the t e c h n o l o g y c o s t per u n i t of i n f o r m a t i o n i s g i v e n by M y ) = y the system c o s t w i t h s t o r a g e s i z e s x , , x 2 , . . . , x N f or l e v e l s M 1 f M 2, M N w i t h average t r a n s f e r time y 1 r \u00C2\u00A521 \u00E2\u0080\u00A2 Y N r e s p e c t i v e l y , i s g i v e n by N -p S = I x; y ; ' (2.10) i = l G i ven N, x N, and t h a t the memory system c o s t i s not t o exceed S 0 , the o p t i m i z a t i o n problem becomes: n, - a I x i . i y i n ,+n2 -a Min i = 1 + Z x; . i y ; n, - a i =n, - a 1 - 1 I- x j . , y j 1-\u00C2\u00BB x i . i y i i=1 N -p s . t . I x i y i < S 0, i = 1 x 0 = 1; x i > 0; y i > 0 i = 1, 2, ...,N (2.11) The problem (2.11) w i l l have a s o l u t i o n o n l y i n the r e g i o n where n, -o \u00C2\u00A3 f x i . , y i < 1 i = l - c and n x i . , y i < 1 i = n, + 1, n, + n 2 ' T h i s r e s t r i c t i o n meets one of the assumptions made w h i l e c a l c u l a t i n g the e q u i l i b r i u m s t a t e p r o b a b i l i t y , i . e . , the 28 t r a f f i c i n t e n s i t y 1 has to be s t r i c t l y l e s s than one [ F e r r 7 8 ] . Now by m u l t i p l y i n g the o b j e c t i v e f u n c t i o n by np 0 (a constant) and adding 1 + n 2 to i t , the problem (2.11) reduces to n,+n 2 Min 1 + 1 1 n i - a i=n| -o 1-1 \u00E2\u0080\u009E x i - . ' n y; 1-c n i . . r y i -i = 1 N -(> s . t . j _ l x ; y ; < 1 (2.12) \u00E2\u0080\u00A2 So i=l The n a t u r a l c o n s t r a i n t s x\ > 0 : and y; > 0 can be ignored i n the c a l c u l a t i o n by l o o k i n g at a s o l u t i o n only i n the p o s i t i v e region of x and y. Int r o d u c i n g new v a r i a b l e s r 0 and r ; ' s such that n, -o r 0 ^ 1 - I n x;., y; i = 1 -a and r j . , < 1 - K x; . \u00E2\u0080\u00A2, y i ; i = n, + 1, n,+n 2 The problem (2.12) i s e q u i v a l e n t to n 2 Min r 0 1 + E r T 1 i = 1 1 T r a f f i c i n t e n s i t y i s d e f i n e d as the r a t i o of the a r r i v a l r a t e to the s e r v i c e r a t e at a s e r v i c e c e n t r e . 29 n i - o r o < 1 - E M x; . , y; i=1 - o r ; . , < 1 - f x ; . , y \ ; i = n, +1 , . . . , n , +n 2 N - p J _ I x i yj < 1. (2.13) S 0 i=1 Seemingly the above problem (2.9) f a l l s under the framework of the standard Geometric Programming. However, i t does not s a t i s f y one of the c o n d i t i o n s r e q u i r e d , i . e . , the number of v a r i a b l e s minus the number of c o n s t r a i n t s equals one. The d i f f e r e n c e here i s n, + 2. Therefore we expect an n, + 1 parametric s o l u t i o n . Now when we w r i t e the Lagrangian equations f o r t h i s problem, we f i n d that e x a c t l y n, + 1 equations can be d e r i v e d from the r e s t of the equations. Therefore i n theory we should be able to e l i m i n a t e the above n, + 1 parameters. However, i n doing so, the o r t h o g o n a l i t y equations thus obtained become no n - l i n e a r and we are f o r c e d to s o l v e a system of n o n - l i n e a r equations. Thus the very advantage of using Geometric Programming i s l o s t . We s h a l l t h e r e f o r e turn to the Lagrangian m u l t i p l i e r method supplemented with a technique s i m i l a r (but not e x a c t l y equal) to Geometric Programming. T h i s i s why d e t a i l e d d e r i v a t i o n of the r e s u l t s are given r a t h e r than simply s t a t i n g the r e s u l t s . Before s o l v i n g the problem (2.13) we s h a l l f i r s t show that any s o l u t i o n to t h i s problem i s g l o b a l l y o p t i m a l . s. t . and 30 Let (X\u00C2\u00B0,Y\u00C2\u00B0) be any s t a t i o n a r y p o i n t of the problem and R(X,Y) = \u00E2\u0080\u009E p 0 R(X,Y) + n 2 + 1 1. e 3R(X,Y) ?>x i = 0 and R(X, Y) d y i (X\u00C2\u00B0 fY\u00C2\u00B0) (X\u00C2\u00B0,Y\u00C2\u00B0) i = 1, 2, t N where R i s as d e f i n e d i n (2.5). We s h a l l show that (X\u00C2\u00B0,Y\u00C2\u00B0) i s a g l o b a l minimum of R(X,Y). The f o l l o w i n g two theorems [Hadl72] are used i n the pro o f : Theorem 1. The sum of convex f u n c t i o n s i s convex. Theorem 2. I f g i s a monotonically nondecreasing convex f u n c t i o n d e f i n e d on the convex subset S, c. R 1 and i f f i s a convex f u n c t i o n d e f i n e d on the convex subset S n c. R) 1 then the composite f u n c t i o n g ( f ) i s a convex f u n c t i o n on S We now transform the o r i g i n a l f u n c t i o n R(X,Y) to R'(U,V) using the t r a n s f o r m a t i o n x; = e\" U i a n d y i = e\" v\u00C2\u00BB and o b t a i n 31 N R' (U , V) = I -1 i=n 1 +1 -ou \ . \u00E2\u0080\u00A2, +v \ 1 -. M e 1 n, -ttui . -i +v; 1 - I n e i = 1 Since each of the terms e~ v>and e\" v\u00C2\u00BB are convex in t h e i r r e s p e c t i v e v a r i a b l e s , i t i s easy to show, by repeated a p p l i c a t i o n of Theorems 1 and 2 that the f u n c t i o n R'(U,V) i s convex. I t now remains to be shown that the p o i n t (U\u00C2\u00B0,V\u00C2\u00B0) i n R'(U,V), which corresponds to the s t a t i o n a r y p o i n t (X\u00C2\u00B0,Y\u00C2\u00B0) in R(X,Y), i s a l s o a s t a t i o n a r y p o i n t of R'(U,V) and hence a g l o b a l optimal p o i n t s i n c e R'(U,V) i s convex. Case 1 i = n, + 1, n ,+2, ..., N R' (U,V) = A. bu \ -1 1 - i> e (U\u00C2\u00B0,V\u00C2\u00B0) (U\u00C2\u00B0,V\u00C2\u00B0) = (-\u00E2\u0080\u009E)(-o) e -oui+v; + -ou;+vi + , ( 1 - * e )2 (-\u00E2\u0080\u009E)(-o)x; y;., -o (1 \" f x; y \ + , ) 2 (u\u00C2\u00B0,v\u00C2\u00B0) (X\u00C2\u00B0,Y\u00C2\u00B0) * R(X,Y) * x j (X\u00C2\u00B0,Y\u00C2\u00B0) = 0 32 Case 2. \u00E2\u0080\u00A2i = 1 , 2, - o u ; + v ; + r S R 1 ( U , V ) = o u j b (e - o u i + v i + , d u I 1 \u00C2\u00BB< e = \"d R(X,Y) b x \ ( X \u00C2\u00B0 , Y \u00C2\u00B0 ) = 0 ( fol lowing same arguments as in Case 1) S imi la r l y b R ' ( U , V ) = 0 ( U \u00C2\u00B0 , V \u00C2\u00B0 ) However, ( U \u00C2\u00B0 , V \u00C2\u00B0 ) is the global minimum of R ' (U,V) , and the transformation between (X,Y) and (U,V) is unique. Therefore ( X \u00C2\u00B0 , Y \u00C2\u00B0 ) is the global minimum of R(X,Y) . Using s imi lar arguments i t can be shown that the const ra in t problem (2.9) has a g lobal minimum. Tr ived i and Sigmon [TrSi80] have independently shown [ChTr79] that th is design problem in general has a global optimal so lut ion although the i r object ive function was d i f f e ren t and they used a c losed queueing network model. 33 2.5 S o l u t i o n of the O p t i m i z a t i o n Problem n 2 n, -o F(R,X,Y,X) = r 5 1 + I r f 1 + X 0 ( r 0 + I x i . , y i - 1) i=1 . . i=1 n,+n 2 + I X i . , ( r ; i=n, +.1 x i . i y i - l ) N - fi + X ' ( _ 1 _ I x i y i - 1) S 0 i=1 (2.14) D i f f e r e n t i a t i n g w i t h r e s p e c t t o R, X, Y and X r e s p e c t i v e l y and e q u a t i n g t o z e r o , we o b t a i n , b r 0 a F = b r i bF bx i (-1 ) r-0 1 + X 0 r 0 = 0 (2.15) (-1) r , 1 + X i r i = 0; i = 1, 2, ...... n 2.' (2.16) -a ~$ (\u00E2\u0080\u00A2-o)nX.0xi y i + XJ_ x i . , y i . , = 0; So i = 1 , 2, ..., n,+n 2 (2.17) -a ~fi (-oOxXi., x i y i + XJ_ x i . , y i . i = 0; S 0 i = n,+1, n,+n 2 (2.18) bF = by i -a -p \" X o x i . , y i + (-*)XJ_ x i y i = 0; So i = 1, 2, ..., n, (2.19) - c - * \" X i . , x i . , y i + ( - * )x /_ x i y i = 0; So i = n,+1, ..., n,+n 2 (2.20) n, - o b F = r 0 + I if x i . , y i - i = 0 b x 0 i = i (2.21 ) 34 SF.' = r ; . , \".+ \u00C2\u00BB x ; . , y; - 1 = 0 N & F = J _ i x ; y ; - i = o n ' s 0 . i = i i = n,+1,...., n,+n 2 . (2.22) (2.23) n, - c n,+n 2 -c Define f\u00C2\u00B0 = X 0 I' n x i . , y + I X; . , \u00E2\u0080\u009E x; . , y i (2.24) i= 1 i = n, + 1 6 0 - ro 1 61 = r ; 1 i . = 1 , . .. , n : Now c l e a r l y , n, +n 2 E 6 ,\ = 1 i = 1 (normality) ( 2 . 1 5 ) => ( -1 ) 6 0 + 620 = 0 ( 2 . 1 6 ) => ( -1 ) 6 i + 6 2 i = 0 i = 1 , 2 , n 2 ( 2 . 1 7 ) and ( 2 . 1 8 ) => (-o) 6 , i + 6 3 ! . , = 0 i = 2 , n,+n 2 ( 2 . 2 5 ) ( 2 . 2 6 ) ^0 K x j . , y i / f 0 i = 1, 2, ..., n, (2.27) - a M - i x x i ^ y i / f 0 i = n', + 1 , . . . , n,+n2 (2.28) 620= X.'o r 0 (2.29) 62 i= M r i i = 1, 2, ..., n 2 (2.30) * 3 ^ = X' x i y i i = 1, 2, n,+n2. (2.31) S 0 f 0 (2.32) (2.33) (2.34) (2.35) 35 (2.19) and (2.20) = > 6,; + (-*). 6 3 i = 0 i=1, 2, n,+n 2 (2.36) S o l v i n g ( 2 . 3 2 ) , ( 2 . 3 5 ) , and (2.36) s i m u l t a n e o u s l y , we o b t a i n N - i 6 , i = (afi) ap - 1 N (op) - 1 6 3 i = j . 6 1 i N = n,+n 2; i = 1 , 2 , . . . , n 1 + n 2 (2.37) Now i n or d e r t o o b t a i n o p t i m a l v a l u e s f o r x j ' s and y i ' s we * f i r s t o b t a i n v a l u e s f o r f\u00C2\u00B0 and x ; ' s . R a i s i n g (2.27) and (2.28) t o the power 6,, and (2.31) t o the power 6 3 1 f o r i = 1 , 2 , n, + n 2 r e s p e c t i v e l y , and then m u l t i p l y i n g we o b t a i n : n,+n 2 I 6, i a n n 2 b \ (f\u00C2\u00B0) 1 = X 0 n X ; . c (2.38) i=1 n, where a n s E 6|j ' i=1 b\ = 61 i + i , 1 = 1 , 2 , . . . , n 2 ]/\u00C2\u00BB n,+n 2 6,i 6 3; c = \u00E2\u0080\u009Ed / *>s 0 ) . x N . n d/6,;) d/6 3;) i = 1 X ' = 1 T \u00C2\u00B0 1 36 From ( 2 . 2 1 ) , (2.25), ( 2 , 2 7 ) , ( 2 . 2 9 ) , and (2.33) X 0 = (2 f\u00C2\u00B0 a M j + V 4 f 0 a n i + 1 ) / 2 (2.39) and from ( 2 . 2 2 ) , ( 2 . 2 6 ) , ( 2 , 2 8 ) , ( 2 . 3 0 ) , and (2.34) . x; = (2 f\u00C2\u00B0 b; + 1 + V 4 f.\u00C2\u00B0 b; + 1 ) / 2 (2.40) S u b s t i t u t i n g (2.39) and (2 . 40 ) i n (2 . 38) * a f 0 = c . ( 2 f 0 a, + 1 + A/ 4 f 0 a \u00E2\u0080\u009E, + 1 ) n 2 b; . n (2 f\u00C2\u00B0 b; + 1 + V 4 f 0 b; + 1) / 2 (2.41) i = l S u b s t i t u t i n g the v a l u e s of f\u00C2\u00B0 i n (2.39) and (2.40) we o b t a i n v a l u e s f o r the X i ' s . From ( 2 . 2 8 ) , (2.31), (2.25) and (2.36) -e -\u00C2\u00BB x ; y ; = (ae) x ; + 1 y ; + 1 i = 1, 2, ..., n,+n2-1 (2.42) ~a ~a x i . i y i = U p ) x i y i + , i = 1 , 2 , . . . , n 1 - 1 _ a -a Xo x n-1 y n, = ( c0 ) X , x n , Y t i * 1 -a ~ a X i . 1 x i . , y i = (af\u00C2\u00BB) X i . , + , x i y i + , i = n,+1, ..., n,+n2-1 (2.43) From (2.42) and (2.43) 00 O0 - (js+1 ) ( x i . , ) = ( x j ) iap) i = 1, 2, n,-1 X 1 x i + i 37 op ~ (p +1 ) op x i - i ( x i \u00E2\u0080\u00A2. 1') = (o-e) x i . 1 + 1 ( x; ) X 1 x U t i = n,, \u00E2\u0080\u00A2n1'+n2-1 (2.44) Taking l og a z;.1 = K + (T+a) z i - zi\u00E2\u0080\u009E, i = 1 , 2, . . . , n , - l a z i . - , = K + (1+a) z\ - z i + 1 + h j , , i = n 1 r n!+n 2-1 (2.45) where zj = l o g x i ; a = op; K = -(<3+l) l o g a; hi = p ( l o g X. i - l o g Xi + 1) Adding (2.45) f o r i = 1, 2, n,+n 2-1 N-1 N-1 N-1 N-* I a z i . , = K(N - 1 ) + (1+a) E z\ - E z i + 1 + E h;., i=1 . i=1 i=1 i=n, (2.46) S i m p l i f y i n g (2.46) a z 0 = k(N-l) + z, + a z N . , - z N + L (2.47) N-1 n 2 - 1 where L = E h i _ M ( = E hi = p ( l o g X.0 - l o g kn) i=n, i = 0 *\u00E2\u0080\u00A2 A l s o m u l t i p l y i n g (2.45) by (1/a 1) and then adding f o r i = 1, 2, ..., N-1 N N-1 i N => a z 0 = K E a + a z, + a z N _ , - a z H + M (2.48) N n 2 - 1 n,+i n 2 - 1 n 2 - i where M = a E h i / a = E h i a i=0 i=0 38 From (2.47) and (2.48) ( t a k i n g z 0 > 0) N-1 i N 0 = K[(N-1) - I a ] + (1-a )z, - ( l - a ) z N . + (L-M) i = 1 => z, = r 1 \" N i K + 1-a z, + L-M (2.49) L T ^ a Ta^-I T T \u00E2\u0084\u00A2 Again from (2.44) i t can be shown that z \ = r i - (1-a r)N -i K + 1-a' z H L T ^ i \" (1-a) (1-as)-< T T \u00E2\u0084\u00A2 + 1 -a 1 rL-M -i i . = 1,2, . .. , n,-1 (2 . 50a) S i m i l a r l y i t can be shown that z; = r i - ( l - a 1 ) N -| K + 1-a 1 z N L T ^ a ( 1-a) (-l-a^J TT* i - n i . . + 1-a 1 rL-M -, + I -h; a 1 - J . r b M -I T L T T ~ ^-TT^ j = 0 i = n,, ..., N-1; (2.50b) Hence a l l z i ' s can be computed using (2.50a) and (2.50b). From (2.37) 6 3 \ = (T/*)6, j => yj = (x;/S 06,;) i = 1, 2, n; (2.51) Hence knowing a l l x;'s a l l y i ' s can be computed from (2.51). I f we s u b s t i t u t e u = 0 i n the ex p r e s s i o n f o r the steady s t a t e r e s i d e n c e time (2.9) of a multiprogrammed system, we o b t a i n the ex p r e s s i o n f o r the uniprogrammed system. I n t u i t i v e l y a l s o , the number of jobs i n the system 39 i s reduced as the a r r i v a l r a t e i n an open network d e c r e a s e s . As a r e s u l t , the i n t e r a c t i o n among the job s a l s o goes down. In the l i m i t i n g c a s e , as the a r r i v a l r a t e tends t o z e r o , the mean response time c o i n c i d e s w i t h t h a t of the uniprogrammed c a s e . Now l e t t i n g *\u00E2\u0080\u00A2 = 0 i n (2.50a) and (2.50b) we o b t a i n , l o g x \ = r i - ( 1 - a 1 )N -, . K ( l - a ) ( l - a N ) J * + 1-a 1 l o g x N ; i = 1, 2, N. T T \u00E2\u0084\u00A2 T h i s i s the same r e s u l t o b t a i n e d by Chow [Chow74]. The f o l l o w i n g example i l l u s t r a t e s how the o p t i m a l s o l u t i o n f o r a multiprogrammed environment approaches the ones f o r a uniprogrammed environment as v tends t o z e r o . Example For the case N = 4 , n, = n 2 = 2 , and a = p = 1 , the e x p r e s s i o n f o r o p t i m a l x^'s are T _ v O \u00E2\u0080\u00A2 2 5 \ 0 \u00E2\u0080\u00A2 5 X 1 \" XFT E-XT] X 9 = x 9 ^ r X . [ \" i t ] 0 - 7 5 x 2 - 5 X 3 - X , ^ X-p-j T a k i n g x f l = 10 8 and S 0 = 4 * 10 1\u00C2\u00B0, (the a c t u a l u n i t s depend upon the n o r m a l i z a t i o n f a c t o r s used i n e q u a t i o n s (2.2) and (2.3)) the v a l u e s of x^'s are g i v e n i n T a b l e 1 f o r d i f f e r e n t v a l u e s of n. The v a l u e s of the speeds of the 40 memory h i e r a r c h y are obtained by d i v i d i n g the values of x i n Table 1 by 1 0 1 \u00C2\u00B0 . Thus f o r \u00C2\u00BB = 0, the speeds f o r M, , M 2, M 3, and M, are 10ns, 1\u00C2\u00BB.s, 100\u00C2\u00BB.S and 10ms, r e s p e c t i v e l y . x , ,x2 x 3 1 00 1000 1 0000 100.08 100.77 \u00E2\u0080\u00A2 1 07.24 1 0031 10313 13224 1003905 1039305 1418069 0 100.00 10000 1000000 Table 2.j_ Optimal Storage S i z e s for N=4, x t t=l0 B, S 0=4*1Q 1 0, o=g=1 The values of x's f o r v = 0 are the ones obtained by Chow f o r a uniprogrammed system. From the data gathered from an Amdhal 470 v/6 model II running the Michigan Terminal System, we estimate that u - 5. r e q u e s t / s e c . i n a moderately heavy l o a d and p 0 , which d i f f e r s widely from process to process, i s i n the range of 1/1000 to 1/10,000. Hence ji(=up 0 ) = 1 0, 000 i s not unreasonable. From Rege's e m p i r i c a l data [Rege76], o i s roughly equal to 1 and the storage u n i t K, i s about 200 bytes. Thus the optimal s o l u t i o n f o r \u00C2\u00BB = 10,000 (Table 1) i s x, - 21K bytes., x 2 - 2.6M byt e s , x 3 - 280. M bytes and x 4 - 20 B bytes. I f the u n i t of c o s t i s 0.00055C 2, S 0 . w i l l be approximately $200,000. 41 Because of the c l o s e d form r e s u l t s we are. able to, make the f o l l o w i n g important o b s e r v a t i o n s . From equations (2.50a) and (2.50b) we f i n d that the optimal s i z e s of the v a r i o u s l e v e l s of memory are independent of the given system c o s t c o n s t r a i n t S 0. An i n t u i t i v e e x p l a n a t i o n l i e s i n the o f t e n c i t e d o b s e r v a t i o n that about 10% of the memory i s accessed 90% of the time [Welc78]. Thus, a f t e r a c e r t a i n p o i n t , the m i s s - r a t i o w i l l decrease only m a r g i n a l l y with an i n c r e a s e i n memory s i z e . Thus one can conclude that once each l e v e l i n the h i e r a r c h y has the optimal s i z e given by (2.50a) and (2.50b), any a d d i t i o n a l money should be i n v e s t e d i n a c q u i r i n g f a s t e r memory r a t h e r than * more memory. On the other hand, the optimal s i z e s of memory h i e r a r c h y i s ra t h e r s e n s i t i v e to the m i s s - r a t i o parameter o. The smaller the value of o i n (2.2) (and hence the smaller the values of a, K, and M i n (2.50a) and (2.50b)) the l a r g e r the values of the x^'s. Furthermore, we observe that there i s a c o n s i d e r a b l e d i f f e r e n c e between uniprogrammed and multiprogrammed r e s u l t s when \u00C2\u00BB i s l a r g e ( i . e . , when a r r i v a l r a t e of requests i s l a r g e and/or the e x i t p r o b a b i l i t y of a process i s small which i m p l i e s a l a r g e number of a c t i v e process competing f o r l i m i t e d system resources s i m u l t a n e o u s l y ) . 2 The co s t u n i t i s obtained assuming p = 1. Although a v a i l a b l e data, i t i s used so to Chow's [Chow74]. by using 1979 DEC p r i c e s and p = 1 gives a poor f i t of the that the r e s u l t s can be compared 42 F i n a l l y , Chow [Chow74] reported that i n a uniprogrammed system, the r a t i o s of the c a p a c i t i e s as w e l l as the speeds of adjacent l e v e l s of memory f o r the optimal c o n f i g u r a t i o n are constant i f ap = 1. T h i s constancy i n the . r a t i o of c a p a c i t i e s was a l s o e m p e r i c a l l y observed (see ref e r e n c e 11 i n [Chow74]). Equations (2.50a) and (2.50b) show that i n a multiprogrammed environment t h i s i s tr u e f o r the f i r s t n, l e v e l s and f o r the next n 2 l e v e l s s e p a r a t e l y . The r e s u l t s presented i n t h i s chapter are not intended to be used d i r e c t l y by the de s i g n e r s and e v a l u a t o r s i n the f i n a l c o n f i g u r a t i o n . T h i s i s because memory may not be a v a i l a b l e with the c a p a c i t y / s p e e d c h a r a c t e r i s t i c s computed. A l s o some of the assumptions used i n the f o r m u l a t i o n and the s o l u t i o n of the problem may not be s a t i s f i e d i n p r a c t i c e . However, they are u s e f u l as g u i d e l i n e s and as an c o n f i g u r a t i o n i n an i t e r a t i v e design technique (see f o r example, chapter 8 of [ F e r r 7 8 ] ) . 43 CHAPTER 3 ADAPTIVE LOAD CONTROL 3 . V I n t r o d u c t i o n and Review One of the p r i n c i p l e ideas behind multiprogramming i s to make more e f f e c t i v e use of the system resources, many of which can be simultaneously u t i l i z e d . However, in order to a v o i d e x c e s s i v e i n t e r a c t i o n s among the competing jobs, which w i l l r e s u l t i n general degradation of system performance, the number and composition of jobs i n a multiprogramming set should be c a r e f u l l y c o n t r o l l e d . The p o l i c y that c o n t r o l s the flow of jobs i n a multiprogramming system i s c a l l e d l o a d c o n t r o l p o l i c y . T h e r e f o r e , i n order to provide an acce p t a b l e l e v e l of s e r v i c e , most multiprogramming computer systems employ some form of load c o n t r o l p o l i c y . Load c o n t r o l p o l i c i e s are t y p i c a l l y b u i l t around m a i n t a i n i n g two s e t s of queues, o f t e n c a l l e d the e l i g i b l e queues and the multiprogramming queues. Jobs i n the 44 e l i g i b l e queues must Wait u n t i l the c o n t r o l p o l i c y d e cides (depending upon the system s t a t e or some other c r i t e r i a ) to move them to the multiprogramming queues. Only jobs i n the multiprogramming queues are allowed to a c t i v e l y share the system's r e s o u r c e s . In a l a r g e computer i n s t a l l a t i o n or a d i s t r i b u t e d computing environment, g e n e r a l l y a l a r g e number of jobs with d i f f e r e n t c h a r a c t e r i s t i c s and resource demands are executing at the same time. The number of jobs competing f o r l i m i t e d system resources changes from time to time. T h e i r resource demands a l s o change with time. For such a system , a s t a t i c c o n t r o l p o l i c y (one which i s i n s e n s i t i v e to job c h a r a c t e r i s t i c s and job-mix) cannot be expected, to give good performance at a l l times. There are v a r i o u s l o a d c o n t r o l mechanisms a v a i l a b l e which are optimal under v a r i o u s c o n d i t i o n s . The o p t i m a l i t y of a c o n t r o l mechanism i s always r e l a t i v e to the o b j e c t i v e f u n c t i o n s e l e c t e d ( i . e . , the performance measure) and a set of c o n s t r a i n t s . A c o n t r o l mechanism that i s optimal with r e s p e c t to one performance measure may not be optimal with r e s p e c t to o t h e r s . Furthermore, a c o n t r o l mechanism may or may not be r e a l i z a b l e because of the u n d e r l y i n g assumptions. The one t h a t i s u n r e a l i z a b l e cannot be implemented i n p r a c t i c e but can be used as a standard f o r comparison. 45 A n a t u r a l load c o n t r o l mechanism, p a r t i c u l a r l y f o r , paging systems, i s accomplished by c o n t r o l l i n g the degree of multiprogramming. Previous work d i r e c t e d towards t h i s end i s p r i m a r i l y represented by the development of the working set p o l i c y 1([Denn68b],[Denn70],[Denn71],[RoDu73]). More r e c e n t l y , e f f o r t s to optimize the system work c a p a c i t y l i e mainly i n keeping some measure r e l a t e d to program behaviour ( u s u a l l y paging behaviour) w i t h i n some predetermined bounds ([DKLP76],[DeKh76],[LePo76],[GrDe77]). The 50% c r i t e r i o n [LePo76] f o r example, aims at m a i n t a i n i n g the u t i l i z a t i o n of the paging device to around 0.5. The L=S c r i t e r i o n [DeKh76] proposes to keep the system l i f e t i m e ' approximately equal to that of the page swap time. The Knee c r i t e r i o n ([DKLP76],[GrDe77]) suggests t h a t the mean r e s i d e n t set of each process should be maintained at the value a s s o c i a t e d with the primary knee 2 of i t s l i f e - t i m e f u n c t i o n , where l i f e time i s d e f i n e d to be the mean v i r t u a l time between two s u c c e s s i v e page f a u l t s [DKLP76]. Though the most robust of the th r e e , the Knee c r i t e r i o n a l s o i n v o l v e s c o n s i d e r a b l e amount of overhead. The above mentioned c r i t e r i a are not based on 1 According to t h i s p o l i c y , the number of jobs allowed i n the multiprogramming queue i s l i m i t e d such t h a t the sum of t h e i r working s e t s can be accomodated i n main memory. The working set W(t,r) of a process at any time t , with window s i z e T, i s the set of d i s t i n c t pages r e f e r e n c e d by the process d u r i n g the most recent time i n t e r v a l of l e n g t h T. 2 The p o i n t where the curve of the system l i f e time vs number of a c t i v e jobs has maximum s l o p e . 46 mathematical models and are not proven o p t i m a l . There are c o n d i t i o n s under which they may perform p o o r l y . If the system i s execution bound,.the 50% c r i t e r i o n does not work w e l l . I f the system i s both execution and I/O bound, the L=S c r i t e r i o n does not give good r e s u l t s . These c r i t e r i a mainly aim at i n c r e a s i n g the throughput rate by l o a d i n g the system up to the p o i n t when the measured i n d i c a t o r suggests that f u r t h e r i n c r e a s e i n system load may cause \" t h r a s h i n g \" [Denn68a]. The methods are a p p l i c a b l e only to paging systems. Furthermore, f o r i n t e r a c t i v e systems and combined b a t c h - i n t e r a c t i v e systems,, one i s i n t e r e s t e d not only i n maximizing the system; throughput rate but a l s o in guaranteeing good response times to the i n t e r a c t i v e jobs ( p o s s i b l y at the expense of the batch j o b s ) . Landwehr [Land76] s t u d i e d a combined b a t c h - i n t e r a c t i v e system and proposed a scheme to a c t i v a t e batch jobs depending on the t e r m i n a l l o a d . The emphasis of the study, however, was on model f o r m u l a t i o n and i t s v a l i d a t i o n . There was no attempt to prevent the system from s a t u r a t i o n or to optimize performance. Hine et a l . [HiMT79] s t u d i e d the problem from a s l i g h t l y d i f f e r e n t viewpoint. T h e i r goal was to c o n t r o l the main memory a l l o c a t i o n f o r each c l a s s of job i n order to p r o v i d e d i f f e r e n t response time to each c l a s s while maximizing the CPU u t i l i z a t i o n . - They employed a 47 mathematical model but o p t i m i z a t i o n was achieved by an exhaustive search technique. A h e u r i s t i c e xtension of the method was a l s o given which pr o v i d e d good but not optimal r e s u l t s . Moreover, t h e i r model assumes that the d i s t r i b u t i o n of s e r v i c e time f o r v a r i o u s s e r v i c e c e n t r e s and the valu e s of v a r i o u s system parameters are known. In the f o l l o w i n g example we e x p l a i n , using a simple queueing network model, how some of the schemes d i s c u s s e d above, which employ s t a t i c c r i t e r i a f o r s a t u r a t i o n e s t i m a t i o n , may not produce s a t i s f a c t o r y r e s u l t s under v a r i o u s load c o n d i t i o n s . Example 3_. J_ Consider a simple c l o s e d queueing network model. In order to reduce the complexity of the model so that a g r a p h i c a l e x p l a n a t i o n can be given we c o n s i d e r a system having only one I/O u n i t and a CPU. 48 F i g u r e 3.j_ S i m p l e C e n t r a l S e r v e r Model The r e s p o n s e t i m e f o r such a system w i t h N j o b s i s g i v e n by [CoDe73] N+1 R ( o , m , j . 2 f N ) = 1 . l - U a / g f 1 ) \u00E2\u0080\u00A2 N an-, N+1 f2/e\u00C2\u00BB\-(\u00C2\u00BB2/ev1) (3.1) where, \u00C2\u00BB t and n 2 a r e t h e mean s e r v i c e r a t e s of the CPU and t h e I/O d e v i c e r e s p e c t i v e l y and c(=1-p) i s t h e p r o b a b i l i t y t h a t a j o b l e a v e s t h e system a f t e r b e i n g s e r v e d by the CPU. The e q u a t i o n of t h e asymptote of R ( i . e . , as N\u00E2\u0080\u0094> \u00C2\u00B0o) can be shown t o be R = ( 1 / a ^ , ) * N. (3.2) As the system l o a d i n c r e a s e s t h e s l o p e of t h e response t i m e vs system l o a d c u r v e a l s o i n c r e a s e s . T h i s i n c r e a s e i s i n i t i a l l y s m a l l . A system i s c o n s i d e r e d t o be s a t u r a t e d when the s l o p e becomes s i g n i f i c a n t . The s a t u r a t i o n p o i n t 49 N* of the system can be approximated by the p o i n t of i n t e r s e c t i o n of the asymptote with the h o r i z o n t a l l i n e R(1) (see F i g u r e 3.2) and i s given by N* = 1 + fi(pi/r2). (3.3) R(1) N* \u00E2\u0080\u0094>N F i g u r e 3.2 S a t u r a t i o n p o i n t Thus the system i s s a t u r a t e d when the c o n d i t i o n N :> 1 + * . U , / \" 2 ) ( 3 . 4 ) h o l d s . 50 In o r d e r t o show p i c t o r i a l l y t h a t s t a t i c schemes a t t i m e s l e a v e the system e i t h e r u n d e r u t i l i z e d o r s u p e r s a t u r a t e d , we s h a l l assume t h a t the system parameters P , and i> 2 a r e c o n s t a n t . F i g u r e 3.3 i s a graph of e q u a t i o n ( 3 . 3 ) , r e l a t i n g the r e m a i n i n g system p a r a m e t e r s N* and o(=1-p). Thus i f the o p e r a t i n g p o i n t of the system l i e s w i t h i n the a r e a e n c l o s e d between t h e p o i n t AOB, t h e system i s not s a t u r a t e d . The v a l u e s of parameters l i k e N* or o a r e f i x e d i n most c o n t r o l schemes. Whenever t h e v a l u e s of the observed p a r a m e t e r s exceeds some p r e d e t e r m i n e d f i x e d v a l u e , n o r m a l l y o b t a i n e d e m p i r i c a l l y , the system i s assumed t o be s a t u r a t e d . I f the c o n t r o l scheme uses (N,,c,) as the s a t u r a t i o n p o i n t then t h i s scheme w i l l l e a v e the system 51 u n d e r - u t i l i z e d because i t cannot o p e r a t e i n the u n s a t u r a t e d r e g i o n S, U S 2. I f the c o n t r o l parameters are ( N 2 , c 2 ) then the system i s a l l o w e d t o o p e r a t e i n the s a t u r a t e d r e g i o n S 3. MTS has f i v e parameters which c o n t r o l the system l o a d . I f any one of the observed v a l u e s exceeds some p r e c a l c u l a t e d f i x e d v a l u e the system i s c o n s i d e r e d t o be s a t u r a t e d ( f o r more d e t a i l see [Chan79]). I f we c o n s i d e r t h i s i n two d i m e n s i o n s o n l y ( i . e . , o n l y two l o a d p a r a m e t e r s ) , then the a n a l o g y of F i g u r e 3.3 d i s c u s s e d e a r l i e r can be a p p l i e d d i r e c t l y . The c o n t r o l a l g o r i t h m d e s c r i b e d by Landwehr [Land76] i s a l s o i n s e n s i t i v e t o v a r i a t i o n s i n j o b c h a r a c t e r i s t i c s because the v a l u e s of the break p o i n t s b j ' s ( d e f i n e d i n the scheme [Land76]) are f i x e d and do not depend on the i n s t a n t a n e o u s w o rkload c h a r a c t e r i s t i c s . The c o n t r o l scheme d e s c r i b e d by Hine et a l . [HiMT79] c h a r a c t e r i z e s j o b s o n l y by t h e i r page f a u l t r a t e . The scheme does not take i n t o c o n s i d e r a t i o n the v a r i a t i o n i n the I/O r e q u i r e m e n t s of the j o b s . Moreover, the use of l i f e - t i m e f u n c t i o n g i v e s o n l y a g l o b a l p i c t u r e of the system p a g i n g c h a r a c t e r i s t i c s and not the l o c a l b e h a v i o u r . I t s h o u l d now be c l e a r t h a t a c o n t r o l p o l i c y which 52 does not base i t s d e c i s i o n on job c h a r a c t e r i s t i c s and instantataneous system c o n d i t i o n s w i l l not perform o p t i m a l l y under a l l circumstances. We need an adaptive c o n t r o l p o l i c y which i s able to dynamically recompute the system s a t u r a t i o n p o i n t as the c h a r a c t e r i s t i c s of the workload change. From among the w a i t i n g jobs, s e v e r a l sets of job-mix can be- s e l e c t e d to a t t a i n the c a l c u l a t e d s a t u r a t i o n p o i n t . Therefore the scheme should a l s o be able to s e l e c t the one that o p t i m i z e s some system performance measure(s). L i s t e d below are some recent work i n adaptive l o a d c o n t r o l p o l i c y . . Badel and L e r o u d i e r [BaLe78] have proposed an adaptive l o a d c o n t r o l p o l i c y by i n t r o d u c i n g a system \" d i l a t a t i o n \" f u n c t i o n . The system \" d i l a t a t i o n \" f u n c t i o n i s d e f i n e d as the r a t i o of the r e a l execution time of N programs running one at a time to the r e a l execution time of the N programs running simultaneously ( i . e . , degree of multiprogramming equal to N). They observe that the \" d i l a t a t i o n \" f u n c t i o n a t t a i n s i t s maximum value when c e r t a i n p r i n c i p l e s are s a t i s f i e d . The p r i n c i p l e s are e q u i v a l e n t t o L=S and 50% c r i t e r i a . The implementation of t h e i r scheme i s , i n f a c t , implementation of these c r i t e r i a . We s h a l l see i n chapter 5 that the schemes developed i n t h i s t h e s i s are b e t t e r than these c r i t e r i a . Schonbach [Scho80] d e s c r i b e s a macro scheduler f o r 53 high p r o d u c t i v i t y . I t i s assumed that the \"system balance\" p o i n t i s a l r e a d y s p e c i f i e d . . . Here, - system balance i s a s t a t e i n which v a r i o u s processor u t i l i z a t i o n s are at some p r e s p e c i f i e d l e v e l s . The macro-scheduler then chooses, from among the w a i t i n g jobs, a job-mix which maintains system balance. The scheme does not i n c l u d e e x t e r n a l p r i o r i t i e s and i s a p p l i c a b l e only to non-paged systems. Lo [ L 0 8 O ] d e s c r i b e s a load c o n t r o l p o l i c y u sing s t o c h a s t i c c o n t r o l theory. The p o l i c y i s shown to give optimal r e s u l t s . The main weakness of the scheme i s that i t s implementation r e q u i r e s the job parameter values to be known. I t a l s o makes the usual queueing theory assumptions, such as e x p o n e n t i a l i n t e r a r r i v a l and s e r v i c e time d i s t r i b u t i o n s which may not be s a t i s f i e d i n p r a c t i c e . With the exception of Lo' s and Schonbach's schemes, most of the c o n t r o l p o l i c i e s mentioned above are not based on mathematical models and are not expected to give optimal r e s u l t s i n a l l s i t u a t i o n s . On the other hand, Lo and Schonbach make such strong assumptions in t h e i r f o r m u l a t i o n that i t i s i m p r a c t i c a b l e and sometimes impossible to meet those assumptions in p r a c t i c e . In the next three s e c t i o n s we develop a dynamic lo a d c o n t r o l scheme. F i r s t , we d e r i v e an e x p r e s s i o n f o r the system s a t u r a t i o n p o i n t . The e x p r e s s i o n depends upon 54 parameters t h a t a re d i r e c t l y measurable from the system, t h e r e b y r e f l e c t i n g i n s t a n t a n e o u s change i n the s t a t e of the system. The c o n t r o l p o l i c y computes the o p t i m a l number of j o b s t h a t s h o u l d be a c t i v a t e d from each c l a s s so t h a t the system i s n e i t h e r u n d e r - u t i l i z e d nor s a t u r a t e d . 55 3.2 System Model We here d e s c r i b e a queueing network model of a computer system. F i g u r e 3.4 i s a diagram of such a network which i s widely c i t e d i n the l i t e r a t u r e . F i g u r e 3_.4 General C e n t r a l Server Model The network c o n s i s t s of M s e r v i c e c e n t r e s \u00E2\u0080\u0094 a CPU, a paging d e v i c e , and (M-2) I/O u n i t s . Each s e r v i c e c e n t r e c o n s i s t s of a s e r v e r and an a s s o c i a t e d queue. Upon a r r i v a l , a job waits i n the queue i f the s e r v e r i s busy. 56 When a j o b completes i t s s e r v i c e a t the paging d e v i c e or any of the I/O u n i t s , i t always r e j o i n s the CPU queue. When a j o b f i n i s h e s i t s s e r v i c e a t the CPU, i t e i t h e r l e a v e s the system or v i s i t s one of the o t h e r s e r v i c e c e n t r e s . A new j o b always j o i n s the CPU queue. T h i s model assumes t h a t no j o b o v e r l a p s i t s requirement of d i f f e r e n t d e v i c e s i . e . , no j o b i s b e i n g s e r v e d by more than one s e r v i c e c e n t r e a t the same t i m e . T h i s assumption may not h o l d i n p r a c t i c e . However the e r r o r i n t r o d u c e d i s g e n e r a l l y not s i g n i f i c a n t [Buze78b]. We s h a l l assume, f o r the time b e i n g , t h a t a l l j o b s a r e i d e n t i c a l i n t h e i r r e s o u r c e demand. S p e c i f i c a l l y , the b r a n c h i n g , f r e q u e n c i e s from the CPU t o the p a g i n g d e v i c e or one of the I/O d e v i c e s or out of the system i s same f o r a l l j o b s . A l s o the mean s e r v i c e r a t e s a t v a r i o u s c e n t r e s a re same f o r a l l j o b s . We s h a l l r e l a x t h e s e assumptions i n Chapter 4 where we a n a l y z e a m u l t i - c l a s s model. We s h a l l use o p e r a t i o n a l a n a l y s i s as i n t r o d u c e d by Denning and Buzen [DeBu78] t o study such a system. The main reason f o r u s i n g o p e r a t i o n a l a n a l y s i s r a t h e r than c l a s s i c a l queueing t h e o r y i s t h a t we do not need t o determine the d i s t r i b u t i o n of the system parameters r e q u i r e d by c l a s s i c a l queueing t h e o r y . The v a l u e s of the system parameters can be d i r e c t l y measured from the system or d e r i v e d from t h e measured q u a n t i t e s . Moreover, the a s s u m p t i o n s made t o make t h e mathematics t r a c t a b l e can be v e r i f i e d by the a n a l y s t . 3L.3 S a t u r a t i o n E s t i m a t i o n Most l o a d c o n t r o l mechanisms a r e based on some system s a t u r a t i o n d e f i n i t i o n . V a r i o u s system s a t u r a t i o n d e f i n i t i o n s have been p r o p o s e d ( [ K l e i 6 8 ], [ F e r r 7 8 ] , [DeBu78 ]). I n v a r i a b l y t h e system i s c o n s i d e r e d t o be s a t u r a t e d at the p o i n t t h e response t i m e s t a r t s t o r i s e r a p i d l y w i t h an i n c r e a s e i n some system l o a d i n d e x . Under t h e assumption t h a t a l l j o b s a r e i d e n t i c a l i n t h e i r r e s o u r c e demands, the number o f a c t i v e j o b s o r t h e degree of multiprogramming can be c o n s i d e r e d t o be a measure of system l o a d . A t y p i c a l r e s p o n s e t i m e c u r v e a g a i n s t degree of multiprogramming i s shown i n F i g u r e 3.3 w h i c h i s shown a g a i n i n F i g u r e 3.5 f o r c o n v e n i e n c e . R(1) --> N F i g u r e 3.5 S a t u r a t i o n P o i n t 58 Badel et a l . [BaLP75] propose to use the \" d i l a t a t i o n \" f u n c t i o n as a measure of system l o a d . Badel and Le r o u d i e r [BaLe78] l a t e r used t h i s f u n c t i o n i n v a r i o u s d e f i n i t i o n s of s a t u r a t i o n c r i t e r a . For K l e i n r o c k [ K l e i 6 8 ] , a system i s c o n s i d e r e d to be s a t u r a t e d at the i n t e r s e c t i o n of the asymptote of the normalized mean response time curve vs the number of a c t i v e t e r m i n a l s and the h o r i z o n t a l l i n e c orresponding to when there i s only one job i n the system (see F i g u r e 3.5). If the system i s not.allowed to get s a t u r a t e d according to t h i s d e f i n i t i o n , the mean response time of the a c t i v e jobs does not exceed an acceptable l e v e l . However, i t i s i m p l i c i t l y assumed that the program p o p u l a t i o n i s homogeneous and the system i s i n a s t a t i o n a r y s t a t e . In the f o l l o w i n g a n a l y s i s , the system s a t u r a t i o n i s computed at the end of each small i n t e r v a l ( u s u a l l y a few seconds long) duri n g which the s t a t i o n a r y assumption i s more reasonable. One may argue that t h i s may l e a d to end e f f e c t s ( i . e . , d i s c o n t i n u i t y at the i n i t i a l and t e r m i n a l p o i n t of the o b s e r v a t i o n i n t e r v a l ) . The e n d - e f f e c t s , however, do not a f f e c t the v a l i d i t y of o p e r a t i o n a l laws s i n c e these laws can be shown to be i n t e r n a l l y c o n s i s t e n t and v a l i d f o r a l l i n i t i a l and t e r m i n a l s t a t e s so long as the o p e r a t i o n a l assumptions are s a t i s f i e d [Buze78b]. As w e l l , s i n c e we assume a l l jobs to have i d e n t i c a l resource demands, the homogeneity c o n d i t i o n i s s a t i s f i e d . We s h a l l now f i n d an expression f o r the system s a t u r a t i o n p o i n t . 59 Let S , , S 2, \u00E2\u0080\u00A2\u00E2\u0080\u00A2\u00E2\u0080\u00A2, S ^ be the M s e r v i c e c e n t r e s as shown in F i g u r e 3 . 4 . S , i s the CPU, S 2 i s the paging device and S 3, ... , S TV, are the v a r i o u s I/O u n i t s . The f o l l o w i n g are observed q u a n t i t i e s from the system. They are mean values w i t h i n an o b s e r v a t i o n p e r i o d and, as such, are f u n c t i o n s of time which i s omitted f o r c l a r i t y . T : o b s e r v a t i o n p e r i o d X; : observed number of completions at centre S ; d u r i n g T B; : the t o t a l amount of time dur i n g which the s e r v i c e centre S j i s busy duri n g T C; : observed number of requests f o r centre S; duri n g T q; : request frequency, the f r a c t i o n of jobs proceeding to s e r v i c e c e n t r e S; a f t e r completing a s e r v i c e request at the CPU (= c j/X,), i * 1 . We now compute the f o l l o w i n g o p e r a t i o n a l q u a n t i t i e s . Mean s e r v i c e r a t e of s e r v e r S; = \u00C2\u00BB\ = x;/B; 60 System throughput r a t e T = (X, . q,) / T = (XT/B,)(B,/T) . q, = v i P i q1 (3.5) U t i l i z a t i o n of s e r v e r S j = p\ = B\ / T = (Bi/Xj ) (Xj/X, ) (X,/B, ) (B-,'/T)-Using the job flow balance assumption X; = C\ i*1. ( i . e . , the number of requests f o r s e r v i c e at c e n t r e Si during an i n t e r v a l i s equal to the number of departures from the centre) we o b t a i n : P\ = />, . U - i / > i) q i ; i*1 (3.6) M M = > I />; = />, + ! p, (n y/u j ) q i i=1 i=2 If there i s one and only one job i n the system i t can be present at only one s e r v i c e c e n t r e . Therefore M I P\ = 1 i = 1 Which i m p l i e s that the CPU u t i l i z a t i o n with e x a c t l y one job in the system i s given by: ,(1) = M I U i ) q i + 1 i = 1 1 (3.7) Using L i t t l e ' s Law, the mean response time of the system with N jobs i s given by: 61 R(N) = N / T Using (3.5) K(N) = N / U ^ , / ) , ) = N q i / U ,/Mqj ) . (3.8) From (3.7) M R ( 1 ) = ( 1/d , q , ) E ( u q ; + 1 L i=i (3.9) The equation of the asymptote ( i . e . , as N approaches i n f i n i t y ) i s more d i f f i c u l t to d e r i v e . Let us f i r s t c o n s i d e r the simple case of a non-paging system. The asymptote occurs at the p o i n t when the u t i l i z a t i o n of a s e r v i c e c e n t r e ( i * say) reaches u n i t y ( i . e . , i t becomes the f i r s t b o t t l e n e c k of the system). From Buzen's a n a l y s i s [ B u z e 7 l ] , i * i s that s e r v i c e c e n t r e which has the highest u t i l i z a t i o n i n the i n t e r v a l (note that i * may vary from i n t e r v a l to i n t e r v a l as the workload c h a r a c t e r i s t i c s change). If the CPU i s the b o t t l e n e c k , the equation of the asymptote i s simply: R(N) = N / U,qi ) (3.10) Otherwise, using equation (3.8) and noting t h a t n \ as w e l l as the r a t i o (qi/qw remains unchanged as N i n c r e a s e s , the equation of the asymptote i s given by: R(N) = N q;* / U i*q,) i * * 1 (3.11) F o r a p a g i n g system t h e e v e n t u a l b o t t l e n e c k as N approachs i n f i n i t y must be t h e p a g i n g d e v i c e . I t need not however, be t h e f i r s t d e v i c e t o be s a t u r a t e d . C a s e ( i ) The p a g i n g d e v i c e i s not t h e f i r s t t o s a t u r a t e . In t h i s c a s e , as t h e system i s s a t u r a t e d b e f o r e the p a g i n g d e v i c e i s f u l l y u t i l i z e d , the asymptote s h o u l d be computed based on t h e f i r s t d e v i c e t o r e a c h s a t u r a t i o n . T h e r e f o r e e q u a t i o n (3.11) i s s t i l l v a l i d (see F i g u r e 3.6) R(1) R N* \u00E2\u0080\u0094>N F i g u r e 3_.6 System B o t t l e n e c k C a s e ( i i ) The p a g i n g d e v i c e i s f i r s t t o s a t u r a t e The r a t i o q j * / q i w i l l c o n t i n u e t o i n c r e a s e as N i n c r e a s e s and w i l l a p p r o a c h i n f i n i t y . A r e a l i s t i c approach 63 c o n s i s t e n t with the one used i n C a s e ( i ) i s to use the values of q;*/q, corresponding to the p o i n t the paging device f i r s t gets f u l l y u t i l i z e d . However, t h i s r a t i o i s not easy to estimate. The observed values of q i * / q , can be used i f the system i s c l o s e to s a t u r a t i o n ( i . e . , N* - N, see below). Otherwise the s a t u r a t i o n load w i l l be under-estimated. T h i s i s not a problem when the system i s l i g h t l y loaded. As can be seen subsequently, when the system workload becomes . heavy, the c o n t r o l p o l i c y w i l l a d j u s t i t s e l f and the observed r a t i o w i l l e v e n t u a l l y reach the d e s i r e d v a l u e . Thus the s a t u r a t i o n load N* i . e . , . the poi n t of i n t e r s e c t i o n of equations (3.9) and (3.10) i s given by : M N* = 1 + I U,/\u00E2\u0080\u009E i) q i (3.12) i = 2 i f the CPU i s the b o t t l e n e c k . Otherwise i t i s given by equation (3.13) as the p o i n t of i n t e r s e c t i o n of (3.9) and (3.11): N* = U i * / x i q i * ) M I (*/,/>/ i )qi + 1 i = 2 (3.13) Thus now we have equations (3.12) and (3.13) as estimates of system s a t u r a t i o n p o i n t under the assumptions made e a r l i e r . Both equations (3.12) and (3.13) can a l s o be d e r i v e d u s i n g c l a s s i c a l queueing theory with e x p o n e n t i a l d i s t r i b u t i o n s of s e r v i c e times. 6 4 Most proposed schemes assume a f i x e d s a t u r a t i o n l o a d . The Michigan Terminal System f o r example computes the va l u e s of f i v e f a c t o r s at f i x e d i n t e r v a l s . If one or more exceeds the corresponding predetermined s t a t i c s a t u r a t i o n v a l u e , the system i s assumed to be s a t u r a t e d . For the 50% c r i t e r i o n , the s a t u r a t i o n p o i n t corresponds to the load beyond which the u t i l i z a t i o n of the paging d e v i c e w i l l exceed 0.5+c, where c i s some p o s i t i v e c onstant. The L=S c r i t e r i o n to a c e r t a i n extent assumes the system to be s a t u r a t e d when the mean system l i f e time i s below the page swap time, which i s f i x e d f o r a given paging d e v i c e . Chanson [Chan79] has shown that the s a t u r a t i o n load i s r e a l l y a f u n c t i o n of the c h a r a c t e r i s t i c s of the c u r r e n t workload and cannot be very w e l l represented by some constant measure. For the present model, these work load c h a r a c t e r i s t i c s are q; and \u00C2\u00BB\, i= 1, 2, M. Any model that does not take these i n t o c o n s i d e r a t i o n w i l l sometimes over-estimate and sometimes under-estimate the system s a t u r a t i o n l o a d . The f a c t t h at on the average the o v e r - e s t i m a t i o n i s equal to the under-estimation p r o v i d e s no comfort when the goal i s to optimize performance at a l l times. T h e r e f o r e the f i r s t c r i t e r i o n f o r load c o n t r o l i s to keep the system from s a t u r a t i o n . For a multiprogrammed paging computer system, the s i m p l e s t way to accomplish t h i s 65 i s to keep the number of a c t i v e jobs below N*, where N* i s given by equations (3.12) and (3.13). Since the system throughput rate i s a non-decreasing f u n c t i o n of N before the system s a t u r a t e s [Buze7l], a c t i v a t i n g N* jobs whenever p o s s i b l e w i l l a l s o maximize the system throughput. Three cases have to be c o n s i d e r e d : ( i ) the system i s s a t u r a t e d ( i . e . , N > N*) ( i i ) the system i s u n d e r - u t i l i z e d ( i . e . , N << N*) ( i i i ) the system i s c l o s e to s a t u r a t i o n but not yet s a t u r a t e d . Only c a s e ( i i i ) i s of i n t e r e s t , because i f the system i s u n d e r - u t i l i z e d , i t i s unnecessary to apply any c o n t r o l measure. Each job w i l l be a c t i v a t e d as soon as i t a r r i v e s , u n t i l the c o n d i t i o n f o r c a s e ( i i i ) i s reached. If the system load i s p r o p e r l y c o n t r o l l e d , the system w i l l a t t a i n s a t u r a t i o n i n f r e q u e n t l y and only f o r b r i e f p e r i o d s of time. When the system i s s a t u r a t e d , one simple c o n t r o l measure i s not to a c t i v a t e any more batch jobs u n t i l the system comes out of s a t u r a t i o n . If the system reaches s a t u r a t i o n s t a t e f r e q u e n t l y and remains in t h i s s t a t e f o r extended p e r i o d s of time, then i t i s h i g h l y probable that the hardware i s inadequate to handle the normal work load and the system 66 s h o u l d be upgraded. In the next s e c t i o n we c o n s i d e r a combined b a t c h - i n t e r a c t i v e system w i t h the assumption t h a t the b a t c h and t e r m i n a l jobs are i d e n t i c a l i n t h e i r r e s o u r c e demands. In a t y p i c a l environment the t e r m i n a l j o b s have r e l a t i v e l y h i g h e r p r i o r i t y than the b a t c h j o b s . T h e r e f o r e , t h e i r w a i t i n g t i m e s s h o u l d be a f u n c t i o n of t h e i r r e l a t i v e p r i o r i t y . Our o b j e c t i v e i s t h e r e f o r e , t o maximize a w eighted sum of the w a i t i n g times , w i t h o u t s a t u r a t i n g the system. 67 3_\u00E2\u0080\u009E4^ O p t i m a l S e l e c t i o n The combined b a t c h - i n t e r a c t i v e system under c o n s i d e r a t i o n i s shown i n F i g u r e 3.7. System Subsystem T e r m i n a l * f B a t c h 'Paging P 2 IT/6 P 3 ^ C o n t r o l CPU M \"fxiT F i g u r e 3.7 Load C o n t r o l Model 68 Upon a r r i v a l i n t o the system, batch and i n t e r a c t i v e jobs enter the c o n t r o l u n i t A and B, r e s p e c t i v e l y . A c o n t r o l p o l i c y then decides whether to admit any job from the c o n t r o l u n i t s i n t o the subsystem. Only those jobs which are, present i n the subsystem are allowed to compete fo r system r e s o u r c e s . In the previous s e c t i o n we have d e r i v e d the maximum number of jobs N* that should be allowed i n the subsystem. L e t : N* : be the maximum number of jobs that should be allowed in the subsystem (obtained i n the previous s e c t i o n ) , S, : be the mean t o t a l i n t e r a c t i v e jobs i n the system (observed d u r i n g the previous i n t e r v a l ) , S 2 : be the mean t o t a l batch jobs i n the system (observed d u r i n g the previous i n t e r v a l ) , N, : be the number of i n t e r a c t i v e jobs to be maintained i n the subsystem dur i n g the next i n t e r v a l (to be c a l c u l a t e d ) , 69 N 2 : be the number of batch jobs to be maintained i n the subsystem during the next i n t e r v a l (to be c a l c u l a t e d ) , R : be the mean response time of the subsystem (not r e q u i r e d i n the f i n a l a l g o r i t h m ) , R, : be the mean response time of i n t e r a c t i v e jobs i n the subsystem, R 2 : be the mean response time of batch jobs in the subsystem. In order to keep the subsystem from s a t u r a t i o n i t i s r e q u i r e d t h at N, + N 2 < N*. (3.14) We want to compute the optimal values f o r N, and N 2. Using L i t t l e ' s Law, the throughput r a t e f o r i n t e r a c t i v e jobs i s : T, = (N, / R,). (3.15) S i m i l a r l y the throughput r a t e f o r batch jobs i s : 70 T 2 = (N 2 / R 2 ) . (3.16) Under the assumption that the t e r m i n a l and batch jobs are i d e n t i c a l i n t h e i r resource demands, the values R, and R 2 w i l l be the same a n d , w i l l not change as long as the e q u a l i t y i s s a t i s f i e d i n c o n d i t i o n (3.14), i r r e s p e c t i v e of the v a l u e s of N, and N 2. Under the job flow balance assumption [DeBu78] (which should be v e r i f i e d ) i n t e r a c t i v e and batch, jobs w i l l be e n t e r i n g the subsystem at the r a t e s T, and T 2 r e s p e c t i v e l y . T h e r e f o r e , on the average, an i n t e r a c t i v e job enters the system every 1/T, sec. I t f o l l o w s that on the average, the w a i t i n g time i n the c o n t r o l u n i t f o r i n t e r a c t i v e jobs (using L i t t l e ' s Law) i s given by: S i m i l a r l y the w a i t i n g time i n the batch c o n t r o l u n i t i s given by: In a r e a l system i n t e r a c t i v e jobs are normally given higher p r i o r i t y compared to batch j o b s . Let C, and C 2 be the weights a s s o c i a t e d with the i n t e r a c t i v e and the batch jobs r e s p e c t i v e l y . Our o b j e c t i v e i s to minimize a weighted W, = (S, - N,) / T (3.17) W2 = ( S 2 - N 2) / T 2 (3.18) 71 sum of the w a i t i n g times i n the c o n t r o l u n i t . Therefore the o p t i m i z a t i o n problem becomes: Min z 1 Min (C^W, + C 2W 2) subject to N, + N 2 < N* 0 < K, < S, 0 < N 2 < S 2 (3.19) Using (3.15), (3.16), (3.17) and (3.18), the above problem ( 3 . 1 9 ) can be shown to be e q u i v a l e n t to Min ( C S i / N , + C 2 S 2 / N 2 ) subject to N, + N 2 < N* 0 < N,. < S, 0 < N 2 < S 2 (3.20). For the time being we s h a l l not c o n s i d e r the c o n s t r a i n t s 0 ^ N, < S, and 0 ^ N 2 < S 2 ( i . e . , we assume that there are enough jobs w a i t i n g i n the i n t e r a c t i v e and batch c o n t r o l queues). We s h a l l use the Lagrange m u l t i p l i e r method to s o l v e the problem (3.20) (without e x t r a c o n s t r a i n t s ) . The Lagrangian equation of the problem i s : 72 L ( N , X . ) = C t S t / N , + C 2 S 2 / N 2 + X . ( N , / N * + N 2 / N * - 1) ( 3 . 2 1 ) The o p t i m a l v a l u e s of N , and N 2 from ( 3 . 2 1 ) a r e : N , * = N V C T S T / (\/cTsT + v/c7s7) ( 3 . 3 2 ) and N 2 * = NVC7S\"7 / (v/c7sT + v'cIsT) ( 3 . 2 3 ) Let us look at the problem ( 3 . 2 0 ) g e o m e t r i c a l l y with the c o n s t r a i n t s 0 ^ N , < S , and 0 < N2 <, S 2 a c t i v e . F i g u r e 3 . 9 i s a diagram of the problem ( 3 . 2 0 ) showing the f e a s i b l e r e g i o n and the o b j e c t i v e f u n c t i o n f o r i t s v a r i o u s v a l u e s . 1 , N , F i g u r e 3 . 8 A c t i v e C o n s t r a i n t s 73 From the above graph, i n the absence of the c o n s t r a i n t s 0 S, and N* 2 < S 2. F i g u r e 3_.9 I n a c t i v e C o n s t r a i n t C l e a r l y the o p t i m a l v a l u e s f o r N, i s N1 ' = S, and N 2' = N* - N, ' which g i v e s maximum value of the o b j e c t i v e f u n c t i o n s a t i s f y i n g the c o n s t r a i n t s . 74 Case I i i ) ..', N 2* > S 2 and N,* < S, . As i n C a s e ( i ) i t can be seen that an optimum value f o r N 2 i s \"... N 2' = S 2 and N,' = N* - N 2', which gives maximum value of the o b j e c t i v e f u n c t i o n s a t i s f y i n g the c o n s t r a i n t s . Case ( i i i ) N,* > S v and N 2* > S 2 I t can be e a s i l y seen that N 2' = S 2 and N,' = S, w i l l give the optimal s o l u t i o n i n t n i s case. The f i n a l c o n t r o l p o l i c y , which we s h a l l c a l l SELF (standing f o r S a t u r a t i o n E s t i m a t i o n and L o a d - c o n t r o l with Feed-back), works as f o l l o w s . The values N,* and N 2* are computed from the equations (3.22) and (3.23) or from one of the three cases as d e s c r i b e d above. During the ob s e r v a t i o n p e r i o d , whenever an i n t e r a c t i v e job leaves the subsystem, an i n t e r a c t i v e job w i l l be i n j e c t e d i n t o the subsystem p r o v i d e d the i n t e r a c t i v e c o n t r o l queue i s non-empty. When a new i n t e r a c t i v e job a r r i v e s , i t enters the subsystem immediately p r o v i d e d ( i ) there i s no w a i t i n g i n t e r a c t i v e job, and ( i i ) the number of i n t e r a c t i v e jobs i n the subsystem i s l e s s than N,*. Otherwise, i t waits i n the i n t e r a c t i v e c o n t r o l queue. 75 The batch c o n t r o l works in a s i m i l a r f a s h i o n . In the implementation of SELF the values of the parameters are estimated on the b a s i s of t h e i r past v a l u e s . In order to reduce the e r r o r i n the e s t i m a t i o n , we use a s p e c i a l technique from t i m e - s e r i e s a n a l y s i s [Kend76], c a l l e d e x p o n e n t i a l smoothing. T h i s technique can be d e s c r i b e d as f o l l o w s . Let Pi be the expected value of ; the parameter f o r the time i n t e r v a l [ i , i + 1 . ] . Let Xi be the v observed value of the parameter at the time t . Pi can be expressed as Pi = ( 1 - * ) (Xi+*Xi . !+*2Xi.. 2 + . . . ) = ( 1 - - * ) T p i Xi . ; (3.25) j=o J where the e x p o n e n t i a l weight f a c t o r $ i s a constant between zero and one. S i m i l a r l y P i . 1 = ( 1 - * ) ? *'* X i . ; . , j - o J => Pi = ( 1-p)Xi + p P i . , (3.26) Now, l e t the e r r o r made at time ( i - 1 ) i n p r e d i c t i n g X,' be \u00E2\u0082\u00AC i , then, ei = Xi - P i . , . (3.27) 76 S u b s t i t u t i n g i n equation (3.26) Pi = Xj - * ei = P|., + ( 1 - 0 ) \u00E2\u0082\u00AC ; (3.28) The remaining problem i s to f i n d a proper value of 0 . If the e r r o r i\ i s s u f f i c i e n t l y s m a l l , equation (3.28) w i l l be s a t i s f i e d f o r almost any value of 0 , so that we can use the value of 0 from the p r e v i o u s i n t e r v a l . I f ej i s l a r g e , we recompute a value f o r 0 by minimizing the sum of squares of the e r r o r s given by: ~E {Xj - ( 1 - 0 ) 7 0 * X j _ x _ 1 } 2 . (3.29) j=i J k=0 J In p r a c t i c e , the summation i n equation(3.29) does not have to i n v o l v e many terms (say K) before 0 H ( O < 0 ^ 1 ) approaches zero. Moreover, 0 need not be very a c c u r a t e , and standard techniques e x i s t f o r i t s e f f i c i e n t computation. From (3.28), i t can be seen that not a l l previous values of X are needed to compute the expected value of X f o r the next i n t e r v a l . The f u t u r e value can be computed from the p r e v i o u s p r e d i c t e d values and the present e r r o r . However, s e v e r a l immediate past v a l u e s of X may be needed i n the computation of 0 . The e f f e c t of 0 on the expected 77 e r r o r i s d i s c u s s e d i n Chapter 5. So f a r we have assumed t h a t a l l the jo b s a re s t a t i s t i c a l l y i d e n t i c a l i n t h e i r c h a r a c t e r i s t i c s . In the next c h a p t e r we s h a l l r e l a x t h i s c o n d i t i o n and d e v e l o p a c o n t r o l scheme t h a t makes an o p t i m a l s e l e c t i o n by t a k i n g i n t o c o n s i d e r a t i o n the. job c h a r a c t e r i s t i c s a l o n g w i t h t h e i r e x t e r n a l p r i o r i t i e s . 78 CHAPTER 4 MULTICLASS CONTROL 4.J_ I n t r o d u c t i o n In a t y p i c a l general purpose computer system there e x i s t v a r i o u s c l a s s e s of jobs as f a r as t h e i r resource demand c h a r a c t e r i s t i c s are concerned. In such an environment the n a t u r a l approach f o r an a n a l y s t i s to use a m u l t i c l a s s model of the system that would t r e a t the d i f f e r e n t c l a s s e s of jobs d i f f e r e n t l y ([Rode79], [Bard79], [ReLa80]). The jobs w i t h i n a c l a s s , however, are supposed to have i d e n t i c a l resource demands. In t h i s a n a l y s i s , i t i s assumed that the jobs do not change c l a s s d u r i n g t h e i r stay i n the system. The number of job c l a s s e s c o n s i d e r e d i s f i n i t e . Moreover, i t i s assumed that when a job a r r i v e s at the system i t i s p o s s i b l e to c l a s s i f y i t i n t o one of the job c l a s s e s . An example of a p r i m i t i v e method of job c l a s s i f i c a t i o n i s to compute the job c l a s s based on car d parameters f o r batch jobs (e.g., CPU time l i m i t , user-given 79 job p r i o r i t y , user IDs etc.) and on the command type f o r t e r m i n a l jobs (e.g., e d i t , compile, copy, e t c . , ) . However, i t i s not assumed that the resource demands of a job in a p a r t i c u l a r c l a s s are known. The resource demands of v a r i o u s c l a s s e s of jobs are c o n t i n u o u s l y being measured, thereby p r e s e r v i n g the dynamic nature of the c o n t r o l . 4.2 M u l t i c l a s s S a t u r a t i o n E s t i m a t i o n \ In a non-paging system with jobs that are i d e n t i c a l i n t h e i r resource demands, the branching f r e q u e n c i e s , from one s e r v i c e c e n t r e to another, are independent of the number of jobs present in the system. They depend mainly on the job c h a r a c t e r i s t i c s . On the other hand, i n a m u l t i c l a s s environment, the branching f r e q u e n c i e s are not only dependent upon the t o t a l number of jobs but a l s o on the number of -jobs present i n each c l a s s ( s p e c i f i c a l l y , the r a t i o of the number of jobs i n each c l a s s ) . Thus l o a d c o n t r o l i n such a system c o n s i s t s of two mutually dependent problems as f o l l o w s : (a) compute an optimal r a t i o corresponding to the number of jobs that should be s e l e c t e d from each c l a s s , (b) compute the s a t u r a t i o n p o i n t of the system f o r 80 the r a t i o obtained i n ( a ) . The o p t i m a l i t y i n (a) i s subject to a c o n s t r a i n t d e f i n e d by (b) and the s a t u r a t i o n p o i n t obtained i n (b) must s a t i s f y the r a t i o o b tained i n ( a ) . Thus the two problems (a) and (b) are dependent upon each other and we must seek a simultaneous s o l u t i o n . In a number of s t u d i e s , a s o l u t i o n of one of the problems has been obtained f o r a f i x e d s o l u t i o n of the other problem. For example,-Schonbach [Scho80], s o l v e s the problem (a) f o r the case in which system s a t u r a t i o n i s a l r e a d y d e f i n e d and f i x e d . In the r e s t of t h i s s e c t i o n we f i r s t d e r i v e the s a t u r a t i o n l o a d f o r a m u l t i c l a s s system and then d e s c r i b e the m u l t i c l a s s c o n t r o l procedure. The f o l l o w i n g n o t a t i o n s are used: K M s( j ) N n ( i , j , N ) n; (N) t o t a l number of job c l a s s e s , t o t a l number of s e r v i c e c e n t r e s , number of c l a s s j jobs i n the subsystem the system s t a t e v e c t o r ( s , , s 2 , s K ) , number of c l a s s j jobs at centre i f o r a given system s t a t e N, t o t a l number of jobs at ce n t r e i fo r a given system s t a t e N, X ( j ) throughput r a t e of c l a s s j jobs 81 w ( i , j ) : mean time a c l a s s j job spends at s e r v i c e c e n t r e i duri n g i t s stay i n the system ( i n c l u d i n g queue wait and s e r v i c e t i m e s ) , R(j) : mean time spent i n the multiprogramming subsystem by c l a s s j jobs, d ( i , j ) : mean t o t a l s e r v i c e demand of c l a s s j ; jobs at s e r v i c e c e n t r e i , S ( j ) : mean t o t a l number of c l a s s j jobs i n the system, vi : mean s e r v i c e r a t e of s e r v i c e c e n t r e i , q ( i , j ) : normalized frequency of requests f o r centre i by c l a s s j jobs ( i . e . , the r a t i o of c l a s s j jobs j o i n i n g c e n t r e i a f t e r being s e r v i c e d by the CPU to the t o t a l number of c l a s s j job completions at the CPU), q;(N) : normalized frequency of requests f o r centre i ( i . e . , the r a t i o of jobs j o i n i n g c e n t r e i a f t e r being s e r v i c e d by the CPU to the t o t a l number of completions at the CPU), C ( j ) : weight f o r c l a s s j jobs. I t can be e a s i l y shown that qi(N) i s given by 82 K qi(N) = I n O , j , N ) q ( i , j , N ) / n,(N). j = 1 (4.1 ) C l e a r l y M I i = 1 qi'(N) = 1 M K I q ( i , j , N ) = 1 and I n( 1,j,N) = n,(N) i = 1 . . . j= 1 \u00E2\u0080\u00A2 There are s e v e r a l methods to compute n ( i , j , N ) for a given system s t a t e vector, N ([Buze73],[ReKo76],[ReLa80]). The drawback i n most of these methods i s that the overhead i n v o l v e d i s q u i t e high. Since the aim here i s to develop an adaptive scheme, i t i s d e s i r a b l e to minimize the amount of computation i n v o l v e d . Reiser and Lavenberg's [ReLa80] mean value a n a l y s i s f o r example, can be used to compute n ( i , j , N ) . The method assumes that the s e r v i c e time of each s e r v i c e c e n t r e i s load-independent and has only one s e r v e r . According to t h e i r study the mean w a i t i n g time, the mean throughput rate and the mean queue s i z e s f o r each c l a s s j at s e r v i c e c e n t r e i are r e c u r s i v e l y given by: w ( i , j ) = d ( i , j ) (1 + n; ( N - e ( j ) ) ) (4.2) X ( j ) M = s ( j ) / E w ( i , j ) i = 1 (4.3) n ( i , j ) = X ( j ) w ( i , j ) (4.4) 83 where e ( j ) i s the j t h - u n i t v e c t o r . S t a r t i n g with the i n i t i a l c o n d i t i o n n ( i , j , N ) = 0 and using equations (4 .2) , (4.3) and (4.4) one can then o b t a i n n ( i , j , N ) . T h e i r arguments are based on the i n t u i t i o n that upon a r r i v a l at a s e r v i c e c e n t r e , a job \"sees\" the system with i t s e l f removed ( i . e . , one job l e s s i n the system). L i t t l e ' s Law i s then a p p l i e d to the e n t i r e system and to each i n d i v i d u a l s e r v i c e c e n t r e . T h i s a l g o r i t h m r e q u i r e s 2KM - K a d d i t i o n s and 2KM + K m u l t i p l i c a t i o n s / d i v i s i o n s per r e c u r s i v e step. There are a t o t a l of s , . s 2 . . . s K such s t e p s . In a. system with a moderately l a r g e number of c l a s s e s and a l a r g e number of jobs such a scheme might not be economically f e a s i b l e . Reiser and Lavenberg have extended t h e i r scheme and have reduced i t to the problem of s o l v i n g a set of n o n - l i n e a r equations. For each centre r they i n t r o d u c e d a c o r r e c t i o n term c ( i , j , r , N ) such that n ( i , j , N - e ( r ) ) = n ( i , j , N ) - e ( i , j , r , N ) (4.5) with the assumption that e ( i , j , r , N ) = 0 f o r i t r i . e . , only the c l a s s with the customer removed i s a f f e c t e d . 84 They a l s o assumed that \u00E2\u0082\u00AC ( i , j , i , N ) i s given by e ( i , j , i , N ) = n j ' ( s j ) - n i ' ( s j - 1 ) (4.6) where n ; ' ( s j ) i s the mean number of c l a s s j jobs at the s e r v i c e c e n t r e i with s j c l a s s j customers. The mo d i f i e d parameters of the system are d ; ' ( i , j ) = d ( i , j ) ni(N) / n ( i , j , N ) (4.7) Using (4.5), the r e c u r s i v e equations (4.2), (4.3) and (4.5) can be reduced to the set of n o n - l i n e a r equations K w ( i , j ) = d ( i , j ) (1 + ni - I \u00C2\u00A3 ( r , j , i ) ) : (4.8) r=1 M U j ) = s ( j ) / I w ( i , j ) (4.9) i = 1 n ( i , j) = X(j) w ( i , j ) (4.10) The equations (4.8), (4.9) and (4.10) along with (4.6) can be s o l v e d simultaneously to o b t a i n n ( i , j , N ) . The number of o p e r a t i o n s r e q u i r e d by t h i s procedure i s p r o p o r t i o n a l to M.|N|, which i s c o n s i d e r a b l y l e s s than the o r i g i n a l a l g o r i t h m . In the implementation of the m u l t i c l a s s c o n t r o l we use t h i s extended v e r s i o n of t h e i r scheme. 85 Once the n ( i , j , N ) ) ' s are known for. a given N one can compute q;(N) from equation (4.1). F o l l o w i n g the same arguments used i n the case of a s i n g l e c l a s s system, i t can be shown that the s a t u r a t i o n v e c t o r N* of the system i s given by the r e l a t i o n |N*j = U \ * / \u00C2\u00BB ,qi*(N)) M \u00E2\u0080\u00A2 \" 1 + 1 \u00C2\u00A3_L q|(N) i = 2 f \ (4.11 ) K where |N.*| = 1 s ( j ) * , q!*(N) = 1 and i * i s the device with the h i g h e s t u t i l i z a t i o n i n the o b s e r v a t i o n p e r i o d . N o t i c e that N* i s no more a s i n g l e number as i n the case of a s i n g l e c l a s s . Rather, i t i s a vecto r and equation (4.11) alone i s not s u f f i c i e n t to uniquely determine N*. T h i s i s why we need a second c r i t e r i o n f o r lo a d c o n t r o l . . 4.3 M u l t i c l a s s Optimal S e l e c t i o n The f o l l o w i n g d i f f e r e n t o b j e c t i v e s are c o n s i d e r e d for optimal s e l e c t i o n . (a) Compute the r a t i o that maximizes the system throughput r a t e . (b) Compute the r a t i o that takes i n t o c o n s i d e r a t i o n job p r i o r i t i e s ( i . e . , some weights are a s s o c i a t e d 86 . with each c l a s s of jobs) and minimizes a weighted f u n c t i o n of t h e i r w a i t i n g time. Schonbach [Scho80] solved the problem using ( a ) . One obvious problem with t h i s approach ( i . e . , . without c o n s i d e r i n g the job p r i o r i t i e s ) i s that i t may give r e l a t i v e l y higher p r i o r i t y to small jobs. In some cases l a r g e jobs may have to wait i n d e f i n i t e l y . Furthermore, i t u s u a l l y leads to a dynamic programming problem which i n t u r n r e q u i r e s high overhead. T h e r e f o r e , we s h a l l use c r i t e r i o n (b). For c r i t e r i o n (b) the o p t i m i z a t i o n problem becomes K Min z Z Min { I Z\ W j . } i = 1 K subject to I s ( i ) < | N*| i=1 where Wj i s the w a i t i n g time f o r c l a s s j jobs i n the system. Wj = R(j) + ( S ( j ) - s ( j ) ) / ( s ( j ) / R ( j ) ) , j=1,2,...,K. F o l l o w i n g arguments s i m i l a r to the one i n Chapter 3 i t can be shown that the s o l u t i o n of the problem i s given by: s( i )* = | N* | VC( i ) S ( i )R( i )\" , i = 1 , 2, . . . , K. (4.12) K _ T_^_ T__ T_^_^ I V C ( j ) S ( j ) R ( j ) j = 1 87 Furthermore, i t can be shown that only (K-1) out of the K r e l a t i o n s h i p s i n (4.12) are l i n e a r l y independent. In order f o r the system not to be s a t u r a t e d , the s ( i ) ' s must s a t i s f y (4.11). Therefore, there are K unknowns (the s ( i ) ' s ) i n K n o n - l i n e a r independent equations. A unique s o l u t i o n t h e r e f o r e e x i s t s . Note that R ( j ) = I w ( i , j ) whose value i s obtained i n i n the computation of qi(N) in equation (4.1). The m u l t i c l a s s c o n t r o l procedure, which we s h a l l c a l l MULTI-SELF, can be summarized as f o l l o w s : During the o b s e r v a t i o n p e r i o d T, c o l l e c t the v a l u e s of the parameters r e q u i r e d f o r the computations ( i . e . , . the branching f r e q u e n c i e s to d i f f e r e n t s e r v i c e c e n t r e s f o r d i f f e r e n t c l a s s e s of jobs; the average s e r v i c e r a t e s of d i f f e r e n t c e n t r e s and the mean number of jobs i n the system f o r each c l a s s ) . From .the measured parameter val u e s compute t h e i r expected values f o r the next i n t e r v a l using e x p o n e n t i a l smoothing. Step 1 . Step 2. Step 3. Solve the system of n o n - l i n e a r equations (4.8) through (4.12) si m u l t a n e o u s l y . 88 Step 4 . For each c l a s s i , maintain the number of jobs i n the subsystem to be s ( i ) * ( i f p o s s i b l e ) during the next o b s e r v a t i o n p e r i o d . The next chapter d e s c r i b e s a s i m u l a t i o n study of the performance of the two c o n t r o l schemes SELF and MULTI-SELF. T h e i r performances are compared to that of other e x i s t i n g schemes.. 89 CHAPTER 5 SIMULATION RESULTS AND ERROR ANALYSIS 5_.j_ I n t r o d u c t i o n Throughout the development of the c o n t r o l schemes SELF and MULTI-SELF presented i n the l a s t two chapters, a number of assumptions have been made. These assumptions may not be s a t i s f i e d i n p r a c t i c e . I t i s t h e r e f o r e d e s i r a b l e to determine the accuracy of these models and to compare t h e i r performance to those of some of the pr e v i o u s models. Because of the l a c k of resources i t was not p o s s i b l e to implement the proposed schemes on a r e a l system. Instead, a general purpose e v e n t - d r i v e n s i m u l a t o r f o r a c e n t r a l server model was developed. The schemes were then implemented on t h i s s i m u l a t o r to c o n t r o l the flow of jobs through the simulated system. The workload that d r i v e s the si m u l a t o r does not make the same assumptions that were made in the development of SELF and MULTI-SELF. For example, 90 the jobs are not i d e n t i c a l , and t h e i r c h a r a c t e r i s t i c s may change from time to time. A l s o i t may not be p o s s i b l e to maintain the computed degree of multiprogramming d u r i n g every o b s e r v a t i o n i n t e r v a l and the job flow balance might not be s a t i s f i e d during every o b s e r v a t i o n i n t e r v a l , e t c . The s i m u l a t o r i s q u i t e f l e x i b l e . I t can be run with any p r a c t i c a l number of d i f f e r e n t c l a s s e s of jobs and can handle any p r a c t i c a l number of I/O d e v i c e s . A l s o , i t can use v a r i o u s d i s t r i b u t i o n s to generate the s e r v i c e times of d i f f e r e n t s e r v i c e c e n t r e s . . The major components of the simulator are d e s c r i b e d i n the next s e c t i o n . R e s u l t s comparing the performance of the two schemes with some of the e x i s t i n g ones are a l s o presented. A summary of the v a r i o u s assumptions made, and t h e i r e f f e c t s on the performance of the two schemes i s given i n the l a s t s e c t i o n . 5.2 D e s c r i p t i o n of the Simulator The s i m u l a t o r i s w r i t t e n i n PL/1 and i s implemented on an Amdhal 470 V/8 running under MTS. There are s e v e r a l components of the s i m u l a t o r . They w i l l be examined a f t e r a d i s c u s s i o n of the flow of jobs through the simulated system. A sequence of j o b - p r o c e s s i n g step i s d e s c r i b e d below. 91 Each of them corresponds to an event w i t h i n the model. The. simulated model i s the same as that d e p i c t e d e a r l i e r i n Fi g u r e 3.7. (1) Upon a r r i v a l , a job e n t e r s the c o n t r o l u n i t and j o i n s the queue corresponding to i t s c l a s s . I f the c o n t r o l queue corresponding to i t s c l a s s i s empty and the c u r r e n t c o n t r o l p o l i c y allows more jobs of t h i s c l a s s i n the subsystem, the job moves to the CPU queue immediately. Otherwise, i t waits f o r i t s turn i n the c o n t r o l queue. (2 ) A job may j o i n the CPU queue from one of the s e v e r a l s e r v i c e c e n t r e s or from the c o n t r o l queue. Depending on where i t came from and the c o n d i t i o n s under which a job enters the CPU queue the f o l l o w i n g a c t i v i t i e s w i l l happen. (a) If i t ente r s from the c o n t r o l u n i t , i t i s assig n e d a f u l l CPU quantum. I t s next I/O time and type are generated. The time i n d i c a t e s the amount of CPU s e r v i c e r e q u i r e d between two s u c c e s s i v e I/O o p e r a t i o n s . The type i n d i c a t e s whether the job w i l l leave the system or j o i n one of the I/O u n i t s . In the l a t t e r case i t a l s o . i n d i c a t e s which of the I/O u n i t s i t w i l l j o i n . (b) I f i t enters a f t e r completion of i t s time quantum, i t 92 i s a ssigned a new f u l l time quantum. (c) I f i t enters a f t e r completion of an I/O request, i t s next I/O time and type are generated as i n ( a ) . (d) If i t enters a f t e r completion of a page f a u l t , i t s remaining time quantum, the next I/O time and type w i l l be used. Depending upon the system to be simulated, jobs may enter the head or the t a i l of the CPU queue. In MTS, f o r example, a job j o i n s the head of the CPU queue a f t e r a page f a u l t or an I/O competion. I t j o i n s the t a i l of the queue on the f i r s t a r r i v a l or i f i t s time quantum has e x p i r e d . (3) Departure from the CPU occurs on completion of one of the f o l l o w i n g events: (a) the CPU time quantum e x p i r e s ; i n t h i s case, the remaining time f o r the next I/O event i s computed, and the job i s moved back to the CPU queue, (b) an I/O request i s made; i n t h i s case, the remaining CPU quantum time i s computed and the job i s moved to the requested I/O u n i t , (c) a page f a u l t - o c c u r s ; in t h i s case, the remaining CPU 93 quantum and I/O time f o r the c u r r e n t l y running job are computed, the job i s moved to the paging device and. the next page f a u l t time i s generated. (4 ) Upon a r r i v a l to the paging d e v i c e , the job waits in the paging queue i f the device i s busy. Before the paging device s t a r t s p r o c e s s i n g . t h e job, i t s s e r v i c e time i s generated. Upon completion of the page s e r v i c e the job r e j o i n s the CPU queue. (5) Upon a r r i v a l at an I/O u n i t , the job waits i n the I/O queue i f the d e v i c e , i s busy. Before the I/O de v i c e s t a r t s p r o c e s s i n g the job, i t s s e r v i c e time i s generated. A f t e r completion of the I/O s e r v i c e the job r e j o i n s the CPU queue. (6) Upon completion of a l l the CPU and I/O demands, the job's s t a t i s t i c a l data are c o l l e c t e d . The c o n t r o l procedure i s then a c t i v a t e d to check i f another job can be i n j e c t e d i n t o the subsystem. The simulated system i s i n i t i a l i z e d with the system and job parameters. The system parameters then generate a b a s i c data s t u c t u r e f o r the modelled system. Some of the system parameters are: the number of job c l a s s e s , the number of I/O u n i t s , the c o n t r o l c r i t e r i o n , the memory 94 s i z e , the l i f e - t i m e f u n c t i o n parameters, e t c . The job parameters f o r each job c l a s s are the e x i t p r o b a b i l i t y , the I/O r a t e s e t c . when a job i s admitted i n t o the subsystem, a job d e s c r i p t o r i s c r e a t e d f o r i t . The job d e s c r i p t o r i d e n t i f i e s the job and i t s c l a s s . Data a s s o c i a t e d with the job's a c t i v i t y throughout i t s l i f e span in the subsystem are c o l l e c t e d and s t o r e d i n the job d e s c r i p t o r . The queues at the v a r i o u s s e r v i c e c e n t r e s are maintained as a l i n k e d l i s t with head and t a i l p o i n t e r s . Each node of the l i s t i s a p o i n t e r to a job d e s c r i p t o r i n d i c a t i n g the presence of the job at that p a r t i c u l a r p o s i t i o n i n the queue. A separate procedure dynamically c r e a t e s the I/O u n i t s at the beginning of a s e s s i o n . Upon each I/O request, another procedure l i n k s t h i s request to the proper I/O u n i t . The request i s then s e r v i c e d . The s i m u l a t o r can be run under d i f f e r e n t c o n t r o l schemes v i z . , 50%, L=S, Knee, SELF, MULTI-SELF and NO CONTROL. An event i s scheduled every T u n i t s of time to c o l l e c t data and a c t i v a t e the r e q u i r e d c o n t r o l procedure. The c o n t r o l procedure then computes the number of jobs i n each c l a s s that should be maintained i n the next T u n i t s of 95 time. During the o b s e r v a t i o n p e r i o d the mean number of jobs i n the subsystem i s not allowed to exceed the computed c o n t r o l number. Page f a u l t s : a r e generated using the system l i f e - t i m e f u n c t i o n as d e f i n e d i n [BeKu69] i . e . , L = 2b ( 5 . 1 ) 1 + ( c / p ) 2 where b and c are constants and p i s the average number of pages a l l o c a t e d to each job. Each time a page f a u l t occurs, the next page f a u l t time i s computed based on equation ( 5 . 1 ) and the number of jobs c u r r e n t l y i n the subsystem. A page f a u l t i s implemented as a system a c t i v i t y r a t h e r than a job a c t i v i t y . T h e r e f o r e whenever the page f a u l t time i s reached, whichever job i s present at the GPU at that time w i l l experience the page f a u l t . In order to c r e a t e e x a c t l y the same workload f o r v a r i o u s experiments with d i f f e r e n t c o n t r o l schemes, a pseudo-random number generator i s used. Given the same i n i t i a l seed, the same sequence of random numbers w i l l be generated each time. To compare v a r i o u s c o n t r o l schemes, the s i m u l a t o r i s run u s i n g the same random-number stream. Under v a r i o u s c o n t r o l schemes d i f f e r e n t job a c t i v i t i e s can take p l a c e i n d i f f e r e n t order. T h e r e f o r e , a common random-number stream can not be used f o r d i f f e r e n t jobs. 96 In order to r e s o l v e t h i s problem, we use the f a c t that the order of job a r r i v a l s i s independent of the c o n t r o l scheme. Thus each job i s a s s i g n e d a seed as soon as i t a r r i v e s at the system. T h i s seed i s used to generate a pseudo random-number sequence f o r v a r i o u s j o b - r e l a t e d a c t i v i t i e s (such as, next I/O time and type, I/O s e r v i c e time e t c . ) . Furthermore, because paging i s a system a c t i v i t y and depends on the s i z e of main memory, the number of jobs i n the system e t c . , the p a g e - f a u l t time and page s e r v i c e time use independent random-number streams. The s i m u l a t o r was v a l i d a t e d in two d i f f e r e n t ways, (a) S e l e c t i v e dumps of. a l l a c t i v i t i e s over s e v e r a l p r e s p e c i f i e d p e r i o d s of time were taken and hand t r a c e d . Because there i s a f a i r l y l a r g e number of a c t i v i t i e s even i n a small i n t e r v a l of time i t was not p o s s i b l e to go over a l l the a c t i v i t i e s of one s e s s i o n . T h e r e f o r e , (b) L i t t l e ' s Law was used to compute the g l o b a l mean behaviour of the system as w e l l as of the i n d i v i d u a l s e r v i c e c e n t r e s ; these were then v e r i f i e d a g a i n s t the observed ones. The observed v a l u e s were w i t h i n 0.05% of the computed v a l u e s . 5.3_ S i m u l a t i o n R e s u l t s In order to study the f e a s i b i l i t y and behaviour of SELF and MULTI-SELF we f i r s t compare the performance of SELF with some other schemes, s p e c i f i c a l l y the 50%, the 97 L=S, and the Knee c r i t e r i a . Then SELF i s compared with MULTI-SELF. The workload corresponding to the r e s u l t s i n Table 5.1 through Table 5.13 are given i n Table A.1-through Table A.9 i n Appendix A. The system parameter values were s e l e c t e d to be s i m i l a r to those of the IBM 370/168 system used by the UBC computing c e n t r e a few years ago.. The workload parameter v a l u e s were based on measured workloads on the 370/168 with p e r t u r b a t i o n s introduced to t e s t the s t a b i l i t y of SELF and MULTI-SELF. Because s i m u l a t i o n runs are expensive, the runs were made as short as p o s s i b l e . The runs of 120 simulated seconds were made. During t h i s p e r i o d , approximately 200 jobs were processed. I t i s found that the mean response time s t a b i l i z e s around 120 seconds. Table 5.1 shows a t y p i c a l set of mean response times observed between 75 and 180 seconds at an i n t e r v a l l ength of 3 seconds. SIMULATED TIME (SEC.) TERMINAL BATCH 75 0.3112 0.3744 90 0.3484 0.4420 1 05 0.3482 0.4632 120 0.3489 0.4627 1 35 0.3380 0.4630 1 50 0.3389 0.4630 165 0.3337 0.4396 180 0.3403 0.4448 Table 5.J_ Mean Response Time 98 The performance of the 50% c r i t e r i o n and the L=S c r i t e r i o n depends upon c e r t a i n parameter val u e s which are f u n c t i o n s of the system l o a d . For example, i n the L=S c r i t e r i o n , we must f i n d a constant \u00C2\u00BB and use L=MS. The value of ii depends upon the job c h a r a c t e r i s t i c s . From Table 5.2 we ob t a i n 0.6 as the best value of n for the workload under c o n s i d e r a t i o n . 0.2 0.50 0.60 0.75 TERMINAL 9.2093 1.4124 . 1 .7085 1.7247 BATCH 13.7833 4.1951 3.4939 4.4230 SYSTEM 10.3977 2.3303 2.3109 2.6465 Table 5.2 Mean Response Time ( i n sec.) For v a r i o u s values of ^ i_n L=S c r i t e r i o n Although the job c h a r a c t e r i s t i c s change dynamically i n the workloads that we s h a l l be a n a l y s i n g , the value of u i s a constant i n the L=S c r i t e r i o n . Therefore f o r every workload we f i r s t o b t a i n a best value of \u00C2\u00BB and then use i t . S e v e r a l s i m u l a t i o n runs were made which show the s u p e r i o r i t y of SELF over 50% and L=S. Some of the r e s u l t s obtained are presented i n Tables 5.3 through 5.5. 99 SELF . RESP. TIME (SEC) 50% RESP. TIME (SEC) %IMP* OVER 50% L=S RESP. TIME (SEC) %IMP OVER L=S TERMINAL 0.5409 0.5531 2.20 0.5444 0.64 BATCH 0.8901 0.9600 7.28- 0.9574 . 7.08 SYSTEM 0.6552 0.6785 ; 3.43 . 0.6724 2.55 Table 5.3 Resp. Time and %Impr. for Workload in Table A.3_ SELF RESP. TIME (SEC) 50% RESP. TIME (SEC) %IMP* OVER 50% L=S RESP. TIME (SEC) %IMP OVER L=S TERMINAL 0.6111 0.8262 26.03 0.8404 27.28 BATCH 1.5418 2.2729 32. 16 2.0050 23.15 SYSTEM 0.9531 1 .3586 29.84 1 .2684 24.86 Table 5.j4 Resp. Time and %Impr. for Workload in Table A.4 1 % impr. = . (50%Resp. time - SELF Resp. time) * 100 50% Resp. Time 100 SELF RESP. TIME (SEC) 50% RESP. TIME (SEC) %IMP. OVER 50% L=S RESP. TIME (SEC) %IMP. OVER . L=S TERMINAL 1.9096 2.7638 30.90 2.6061 26.73 BATCH 3.8750 4.9179 21.21 4.4218 12.37 SYSTEM 2.5218 3.4243 26.36 3.1707 20.47 Ta b l e 5.5 Resp. Time and %Impr. f o r Workload i n Ta b l e A.5 ' 1 0 3 The parameters f o r the workload corresponding to these r e s u l t s are given i n Appendix A. In order to demonstrate the dynamic a d a p t a b i l i t y of SELF over the other schemes, an a r t i f i c i a l v a r i a t i o n i n the workload i s i n t r o d u c e d . F i g u r e s .5.1 and 5.2 represent graphs of the mean number of jobs vs s i m u l a t i o n time and correspond to the workload of Table 5.3 and 5.4 r e s p e c t i v e l y . The workload corresponding to F i g u r e 5.1 has a smaller v a r i a t i o n compared to the workload of F i g u r e 5.2. The r e l a t i v e improvement of SELF over 5.0% and L=S in Table 5.4 i s a l s o greater as compared to the one i n Table 5.3. Thus i t i s apparent that the l a r g e r the workload v a r i a t i o n , the more s u p e r i o r i s the performance of SELF r e l a t i v e to the other two schemes. T h i s demonstrates the robustness and adaptive nature of SELF under v a r y i n g workload. Under l i g h t workload a l l the schemes g i v e approximately the same r e s u l t s as no c o n t r o l i s r e q u i r e d (see f o r example Table 5.3). Although the Knee c r i t e r i o n i s b e t t e r than the 50% and the L=S c r i t e r i a , i t i s expensive to implement i n p r a c t i c e . However, s i n c e the workload of the s i m u l a t o r i s d i s t r i b u t i o n - d r i v e n and the l i f e - t i m e f u n c t i o n approximated by equation (5.1), i t i s p o s s i b l e to simulate the Knee c r i t e r i o n without e x c e s s i v e overhead. The Knee c r i t e r i o n r e q u i r e s each job to run at the knee of i t s l i f e - t i m e f u n c t i o n , i . e . , at the p o i n t where the curve of the l i f e - t i m e of a process vs i t s memory a l l o c a t e d has maximum 104 slope. I t can be shown that, i f the l i f e - t i m e f u n c t i o n i s simulated using equation (5.1), then t h i s maximum slope i s a t t a i n e d when p = 2c, which i s independent of the parameter b. T h e r e f o r e , i f equation (5.1) i s used to simulate the l i f e - t i m e , by s u i t a b l y choosing the values of b and c one can c r e a t e a worst case workload f o r the Knee c r i t e r i o n without s i g n i f i c a n t l y a f f e c t i n g the performance of the other c r i t e r i a . A f t e r s e l e c t i n g a combination of parameters to favour the Knee c r i t e r i o n , the r e s u l t s shown in Table 5.6 were obtained. SELF RESP. TIME (SEC) 50% RESP. TIME (SEC) %IMP. OVER 50% L=S RESP. TIME (SEC) %IMP. OVER L=S KNEE RESP. TIME (SEC) %IMP. OVER KNEE TERMINAL 2.2405 3.2066 43. 1 1 3.1424 40.25 2.8461 27.02 BATCH 4.2405 5.0555 18.29 4.6618 9.94 4.3399 2.34 SYSTEM 2.8448 3.7727 32.62 3.6112 26.94 3.3070 1 6.24 Table 5.6 Resp. Time and %Impr. of SELF, 50%, L=S and Knee We observe that the knee c r i t e r i o n i s b e t t e r than the 50% and L=S c r i t e r i a but not as good as SELF under the workload c o n s i d e r e d . SELF allows one to adjust the q u a l i t y of s e r v i c e given to the t e r m i n a l and batch jobs. By choosing proper weights 105 the a n a l y s t can reduce the mean response time of the t e r m i n a l jobs to almost the lower l i m i t ( i , e . , when only t e r m i n a l jobs are present in the subsystem), at the expense of the mean batch response time. Tables 5.7 and 5.8 show the mean response times of t e r m i n a l and batch jobs f o r d i f f e r e n t values of the weight f a c t o r s f o r two d i f f e r e n t workloads corresponding to those in Tables A.7 and A.8 r e s p e c t i v e l y . C1/C2=1 2 3 4 5 20 TERMINAL BATCH SYSTEM 10.6451 20.1333 14.281 6 7.4705 22.7354 13.0492 5.4945 22.6035 11.6829 4.8306 23.4670 11.4475 4.5491 24.8099 11.5916 2.9849 27.6247 11.4095 Table 5.7 Mean Resp. Time ( i n sec) Using SELF With D i f f . Weight R a t i o s Under Heavy Workload C1/C2=1 2 3 4 5 20 TERMINAL BATCH SYSTEM 2.3774 3.8004 2.8128 2.2405 4.2405 2.8448 2.0437 4.1070 2.6620 1.9000 4.3152 2.6286 1.7773 4.4305 2.5777 1.7770 5.1705 2.7853 ; Table 5.8 Mean Resp. Time ( i n sec) Using SELF With D i f f . Weight R a t i o s Under L i g h t Workload 1 06 The r e s u l t s i n Tables 5.7 and 5.8 show the e f f e c t , of changing the weights under heavy and l i g h t loads r e s p e c t i v e l y . , The overhead i n v o l v e d i n the implementation of SELF c o n s i s t of two d i f f e r e n t components. (a) The overhead due to c o l l e c t i n g the data during the o b s e r v a t i o n i n t e r v a l s . (b) The overhead due to the computation of the c o n t r o l number.. Overhead (a) depends upon the system c o n f i g u r a t i o n -(e.g., the number of I/O u n i t s . etc.,) and job c h a r a c t e r i s t i c s (e.g., t o t a l CPU demand, number of I/O requests e t c . ) . The overhead i n (b) only depends upon the system c o n f i g u r a t i o n . The overhead (a) f o r the system and the workload c o n s i d e r e d i n the above examples i s estimated to be approximately 0.125% of CPU time on an Amdhal 470 V/8 system. The percentage i s computed as f o l l o w s : % CPU Time = Computation Time * 100 I n t e r v a l Length The overhead i n (b) i s estimated to be approximately 0.04% of CPU time. T h e r e f o r e , the t o t a l overhead f o r SELF i s approximately 0.165%. T h i s l e v e l of overhead i s c e r t a i n l y a c c e p t a b l e . 1 07 We now compare SELF with MULTI-SELF. In the pre v i o u s examples only two c l a s s e s of jobs were c o n s i d e r e d . We use m u l t i - c l a s s c o n t r o l to handle four d i f f e r e n t c l a s s e s of jobs i n our next examples. T h i s small number i s chosen i n order to keep the s i m u l a t i o n c o s t s reasonable. MULTI-SELF can t h e o r e t i c a l l y , handle any number of c l a s s e s . The jobs in the f i r s t two c l a s s e s are short jobs with high p r i o r i t i e s and can be c o n s i d e r e d as t e r m i n a l jobs. The jobs i n the other two c l a s s e s are longer jobs with low p r i o r i t i e s and can be c o n s i d e r e d as batch jobs. . The mean response times of the four d i f f e r e n t c l a s s e s of jobs under SELF and MULTI-SELF are shown i n Table 5.9. CLASS WEIGHT SELF RESP. TIME (SEC) MULTI SELF RESP. TIME (SEC) %IMP OVER SELF 1 2.5 0.4329 0.3000 30.70 2 2.0 0.4483 0.3191 28.82 3 1 .5 2.01.55 1.9418 3.66 4 1 .0 4.4868 4.2737 4.75 Table 5.9 Mean Response Times of Jobs Under SELF and MULTI-SELF With S t a t i c Beta 108 It may be observed that there i s a c o n s i d e r a b l e improvement in the response times of short jobs with high p r i o r i t i e s , whereas only marginal improvement i s observed f o r longer jobs with low p r i o r i t i e s . T h i s improvement i s achieved at the expense of a d d i t i o n a l overhead. The t o t a l overhead of MULTI-SELF for t h i s c o n f i g u r a t i o n of the system and f o r the s e l e c t e d workload i s approximately 4.32% of the t o t a l CPU time. The c o r r e s p o n d i n g overhead f o r SELF i s approximately 0.165%. In the implementation of SELF and MULTI-SELF in the above example, the values of p ( i n equation (3.25)) are computed only once for each parameter and then these constant values are used throughout the experiment. One can improve the performance of these schemes by dynamically computing the values of p at each i n t e r v a l using equation (3.29), thus reducing the e r r o r i n the e s t i m a t i o n of the values of the workload parameters. 109 SELF MULTI %IMP RESP. SELF OVER CLASS WEIGHT TIME RESP. SELF (SEC) TIME (SEC) 1 2.5 0.3564 0.2875 23.96 2 2.0 0.3057 0.2998 . 1 .92 3 1.5 1.8350 1.8029 ,1.74 4 1.0 4.1917 4.1322 1.41 Table 5 .J_0 Mean Response Times of Jobs Under SELF and MULTI-SELF With Dynamic Beta\u00E2\u0080\u00A2. Table 5.10 shows the. mean response times of the four c l a s s e s of jobs with dynamic computation of p. I t i s observed that MULTI-SELF e x h i b i t s an improvement i n the response time over SELF. I t i s not as much as i n the case of Table 5.9. Furthermore, by dynamically recomputing the values of P i n SELF an improvement i s observed over SELF with s t a t i c 0 (see Table 5.11). In the case of MULTI-SELF only marginal improvement i s achieved when 0 i s dynamically recomputed (see Table 5.12). n o CLASS WEIGHT STATIC BETA RESP. TIME (SEC) DYM. BETA RESP. TIME (SEC) %IMP. OVER STATIC BETA 1 2 . 5 0 . 4 3 2 9 0 . 3 5 6 4 21 . 4 4 2 2 . 0 0 . 4 4 8 3 0 . 3 0 5 7 4 6 . 6 4 3 1 . 5 - 2 . 0 1 5 5 1 . 8 3 5 0 0 9 . 8 3 4 1.0 4 . 4 8 6 8 4 . 1 9 1 7 0 7 . 0 4 V Table 5_. n_ Mean Response Times of Jobs Under SELF with S t a t i c and Dynamic Beta CLASS WEIGHT STATIC BETA RESP. TIME (SEC) DYM. BETA RESP. TIME (SEC) %IMP. OVER STATIC BETA 1 2 . 5 0 . 3 0 0 0 0 . 2 8 7 5 4 . 1 7 2 2 . 0 0 . 3 1 9 1 0 . 2 9 9 8 . 6 . 0 5 3 1 .5 1 . 9 4 1 8 1 . 8 0 2 9 7 . 1 5 4 1.0 4 . 2 7 3 7 4 . 1 3 2 2 3 .31 Table 5_.J_2 Mean Response Times of Jobs Under MULTI-SELF Resp. With S t a t i c and Dynamic Beta The overhead i n v o l v e d i n the case of SELF with dynamic computation of the value s of p i s approximately 12 .40% of , 1 1 1 CPU time, where as i n the case of MULTI-SELF i t i s approximately 45.69%. Ther e f o r e we can conclude that i t i s not worthwhile to dynamically compute the value s of p , at l e a s t i n the case of MULTI-SELF. It seems b e t t e r to implement MULTI-SELF with a constant value.of > rather than SELF with dynamic values of p . The above o b s e r v a t i o n s were made on the b a s i s of a few examples. T h i s i s due to the high cost of s i m u l a t i o n . However, the workload was c a r e f u l l y s e l e c t e d to r e f l e c t the worst case f o r these schemes. I t i s expected that the performance of these schemes w i l l vary with d i f f e r e n t workloads, but t h e i r order of magnitude w i l l not d i f f e r s i g n i f i c a n t l y . 5.4 E r r o r a n a l y s i s In t h i s s e c t i o n we o u t l i n e some of the most important assumptions made i n order to make the models mathematically t r a c t a b l e and the c o n t r o l schemes p r a c t i c a l l y f e a s i b l e . We a l s o analyse the e r r o r i n t r o d u c e d because of these assumptions. Assumption j_. I dent i c a l jobs SELF assumes that a l l the jobs are i d e n t i c a l i n t h e i r resource demands, whereas MULTI-SELF assumes t h a t the jobs 1 12 w i t h i n each c l a s s are i d e n t i c a l . In an a c t u a l system, job c h a r a c t e r i s t i c s may vary widely. The s y n t h e t i c workload s e l e c t e d to d r i v e the simulator does not make t h i s assumption. Not only d i f f e r e n t jobs have d i f f e r e n t c h a r a c t e r i s t i c s , but a l s o t h e i r c h a r a c t e r i s t i c s change from time to time. The extent of the improvement obtained by c l a s s i f y i n g jobs i n t o four c l a s s e s can be seen i n Table 5.9. Schonbach [Scho80] suggested a way of reducing t h i s \ e r r o r by c r e a t i n g an independent c l a s s f o r each job. T h i s may solve the problem to a c e r t a i n extent but the overhead i n v o l v e d w i l l a l s o i n c r e a s e c o n s i d e r a b l y . Assumption 2. E s t i m a t i o n of Parameters Both the schemes estimate the value s of parameters on the b a s i s of t h e i r past v a l u e s . In order to reduce the e r r o r i n the e s t i m a t i o n we have used the simplest method of ex p o n e n t i a l smoothing. T h i s e r r o r can be f u r t h e r reduced by dynamically computing the values of p i n the e x p o n e n t i a l smoothing. Although dynamic computation of p does not r e q u i r e more than a maximum of 10 pre v i o u s values ( i . e . , an i n s i g n i f i c a n t storage requirement) the computation overhead i n q u i t e h i g h . Moreover, i t i s observed that the improvement achieved by using dynamic p i s marginal i n the case of MULTI-SELF. A compromise i s to recompute the v a l u e of p a f t e r l a r g e i n t e r v a l s of time. Table 5.13 shows the percentage e r r o r i n v o l v e d i n the p r e d i c t i o n of one of the 113 system parameter without smoothing, with s t a t i c smoothing and with dynamic smoothing. % ERROR RESP. TIME (SEC) %IMP. NO SMOOTHING 50.5 , 6.2271 STATIC SMOOTHING 12.6 4.4868 38.78 DYNAMIC SMOOTHING 04.4 .4.1917 48.55 Table 5.J_3 % E r r o r and %Imp. i n Resp. Time under S t a t i c , Dynamic and No Smoothing The improvements i n both cases are s i g n i f i c a n t over no smoothing. But there i s not much improvement of dynamic smoothing with respect to s t a t i c smoothing. Assumption 3. Constant Degree of Multiprogramming The two schemes r e q u i r e the degree of multiprogramming to be maintained at the computed l e v e l . T h i s c o n d i t i o n may not be s a t i s f i e d during every o b s e r v a t i o n i n t e r v a l . For example, d u r i n g c e r t a i n i n t e r v a l s t here c o u l d be very l i g h t l o a d ( i . e . , very few jobs) at the beginning f o l l o w e d by a sudden bu r s t of j o b s . Under such circumstances, the number of jobs i n the system w i l l be i n i t i a l l y below the computed number and then, because of the c o n t r o l , i t . w i l l never exceed the c o n t r o l number. As a r e s u l t , the mean number of jobs i n the subsystem a f t e r the time i n t e r v a l w i l l be l e s s than the c o n t r o l number. T h i s problem can be so l v e d to a c e r t a i n extent by comparing the c o n t r o l number with the mean number of c u r r e n t jobs i n the system r a t h e r than with the a c t u a l number of c u r r e n t jobs in the system. An improvement of approximately 12% in response time was observed when t h i s m o d i f i c a t i o n was made. Assumption _4. Job Flow Balance O p e r a t i o n a l a n a l y s i s r e q u i r e s the job flow to be balanced at every s e r v i c e c e n t r e in the system. I t i s found that i n the simulated system t h i s i s s a t i s f i e d almost 95% of the time. Whenever i t i s not s a t i s f i e d ( i . e . , the number of a r r i v a l s at a c e n t r e d u r i n g an i n t e r v a l i s not equal to the number of departures) the e r r o r 2 i s never more than 2%. There are a few other f a c t o r s that e f f e c t the v a l i d i t y of the r e s u l t s . The o b s e r v a t i o n i n t e r v a l and the CPU time quantum le n g t h are set to 3.0 and 0.010 simulated seconds r e s p e c t i v e l y . These are the values used by the Michigan Terminal System at UBC. However, there are standard 2 % E r r o r = [no. a r r i v a l - no. departure 1 * 100 no. Departure 1 1 5 t e c h n i q u e s t o compute the. v a l u e of the o b s e r v a t i o n i n t e r v a l (see [ B o J e 7 6 ] ) . 1 16 CHAPTER 6 CONCLUSIONS AND EXTENSIONS T h i s t h e s i s has demonstrated the v e r s a t i l i t y of using o p t i m i z a t i o n techniques along with queueing theory to solve some of the decision-making problems i n computer system performance e v a l u a t i o n . The emphasis has been t o obt a i n s o l u t i o n s on the b a s i s of mathematical m o d e l l i n g . Closed form s o l u t i o n s were ob t a i n e d wherever p o s s i b l e to reduce the computational overhead and to attempt a t h e o r e t i c a l e x p l a n a t i o n of e m p i r i c a l f i n d i n g s . The f i r s t problem co n s i d e r e d was to obt a i n an optimal design f o r the memory h i e r a r c h y of a multiprogramming system. To our knowledge, t h i s i s the f i r s t work that c o n s i d e r s e x p l i c i t queueing at some of the memory l e v e l s . A model f o r e s t i m a t i n g the optimal c a p a c i t i e s and speeds of the memory h i e r a r c h y has been developed. I t was assumed that the technology c o s t f u n c t i o n and the h i t - r a t i o 117 f u n c t i o n can be represented by power f u n c t i o n s . T h i s appears to be a rough but reasonable approximation. The q u a n t i t i e s to be optimized (mean response time i n t h i s case) are expressed as a f u n c t i o n of the d e s i r e d parameters through the use of . queueing models. O p t i m i z a t i o n techniques are then a p p l i e d to d e r i v e the optimal values f o r the design parameters. Using the assumptions s t a t e d i n Chapter 1, the design problem i s shown to have a : g l o b a l optimal s o l u t i o n . I t was observed that there i s a c o n s i d e r a b l e d i f f e r e n c e between the optimal design of uniprogrammed and multiprogrammed systems. Th e r e f o r e , the r e s u l t s obtained f o r a uniprogrammed system cannot be adequately used i n the multiprogrammed case. The e m p i r i c a l o b s e r v a t i o n that there i s a constant r a t i o between the optimal s i z e s of d i f f e r e n t l e v e l s of memory was mathematically v e r i f i e d . I t was a l s o i n f e r r e d that once a system has achieved the optimal memory s i z e , any e x t r a budget should be used i n the a c q u i s i t i o n of f a s t e r r a t h e r than more memory. The technique presented here can be extended and a p p l i e d to s e v e r a l other r e l a t e d problems. A n a t u r a l e x t e n s i o n i s to i n c l u d e the CPU cost and speed in the design problem. One can a l s o attempt to f i n d the optimal amount of i n f o r m a t i o n (e.g., page s i z e ) that should be moved from one l e v e l to another, upon each request. A s i m i l a r problem \u00E2\u0080\u00A2 has been attempted by T r i v e d i 118 et a l . [TrWS80]. By c o n s i d e r i n g a c l o s e d queueing network model and by maximizing the throughput r a t e they f i n d the optimal CPU speed, device c a p a c i t i e s and a l l o c a t i o n of f i l e s among v a r i o u s secondary d e v i c e s . Most of the previous work in optimal system design assumes i d e n t i c a l j o b s . T h i s strong assumption needs to be r e l a x e d . The second problem c o n s i d e r e d was to c o n t r o l the flow of jobs through a system. Two c o n t r o l schemes, SELF and MULTI-SELF, were developed. U n l i k e most other work, the schemes are based on mathematical modelling and thus t h e i r o p t i m a l i t y can be proven. Furthermore, by the use of o p e r a t i o n a l a n a l y s i s , the assumptions used i n the model fo r m u l a t i o n are minimized and a l l the r e q u i r e d parameters are observable. B a s i c a l l y , these schemes c o n s i s t of two s t e p s . In the f i r s t step, the s a t u r a t i o n p o i n t of the multiprogramming subsystem i s estimated. Next, the optimal r a t i o of the number of jobs from each c l a s s t hat should be maintained i n the multiprogramming subsystem i s computed. The o b j e c t i v e i n t h i s o p t i m i z a t i o n problem i s to minimize a weighted sum of the w a i t i n g times of jobs i n the system without s a t u r a t i n g the system. In the development of SELF i t was assumed that a l l the jobs have i d e n t i c a l resource demands. However, the scheme 119 i s adaptive and i s capable of modifying i t s e l f when the workload v a r i e s . If there i s a sudden burst of jobs during an o b s e r v a t i o n i n t e r v a l and the subsystem gets s a t u r a t e d , then d u r i n g the subsequent o b s e r v a t i o n i n t e r v a l s , more jobs are prevented from e n t e r i n g the subsystem u n t i l the system comes out of s a t u r a t i o n . On the other hand, i f the c a l c u l a t e d s a t u r a t i o n p o i n t i s smaller than the a c t u a l s a t u r a t i o n p o i n t , more jobs are allowed to begin execution in the subsequent i n t e r v a l . The i d e n t i c a l job assumption of SELF i s r e l a x e d by using mean-value a n a l y s i s of m u l t i - c l a s s systems. The m u l t i - c l a s s scheme MULTI-SELF can handle any p r a c t i c a l number of d i f f e r e n t c l a s s e s of jobs. Although an improvement was observed over SELF when using MULTI-SELF, the overhead i n v o l v e d i n the implementation of MULTI-SELF are a l s o higher than that of SELF. In the development of MULTI-SELF i t was assumed that once a job a r r i v e s at the system i t i s p o s s i b l e to determine the c l a s s to which i t belongs. I t i s a l s o assumed that the jobs do not change t h e i r c l a s s d u r i n g t h e i r stay i n the system. An extension to t h i s work would be to r e l a x these assumptions. In order to compare the performance of SELF and MULTI-SELF with some of the major c o n t r o l schemes a general purpose sim u l a t o r of a c e n t r a l s e r v e r model was developed and d i f f e r e n t c o n t r o l schemes were implemented on i t . The 1 20 workload that d r i v e s the s i m u l a t o r d i d not make most of the assumptions r e q u i r e d f o r the development of SELF and MULTI-SELF. I t was found that the two schemes perform b e t t e r than the e x i s t i n g ones over a v a r i e t y of workloads. The s u p e r i o r i t y of the two schemes over the other schemes i n c r e a s e s as t h e . v a r i a t i o n i n the workload i n c r e a s e s . T h i s shows that the two schemes are more adaptable to changes i n the workload. Furthermore, . to improve the accuracy, an e x p o n e n t i a l smoothing technique i s used i n the e s t i m a t i o n of system and job parameters r e q u i r e d by the two schemes. F i n a l l y , we would l i k e to implement these schemes on an a c t u a l *system and v e r i f y t h e i r performance using r e a l workloads. 121 REFERENCES [ArGa73] Arora , S. R. and A. G a l l o , \"Optimization of S ta t ic Loading and S iz ing of M u l t i l e v e l Memory System\". JACM, V o l . 20, No. 2, A p r i l 1973, (307-319). [BaLe78] Bedel , M., and J . Leroudier , \"Adaptive Multiprogramming Systems Can E x i s t \" . in Performance of Computer I n s t a l l a t i o n s , D. F e r r a r i (ed.)., North-Hol land Publ . C o . , 1978(115-135). [BGLP75] Badel , M. , E. Gelenebe, J . Leroudier , and D. P o t i e r , \"Adaptive Optimization of a Time-Sharing System's Performance\". Proc. IEEE, V o l . 63, No. 6, June 1975(958-965). [Baer74] Baer, J . L . , \"On Program Placement in a D i r e c t l y Executable Hierarchy of Memories\". IEEE Trans. Computers, V o l . C-23, No. 8, Aug. 1974(838-849). [Bard79] Bard, Y . , \"Some Extensions to M u l t i c l a s s Queuing a n a l y s i s \" . in Performance of Computer Systems, M. Arato , A. Butrimenko, and E. Gelenbe T e d . ) , North-Holland Publ . C o . , 1979(51-62). [BCMP75] Baskett , F . , K. M. Chandy, R. R. Muntz, and F. G. P a l a c o i s , \"Open Closed and Mixed Networks of Queues with Di f ferent Classes of Customers\". JACM, V o l . 22, No. 2, A p r i l 1975(248-260). [BeKu69] Belady, L. A . , and C. J . Kuehner, \"Dynamic Space Sharing in Computer System\". Comm. ACM, V o l . 12, No. 5, May 1969(282-288). [BeBW72] B e l l , T. E . , B. W. Boehm, and R. A. Watson, \"Framework and I n i t i a l Phase for Computer Performance Improvement\". FJCC, 1972(1141-1154). [BoJe76] Box, G. E . P . , and G. M. Jenkins , Time-Series A n a l y s i s : Forcast inq and Cont ro l . Holden-Day, San F r a n c i s c o , 1976. [BrBC77] Brown, R. M., J . C. Browne, and K. M. Chandy, \"Memory Management and Response Time\". Comm. ACM, V o l . 20, No. 3, March 1977(154-165). 122 [BCBK75] Browne, J . C , K. M. Chandy, R. M. Brown, T. N. K e l l e r , D. F. Twosley, and C. W. D i s s l y , \" H i e r a r c h i c a l Techniques f o r the Development of R e a l i s t i c Models of Complex Computer Systems\". Proc. IEEE, V o l . 63, No. 6,' June 1975(966-975). [Buze7l] Buzen, J . P., \" A n a l y s i s of System B o t t l e n e c k s Using a Queuing Network Model\". Proc. ACM-SIGOPS Workshop on System Performance E v a l u a t i o n , A p r i l 1971(82-103). [Buze73] Buzen, J . P., \"Computational Algorithms f o r Closed Queuing Netwirks with E x p o n e n t i a l S e r v e r s \" . Comm. ACM, V o l . 16, No. 9, Sep. 1973(527-531). [BuSh74] Buzen, J . P. and A. W. C. Shum, \"Optimal Load Balancing i n Memory H i e r a r c h i e s \" . Proc. E i g h t Ann. P r i n c e t o n Conf. on Information S c i . March 1974(335-339). [Buze76] Buzen, J . P., \" O p e r a t i o n a l A n a l y s i s : The Key to the New Generation of Performance P r e d i c t i o n T o o l \" . Proc. IEEE COMPCON, N.Y., 1976(166-171). [BuPo77] Buzen, J . P., and D. P o t i e r , \"Accuracy of E x p o n e n t i a l Assumption i n Closed Queuing Models\". Proc. S i g m e t r i c s Conf. On Computer Per f . , 1977(53-64). [Buze78a] Buzen, J . P., \"A Queuing Network Model of MVS\". Computing Surveys, V o l . 10, No. 3, Sep. 1978(319-331). [Buze78b] Buzen, J . P., \" O p e r a t i o n a l A n a l y s i s : An A l t e r n a t e to S t o c h a s t i c Modeling and P r e d i c t i o n \" . Proc. I n t . Conf. P e r f . Comp. I n s t a l l a t i o n , North-Holland Publ. Co., 1978(175-194). [BuDe80] Buzen, J . P., and P. J . Denning, \" O p e r a t i o n a l Treatment of Queue D i s t r i b u t i o n s and Mean-value A n a l y s i s \" . Computer Performance, Vol.1, no.1, June 198.0(6-15). [ChHW75a] Chandy. K. M., U. Herzog, and L. Woo, \"Approximate A n a l y s i s of General Queuing Networks\". IBM J . Res. And Develop. Jan. 1975(43-49). [ChHW75b] Chandy. K. M., U. Herzog, and L. Woo, \"Parametric A n a l y s i s of Queuing Networks\". IBM J . Res. And Develop. Jan. 1975(36-42). 123 [ChHT77] Chandy, K. M. , J . H. Howard J r . , and D. F. Towsley, \"Product Form and L o c a l Balance i n Queuing Networks\". JACM, V o l . 24, No. 2, A p r i l 1977(250-263). [ChHS77] Chandy, K. M., J . Hogarth, and C. H. Sauer, \" S e l e c t i n g C a p a c i t i e s in Queuing Network Models of Computer/Communication Systems\". IEEE Trans. Software Engg., V o l . SE-3, No. 4, J u l y 1977(290-295). [ChSa78] Chandy, K. M., and C. H. Sauer, \"Approximate Method f o r A n a l y s i n g Queuing Network Models of Computing Systems\". Computing Surveys, V o l . 10, No. 3, Sep. 1978(281-317). [Chan79] Chanson, S. T., \" S a t u r a t i o n E s t i m a t i o n i n I n t e r a c t i v e Computer Systems\". Dept. of Comp. S c i . , Univ. of B r i t i s h Columbia, TR 79-7, 1979. [ChTr79] P r i v a t e Communication with S. T. Chanson and K. S. T r i v e d i , Jan 1979. [ChSi80a] Chanson, S. T. and P. S. Sinha, \" O p t i m i z a t i o n of Memory H i e r a r c h i e s i n Multiprogrammed Computer Systems with F i x e d Cost C o n s t r a i n t \" . IEEE Trans. Computers J u l y 1980(611-618). [ChSi80b] Chanson, S. T., and P. S. Sinha, \"Optimal Load C o n t r o l i n Combined B a t c h - I n t e r a c t i v e Computer System\". Proc. Of 16th Conf. Of CPEUG, F l o r i d a , October 1980(207-213). [Chow74] Chow, C.K., \"On O p t i m i z a t i o n of Storage H i e r a r c h i e s \" . IBM J . Res. Devolop., V o l . 18, May 1974(194-203). [Chow76] Chow, C. K., \"Determination of Cache's C a p a c i t i e s and I t s Matching Storage H i e r a r c h y \" . IEEE Trans. Computers, V o l . C-25, No. 2, Feb. 1976(157-164). [CoDe73] Coffman, E. G., J r . , and P. J . Denning, Operating System Theory. P r e n t i c e - H a l l , Englewood C l i f f s , N.J.,1973. [Cour77] C o u r t o i s , P. J . , D e c o m p o s i b i l i t y - Queuing and Computer System A p p l i c a t i o n s . Academic Press, 1977. [Denn68a] Denning, P. J . , \"Thrashing: I t s Causes and Pr e v e n t i o n \" . Proc. AFIPS, 33, (FJCC) 1968(915-922). [Denn68b] Denning, P. J . , \"The Working Set Model f o r Program Behaviour\". Comm. ACM, V o l . 15, No. 5, May 1968(323-333). 124 [Denn70] Denning, P. j . , \" V i r t u a l . Memory\". Computing Surveys, V o l . 2, No. 3, Sept. 1970(153-189). [Denn7l] Denning, p. J . , \" T h i r d Generation Computer System\". Computing Surveys, V o l . 3 No. 4, Dec. 1971(175-216). [DeGr75] Denning, P. J . , and G. S. Graham, \"Multiprogramming Memory Management\", Proc. IEEE, V o l . 63, No. 6, June 1975(924-939). [DKLP76] Denning, P. J.,\"K. C. Khan, J . L e r o u d i e r , D. P o t i e r , and R. S u r i , \"Optimal Multiprogramming\". Acta I n f o r m a t i c a , 7,1976(197-216). [DeKh76] Denning, P. J . , and.K. C. Khan, \"An L=S C r i t e r i o n for Optimal Multiprogramming\". Proc. I n t ' l . Symp. on Computer Performance Modeling, Measurement and E v a l u a t i o n , March 1976(219-229). [DeBu78] Denning, P. J . , and J . P. Buzen, \"The O p e r a t i o n a l A n a l y s i s of Queuing Network Models\". Computing Surveys, V o l . 10, No. 3, Sep. 1978(225-261). [DuPZ67] Duff i n , R. J . , E. L. Peterson and C. Zener, Geometric Programming. John Weley and Sons Inc., N.Y., 1967. [Ferr78] F e r r a r i , D. , Computer Systems Performance E v a l u a t i o n . P r e n t i c e - H a l l , 1978. [FoBr76] F o s t e r , D., and J . C. Browne, \" F i l e Assignment i n Memory H i e r a r c h i e s \" . Proc. M o d e l l i n g and P e r f . E v a l . Of Computer Systems, North-Holland, Oct. 1976(119-127). [Grav67] Gaver, D. P., \" P r o b a b i l i t y Models f o r Multiprogramming Computer System\". JACM, V o l . 14,No. 3, J u l y 1967(423-438). [Gece74] G e c s e i , J . , \"Determining H i t R a t i o f o r M u l t i l e v e l H i e r a r c h i e s \" . IBM J . Res. and Dev., V o l . 1 8 , No. 4, J u l y 1974(316-327). [GeLu74] G e c s e i , J . , and T. A. Lukes, \"A Model f o r the E v a l u a t i o n of Storage H i e r a r c h i e s \" . IBM Sys. J . , V o l . 13, No. 2, 1974(163-178). [GeMu76] Gelenbe, E., and R. R. Muntz, \" P r o b a b i l i s t i c Models of Computer Systems \u00E2\u0080\u0094 P a r t - I (Exact R e s u l t s ) \" . Acta I n f o r m a t i c a 7, 1976(35-60). 125 [GoNe67] [GrDe77] [Hadl72] [HiMT79] [Jack63] [Kend76] [ K l e i 6 8 ] [Land76] [Lazo77] [LePo76] [LeSc71] [LiMa72] Gorden, W. J . , and G. F. Newell, \"Closed Queuing Systems With E x p o n e n t i a l S e r v e r s \" . Oper. Res., 15, 1967(254-265). Graham, G. S., and P. J . Denning, \"On the R e l a t i v e C o n t r o l l a b i l i t y of Memory P o l i c i e s \" . Proc. I n t ' l . Symp. on Computer Performance Modeling, Measurement and E v a l u a t i o n , Aug. 1977(411-428). Hadley, G., Nonlinear and Dynamic Programming. Addison-Wesley Pub. Company, 1972. Hine, J . H., I. M i t r a n i , and S. Tsur, \"The C o n t r o l of Response Time i n M u l t i - c l a s s Systems by Memory A l l o c a t i o n \" . Comm. J u l y 1979(415-424). ACM, V o l . 22, Jackson, J . R., \"Job Shop-like Queuing Management Science, V o l . 10, No, Oct. 1963(131-142). No. 7, System\" 1 , K e n d a l l , M., Time-Series. G r i f f i n , London, 1976. K l e i n r o c k , L., \" C e r t a i n A n a l y t i c R e s u l t s for Time-shared P r o c e s s o r s \" . I n f o . P r o c e s s i n g , Proc. IFIP Congress 1968(838-845). Landwehr, C. E., \"An Endogenous P r i o r i t y Model f o r Load C o n t r o l i n a Combined B a t c h - I n t e r a c t i v e Computer System\". Proc. I n t ' l . Symp. on Computer Performance Modeling, Measurement and E v a l u a t i o n , March. 1976(282-287). Lazowska, E. W., \"The Use of P e r c e n t i l e s i n Modeling CPU S e r v i c e Time D i s t r i b u t i o n \" . i n Computer Performance, K. M. Chandy and M. R e i s e r - (ed.), E l s e v i e r North-Holland Inc., new York, 1977(53-66). L e r o u d i e r , J . , and D. P o t i e r , \" P r i n c i p l e s of O p t i m a l i t y f o r Multiprogramming\". Proc. I n t ' l . Symp. on Computer Performance Modeling, Measurement and E v a l u a t i o n , March 1976(211-218). Lewis, P. A. W., and G. S. Schedler, \"A Cyclic-Queue Model of System Overhead i n Multiprogrammed Computer systems\". JACM, V o l . 18, 1971(199-220). L i n , Y. S., and R. L. Mattson, \"Cost-Performance E v a l u a t i o n of Memory H i e r a r c h i e s \" . IEEE Trans. Magnetics, V o l . MAG-8, No. 3, Sep 1972(390-392). 126 [LiCh77] L i p s k y , L., and J . D. Church, \" A p p l i c a t i o n of Queuing Network Model f o r a Computer System\". ACM Computing Surveys, V o l . 9 No. 3, Sep. 1977(205-222). [ L i t t 6 l ] L i t t l e , J . D. C , \"Proof f o r the Queuing Formula L=X.W\". Operations Research, 9, 3, May 1961(383-387). [Lo80] Lo, R., \"The A p p l i c a t i o n of Optimal S t o c h a s t i c C o n t r o l Theory to Adaptive Performance C o n t r o l of Computer Systems\". M.Sc. T h e s i s , Dept. of Comp. S c i . , Univ. of B r i t i s h Columbia, 1980. [MaSi75] MacDonald, J.E., and K.L. Sigworth, \"Storage H i e r a r c y O p t i m i z a t i o n Procedure\". IBM J . Res. Develop., V o l 19, March 1975(133-140).. [Matt7l] Mattson, R.L.',; \" E v a l u a t i o n of M u l t i l e v e l Memories\". IEEE Trans. Magnetics, V o l . MAG-7, December 1971(814-819). [Pugh7l] Pugh, E. W., \"Storage H i e r a r c h i e s : Gaps, C l i f f s , and Trends\". IEEE Trans. Magnetics, V o l . MAG-7, No. 4, Dec. 1971(810-814). [RaCh70] Ramamoorthy,C.V., and K.M. Chandy, \" O p t i m i z a t i o n of Memory Hi e r a r c h y in Multiprogramming System\". JACM, Vol.17, J u l y 1970(426-445). [Rege76] Rege, S. L., \"Cost, Performance and S i z e Trade-Off f o r D i f f e r e n t L e v e l s i n Memory H i e r a r c h y \" . Computer, V o l . 9, No. 4, A p r i l 1976(43-51)-. [Reis76] R e i s e r , M., \" I n t e r a c t i v e Modeling of Computer System\". IBM Sys. J . , V o l . 15, No. 4, 1976(309-327). [ReKo76] R e i s e r , M., and H. . Kobayashi, \"On the Convolu t i o n A l g o r i t h m f o r Separable Queuing Networks\". Proc. I n t ' l . Symp. Comp. Per. M o d e l l i n g Measurement and E v a l u a t i o n , 1976(109-117). [Reis79] R e i s e r , M., \"Mean Value A n a l y s i s of Queuing Network, A New Look at an Old Problem\". i n Performance of Computer System, M. Arato, A. Butrimenko, and E. Gelenbe (ed.), North-Holland Publ. Co., 1979(63-77). [ReLa80] R e i s e r , M., and S. S. Lavenberg, \"Mean-Value A n a l y s i s of Closed M u l t i c l a s s Queuing Network\". JACM, V o l . 27, No2, A p r i l 1980(313-322). [Rode79] Rode, J . D., \" M u l t i c l a s s O p e r a t i o n a l A n a l y s i s of Queuing Network\", i n Performance of Computer Systems, M. Arato, A. Butrimenko, and E. Gelenbe (ed.), North-Holland Publ. Co., 1979(339-352). 1 27 [RoDu73] Rodriguez-Rosel, J . , and J . P. Dupuy, \"The Design, Implementation, and E v a l u a t i o n of Working Set Di s p a t c h e r \" . Comm. ACM, V o l . 16, No. 4, A p r i l 1973(247-253). [Scho80] Schonbach, A., \"Macro-Scheduler f o r High P r o d u c t i v i t y \" . T h e s i s , Dept. of Comp. S c i . , Univ. of Toronto, 1980. [Spos75] S p o s i t o , V. A., L i n e a r and Nonlinear Programming. Iowa: Iowa State Univ. Press/ames, 1975. [TrMa?1] T r a i g e r , I. L., and R. L. Mattson, \"The E v a l u a t i o n and S e l e c t i o n of Technologies f o r Computer Storage System\". . AIP Conference Proceedings, No. 5, Part I, Magnetism and Magnetic M a t e r i a l s , 1971. [TrSe77] T r i p a t h i , S. K., and K. C. Sevick, \"The I n f l u e n c e of Multiprogramming L i m i t on I n t e r a c t i v e Response Time i n a V i r t u a l Memory System\". Proc. S i g m e t r i c s Conf. On Computer Performance 1977(121-130). [ T r i v 7 8 ] T r i v e d i , K. S., \" A n a l y t i c Modeling of Computer Systems\". Computer, Oct. 1978(38-56). [TrWa80] T r i v e d i , K. S., and R. A. Wagner, \"A D e c i s i o n Model f o r Closed Queuing Networks\". IEEE Trans. Software Eng. SE-5, 4, J u l y 1979(328-332). [TrWSSO] T r i v e d i , K. S., R. A. Wagner, and T. M. Sigmon, \"Optimal S e l e c t i o n of CPU Speed, Device C a p a c i t i e s , and F i l e Assignments\". JACM Vol.27, No.3, J u l y 1980(457-473). [Welc78] Welch, T.A., \"Memory Hi e r a r c h y C o n f i g u r a t i o n A n a l y s i s \" . IEEE Trans. on Computers, V o l . C-27, No. 3, May 1978(408-413). [ W i l l 7 3 ] W i l l i a m s , J . G., \"Asymetric Memory H i e r a r c h i e s \" . CACM, V o l . 16, No. 4, A p r i l 1973(213-222). 128 Appendix A SYSTEM / JOB CHARACTERISTICS TERMINAL BATCH ARR. RATE(/SEC.) 3.0 1 .25 DELTA ARR. RATE*(/SEC) 1 .50 1 .00 I/O REQU. RATE(/SEC.) 250 99 DELTA I/O REQU. RATE** 1 00 20 EXIT PROBABILITY 0.90 0.90 DELTA EXIT PROB.*** 0.10 0.10 I/O SERVICE RATE(/SEC) 30 30 PAGE SERV. RATE(/SEC) 100 100 QUANTUM LENGTH(SEC) 0.01 0.01 B (EQU 5.1) 0.01 0.01 C (EQU 5.1) 120 1 20 MAIN MEMORY (PAGES) 500 500 OBSERVATION INTERVAL \u00E2\u0080\u00A23 . 3 LENGTH (SEC) DELTA ARR. PERIOD\"*( SEC ) 12 12 DELTA CHARACTERISTIC 3 3 PERIOD++ (SEC) TABLE A.1 WORKLOAD FOR TABLE 5.1 SYSTEM / JOB CHARACTERISTICS TERMINAL BATCH ARR. RATE(/SEC.) 5.0 3.00 DELTA ARR. RATE*(/SEC) 2.50 1 .50 I/O REQU. RATE(/SEC.) 250 50 DELTA I/O REQU. RATE** 1 00 20 EXIT PROBABILITY 0.90 0.90 DELTA EXIT PROB.*** 0.10 0.10 I/O SERVICE RATE(/SEC) 30 30 PAGE SERV. RATE(/SEC) 100 1 00 QUANTUM LENGTH(SEC) 0.01 0.01 B (EQU 5.1) 0.01 0.01 C (EQU 5.1) 80 80 MAIN MEMORY (PAGES) 500 500 OBSERVATION INTERVAL 3 3 LENGTH (SEC) DELTA ARR. PERIOD^ SEC) 1 2 1 2 DELTA CHARACTERISTIC 3 3 PERIOD++ (SEC) TABLE A.2 WORKLOAD FOR TABLE 5.2 * V a r i a t i o n i n a r r i v a l r a t e . ** V a r i a t i o n i n I/O request r a t e . *** V a r i a t i o n i n e x i t p r o b a b i l i t y . + P e r i o d of v a r i a t i o n i n e x i t p r o b a b i l i t y . ++ P e r i o d of v a r i a t i o n i n job c h a r a t e r i s t i c s . SYSTEM / JOB CHARACTERISTICS TERMINAL BATCH ARR. RATE(/SEC.) 3.5 1 .50 DELTA ARR. RATE*(/SEC) 0.50 0.00 I/O REQU. RATE(/SEC.) 250 50 DELTA I/O REQU. RATE** 50 10 EXIT PROBABILITY 0.90 0.90 I/O SERVICE RATE(/SEC) 30 30 PAGE SERV. RATE(/SEC) 100 100 QUANTUM LENGTH(SEC) 0.01 0.01 B (EQU 5.1) 0.01 0.01 C (EQU 5.1) 1 20 120 MAIN MEMORY (PAGES) 500 500 OBSERVATION INTERVAL 3 3 LENGTH (SEC) DELTA ARR. PERIOD*( SEC ) 12 . 12 DELTA CHARACTERISTIC 3 3 PERIOD++ (SEC) TABLE A.3 WORKLOAD FOR TABLE 5.3 SYSTEM / JOB CHARACTERISTICS TERMINAL BATCH ARR. RATE(/SEC.) 4.5 2.50 DELTA ARR. RATE*(/SEC) 1 .50 0.50 I/O REQU. RATE(/SEC.) 250 50 DELTA I/O REQU. RATE** 50 10 EXIT PROBABILITY 0.87 0.87 DELTA EXIT PROB.*** 0.05 0.05 I/O SERVICE RATE(/SEC) 30 30 PAGE SERV. RATE(/SEC) 1 00 100 QUANTUM LENGTH(SEC) 0.01 0.01 B (EQU 5.1) 0.01 0.01 C (EQU 5.1) 120 1 20 MAIN MEMORY (PAGES) 500 500 OBSERVATION INTERVAL 3 3 LENGTH (SEC) DELTA ARR. PERIOD** SEC) 12 12 DELTA CHARARATERISTIC 3 3 PERIOD++ (SEC) TABLE A.4 WORKLOAD FOR TABLE 5.4 130 SYSTEM / JOB CHARACTERISTICS TERMINAL BATCH ARR. RATE(/SEC.) 4.00 2.00 DELTA ARR. RATE*(/SEC) 1 .50 0.50 I/O REQU. RATE(/SEC.) 250 50 DELTA I/O REQU. RATE** 50 10 EXIT PROBABILITY 0. .87 0.87 DELTA EXIT PROB.*** 0.05 0.05 I/O SERVICE RATE(/SEC) 30 30 PAGE SERV. RATE(/SEC) 100 1 00 QUANTUM LENGTH(SEC) 0.01 0.01 B (EQU 5.1) 0.01 0.01 C (EQU 5.1) 120 120 MAIN MEMORY (PAGES) 500 500 OBSERVATION INTERVAL 3 3 . LENGTH (SEC) DELTA ARR. PERIOD^ SEC) 12 . 1 2 DELTA CHARACTERISTIC 3 3 PERIOD++ (SEC) TABLE A.5 WORKLOAD FOR TABLE 5.5 SYSTEM / JOB CHARACTERISTICS TERMINAL BATCH ARR. RATE(/SEC.) 5.0 2.5 DELTA ARR. RATE*(/SEC) 2.50 1 .50 I/O REQU. RATE(/SEC.) 250 50 DELTA I/O REQU. RATE** 1 00 20 EXIT PROBABILITY 0.90 0.90 DELTA EXIT PROB.*** 0.10 0.10 I/O SERVICE RATE(/SEC) 30 30 PAGE SERV. RATE(/SEC) 1 00 100 QUANTUM LENGTH(SEC) 0.01 0.01 B (EQU 5.1) 0.01 0.01 C (EQU 5.1) 1 20 1 20 MAIN MEMORY (PAGES) 500 500 OBSERVATION INTERVAL 3 3 LENGTH (SEC) DELTA ARR. PERIOD^SEC) 1 2 1 2 DELTA CHARACTERISTIC 3 3 PERIOD++ (SEC) TABLE A.6 WORKLOAD FOR TABLE 5.6 131 SYSTEM / JOB CHARACTERISTICS TERMINAL BATCH ARR. RATE(/SEC.) 5.0 3.50 DELTA ARR. RATE*(/SEC) 2.50 1 .50 I/O REQU. RATE(/SEC.) 250 50 DELTA I/O REQU. RATE** 1 00 20 EXIT PROBABILITY 0.90 0.90 DELTA EXIT PROB.*** 0.10 0.10 . I/O SERVICE RATE(/SEC) 30 30 PAGE SERV. RATE (/SEC) 1 00 100 QUANTUM LENGTH(SEC) 0.01 0.01 B (EQU 5.1) 0.01 0.01 C (EQU 5.1) 120 120. MAIN MEMORY (PAGES) 500 500 OBSERVATION INTERVAL 3 3 LENGTH (SEC) DELTA ARR. PERIOD^ SEC) 1 2 12 DELTA CHARACTERISTIC 3 3 PERIOD++ (SEC) TABLE A.7 WORKLOAD FOR TABLE 5.7 SYSTEM / JOB CHARACTERISTICS TERMINAL BATCH ARR. RATE(/SEC.) 5.0 2.50 DELTA ARR. RATE*(/SEC) 2.50 1.50 I/O REQU. RATE(/SEC.) 250 50 DELTA I/O REQU. RATE** 1 00 20 EXIT PROBABILITY 0.90 0.90 DELTA EXIT PROB.*** 0.10 0.10 I/O SERVICE RATE(/SEC) 30 30 PAGE SERV. RATE(/SEC) 1 00 100 QUANTUM LENGTH(SEC) 0.01 0.01 B (EQU 5.1) 0.01 0.01 C (EQU 5.1) 1 20 1 20 MAIN MEMORY (PAGES) 500 500 OBSERVATION INTERVAL 3 3 LENGTH (SEC) DELTA ARR. PERIOD\"**SEC) 1 2 1 2 DELTA CHARACTERISTIC 3 3 TABLE A.8 WORKLOAD FOR TABLE 5.8 SYSTEM / JOB CHARACTERISTICS CLASS 1 CLASS 2 CLASS 3. CLASS 4 ARR. RATE(/SEC.) 2.75 2.25 1 .75 1 .25 DELTA ARR. RATE*(/SEC) 0.50 0.50 .0.50 0.50 I/O REQU. RATE(/SEC.) 250 200 75 50 DELTA I/O REQU. RATE** \u00E2\u0080\u00A2'\u00E2\u0080\u00A2 .50 50 10 10 EXIT PROBABILITY 0.78 0.80 0.90 0.92 DELTA EXIT PROB.** * , 0.05 . 0.05 0.05 0.05 I/O SERVICE RATE.(/SEC) 30 30 30 30 PAGE SERV. RATE(/SEC) 100 100 100 100 QUANTUM LENGTH(SEC) 0.01 0.01 0.01 0.01 B (EQU 5.1) 0.01 0.0 1 0.01 0.01 C (EQU 5.1) 120 120 120 1 20 MAIN MEMORY (PAGES) 500 500 500 500 OBSERVATION INTERVAL 3 . 3. / .3 3 .... LENGTH (SEC) DELTA ARR. PERIOD+(SEC) 12 1 2 12 12 DELTA CHARACTERISTIC \u00E2\u0080\u00A2 3 . 3 3 \u00E2\u0080\u00A2 PERIOD++ (SEC) TABLE A.9 WORKLOAD FOR TABLE 5.9 AND TABLE 5.12 "@en . "Thesis/Dissertation"@en . "10.14288/1.0051833"@en . "eng"@en . "Computer Science"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use."@en . "Graduate"@en . "Optimization techniques in computer system design and load control"@en . "Text"@en . "http://hdl.handle.net/2429/22966"@en .