UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A new approach to program restructuring and clustering Yap, Tuan-Bin 1983

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1983_A6_7 Y36.pdf [ 2.95MB ]
Metadata
JSON: 831-1.0051842.json
JSON-LD: 831-1.0051842-ld.json
RDF/XML (Pretty): 831-1.0051842-rdf.xml
RDF/JSON: 831-1.0051842-rdf.json
Turtle: 831-1.0051842-turtle.txt
N-Triples: 831-1.0051842-rdf-ntriples.txt
Original Record: 831-1.0051842-source.json
Full Text
831-1.0051842-fulltext.txt
Citation
831-1.0051842.ris

Full Text

A NEW APPROACH TO PROGRAM RESTRUCTURING AND CLUSTERING by TUAN-BIN £YAP B . S c , U n i v e r s i t y Of Sydney, 1979 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE i n THE FACULTY OF GRADUATE STUDIES Department Of Computer S c i e n c e We a c c e p t t h i s t h e s i s as c o n f o r m i n g t o the r e q u i r e d s t a n d a r d THE UNIVERSITY OF BRITISH COLUMBIA October 1983 © Tuan-Bin Yap, 1983 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make i t freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It i s understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Comy^d-ex ^CftZnC-C The University of British Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1W5 Date (Jobber \ f c? 3 DE-6 (2/79) I I ABSTRACT A new program restructuring algorithm aimed at reducing the working set size of a program executing in a working set environment i s developed. The algorithm makes use of the concept of l o c a l i t y as defined in the Bounded Loc a l i t y Interval (BLI) program behaviour model to discern program referencing patterns. The basic p r i n c i p l e of t h i s as well as a l l other restructuring algorithms is to put relocatable blocks having a prominent referencing pattern in the same v i r t u a l pages. Simulation experiments were conducted to evaluate the performance of the new scheme r e l a t i v e to the other existing algorithms. The algorithm was also evaluated on a real system which uses a clock page replacement strategy. A new c l u s t e r i n g scheme used in the restructuring procedure is also developed. The new technique decomposes the c l u s t e r i n g problem into subproblems which can then be solved i n d i v i d u a l l y . The c l a s s i c a l h i e r a r c h i c a l c l u s t e r i n g algorithm y i e l d s a time complexity of n 3 where n is the number of d i s t i n c t blocks in the program to be restructured. The decomposition approach reduces the time-complexity of the algorithm to n 2. Several other p r a c t i c a l aspects of program restructuring are also discussed in the thesis. I l l TABLE OF CONTENTS 1. I n t r o d u c t i o n 1 1.1 R e s t r u c t u r i n g procedure 1 1.2 C l u s t e r i n g phase 3 1.3 P r e v i o u s e x p e r i m e n t a l r e s u l t s 4 1.4 O b j e c t i v e s of the t h e s i s 5 2. A s u r v e y of r e s t r u c t u r i n g t e c h n i q u e s 6 2.1 Nearness method 10 2.2 C r i t i c a l - s e t a l g o r i t h m s 11 2.3 M i n i m a l - s e t a l g o r i t h m s 11 3. B L I - b a s e d r e s t r u c t u r i n g a l g o r i t h m s 13 3.1 ASI a l g o r i t h m 14 3.2 BLI a l g o r i t h m 15 4. C l u s t e r i n g a l g o r i t h m s 17 4.1 H i e r a r c h i c a l c l a s s i f i c a t i o n method ....18 4.2 A d e c o m p o s i t i o n approach t o c l u s t e r i n g problem 21 4.2.1 Method A 21 4.2.2 G r a v i t a t i o n a l c l u s t e r i n g scheme 23 4.2.3 G r a v i t a t i o n a l - h i e r a r c h i c a l c l a s s i f i c a t i o n method 24 4.2.4 A n a l y s i s of time c o m p l e x i t y 26 5. D e s c r i p t i o n of experiment 28 5.1 Parameters 28 .5.1.1 Performance i n d i c e s 28 5.1.2 F a c t o r s 29 5.2 P r o c e d u r e of experiment 30 5.3 R e s t r u c t u r i n g and measurement t o o l s 32 5.3.1 Data c o l l e c t i o n .32 5.3.2 R e s t r u c t u r i n g a l g o r i t h m s 33 5.3.3 C l u s t e r i n g a l g o r i t h m s 33 5.3.4 Memory management p o l i c y s i m u l a t o r s 34 5.3.5 L i n k a g e e d i t o r 34 5.4 D e s i g n and i m p l e m e n t a t i o n 35 5.4.1 S e l e c t i o n of l e v e l s f o r experiment 35 5.4.2 Implementation d e t a i l s 37 6. D i s c u s s i o n of e x p e r i m e n t a l r e s u l t s 43 6.1 E v a l u a t i o n of BLI a l g o r i t h m 43 6.1.1 D i s c u s s i o n of s i m u l a t i o n r e s u l t s 43 6.1.2 D i s c u s s i o n of e m p i r i c a l r e s u l t s 45 6.1.3 Cost 47 6.2 E v a l u a t i o n of GHC c l u s t e r i n g scheme 47 6.3 C o n c l u s i o n s 48 REFERENCES K C IV LIST OF TABLES AND FIGURES Table 5.1 S e l e c t i o n of l e v e l s f o r the r e s t r u c t u r i n g experiment 42 Table 5.2 S e l e c t i o n of l e v e l s f o r the c l u s t e r i n g experiment 42 Tab l e 6.1a Comparison of the r e s t r u c t u r i n g a l g o r i t h m s on the number of page f a u l t s g e n e r a t e d i n a working s e t environment ...52 Tab l e 6.1b Comparison of the r e s t r u c t u r i n g a l g o r i t h m s on average working s e t s i z e i n a w o r k i n g s e t environment 52 Table 6.2a Comparison of the c l u s t e r i n g a l g o r i t h m s on the CPU p r o c e s s i n g time ( i n msec.) d u r i n g the c l u s t e r i n g phase 53 Table 6.2b Comparison of the c l u s t e r i n g a l g o r i t h m s on the CPU p r o c e s s i n g time ( i n msec.) d u r i n g the r e s t r u c t u r i n g phase 53 T a b l e 6.2c Comparison of the c l u s t e r i n g a l g o r i t h m s on the CPU p r o c e s s i n g time ( i n msec.) d u r i n g the r e s t r u c t u r i n g and c l u s t e r i n g phases 53 Table 6.3 Comparison of the c l u s t e r i n g a l g o r i t h m s on the number of page f a u l t s g e n e r a t e d i n a working set environment 54 Tab l e 6.4 Performance of the r e s t r u c t u r e d P a s c a l c o m p i l e r on MTS 54 F i g u r e 3.1 A h i e r a r c h y of bounded l o c a l i t y i n t e r v a l 14 F i g u r e 5.1 Layout of a P a s c a l n e s t e d r o u t i n e 41 F i g u r e 6.1 Pa g i n g c h a r a c t e r i s t i c s of r e f e r e n c e s t r i n g P3 ...55 V ACKNOWLEDGEMENTS I would like to thank Dr. S. Chanson for his valuable assistance in the research and in the writing of this thesis, and for spending generous portions of his time in discussing its progress. I am also grateful to John Stevens for many helpful discussions and for his time in reading the thesis. 1 Chapter J _ I n t r o d u c t i o n Reducing the page f a u l t r a t e t o improve the performance of paged v i r t u a l memory systems can be a c h i e v e d i n s e v e r a l ways. The s i m p l e s t way i s t o i n c r e a s e the s i z e of main memory. However, computer a r c h i t e c t u r e l i m i t s the maximum memory s i z e a system may assume. Fur t h e r m o r e , the ever growing demand of main memory has c o n s t a n t l y outpaced memory i n c r e a s e s i n new machines. Thus r e s e a r c h e r s have worked on o t h e r methods as w e l l . Research i n page replacement a l g o r i t h m s t o m i n i m i z e p a g i n g a c t i v i t y r e c e i v e d much a t t e n t i o n i n the l a t e s i x t i e s and e a r l y s e v e n t i e s and the c u r r e n t s t r a t e g i e s have been shown t o be n e a r - o p t i m a l . System performance can a l s o be improved by i m p r o v i n g the l o c a l i t y of the programs r u n n i n g on the system. T h i s t e c h n i q u e , known as program r e s t r u c t u r i n g , i s v i a b l e because of the p r o p e r t y of ' l o c a l i t y ' e x h i b i t e d i n the b e h a v i o u r of most programs. The b a s i c p r i n c i p l e of program r e s t r u c t u r i n g i s t o arra n g e program b l o c k s i n such a way t h a t b l o c k s r e f e r e n c i n g one another over extended p e r i o d s of time a re put i n the same v i r t u a l pages. V a r i o u s r e s t r u c t u r i n g a l g o r i t h m s have been d e v i s e d and have been shown t o reduce page f a u l t r a t e by 20-40%. In t h i s t h e s i s , a new s t r a t e g y - i n d e p e n d e n t r e s t r u c t u r i n g a l g o r i t h m i s d e s c r i b e d . As w e l l , s e v e r a l p r a c t i c a l a s p e c t s of program r e s t r u c t u r i n g i n c l u d i n g a new approach t o c l u s t e r b l o c k s t o g e t h e r a r e a l s o d i s c u s s e d . 1.1 R e s t r u c t u r i n g procedure R e s t r u c t u r i n g p r o c e d u r e u s u a l l y c o n s i s t s of the f o l l o w i n g s t e p s : 1) The program t o be r e s t r u c t u r e d i s d i v i d e d i n t o r e l o c a t a b l e b l o c k s which a r e e i t h e r c o n t i g u o u s b l o c k s of d a t a or i n s t r u c t i o n s . Address t r a c e c o l l e c t e d from e x e c u t i o n of the program i s t r a n s f o r m e d i n t o i t s c o r r e s p o n d i n g b l o c k r e f e r e n c e s t r i n g . 2) The s t r i n g i s p r o c e s s e d and the s t r e n g t h of c o n n e c t i v i t y between b l o c k s i s computed and s t o r e d i n a m a t r i x . The m a t r i x can be thought of as a graph w i t h nodes r e p r e s e n t i n g the b l o c k s and l a b e l s of edges r e p r e s e n t i n g the c o n n e c t i v i t y between b l o c k s . 3) A c l u s t e r i n g a l g o r i t h m i s a p p l i e d t o the b l o c k s based on the • . . c o n n e c t i v i t y i n f o r m a t i o n s t o r e d i n ..the. m a t r i x .with the. o b j e c t i v e of m i n i m i z i n g i n t e r - p a g e r e f e r e n c e s s u b j e c t t o page s i z e c o n s t r a i n t . The r e s u l t i s a b l o c k - t o - p a g e mapping s p e c i f y i n g w hich b l o c k s a r e t o be p l a c e d i n the same page. 4) The program i s r e s t r u c t u r e d a c c o r d i n g t o the r e s u l t i n g l a y o u t . The v a r i o u s r e s t r u c t u r i n g a l g o r i t h m s ( s t e p 2 of the p r o c e d u r e ) d i f f e r i n the way they d e f i n e and compute c o n n e c t i v i t y between b l o c k s . An example of a r e s t r u c t u r i n g a l g o r i t h m i s the nearness method [10] which i n c r e m e n t s by 1 the ( i , j ) t h e n t r y of the m a t r i x whenever a r e f e r e n c e made t o b l o c k j f o l l o w s a r e f e r e n c e t o b l o c k i . Thus the v a l u e of t h e ( i , j ) t h 3 element e x p r e s s e s the d e s i r a b i l i t y of p u t t i n g b l o c k s i and j i n t h e same page. A d e s c r i p t i o n of some major r e s t r u c t u r i n g a l g o r i t h m s i s p r e s e n t e d i n Chapter 2. 1.2 C l u s t e r i n g phase In terms of time c o m p l e x i t y , t h i s i s p r o b a b l y the most e x p e n s i v e s t e p i n the whole p r o c e d u r e of program r e s t r u c t u r i n g . There a r e s e v e r a l c l u s t e r i n g a l g o r i t h m s , none of which i s o p t i m a l s i n c e c l u s t e r i n g i s b a s i c a l l y a c o m b i n a t o r i a l problem and has been shown t o be NP hard [5,11,13]. The v a r i o u s methods d i f f e r i n t h e i r e f f i c i e n c y and time c o m p l e x i t y . Those which are s i m p l e ( e . g . , n u c l e u s c o n s t r u c t i n g [ 1 ] ) g e n e r a l l y g i v e p o o r e r r e s u l t s . The one t h a t we a r e concerned w i t h i s based on h i e r a r c h i c a l c l a s s i f i c a t i o n . The h i e r a r c h i c a l c l a s s i f i c a t i o n method i t e r a t e s the f o l l o w i n g two s t e p s u n t i l i n t e r - c l u s t e r c o n n e c t i v i t y between any two b l o c k s i s n i l . 1) F i n d two c l u s t e r s h a v i n g the s t r o n g e s t c o n n e c t i v i t y w i t h the c o n s t r a i n t t h a t they f i t i n t o one page. 2) Merge the two c l u s t e r s . Step 1 r e q u i r e s s c a n n i n g the nxn m a t r i x f o r the l a r g e s t e n t r y and t a k e s n 2 o p e r a t i o n s . Step 2 r e q u i r e s 4n o p e r a t i o n s . Thus the time c o m p l e x i t y of the c l u s t e r i n g p r o c e d u r e i s of the o r d e r n 3 (See s e c t i o n 4.1). A more e f f i c i e n t c l u s t e r i n g method has been de v e l o p e d . The 4 performance of the new method i s comparable t o t h a t of h i e r a r c h i c a l c l a s s i f i c a t i o n but i s o n l y q u a d r a t i c i n time c o m p l e x i t y . 1•3 P r e v i o u s e x p e r i m e n t a l r e s u l t s Numerous s t u d i e s have shown t h a t r e s t r u c t u r e d programs p e r f o r m b e t t e r than n o n - r e s t r u c t u r e d ones by anywhere between 20 to 40% i n terms of page f a u l t r a t e . However, almost a l l of the s e r e s u l t s were based on s i m u l a t e d r e s t r u c t u r i n g . T h i s i s because many v i r t u a l memory systems use g l o b a l L R U - l i k e page replacement a l g o r i t h m s whose replacement d e c i s i o n depends on the dynamic b e h a v i o u r of a l l a c t i v e p r o c e s s e s , not j u s t a s i n g l e p r o c e s s . Thus i t i s d i f f i c u l t t o c o n t r o l the w o r k l o a d t o o b t a i n m e a n i n g f u l comparisons i n an a c t u a l system. A s i m u l a t e d model of g l o b a l L R U - l i k e s t r a t e g i e s i n t i m e - s h a r i n g environments has been proposed [ 2 ] . In t h i s model , a parameter c a l l e d Page S u r v i v a l Index (PSI) i s used t o c h a r a c t e r i z e the i n f l u e n c e of system w o r k l o a d on a s i n g l e p r o c e s s . PSI i s d e f i n e d as the number of i n t e r r u p t i o n s an u n r e f e r e n c e d page r e s i d e n t i n the memory can s u r v i v e . The i n t e r r u p t i o n s n o r m a l l y i n c l u d e 1) g e n e r a t i o n of a page f a u l t 2) c o m p l e t i o n of an I/O o p e r a t i o n 3) t e r m i n a t i o n of a time quantum. T h i s model has been s u c c e s s f u l l y v e r i f i e d f o r the CP-67 system [ 2 ] . However, n o t h i n g c o n c l u s i v e c o u l d be s a i d about e x p e r i m e n t s performed i n a s i m u l a t e d g l o b a l LRU environment w i t h a f i x e d PSI s i n c e t h e number of i n t e r r u p t i o n s i n a r e a l system f l u c t u a t e s w i t h the system l o a d . U n l e s s one has c o n t r o l of the e n t i r e system (and hence i t s 5 w o r k l o a d ) , the e f f e c t i v e n e s s of r e s t r u c t u r i n g can o n l y be e s t a b l i s h e d s t a t i s t i c a l l y by r e p e a t i n g the exper iments a number of t imes on a r e a l sys tem. The performance of the new r e s t r u c t u r i n g a l g o r i t h m was t e s t e d on a r e a l system and the r e s u l t s are r e p o r t e d in Chapter 6. 1 .4 O b j e c t i v e s of the t h e s i s 1) To d e v e l o p a more e f f i c i e n t s t r a t e g y - i n d e p e n d e n t r e s t r u c t u r i n g a l g o r i t h m and e v a l u a t e i t s per formance . 2) To d e v e l o p a more e f f i c i e n t c l u s t e r i n g a l g o r i t h m . 3) To d e v e l o p a package which automates the four s teps of the r e s t r u c t u r i n g procedure and to t e s t the performance of the r e s t r u c t u r e d program on a r e a l system which uses a g l o b a l memory management p o l i c y . 6 Chapter 2 A survey of some r e s t r u c t u r i n q techniques There are two broad c a t e g o r i e s of r e s t r u c t u r i n g a l g o r i t h m s : 1) Strategy-independent r e s t r u c t u r i n g a l g o r i t h m s and 2 ) Strategy-dependent r e s t r u c t u r i n g a l g o r i t h m s . The nearness method of H a t f i e l d and G e r a l d [ 1 0 ] i s a c l a s s i c example of the f i r s t c a tegory. The second category of a l g o r i t h m s , as the name suggests, takes i n t o account the memory management s t r a t e g y of the system on which the program w i l l be run. T h i s c l a s s of a l g o r i t h m s , f i r s t i n t roduced by F e r r a r i , are s u p e r i o r to those of the f i r s t category because of the use of a d d i t i o n a l informat i o n . In v i r t u a l memory systems which use working s e t - l i k e s t r a t e g i e s to manage main memory, another u s e f u l performance index i s the average working set s i z e of the program. Often there i s a t r a d e - o f f between time and space but a good r e s t r u c t u r i n g s t r a t e g y w i l l reduce both the page f a u l t r a t e and the working set s i z e . A l l r e s t r u c t u r i n g a l g o r i t h m s aim at p u t t i n g blocks which are l i k e l y t o be r e f e r e n c e d together i n the same p h y s i c a l pages. We s h a l l use the notion of 'nearness' to r e f e r to such r e l a t i o n s h i p among b l o c k s . The e f f e c t i v e n e s s of an a l g o r i t h m depends on how c l o s e i t i s ab l e to capture b l o c k s which are 'near' one another. The notion of nearness i s meaningless unless i t i s understood i n the context of the memory management p o l i c y under which the r e s t r u c t u r e d program i s to be executed. For example, in the course of a program excuting i n a working set environment with window s i z e T, two b l o c k s i , j i n the same 7 window a r e s a i d t o be 'near' each o t h e r s i n c e p u t t i n g them i n the same page may a v o i d a page f a u l t . However, the same b l o c k s i and j may not be 'near' each o t h e r under a d i f f e r e n t memory management p o l i c y or even i n the same environment w i t h a d i f f e r e n t window s i z e . S i n c e the page s i z e i s f i x e d , p u t t i n g i and j i n the same page may p r e v e n t a more d e s i r a b l e c l u s t e r i n g of i w i t h a n o t h e r b l o c k k. I t s h o u l d now be c l e a r t h a t S t r a t e g y -dependent a l g o r i t h m s p e r f o r m b e t t e r than t h e o t h e r c a t e g o r y of a l g o r i t h m s s i n c e the n o t i o n of nearness p e r t a i n s t o a p a r t i c u l a r main memory management p o l i c y . I t s h o u l d a l s o be o b v i o u s t h a t no s i n g l e a l g o r i t h m i s o p t i m a l i n r e d u c i n g the page f a u l t r a t e of the program e x c u t i n g i n d i f f e r e n t e n v i r o n m e n t s . An a l g o r i t h m can o n l y be o p t i m a l w i t h r e s p e c t t o a p a r t i c u l a r program e x e c u t i n g i n a p a r t i c u l a r v i r t u a l memory system. In the absence of memory management i n f o r m a t i o n or when l i t t l e i s known about the p a g i n g b e h a v i o u r of the program, c h o o s i n g an a p p r o p r i a t e r e s t r u c t u r i n g a l g o r i t h m may p r e s e n t a problem. In such s i t u a t i o n s , s t r a t e g y - i n d e p e n d e n t a l g o r i t h m s a re more d e s i r a b l e p r o v i d e d t h e i r performance i s comparable t o those of the s t r a t e g y - d e p e n d e n t ones. Le t us now examine some c r i t e r i a t h a t have been used t o d e c i d e 'nearness'. The nearness method of H a t f i e l d and G e r a l d c o n s i d e r s two b l o c k s i and j 'near' when b l o c k i i s r e f e r e n c e d a f t e r b l o c k j or b l o c k j i m m e d i a t e l y a f t e r i , d u r i n g program e x e c u t i o n . An a l g o r i t h m based on a broader d e f i n i t i o n of 'nearness' i s the one proposed by Masuda [ 1 6 ] . The r a t i o n a l e 8 b e h i n d the broader view of d e f i n i t i o n a r e : 1) A page o f t e n c o n t a i n s more than two b l o c k s and t h a t a p r o c e s s i s u s u a l l y g i v e n more than 1 page memory. T h e r e f o r e more than two b l o c k s of the same p r o c e s s can be r e s i d e n t i n main memory a t the same t i m e . 2) The o b s e r v e d p a t t e r n of r e f e r e n c i n g i s not l i m i t e d t o two b l o c k s . The s o - c a l l e d l o c a l i t y s e t s o f t e n c o n t a i n more than two b l o c k s . The extended view of 'nearness' r e f e r s t o b l o c k s which a r e a s h o r t d i s t a n c e a p a r t . The d i s t a n c e i s d e f i n e d by a window T which c o n s i s t s of time i n t e r v a l s or number of r e f e r e n c e s . T h i s a l g o r i t h m g e n e r a l l y performs b e t t e r than the o r i g i n a l n earness method but the problem l i e s i n the c h o i c e of T. As a program may have v a r y i n g l o c a l i t y s i z e s , not a l l the l o c a l i t i e s may be c a p t u r e d i f the f i x e d window s i z e T i s too s m a l l . The e x p o n e n t i a l t i m e - c o m p l e x i t y of the a l g o r i t h m i s a n o t h e r drawback of the a l g o r i t h m . The concept of Bounded L o c a l i t y I n t e r v a l ( BLI) proposed by Madison and Batson [15] p r o v i d e s a mechanism t o determine l o c a l i t i e s i n a n a t u r a l way. The d e t a i l s of t h i s concept a r e found i n Chapter 3. B e s i d e s not needing any p a r a m e t e r s , t h i s model d e f i n e s a h i e r a r c h y of l o c a l i t i e s . The h i g h e r t h e l e v e l of l o c a l i t y , the s m a l l e r the s e t i s . At the top are l e v e l - o n e BLI's which c o r r e s p o n d t o the a c t i v i t y s e t s h a v i n g the l o n g e s t l i f e t i m e s . The l e v e l s of o t h e r B L I ' s a r e numbered a c c o r d i n g t o 9 t h e i r d i s t a n c e from the t o p B L I ' s . U s i n g l e v e l - o n e B L I ' s , a b l o c k r e f e r e n c e s t r i n g can be p a r t i t i o n e d i n t o the l a r g e s t p o s s i b l e s u b i n t e r v a l s of d i s t i n c t r e f e r e n c i n g b e h a v i o u r . An i n t e r v a l can e i t h e r be a l o c a l i t y phase or a t r a n s i t i o n phase. The t r a n s i t i o n phase r e p r e s e n t s the d i s r u p t i v e b e h a v i o u r of a program m i g r a t i n g from one l o c a l i t y t o a n o t h e r . The s e t of s t r a t e g y - i n d e p e n d e n t r e s t r u c t u r i n g a l g o r i t h m s proposed by Kob a y a s h i [14] i s based on the concept of B L I . The be s t of these a l g o r i t h m s i s the A c t i v i t y Set a l g o r i t h m (ASI) which increments the l a b e l of the edge between b l o c k s i and j by an amount e q u a l t o the l e v e l of the s m a l l e s t a c t i v i t y set t o which they b e l o n g . One drawback of t h i s a l g o r i t h m i s t h a t i t g i v e s l i t t l e weight t o b l o c k s i n a d j a c e n t i n t e r v a l s , t h a t i s , b l o c k s i n the same l o c a l i t y a r e c o n s i d e r e d 'near' but not a c r o s s the i n t e r v a l boundary. A f t e r d i s c u s s i n g the s t r e n g t h s and weaknesses of some of the r e s t r u c t u r i n g a l g o r i t h m s , we s h a l l d e f i n e some terms used i n t h i s t h e s i s . 1) The b l o c k r e f e r e n c e s t r i n g { b ,b ,..b } denotes a 1 2 r c h r o n o l o g i c a l sequence of r e f e r e n c e s g e n e r a t e d by the program where r = t o t a l number of r e f e r e n c e s i n the e x e c u t i o n of the program. Each b r e p r e s e n t s a r e l o c a t a b l e i n s t r u c t i o n or data i s e c t o r . Another c h a r a c t e r i s a t i o n of the r e f e r e n c e s t r i n g b ( t l ) . . . . b ( t i ) . . r e c o r d s e x e c u t i o n time of each b l o c k 1 i r e f e r e n c e as w e l l . C o l l e c t i n g r e f e r e n c e t i m e s i s an e x p e n s i v e o p e r a t i o n and i t i s o f t e n not done. B l o c k r e f e r e n c e s t r i n g i n 1 0 this thesis will refer to the f i r s t type. 2) R(t) denotes the set of blocks resident in the memory at time t. Thus in a working set environment,'R(t) would contain a l l block references during the time interval(t-T,t) where T is the window size. 3) A block referenced at time t is said to be c r i t i c a l if i t is not in R(t). 4) R(t) is said to be c r i t i c a l i f a block reference b at time t i e R(t). 5) C denotes the (i, j ) t h entry of the connectivity matrix. A l l entries are i n i t i a l l y set to 0 . We shall now describe some of the existing restructuring algorithms. 2 .1 Nearness method C is incremented by 1 whenever a reference to block j is made following reference to block i . Thus C records the number i j of times block i immediately follows block j in the reference string. 11 2.2 C r i t i c a l - s e t a l g o r i t h m s Examples of t h i s c l a s s of a l g o r i t h m s are C r i t i c a l - W o r k i n g -Set (CWS), C r i t i c a l - l e a s t - r e c e n t l y - u s e d (CLRU), C r i t i c a l - F i r s t -I n - F i r s t - O U T (CFIFO), C r i t i c a l - S a m p l i n g - W o r k i n g - S e t (CSWS) and C r i t i c a l - P a g e - S u r v i v a l - I n d e x ( C P S I ) . Each i s t a i l o r e d t o a s p e c i f i c memory management p o l i c y , and hence the a l g o r i t h m s d i f f e r from one another i n the d e f i n i t i o n of the r e s i d e n t s e t R ( t ) . The b a s i c p h i l o s o p h y u n d e r l y i n g t h e s e a l g o r i t h m s i s t h a t a c r i t i c a l r e f e r e n c e may be made n o n c r i t i c a l (and thus a v o i d i n g a page f a u l t ) by p u t t i n g the c r i t i c a l b l o c k i n the same page as the b l o c k s i n the c r i t i c a l s e t . In a w o r k i n g - s e t environment, where the b l o c k - t o - p a g e mapping does not a f f e c t the number of r e f e r e n c e s i n a w o r k i n g - s e t window, a n o n c r i t i c a l b l o c k r e f e r e n c e w i l l not become c r i t i c a l i n the page r e f e r e n c e s t r i n g a f t e r the mapping. In f a c t , CWS has been shown t o be o p t i m a l when a page c o n t a i n s e x a c t l y 2 b l o c k s [ 1 7 ] . The c r i t i c a l - s e t a l g o r i t h m can be b r i e f l y d e s c r i b e d as f o l l o w s . Whenever a c r i t i c a l r e f e r e n c e t o b l o c k j i s i s s u e d at time t , then a l l C i j such t h a t b l o c k i e R ( t ) are incremented by 1. 2.3 M i n i m a l - s e t a l g o r i t h m s M i n i m a l - W o r k i n g - S e t (MWS), Minimal-Sampling-Working-Set (MSWS) and M i n i m a l - p a g e - S u r v i v a l - I n d e x (MPSI) a r e examples of t h i s c l a s s of a l g o r i t h m s . T h e i r o b j e c t i v e i s t o m i n i m i z e memory occupancy of r e s t r u c t u r e d programs. MWS i s a c t u a l l y an a p p l i c a t i o n of the a l g o r i t h m proposed by Masuda. Masuda's a l g o r i t h m b e l o n g s t o the f i r s t c a t e g o r y of a l g o r i t h m s s i n c e the 1 2 window size T is chosen according to convenience and reasonableness and therefore the program is not tailored to any policy. On the other hand, the window size T in MWS coincides with that used by the working set policy. The algorithm increments C by 1 a l l pairs i and j such that blocks i and j belong to the current resident set of blocks R(t). 13 Chapter 3_ BLI-based r e s t r u c t u r i n g a l g o r i t h m s ' L o c a l i t y of r e f e r e n c e ' i s a n a t u r a l phenomenon e x h i b i t e d by most programs. I t r e f e r s t o the e x p e r i m e n t a l l y o b s e r v e d phenomenon t h a t , a s m a l l subset of a program's segments i s r e f e r e n c e d over a r e l a t i v e l y l o n g time i n t e r v a l . The c o u r s e of program e x e c u t i o n can be viewed, as a sequence of i n t e r v a l s , each of which i s e i t h e r a l o c a l i t y phase or a t r a n s i t i o n phase. The concept of BLI proposed by Madison and Batson [15] p r o v i d e s a mechanism t o p a r t i t i o n a b l o c k r e f e r e n c e s t r i n g i n t o l o c a l i t i e s and t r a n s i t i o n s . The BLI concept i s based on the extended L e a s t - R e c e n t l y - U s e d (LRU) s t a c k . The LRU s t a c k i s an o r d e r e d v e c t o r L ( t ) = ( L ( t ) , . . , L ( t ) , . . L ( t ) ) where n i s the 1 i n number of b l o c k s i n the program and L ( t ) i s the b l o c k I i d e n t i f i e r f o r the i t h m o s t - r e c e n t l y - r e f e r e n c e d b l o c k a t time t . The LRU s t a c k t h u s c o n t a i n s i n f o r m a t i o n about the o r d e r the b l o c k s were l a s t r e f e r e n c e d a t time t . In o r d e r t o d e f i n e l o c a l i t y s e t s from the LRU s t a c k , two new o r d e r e d v e c t o r s F ( t ) and B ( t ) were i n t r o d u c e d . L e t S ( t ) denote the s e t c o n t a i n i n g the t o p i elements of i the LRU s t a c k , t h a t i s , S (t)={ L ( t ) , ,L ( t ) }. Then F ( t ) i 1 i i c o r r e s p o n d s to the f o r m a t i o n time of the s e t S ( t ) . B ( t ) i s the i i time of the most r e c e n t r e f e r e n c e t o the i t h member of S ( t ) . An i a c t i v i t y s e t i s then d e f i n e d as any S ( t ) h a v i n g i t s B ( t ) > i i 1 4 F ( t ) , t h a t i s , a l l i t s members have been r e f e r e n c e d a t l e a s t i once s i n c e the time of f o r m a t i o n . The l i f e t i m e , 1 ( t ) of the i a c t i v i t y s e t i s d e f i n e d t o be the d i f f e r e n c e between B ( t ) and i F ( t ) where B ( t ) > F ( t ) . A BLI i s the p a i r (A ( t ) , l ( t ) ) such i i i i i t h a t A i s the a c t i v i t y s e t and l i i s i t s l i f e t i m e . The BLI i c o ncept a l l o w s a n a t u r a l way of d e f i n i n g l o c a l i t i e s . The h i e r a r c h i c a l n a t u r e of l o c a l i t i e s i m p l i c i t i n the d e f i n i t i o n i s shown i n F i g u r e 3 . 1 . J A B C D C D A B J E G H G H A B C D C D A B E G H G H I E G H -I 1 I 1' I 1 I 1 l e v e l 1 I 1 I 1 l e v e l 2 1 1 l e v e l 3 F i g u r e 3.1 A h i e r a r c h y of bounded l o c a l i t y i n t e r v a l s 3 . 1 ASI r e s t r c ' t u r i n g a l g o r i t h m T h i s a l g o r i t h m i n c r e m e n t s by one a l l t h e l a b e l s of the edges between the b l o c k s whenever the a c t i v i t y s e t t o which they b e l o n g i s e s t a b l i s h e d . Because of the h i e r a r c h i c a l n a t u r e of B L I ' s , b l o c k s are incremented by an amount e q u a l t o the l e v e l of the s m a l l e s t a c t i v i t y s e t t o which they b e l o n g . For example, b l o c k s C and D w i l l be incremented by 5 a f t e r the segment of 1 5 r e f e r e n c e s t r i n g i n F i g u r e 3.1 has been p r o c e s s e d . The h i e r a r c h i c a l p r o p e r t y of the BLI's a l l o w s a v e r y broad d e f i n i t i o n of 'nearness'. In F i g u r e 3.1, b l o c k s C and D a r e c o n s i d e r e d 'near' i n s e v e r a l l o c a l i t y phases, i n a broader c o n t e x t , the l e s s prominent r e f e r e n c i n g p a t t e r n of b l o c k s J,A,B and D i s o b s e r v e d . As mentioned i n the l a s t c h a p t e r , t h i s a l g o r i t h m may f a i l t o r e c o g n i z e some n e i g h b o u r i n g phases as r e l a t e d a l t h o u g h the p a r t i c u l a r o b s e r v e d r e f e r e n c i n g p a t t e r n i s p r e v a l e n t i n the co u r s e of e x e c u t i o n . For example, the c l u s t e r of E,G,H w i l l not be suggested by the a l g o r i t h m f o r the r e f e r e n c e s t r i n g d e p i c t e d i n F i g u r e 3.1. The next s e c t i o n d i s c u s s e s a new a l g o r i t h m which t r i e s t o remedy the problem d e s c r i b e d , the a l g o r i t h m s h a l l be r e f e r r e d t o s i m p l y as BLI a l g o r i t h m . 3.2 BLI a l g o r i t h m T h i s a l g o r i t h m i s based on the o b s e r v a t i o n t h a t t r a n s i t i o n phases have f a r g r e a t e r impact on pa g i n g a c t i v i t y than l o c a l i t y phases. T h i s i s because any r e f e r e n c e t o b l o c k s i n a l o c a l i t y phase w i l l not i n c u r more page f a u l t s as they a re a l r e a d y brought i n t o the memory. Kahn [12] r e p o r t e d t h a t w h i l e l o c a l i t y phases dominated the v i r t u a l t i m e , the r a t e of page f a u l t s d u r i n g t r a n s i t i o n s was 100 t o 1000 times h i g h e r than t h a t d u r i n g l o c a l i t y phases. U s i n g the l e v e l - o n e B L I , the program e x e c u t i o n time can be 1 6 d i v i d e d i n t o t r a n s i t i o n and l o c a l i t y i n t e r v a l s which a r e denoted by T and L r e s p e c t i v e l y . An a p p r o x i m a t i o n method t o e s t i m a t e i i T and L g i v e n by Chanson and Law [4 ] i s as f o l l o w s : i i 1) I f the c u r r e n t i n t e r v a l i s a t r a n s i t i o n phase, then the f i r s t r e p e t i t i o n of a b l o c k r e f e r e n c e s i g n a l s the b e g i n n i n g of the l o c a l i t y phase L (and thus the end of the t r a n s i t i o n phase i T ). 2) O t h e r w i s e , any r e f e r e n c e t o a b l o c k not i n the c u r r e n t l o c a l i t y phases s i g n a l s the end of the l o c a l i t y phase and the b e g i n n i n g of the next t r a n s i t i o n phase T 1 + 1 I t i s c l e a r from the above proc e d u r e t h a t L i s a s u b s e t of i T , t h e r e f o r e i t i s not n e c e s s a r y t o c o n s i d e r the l o c a l i t y i p hases. The new a l g o r i t h m i n c r e m e n t s by one a l l the l a b e l s of the edges between the b l o c k s which a r e i n the u n i o n s e t of T i - 1 and T . T h i s h o p e f u l l y c a p t u r e s b l o c k s which e x h i b i t prominent i r e f e r e n c i n g p a t t e r n . The nearness n o t i o n between b l o c k s i n a d j a c e n t phases can be extended t o phases which a r e T d i s t a n c e a p a r t , where T i s the number of t r a n s i t i o n phases. The window s i z e used i n t h i s t h e s i s 1 7 Chapter 4 C l u s t e r i n g a l g o r i t h m s Phase 2 of the r e s t r u c t u r i n g p r o c e d u r e s t o r e s the c o n n e c t i v i t y i n f o r m a t i o n i n a m a t r i x C. Each e n t r y C of the i j nXn c o n n e c t i v i t y m a t r i x r e p r e s e n t s the d e s i r a b i l i t y of p l a c i n g two b l o c k s i and j i n the same page. In g r a p h i c a l sense, C i j r e p r e s e n t s the weight of the edge c o n n e c t i n g node i and node j . The t a s k of t h e c l u s t e r i n g a l g o r i t h m i s thus t o group the b l o c k s i n t o c l u s t e r s w i t h the o b j e c t i v e of m i n i m i z i n g i n t e r - c l u s t e r r e f e r e n c e s . In a s s i g n i n g the b l o c k s t o pages, we may impose the c o n s t r a i n t t h a t the b l o c k s must f i t i n t o a page, or we may a l l o w some b l o c k s t o c r o s s page b o u n d a r i e s . W h i l e the o r d e r i n g of the b l o c k s i n a page i s i m m a t e r i a l i n the f i r s t a l t e r n a t i v e , a d d i t i o n a l a l g o r i t h m s a r e r e q u i r e d t o det e r m i n e the be s t o v e r l a p p i n g b l o c k s i n the second a l t e r n a t i v e . Because of the f r a g m e n t a t i o n caused by c l u s t e r s not f i l l i n g the pages c o m p l e t e l y , the second a l t e r n a t i v e was r e p o r t e d [10] t o p e r f o r m b e t t e r when a prop e r a l g o r i t h m was used t o det e r m i n e the o v e r l a p p i n g b l o c k s . In t h i s t h e s i s , we a r e not co n c e r n e d w i t h the 'gaps' i n the pages. F i n d i n g an o p t i m a l s o l u t i o n t o the c l u s t e r i n g problem i n the g e n e r a l case r e q u i r e s enumerating e x h a u s t i v e l y a l l p o s s i b l e l a y o u t s and the c o s t of e x e c u t i n g such an a l g o r i t h m i s v e r y h i g h . In most p r a c t i c a l s i t u a t i o n s , a n e a r - o p t i m a l and l e s s e x p e n s i v e a l g o r i t h m i s sought. Many a l g o r i t h m s w i t h d i f f e r e n t e f f i c i e n c y and time c o m p l e x i t y e x i s t . We s h a l l d e s c r i b e a f a i r l y 18 e f f i c i e n t a l g o r i t h m which i n most cases g i v e s n e a r - o p t i m a l s o l u t i o n . 4.1 H i e r a r c h i c a l c l a s s i f i c a t i o n (HC) method The c o n n e c t i v i t y m a t r i x may be viewed as a space c o n s i s t i n g of c l u s t e r s of p o i n t s w i t h t h e s i z e of each p o i n t r e f l e c t i n g the weight of the e n t r y C . An o p t i m a l a l g o r i t h m i s thus one t h a t i j f i n d s a l a y o u t i n which the t o t a l weight of the f i x e d - s i z e c l u s t e r s i s maximized. The HC method presumes t h a t the h e a v i l y w e i g h t e d p o i n t s a r e l o c a t e d i n the v i c i n i t y of dense a r e a s and hence they may be the p o s s i b l e n u c l e i of the p o t e n t i a l c l u s t e r s . At each i t e r a t i o n of the a l g o r i t h m , the two c l u s t e r s which a r e most s t r o n g l y c onnected a r e merged i n t o one and the m a t r i x i s reduced a c c o r d i n g l y . The p r o c e d u r e i s d e s c r i b e d f o r m a l l y below. Le t C be the c o n n e c t i v i t y m a t r i x , and J be the s e t of m m c l u s t e r s of b l o c k ( s ) , w i t h the s u b s c r i p t m d e n o t i n g the mth i t e r a t i o n . Each c l u s t e r i n i t i a l l y c o n t a i n s a s i n g l e b l o c k , t h a t i s , J = { {1},{2},..,{n} }. The mth i t e r a t i o n of the a l g o r i t h m 0 then e n t a i l s : 1) F i n d i n g the l a r g e s t element C i n the m a t r i x C , where A AB m-1 and B a r e two elements of J m-1 19 2) Reducing the m a t r i x C t o C by combining the rows and m-1 m columns a s s o c i a t e d w i t h A and B t o a new row and column l a b e l l e d by D,such t h a t : C = C + C DX AX BX C = C + C XD XA XB f o r a l l X i n J m- 1 where J = J - {A} - {B} + {D} m m-1 In the i n i t i a l m a t r i x C and the subsequent reduced ones, 0 each e n t r y C i s a s s i g n e d z e r o i f the combined c l u s t e r - s i z e s of i j i and j exceed a page. In a more g e n e r a l case where some b l o c k s i z e s may exceed a page, C i s a s s i g n e d z e r o i f the s m a l l e r i j c l u s t e r ( j say) c o u l d not f i t i n t o the r e m a i n i n g space of the c l u s t e r i . In t he a c t u a l i m p l e m e n t a t i o n , each e n t r y C of the i n i t i a l i j m a t r i x i s recomputed t o take i n t o account of the b l o c k s i z e s . The new v a l u e C ' i s d e f i n e d t o be i j C = C /(W +W ) i j i j i j where W and W are r e s p e c t i v e l y the b l o c k s i z e s of i and j . i j 20 T h i s d e f i n i t i o n was i n t r o d u c e d by M a s u d a [ l 6 ] and the performance of the m o d i f i e d method has been shown to be s u b s t a n t i a l l y b e t t e r than the o r i g i n a l method which does not t a k e the b l o c k s i z e s i n t o c o n s i d e r a t i o n . The proce d u r e t e r m i n a t e s when the c o n n e c t i v i t y between any two c l u s t e r s i s n i l , t h a t i s , a l l e n t r i e s of the reduced m a t r i x are z e r o s . For each i t e r a t i o n , s t e p 1 r e q u i r e s j 2 o p e r a t i o n and s t e p 2 r e q u i r e s 4 j o p e r a t i o n s , where j i s the s i z e of the reduced m a t r i x . Thus the number of o p e r a t i o n s T i s g i v e n by : T = { n 2 + ( n - l ) 2 + . . + 1 2 } + 4 { n + n - 1 + . . + l } = n(n+1)(2n+1)/6 + 2n(n+1) = n(n+1)(2n+13)/6 A common approach t o r e d u c i n g time c o m p l e x i t y of an a l g o r i t h m . i s t o decompose the o r i g i n a l problem i n t o subproblems on which some a l g o r i t h m can be a p p l i e d . However, the c o n n e c t i v i t y m a t r i x does not c o n t a i n i n f o r m a t i o n as t o how i t can be p a r t i t i o n e d i n t o s u b m a t r i c e s ; any a r b i t r a r y s u b m a t r i x may have s t r o n g c o n n e c t i v i t y w i t h the o t h e r s u b m a t r i c e s . The next s e c t i o n d e s c r i b e s a s i m p l e and f a s t method t o p a r t i t i o n the c o n n e c t i v i t y m a t r i x i n t o s u b m a t r i c e s such t h a t the i n t r a -s u b m a t r i x c o n n e c t i v i t y i s s m a l l . H i e r a r c h i c a l c l a s s i f i c a t i o n method i s then a p p l i e d t o each s u b m a t r i x . 21 4.2 A d e c o m p o s i t i o n approach t o c l u s t e r i n g problem The i d e a of r e p r e s e n t i n g the c o n n e c t i v i t y between b l o c k s i n a v e c t o r i s taken from the c l u s t e r i n g scheme of Yu et a l . [ l 9 ] . T h i s a l g o r i t h m f i n d s i t s a p p l i c a t i o n i n a l a r g e c l a s s of problems such as r e c o r d c l u s t e r i n g , a t t r i b u t e p a r t i t i o n i n g and f i l e a l l o c a t i o n i n d i s t r i b u t e d d a t a b a s e s . We s h a l l d e s c r i b e the a l g o r i t h m (which i s denoted as method A) i n the c o n t e x t of program r e s t r u c t u r i n g problem. 4.2.1 Method A The c o n n e c t i v i t y between b l o c k s i s r e p r e s e n t e d by a v e c t o r V where the i t h element of the v e c t o r (V ) r e c o r d s the p o s i t i o n i of the i t h b l o c k (b ) on a l i n e . The d i s t a n c e between any two i b l o c k s g i v e s an approximate i n d i c a t i o n of the c o n n e c t i v i t y between them. The d i s t a n c e between two b l o c k s i and j i s d e f i n e d t o be the a b s o l u t e d i f f e r e n c e between V and V . i j I n i t i a l l y the b l o c k s a r e a s s i g n e d a r b i t r a r y p o s i t i o n s on the l i n e . D u r i n g phase two of the r e s t r u c t u r i n g p r o c e d u r e , the c o n n e c t i v i t y v e c t o r i s updated as f o l l o w s : 1) A f t e r d e t e r m i n i n g the c o n n e c t i v i t y between two b l o c k s i and j by some a p p r o p r i a t e r e s t r u c t u r i n g a l g o r i t h m , b l o c k s i and j a r e moved towards t h e i r c e n t r o i d by a s m a l l amount A'. The c e n t r o i d of the two b l o c k s i s d e f i n e d t o be (V +V )/2. i j 2) Two randomly s e l e c t e d b l o c k s a r e moved away from t h e i r 22 c e n t r o i d by another s m a l l amount A " . Step 2 i s a p p l i e d t o a v o i d b l o c k s not c o n n e c t e d from bunching up t o g e t h e r . C o n s i d e r a s i t u a t i o n where the c e n t r o i d s of b l o c k s i and j c o i n c i d e w i t h t h a t of two o t h e r b l o c k s i ' and j ' . I f t h e r e i s a s t r o n g c o n n e c t i v i t y w i t h i n each of t h e p a i r s i and j , i ' and j ' , then e v e n t u a l l y the f o u r b l o c k s w i l l converge even i f t h e r e i s no c o n n e c t i v i t y between the p a i r s . I d e n t i f y i n g the c l u s t e r s i s a s t r a i g h t - f o r w a r d p r o c e d u r e . Any two b l o c k s w i t h i n a d i s t a n c e 6 a r e c l u s t e r e d t o g e t h e r . The s i m p l i c i t y of the a l g o r i t h m i s i n t u i t i v e l y a p p e a l i n g . However, i t has s e v e r a l drawbacks when a p p l i e d t o the r e s t r u c t u r i n g problem. I t i s c l e a r t h a t the c h o i c e of the p arameters A ' , A " and 5 i s c r i t i c a l i n i s o l a t i n g the c l u s t e r s . For example, a l a r g e A' v a l u e would cause the b l o c k s t o move c l o s e r t o one a n o t h e r too q u i c k l y , r e s u l t i n g i n many l a r g e c l u s t e r s . A s m a l l A' v a l u e , o n the o t h e r hand causes slow convergence and hence produces no v a l i d c l u s t e r s . Because the f r e q u e n c y of r e f e r e n c i n g v a r i e s from c l u s t e r t o c l u s t e r , not a l l the c l u s t e r s w i l l be i d e n t i f i e d c o r r e c t l y i f a s i n g l e A v a l u e i s used. Whi l e the p o i n t s on the l i n e p r e s e n t an approximate view of where the c l u s t e r s a r e , i n f o r m a t i o n about the c o n n e c t i v i t y between any two b l o c k s i s l o s t . T h i s i s a problem i n t r i n s i c i n the o n e - d i m e n s i o n a l r e p r e s e n t a t i o n because moving a b l o c k i towards b l o c k j i n e v i t a b l y r e s u l t s i n b l o c k i b e i n g c l o s e r t o o t h e r b l o c k s which may not be r e l a t e d t o b l o c k i . S i n c e not 23 e v e r y c l u s t e r w i l l f i t i n t o a page, i t i s n e c e s s a r y to- f i l t e r out t h e a p p r o p r i a t e b l o c k s from the c l u s t e r . However, a f i n e r r e c o n f i g u r a t i o n of the p o i n t s i s not p o s s i b l e due t o the i n s u f f i c i e n t i n f o r m a t i o n c o n t a i n e d i n the v e c t o r . ' We s h a l l now d e s c r i b e our new a l g o r i t h m , w hich, f o r l a c k of a b e t t e r name, w i l l s i m p l y be c a l l e d the g r a v i t a t i o n a l c l u s t e r i n g scheme i n t h i s t h e s i s . 4.2.2 G r a v i t a t i o n a l c l u s t e r i n g scheme S i n c e i t i s d i f f i c u l t t o s e l e c t a p p r o p r i a t e v a l u e s f o r the parameters A ' , A" and 6, the a l g o r i t h m d e s c r i b e d i n the l a s t s e c t i o n i s not s u i t a b l e f o r our a p p l i c a t i o n . E x p e r i m e n t s performed u s i n g a range of parameter v a l u e s d i d not show n o t i c e a b l e r e d u c t i o n i n page f a u l t r a t e . The a l g o r i t h m we have d e v e l o p e d uses the same p r i n c i p l e of moving b l o c k s which a r e s t r o n g l y connected c l o s e r to one a n o t h e r . Phase 2 of the r e s t r u c t u r i n g p r o c e d u r e i s r e v i s e d as f o l l o w s . When a b l o c k r e f e r e n c e s another b l o c k , the b l o c k on the l e f t i s moved t o the r i g h t by an amount d, w i t h the p o s s i b i l i t y of moving p a s t the o t h e r b l o c k . The c h o i c e of d i s not c r i t i c a l e x c e pt t h a t i t has t o be r e a s o n a b l y l a r g e so t h a t t h e i n i t i a l s e p a r a t i o n between the b l o c k s w i l l have m i n i m a l e f f e c t on the r a t e of convergence. I t s h o u l d be c l e a r t h a t i f a set of b l o c k s a r e o f t e n j o i n t l y r e f e r e n c e d t o g e t h e r , t h e i r c o r r e s p o n d i n g p o i n t s w i l l be l o c a t e d i n the p r o x i m i t y of one a n o t h e r . No p u l l i n g away of two randomly s e l e c t e d b l o c k s i s n e c e s s a r y 24 because c l u s t e r s of b l o c k s o f t e n move away from one a n o t h e r . However, due t o the l i m i t a t i o n of a o n e - d i m e n s i o n a l r e p r e s e n t a t i o n , some c l u s t e r s may s t i l l have t h e i r members i n t e r m i x e d . 4.2.3 G r a v i t a t i o n a l - h i e r a r c h i c a l c l a s s i f i c a t i o n (GHC) method A l t h o u g h the g r a v i t a t i o n a l c l u s t e r i n g scheme may f a i l t o f i l t e r out the most co n n e c t e d b l o c k s from each c l u s t e r , i t p r o v i d e s a b a s i s f o r p a r t i t i o n i n g a problem i n t o subproblems. The c o n n e c t i v i t y v e c t o r p r o v i d e s a means t o r e o r d e r the rows and columns of the c o n n e c t i v i t y m a t r i x i n such a way t h a t the c o n n e c t i v i t y between any s u b m a t r i c e s of r e s o n a b l e s i z e k i s s m a l l . The c o n n e c t i v i t y v e c t o r i s f i r s t s o r t e d i n d e c r e a s i n g o r d e r , and the m a t r i x i s r e a r r a n g e d t o have the b l o c k s a l o n g both the rows and columns i n the same o r d e r as t h a t i n the s o r t e d v e c t o r . Thus b l o c k s h a v i n g row numbers c l o s e t o one ano t h e r a r e l i k e l y t o be r e l a t e d . I n i t i a l l y , the s i z e of the m a t r i x i s nxn. A f t e r a p p l y i n g the HC p r o c e s s on the kxk s u b m a t r i x , t h e i t h and j t h rows/columns (1<= i , j <=k) of the s u b m a t r i x are merged t o form a new row/column. I f the row/column t o be d e l e t e d ( j say) i s not the f i r s t row/column of the updated m a t r i x , then a l l the e n t r i e s of the f i r s t row/column are moved t o the j t h row/column. In e i t h e r c a s e , the f i r s t row/column of the m a t r i x i s d e l e t e d . T h e r e f o r e the reduced m a t r i x C a f t e r the mth i t e r a t i o n i s of m s i z e (n'-m-1 )x(n-m-1 ). At each i t e r a t i o n , the HC p r o c e s s i s a p p l i e d on the f i r s t k rows and columns of the reduced m a t r i x . 2 5 The choice of k clearly depends on the distribution of the positions of the blocks on the line. A good approximation for k is the average number of 'blocks in a cluster. This value can be obtained by scanning the connectivity vector and keeping count of the number of blocks in each cluster. We shall describe a procedure to identify the clusters. Let d ...d be the distance between the adjacent pairs of points 1 n-1 (V ,V ),(V ,V ) , .. , (V ,V ). Assuming b belongs to some 1 2 2 3 n-1 n i-1 cluster, and d is the maximum distance between any two points in that cluster. Then b belongs to another cluster i f the i following conditions are satisfied. 1) d >. d .and . . . . . . . i-1 2) d > d for a l l 1<=j<=q where q >=1 i — 1 i-1+j The second condition is to ensure that the clusters are not terminated prematurely. In other words, one 'looks ahead' at the next q separations to ensure that d is a resonable separation. i Looking ahead at the next separation (q=1) is sufficient to isolate the major clusters in most cases. However, a conservative approach would choose a large q value resulting in large cluster sizes. In summary, the revised restructuring procedure consists of the following steps : 26 1) P a r t i t i o n the program t o be r e s t r u c t u r e d i n t o r e l o c a t a b l e b l o c k s . Transform the t r a c e c o l l e c t e d from e x e c u t i o n of the program i n t o i t s c o r r e s p o n d i n g b l o c k r e f e r e n c e s t r i n g . 2) P r o c e s s the b l o c k r e f e r e n c e s t r i n g and determine the c o n n e c t i v i t y between the b l o c k s . The c o n n e c t i v i t y m a t r i x and the c o n n e c t i v i t y v e c t o r a r e updated a c c o r d i n g l y . 3) S o r t the c o n n e c t i v i t y v e c t o r i n d e c r e a s i n g o r d e r and r e a r r a n g e the c o n n e c t i v i t y m a t r i x i n the same o r d e r . Apply the h i e r a r c h i c a l c l a s s i f i c a t i o n method s u c c e s s i v e l y on' the kxk s u b m a t r i x , s t a r t i n g from the t o p l e f t c o r n e r of the m a t r i x . 4) R e s t r u c t u r e the program a c c o r d i n g t o the l a y o u t produced i n the l a s t s t e p . 4.2.4 A n a l y s i s of time c o m p l e x i t y The GHC p r o c e d u r e c o n s i s t s of the f o l l o w i n g s t e p s . 1) S o r t i n g the c o n n e c t i v i t y v e c t o r and computing a v a l u e f o r k. 2) R e o r d e r i n g the rows and columns of the m a t r i x . 3) Recomputing each element of the m a t r i x t o t a k e i n t o c o n s i d e r a t i o n the b l o c k s i z e s . 4) A p p l y i n g the HC method s u c c e s s i v e l y on the kxk s u b m a t r i x . 27 The number of o p e r a t i o n s r e q u i r e d t o exe c u t e s t e p 1 i s bounded by n l o g n . Step 3 i s common i n the GHC method and the HC method. A l t h o u g h no rearrangement of the m a t r i x i s n e c e s s a r y i n the HC method, the m a t r i x i s made more compact by e l i m i n a t i n g the z e r o rows and columns. Thus the time c o m p l e x i t y of the p r e -h i e r a r c h i c a l p r o c e s s i n both methods i s of the o r d e r n 2 . We w i l l a n a l y z e the time c o m p l e x i t y of the GHC method based on s t e p 4. Assume the sub m a t r i x S has c o n s t a n t s i z e kxk. The p r o c e s s i n g time of s t e p 4 can be d i v i d e d i n t o two components. 1) F i n d i n g the maximum e n t r y S and merging the c l u s t e r s i and i j j . The number of o p e r a t i o n s r e q u i r e d = k 2+4k. S i n c e each i t e r a t i o n r e q u i r e s a c o n s t a n t p r o c e s s i n g t i m e , the t o t a l p r o c e s s i n g time i s thus l i n e a r l y p r o p o r t i o n a l t o the s i z e of the i n i t i a l m a t r i x . 2) R e p l a c i n g the e n t r i e s of the row/column t o be d e l e t e d w i t h t h a t of the f i r s t row/column of the r e d u c e d . m a t r i x . T h i s t a k e s m o p e r a t i o n s where m i s the s i z e of the reduced m a t r i x . Note t h a t t h i s s t e p i s not e x e c u t e d a t eve r y i t e r a t i o n . The number of o p e r a t i o n s r e q u i r e d f o r n i t e r a t i o n s i s of the or d e r n 2 . T h e r e f o r e the ' t o t a l p r o c e s s i n g time of the e n t i r e c l u s t e r i n g p r o c e d u r e ( s t e p 1 t o s t e p 4) i s of the o r d e r n 2 . 28 Chapter 5 D e s c r i p t i o n of experiment S e v e r a l s e r i e s of e x p e r i m e n t s were c a r r i e d out t o t e s t the performance of the BLI r e s t r u c t u r i n g a l g o r i t h m and the GHC c l u s t e r i n g a l g o r i t h m . There were two p a r t s t o the e x p e r i m e n t s . The f i r s t p a r t aimed a t 1) comparing the performance of the BLI r e s t r u c t u r i n g a l g o r i t h m w i t h the major e x i s t i n g r e s t r u c t u r i n g a l g o r i t h m s v i a s i m u l a t i o n 2) v e r i f y i n g the time c o m p l e x i t y a n a l y s i s of the GHC a l g o r i t h m . F o l l o w i n g the s i m u l a t i o n r u n s , the r e s t r u c t u r e d program was t e s t e d on a r e a l system and i t s performance was compared w i t h the n o n - r e s t r u c t u r e d program. The experiment procedure c o n s i s t s of the f o l l o w i n g s t e p s . 1) To d e t e r m i n e what performance i n d i c e s t o use and then t o i d e n t i f y the major parameters a f f e c t i n g the chosen performance i n d i c e s . 2) To d e v e l o p the a p p r o p i a t e r e s t r u c t u r i n g and measurement t o o l s . 3) To p e r f o r m s i m u l a t i o n r u n s . 4) To e v a l u a t e the r e s t r u c t u r e d program on a r e a l system. 5.1 Parameters 5.1.1 Performance i n d i c e s For the purpose of e v a l u a t i n g the performance of the BLI a l g o r i t h m , the page f a u l t r a t e and the mean working s e t s i z e were chosen t o be the performance i n d i c e s f o r t h e i r r e s p e c t i v e 29 i n f l u e n c e on the t u r n a r o u n d time and the space requirement of the v i r t u a l memory system. The mean wo r k i n g s e t s i z e i s o n l y a p p l i c a b l e i n the working s e t environment. To e v a l u a t e t h e e f f i c i e n c y of the GHC c l u s t e r i n g a l g o r i t h m , the performance index used was the p r o c e s s i n g t i m e . The page f a u l t r a t e was a l s o used t o measure the performance of the a l g o r i t h m . T h i s i s t o ensure t h a t the e f f i c i e n c y and the e f f e c t i v e n e s s of the a l g o r i t h m a r e m a i n t a i n e d a t t h e same t i m e . 5.1.2 F a c t o r s The performance of a r e s t r u c t u r e d program i s a f f e c t e d by the f o l l o w i n g f a c t o r s . 1) Input d a t a . T h i s a f f e c t s the b e h a v i o u r of the program. T h e r e f o r e a r e s t r u c t u r i n g a l g o r i t h m s h o u l d be t e s t e d on s e v e r a l r e f e r e n c e s t r i n g s which r e p r e s e n t v a r i o u s program b e h a v i o u r . D i f f e r e n t t y p e s of i n p u t d a t a were chosen t o co v e r a range of v a r i a b i l i t y . 2) R e s t r u c t u r i n g a l g o r i t h m s . The page f a u l t r a t e i s d i r e c t l y i n f l u e n c e d by the placement o r d e r of the b l o c k s i n the v i r t u a l pages. 3) Memory management p o l i c y . As mentioned i n Chapter 1, the r e s t r u c t u r i n g a l g o r i t h m s a r e s e n s i t i v e t o the memory management p o l i c y of the computing system and window s i z e ( i n case of the working s e t p o l i c y ) . Each r e s t r u c t u r i n g a l g o r i t h m was t e s t e d on d i f f e r e n t memory management environment i n 30 order to establish i t s robustness. For the purpose of analyzing the processing time of the GHC method, the following factors are considered. 1) Input data. Since the processing time i s a function of the number of d i s t i n c t referenced blocks, block reference strings consisting of d i f f e r e n t number of d i s t i n c t referenced blocks were chosen. 2) Size of the submatrix k. This a f f e c t s the processing time as well as the page fault rate. The GHC method gives poorer results (in terms of page fa u l t rate) than the HC method i f k is too small, since blocks belonging to the same cluster may not be recognized as such. An appropriate value for k can be computed using the procedure described in section 4.2.3. 5.2 Procedure of experiment After choosing the program to be restructured, the execution trace of the program was c o l l e c t e d by using a debugging package. The resulting output was then reduced to a block reference s t r i n g consisting of a sequence of numbers representing the order in which the relocatable blocks of the program were referenced. The size of each relocatable block was also recorded for use in the c l u s t e r i n g phase. The block reference s t r i n g was then input to a program in 31 which s e v e r a l r e s t r u c t u r i n g and c l u s t e r i n g a l g o r i t h m s were i n c o r p o r a t e d . Based on a s p e c i f i e d r e s t r u c t u r i n g a l g o r i t h m , the c o n n e c t i v i t y m a t r i x (and p o s s i b l y the c o n n e c t i v i t y v e c t o r ) was produced. The c o n n e c t i v i t y i n f o r m a t i o n was t h e n passed t o the c l u s t e r i n g phase which output a l a y o u t s p e c i f y i n g the o r d e r i n g of the b l o c k s i n the v i r t u a l pages which would m i n i m i z e i n t e r -page r e f e r e n c e s . The performance of t h e r e s t r u c t u r e d program was e v a l u a t e d by s i m u l a t i o n . The s i m u l a t o r t r a n s f o r m e d the b l o c k r e f e r e n c e s t r i n g i n t o the c o r r e s p o n d i n g page r e f e r e n c e s t r i n g a c c o r d i n g t o the r e s t r u c t u r i n g l a y o u t . At the same t i m e , i t s i m u l a t e d the pa g i n g a c t i v i t y of the program under a chosen memory management p o l i c y . The a p p r o p r i a t e performance s t a t i s t i c s were ou t p u t a t the end of t h e s i m u l a t i o n . The experiment p r o c e d u r e d e s c r i b e d above was r e p e a t e d a number of t i m e s , each time w i t h a d i f f e r e n t f a c t o r such as r e s t r u c t u r i n g a l g o r i t h m , memory management p o l i c y and i n p u t d a t a . F i n a l l y , the a c t u a l b l o c k s of the program were r e o r d e r e d and r e l i n k e d by a l i n k e r . The performance of the r e s t r u c t u r e d program was then t e s t e d on a r e a l system. The s o f t w a r e t o o l s were a l l coded i n P a s c a l . A d e s c r i p t i o n of the t o o l s i s p r e s e n t e d i n the next s e c t i o n . 32 5.3 R e s t r u c t u r i n g a n d measurement t o o l s  5.3.1 D a t a c o l l e c t i o n The S y m b o l i c D e b u g g i n g S y s t e m (SDS) was u s e d t o c o l l e c t t h e e x e c u t i o n t r a c e o f t h e p r o g r a m t o be r e s t r u c t u r e d . The SDS p r o v i d e s a TRACE CALL command w h i c h r e c o r d s a t r a c e o f s u b r o u t i n e c a l l s and r e t u r n s d u r i n g t h e p r o g r a m ' s e x e c u t i o n . I t a l s o p r o v i d e s a MAP command w h i c h p r o d u c e s a l i s t i n g o f a l l b l o c k ( i n s t r u c t i o n o r d a t a ) names and t h e i r c o r r e s p o n d i n g l e n g t h s . T h i s i n f o r m a t i o n was u s e d i n t h e c l u s t e r i n g p h a s e . The e x e c u t i o n t r a c e c o l l e c t e d was i n r e a d a b l e ( a n d h e n c e v e r b o s e ) f o r m f o r d e b u g g i n g p u r p o s e s . A d a t a r e d u c t i o n p r o g r a m was w r i t t e n t o t r a n s f o r m t h e d a t a i n t o a more c o n c i s e a n d u s e f u l f o r m . U s i n g t h e l i s t p r o d u c e d f r o m t h e MAP command, t h e p r o g r a m maps e a c h b l o c k name i n t h e t r a c e t o a u n i q u e p o s i t i v e i n t e g e r . The r e d u c e d f o r m o f t h e c o l l e c t e d t r a c e i s known a s t h e b l o c k r e f e r e n c e s t r i n g . C o l l e c t i n g d a t a i s t h e most e x p e n s i v e s t e p o f t h e r e s t r u c t u r i n g p r o c e d u r e b e c a u s e SDS i s n o t c u s t o m i z e d f o r t h e p u r p o s e o f c o l l e c t i n g p r o g r a m t r a c e s . The e x t e n s i v e f a c i l i t i e s SDS p r o v i d e s makes i t a v e r s a t i l e b u t e x p e n s i v e t o o l t o u s e . 5.3.2 R e s t r u c t u r i n g a l g o r i t h m s The r e s t r u c t u r i n g a l g o r i t h m s i m p l e m e n t e d were F e r r a r i ' s C r i t i c a l W o r k i n g s e t a n d M i n i m a l W o r k i n g s e t , K o b a y a s h i ' s A c t i v i t y s e t a n d t h e B L I . E a c h a c c e p t s a s i n p u t t h e b l o c k r e f e r e n c e s t r i n g a n d t h e r e s t r u c t u r i n g p a r a m e t e r ( s u c h a s window 33 s i z e ) i f a n y . The b l o c k r e f e r e n c e s t r i n g i s t h e n p r o c e s s e d a n d t h e c o n n e c t i v i t y m a t r i x i s u p d a t e d a c c o r d i n g l y . A t t h e end o f p r o c e s s i n g t h e s t r i n g , t h e m a t r i x i s p a s s e d t o a c l u s t e r i n g p r o c e d u r e w h i c h u s e s e i t h e r t h e HC method o r t h e GHC method t o c l u s t e r t h e b l o c k s i n t o p a g e s . 5.3.3 C l u s t e r i n g a l g o r i t h m s The c l u s t e r i n g a l g o r i t h m s i m p l e m e n t e d were t h e HC a n d t h e GHC m e t h o d s . I n p u t t o t h e c l u s t e r i n g p r o c e d u r e a r e t h e c o n n e c t i v i t y m a t r i x , t h e b l o c k s i z e a n d t h e s i z e o f a p a g e . The GHC method r e q u i r e s two a d d i t i o n a l i n p u t : t h e c o n n e c t i v i t y v e c t o r a n d t h e p a r a m e t e r k ( s i z e o f t h e s u b m a t r i x ) a s d e s c r i b e d i n s e c t i o n 4.2.3. I n o r d e r t o e l i m i n a t e t h e I/O o p e r a t i o n r e q u i r e d t o p a s s t h e c o n n e c t i v i t y m a t r i x f r o m t h e r e s t r u c t u r i n g p h a s e t o t h e c l u s t e r i n g p h a s e , t h e r e s t r u c t u r i n g a l g o r i t h m s a n d t h e c l u s t e r i n g a l g o r i t h m s were i m p l e m e n t e d i n t h e same p r o g r a m . I n t h e GHC m e t h o d , t h e c o n n e c t i v i t y v e c t o r i s f i r s t s o r t e d i n d e s c e n d i n g o r d e r . The c o n n e c t i v i t y m a t r i x i s t h e n c o p i e d t o a n o t h e r m a t r i x w i t h i t s rows a n d c o l u m n s a r r a n g e d i n t h e same o r d e r a s t h a t i n t h e s o r t e d v e c t o r . A v a l u e f o r t h e s u b m a t r i x s i z e ( k ) i s a l s o c o m p u t e d . I n t h e HC m e t h o d , t h e c o n n e c t i v i t y m a t r i x i s a l s o c o p i e d t o a more c o m p a c t m a t r i x e l i m i n a t i n g t h e z e r o rows and c o l u m n s . B e f o r e a p p l y i n g t h e HC m e thod, e a c h e l e m e n t C o f t h e i j m a t r i x i s d i v i d e d by an amount e q u a l t o t h e sum o f t h e b l o c k s i z e s i a n d j . 34 5.3.4 Memory management p o l i c y simulators Trace-dr iven memory management simulators were developed for the Working Set Po l icy and the Least -recent ly-used replacement strategy. Input to the simulators 'are the block reference s t r i n g , the layout produced from the res truc tur ing procedure, the block sizes and the s ize of a page. The las t two pieces of information are needed to determine the overlapped blocks because some c lus ters may contain blocks whose s izes exceed a page. The blocks are mapped into pages as spec i f i ed in the c lus ter layout . Based on the page reference s t r i n g , the paging a c t i v i t y of the restructured program is simulated under a spec i f i ed memory management po l i cy and the appropriate performance indices associated with the restructured program are pr inted out. 5.3.5 Linkage editor The l inkage editor provides f a c i l i t i e s to place the object modules into pages according to a spec i f i ed order ing . A program was written to produce appropriate l i n k e d i t commands when given the fol lowing input 1) the layout produced from the res tructur ing procedure and 2) the map which i d e n t i f i e s the actual object module names with the block numbers used in the layout . 5.4 Design and implementation The experiment was performed on an AMDAHL 470 V/8 computer which operates under the Michigan Terminal System (MTS) [3] . The 35 program e x p e r i m e n t e d on was a P a s c a l c o m p i l e r . I t was w r i t t e n i n P a s c a l and has about 13000 l i n e s of code. The c h a r a c t e r i s t i c s of the P a s c a l c o m p i l e r a re g i v e n below. no. of r e l o c a t a b l e b l o c k s = 461 no. of b l o c k s > 1 page = 4 av. s i z e of b l o c k s > 1 page = 5282 b y t e s av. s i z e of b l o c k s <= 1 page = 617 b y t e s s i z e of a page = 4K b y t e s 5.4.1 S e l e c t i o n of l e v e l s f o r the experiment In s e c t i o n 5.1.1 the major f a c t o r s f e l t t o a f f e c t the performance of an a l g o r i t h m were i d e n t i f i e d . The d i f f e r e n t v a l u e s of a f a c t o r (both q u a n t i t a t i v e and n o n - q u a n t i t a t i v e ) are c a l l e d i t s l e v e l s . The f a c t o r s s e l e c t e d and t h e i r c o r r e s p o n d i n g l e v e l s f o r the experiment a r e l i s t e d i n T a b l e s 5.1 and 5.2. The l e v e l s f o r the i n p u t data were s e l e c t e d t o co v e r a v a r i e t y of P a s c a l programs t o be c o m p i l e d w i t h r e s p e c t t o the number of s y n t a c t i c e r r o r s , the type of i n s t r u c t i o n s and data s t r u c t u r e s of the programs. The f i v e b l o c k r e f e r e n c e s t r i n g s c o l l e c t e d c o r r e s p o n d t o the f i v e d i f f e r e n t i n p u t programs t o the P a s c a l c o m p i l e r . F i v e r e s t r u c t u r i n g h e u r i s t i c s were s e l e c t e d f o r comparison. The o r i g i n a l o r d e r i n g of the b l o c k s by the a u t h o r went through no r e s t r u c t u r i n g and thus s e r v e d as the c o n t r o l of the exper i m e n t . In the o r i g i n a l o r d e r i n g , the o b j e c t modules are p l a c e d i n one c o n t i g u o u s b l o c k spanning s e v e r a l pages. 36 K o b a y a s h i ' s ASI a l g o r i t h m [ 1 4 ] was i n c l u d e d b e c a u s e i t i s one o f t h e b e s t e x i s t i n g s t r a t e g y - i n d e p e n d e n t r e s t r u c t u r i n g a l g o r i t h m s f o r r e d u c i n g p age f a u l t r a t e . The CWS m e thod [ 8 ] was c h o s e n b e c a u s e i t i s s t i l l t h e c u r r e n t b e s t s t r a t e g y - d e p e n d e n t a l g o r i t h m . The MWS m e thod [ 8 ] was c h o s e n f o r two r e a s o n s : 1) i t s c o m b i n a t i o n method o f c o m p u t i n g c o n n e c t i v i t y b e t w e e n p a i r s o f b l o c k s i n t h e window i s v e r y s i m i l a r t o t h a t o f t h e B L I , a n d 2) i t i s c o n s i d e r e d t o be t h e b e s t s t r a t e g y - d e p e n d e n t a l g o r i t h m f o r r e d u c i n g t h e mean w o r k i n g s e t s i z e o f t h e r e s t r u c t u r e d p r o g r a m . Two memory management p o l i c i e s were u s e d . The w o r k i n g s e t p o l i c y w i t h a window s i z e o f 50 r e f e r e n c e s was c h o s e n b e c a u s e o f t h e p o p u l a r i t y o f i t s u s a g e r e p o r t e d i n t h e l i t e r a t u r e . The L e a s t - R e c e n t l y - U s e d r e p l a c e m e n t s t r a t e g y was s e l e c t e d b e c a u s e i t r e p r e s e n t s a f i x e d - p a r t i t i o n memory management p o l i c y . A p a r t i t i o n s i z e r a n g i n g f r o m 4 t o 24 p a g e s was u s e d t o t e s t t h e r o b u s t n e s s o f t h e a l g o r i t h m . To o b t a i n a r e l a t i o n s h i p b e t w e e n t h e p r o c e s s i n g t i m e o f t h e GHC m e thod a n d t h e s i z e o f t h e m a t r i x n, b l o c k r e f e r e n c e s t r i n g s c o n s i s t i n g o f d i f f e r e n t number o f d i s t i n c t b l o c k s were u s e d . T h r e e o f t h e r e f e r e n c e s t r i n g s had t h e i r l e n g t h e x t e n d e d by a p p e n d i n g o t h e r r e f e r e n c e s t r i n g s t o them i n o r d e r t o g e t a r a n g e o f v a l u e s f o r n. The s u b m a t r i x s i z e ( k ) c o m p u t e d f o r t h e r e f e r e n c e s t r i n g s v a r i e d f r o m 30 t o 50, a f i x e d v a l u e ( 3 0 ) was u s e d . 5.4.2 I m p l e m e n t a t i o n d e t a i I s I n t h e 5 b l o c k r e f e r e n c e s t r i n g s we c o l l e c t e d , we f o u n d 37 t h a t s l i g h t l y over 200 d i s t i n c t b l o c k s were r e c o r d e d . For the u n r e f e r e n c e d b l o c k s , the r e s t r u c t u r i n g p r o c e d u r e s i m p l y put them i n the same o r d e r as the o r i g i n a l o r d e r i n g i n s e v e r a l c o n t i g u o u s pages. I n i t i a l l y we found t h a t the r e s t r u c t u r e d p a s c a l c o m p i l e r g e n e r a t e d more page f a u l t s than the n o n - r e s t r u c t u r e d one when t e s t e d on a r e a l system. C a r e f u l i n v e s t i g a t i o n r e v e a l e d the cause of the poor performance. We s h a l l d i g r e s s t o d e s c r i b e the s t r u c t u r e of a P a s c a l n e s t e d r o u t i n e i n o r d e r t o e x p l a i n the cause. The P a s c a l language a l l o w s d e c l a r a t i o n of a n e s t e d r o u t i n e which t a k e s the form as shown i n F i g u r e 5.1. The d e c l a r a t i o n of the r o u t i n e s P1,..,Pn w i t h i n the r o u t i n e P i n a w e l l - s t r u c t u r e d program i s not a r b i t r a r y . I n s t e a d , such c o n s t r u c t i s chosen t o a l l o w many of t h e r o u t i n e s P1..Pn t o share a common da t a b l o c k (B i n our example). C o n s e q u e n t l y , a h i g h i n c i d e n c e of the 1 b l o c k s B ..B r e f e r e n c i n g one a n o t h e r i s e x p e c t e d . 2 n+2 Fu r t h e r m o r e , the data b l o c k B1 w i l l be h e a v i l y r e f e r e n c e d by the i n s t r u c t i o n b l o c k s B2..B(n+2). In a n o n - r e s t r u c t u r e d program, b l o c k s a r e l o a d e d i n t o v i r t u a l memory i n the same o r d e r as they a r e coded i n the program. T h e r e f o r e the b l o c k s B ..B have a 1 n + 2 h i g h p r o b a b i l i t y of b e i n g i n main memory a t the same t i m e . I n . a r e s t r u c t u r e d program, the o r i g i n a l o r d e r i n g of the b l o c k s B ..B w i l l be p r e s e r v e d o n l y i f t h e r e i s no r e f e r e n c e t o such 1 n + 2 b l o c k s i n the r e f e r e n c e s t r i n g . Any r e f e r e n c e t o the b l o c k 38 B(n+2) w i l l i n e v i t a b l y r e s u l t i n a l a y o u t i n which the b l o c k B(n+2) and i t s d a t a b l o c k w i l l not be i n the same page. T h i s i s because d a t a b l o c k s were not r e c o r d e d i n the t r a c e and hence they would be put i n the same page as some o t h e r u n r e f e r e n c e d b l o c k s . The s o l u t i o n t o the above problem would be t o c o l l e c t d a t a r e f e r e n c e s as w e l l as i n s t r u c t i o n r e f e r e n c e s i n the d a t a c o l l e c t i o n s t a g e . S i n c e the SDS d i d not p r o v i d e such f a c i l i t y , an ad-hoc method was d e v i s e d t o . g r o u p the n e s t e d r o u t i n e and i t s e n c l o s i n g d a t a and i n s t r u c t i o n b l o c k s i n t o one b l o c k . Note t h a t t h i s method o n l y works f o r the P a s c a l c o m p i l e r used i n the exp e r i m e n t . The proposed method made use of the c o n v e n t i o n the P a s c a l c o m p i l e r employed t o g e n e r a t e s y m b o l i c names. In t h i s c o n v e n t i o n , a l l symbols h a v i n g the same p r e f i x names were names of e i t h e r i n s t r u c t i o n or d a t a b l o c k b e l o n g i n g t o a n e s t e d r o u t i n e . U s i n g t h i s r u l e , t h e s e were a l l put i n t o the same b l o c k t o ensure t h a t they w i l l be c l o s e t o one another i n the new l a y o u t . The c o l l e c t e d b l o c k r e f e r e n c e s t r i n g were scanned and the b l o c k numbers were mapped t o some o t h e r unique numbers. As a r e s u l t of the many-to-one mapping, the number of d i s t i n c t b l o c k s i n the new r e f e r e n c e s t r i n g was reduced s u b s t a n t i a l l y . The d e r i v e d s t r i n g has the f o l l o w i n g c h a r a c t e r i s t i c s . no. of d i s t i n c t r e l o c a t a b l e b l o c k s = 278 no. of b l o c k s > 1 page = 13 av. s i z e of b l o c k s > 1 page = 8116 b y t e s av. s i z e of b l o c k s <= 1 page = 681 b y t e s 39 In the s i m u l a t i o n e x p e r i m e n t , the o r i g i n a l b l o c k r e f e r e n c e s t r i n g s were used throughout because the s m a l l e r b l o c k s i z e would g i v e r i s e t o more c o n t r a s t i n g performance r e s u l t s among the r e s t r u c t u r i n g a l g o r i t h m s so t h a t more d e f i n i t e c o n c l u s i o n s c o u l d be made. In t e s t i n g the r e s t r u c t u r e d program on a r e a l system, we were f a c e d w i t h the problem of t r y i n g t o e v a l u a t e the non-r e s t r u c t u r e d and the r e s t r u c t u r e d programs i n a dynamic environment from which no f a i r assessment c o u l d be made. S i n c e we have no c o n t r o l over the t o t a l system w o r k l o a d , t h e number of page f a u l t s produced w h i l e r u n n i n g the program i n d i f f e r e n t w o r k l o a d c o n d i t i o n s cannot be m e a n i n g f u l l y compared. The i d e a l s o l u t i o n i s a c o n t r o l l e d environment i n which the system i s d r i v e n by some r e c o r d e d w o r k l o a d s . Such workloads a r e e x p e n s i v e t o ' c o n s t r u c t , and they r e q u i r e t h e system t o be d e d i c a t e d t o the exp e r i m e n t . F a i l i n g t o meet the r e q u i r e m e n t s , a crude means of t e s t i n g was employed. In o r d e r t o t e s t the programs under s i m i l a r w o r k l o a d s , the j o b s c o m p r i s i n g the t e s t i n g programs were s u b m i t t e d s i m u t a n e o u s l y as i n t e r a c t i v e j o b s from two t e r m i n a l s . S i n c e the CPU can o n l y s e r v i c e one j o b a t a t i m e , the workloads when the programs are r u n n i n g would s t i l l not be i d e n t i c a l . H o p e f u l l y , the f l u c t u a t i o n of wor k l o a d s d u r i n g t h e s h o r t time i n t e r v a l i s s m a l l . The r e s t r u c t u r i n g of the P a s c a l c o m p i l e r was based on i t s e x e c u t i o n t r a c e h a v i n g as i t s i n p u t a program which c o n t a i n e d d i f f e r e n t t y p e s of syntax e r r o r s . B e i n g a s t u d e n t c o m p i l e r , the r a t e of e r r o r i s e x p e c t e d t o be h i g h e r than normal and t h u s t h e 40 chosen t r a c e would r e p r e s e n t the g e n e r a l behav iour of the c o m p i l e r . \ PROCEDURE P; I—Var . . . P r o c e d u r e P1; —. V a r . . . Begin End; I P r o c e d u r e Pn ; —> Va r . . . Begin End; — B e g i n 1 . Codes f o r . Procedure P — E n d ; B l o c k B 1 B l o c k B 2 B l o c k B n+1 B l o c k B n + 2 F i g u r e 5.1 Layout of a P a s c a l n e s t e d r o u t i n e Table 5 . 1 . S e l e c t i o n of l e v e l s f o r the R e s t r u c t u r i n g Experiment Program : P a s c a l C o m p i l e r C l u s t e r i n g A l g o r i t h m : H i e r a r c h i c a l C l a s s i f i c a t i o n FACTORS LI NAME 2VELS DESCRIPTION Input d a t a (program t o be c o m p i l e d ) PI P2 P3 P4 P5 s t u d e n t program p o i n t e r m a n i p u l a t i n g program p o r t i o n of the r e s t r u c t u r i n g program w i t h 21 syn t a x e r r o r s d a t a p r o c e s s i n g program n u m e r i c a l program R e s t r u c t u r i n g A l g o r ithms ORIG ASI CWS MWS BLI o r i g i n a l o r d e r i n g by au t h o r A c t i v i t y - S e t C r i t i c a l - W o r k i n g - S e t M i n i m a l - W o r k i n g - S e t B o u n d e d - L o c a l i t y - I n t e r v a l Memory Management P o l i c y WS LRU Working-Set L e a s t - R e c e n t l y Used Ta b l e 5.2 S e l e c t i o n of l e v e l s f o r the C l u s t e r i n g Experiment Program : P a s c a l C o m p i l e r R e s t r u c t u r i n g A l g o r i t h m : CWS S i z e of the Submatrix k : 30 Memory Management P o l i c y : WS (window s i z e 50 r e f e r e n c e s ) FACTORS LEVELS NAME DESCRIPTION B l o c k r e f e r e n c e SI 1 42 d i s t i n c t b l o c k s S t r i n g s S2 171 d i s t i n c t b l o c k s S3 195 d i s t i n c t b l o c k s S4 208 d i s t i n c t b l o c k s S5 252 d i s t i n c t b l o c k s 43 Chapter 6 D i s c u s s i o n of e x p e r i m e n t a l r e s u l t s In t h i s c h a p t e r , r e s u l t s o b t a i n e d from the s i m u l a t i o n and the a c t u a l runs are p r e s e n t e d . We w i l l f i r s t d i s c u s s the performance of the BLI r e s t r u c t u r i n g a l g o r i t h m i n comparison w i t h the o t h e r r e s t r u c t u r i n g a l g o r i t h m s . Then we w i l l a n a l y z e e x p e r i m e n t a l l y t h e time c o m p l e x i t y of the GHC c l u s t e r i n g method. The BLI a l g o r i t h m i s a s s e s s e d i n terms of page f a u l t r a t e r e d u c t i o n , average working s e t s i z e r e d u c t i o n and c o s t . The s e n s i t i v i t y of a program's b e h a v i o u r to d i f f e r e n t i n p u t d a t a i s l a r g e l y an a t t r i b u t e of the program. The performance of a r e s t r u c t u r e d program w i t h r e s p e c t t o d i f f e r e n t i n p u t d a t a w i l l not be i n v e s t i g a t e d i n t h i s t h e s i s . 6 . 1 E v a l u a t i o n of the BLI a l g o r i t h m  6 . 1 . 1 D i s c u s s i o n of s i m u l a t i o n r e s u l t s . The number of page f a u l t s g e n e r a t e d and t h e average w o r k i n g s e t s i z e of the r e s t r u c t u r e d programs under a w o r k ing s e t environment are p r e s e n t e d i n T a b l e s 6.1a and 6.1b. As e x p e c t e d , the CWS a l g o r i t h m o u t p e r f o r m s the o t h e r s i n page f a u l t r a t e r e d u c t i o n when the window s i z e used i n r e s t r u c t u r i n g matches w i t h the window s i z e of the memory management p o l i c y . However, the BLI a l g o r i t h m works be s t when the f i g u r e of m e r i t i s the average working s e t s i z e . The CWS a l g o r i t h m performs b e t t e r than the BLI a l g o r i t h m i n terms of page f a u l t r a t e r e d u c t i o n by a p p r o x i m a t e l y 3%. The BLI a l g o r i t h m on the o t h e r hand reduces the average working s e t s i z e 44 13% more than the CWS a l g o r i t h m . A l t h o u g h t h e r e i s no s i g n i f i c a n t d i f f e r e n c e between the performance of the MWS and the BLI a l g o r i t h m s , the h i g h c o s t of e x e c u t i n g the MWS a l g o r i t h m (at l e a s t 10 times more than the o t h e r s i n our s t u d y ) makes i t an u n a t t r a c t i v e a l g o r i t h m t o use. In a g l o b a l - L R U environment, the memory space a l l o c a t e d t o a s i n g l e p r o c e s s i s t i m e - v a r i a n t . To get an i d e a how r e s t r u c t u r i n g would improve the performance of a program i n such an environment, the r e s t r u c t u r e d program i s e x e c u t e d i n a f i x e d -p a r t i t i o n LRU environment over a wide range of page frames. The c u r v e s i n F i g u r e 6.1 show the number of page f a u l t s g e n e r a t e d f o r d i f f e r e n t number of page frames i n a f i x e d - p a r t i t i o n LRU e n v i ronment. When the number of page frames a l l o c a t e d i s s m a l l , the BLI a l g o r i t h m o u t p e r f o r m s the o t h e r by a s u b s t a n t i a l amount. As more page frames a r e a l l o c a t e d , the amount of improvement produced by BLI r e l a t i v e t o the o t h e r , a l g o r i t h m s d i m i n i s h e s . The poor performance of the BLI a l g o r i t h m a t the h i g h e r range of page frames c o u l d be b e s t e x p l a i n e d by the f o l l o w i n g example. C o n s i d e r a b l o c k r e f e r e n c e s t r i n g h a v i n g the r e f e r e n c e p a t t e r n {1,2,..,1,2,4,3,..,4,3,1,3,5,6,..,1,3,5,6}, where each of t h e p a t t e r n s 1,2 and 4,3 i s r e p e a t e d many more times c o n s e c u t i v e l y than the p a t t e r n 1,3,5,6. Assuming i n a s i m p l e case t h a t a page c o n t a i n s two b l o c k s . The l a y o u t {(1 , 2 ) , (3 , 4) , ( 5 , 6)} would be o p t i m a l i n a f i x e d - p a r t i t i o n LRU environment w i t h the number of page frames = 1. However, i f the number of page frames = 2, then the l a y o u t { ( 1 , 3 ) , ( 5 , 6 ) , ( 2 , 4 ) } would produce b e t t e r r e s u l t s . Thus 45 i t i s c l e a r t h a t no r e s t r u c t u r i n g a l g o r i t h m c o u l d produce a l a y o u t which i s o p t i m a l over the whole range of" memory space a v a i l a b l e t o the r e s t r u c t u r e d program. In g e n e r a l , i f the a v a i l a b l e memory i s l a r g e , a wider o b s e r v a t i o n of the b l o c k r e f e r e n c e s i s r e q u i r e d . In our r e f e r e n c e s t r i n g s , the average t r a n s i t i o n s e t s i z e d e t e c t e d by the BLI method i s s m a l l e r than the c o r r e s p o n d i n g average w o r k i n g s e t s i z e of the CWS method. As a r e s u l t , the BLI a l g o r i t h m performs b e t t e r a t the lower range of page frames a l l o c a t i o n . 6 . 1 . 2 . D i s c u s s i o n of e m p i r i c a l r e s u l t s Due t o the h i g h f l u c t u a t i o n of wo r k l o a d i n the system under which the experiment was c a r r i e d o u t , the e x p e r i m e n t a l e r r o r i s exp e c t e d t o be v e r y h i g h . I t i s hoped t h a t a f t e r r e p e a t i n g the experiment a l a r g e number of t i m e s , the s u p e r i o r i t y of one program over another can be e s t a b l i s e d s t a t i s t i c a l l y . The two v e r s i o n s of the P a s c a l c o m p i l e r were t e s t e d on MTS [3] which uses a g l o b a l memory management p o l i c y . The r e f e r e n c e s t r i n g used i n the r e s t r u c t u r i n g p r o c e s s had as i t s i n p u t a p a s c a l program c o n t a i n i n g s e v e r a l syntax e r r o r s . The d i f f e r e n t i n p u t used t o t e s t the c o m p i l e r s i n c l u d e d programs of d i f f e r e n t l e n g t h s and programs w i t h and w i t h o u t s y n t a x e r r o r s . For each type of i n p u t , the experiment was r e p e a t e d a t l e a s t 30 t i m e s . Due t o the h i g h u n c e r t a i n t y i n the e x p e r i m e n t , i t i s d i f f i c u l t t o draw c o n c l u s i o n s from the e x p e r i m e n t a l r e s u l t s w i t h a h i g h degree of c o n f i d e n c e . N e v e r t h e l e s s , the cummulative page f a u l t s show t h a t t h e r e i s a d e f i n i t e improvement of performance 46 i n the r e s t r u c t u r e d program. The r e d u c t i o n of page f a u l t r a t e p r e s e n t e d i n T a b l e 6.4 i s based on the cummulative page f a u l t s of over 30 r u n s . Because of the random n a t u r e of the e x p e r i m e n t , the f i g u r e s do not i n any way g uarantee r e p r o d u c t i o n of the same r e s u l t s upon r e p e t i t i o n of the e x p e r i m e n t . The d i s c r e p a n c y i n the r e d u c t i o n of page f a u l t between the a c t u a l and the s i m u l a t i o n runs r e q u i r e s some e x p l a n a t i o n . The r e s t r u c t u r i n g i s based on the r e f e r e n c e s t r i n g which c o n t a i n s no d a t a r e f e r e n c e s and hence i t does not account f o r the d a t a b l o c k s which may be h e a v i l y r e f e r e n c e d . The s i m u l a t i o n r e s u l t s show a s u b s t a n t i a l improvement because the p a g i n g a c t i v i t y s i m u l a t e d i s based on the r e f e r e n c e s t r i n g . A l t h o u g h the method of g r o u p i n g the n e s t e d p r o c e d u r e b l o c k and a l l i t s e n c l o s i n g b l o c k s i n t o a b l o c k (see s e c t i o n 5.4.2) has reduced page f a u l t r a t e somewhat, i t i s not the i d e a l s o l u t i o n . The r e g r o u p i n g r e s u l t s i n l a r g e r b l o c k s i z e s and thus r e d u c e s . t h e b e n e f i t s of r e s t r u c t u r i n g . Moreover, many of the combined b l o c k s a f t e r r e g r o u p i n g span over a page. Some of the pages of the o v e r s i z e d b l o c k may c o n t a i n s u b b l o c k s which a r e not r e l a t e d . A l s o , the o v e r s i z e d b l o c k s i n our r e f e r e n c e s t r i n g s happened t o be the more h e a v i l y r e f e r e n c e d ones. A s t u d y of the c o m p i l e r ' s s o u r c e code shows t h a t t h e r e i s a h i g h f r e q u e n c y of p r o c e d u r e s r e f e r e n c i n g g l o b a l d a t a . S i n c e the g l o b a l d a t a b l o c k cannot be p l a c e d i n the same page as every b l o c k t h a t r e f e r e n c e s i t , i t i s c l e a r t h a t the improvement of r e s t r u c t u r i n g w i l l not be s u b s t a n t i a l i f the number of page f a u l t s saved t h r o u g h r e s t r u c t u r i n g i s s m a l l compared t o the h i g h 47 page f a u l t r a t e due t o t h e g l o b a l d a t a r e f e r e n c e s . 6.1.3 C o s t The c o s t o f c o n s t r u c t i n g t h e c o n n e c t i v i t y m a t r i x u s i n g t h e BLI method d e p e n d s on t h e c h a r a c t e r i s t i c s o f t h e l o c a l i t y o f t h e p r o g r a m . I t i s p r o p o r t i o n a l t o t h e number o f t r a n s i t i o n p h a s e s and t h e a v e r a g e t r a n s i t i o n s e t s i z e . Our measurement shows t h a t on t h e a v e r a g e t h e BLI method i s c h e a p e r t o e x e c u t e t h a n t h e CWS method by a p p r o x i m a t e l y 18%. 6.2 E v a l u a t i o n o f t h e GHC c l u s t e r i n g scheme T a b l e s 6.2 and 6.3 show t h e p r o c e s s i n g t i m e a n d t h e p e r f o r m a n c e o f t h e c l u s t e r i n g a l g o r i t h m s u s i n g r e f e r e n c e s t r i n g s w i t h d i f f e r e n t number o f d i s t i n c t b l o c k s . The p r o c e s s i n g t i m e m e a s u r e d i s t h e t i m e t a k e n t o e x e c u t e t h e e n t i r e c l u s t e r i n g p r o c e d u r e . The a v e r a g e number o f b l o c k s i n a c l u s t e r ( k ) v a r i e s f r o m 30 t o 50. A f i x e d k v a l u e was u s e d i n o r d e r t o e s t a b l i s h t h e r e l a t i o n b etween t h e i m p r o v e m e n t r a t i o a n d t h e number o f b l o c k s . The i m p r o v e m e n t r a t i o i s d e f i n e d t o be t h e p r o c e s s i n g t i m e o f t h e HC method d i v i d e d by t h a t o f t h e GHC method. Our a n a l y s i s shows t h a t t h e t i m e c o m p l e x i t y o f t h e HC a n d t h e GHC m e t h o d s a r e r e s p e c t i v e l y n 3 and n 2 . The i n c r e a s i n g i m p r o v e m e n t r a t i o a s a f u n c t i o n o f n i n d i c a t e s t h a t t h e p r o c e s s i n g t i m e o f t h e GHC method i s i n d e e d l o w e r i n c o m p l e x i t y t h a n t h e HC method. I n f a c t , t h e i m p r o v e m e n t r a t i o i s = c n where c i s c o m p u t e d t o be ^ 0.0235. Thus i f n i s = 1000, t h e e x p e c t e d i m p r o v e m e n t i s a p p r o x i m a t e l y 23 t i m e s . 48 H o w e v e r , t h e a b o v e a n a l y s i s d o e s n o t t a k e i n t o a c c o u n t t h e c o n s t r u c t i o n o f t h e c o n n e c t i v i t y v e c t o r d u r i n g t h e r e s t r u c t u r i n g p h a s e . S i n c e t h e a d d i t i o n a l p r o c e s s i n g t i m e i s l i n e a r l y p r o p o r t i o n a l t o t h e l e n g t h o f t h e r e f e r e n c e s t r i n g , t h e new scheme w i l l n o t be s u i t a b l e i f t h e m a j o r o v e r h e a d i s t h e c o n s t r u c t i o n o f t h e m a t r i x and v e c t o r . Thus f o r a m a t r i x o f c o n s t a n t s i z e , t h e e f f i c i e n c y o f t h e new m e t h o d d e c r e a s e s a s t h e l e n g t h o f t h e r e f e r e n c e s t r i n g i n c r e a s e s . By e x t r a p o l a t i o n , f o r a m a t r i x o f s i z e 208 x 208, t h e GHC m e t h o d w i l l have no a d v a n t a g e o v e r t h e HC method when t h e number o f b l o c k r e f e r e n c e s e x c e e d 150000. T h i s c o m p u t a t i o n o f c o u r s e p e r t a i n s t o a p a r t i c u l a r r e s t r u c t u r i n g a l g o r i t h m and a p a r t i c u l a r m a c h i n e on w h i c h t h e a l g o r i t h m i s e x e c u t e d . I n o u r e x p e r i m e n t , t h e r e i s s t i l l a n o n - n e g l i g i b l e s a v i n g i n p r o c e s s i n g t i m e f o r a r e f e r e n c e s t r i n g (S4) w h i c h i s c o n s i d e r e d l o n g . The p e r f o r m a n c e of t h e GHC method a s shown i n T a b l e 6.3 i s c o m p a r a b l e t o t h a t o f t h e HC a l g o r i t h m . Thus t h e GHC o f f e r s a much c h e a p e r s o l u t i o n t o t h e c l u s t e r i n g p r o b l e m w i t h no d e g r a d a t i o n i n p e r f o r m a n c e . 6.3 C o n c l u s i o n T h e r e i s no d o u b t t h a t r e s t r u c t u r i n g h a s t h e p o t e n t i a l o f i m p r o v i n g t h e p e r f o r m a n c e o f a p r o g r a m r u n n i n g i n a v i r t u a l memory s y s t e m . However, t h e c o s t of r e s t r u c t u r i n g must be w e i g h e d a g a i n s t t h e b e n e f i t s g a i n e d . The amount o f i m p r o v e m e n t o b t a i n a b l e d e p e n d s on s e v e r a l f a c t o r s , among w h i c h t h e 49 f o l l o w i n g : c h o i c e of the r e s t r u c t u r i n g a l g o r i t h m s , the q u a l i t y of the n o n - r e s t r u c t u r e d program, i n p u t t o the r e s t r u c t u r e d program and the f r e q u e n c y of use. The c h o i c e of the r e s t r u c t u r i n g a l g o r i t h m s depends on the improvement index sought and on the memory management p o l i c y of the system i n which the r e s t r u c t u r e d program w i l l be e x e c u t e d . S t r a t e g y - d e p e n d e n t a l g o r i t h m s a r e best i n r e d u c i n g the page f a u l t r a t e i f the knowledge of the memory management s t r a t e g y can be e x p l o i t e d . When the improvement sought i s the r e d u c t i o n of average w o r k i n g s e t s i z e i n a w o r k i ng s e t environment, the BLI a l g o r i t h m seems t o be a b e t t e r c h o i c e . A l t h o u g h each a l g o r i t h m i s o p t i m a l over a c e r t a i n range of page frames i n a f i x e d LRU environment (see F i g u r e 6 . 1 ) , the a b s o l u t e improvement produced from the BLI a l g o r i t h m i s the h i g h e s t . T h e r e f o r e we might c o n j e c t u r e t h a t the BLI a l g o r i t h m i s l i k e l y t o p e r f o r m w e l l i n a g l o b a l - L R U environment. Due t o the l a c k of c o n t r o l l a b i l i t y of our e x p e r i m e n t , our c l a i m cannot be s u b s t a n t i a t e d by c o n c r e t e r e s u l t s from a r e a l system. I t has been s a i d t h a t r e s t r u c t u r i n g i s most e f f e c t i v e on b a d l y - w r i t t e n programs. T h i s i s t r u e t o a c e r t a i n e x t e n t though some c l a r i f i c a t i o n i s needed. A b a d l y - w r i t t e n program may t a k e many forms. On one extreme i t may c o n s i s t of j u s t one or a few l a r g e program segments and r e s t r u c t u r i n g would c e r t a i n l y not be e f f e c t i v e . An i l l - s t r u c t u r e d program c o u l d a l s o be one whose e x e c u t i o n b e h a v i o u r does not e x h i b i t d i s t i n c t r e f e r e n c i n g p a t t e r n and r e s t r u c t u r i n g i n such a case would not y i e l d much improvement s i n c e no p a r t i c u l a r s e t of b l o c k s a r e f a v o u r e d . Yet 50 a n o t h e r form of i l l - s t r u c t u r e d program i s one c o n t a i n i n g a h i g h f r e q u e n c y of g l o b a l d a t a r e f e r e n c e s . In the worst case where ever y p r o c e d u r e makes r e f e r e n c e t o the g l o b a l data b l o c k , the s i z e of the g l o b a l data b l o c k c o u l d be v e r y l a r g e . I t i s c o n c e i v a b l e t h a t the improvement r e s u l t e d from r e s t r u c t u r i n g w i l l be m i n i m a l i n t h i s c a s e . A l t h o u g h a good memory management p o l i c y would m a i n t a i n the g l o b a l d a t a b l o c k i n main memory most of the t i m e , the p o r t i o n of the d a t a b l o c k b e i n g f r e q u e n t l y r e f e r e n c e d d u r i n g a s h o r t p e r i o d of time c o u l d be s m a l l , r e s u l t i n g i n low u t i l i z a t i o n of main memory. Perhaps i t i s more c o r r e c t t o say t h a t r e s t r u c t u r i n g i s most e f f e c t i v e on a w e l l - s t r u c t u r e d program. A good m o d u l a r i z a t i o n i n a w e l l - s t r u c t u r e d program would g i v e r i s e t o s m a l l b l o c k s i z e s and good l o c a l i t y , two e s s e n t i a l i n g r e d i e n t s f o r e f f e c t i v e r e s t r u c t u r i n g . Our e x p e r i m e n t s show t h a t d a t a r e f e r e n c e s c o n s t i t u t e a c o n s i d e r a b l e amount of page f a u l t s . Memory a l l o c a t i o n t o d a t a i s e i t h e r s t a t i c or dynamic. The i n f l u e n c e of d a t a r e f e r e n c e s of the f i r s t type on page f a u l t r a t e has been d i s c u s s e d . R e f e r e n c e t o dynamic d a t a s t r u c t u r e s i s not r e c o r d e d i n the t r a c e of a program and hence c o u l d not be a c c o u n t e d f o r i n r e s t r u c t u r i n g . R e s t r u c t u r i n g i s based on one e x e c u t i o n t r a c e which i s supposed t o r e p r e s e n t the g e n e r a l b e h a v i o u r of the program. T h e r e f o r e i t w i l l o n l y be b e n e f i c i a l i f the r e s t r u c t u r e d program i s i n s e n s i t i v e t o d i f f e r e n t i n p u t d a t a . S t u d i e s have shown t h a t the b e h a v i o u r of c o m p i l e r s , a s s e m b l e r s and o t h e r l a r g e system programs i s not v e r y s e n s i t i v e t o d i f f e r e n t i n p u t d a t a . Such 51 programs are a l s o good candidates f o r r e s t r u c t u r i n g s i n c e the cost of r e s t r u c t u r i n g w i l l be p a i d o f f through frequent usage. The improvement a l s o may depend on the l e n g t h of the input data. When the input data i s a program to be compiled, the amount of i n f o r m a t i o n , such as symbol t a b l e s , r e q u i r e d to be r e s i d e n t i n main memory may i n c r e a s e p r o p o r t i o n a l l y with the s i z e of the program. The e f f e c t of t h i s f a c t o r i s perhaps minor; Table 6 . 4 shows that the input program would have to be very l a r g e ( s e v e r a l hundred Kbytes) t o reduce the percentage improvement to below 6 % . L a s t l y , the new c l u s t e r i n g scheme p r o v i d e s a much cheaper s o l u t i o n to the c l u s t e r i n g problem with l i t t l e d egradation in performance. The re d u c t i o n i n p r o c e s s i n g time may seem i n s i g n i f i c a n t compared t o the o v e r a l l c o s t of r e s t r u c t u r i n g . However, t h i s scheme f i n d s i t s a p p l i c a t i o n i n many areas where the s i z e of the matrix i s the dominating f a c t o r i n the execution time of the a l g o r i t h m . When the s i z e of the matrix i s huge, the re d u c t i o n i n execution time c o u l d be s u b s t a n t i a l . Table 6.1a Comparison of the r e s t r u c t u r i n g a l g o r i t h m s on the number of page f a u l t s i n the working s e t environment S t r i n g O r i g i n a l CWS MWS ASI BLI PI 369 1 70 1 58 202 159 P2 8032 3565 3599 3565 3607 P3 5698 2769 2765 3465 2706 P4 4946 231 5 2254 2306 2465 P5 1690 738 826 921 898 mean 4147 1911 1 920 2092 1 967 Tab l e 6.1b Comparison of the r e s t r u c t u r i n g a l g o r i t h m s the average working s e t s i z e ( i n terms of 4K byte pages) i n the working s e t environment S t r i n g Or i g i n a l CWS MWS ASI BLI PI 1 0.40 7.16 5.34 6.42 4.87 P2 1 2.68 7.66 7.25 7.66 6.86 P3- 1 3.42 8.21 7.68 8.91 7.11 P4 10.89 6.52 6.21 6.50 6.26 P5 1 1 .65 6.72 6.93 6.92 6.50 mean 11.65 7.25 6.68 7.28 6.32 Table 6.2a Comparison of the c l u s t e r i n g a l g o r i t h m s on the CPU p r o c e s s i n g time ( i n msec*) d u r i n g the c l u s t e r i n g phase S t r i n g n HC GHC(k=30) Improvement SI 1 42 1823 554 3.29 S2 171 3026 739 4.09 S3 195 4299 938 4.58 S4 208 5100 1057 4.82 S5 252 8704 1 448 6.01 Ta b l e 6.2b Comparison of the c l u s t e r i n g a l g o r i t h m s on the CPU p r o c e s s i n g time ( i n msec*) d u r i n g the r e s t r u c t u r i n g phase S t r i n g no.of r e f e r e n c e s HC GHC(k=30) 21 24 931 5662 7621 3581 Table 6.2c Comparison of the c l u s t e r i n g a l g o r i t h m s on the C P U ' p r o c e s s i n g time ( i n msec*) d u r i n g the r e s t r u c t u r i n g and c l u s t e r i n g phases S t r i n g HC GHC(k=30) SI 371 5 2678 S2 3878 1 670 S3 8768 6600 S4 1 1269 8678 S5 11719 5029 51 21304 1892 52 9950 852 53 40832 4469 54 59396 6169 55 15375 3015 A l l t i m e s quoted i n T a b l e s 6.21 t o 6.2c were measured from an Amdahl 470 V/8 system. 54 Table 6.3 Comparison of the c l u s t e r i n g a l g o r i t h m s on the number of page f a u l t s g e n e r a t e d i n the w o r k i n g s e t environment S t r i n g n HC GHC(k=30) s i • 142 882 928 S2 171 358 353 S3 195 2593 2697 S4 208 3565 3629 S5 252 1510 1 603 Table 6.4 E v a l u a t i o n of the r e s t r u c t u r e d P a s c a l c o m p i l e r on MTS r e s t r u c t u r i n g a l g o r i t h m : BLI I n P u t programs % r e d u c t i o n i n page f a u l t r a t e p r o g , w i t h 7 e r r o r s , size=13 pages 14 p r o g , w i t h no e r r o r s ,size=13 pages 12 p r o g , w i t h 209 e r r o r s , s i z e = 8 0 pages 6 p r o g , w i t h no e r r o r s , size=80 pages 3 F i g u r e 6.1 P a g i n g C h a r a c t e r i s t i c s o f r e f e r e n c e s t r i n g P3 56 REFERENCES [1] M.S. Achard, J.Y.Babonneau, M . C a r p e n t i e r , G. M o r i s s e t , M.B. Mounajjed, "The c l u s t e r i n g a l g o r i t h m s i n the Opale r e s t r u c t u r i n g system," Performance of Computer I n s t a l l a t i o n , D . F e r r a r i ( e d . ) CILEA, N o r t h - H o l l a n d P u b l i s h i n g Co.,1978. [2] Y, Ba r d , " C h a r a c t e r i z a t i o n of program p a g i n g i n a t i m e -s h a r i n g environment," IBM J o u r n a l of Research and Development, 1973, pp.387-393. [3] D. B o e t t n e r and M. A l e x a n d e r , "The M i c h i g a n T e r m i n a l System," In IEEE P r o c e e d i n g s , S p e c i a l i s s u e on i n t e r a c t i v e computer systems, pp. 912-918, June 1975. [4] S. Chanson and B. Law, "Improving performance by s t r a t e g y -independent program r e s t r u c t u r i n g u s i n g Bounded L o c a l i t y I n t e r v a l s , " , T e c h n i c a l Report 81-9, Dept. of Computer Sc., U n i v e r s i t y of B r i t i s h C olumbia, 1981. [5] S. Cook, The c o m p l e x i t y of theorem p r o v i n g p r o c e d u r e s , " ACM SIGACT, 1970, pp.151-158. [6] D. F e r r a r i , "Improving program l o c a l i t y by s t r a t e g y -o r i e n t e d r e s t r u c t u r i n g , " I n f o r m a t i o n P r o c e s s i n g 74, P r o c . I F I P Congress 74, N o r t h - H o l l a n d , Amsterdam, 1974, pp.266-270. [7] D. F e r r a r i , " T a i l o r i n g programs t o models of program b e h a v i o r , " IBM J o u r n a l of Research and Development, v o l . 1 9 , 3, May 1975, pp.244-251. [8] D. F e r r a r i , "The improvement of program b e h a v i o r , " IEEE Computer 9 11, Nov. 1976, pp.39-47. [9] D. F e r r a r i and M. K o b a y a s h i , "Program r e s t r u c t u r i n g a l g o r i t h m s f o r g l o b a l - L R U e n v i r o n m e n t s , " Conf. P r o c . of I n t . Computing Symp. 1977, L i e g e , B e l g i u m , pp.277-283, A p r i l , 1977. [10] D.J. H a t f i e l d and J . G e r a l d , "Program r e s t r u c t u r i n g f o r v i r t u a l memory," IBM System J o u r n a l v o l . 1 0 , pp.168-192, 1 971 . [11] E. H o r o w i t z and S. S a h n i , "Fundamental of computer a l g o r i t h m s " , Computer S c i e n c e P r e s s , I n d . , 1978. [12] K.C. Kahn, "Program b e h a v i o r and l o a d dependent system performance," Ph.D. D i s s e r t a t i o n , Dept. Computer Sc., Purdue U n i v . , W. L a f a y e t t e , IN, Aug. 1976. [13] R. K a r p , " R e d u c i b i l i t y among c o m b i n a t o r i a l problems," In C o m p l e x i t y of Computer Computations, R. M i l l e r and J . Thather Ed., Plenum P r e s s , N.Y. 1972, pp. 85-104. 57 [14] M. K o b a y a s h i , "A s e t of s t r a t e g y - i n d e p e n d e n t r e s t r u c t u r i n g a l g o r i t h m s " S o f t w a r e - P r a c t i c e and E x p e r i e n c e , V o l . 7, 585-594 (1977). [15] A.W. Madison and A.P. Bat s o n , " C h a r a c t e r i s t i c s of program l o c a l i t i e s , " CACM, v o l . 19, pp.285-294, May 1976. [16] T. Masuda, H. S h i o t a , K.Noguchi, and T. O h k i , " O p t i m a z a t i o n of program l o c a l i t y by c l u s t e r a n a l y s i s , " P r o c . I F I P C o n g r e s s , 1974, pp.261-265. [17] J.-F. P a r i s and D. F e r r a r i , "An a n a l y t i c a l study of s t r a t e g y - o r i e n t e d r e s t r u c t u r i n g a l g o r i t h m s , " P r o g r e s Report 82.1 ,Dept. of EECS, U n i v e r s i t y of C a l i f o r n i a , B e r k e l e y , 1982. [18] J.R. S p i r n and P . J . Denning, "Experiments w i t h program l o c a l i t y , " AFIPS Conf. P r o c , FJCC, v o l . 4 1 , 1972, pp.61 1-621 . [19] C.T. Yu, M.R. S i u , K. Lam, F . T a i , " A d a p t i v e c l u s t e r i n g schemes : g e n e r a l framework," P r o c e e d i n g of COMPSAC c o n f e r e n c e , C h i c a g o , 1981, pp.81-89. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051842/manifest

Comment

Related Items