UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Machine architecture and the programming language BCPL 1978

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
UBC_1978_A6_7 F69.pdf [ 3.51MB ]
UBC_1978_A6_7 F69.pdf
Metadata
JSON: 1.0051782.json
JSON-LD: 1.0051782+ld.json
RDF/XML (Pretty): 1.0051782.xml
RDF/JSON: 1.0051782+rdf.json
Turtle: 1.0051782+rdf-turtle.txt
N-Triples: 1.0051782+rdf-ntriples.txt
Citation
1.0051782.ris

Full Text

MACHINE ARCHITECTURE AND THE PROGRAMMING LANGUAGE BCPL by MASK C. FOX B.Sc, The University of B r i t i s h Columbia, 1975 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE i n THE FACULTY OF GRADUATE STUDIES (Department of Computer Science) We accept t h i s t h e s i s as conforming to the reguired standard. THE UNIVERSITY OF BRITISH COLUMBIA September, 1978 (c) Mark C. , Fox, 1978 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of Brit ish Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the Head of my Department or by his representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Computer Science The University of Brit ish Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1W5 Date Sept, 12. 1978. i i 0. Abstract This thesis describes the design of a well mapped machine 1 for the language BCPL. Based on a generalized notion of stack machines the SLIM (Stack language f o r Intermediate Machines) machine i s described. As the acronym suggests, representation of BCPL programs i n SLIM i s i n fac t slim compared with other architectures. The u t i l i t y of th i s measure f o r comparison with other architectures i s discussed and some encouraging r e s u l t s presented. Apart from t h i s r e s u l t , some advance i s made i n the c l a s s i c a l mode of porting BCPL programs. Normally the compiler produces OCODE from which INTCODE i s generated. The BCPL SLIM compiler shortcuts t h i s process by generating SLIM d i r e c t l y from the program tree thus dispensing with software corresponding to the OCODE to INTCODE translator. Translation of BCPL programs i s thus s i m p l i f i e d and speeded up. »by well mapped we mean that transformations in the high l e v e l language correspond c l o s e l y to those i n the low l e v e l machine representation. Table of contents i i i 0, ABSTRACT ...... ... 1. INTRODUCTION ... 1.3 Objectives .................................. - - 2. THE SLIM MACHINE DESCRIPTION ....,10 2 .1 Preliminaries ................................ 2.2 Variables 2.3 Operands ................ .. .......... ......... 2.4 Operations ................................... 2.5 An example 3. MACHINE JUSTIFICATION . . . . 3.1 Why choose a stack machine archtectur€? ...... 3.2 Why choose a single accumulator? ............. 3.3 8hy have an ;S register? ...................... 3.4 Why single address? .......................... 4., MEASURES AND RESULTS . ..................... .-.vyy.: 4.1 Ideal Program representation ................. 4*2 M 63.SUZT6S • • • * • • * • • • * « • • • • • • • • • • • • • •- • • • 1̂ • 3 R©suX fcS • * « • • • * * • • • • • • • » • • * * ' * • • • * • • * • • * ' 5. CONCLUSIONS AND DIRECTIONS FOB FURTHER RESEARCH . BIBLIOGRAPHY ..... .............................. APPENDXX X •«••»•••*••••••*§•**••••••*•••••***«•«•**••••*«•*• , 5ty APPENDIX II . ...... .... ..... ...... .. . . .. .. ..... ...... APPENDIX I I I - SLIM system software .................. •* • ' • * * • * . 11 * * * *' • • • . 1 • • • * * • • . 1 • • • • • * • . 3 • » • • • 8 * * f '•• • *' • 10 11 * • • *•' ,12 • • • • • • * . 13 • • • * 19 • * • * • • • 23 • * * * • * • 23 • • # * * * • 26 * ••'*'•' .28 • •»••*• * 31 '* '•• *' *•" 34 • • * * • • • 34 • • • • 35 41 • * • • * • • ,47 52 • • • « • • • 54 • * • 55 • • • •' • • • ,57 L i s t of tables iv I. OCODE and SLIM comparison 43 II. , Optimized EM-1 and SLIM comparison ..................... J»i| I I I . Modified EM-1 and SLIM comparison ............v........,46 L i s t of figures v 2.1 The runtime stack 11 2.2 Stack prior to the c a l l ................................. 20 2.3 Stack after the c a l l .................................... 20 2.4 The routine's Stack Environment 21 2.5 The stack i n the midst of expression evaluation ......... 22 vi Acknowledgements I would l i k e thank the Department for the provision of various Teaching assistantships. My thanks also go to my supervisor Sam Chanson, for his provision of a research assistantship that allowed me to devote more time to t h i s work, and also f o r h i s constructive c r i t i c i s m of my thesis. Harvey Abramson i s also to be thanked for reading t h i s t h e s i s . Outside the department I am grateful for other friends and i n t e r e s t s at UBC p a r t i c u l a r l y the community of Fairmont House that were there during the academic f r u s t r a t i o n s of t h i s past year. For the broader perspective provided by these people I am very thankful. Chapter 1 1• Introduction 1 As with most work t h i s thesis needs to be set i n i t s proper context. The h i s t o r i c a l perspective i s one aspect of t h i s but more importantly there are a number of current issues that give t h i s thesis relevance. We w i l l f i r s t examine these two components of t h i s thesis*s context and then proceed to outline the objectives that governed the r e a l i z a t i o n of the SLIM machine. 1.1 Setting the context The way programming languages are used i s of interest to various people. These people often include systems architects, language designers and compiler writers. In the design of a programming language i t i s useful to know the kind of constructs that are most freguently used. Compiler writers can use t h i s knowledge e f f e c t i v e l y as they decide how much energy to devote to compiling good code for the more common constructs. By good code we mean code that compactly and e f f i c i e n t l y represents the intentions of the language constructs used. For example, the dominance of the assignment statement i n programs i s one candidate for which good code should be compiled. System architects are more interested i n how e f f e c t i v e l y the language maps to the machine and empirical evaluation can lead to some fi n e tuned application oriented architectures.(see f 1 1) Chapter 1 2 A number of people have studied the ways i n which programming languages are used: Tannenbaum [2] has studied a BCPL variant c a l l e d SAL and proposed a simple machine architecture ; Knuth [3.1 has analyzed Fortran programs; Alexander [4] studied XPL as implemented on the IBM and unearthed some inconsistency i n the mapping from XPL to IBM assembler. Wortman [5] studied student PL on an interpretative system and used his analysis to incorporate changes into h i s PL machine. He states:"fle took as our design goal the development of new design too l s to aid the designer i n building computers that s a t i s f i e d the actual rather than the imagined needs of the programmer. *' Gordon and Capstick [6] have examined COBOL. Of course we would be amiss i f we f a i l e d to mention the design of the Burrough's machines which were high l e v e l language machines i n the f i r s t place. An analysis of over 60 ALGOL 60 compilers by Sichmann [7] showed code produced by the Burrough's compiler occupied half the space of code produced by the-IBM comjiler. This l e v e l of approaching the problem from the language point of view has gone hand in hand with the development of microprogramming. Once the manufacturers offered user microprogramming there was a flood of a c t i v i t y i n t h i s f i e l d . , Many machines were designed which were t r u l y meant to be very general purpose (see [81 and [91 for examples) . The Manodata QM-1 was perhaps too f l e x i b l e but the emulation of the PDP-11 on i t [10 1 proved i t to be useful. As usual, Burroughs i n the design of the B1700 series (see [111) seriously and e f f e c t i v e l y attempted to free us Chapter 1 3 from von Neumann s t y l e machines. Hilner states that "Von Neumann derived machines are automatous malefactors who force programmers to l i e on many procrustean beds". Each particular language would have i t s own S (S standing for secondary) machine optimized for i t s own par t i c u l a r application area that would be emulated by B1700 hardware. No machine language was b u i l t into the hardware and therefore each language to be executed had to f i r s t reconfigure the B1700 processor. Concurrent execution of S machines was very feasible with the fast switching time (14 53 microsec) . Apart from t h i s a r c h i t e c t u r a l concern on the l e v e l of machine design, microprogramming i t s e l f was used to measure computer systems., Saal [121 used t h i s very transparent technigue to obtain system design guidelines. Despite a l l these a c t i v i t i e s that brought architecture to the fore as a research area, Rosin who was involved for a long time with microprogramming, was forced to define microprogramming as "the implementation of hopefully reasonable systems through interpretation on unreasonable machines" [13 1 . Even t h i s pessimistic comment should not detract from the overriding concern with machine architecture not just i n i t s e l f as an end but as a means to f a c i l i t a t i n g what people want to do. It i s i n t h i s l i g h t that we should see the development of SLIM. 1•? Current issues Computer science with i t s concern for easy and e f f e c t i v e expression has long been i n the business of generating new Chapter 1 4 languages. Translators conseguently tend to be one of the most freguently used pieces of software and w i l l probably continue to stay that way. Translator writing systems are a manifestation of t h i s f a c t . However as Appelbe f14^ states: "The major complexities encountered i n the design and implementation are usually i n the 'semantics phase* of the translator i n generating the object program from an i n t e r n a l source program representation". This complexity no doubt a r i s e s from a number of sources: Languages that are exceedingly complex and hence inevitably reguire complex code generation. Another more important source of complexity i s that inhospitable host architectures demand contortions on the part of the code generator and hence the implementor. This i s the complete reversal of the s i t u a t i o n i n the parsing-syntax analysis phase where the methods are very well understood. Despite t h i s acknowledgement, computer science has tended to minimize the importance of machine architecture and has often comforted i t s e l f with the fact that the language was implemented and l e f t i t at that. The implementation was the overriding concern and a f t e r a l l , with cheaper memory prices we are net r e a l l y concerned about how e f f i c i e n t l y our programs are represented, are we? This hides the main point. It*s high time that computer science not relegate machine design only to the m i l i t a r y and the a r t i f i c i a l i n t e l l i g e n c e communities where an overwhelming need demands better architecture. He need to refute the notion that the description of a machine's assembly language constitutes i t s complete d e f i n i t i o n . Chapter 1 5 At present, even to the most casual observer, there i s an explosion in the area of microprocessor technology. The market here i s i n a continual state of flux as more and more products are announced. The a v a i l a b i l i t y of b i t s l i c e components makes possible the construction of new machines at reasonable costs. T.M. HcWilliams et a l . [15] describe the construction of a PDP/11 using b i t s l i c e s for a t o t a l cost of $1076., This machine even outperforms the LSI-11. With t h i s continual state of flux industry has adopted a microprocessor chip standard - the I n t e l 8080. In l i g h t of the above fa c t this i s not the most desirable chip from an a r c h i t e c t u r a l point of view. Perhaps t h i s standardization was unavoidable. Another more important standardization of t h i s technology i s the language and i t s implementation. BASIC has become the standard language with a variety of implementations. Its implementation however i s more s i g n i f i c a n t as far as SLIM i s concerned. Interpretation i s the accepted way of implementing Basic. From a very rudimentary analysis of the BASIC source these interpreters interpret BASIC programs at a f a i r l y high l e v e l . Before we draw out the s i g n i f i c a n c e of these facts we s h a l l quote from the Nov 15, 1977 draft of the objectives for computer science i n the department of Computer Science at UBC [16] . "Broadly speaking computer science i s concerned with the design of algorithms and with e f f i c i e n t implementations of algorithms on computing systems. , The computing systems may vary i n size from the hand held programmable calculator to a complex c o l l e c t i o n of devices interconnected by s a t e l l i t e and cable." Chapter 1 6 Let as examine t h i s statement i n the context of high school computing f a c i l i t i e s . The systems that high school students are dealing with are Basic ones i n more than one sense! As a department there are educational issues at stake. What are potential university students i n computer science being exposed to? Is the form of expression r e a l l y suited to developing structured programming? H i l l university programs at the f i r s t year l e v e l involve a certain amount of deprogramming? One attempt to address t h i s issue of teaching computer science as a unified d i s c i p l i n e was Peck's 'Essence of computer science' f 1 7 ]. In t h i s report the language BCPL and i t s abstract machine INTCODE served as a reference point for teaching computer science coherently. In order to execute BCPL programs one interpreted t h e i r INTCODE representation. The major l i m i t a t i o n of the INTCODE system was the s i z e of the INTCODE version of the compiler since i f the compiler cannot f i t on the host system, i t becomes very cumbersome to compile programs on one machine and execute them (via interpretation) on another. SLIM serves exactly the same function as INTCODE except that i t i s much more compact (as we show in Chapter 4) and hence the BCPL SLIM compiler i s more compact, than the INTCODE version . However both forms of r e a l i z i n g BCPL are exactly s i m i l a r to the current form of r e a l i z i n g BASIC - interpretation. Therefore i f computer science i s concerned with the representation of algorithms and BCPL i s accepted as a v a l i d vehicle for t h i s , then i f SLIM f a c i l i t a t e s t h i s process on a wide variety of machines i t should be of concern and use to the department i n r e a l i z i n g one of i t s Chapter 1 7 stated objectives. There seems to be the very r e a l p o s s i b i l i t y that computer science could be making inroads here not only i n the r e a l i z a t i o n of more ef f e c t i v e languaqes but also i n the r e a l i z a t i o n of more e f f e c t i v e hardware as i n the b i t s l i c e ver s i on of the PDP/11. One f i n a l issue involves the language BCPL (see [20], [21] and [22]) .We w i l l just present the case for t h i s class of language. By class we mean systems programming languages cf the BCPL type. OS-6 [19], a single user operating system was written almost completely i n BCPL. C, a BCPL offshoot with PDP/11 constructions that have f i l t e r e d up into the language, i s the source langauge for a very e f f e c t i v e operatinq system UNIX, [23] written f o r the PDP/11. This demonstrates the u t i l i t y and effectiveness of t h i s class of language. Therefore our concern with i t i s not misplaced. A more l o c a l BCPL issue concerns the c l a s s i c a l form of translation to INTCODE., BCPL source i s f i r s t translated to OCODE and from this one translates to INTCODE. SLIM i s generated d i r e c t l y from the tree representation of the BCPL program. A number of advantages accrue: i . One complex piece of software (OCODE to INTCODE translator) i s dispensed with. i i . The obscure OCODE machine no longer confronts us. i i i . The r e a l i z a t i o n that tr a n s l a t i o n from the tree to SLIM i s straightforward and that simple optimizations are e a s i l y handled. For too long we have been stuck with the BCPL-> OCODE -> INTCODE Chapter 1 process. This thesis shows necessary, and more importantly machines are not i d e a l l y suited 8 that t h i s procedure i s no longer shows that these two v i r t u a l to BCPL. 1. 3 Objectives The prime objective was to achieve compact program representation. By compact program representation we mean that the s i z e of the translated program i s small. This i s always a convenient res u l t but more importantly i t r e f l e c t s how e f f e c t i v e the mapping i s from language to machine., The choice of t h i s measure rather than time for example, w i l l be dealt with l a t e r when we address the issue of how one actually evaluates an architecture. Presently there i s no standard evaluation technigue that allows for program independent evaluation of an architecture. Simplicity was the second c r i t e r i a . Partly t h i s i s a reaction against the trend that dictates complexity to be the norm, but a simple architecture has a number of advantages. Simple architectures are more easily understood and can exemplify a r c h i t e c t u r a l p r i n c i p l e s . If interpretation i s going to be the sole method of executing BCPL programs then the simpler the machine to be emulated the simpler the emulator. As t h i s most probably w i l l be the form of execution i n the microprocessor f i e l d and since the emulator w i l l most probably be written i n assembler, the ease with which the interpreter i s implemented i s very important. I f we extend the notion of Chapter 1 9 interpretation and temporarily equate i t with microprogramming, then the same advantages apply except that in t h i s case the corresponding assembly language i s more primitive. I f one actually intends to glue together the SLIM machine using b i t s l i c e technology then the simpler the machine the better. A t h i r d guideline was that the machine acknowledge the existence of BCPL r i g h t at the machine l e v e l . Thus i n e f f e c t the SLIH machine i s as e a s i l y described by BCPL as i t i s by the SLIH assembly language. For example the CALL i n s t r u c t i o n should actually do more than just change the program counter and remember i t s previous value. The IBM 370 BALB inst r u c t i o n i s a c l a s s i c example of refusing to acknowledge that a programmers common forms of expression are at a much higher l e v e l and that changing control i s much more than a change i n location. This guideline hopes to demonstrate that the tr a n s l a t i o n process can be straightforward and not always involve numerous contortions in the code generation section of the translator. Sith these objectives in mind, and the context i n which t h i s thesis i s set outlined, we now proceed to describe the SLIM machine in an informal manner. Chapter 2 10 2 . The SLIM machine description This machine i s offered as an alternative to the current way of porting BCPL programs. This section contains a s p e c i f i c a t i o n of the machine but o f f e r s no j u s t i f i c a t i c n or motivation for the choice of SLIM. However as the acronym suggests (a Stack Language for Intermediate Machines) , one advantage should be that of compact program representation or SLIM code. 2*1 P r e l i f f l i l i a r i g s The memory of the SLIM machine l i k e INTCODE [ 2 2 1 and PICA-B [ 2 6 ] , consists of an array of c e l l s numbered from zero. The consecutive numbers assigned to c e l l s are known as th e i r addresses.. The number of b i t s per c e l l i s unspecified. The SLIM machine has f i v e r e g i s t e r s . One accumulator (ACC) i s used for a l l arithmetic, l o g i c a l and various other operations. P i s used as an index register and points to the base of the current stack frame and S points to the top of the current stack frame. {Figure 2.1) The C r e g i s t e r i s used as a program counter. G i s used to access the base of the location of the global variables. We w i l l j u s t i f y t h i s choice l a t e r , but for the present we w i l l give some examples of eguivalent constructions in present day architectures.. The HP - 2 l M X*s base page addressing functions i n very much the same way. Even the PDP-8*s zeroth page addressing, in which every page can access the zeroth page, i s a form of global variable access. Two more respected architectures- the HP 3000 and the B5500 (see [ 2 4 1 ) Chapter 2 11 have si m i l a r features. The IB r e g i s t e r i n the HP3000 points to the case of the global area that i s in fact kept at the bottom of the stack. The 5500*s R re g i s t e r points to the separate ( i . e . , not i n the stack) program reference table that contains global variables and global procedure names. To use the current terminology the SLIM machine i s a single accumulator, one address machine. Previous stack frames Current stack frame r~ r 1 1 1 1 | ]Linkage JParametersJ Local | .... | I I Info | |variables J j • J 1 a x i I I i I I I P S Figure 2.1 The runtime stack J2..2 Variables Before describing the operations provided by the SLIM machine we w i l l look at the four sets of variables in BCPL and describe how they may be accessed. BCPL (a modified version) has four sets of variables - l o c a l , s t a t i c , global and external although i n s t r u c t i o n s need only be provided to access the f i r s t three. Local variables (see figure 2.1) are allocated space above the parameters on the runtime stack. They are accessed r e l a t i v e Chapter 2 12 to P and hence we see how P serves as an index register. Notice that once the current stack frame i s released, space for a l l l o c a l variables vanishes as well. S t a t i c variables are accessed by reference to some label and unlike l o c a l variables remain accessible throughout program execution. The syntax of a l a b e l i s simply a number followed by a colon. External variables are accessed as i f they were s t a t i c variables except that information i s provided for the loader so that i t can resolve these references. Externals can be thought of as being another program's s t a t i c variables. Global variables are accessed r e l a t i v e to the G register. This space i s reserved by the runtime support for the pa r t i c u l a r BCPL program and t h i s support also i n i t i a l i z e s the G register. 2.3 Operands The syntax of a l l operands i s as follows. CI 3 [P | G | L] < integer > | * P refers to the stack pointer. The contents of P i s added to <integer> to obtain the address of a particular variable (parameter or local) on the current stack frame. G i s interpreted s i m i l a r l y except that i t refers to the global pointer. L denotes a p a r t i c u l a r l a b e l (e.g., Ln) . The operand Chapter 2 13 i n t h i s case i s treated as the address of the p a r t i c u l a r label. I means i n d i r e c t i o n . The operand determined thus far i s dereferenced once. The * refers to the top of the stack ( i . e . , the location pointed at by register S ) . In t h i s case the contents of S-1 becomes the operand and S i s decremented by 1. Some examples follow: P2 address of c e l l at o f f s e t 2 from P IP2 value of c e l l at offset 2 from P 13 address of c e l l denoted by lab e l 3 IL3 value of c e l l pointed to by label 3 2.4 Operations a. Variable access operators four operations are used ; Load (LD) , store (STORE), stack and load (STKLD), and select f i e l d and store (SLCTST). (e.g. value of a s t a t i c variable) * value of a temporary variable LB operand ACC := operand STORE operand location(operand) := ACC STKLD operand !S := ACC S : = S + 1 ACC := operand Chapter 2 14 SLCTST f i e l d s e l e c t o r s e l e c t appropriate f i e l d of the value i n the accumulator and store them i n the correct f i e l d of the c e l l specified by i t s address at the top of the stack, (i.e !{S-1)) Then decrement S by 1. Some sample variable loads and stores are i l l u s t r a t e d , temporary: LD * l o c a l : LD IPn STORE Pn s t a t i c : LD ILn STORE Ln Notice that LD Pn loads the address of the l o c a l variable not the value. b. Diadic expression operators A l l operators can be defined as follows. op operand ACC : = ACC op operand Integer operators - MULT,DIV,REM,PLUS,MINUS MDIT, DIV, PLUS and MINUS are as expected. REM i s the integer remainder on the d i v i s i o n of the ACC by the operand. Chapter 2 15 Relational operators - EQ, N E,L S,GR,LE,G E EQ and NE are equal and not equal. LS and GR are less than and greater than, LE and GE are less than or egual to and greater than or egual to. Logical operators - LSHIFT,RSHIFT,LOGAND,LOGOfi,EQV,EXOR L and RSHIFT are the l e f t and right s h i f t operators. LOGAND and LOGOR are the l o g i c a l AND and OR. EXOR i s the exclusive OR or bitwise non-equivalence. . EQV i s bitwise equivalence., Bit operators - SLCTAP This applies to f i e l d selectors i n the BCPL sense. The appropriate f i e l d i s selected. c., S regis t e r manipulation To allow for f l e x i b l e manipulation of the S register a combination of monadic and diadic operators are defined. This allows one to set the S reg i s t e r in a r e l a t i v e ( i . e . SREL, SR.ELI) or absolute sense. (SSET, SSETI) . I f one allows an extended BCPL i n which dynamic storage a l l o c a t i o n i s implemented then i t i s mandatory that the S register be manipulated in a Chapter 2 16 r e l a t i v e sense. I t i s true that a l l l o c a l variables are i n a sense dynamically allocated but at present the sizes of vectors must be fixed at compile time. This a b i l i t y to determine run time sizes of vectors i s what we mean by dynamic a l l o c a t i o n . SGET ACC := S SSET S := ACC SHEL S := S + ACC SSETI S := operand SBEII S := S • operand d. Monadic operators NEG ACC := - ACC NOT ACC := not ACC DEBEF ACC := !ACC POSH IS := ACC ; S := S + 1 POP ACC := !(S - 1) ; S := S - 1 TRUE ACC:= TBDE FALSE ACC := FALSE FINISH FINISH e. . Transfer GOTO C := ACC JUMP operand C := operand Chapter 2 17 JT operand C := (ACC = TRUE) -> operand, C JF operand C := (ACC = FALSE) .•-> operand,C The switchon statement i s implemented by the SWITCHON command; SWITCHON k Ld k1 11 k2 12 ...... kk l k The accumulator controls the switch; i t examines the k case constants i n l e f t to right order and when a match occurs then a jump i s made to the corresponding case l a b e l , otherwise a lump i s made to the default l a b e l Ld. f. Function and routine c a l l i n g No difference i s made between the c a l l and return instructions for a function or a routine. When a function i s ca l l e d i t returns i t s r e s u l t i n the ACC., Prior to a c a l l space should be saved for the l i n k s , (savespacesize denotes t h i s number) the i parameters pushed on the stack and the address of the routine loaded into the ACC. The CALL i instruction has the following effect temp := S - ( i • savespacesize) temp!0 := C temp!1 := P P := temp C := ACC Chapter 2 18 The 8T8N in s t r u c t i o n i s as follows C := P!0 S := P P := P! 1 The routine c a l l e d , on entry i s responsible to set the S regi s t e r so that i t points as i n Figure 1 (above the para meters and preliminary l o c a l variables). g. Pseudo instructions <number>: denotes la b e l <number>. Sname denotes the s t a r t of the code for the routine c a l l e d name. SECTION "section name" indicates the start of the code for the section "section name". END indicates the end of a section*s code h.. Data reservation instructions There i s one general purpose d i r e c t i v e used to reserve space. This i s the DATA operator.„ I t s operands can be numbers, characters (enclosed i n single quotes), strings {enclosed in double quotes) or labels.. I t reserves space i n the subsequent c e l l s f o r i t s operands. Chapter 2 1 9 2.5 An Example The following program and i t s SLIM code w i l l serve as an example. I I I I j | t h i s i s the sample BCPL program I I l e t f ( a , b # c) be f l e t v = vec 4 and w = 0 v!1 : = (a • b) *(b + c) } and s t a r t (} be f ( 1 , 2, 3) I I || t h i s i s the translated version of the above program into SLIM: || I t i s an exact reproduction of the slim compiler output. I I SSETI P2 JUMP L5 Sf 1: SSETI P5 LD P7 STKLD 0 PUSH SSETI P12 LD IP5 PLUS 1 STKLD IP3 PLUS IP4 STKLD IP2 PLUS IP3 MULT * STORE * RTRN Sstart 3: SSETI P2 SRELI 2 LD 1 STKLD 2 STKLD 3 STKLD 112 CALL 3 RTRN 5: FINISH 2: DATA L1 4: DATA 13 END This example serves to i l l u s t r a t e c a l l mechanism; vector a l l o c a t i o n i . , The c a l l mechanism The prelude before the actual for the l i n k s and evaluating the saved by the instruction SRELI 2. three features i n SLIM: the and expression evaluation. c a l l involves saving space parameters. Linkage space i s Here two c e l l s are l e f t to Chapter 2 20 contain the previous stack frame pointer and program counter. The three parameters are evaluated and the address of " f " loaded into the ACC. At t h i s point the stack i s as in Figure 2.2. , T T r J 1 1 1 I j old | nev |1 |2 |3 | | 1 ] l i n k s | l i n k s ! | | | j t—— i 1 1 1 j * r t I 1 i P S Figure 2.2 Stack prior to c a l l The effect of the CALL 3 i n s t r u c t i o n can be pictured as in Figure 2.3. t r 1 , — 1 1 , , 1 I old J new |1 |2 |3 | | I J l i n k s | l i n k s | t i l l 1 1 . 1 i JL J 1 1 A A i I 1 I P S Figure 2.3 Stack after c a l l Notice how the P pointer has changed and we are now executing with a new stack frame. As mentioned previously, the routine c a l l e d i s responsible to ensure that S actually points where i t should, hence the SSETI P5 i n s t r u c t i o n . This i s necessary since one can c a l l routines with fewer parameters than they expect and i f the S r e g i s t e r i s not corrected, not only w i l l further l o c a l variable a l l o c a t i o n be completely incorrect {within the current Chapter 2 21 procedure) , but any temporaries used w i l l map onto existing l o c a l variables and cause havoc. i i , l o c a l variable a l l o c a t i o n This involves setting variables to their i n i t i a l values and also a l l o c a t i n g space. The routine expects i t s environment to be as in Figure 2.4, (the numbers on the top denote stack frame offsets) 2 3 4 5 6 , , 1 j—T—T~1 1 I |links|a|b|c|v|w| | t i I I I 1 I I I L J J I I I I | 1 * I 1 I 1 1 P s Figure 2.4 The routine*s stack environment The SSETI instruction adjusts the S r e g i s t e r appropriately. At t h i s point o f f s e t 5 and 6 from the current stack frame pointer reserve space for the variables v and w except that they have not been i n i t i a l i z e d . Since v by d e f i n i t i o n w i l l contain the address of a vector of size 5 the LD P7 STKLD 0 sequence accomplishes t h i s . w i s i n i t i a l i z e d via the POSH in s t r u c t i o n since the previous i n s t r u c t i o n has already loaded zero into the accumulator. A l l i s well except that we must indicate somehow that we have used 5 more c e l l s for the vector v. SSETI P12 adjusts S to r e f l e c t t h i s f a c t . Chapter 2 22 i i i . Expression evaluation At t h i s time we w i l l concentrate on the sequence of slim code that evaluates {a+b) *(b*c) LD IP3 PLOS IP4 evaluates b+c. However at th i s stage we need to save t h i s r e s u l t in some temporary location. The STKLD IP2 PLUS IP3 accomplishes t h i s (via STKLD) while at the same time evaluating a+b. At t h i s point the accumulator contains a*b and the stack contains the following: r 1—I 1 | . . . . . . . . JH|b+c| » I i S Figure 2.5 The stack in midst of expression evaluation Now a l l that remains to be done i s r e t r i e v e the temporary result and multiply i t by the accumulator. MULT * accomplishes th i s and leaves the stack how i t was. Although t h i s i s not p a r t i c u l a r l y convincing one must admit to the r e l a t i v e ease with which temporaries are handled. Chapter 3 compares the amount of code generated f o r the above expression using a pure stack machine with that generated by the SLIM compiler. Chapter 3 23 J * Machine J u s t i f i c a t i o n In t h i s chapter some j u s t i f i c a t i o n for the choice of the SLIM machine i s outlined. Since a normal defense would consist of responding to several guestions regarding the choice of particular features, t h i s i s the form t h i s chapter w i l l take! 3.1 why choose a stack machine architecture? We w i l l f i r s t outline a more generalized notion of what we mean by a stack machine. By a stack machine we w i l l mean a machine i n which a hardware stack plays a central r o l e in expression evaluation, storage a l l o c a t i o n and subroutine control and linkage. We w i l l not reguire a machine to be a stack machine i f and only i f most instructions operate on operands held at the top of the stack. Software has made use of stacks for a long time but most computers lack hardware stacks. As the trend to develop software i n higher l e v e l languages develops we are now witnessing hardware acknowledgement of t h i s f a c t with the advent of hardware stacks. The HP 3000, the Burrroughs machines (B1700, B5500, B6700 and 7700) , the Data General Eclipse and the PDP-11 to a more limited extent are just some of the machines with some form of hardware stack. I t i s i n t h i s context of higher l e v e l language use that we w i l l outline some of the advantages of stack machines. A key concept i n software i s the subroutine. Some people s t i l l argue that e f f e c t i v e use of subroutines {i.e., good Chapter 3 24 structure) i s wasteful of time and space. This i s natural since as Bulman £24] says, "the subroutine c a l l and return mechanism seems to be almost an afterthought i n the architecture of many computers." The stack machine nips t h i s argument i n the bud. The best mechanism for the subroutine c a l l and return mechanism i s to involve the stack to store the return address. The stack then contains the record of the nesting of procedure c a l l s and one no longer has to worry about saving space for the return address. This l a s t issue has been treated in many ways and points up another advantage of stack architectures. Often t h i s return address has been saved i n a regi s t e r or worse s t i l l a l o c a l memory location. Both these methods however reguire extra software i f one allows recursion or reentrant routines. The programmer becomes responsible f o r stashing this return address somewhere before the next routine (and in recursion i t i s the same one) i s invoked,, Stack architectures remove t h i s concern from the programmer and in f a c t i t i s hard not to write reentrant programs when using a stack. Parameters are treated e f f i c i e n t l y in a stack architecture. What better place for them than on the stack? Many other methods that specify that space be permanently allocated to each subroutine for i t s parameters or that space be shared, again s h i f t the burden for the management of t h i s space onto the programmer. Stacking the parameters at once removes th i s concern from the programmer and also uses the space only when i t i s reguired. Chapter 3 25 another key advantage of stack architectures i s that they automatically provide l o c a l environments. T y p i c a l l y a subprogram refers to only a small subset of a l l i d e n t i f i e r s declared i n the whole program. In the BCPL case one only refers to l o c a l or global variables {these include s t a t i c s and externals) . S i n c e these l o c a l variables are only referenced in the procedure in which they are defined i t seems wasteful to have space permanently allocated for these variables when the procedure i s not active. Allocating t h i s space on the stack i n the l o c a l environment also accomplishes something else. Since the l o c a l environment i s accessed r e l a t i v e to some environment pointer (P i n SLIM's case) addresses for these variables need only specify o f f s e t s from t h i s environment pointer. Since these offs e t s are t y p i c a l l y small (95% < 10) , in s t r u c t i o n s reguire fewer b i t s . Hence program space i s saved. Program space saving i s also accomplished by reguiring no i m p l i c i t addresses for those variables that are i m p l i c i t . Addresses are of two kinds i n a machine: e x p l i c i t - those variables e x p l i c i t l y mentioned by the program; and i m p l i c i t - those that arise out of the need for some temporary storage location. These are automatically provided by stack architectures and their reference just involves referencing the top of the stack which requires no i m p l i c i t address b i t s . Once aqain code i s compacted. Global variable access also requires fewer b i t s since they are accessed r e l a t i v e to some global environment pointer. Another advantage of stack architectures i s that they exhibit the difference between program and task. Using Chapter 3 26 Organick*s [26] terminology, an incarnation of a task i s a combination of a time invariant algorithm(the code) and the time varying record of execution. The stack embodies t h i s record and hence a task can be seen as code plus stack, Processes then can be e a s i l y conceived of as some code plus some stack area for the par t i c u l a r process. Process switching then only involves transfer of control and the provision of some space to contain the time varying record of execution. Interrupt handling can also be treated e f f e c t i v e l y as unexpected procedure c a l l s . Since we know the l i m i t of the stack the interrupt can be serviced transparent to whatever was executing at the time. 3.2 Why choose a single accumulator? Simplicity i s the main reason, A single accumulator i s a l l one r e a l l y needs. I n b u i l t registers l i k e the P, S and G reg i s t e r s provide the index functions that one normally i s provided with except that the P and G are automatically maintained. In an environment of short procedures Tanenbaum [2] concludes "the register sets provided by a t h i r d generation machine are of l i t t l e value". They can be used for intermediate r e s u l t s but with the stack mechanism (see chapter 2) one reg i s t e r i s s u f f i c i e n t . In Tanenbaum's environment where one out of every four statements i s a procedure c a l l the save-restore overhead makes i t i n e f f i c i e n t to use reg i s t e r s to hold l o c a l variables. When one considers what i s involved i n Chapter 3 27 process switching the smaller the number of states associated with a process, the quicker and easier i t becomes to implement process switching. Having considered why we choose to minimize the number of reg i s t e r s one perhaps wonders why we did not go the stack machine route completely and eliminate the ACC altogether. This w i l l be addressed in section 3.4., but perhaps we can outline an equivalence of SLIM and an addressable top of stack location on a pure stack machine. Consider the following expression and i t s equivalent evaluation by three machines: SLIM, a pure stack machine and a modified stack machine as above. (A + B) * (C •* D) Pure Stack SLIM Modified Stack LOAD A LD A LOAD A LOAD B PLUS B PLUS B PLUS STKLD C LOAD C LOAD C PLUS D PLUS D LOAD D MULT * MULT PLUS TIMES The modified stack machine can be thought of as a machine with a f l o a t i n g ACC i n SLIM's sense. This ACC i s actually the current top of stack. Comparing th i s and SLIM code one notices Chapter 3 28 the s i m i l a r i t y except that there are two versions of each diadic operator i n the case of the modified stack machine: one that requires an operand (e.g., PLUS B S PLUS D) and one that takes both i t s operands from the stack (e.g., MULT) . This introduces a further complexity i n t o the machine when one has 2 versions of each diadic operator. With 17 diadic operators t h i s i s quite s i g n i f i c a n t since these extra operators have to be encoded. This might require extra b i t s i n the opcode f i e l d for the in s t r u c t i o n leaving l e s s space to encode the operands. SLIM however only has one extra operator (from a stack machine's viewpoint) - STKLD. Another factor that favours the single ACC machine i s the necessity of handling environments that return values. The two par t i c u l a r instances of t h i s i n BCPL are the function and the valof block. In both cases some resu l t computed at the top of the current stack frame(in a pure stack or modified stack machine) must be passed to the preceeding environment while at the same time collapsing the present environment. In the function case t h i s reguires an extra operator FNHN to do precisely t h i s . ; The valof block uses the RSTACK operator. This unecessarily adds to the complexity of the machine.SLIM only needs to return any value i n the accumulator and hence requires no extra operators. 3.3 Why have an S regist e r ? This register always points to the top of the stack and hence indicates the next possible unused stack location. There Chapter 3 29 are three main reasons why t h i s register i s made e x p l i c i t . Interrupts can be cleanly handled since the S register always indicates where a new stack frame could begin. Hence the interrupt hardware need only f i l l i n the l i n k s as i n a normal c a l l s t a r t i n g at where the S register points. The K register for the PICA-B machine [25] functions in much the same way. Dynamic storage a l l o c a t i o n i s another major reason why one needs the s register. I t i s when one does not know the size of the current stack frame (e.g., with dynamic vectors) that one needs to be able to manipulate the S register i n a r e l a t i v e manner. One cannot just use offs e t s from P since these off s e t s are only known at run time. The CALL instruction i s a r e l a t i v e type of in s t r u c t i o n i n the sense that the "n" s p e c i f i e s the number of parameters passed as opposed to the corresponding INTCODE ins t r u c t i o n K d where the d s p e c i f i e s the siz e of the c a l l e r s stack frame. Standard BCPL does not allow f o r dynamic storage a l l o c a t i o n (neither does the SLIM version of the compiler) but f o r the ease with which this could be achieved we present a BCPL fragment and the corresponding SLIM code. From t h i s , one w i l l hopefully appreciate the usefulness of the S re g i s t e r . Chapter 3 30 Extended BCPL., l e t v1 = vec <expr1> and v2 = vec <expr2> and c = 0 SLIH code. SSETI Pn SGET STORE Pv1 code to evaluate <expr1> • SREL SGET STORE Pv2 code to evaluate <expr2> SREL LD 0 STORE PC set S to point above space f o r vars get t h i s value i n the accumulator make v1 point to i t s space adjust S by the value of <expr1> make v2 point to i t s space make S point to free space i n i t i a l i z e c Notice that t h i s code sequence d i f f e r s from the example in chapter 2 since there we knew sizes e x p l i c i t l y at compile time and hence could compile more e f f i c i e n t code. A t h i r d reason i s that the S reqi s t e r i s the means of generating and r e t r i e v i n g i m p l i c i t variables that are reguired. The SLIM operators POSH & STKLD and the operand * are the means of r e a l i z i n g t h i s very valuable feature. Chapter 3 31 3.4 Why single address? When one's prime objective i n the design of a simple machine i s that programs be represented e f f i c i e n t l y , considerable thought must be given to the number of addresses an ins t r u c t i o n should contain. The greater the number of addresses the larger the in s t r u c t i o n size and hence a larger t o t a l program size i s more l i k e l y . The number of addresses per inst r u c t i o n can vary from three to none i n a pure stack machine, As Ibbett et e l . f27] state there are a number of c o n f l i c t i n g virtues related to the various p o s s i b i l i t i e s . "Simple operations such as the sett i n g and incrementing of variables are more concisely described by two and three address schemes. Evaluations of longer expressions are more concisely defined by zero address and one address systems, however, because the address i n which the res u l t i s accumulating i s implied." By considering some sample expressions a choice was made to u t i l i z e the one address scheme. This i s made possible by the provision of the STKLD in s t r u c t i o n which f i r s t stacks the accumulator contents, together with the * operand which provides a way to access stacked p a r t i a l results.- This maintains the valuable features of a stack machine while providing more compact code. In the following two examples two measures are used: the number of words in the machine independent sense where there i s one ins t r u c t i o n per word; the number of bytes in the more applied sense. A comparison of the t o t a l sizes demonstrates the sup e r i o r i t y of the one address scheme. Chapter 3 EXAMPLE 1. Comparison of SLIM and a stack machine <A + B) *(C + D) Stack machine words bytes SLIM words bytes LOAD A LOAD B ELDS LD C LOAD D PLUS MOLT Total: 2 2 1 2 2 1 1 11 LD A PLUS B STKLD C PLUS D MULT * Tot a l : 2 2 2 2 1 9 EXAMPLE 2. Comparison of SLIM and a stack machine A + B Stack machine words bytes SLIM words bytes LOAD A 1 2 LD A 1 2 LOAD B 1 2 PLUS B 1 2 PLUS 1 1 TOT&L: 2 H TOTAL: 3 5 Chapter 3 33 Haying i l l u s t r a t e d a simple comparison above the rest of t h i s thesis w i l l attempt to compare the SLIM machine with other exis t i n g architectures. We w i l l f i r s t discuss the issue of what measure to use and then demonstrate that the f i r s t objective in the design of SLIM has been achieved. Chapter Four 34 2» Measures and r e s u l t s 4.1 Ideal Program representation Be are now at the stage where you may be asking - so what? The . context of the design has been sketched and the machine described and verbally j u s t i f i e d . But how can we evaluate t h i s architecture? This i s what concerns us in t h i s chapter. He w i l l examine some general aspects of measures, describe three s p e c i f i c a l l y and then proceed to use the chosen measure to compare our architecture with two other architectures. What should be included i n a measure? One obvious component i s that the measure be objective; something that can be precisely quantified. Unfortunately non-quantitative measures generally tend to receive l i t t l e merit. Somehow one feels that the measure should also incorporate the space time product. Space qenerally meaning proqram size, and time beinq some measure of hardware e f f i c i e n c y . However t h i s space component could j u s t i f i a b l y include items such as compiler size, s i z e of the runtime support etc. Somewhere one has to draw the l i m i t . A more imporatnt issue i s concerned with whether one can evaluate architectures just on the basis of t h e i r desiqn without any reqard for what use w i l l be made of them. Or more prec i s e l y : Can architectures be evaluated i n a proqram independent fashion? In the next section we w i l l consider two not s t r i c t l y proqram independent measures and one s t r i c t l y proqram dependent measure, . Chapter Four 4.2 Measures 35 i . Flynn's measures Flynn [ 2 8 3 compares an architecture against what he takes to be an ultimately simple, f u l l y e x p l i c i t architecture. As he states: "In a simple architecture nothing i s implied - no registers or counters are i n v i s i b l e to the problem state programmer., Each i n s t r u c t i o n contains an operation, the f u l l generalized address s p e c i f i c a t i o n (allowing i f necessary multiple l e v e l s of i n d i r e c t i o n through tables etc.) f o r both source operands, a re s u l t operand, and a test of the r e s u l t which selects an address for the next in s t r u c t i o n " . He then c l a s s i f i e s instructions into three broad categories. i3 instructions are memory p a r t i t i o n movement instructions such as the LOAD and STORE inst r u c t i o n s which move data items within a storage hierarchy. 2 instructions are procedural in s t r u c t i o n s which perform functions associated with i n s t r u c t i o n seguencing, i . e . , TEST, BRANCH, COMPARE etc., but perform no transformation on data. F instructions perform computational functions i n that they operate on data. They include arithmetic operations of a l l types, as well as l o g i c a l and s h i f t i n g operations Tc Flynn M and P instructions are overhead i n s t r u c t i o n s whereas F type instructions are the only ones that do any work. Therefore the three r a t i o s to measure th i s overhead are: Chapter Four 36 1. M r a t i o : ration of M to F instructions 2. P r a t i o : r a t i o of P to F instructions 3. NF r a t i o : r a t i o of the sum of M and P instructions to F instructions An i d e a l machine would have M = P = NF = 0. Flynn uses these r a t i o s to evaluate the IBM 7090, system 360 and the PDP 10. i i . Instruction mixes This i s the frequency d i s t r i b u t i o n of the types of instructions executed durinq the processinq of a workload. The best known published example i s the Gibson mix., Gibson obtained frequencies i n t h i s mix from an analysis of the use of instructions i n technical and s c i e n t i f i c applications i n IBM 7090 i n s t a l l a t i o n s . Flynn has obtained a mix appropriate to system 360 i n s t a l l a t i o n s . These mixes are used to evaluate architectures primarily by providinq time measures. The frequency of inst r u c t i o n use i n the pa r t i c u l a r class i s multiplied by the average i n s t r u c t i o n execution time i n t h i s c l a s s and these summed for a l l classes i n the mix. The result of average instruction execution time i s taken to be a measure of the architecture and used for comparison purposes. i i i . Program representation s i z e Given a program or a representative set of proqrams i n some hiqh l e v e l lanquaqe, one translates these proqrams to machine lanquaqe proqrams f o r various machines., The space required by Chapter Four 37 the object programs i s used for comparison among the various machines. The smaller the space reguired f o r the code the better the machine architecture according to t h i s measure. This measure i s used by Tanenbaum and i s the one we w i l l use and j u s t i f y shortly. At t h i s point one needs to recognize that in some hiqh l e v e l language translations the machine code contains a large number of i m p l i c i t as opposed to e x p l i c i t subroutine c a l l s to b u i l t i n l i b r a r y functions. E x p l i c i t c a l l s to l i b r a r y functions are those which the program d i r e c t l y s p e c i f i e s . As a result the machine code may be small but the percentage of i m p l i c i t subroutine c a l l s may be high. What t h i s a c t ually points out i s that the code for the b u i l t in l i b r a r y functions i s the microcode for the instructions reguired by the higher l e v e l language. This r e f l e c t s the fact that the machine at the current l e v e l i s not suited to the particular language. For t h i s same measure to be used in cases l i k e t h i s , each i m p l i c i t l i b r a r y c a l l should count for the number of words i n the code of that l i b r a r y c a l l , not just as one subroutine c a l l . We w i l l now b r i e f l y comment on these measures i n l i g h t of the guestion raised previously: Can architectures be evaluated in a program independent fashion? The underlying issue here i s to guage how e f f e c t i v e l y the machine accomplishes i t s purpose. By machine we mean a configuration of the micro architecture that r e a l i z e s an inst r u c t i o n set. In many cases t h i s configuration i s hardwired but in others (e.g., the B1700) one can dynamically Chapter Four 38 reconfigure the micro architecture. By purpose we mean how the machine f a c i l i t a t e s what people want to do. This of course i s accomplished at a number of l e v e l s : modes of expression available ( i . e . , programming languages); software packages; operating systems; programming environments (batch or timeshared) etc. What we are more concerned with here, i s the primary l e v e l concerning modes of expression. What people want to do i s most often expressed algorithmically i n some high l e v e l language. Thus programs written in a high l e v e l language are the primary vehicle of conveyinq people's intent to the machine. Therefore we w i l l assume that programs i n some programming language or languages are a good indication of the use of a machine. Bote that we are not tying the expression of algorithms to one p a r t i c u l a r language. Rather we are suggesting that much has been learned about algorithms and ways to represent them i n programming languages. This makes programs i n a given class of languages representative of what people want to do, and hence machines should be evaluated with respect to a given class of languages. From th i s perspective the use of the machine i s the common denominator i n an evaluation not some general notions of machine design. We are now l e f t with the guestion of how to e f f e c t i v e l y and precisely measure how well mapped the machine i s . By well mapped match we mean how concisely transformations (or state transitions) in the high l e v e l language are represented i n the lower l e v e l machine. The more concise t h i s representation the better mapped the machine. This i s the bias we have i n choosing our measure of program Chapter Four 39 representation s i z e ( i . e , code space). Flynn's measures c l e a r l y emphasize the functional c h a r a c t e r i s t i c s of an architecture. He attempts to compare archtectures s o l e l y from the functional a r c h i t e c t u r a l viewpoint. He i s not s t r i c t y comparing architectures in a program dependent manner, one however could possibly argue that his simple machine i s actually the most optimal representation f o r programs provided one accepts his d e f i n i t i o n of optimality. ( i . e . no overhead) This measure however i s more concerned with validating or invalidating the following thesis: Machine design has strived towards decreasing memory references, (e.g., of instructions and t h e i r operands) but t h i s has introduced considerable overhead. This overhead i s a result of making several e x p l i c i t functions (in Flynn's simple machine case) i m p l i c i t . Two cases of t h i s overhead are: i . The treatment of programs as l i n e a r strings and consequently maintaining the proqram counter i m p l i c i t l y . i i . The introduction of r e g i s t e r s to hold operands in l o c a l store and not i n main memory. The former case has introduced the whole ranqe of branch instructions whereas the l a t t e r has introduced the Store and Load variations. After makinq some measurements of various computer architectures Flynn concludes that in fact the overhead i s considerable. As we can see, the emphasis i n Flynn's measures of measuring th i s overhead i s not d i r e c t l y concerned with how well mapped the machine i s . Therefore we w i l l not use i t . Chapter Four 40 Instruction mixes b a s i c a l l y are a test of hardware. After one has derived a suitable mix one i s generally interested in average i n s t r u c t i o n execution time or some such time oriented measure. Although t h i s i s useful i t subtly incorporates the variables of the technology used and the encoding of i n s t r u c t i o n s . This l a t t e r variable greatly a f f e c t s the complexity of the microcode and hence the speed. (We assume a microcoded implementation of an instruction set.) These variables do not r e a l l y give an indication of how well mapped the machine i s . , Even a f a s t average i n s t r u c t i o n execution time does not necessarily guarantee anything. I f a l l one has i s f a s t i n s t r u c t i o n s that do nothing, the increase in instructions needed to dc something useful w i l l d e f i n i t e l y detract from any advantage speed might have i n i t i a l l y provided. In other words the power of an i n s t r u c t i o n i s not necessarily taken into account. This power i s representative i n some sense of what you would l i k e to do and since i n s t r u c t i o n mixes measure t h i s poorly we w i l l not use t h i s measure either. Also because d i f f e r e n t machines (and hence instruction sets) produce d i f f e r e n t user c h a r a c t e r i s t i c s , i t i s not c l e a r that the same in s t r u c t i o n mix i s applicable to a l l machines under consideration. We w i l l now outline the reasons for our choice of the size of programs as our measure. Small representation of programs ( i . e . , code space) c l e a r l y r e f l e c t s a well mapped machine., If some other machine reguires more code space for the same program Chapter Four 41 then t h i s i s i n d i c a t i v e of the need for more state t r a n s i t i o n s i n the machine than actually reguired by the program. In other words the machine i s p a r t i a l l y mapped. In our s p e c i f i c case we wish to show that by t h i s measure for the language BCPL, the SLIM machine i s a well mapped machine, well suited to BCPL. Secondly si z e and space are intertwined. The smaller the program the faster the interpretation i s l i k e l y to be. Small representation of programs also has a third more p r a c t i c a l advantage. In t h i s age of mini and micro computers the a b i l i t y to run large programs i s a great advantage with the limi t e d memory these systems usually provide. Fourthly a decrease i n program size can lead to an increase in the degree of multiprogramming and po t e n t i a l l y decrease the page f a u l t rate. I t i s for this combination of a r c h i t e c t u r a l and p r a c t i c a l considerations that we use space as the measure for comparison in the following sections. 4.3 Results Now having established our measure for the purpose of comparison we w i l l proceed to compare the SLIM machine against two architectures. These are OCODE (see f 211) and EM-1 (see [2]) . These machines w i l l not be described but one i s referred to t h e i r adeguate description elsewhere. i . OCODE versus SLIM - round 1 OCODE i s the c l a s s i c a l f i r s t step i n the tran s l a t i o n of BCPL programs, From the Applicative Expression Tree (AE tree) Chapter Four 42 representation of the BCPL program OCODE i s generated. OCODE i s a stack machine and t h i s i s one of the reasons why we have chosen i t as one of the machines for comparison. In some sense i t i s representative of stack machines for which there i s wide respect. The second reason for choosing OCODE i s that i t was especially designed for the tra n s l a t i o n of BCPL., The procedure for comparison has involved translating approximately 8500 l i n e s of BCPL source into OCODE and SLIM code. In f a c t BCPL i s not translated into OCODE but into BCODE. The only difference between the two i s that BCODE i s intended to be used as a re a l machine and so OCODE instructions are encoded and object modules generated. BCODE i s the work of a l o c a l , unpublished project at the University of B r i t i s h Columbia. One might object here that encoding has not been mentioned. At t h i s l e v e l of comparison, instructions that take one word in BCODE occupy the same in SLIM. Double word instructions w i l l occur more frequently f o r SLIM since there i s less space for encoding operands. Therefore for the measure of code sizes encoding can be treated as a constant in t h i s case and not enter into the comparisons. Two measures are used: number of instructions and code s i z e s . The following table describes the programs used and gives the r a t i o s of BCODE to SLIM for both measures. Chapter Four 43 PROGRAM INSTRUCTIONS CODE SIZE Intcode interpreter 1.18 1.20 (2 sections) 1. 22 1. 17 Intcode assembler 1.09 1. 13 (3 sections) 1.18 1. 15 1. 12 1.10 BCPL compiler 1.14 1. 10 (6 sections) 1.11 1. 12 1. 12 1. 10 1. 10 1. 10 1.05 1. 15 1. 14 1.20 OCODE to 370 assembler 1.15 1. 15 (5 sections) 1. 15 1.11 1. 14 1.11 1. 18 1. 12 1. 13 1.11 Text editor 1.03 1.06 (4 sections) 1. 06 1. 06 1.05 1.05 1.06 1.04 Average: 1. 12 1. 12 TABLE I. OCODE and SLIM comparison As can be seen there i s a twelve percent gain on the average for the SLIM machine using t h i s measure. i i . EM-1 versus SLIM - round 2 This machine i s a recent attempt to provide a machine that w i l l provide very compact representation for a large c l a s s of languages. For example ALGOL 60, ALGOL - 68 , Pascal, XPL, BCPL, SAL etc. In Tanenbaum»s paper [2] he compares four programs and the i r code sizes on the EM-1, PDP-11 and Cyber. He gets r a t i o s as low as 1.5 with the PDP-11 and as high as 6.3 on the Cyber. Chapter Four 44 Thus t h i s machine i s well suited as a comparison with SLIM. In order to perform the comparison we must f i r s t provide a more compact encoding for SLIM to match EM-1*s encoding. The encoding i s presented i n Appendix I. Appendix II contains the BCPL source for three programs used for the comparison.. Since there i s no BCPL to EM-1 compiler, equivalent C programs were provided and these too are included. The following table presents the r e s u l t s . OPTIMIZED EM-1 (bytes) SLIM (bytes) Hanoi 46 41 Bubblesort 80 81 Expression 20 2 7 TABLE I I . Optimized EM-1 and SLIM comparison The three programs were chosen to represent 3 classes of program : procedure c a l l i n g (towers of hanoi) ; general loop mechanisms (bubblesort) and expression evaluation .. Before one concludes too much here, where SLIM does not outperform EM-1 dramatically we should be aware of a number of c h a r a c t e r i s t i c s of "optimized" EM-1 code. It i s very closely t i e d to language Chapter Four 45 directed machine design, Two examples of t h i s optimization fellow: i ) Due to a f a i r amount of incrementing by 1 i n higher l e v e l languages EM-1 provides an increment operator. Since SLIM does not, the eguivalent SLIM code (LD IPn PLUS 1 STORE Pn) occupies 4 bytes as opposed to 1. I f t h i s operator were provided then the 3 bytes we would save i n SLIM code f o r the Bubblesort routine would make SLIM code more compact than EH-1 code i n t h i s case. i i ) Optimized EM-1 code recognizes consecutive leads and replaces them by a single LOAD DODBLE in s t r u c t i o n . For example, instead of generating LOAD A LOAD B i t generates a LOAD DOUBLE A. In our expression program there are 5 cases where th i s occurs: (a+b) , (c+d) , c+d, (a+b) and a*b.r If t h i s procedure had been written with the order i n these expressions reversed then the resu l t s would have been s i g n i f i c a n t l y d i f f e r e n t . One need only note that 24 bytes are reguired for a pure stack machine for the expression evaluation alone and t h i s does not take into account the procedure entry and exit. The following table presents the resu l t s assuming the lack of the above two optimizations for EM-1 and also that procedure entry and exit occupy three bytes as i n SLIM. Chapter Four 46 EM-1 (bytes) SLIM (bytes) Hanoi 46 41 Bubblesort 83 81 Expression 27 27 TABLE II I . EM-1 and SLIM comparison Despite the lack of any EM-1 type optimization i n SLIM the machines compare very favourably. , We now present our observations and directions for further research. Chapter 5 47 5. Conclusions and directions for further research although t h i s thesis has presented the design of an intermediate machine suited to a particular high l e v e l language, not much has been said about the various approaches to ins t r u c t i o n set design. We w i l l now outline some approaches to instr u c t i o n set design and then make some comments on the part i c u l a r approach used i n t h i s thesis. Lipovski and Doty f 30 ] describe three schools of thought on instr u c t i o n set design. The oldest approach i s to use s t a t i s t i c s based on coding experience with an older architecture, to a s s i s t i n constructing a more refined machine. Instructions that are frequently used are made fast e r and perhaps more f l e x i b l e . S t a t i s t i c a l l y s i g n i f i c a n t i n s t r u c t i o n sequences are made into primitive operations. The second approach i s to choose a widely used high l e v e l lanquaqe. The primitive operations necessary to execute t h i s hiqh l e v e l languaqe are i d e n t i f i e d , and then r e a l i z e d i n the inst r u c t i o n set. The t h i r d approach i d e n t i f i e s a ranqe of problems to be solved using the computer and a set of c h a r a c t e r i s t i c s of the technology to be used to r e a l i z e the machine. The problems to be solved are treated as 'axioms 1, (premises) and the decisions leading up to the design of the architecture are treated as 'theorems* (implications). The 'proof* gives a l l the reasons for the s p e c i f i c design decision (implication) in terms of the problems to be solved (premises) and e a r l i e r implications. Clearly the approach used i n the design of SLIM i s the hiqh l e v e l language approach. These approaches a l l have the i r pros Chapter 5 48 and cons. The s t a t i s t i c a l approach generally assures some form of compatibility between between the old and the more refined machine. This i s convenient corporate policy but can tend to entrench e x i s t i n g patterns of operation and insight and not allow for new innovations. The high l e v e l language approach i s more suited to the more common forms of expression but i s generally applied to one s p e c i f i c high l e v e l language. Since most computers run more than one language, what i s optimal for one language may not be optimal for another.. There are two ways to overcome t h i s problem. One i s to allow f o r various microcoded intermediate languages as i n the Burroughs B1700, The other i s to design i n s t r u c t i o n sets that are well suited to a number of languages. The EM-1 machine i s one signpost i n t h i s d i r e c t i o n . The premise-implication approach requires c a r e f u l thought for a l l design decisions and hence makes i t d i f f i c u l t to write the description. However this approach perhaps shows more cl e a r l y what the system i s intended f o r and what i t s l i m i t a t i o n s are. 8e w i l l now make some conclusions regarding the methodology used in the design of SLIM and the results obtained. The results c l e a r l y show that the objectives governing the design of SLIM have been achieved. Using the measure of program representation s i z e SLIM compares very favourably with a number of architectures. SLIM i s a d e f i n i t e improvement over OCODE and i s approximately equivalent to the EM-1 machine. although no mention has been made of INTCODE, one automatically can i n f e r Chapter 5 49 from the SLIM versus OCODE r e s u l t s t h a t the SLIM r e p r e s e n t a t i o n c f programs i s much s m a l l e r than t h e i r INTCODE c o u n t e r p a r t s . The o b j e c t i v e of s i m p l i c i t y i n machine a r c h i t e c t u r e a l s o has been r e a l i z e d . The achievement of t h e s e o b j e c t i v e s show t h a t u s e f u l work can be done w i t h i n t h i s p a r t i c u l a r approach to i n s t r u c t i o n set design., Regarding the approach i t s e l f i t i s d i f f i c u l t t o be s p e c i f i c . Although we have argued elsewhere f o r the importance of t h i s approach i t i s d i f f i c u l t t c provide handles t o a s s i s t i n s y n t h e s i z i n g the o p e r a t i o n s necessary to execute high l e v e l languages. One not only has to determine o p e r a t i o n s but one must f i r s t of a l l determine the a r c h i t e c t u r a l b u i l d i n g b l o c k s on which these o p e r a t i o n s w i l l operate. There are a number of accepted b u i l d i n g b l o c k s i n e x i s t e n c e , f o r example the importance of s t a c k s i n environment a l l o c a t i o n , procedure c a l l i n g and e x p r e s s i o n e v a l u a t i o n . T h i s i s one area of f u r t h e r r e s e a r c h where s i m i l a r work with other languages might d i s t i l l other a r c h i t e c t u r a l b u i l d i n g blocks. T h i s i n t u r n w i l l help to i d e n t i f y the p r i m i t i v e o p e r a t i o n s necessary to execute high l e v e l languages. Another approach we have not mentioned t h a t d i f f e r s from t h a t of i n s t r u c t i o n set design i s d i r e c t e x e cution of high l e v e l languages, In t h i s approach the machine i n s t r u c t i o n s e t becomes the o p e r a t i o n s of the high l e v e l language. T h i s approach a l s o has a number of pros and cons. I t e l i m i n a t e s the c o m p i l a t i o n process, speeds up execution of programs and g e n e r a l l y provides g r e a t e r program d e n s i t y . , On the other hand the s i z e of the microprogram t o i n t e r p r e t the high l e v e l language i n s t r u c t i o n s Chapter 5 50 w i l l be large and very complex . With current technology and costs the construction of such a machine would be p r o h i b i t i v e l y expensive. The machine also by d e f i n i t i o n w i l l be very s p e c i a l purpose. Since users may want to use other languages he may f i n d i t awkward to compile them into the base high l e v e l language. The representation of these other high l e v e l language programs i n the base language may also be large and their execution slow. More importantly, t h i s approach depends on how suited the language i s to in t e r p r e t i v e execution. In t h i s mode of execution each statement i s decoded just before i t i s used. BCPL i n i t s pure source form i s d e f i n i t e l y not suited to th i s approach. For example a procedure c a l l that involves a procedure that i s defined 3000 l i n e s further on in the source, cannot be immediately executed. For BCPL to be executed i n th i s manner some form of intermediate program representation would be necessary. This borders c l o s e l y on the approach we have used. Two areas of research ari s e out of considering t h i s approach as i t applies to languages l i k e BCPL, One i s to develop suitable high l e v e l intermediate representations that can be d i r e c t l y executed. The other i s to develop language design p r i n c i p l e s that w i l l provide languages that can be d i r e c t l y executed, The f i n a l issue that concerns us i s the development of suitable measures for architecture comparisons. The choice of methodology i n in s t r u c t i o n set design c l e a r l y biases the choice of measure. For example, those adopting the s t a t i s t i c a l approach might be more interested in time oriented measures. However we have argued e a r l i e r for the importance of the high Chapter 5 51 l e v e l language approach to i n s t r u c t i o n set design and therefore conclude that our measure of program representation size i s an important component of any measure that i s devised. Of course our measure has a number of def i c i e n c i e s . I t i s dependent on the e f f i c i e n c y of the t r a n s l a t i o n section of the compiler used. Comparisons are meaningful only i f the translation sections of the various compilers use the same optimizations. This i s sometimes d i f f i c u l t to achieve. Program representation size i s also just one component of a measure. Though th i s measure has been useful f o r our comparison purposes, t h i s subject of measures for evaluation purposes reguires further work and study to produce a more comprehensive measure. B i b l i o q r a p h y 52 1. A b d - a l l a , A.M S K a r l g a a r d , D.C. H e u r i s t i c s y n t h e s i s o f microjjrocfrafflaed c o m p u t e r a r c h i t e c t u r e s . I E E E t r a n s a c t i o n s on C o m p u t e r s , V o l C-29, No. 8, Aug 1974., 2. Tanenbaum. , A.S. I m p l i c a t i o n § o f s t r u c t u r e d - programming f o r machine a r c h i t e c t u r e . CACM, V o l 21, number 3, March 1978 pp237-246 3. K n u t h , D. E. An e m p i r i c a l s t u d y o f FORT'S All p r o g r a m s . S o f t w a r e p r a c t i c e a n d e x p e r i e n c e 1 ( 1 9 7 1 ) , pp261-301 4. A l e x a n d e r , I.G. How a programming l a n g u a g e i s u s e d . CSRG-10 , U. o f T o r o n t o , O n t a r i o , Canada 5., Hortman, D.B, A s t u d y o f l a n g u a g e d i r e c t e d computer d e s i g n . CSRG-20, U. o f T o r o n t o , T o r o n t o , O n t a r i o l l 9 7 2 ) , CanadaT 6., S a l v a d o r i , A., G o r d o n , J.S C a p s t i c k , C. S t a t i c p r o f i l e o f COBOL p r o g r a m s . S i g p l a n n o t i c e s (ACM) 1 0 ( 1 9 7 5 ) , pp20 - 33 7. Sichmann, B.A. ALGOL 60 C o m p i l a t i o n and a s s e s s m e n t . Academic P r e s s , London and New Y o r k , 1975 8, A g r a w a l a , A.K. & R a u s c h e r , T.G. F o u n d a t i o n s o f m i c r o p r o g r a m m i n g : A r c h t e c t u r e . S g f t w a r e ajid A p p l i c a t i o n s . A cademic P r e s s . 1976 9., S a l i s b u r y , A.B. Microprogrammed Computer • A r c h : ^ c t u r e s . - New York : E l s e v i e r . . 1976 10. M a r s h l a n d , T.A. & Demco, J.C. A c o n t e m p p r a r j ; computer emulation.* T e c h n i c a l r e p o r t TR76-1, F e b . 1976, D e p t . o f C p s c , U n i v e r s i t y o f A l b e r t a , Canada. 11. H i l n e r , W.T. The d e s i g n o f t h e B u r r o u g h s B17Q0. P r o c . o f t h e AFIPS F J C C , V o l . 41, AFIPS p r e s s , M o n t v a l e N7J., 1972, pp 489 - 497 12. S a a l , H.J. On m e a s u r i n g c o m p u t e r s y s t e m s by m i c r o p r o g r a m m i n g i n [ 2 9 ] 13. R o s i n , R.F. Systems A r c h i t e c t u r e and M i c r o p r o g r a m m i n g : some comments i n f 29 ] ~ 14. A p p e l b e , W.F. A s e m a n t i c r e p r e s e n t a t i o n f o r t r a n s l a t i o a o f - h i g h - l e v e l A l g o r i t h m i c L a n g u a g e s . PhD T h e s i s , 0 . o f B r i t i s h C o l u m b i a , V a n c o u v e r , Canada, 1978 15. M c B i l l i a m s , T.M. & F u l l e r S.H. S Sherwood, W.H, Using L S I p r o c e s s o r b i t s l i c e s t o b u i l d a PDP-11 z 1 £ a s e -study i n lig£QPoroputer Resign . P r o c , AFIPS NCC 1977. pp 243-253. 16. D r a f t as o f Nov. 15, 1977 o f t h e O b j e c t i v e s f o r computer Bibliography 53 science. Local memo from the Dept. head D.C. Gilmore to faculty and graduate students 17. Peck, J.E.L. The essence of computer science, DBC, Vancouver, 1975 18. Peck, J.E.L., Manis, V.S. 6 Webb, W.E. Code compaction for minicomputers with INTCODE and MINICODE. Technical Report 75-02, Dept. of Computer Science, U. of B r i t i s h Columbia, Vancouver, Canada 19. Stoy, J.E. 6 Strachey, C. OS-6 - an experimental operating system f o r a small computer. The computer Journal, 15, Nos 2~& 3, 1972. 20. Richards, M. BCPL: a t o o l for Compile writing and Cistern JSoaramming. Proc. of the AFIPS 1969 SJCC Vol 34, AFIPS press, Montvale, New Jersey (1969), pp 557-566 21. Richards, M. The p o r t a b i l i t y of the BCPL compiler. Software Practice and Experience, 1:1(197?), pp135 - 146 22. Richards, M. Bootstragging the BCPL comp.iler. i n Van der Poel, W.I, S Maarsen, I. ( eds.) Machine orien.teJi higher l e v e l iaS3Jjases. , North Holland and American Elsevier,?974 23. Ritchie, D.M. & Thompson, K, The ONIX timesharing System. CACM , Vol.. 17 No. 7, (July 1974), pp365-375 24. Bulman, D.M. Stack Computers:An introduction. Computer, May 1977. pp 18-28. 25. Abramson, H., Fox,M., Gorlick,M., Manis,V. & Peck, J. The PICA-B computer. An abstract target machine f o r a transportable Single-Oser- Operating Environment, submitted f o r publication. 26.Organick, E.I. Computer system organization« The B5700/B6700 Series. ACM Monograph Series, Academic Press (1973) 27. Ibett, R.N. & Capon, P.C. The development of the Ml)5 Computer system CACM vol 21 no 1 Jan~1978. pp 13-247" 28. Flynn, M.J. Computer organization and architecture.. Lecture notes for the Advanced course on operating systems Munich, Germany. July 28 to August 5, 1977. 29. Boon, C. (ed.) Microprogramming and System architecture^ INFOTECH state of the art report 23, Maidenhead™ Berkshire, O.K., 1975. 30. Lipovski, G.J. & Doty, K.L. BeyjeloEments and Directions in Computer Architecture. Computer, Aug. 1978. pp 54-67. , Appendix I 54 Towards a single byte encoding This appendix contains the encoding breakdown for SLIM which f i t s the opcode i n cne byte and the operand ( i f any) i n the following byte. Double word instructions would have 255 i n the f i r s t byte which would singnify that the following three bytes contain the instruction - 1 for the opcode and 2 f o r the operands since t h i s i s a double word i n s t r u c t i o n . He w i l l f i r s t examine the number of operands reguired for the various operators and outline the d i s t r i b u t i o n of opcodes. Since we only have a one byte opcode many operators may have nine encodings to account f o r the nine possible operands. OPERAND TYPE NUMBER OF VARIANTS (SYMBOLIC FORM) global 2 IG, G l o c a l 2 IP, P s t a t i c 2 IL, L top of stack 1 * r e l a t i v e address 2 IR, R TOTAL: 9 OPERATORS THAT COULD TAKE ALL NINE VARIATIONS mult, div, plus, minus, eg, ne. Is, gr, l e , ge, l s h i f t , r s h i f t , logand, logor, exor, Id, st k i d , store, rem, egv SUB TOTAL: 20x9 = 180 OPERATORS THAT DO NOT TAKE ALL NINE VARIATIONS s s e t i - absolute and stack r e l a t i v e (2) - 3 s x e l i - absolute and stack r e l a t i v e (2) - 3 c a l l - absolute - 1 jump - r e l a t i v e (2) , s t a t i c (2) - 4 j t - " - 4 j f - - 4 switchon - absolute - 1 slctap, s l c t s t - 2 SUB-TOTAL: 22 OPERATORS THAT ONLY TAKE ONE VARIATION goto, neg, not, deref, push, pop, sset, sget, s r e l , f i n i s h , r t r n . true, f a l s e SUB-TOTAL: 13 SPECIAL ENCODING LD IPn 1<= n «= 10 10 STKLD IPn " 10 STORE Pn " 10 CALL n 0<= n <= 5 6 TOTAL: 180+22+13+36 = 251 Appendix II 55 Three equivalent BCPL and C programs || Check procedure c a l l i n g mechanism. The c l a s s i c towers of hanoi. global { Sritef:50 } l e t Hanoi{ n, s, i , d) be { i f n = 0 then return Hanoi{ n-1, s, d, i) I r i t e f <"Move %N from SIC to %C*N«, n, s, d) Hanoi{ n-1, i , s, d) } || Bubblesort. General test of loop mechanisms manifest { falsevalue = 0 ; truevalue = 1 } l e t Bubblesort(a, n) be { l e t sorted = falsevalue and LastValue = n and temp = 0 r LastValue := LastValue - 1 sorted := truevalue f o r j = 0 to LastValue do i f a!j < a! (j + 1) then ( temp := a!j a!j := a! (j+1) a! (j + 1) := temp sorted := falsevalue } } repeatwhile ( sorted = falsevalue) | { LastValue -«= 1 ) } || Expression evaluation. l e t StupidProgram ( a, b, c, d) be i a := (a+b)*(c+d) b := c+d c := (a+b)/d d := a+b+c } Appendix I I 56 | | and now for the C version of each of these three programs /* towers of hanoi * / hanoitjn, s , i , d) char s, i , d ; { i f ( n == 0 ) return ; hanoi( n-1, s , d, i) ; printf("move %d from %c to %c n", n, s, d) ; hanoi (n-1, i , s, d) ; } # #define fa l se 0 •define true 1 /* simple bubblesort routine * / bubblesort{ a, n) i n t a[ ] ; { i n t sor ted, l a s t va lue , temp, j ; sorted = fa l se ; las tva lue = n ; do [ las tva lue = las tva lue -1 ; sorted = true ; for { j = 0 ; j <= las tva lue ; j = i + 1 ) i f ( a[ j ] < a[ j*1] ) { temp - a[ j ] ; a[ J3 = a [ j*1] ; aCj+1] = temp ; sorted = fa l se ; } } while ( ( sorted =• = fa l se ) | | { las tva lue -»•= 1 } ) ; } /* a stupid program that evaluates expressions * / stupidprogram( a, b, c , d) I a = (a+b)*{c+d) ; b = c + d ; c = (a+b)/d ; d = a+b+c ; Appendix I I I 57 SLIH system software T h i s appendix c o n t a i n s a b r i e f d e s c r i p t i o n of the SLIM system software. T h i s i n c l u d e s : i . a BCPL t o SLIM compiler i i . a SLIM assembler i i i . a SLIM loader and i n t e r p r e t e r T h i s a l l o w s one t o compile and run BCPL proqrams. We w i l l d e s c r i b e t h i s software b r i e f l y and then i l l u s t r a t e the whole system on the e t e r n a l towers of hanoi! The c o m p i l e r i s as expected. I t allows some Vancouver ext e n s i o n s (e.g. f o r op e r a t o r s l i k e *%*, •+:=' etc.) , The assembler generates l o a d modules and a l s o performs compaction making jumps and r e f e r e n c e s r e l a t i v e i f p o s s i b l e . T h i s u s u a l l y saves from 5 t o 10 percent of the proqram s i z e . The technique i s the same as t h a t d e s c r i b e d by Peck et a l . [ 1 8 ] . , A l l the above software i s w r i t t e n i n BCPL so that p r o t a b i l i t y i s enhanced. We now present the e t e r n a l TOWERS OF HANOI r i g h t from the BCPL source t o SLIM i n t e r p r e t a t i o n . T h i s i s an e d i t e d v e r s i o n of a l i v e MTS s e s s i o n at DBC. # COMMENT LIST OF THE SOURCE # LIST -HANOI > 1 SECTION. "HANOI" > 4 GET. "FOX:BCPLHDR" > 4.5 ENTRY ${ START:"START" $) > 5 LET HANOI (N, S, I , D) BE Appendix III 58 > 6 $( IF N <= 0 THEN RETURN > 7 HANOI (N-1, S, B, I) > 8 WRITEF ("MOVE %U FROM %C > 9 HANOI(N-1, I, S, D) $) > 10 > 11 AND START {) BE > 12 ${ LET N = 0 > 13 WRITES("ENTER NUMBER*N") > 14 N := READNQ > 15 WRITEF("NUMBER INPUT WAS > 16 IF N <= 0 THEN FINISH > 17 HANOI(N, *S», 'I», »D») > 18 $) REPEAT # END OF FILE # COMMENT COMPILE IT TO %C*N", N, S, D) %N*N", N) BUN BCPL.COMPILER T=1S SCARDS=-HANOI PAR=I EXECUTION BEGINS BCPL/SLIM (1978 MAY) PARAMETER = * I' LOGICAL UNIT '0* WAS NOT SPECIFIED; -OC# ASSUMED. LOGICAL UNIT *10* WAS NOT SPECIFIED; -STATS ASSUMED, SECTION HANOI COMPILATION COMPLETE; 0 ERRORS DETECTED S EXECUTION TERMINATED # COMMENT LIST THE SLIM CODE # LIST -GC# SECTION HANOI EXTERNAL 11 "WRCH" EXTERNAL L2 "RDCH" EXTERNAL L3 "WRITEO" EXTERNAL L4 "WRITED" EXTERNAL L5 "WRITEHEX" EXTERNAL L6 "WRITEOCT" EXTERNAL L7 "WRITES" EXTERNAL L8 "WRITEF" EXTERNAL L9 "REACN" EXTERNAL L10 "WRITEX" EXTERNAL L11 "NEWPAGE" EXTERNAL L12 "NEWLINE" EXTEBNAL 113 "WRITEN" EXTERNAL L14 "PACKSTRING" EXTERNAL L15 "UNPACKSTRING" SSETI P2 JUMP L20 3BAN0I 17: SSETI P6 LD IP2 LE 0 JF 121 RTRN 21: SRELI 2 LD IP2 MINUS 1 STKLD IP3 STKLD IP5 STKLD IP4 STKLD IL18 CALL 4 SRELI 2 LD L22 STKLD IP2 STKLD IP3 STKLD IP5 STKLD IL8 CALL 4 SRELI 2 LD IP2 MINUS 1 appendix I I I 59 STKLD IP4 STKLD IP3 STKLD IP5 STKLD IL18 CALL 4 RTRN 22: DATa "MOVE SN FROM ^C TO %C*N" 3START 19: SSETI P2 23: LD 0 POSH SRELI 2 LD L24 STKLD IL7 CALL 1 SRELI 2 LD 119 CALL 0 STORE P2 SRELI 2 LD L25 STKLD IP2 STKLD IL8 CALL 2 LD IP2 LE 0 JF L26 FINISH 26: SRELI 2 LD IP2 STKLD 'S' STKLD «I» STKLD *D* STKLD IL18 CALL 4 SSETI P2 JUMP L23 RTRN 25: DATA "NUMBER INPUT WAS 5JN*N" 24: DATA "ENTEB NUMBER*N" 20: FINISH 16: DATA L19 18: DATA L17 ENTRY L16 "START" END # END OF FILE # COMMENT ASSEMBLE IT # RUN ASM T=1S SCARDS=-OC# # EXECUTION BEGINS PABAMETER SPUNCH DEFAULTS TO *-CODE#J S L I M ASSEMBLER ( VERSION 3. JULY 1978 ) # EXECUTION TERMINATED # COMMENT LIST THE LOAD MODULE # LIST -CODE# ENTRY "START" 000146 111002 126400 +000145 111006 077002 040000 135 402 174017 114002 077002 014001 103003 103005 10300 4 102410 +000147 120004 114002 075421 103002 103003 103005 102410 +000000 120004 114002 077002 014001 103004 103003 103005 102410 +000147 120004 174017 013324 1533 45 142500 066325 040306 154726 152100 066303 040343 153100 066303 012400 111002 074000 174005 11400 2 075453 102410 +000000 120001 114002 076410 +000000 120000 10 5002 114002 075426 103002 102410 •000027 120002 077002 040000 135402 174016 114002 077002 102400 000342 102400 000311 102400 000304 103431 120004 1 11002 12553 7 174017 012325 162324 141305 154500 144725 153744 161500 16 3301 161100 066325 012400 006705 152743 142731 040325 162324 141305 154425 174016 +000057 +000003 EXTERNAL "WRITES" 000065 EXTERNAL "WRITEF" 000100 EXTERNAL "READN" 000071 END # END OF FILE # COMMENT NOW RUN THE LOADER/INTERPRETER WITH THE LIBRARY Appendix III 60 # RON INT T=1S SCARDS=-CODE#+BCPLLIB # EXECOTION BEGINS - S L I M - INTERPRETER/LOADER. VERSION 3 ( JOLY 1978 ) 650 WORDS LOADED LOAD MAP 000146 : "START" 001172 : "WRITES" 001175 : "WRITEF" 001202 : "READN" 001173 : "ONPACKSTRING" 001174 : "PACKSTRING" 001176 : "WRITED" 001177 : "WRITEN" 001200 : "NEWLINE" 001201 : "NEWPAGE" 001203 : "WRITEOCT" 001204 : "WRITEHEX" 001205 : "WRITEO" 001206 : "WRITEX" 001207 : "RDCH" 001210 : "SRCH" 001211 : "TERMINATOR" EXECUTION BEGINS ENTER NUMBER 3 NUMBER INPUT WAS 3 MOVE 1 FROM S TO D MOVE 2 FROM S TO I MOVE 1 FROM D TO I MOVE 3 FROM S TO D MOVE 1 FROM I TO S MOVE 2 FROM I TO D MOVE 1 FROM S TO D 1 , ENTER NUMBER '" 1 2 NUMBER INPUT WAS 2 MOVE 1 FROM S TO I MOVE 2 FROM S TO D MOVE 1 FROM I TO D INTER NUMBER -1 NUMBER INPUT WAS -1 EXECUTION TERMINATED.,( 12892 INSTRUCTIONS ) # EXECUTION TERMINATED

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
Germany 23 0
United States 13 10
Canada 8 0
China 5 56
United Kingdom 4 0
Switzerland 3 1
France 3 0
Australia 2 0
Morocco 1 0
City Views Downloads
Unknown 35 3
Ottawa 7 0
Dallas 4 0
Beijing 4 8
Mountain View 2 0
Milwaukee 2 0
Purley 2 0
Sydney 1 0
Annandale 1 0
Fuzhou 1 0
Latham 1 0
Redmond 1 0
Ashburn 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}

Share

Share to:

Comment

Related Items