UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A parsing language Wilbur, Gregory Allen 1975

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1976_A6_7 W54.pdf [ 5.68MB ]
Metadata
JSON: 831-1.0051774.json
JSON-LD: 831-1.0051774-ld.json
RDF/XML (Pretty): 831-1.0051774-rdf.xml
RDF/JSON: 831-1.0051774-rdf.json
Turtle: 831-1.0051774-turtle.txt
N-Triples: 831-1.0051774-rdf-ntriples.txt
Original Record: 831-1.0051774-source.json
Full Text
831-1.0051774-fulltext.txt
Citation
831-1.0051774.ris

Full Text

A PASSING LANGUAGE by GREGORY ALLEN WILBUR B.Sc, University of B r i t i s h Columbia, 1974. A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOB THE DEGREE OF MASTER OF SCIENCE in the Department of Computer Science We accept this thesis as conforming to the required standard The University of B r i t i s h Columbia October, 1975 I n p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t . o f t h e r e q u i r e m e n t s f o r an advanced degree a t the U n i v e r s i t y o f B r i t i s h C olumbia, I agree t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s t u d y . I f u r t h e r agree t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p urposes may be g r a n t e d by t h e Head o f my Department o r by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l n o t be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Department The U n i v e r s i t y o f B r i t i s h Columbia Vancouver 8, Canada i ABSTRACT Con s i d e r a b l e work has been r e c e n t l y devoted to the automatic g e n e r a t i o n of syntax a n a l y z e r s . T h i s work has been g e n e r a l l y concerned with extending the power of the p a r s e r generator technigues, r a t h e r than improving the s y n t a c t i c s p e c i f i c a t i o n mechanism. Se present a new p a r s i n g language which attempts t o u n i f y syntax and semantics. In a d d i t i o n , the language p r o v i d e s a mechanism by which reasonable e r r o r recovery can be n a t u r a l l y i n c l u d e d i n the syntax s p e c i f i c a t i o n . i i CHAPTER 1: INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v * 1 1.1 . P r o p e r t i e s of p a r s i n g languages .................... 1 1.2 . Syntax languages as p a r s i n g languages .............. 4 1.3 . overview of pre v i o u s p a r s i n g languages ............. 6 CHAPTER 2: PARSLEY ........................................... 8 2.1 . C r i t e r i a f o r design ................................ 8 2.2 . P a r s l e y program .................................... 9 2.3 . Basic symbols ..................................... 10 2.3.1 . Reserved words ............................... 10 2.3.2 . S p e c i a l symbols ..............................10 2.3.4 . I d e n t i f i e r s 10 2.4 . Procedure d e f i n i t i o n .............................. 11 2.5 . Statements ........................................ 12 2.5.1 ., Procedure c a l l statement ..................... 12 2.5.2 . Read statement ............................... 12 2.5.3 . Apply statement .............................. 13 2.5.4 . Scope statements • ••• 13 2.5.5 . Procedure escape statement ................... 15 2.5.6 . Scope escape statements ...................... 15 2.5.7 . E i t h e r statement ............................. 16 2.5.7.1 . E r r o r statement ......................... 17 2.5.7.2 . Skip statement .......................... 17 2.5.7.3 . Look statements ......................... 18 2.5.7.4 . E i t h e r a l t e r n a t i v e s ..................... 18 2.6 . T h e o r e t i c a l execution of P a r s l e y .................. 20 2.6.1 . The P a r s l e y i n t e r p r e t e r ...................... 20 2.6.2 . Some i n t e r p r e t e r r o u t i n e s .................... 26 i i i 2.6.3 . Interpreter i n i t i a l conditions ............... 27 2.6.4 . interpreter execution ................... .... . 28 2.7 . Descriptive a b i l i t y of parsley .................... 34 2.8 . Semantic action positioning .......................,36 2.9 . Error traps ....................................... .36 CHAPTER 3: COMPILING PARSLEY ................................ 38 3.1 . General compilation process ....................... ,41 3.2 . Non-deterministic intermediate machine ............ 41 3.3 . Pi n a l deterministic machine ....................... 45 CHAPTES 4: USER'S GUIDE 48 4.1 . Current implementation ............................ 48 4.1.1 . User parameters .............................. 48 4.1.2 .. Input 49 4.1.3 . Output 50 4.1.3*1 • Paragraphed l i s t i n g ..................... „50 4.1.3.2 . Error messages .......................... 51 4.1.3.3 • Table dumps .................. 54 4.1.3.4 . Parser tables ........................... 55 4.2 . Programming pointers .............................. 55 4.2.1 . , Mental approach .............................. 55 4.2.2 . Some good programming practices .............. 56 4.2.3 . Some p r a c t i c a l considerations 58 4.2.4 ., Lookahead inadequacies ....................... ,59 4.2.5 . The parse stack .............................. 64 CHAPTES 5: IMPLEMENTATION NOTES ............................. 66 5.1 . Reading in the source program 67 5.1.1 . The parsing machine d e f i n i t i o n ............... 67 i v 5.1.2 Basic parsing machine construction ........... 68 5.1.3 . Parsing machine optimization ................. 69 5.2 . Parser generation ................................. 70 5.2.1 . LB (0) machine construction 70 5.2.2 . SLR {K) machine construction .................. 73 5.2.3 . Parser optimizations 74 CHAPTER 6: THE END .......................................... 75 BIBLIOGRAPHY .....................................,-,,•••••*• 76 APPENDIX I: PARSLEY IN PARSLEY .............................. 78 APPENDIX I I : PARSLEY IN BNF ................................. 88 APPENDIX I I I : SP(K) IN PARSLEY 93 APPENDIX IV: A SAMPLE PASSER .................. , , . , 111 IV.1 Table Structure 111 IV.2 . Parser Environment ...............................112 IV.3 . Parser I n i t i a l Conditions .......................114 IV.4 . The Parser ......................................114 V ACKNOWLEDGEMENTS Several people have made contributions towards both the design and preparation of t h i s thesis. I would l i k e to thank my supervisors Harvey Abramson and Alan Ballard who both put a great deal of time into the improvement of my thesis. I would also l i k e to thank Alan Ballard and Paul Whaley with whom I had many i n t e r e s t i n g discussions, some of which led to the work presented here. F i n a l l y , I would l i k e to thank the National Research Council of Canada for their f i n a n c i a l support. 1 CHAPTER INTRODUCTION Co n s i d e r a b l e work has r e c e n t l y been devoted t o the automatic g e n e r a t i o n of syntax a n a l y z e r s . T h i s work has been g e n e r a l l y concerned with extending the power of p a r s e r generator technigues, r a t h e r than improving the s y n t a c t i c s p e c i f i c a t i o n languages. T h i s chapter w i l l i n t r o d u c e syntax languages and p a r s i n g languages. Syntax languages are co n s i d e r e d to be languages which s p e c i f y the s e t of sentences which form a language. P a r s i n g languages are considered t o be languages which not o n l y s p e c i f y the s e t of sentences, but a l s o i n d i c a t e where semantic a c t i o n s should take p l a c e to e f f e c t a u s e f u l t r a n s l a t i o n . 1.1 PROPERTIES OF PARSING LANGUAGES T h i s s e c t i o n presents s e v e r a l fundamental p r o p e r t i e s which a good p a r s i n g language should have. 2 The b a s i c requirement f o r a p a r s i n g language i s the a b i l i t y to s p e c i f y programming language syntax. I t i s u s u a l l y accepted that c o n t e x t f r e e d e s c r i p t i v e power i s s u f f i c i e n t to meet t h i s c r i t e r i o n f o r most programming languages. However, a p a r s i n g language must meet t h i s c r i t e r i o n i n a n a t u r a l way. That i s , merely to be able t o d e s c r i b e languages t h e o r e t i c a l l y i s not s u f f i c i e n t , the d e s c r i p t i o n must be easy and n a t u r a l to w r i t e . In our o p i n i o n , a p a r s i n g language needs th r e e b a s i c c o n t r o l f a c i l i t i e s . I t needs a r e c u r s i v e procedure f a c i l i t y , an a l t e r n a t i v e s t r u c t u r e , and an i t e r a t i v e s t r u c t u r e . A c t u a l l y , the f i r s t two are made necessary by the c o n t e x t f r e e reguirement. A r e c u r s i v e procedure f a c i l i t y i s r e g u i r e d to d e s c r i b e c o n s t r u c t s such as p a r e n t h e t i c a l e x p r e s s i o n s and nested b l o c k s . The a l t e r n a t i v e f a c i l i t y i s needed t o d e s c r i b e c o n s t r u c t s with o p t i o n a l p a r t s and i n f i n i t e languages. F i n a l l y , the i t e r a t i v e f a c i l i t y , which can be simulated by r e c u r s i o n , i s u s e f u l i n the d e s c r i p t i o n of r e p e t i t i v e s t r u c t u r e s l i k e parameter l i s t s and statement l i s t s . A p a r s i n g language must a l s o provide a mechanism by which the user can a s s o c i a t e semantic b r e a k p o i n t s with the parse. These breakpoints are r e g u i r e d f o r the g a t h e r i n g of i n f o r m a t i o n necessary f o r t r a n s l a t i o n . They can be used f o r f u n c t i o n s l i k e symbol t a b l e o p e r a t i o n s , parse t r e e c o n s t r u c t i o n , or code ge n e r a t i o n i n one pass c o m p i l e r s . The s u p p l i e d mechanism should c l e a r l y demonstrate the r e l a t i o n between parse and semantics. For t h i s reason, i t i s f e l t t h a t semantics should be e x p l i c i t l y 3 w r i t t e n . (An example o f i m p l i c i t semantics i s the a s s o c i a t i o n of semantics and productions i n context f r e e grammars.) The mechanism should a l s o allow the a d d i t i o n of more semantic a c t i o n s without l a r g e m o d i f i c a t i o n s to the o l d s p e c i f i c a t i o n . F i n a l l y , many a c t i o n s should be allowed to be executed s e q u e n t i a l l y at any p o i n t i n the parse. T h i s w i l l promote the development of a c l e a n , mutually e x c l u s i v e set of p r i m i t i v e a c t i o n s , i n s t e a d of l a r g e r i n t e r t a n g l e d semantics given a s i n g l e e n t r y p o i n t because of a common i n v o c a t i o n parse p o s i t i o n . A p a r s i n g language should a l s o allow the user t o s e t t r a p s to c a t c h syntax e r r o r s . These t r a p s w i l l allow the t r a n s l a t o r to o v e r r i d e g e n e r a l d i a g n o s t i c s and/or recovery when the g e n e r a l e r r o r handling f a l l s s h o r t of e x p e c t a t i o n s . General e r r o r h a n d l i n g can f a l l s h o r t i n three b a s i c ways. The t r a n s l a t o r e r r o r message may not be s u f f i c i e n t l y p r e c i s e to r e a l l y i d e n t i f y the problem to the programmer. In such cases a s p e c i a l r o u t i n e c o u l d be invoked to improve e r r o r d i a g n o s t i c s . . General e r r o r r e c o v e r y can a l s o f a i l i n i t s a b i l i t y t o C o r r e c t * the i n p u t stream so t h a t t r a n s l a t i o n can proceed. I t may t o t a l l y f a i l i n some cases, or may succeed i n l o c a l c o r r e c t i o n which proves to be d i s a s t r o u s a t a l a t e r time. S p e c i a l r o u t i n e s can use e x t r a i n f o r m a t i o n the implementer has t o e f f e c t a non - r d i s a s t r o us c o r r e c t i o n . The f i n a l way i n which g e n e r a l e r r o r r o u t i n e s may f a i l i s i n l e a v i n g i n c o n s i s t e n c i e s w i t h i n the t r a n s l a t o r due to c o r r e c t i o n . U n f i n i s h e d c h a i n s of semantic a c t i o n s such as scope e n t r y but no scope e x i t i s an example of t h i s problem. Such 4 problems may throw the compiler out of phase. Special routines could attempt to avert such d i f f i c u l t i e s . The error trap f a c i l i t y should allow the addition of new error traps without af f e c t i n g the rest of the s p e c i f i c a t i o n . This i s important because the addition of error traps w i l l be an ongoing process as users discover weaknesses i n the general error handling. 1.2 SYNTAX LANGUAGES AS PARSING LANGUAGES Syntax languages are languages whose only purpose i s to specify a set of sentences which comprise a language. , Syntax languages are often forced to serve as parsing languages i n today's parser generators. The rest of t h i s section w i l l attempt to convince the reader of the inadeguacy of syntax languages for t h i s purpose, and hence the need for parsing languages as d i s t i n c t from syntax languages for parser generation systems. The most obvious d i f f i c u l t y with a syntax language doubling as a parsing language i s the lack of a positioning mechanism for semantic actions. The only f e a s i b l e way to position semantic actions, without augmenting the syntax language into a parsing language, i s to associate semantic actions with syntactic groupings. As not a l l semantic actions can be conveniently associated with the i n t u i t i v e syntactic groupings, new syntactic groupings must be introduced. This breakdown of the i n t u i t i v e syntactic s p e c i f i c a t i o n w i l l tend to unduly obscure the 5 d e s c r i b e d language's syntax. Another obvious d i f f i c u l t y with t h i s use of a syntax language i s the l a c k of any syntax e r r o r t r a p p i n g f a c i l i t y . T h i s problem i s i g n o r e d by most p a r s e r generators based on syntax languages, and would be d i f f i c u l t to handle without augmenting the syntax language i n t o a p a r s i n g language. One l a s t o b j e c t i o n t o the use of a syntax language, as a p a r s i n g language, i s the usual l a c k of d e f i n i t i o n of what to do when s e v e r a l p o s s i b l e parse paths present themselves. The syntax language does not s p e c i f y whether t o t r y each a l t e r n a t i v e and back up on f a i l u r e , or whether to generate a new parser f o r each a l t e r n a t i v e and continue i n p a r a l l e l . The problem i s not important i n a syntax language, but with p o s s i b l y i r r e v e r s i b l e semantic a c t i o n s being executed on a parse path t h i s problem i s important to a p a r s i n g language. In g e n e r a l t h i s d i f f i c u l t y with syntax language d e f i n i t i o n i s l e f t to the i n d i v i d u a l parser generator t o decide on. T h i s i s not a p a r t i c u l a r l y appealing s o l u t i o n . The context f r e e grammar (eg. BNF) i s widely used as a p a r s i n g language i n parser g e n e r a t i o n systems. T h i s wide spread use i s p a r t i a l l y due to i t s e a r l y development 1 as a syntax language. The other reason f o r i t s p o p u l a r i t y i s t h a t the language was easy to handle i n a formal way which l e d to the development of many p a r s i n g technigues s p e c i f i c a l l y designed f o r 6 the context f r e e grammars 2 3 4 . The context f r e e grammar demonstrates a l l three of the preceding o b j e c t i o n s t o a syntax language being used as a p a r s i n g language. In a d d i t i o n , the standard context f r e e grammar i s not even a p a r t i c u l a r l y good syntax language a c c o r d i n g t o our c r i t e r i a . , I t confuses the procedure (nonterminal) and the a l t e r n a t i v e s t r u c t u r e ( a l t e r n a t i v e p r o d u c t i o n s ) . Also t h e r e i s no i t e r a t i o n f a c i l i t y a v a i l a b l e . T h i s f a i l u r e as a syntax language i s demonstrated by the development of var i o u s meta-notations t o augment the standard context f r e e grammar 5 6 . However, very few parser generation systems use t h i s more powerful form of the context f r e e grammar. 1.3 OVERVIEW OF PREVIOUS PARSING LANGUAGES Th i s s e c t i o n presents some of the more widely known p a r s i n g languages. The survey i s g u i t e s h o r t as t h e r e are few parsing languages i n e x i s t e n c e today. A very e a r l y p a r s i n g language was the Floy d P r o d u c t i o n Language 7 8 . I t was developed p r i o r t o the advent of most of our p a r s i n g techniques. T h i s i s why the language combined a s h i f t - r e d u c e parser s p e c i f i c a t i o n and a syntax s p e c i f i c a t i o n . T h i s combination enabled the implementer to understand the parser b e t t e r , however i t a l s o made s p e c i f i c a t i o n s more t e d i o u s and d i f f i c u l t to w r i t e . In f a c t , t here was some work which attempted to bypass t h i s o b j e c t i o n by t r a n s l a t i n g context f r e e 7 grammars i n t o FPL 9 . FPL has a l s o allowed the a s s o c i a t i o n of semantics and e r r o r t r a p s with the parse. One l a s t drawback of t h i s language was i t s t o t a l l a c k o f s t r u c t u r e . P a r s i n g was presented as s u c c e s s i v e stack p a t t e r n matches and u n r e s t r a i n e d branching. T h i s made FPL s p e c i f i c a t i o n s d i f f i c u l t to read, w r i t e , and debug. Syntax Charts *o are now r e c e i v i n g renewed i n t e r e s t 1 1 . The c h a r t s s p e c i f y syntax by means of a two dimensional diagram. At t h i s p o i n t , i n s u f f i c i e n t development i n the a s s o c i a t i o n of semantics and e r r o r t r a p s has been done. A major d i f f i c u l t y with syntax c h a r t s i s t h a t they are not d i r e c t l y machine read a b l e u n l e s s t h e r e i s an a v a i l a b l e g r a p h i c s f a c i l i t y . T h i s means the c h a r t s must be hand t r a n s l a t e d i n t o a one dimensional machine-readable language. I t i s suspected t h a t t h i s t r a n s l a t i o n would be te d i o u s and e r r o r prone. However, i t would seem, t h a t with a g r a p h i c s f a c i l i t y and some e x t e n s i o n s t o handle semantics and e r r o r t r a p s , t h i s c ould be a rewarding approach t o p a r s i n g languages. 8 CHIPTES 2. PAESLEY This chapter presents a new parsing language. The parsing language i s c a l l e d Parsley. Parsley's design c r i t e r i a are presented and are followed by a d e f i n i t i o n of the language., A d e f i n i t i o n of Parsley written i n Parsley appears in Appendix 1. 2 . 1 CRITERIA FOR DESIGN This section presents the c r i t e r i a which governed the design of Parsley, the parsing language presented and discussed in the rest of th i s chapter. Some personal prejudices, important to the design of Parsley, are also confessed t o . Obviously, our basic c r i t e r i a f o r the design of t h i s language are those presented i n section 1.1 of the preceding chapter. Th language must provide a recursive procedure f a c i l i t y , an alternative structure, and an i t e r a t i v e f a c i l i t y . The language must also allow the simple positioning of semantic actions and syntax error traps. 9 The most important p e r s o n a l p r e j u d i c e , which a f f e c t e d the design of P a r s l e y , i s our b e l i e f t h a t p a r s i n g languages are j u s t a s p e c i a l type of programming language and should have s i m i l a r c o n s t r u c t s . Furthermore, I have a b a s i c p r e d i s p o s i t i o n toward the block s t r u c t u r i n g of the SOE system language 1 2 . T h i s language does not have an u n r e s t r i c t e d goto and i s very readable. The l a s t c r i t e r i o n f o r the design was to d e f i n e a parse s t a c k to hold token and semantic i n f o r m a t i o n necessary f o r t r a n s l a t i o n purposes. The c o n t r o l of t h i s s t a c k would be i m p l i c i t , r a t h e r than e x p l i c i t as i n FPL 7 a . T h i s makes s p e c i f i c a t i o n s e a s i e r and l e s s e r r o r prone to w r i t e . F i n a l l y , the o p e r a t i o n of the parse stack should be simple to understand. 2.2 PABSLEY PBOGBAM A P a r s l e y program c o n s i s t s of one or more procedures. The f i r s t procedure i s the goal procedure. The r e s t of the procedures may appear i n any order.„ T h i s a l l o w s programs to be w r i t t e n i n a top-down manner. Programs are stream o r i e n t e d , not l i n e o r i e n t e d . Blanks can be f r e e l y used between symbols to improve r e a d a b i l i t y . Comments can appear anywhere a blank may. Comments are any seguence o f c h a r a c t e r s preceded by */*» u n t i l the f i r s t **/*. They have no i n f l u e n c e on program meaning. , 10 2.3 BASIC SYMBOLS Parsley has a set of reserved words, some special characters, and a class of objects known as i d e n t i f i e r s . Special symbols, l i k e blanks, can be used to separate other symbols. 2.3.1 RESERVED WORDS The reserved words are: AND APPLY BE BEGIN CYCLE EITHER END ERROR EXIT LET NOTSEE READ REPEAT RETURN SEE SKIP THEN _!_ 2.3.2 SPECIAL SYMBOLS The s p e c i a l symbols are: : ; , < > ( ) 2.3.4 IDENTIFIERS I d e n t i f i e r s are composed of a sequence of l e t t e r s , d i g i t s , minus characters, and underscore characters. They must begin 11 with a l e t t e r . I d e n t i f i e r s are d i v i d e d i n t o f i v e groups., These groups, whose purpose w i l l be e x p l a i n e d l a t e r , are procedure names, t e r m i n a l names, semantic names, e r r o r names, and scope names. Procedure names cannot appear l e x i c a l l y i d e n t i c a l t o any reserved word. The other f o u r groups can i n c l u d e i d e n t i f i e r s l e x i c a l l y i d e n t i c a l t o a re s e r v e d word. L e x i c a l l y i d e n t i c a l i d e n t i f i e r s can appear i n more than one group i n the same program. The c o n t e x t of the i d e n t i f i e r s i s s u f f i c i e n t t o d i s t i n g u i s h between them. F i n a l l y , i d e n t i f i e r s are not d e c l a r e d , t h e i r f i r s t usage i n each group c o n s t i t u t e s t h e i r d e f i n i t i o n . 2.4 PROCEDURE DEFINITION The major c o n t r o l s t r u c t u r e of P a r s l e y i s the procedure., A procedure has a s i n g l e cohesive body w i t h i n which other procedures cannot be nested. a l l procedures are g l o b a l l y a c c e s s i b l e . Every procedure has an unigue name and a body c o n s i s t i n g of a l i s t of statements, separated by semicolons f o l l o w e d by a *_|_* ( g o a l p o s t ) . EXAMPLE LET UNIQUE-PROCEDURE-NAME BE: /* LIST OF STATEMENTS SEPARATED BY SEMICOLONS */ _ l _ /* THIS GOAL POST MARKS THE END OF A PROCEDURE */ 12 2.5 STATEMENTS 2.5.1 PHOCEDOBE CALL STATEMENT T h i s statement invokes the s p e c i f i e d procedure. The s p e c i f i e d procedure cannot be the g o a l procedure. Procedure c a l l s may be d i r e c t l y or i n d i r e c t l y r e c u r s i v e . The e f f e c t of t h i s statement i s s i m i l a r to t h a t of a nonterminal appearance i n the r i g h t s i d e of a context f r e e grammar p r o d u c t i o n . The statement c o n s i s t s of j u s t the name of the procedure. EXAMPLE , UNIQ0E-PBOCED0RE-NAME 2.5.2 BEAD STATEMENT T h i s statement checks the token at the head of the i n p u t stream. I f t h i s l e a d token matches the read's argument, (a member of the t e r m i n a l name group) then the read statement succeeds. A s u c c e s s f u l read statement d e l e t e s the matched token from the i n p u t stream and conti n u e s the parse. An u n s u c c e s s f u l read statement s i g n a l s f a i l u r e f o r the parse. 13 EXAMPLE EEAD (TOKEN_A) READ(X,Y,;Z) /* A SHORTHAND FOR 3 READS */ 2.5.3 APPLY STATEMENT The e x e c u t i o n of an apply statement causes a semantic break p o i n t . A semantic breakpoint t e m p o r a r i l y stops the parse and sends a code i d e n t i f y i n g the semantic name argument t o the t r a n s l a t o r . The parse continues when the t r a n s l a t o r s i g n a l s the a c t i o n has been handled. EXAMPLE APPLY (OUTPUT_FALSE_BRANCH) APPLY(S1,S2) /* A SHORTHAND FOR 2 APPLIES */ 2.5.4 SCOPE STATEMENTS I t i s o f t e n n a t u r a l to group s e v e r a l statements together without wanting a new procedure. In P a r s l e y , t h i s c o n s t r u c t i s c a l l e d a scope and f a c i l i t a t e s i t e r a t i o n . Scopes can be nested and can be given names. Scope names are only a c c e s s i b l e w i t h i n the scope they i d e n t i f y . (These names are used i n the scope 14 escape statements e x p l a i n e d l a t e r . ) The body of a scope i s a l i s t of statements, separated by semicolons followed by the r e s e r v e d word 1 END*. There are two kinds o f scopes i n P a r s l e y , the block and the c y c l e . The block i s used when the major purpose of a scope i s not i t e r a t i o n . When a l l statements of a b l o c k have been executed, c o n t r o l proceeds to the statement t e x t u a l l y f o l l o w i n g the b l o c k . C y c l e s are used when i t e r a t i o n i s the major purpose f o r the scope. Rhen a l l the statements i n the c y c l e have been completed, e x e c u t i o n resumes with the f i r s t statement of the c y c l e again. EXAMPLE BEGIN /* ' BEGIN* STARTS A BLOCK; THIS BLOCK IS NOT NAMED */ /* LIST OF STATEMENTS SEPARATED BY SEMICOLONS */ * END /* * END' CLOSES THE SCOPE */ <ITERATE> CYCLE /* LIST OF STATEMENTS SEPARATED BY SEMICOLONS */ END <ITERATE>/* NAME MUST MATCH TO CLOSE NAMED SCOPES */ 15 2.5-5 PROCEDURE ESCAPE STATEMENT The execution of t h i s statement causes an immediate return from the containing procedure. There can be any number of procedure escapes i n a procedure. (The goalpost at the end of a procedure i s r e a l l y an implied return.) Procedure escapes may have semantics associated just before and just after escape. The difference between an apply before and an apply after the return phrase i s explained in sections 2.6.4 and 4.2.5 of t h i s t h e s i s . I t should be noted that multi-level procedure escapes are not possible. EXAMPLE RETURN /* NO SEMANTIC ACTIONS */ APPLY(XX) THEN RETURN /* ACTION JUST BEFORE ESCAPE */ RETURN AND APPLY (YY) /* ACTION JUST AFTER ESCAPE .*/ APPLY(XX) THEN RETURN AND APPLY (YY) /* ACTIGN BOTH BEFORE AND AFTER ESCAPE */ 2.5.6 SCOPE ESCAPE STATEMENTS These statements are used to a f f e c t flow of control within scopes. They can cause execution to leave or resume any accessible scope. (Scope escapes can only access the immediately enclosing scope and other named enclosing scopes.) Scope escapes, l i k e the procedure escape, can have associated 16 semantic actions. The e x i t statement and the repeat statement are the scope escape statements. The exit statement passes control to the statement textually following the i d e n t i f i e d scope. The repeat statement causes control to resume the st a r t of the i d e n t i f i e d scope. EXAMPLE APPLY(XX) THEN REPEAT /* HILL REPEAT IMMEDIATELY ENCLOSING SCOPE */ EXIT <BIG-LOOP> /* WILL EXIT A SCOPE CALLED •BIG-LOOP* */ EXIT <BLOCK> AND APPLY (YY) 2.5.7 EITHER STATEMENT This statement presents a l i s t of parse alternatives., A l l of the alt e r n a t i v e s , unless they contain an escape statement or an error statement, pass execution to the statement t e x t u a l l y following the either statement. Each separate alternative i s nested within a pair of parentheses and at least one alternative must not be an error a l t e r n a t i v e . The l a s t alternative i s followed by the reserved word * END'. There are actually several s p e c i a l statements which can only appear within an either a l t e r n a t i v e . The next few sections introduce them. Following that i s . a d e f i n i t i o n of what a l e g a l either alternative i s . 17 EXAMPLE EITHER /* LIST OF ALTERNATIVES */ m m END 2.5.7.1 ERROR STATEMENT T h i s statement can only appear i n an e i t h e r a l t e r n a t i v e . I t s purpose i s to a l l o w the t r a p p i n g of syntax e r r o r s . There can only be one e r r o r a l t e r n a t i v e i n an e i t h e r statement. E r r o r a l t e r n a t i v e s are only executed i f a l l other a l t e r n a t i v e s have f a i l e d . The e x e c u t i o n of an e r r o r statement generates a syntax e r r o r and a code i d e n t i f y i n g the e r r o r name argument. EXAMPLE ERROR(SYNTAX-PROBLEM-X) 2.5.7-2 SKIP STATEMENT T h i s statement can only appear i n an e i t h e r a l t e r n a t i v e . The statement i n d i c a t e s an o p t i o n a l element of the syntax. S e m a n t i c a l l y t h i s i s very s i m i l a r to an empty pr o d u c t i o n i n a c o n t e x t f r e e grammar. 18 EXAMPLE SKIP 2.5.7.3 LOOK STATEMENTS These statements can only appear i n an either a l t e r n a t i v e . They are used to interrogate the next token of the input stream without deleting i t as the read statement does. These statements are used to separate viable a l t e r n a t i v e s . There are two look statements, the see and the notsee. The see statement indicates to follow an alt e r n a t i v e only i f the next token of the input stream matches one of i t s arguments. The notsee has the obvious reverse e f f e c t . I t i s important to note that a l l of the look arguments must be tokens which could actually occur i n some l e g a l parse at t h i s point. EXAMPLE SEE (X,Y,Z) NOTSEE (P) 2.5.7.U EITHER ALTERNATIVES Either alternatives appear within the either statement. 19 Each alt e r n a t i v e , nested within a pair of parentheses, represents a p o s s i b i l i t y for the parse. What constitutes a l e g a l either alternative i s incredibly convoluted i n English. Therefore, please read the next example. EXAMPLE LET EITHER-ALTERNATIVE BE: EITHER (ERBOR—STATEMENT; RETURN) (BEGIN EITHER (SKIP) (BEGIN SEE-STATEMENT; READ (SEMI) END) END; END) END; EITHER (BEGIN READ (READ) ; READ(OPEN,TERMINAL); EITHER (READ (CLOSE)) (ERROR (MULTIPLE-RE AD-IN-.EITHER) ) END; END) (SCOPE-STATEMENT) (EITHER-STATEMENT) (READ (NT) ) (READ (SKIP) ) (ESCAPE-STATEMENT; BETUBN) END ; EITHER (READ (SEMI) ) (RETURN) END; EITHER (ESCAPE-STATEMENT) (APPLY-STATEMENT) END; _ l _ 20 Although the d e f i n i t i o n o f v a l i d a l t e r n a t i v e s may seem r a t h e r a r b i t r a r y , i t a c t u a l l y i s n * t . The d e f i n i t i o n i s l a r g e l y i n f l u e n c e d by the d e s i r e to e n f o r c e a simple parse stack d i s c i p l i n e . The parse stack i s e x p l a i n e d i n s e c t i o n s 2.6 and 4.2.5. These s e c t i o n s should help the reader understand why a l t e r n a t i v e s were d e f i n e d the way they were. The reader should a l s o note that s i n c e an a l t e r n a t i v e , can begin with a scope, almost anything can e f f e c t i v e l y s t a r t an e i t h e r a l t e r n a t i v e . 2.6 THEORETICAL EXECUTION OF PARSLEY The f o l l o w i n g s e c t i o n s e x p l a i n how a P a r s l e y program should t h e o r e t i c a l l y execute. The e x e c u t i o n presented here i s not p r a c t i c a l i n r e a l p arsing s i t u a t i o n s as w i l l become g u i c k l y obvious. However, the understanding of t h i s t h e o r e t i c a l e x e c u t i o n i s u s e f u l i n the understanding of what a P a r s l e y program means. 2.6 . 1 THE PARSLEY INTERPRETER T h i s s e c t i o n presents the b a s i c p a r t s o f the i n t e r p r e t e r which executes P a r s l e y programs. The i n t e r p r e t e r has a code area which c o n t a i n s the P a r s l e y program to be executed. T h i s area cannot be modified by i n t e r p r e t e r e x e c u t i o n . P o i n t s w i t h i n the code area can be i d e n t i f i e d by p o i n t e r s ( a d d r e s s e s ) i n t o i t . The i n t e r p r e t e r has 21 a p o i n t e r i n t o the code area c a l l e d CAP. T h i s p o i n t e r i d e n t i f i e s what to execute next i n the program. The p o i n t e r i s a u t o m a t i c a l l y moved from one P a r s l e y statement to the next i n the code area unless e x p l i c i t l y a l t e r e d . The i n t e r p r e t e r has an i n p u t stream of a f i n i t e l i s t of t e r m i n a l tokens. The l a s t token of t h i s stream i s the unigue end_of_input_stream token(EIST). T h i s s p e c i a l token appears nowhere e l s e i n the i n p u t stream. The tokens i n the stream are made up by the set of t e r m i n a l s i n the program to be executed and EIST. The stream i t s e l f i s not m o d i f i a b l e . Tokens i n the stream can be d i s t i n g u i s h e d by p o i n t e r s . The i n t e r p r e t e r has a p o i n t e r i n t o the token stream c a l l e d the c u r r e n t token p o i n t e r (CTP) which d i s t i n g u i s h e s the next token i n the i n p u t stream f o r the parse to c o n s i d e r . The i n t e r p r e t e r has two modes of e x e c u t i o n , parse mode (PM) and d e c i s i o n mode(DM). The i n t e r p r e t e r i s i n parse mode when i t i s a c t i v e l y p a r s i n g the i n p u t stream and executing apply statements. In d e c i s i o n mode apply statements are not executed. D e c i s i o n mode i s entered when the i n t e r p r e t e r must decide between s e v e r a l p o s s i b l e paths. The d e c i s i o n i s made by s i m u l a t i n g the paths to f i n d which path to a c t u a l l y f o l l o w . T h i s s i m u l a t i o n cannot execute apply statements because the e x t e r n a l e f f e c t may not be r e v e r s i b l e . Whether or not t o execute apply statements depends on the c u r r e n t mode of the i n t e r p r e t e r which can be found i n MODE. 22 The i n t e r p r e t e r has a mechanism which i s . used t o c o n t r o l the e v a l u a t i o n of the P a r s l e y e i t h e r statement. The ex e c u t i o n of an e i t h e r statement f o r c e s the i n t e r p r e t e r t o make a c h o i c e between s e v e r a l p o s s i b l e paths. The path t o take i s not always immediately apparent t o the i n t e r p r e t e r and t h e r e f o r e some mechanism t o choose between the a l t e r n a t i v e s must be a v a i l a b l e . The b a s i s of t h i s mechanism i s to s i m u l a t e parses of the a l t e r n a t i v e s i n d e c i s i o n mode and to s e l e c t the path which all o w s the parse to proceed the f u r t h e s t number of tokens down the i n p u t stream without syntax e r r o r . T h i s s i m u l a t i o n i s f a c i l a t a t e d by the two v a r i a b l e s search depth (SD) and search depth l e f t ( S D L ) , as w e l l as the two l i s t s THY and TESTED. SD i s the number of tokens which a path must s u c c e s s f u l l y parse t o remain a c c e p t a b l e . SDL i s the number o f tokens which a p a r t i c u l a r path must s t i l l parse to remain a c c e p t a b l e . TRY i s a l i s t of a l l the paths which must be t r i e d with the c u r r e n t SD. I n i t i a l l y , TRY i s the l i s t of the e i t h e r a l t e r n a t i v e s . L a t e r , TRY i s the l i s t of a l l a l t e r n a t i v e s a c c e p t a b l e up to a search depth one l e s s than the c u r r e n t search depth. TESTED i s a l i s t of a l l the a l t e r n a t i v e s from TRY which were a c c e p t a b l e f o r a search depth of SD. I f a f t e r a l l elements o f the l i s t TRY have been processed f o r a given SD and TESTED i s l o n g e r than one element then the mechanism increments SD, s e t s TRY to TESTED, empties TESTED, and goes through the process a g a i n . The s e l e c t i o n mechanism o u t l i n e d above can h a l t i n three d i f f e r e n t ways. The mechanism can succeed by f i n d i n g a s i n g l e path f o r some SD. The mechanism can f a i l by f i n d i n g no a c c e p t a b l e path 23 for some SD, and more than one acceptable path for SD-1. This f a i l u r e indicates a syntax error.. F i n a l l y , the mechanism can f a i l because two or more paths can successfully parse the e n t i r e input stream. This means that there e x i s t s an ambiguity and i s grounds to declare the parse unsuccessful. The evaluation of an error alternative i n an either statement causes the variable AET (active error technique) to be set to the argument of the error statement. The value of AET w i l l be returned to i d e n t i f y the syntax error i f the either statement execution f a i l s because there i s no acceptable path for the parse to follow. In cases where there i s no error alternative i n a either statement which f a i l s , the s p e c i a l error i d e n t i f i c a t i o n NONE i s returned. The interpreter has three stacks. There i s a parse stack, an execution stack, and an environment stack. , The parse stack i s used to hold tokens from the input stream. I t can also hold the s p e c i a l empty token ET. This stack, and i t s height, i s available to the user semantic routines. The height of the parse stack cannot be affected by the semantic routines and i s controlled by interpreter execution. The execution stack i s used to control procedure and scope execution. The stack i s pushed on procedure invocations and on 24 scope entries. It i s popped on procedure returns and on scope completions. The following i s a d e f i n i t i o n f o r an element of th i s stack. TYPE EXECUTION_STACK_£LEMENT IS: (PROCEDURE,SCOPE) 7E1AG) , {PROCEDURE_NAME,SCOPE_NAME}(NAME), {CODE_AREA_POINTER}(RESTART_POINTER, COMPI>ETIGN_POI NT ER), {INPUT_STREAM POINTER}(SAVE~CTP), {INTEGER} (PARSE_STACK_ITEMS_PUSHED) END; The FLAG f i e l d of the execution stack i s used on the execution of a procedure completion. A l l scopes entered since the procedure was f i r s t invoked must be completed before the procedure can be completed. A procedure completion causes the execution stack to be popped u n t i l a FLAG i d e n t i f i n g a procedure i s located. Since Parsley only allows single l e v e l procedure escapes t h i s procedure stack element must belong to the procedure being completed. The NAME f i e l d of the execution stack i s used i n multi-level scope escapes. A scope escape with no scope name always refers to the fop execution stack element, otherwise stack elements are popped off u n t i l an element has a NAME matching the scope name of the escape. The two code area pointers are used to resume execution after an escape. Procedure and scope completions cause execution to resume at COMPIETION_POINTER. A scope r e p e t i t i o n causes execution to resume at RESTART_POINTER. The token stream pointer, SAVE_CTP, i s used i n the detection of i n f i n i t e loops. Hhen a scope i s entered the current token pointer (CTP) i s saved i n SAVE_CTP. I f when a scope i s repeated and CTP eguals SAVE_CTP then the scope 25 being repeated went through a loop without passing over any tokens i n the input stream. Since Parsley has no hidden conditions a l l future passages through the loop w i l l follow exactly the same path and the interpreter i s caught i n an i n f i n i t e loop. The interpreter recognizes t h i s problem and stops i t s e l f . The l a s t f i e l d , PARSE_STACK_ITEMS_PUSHED, i s used to coordinate the parse stack height to procedure and scope execution. Each procedure and scope upon completion must leave the parse stack exactly one element higher than the parse stack was when the procedure or scope was begun. Therefore, procedure and scope completions pop the parse stack by one l e s s than the associated PARSE_STACK_ITEJ3S_PUSHED f i e l d . (A pop of minus one i s assumed to mean a POSH_EHPTY.), and a scope r e p e t i t i o n must pop the parse stack by the number in the associated PARSE_STACK_ITEflS_PUSHED f i e l d . The environment stack i s used to save the entire state of the interpreter except for CAP. This stack i s used during the evaluation of an either statement to save the state of the interpreter so that when a path has been selected the interpreter can be restored to i t s previous state except that the code area pointer can be modified to point at the selected path to follow. A stack i s necessary to handle the p o s s i b i l i t y of nested either statements. 26 2.6.2 SOME INTERPRETER ROUTINES T h i s s e c t i o n presents some r o u t i n e s which w i l l be used by the i n t e r p r e t e r . The SYNTAX_ERROB(E) r o u t i n e i s executed when the i n p u t stream i s not a c c e p t a b l e t o the P a r s l e y s p e c i f i c a t i o n . . I t w i l l h a l t the parse and r e t u r n i t s argument as the i d e n t i f i c a t i o n of the problem. T h e o r e t i c a l l y , no c o n t i n u a t i o n i s p o s s i b l e at t h i s p o i n t , but of course p r a c t i c a l implementations would allow the t r a n s l a t o r t o i n t e r a c t to c o r r e c t the problem. The AMBIGUITY^ERROR r o u t i n e i s executed when two or more complete parses are p o s s i b l e . The a c t i o n of t h i s r o u t i n e i s to d e c l a r e t h i s as a f a t a l e r r o r which h a l t s the machine. The APPLY (S) r o u t i n e i s executed whenever a P a r s l e y statement i s executed i n parse mode. I t causes the semantic r o u t i n e S to be executed. The VALID_PATH r o u t i n e i s only executed i n d e c i s i o n mode. I t causes e x e c u t i o n to resume i n the e i t h e r statement s c r i p t which caused i n t e r p r e t a t i o n to e v e n t u a l l y execute the VALID_PATH. I t y i e l d s a r e t u r n code of 'VALID'. The INVALID_PATH r o u t i n e i s a l s o only executed i n d e c i s i o n mode. I t causes execution t o resume i n the e i t h e r statement 27 s c r i p t which caused interpretation to eventually execute the INVALID_PATH. I t yiel d s a return code of 'INVALID*. The POP and PUSH routines have the expected meaning. The RESTORE_ENVIRONMENT routine resets the interpreter's state to that of the interpreter when the corresponding PUSH was executed on the environment stack. I t of course does not e f f e c t the value of CAP. • The GO routine causes execution to proceed with the Parsley statement pointed at by CAP. 2.6.3 INTERPRETER INITIAL CONDITIONS CAP i s pointing at the f i r s t statement of the GOAL PROCEDURE. CTP i s pointing at the f i r s t token of the input stream. MODE i s PM. AET i s NONE. SD, SDL are 0. TRY, TESTED are empty. PARSE STACK i s empty. EXECUTION STACK has one element. FLAG=PROCEDURE NAME=G0AL PROCEDURE NAME RESTART_POINTER=COMPIETION_POINTER=null 28 SAVE_CTP=CTP PARSE_STACK_ITEMS_PUSHED=0 ENVIRONMENT STACK i s empty. 2.6.4 INTERPRETER EXECUTION T h i s s e c t i o n e x p l a i n s what the i n t e r p r e t e r does on each P a r s l e y c o n s t r u c t . The s c r i p t f o r each c o n s t r u c t i s w r i t t e n i n a simple n o n - e x i s t e n t programming language. Comments are pl a c e d i n s i d e •/** and **/'. Phrases nested i n s i d e »*(* and * ) ' are e n g l i s h phrases d e s c r i b i n g simple concepts which are d i f f i c u l t t o express i n the o r d i n a r y s c r i p t . Hhen p o i n t e r s i n t o the token stream, are f o l l o w e d by 'S 1 t h i s means to get the token they p o i n t a t . PROCEDURE CALL EXECUTION Executed when a P a r s l e y procedure c a l l statement i s t o be i n t e r p r e t e d . (1) PUSH (EXEC0TION_STACK) <= FLAG=PROCEDURE NAME=PROCEDU RE NAME R ESTA RT_POINTER=NULL COMPLETION_POINTER=*(JUST AFTER PROCEDURE CALL STATEMENT) SAVE_CTP=CTP PAESE_STACK_ITEMS_PUSHED=0 (2) CAP:=*(FIRST STATEMENT OF PROCEDURE CALLED) GO 29 READ STATEMENT EXECUTION Executed when a P a r s l e y read statement i s t o be i n t e r p r e t e d . , (1) IF CTPS=*(READ ARGUMENT) IF MODE=DM SDL:=SDL-1 IF SDL=0 EXECUTE VALID_PATH ELSE IF MODE=DM EXECUTE INVALID_PATH ELSE SYNTAX_ERROR(NONE) (2) PUSH (PARSE_STACK) <= CTP3 EXECUTION_STACK.PARSE_STACK_ITEMS_PUSHED+:=1 CTP:=NEXT_TOKEN(CTP) GO APPLY STATEMENT EXECUTION Executed when a P a r s l e y apply statement i s t o be i n t e r p r e t e d . (1 ) IF MODE=PM EXECUTE APPLY(APPLY ARGUMENT) GO ENTERING SCOPEiBLOCK OR CYCLE! Executed when a BEGIN or CYCLE i s t o be i n t e r p r e t e d . (1) PUSH(EXECUTION_STACK) <= FLAG=SCOPE NAME=*(SCOPE NAME OR NO_NAME IF SCOPE IS NOT NAMED) RESTART_POINTER=*(JUST AFTER BEGIN OR CYCLE) C0MP1ETI0N_P0INTER=*(JUST AFTER END OF SCOPE) SAVE_CTP=CTP PARSE_STACK_ITEMS_PUSHED=0 (2) CAP:=EXECUTION_STACK.RESTART_P0INTER GO 30 END OF BLOCK EXECUTION Executed shea the end of a BLOCK i s to be interpreted, (1) IF EXECUTION_STACK.PARSE_STACK_ITEMS_PUSHED=0 PUSH(PARSE STACK) <= ET ELSE IF EXECUTION_STACK.PARSE_STACK_ITEMS_PUSHED > 1 POP(PARSE_STACK, EXECUTION_STACK.PARSE_STACK_ITEMS_PUSHED-1) (2) *(SAVE EXECUTION_STACK.COMPLETION_POINTER) POP(EXECUTION_STACK,1) EXECUTION_STACK.PARSE_STACK_iITEMS_PUSHED+: = 1 (3) CAP;=*(SAVED C0MP1ETI0N_P0INTER) GO 1 1 2 Of CYCLE EXECUTION Executed when the end of a CYCLE i s to be interpreted. (1) IF EXECUTION_STACK.SAVED_CTP=CTP EXECUTE SYNTAX_ERROR (NONE) /* THIS CONDITION MEANS THAT.THE SCOPE IS BEING REPEATED 9ITH NO CHANGE IN THE INPUT STREAM. THIS IS AN INFINITE LOOP */ (2) POP (PARSE_STACK, EXECUTION_STACK.PARSE_STACK_ITEMS_PUSHED) EXECUTION~STACK.PARSE_STACK_ITEMS_PUSHED:=0 EXECUTION~STACK.SAVE_CTP:=CTP (3) CAP:=EXECUTION_STACK.RESTART_POINTER GO END OF PROCEDURE EXECUTION Executed when the end of a PROCEDURE (GOALPOST) i s to be interpreted. (1) IF EXECUTION_STACK.NAME = GOAL PROCEDUBE MA ME IF CTPa = EIST IF MODE = DM IF SDL = 1 EXECUTE VALID_PATH ELSE EXECUTE AMBIGUITY^ERBOB ELSE EXECUTE HALT /* PABSE SUCCESSFUL */ ELSE IF MODE = DM EXECUTE INVALID_PATH ELSE SYNTAX_ERBQR (NONE) DO END OF BLOCK EXECUTION(1,2) (2) CAP:=*(SAVED COMPLETION POINTER) GO SCOPE EXIT EXECUTION Executed when the a scope e x i t i s to be interpreted. (1) IF * (EXIT STARTS WITH AN APPLY) IF MODE = PM EXECUTE APPLY(APPLY ARGUMENT) (2) IF * (SCOPE EXIT NAME) -.= NO_NAME WHILE EXECUTE_STACK. NAME -.= SCOPE EXIT NAME DO END OF BLOCK EXECUTION (1,2) (3) DO END OF BLOCK EXECUTION(1,2) (4) IF *(EXIT CLOSES WITH AN APPLY) IF MODE ~ PM EXECUTE APPLY(APPLY ARGUMENT) CAP:=* (SAVED COMPLETION POINTER) GO SCOPE REPEAT EXECUTION Executed when the a scope repeat i s to be interpreted. (1) DO SCOPE EXIT EXECUTION (1,2) (2) DO END OF CYCLE EXECUTION(1,2) 32 (3) IF * (REPEAT CLOSES WITH AN APPLY) IF MODE = PM EXECUTE APPLY(APPLY ARGUMENT) CAP:=EXECUTIQN_STACK.RESTART,POINTER GO PROCEDURE RETURN EXECUTION Executed when a procedure return statement i s to be interpreted. (1) IF * (RETURN STARTS WITH AN APPLY) IF MODE = PM EXECUTE APPLY (APPLY ARGUMENT) (2) WHILE EXECUTION STACK. FLAG -.= PROCEDURE DO END OF BLOCK EXECUTION(1,2) (3) DO END OF PROCEDURE EXECUTION(1) (4) IF * (RETURN CLOSES WITH AN APPLY) IF MODE = PM EXECUTE APPLY (APPLY ARGUMENT) CAP:=*(SAVED COMPLETION POINTEH) GO SKIP STATEMENT EXECUTION Executed when a skip statement i s to be interpreted., (1) PUSH (PARSE_STACK) <= ET EXECUTION_STACK.PARSE_STACK_ITEMS_PUSHED+:=1 GO SEE STATEMENT EXECUTION Executed when a see statement i s to be interpreted. (1) IF *(CTPS DOES NOT HATCH ANY OF THE SEE ARGUMENTS) EXECUTE INVALID PATH GO NOTSEE STATEMENT EXECUTION Executed when a notsee statement i s t o be i n t e r p r e t e d . (1) IF * (CTPS MATCHES ANY OF THE NOTSEE ARGUMENTS) EXECUTE INVALID_PATH GO ERROR STATEMENT EXECUTION Executed when a e r r o r statement i s to be i n t e r p r e t e d . (1) AET:=*(ERROR ARGUMENT) EXECUTE INVALID PATH EITHER STATEMENT EXECUTION Executed when a e i t h e r statement i s t o be i n t e r p r e t e d . (1) PUSH (ENVIRONMENT_STACK) <= * (EVERYTHING BUT CAP) (2) MODE;=DM IF ENVIRONHENT_STACK.MODE = PM SD: = 1 ELSE SD:=ENVIRONMENT_STACK.SDL (3) TRY;=*(ALL EITHER ALTERNATIVES) TESIED:=EMPTY AET:=NONE (4) WHILE TRY NOT EMPTY CAP:=TRY.ALTERNATIVE SDL:=SD 34 GO / * CONTROL HILL RETURN HERE ON VALID_PATH AND INVALID_PATH EXECUTION * / IF RETURN CODE = VALID IF ENVIRONMENT_STACK.MODE = DM RESTORE ENVIRONMENT EXECDTE~VALID_PATH ELSE ADD TRY.ALTERNATIVE TO TESTED TRY:=TRY.NEXT CTP:=ENVIRONMENT_STACK.CTP (5) IF TESTED EMPTY IF ENVIRONMENT_STACK.MODE = PM EXECUTE SYNTAX_ERROR(AET) ELSE RESTORE ENVIRONMENT EXECUTE INVALID_PATH ELSE IF * (TESTED ONE LONG) CAP:=TRY.ALTERNATIVE RESTORE_ENVIRONMENT GO / * EXECUTION IN PM OF SINGLE VALID PATH STARTS * / ELSE SD:=SD+1 / * PATHS ARE NOT SEPARABLE AT PRESENT DEPTH V TRY;=TESTED TESTED:=EMPTY GOTO STEP 4 2.7 DESCRIPTIVE ABILITY OF PARSLEY I t was p r e v i o u s l y s t a t e d t h a t p a r s i n g languages f o r programming language i m p l e m e n t a t i o n s h o u l d be c a p a b l e of d e s c r i b i n g the c o n t e x t f r e e l a n g u a g e s . P a r s l e y i s c a p a b l e o f d e s c r i b i n g e x a c t l y t h i s s e t of l a n g u a g e s . t I f s h o u l d be o b v i o u s to the r e a d e r t h a t any c o n t e x t f r e e grammar can be t r i v i a l l y t r a n s l a t e d i n t o a P a r s l e y program. Each grammar n o n t e r m i n a l w i l l be a s e p a r a t e p r o c e d u r e . Each procedure w i l l be an e i t h e r s tatement w i t h an a l t e r n a t i v e f o r each p r o d u c t i o n o f the a s s o c i a t e d n o n t e r m i n a l . T h e r e f o r e , 3 5 P a r s l e y can d e s c r i b e a l l languages r e p r e s e n t a b l e by the context f r e e grammars. T h i s means P a r s l e y i s at l e a s t context f r e e i n power. , He must now show that Parsley, cannot d e s c r i b e any language which i s not c o n t e x t f r e e . ; F i r s t , we w i l l throw out a few p a r t s of P a r s l e y which do not a f f e c t the languages P a r s l e y can d e s c r i b e . O b v i o u s l y , apply statements and e r r o r statements are i n t h i s c l a s s . S i m i l a r l y , the look statements are not necessary. T h i s i s c l e a r when one c o n s i d e r s t h a t they are u n i f o r m l y a p p l i e d without r e f e r e n c e to c o n t e x t . The removal of the look statement can be e a s i l y done by copying the look r e s t r i c t e d a l t e r n a t i v e and modifying i t to e x p l i c i t l y r e f l e c t the r e s t r i c t i o n . T h i s m o d i f i c a t i o n may e n t a i l the a d d i t i o n of new procedures when the a l t e r n a t i v e c o n t a i n s a procedure c a l l . I t i s now simple to r e p r e s e n t P a r s l e y programs as e i t h e r syntax c h a r t s or r e c u r s i v e t r a n s i t i o n networks. Both of these have been shown to be context f r e e by Lomet 1 3 and Hoods l * r e s p e c t i v e l y . T h e r e f o r e , P a r s l e y r e p r e s e n t s e x a c t l y the context f r e e languages. As an example c o n s i d e r Appendix 1 which i s a d e f i n i t i o n of P a r s l e y i n P a r s l e y , and Appendix 2 which i s a d e f i n i t i o n of P a r s l e y i n a context f r e e grammar. T h i s example demonstrates t h a t P a r s l e y and the context f r e e grammars can d e p i c t the same languages. 36 F i n a l l y , Parsley supplies a l l of the three basic structures claimed to be necessary. A recursive procedure, an alternative structure, and an i t e r a t i v e structure are a l l available. 2.8 SEMANTIC ACTION POSITIONING Semantic breakpoints can be e a s i l y introduced into a s p e c i f i c a t i o n with the Parsley apply statement. This statement, being placed right i n the s p e c i f i c a t i o n , c l e a r l y shows the r e l a t i o n of parse and semantics. The a b i l i t y to construct large semantic actions from primitive ones i s also available. Several apply statements can appear i n order. In f a c t , a shorthand has been supplied f o r t h i s event. , 2.9 ERROR TRAPS Parsley allows the positioning of error traps where ever the user wants spe c i a l error handling. However, there i s a tr i c k y problem with the positioning of error statements. I t i s sometimes d i f f i c u l t to know where to position the traps so that they w i l l be sprung. The following example i l l u s t r a t e s t h i s problem. EXAMPLE LET ARGUMENT-LIST BE: READ (OPEN) ; ARGUMENT-ELEMENTS; EITHER (READ (CLOSE) ) (ERROR(MISSING-CLOSE)) END _ l _ 37 LET ARGUMENT-ELEMENTS BE: CYCLE ARGUMENT; EITHER (EXIT) (READ (COMMA) ) END END _ l _ The problem l i e s i n that the cycle w i l l be exited when a 'CLOSE* i s the next input token, and w i l l be repeated when a * COMMA' i s the next input token. I f a 'CLOSE' was actually missing the error would be encountered i n 'ARGUMENT-ELEMENTS' and the trap i n 'ARGUMENT-LIST' would not be sprung. In general, the user has to be ca r e f u l i n positioning error traps to insure t h e i r correct operation., The above example can be e a s i l y corrected by moving the error alternative to the either statement i n ARGUMENT-ELEMENTS. This a l t e r a t i o n would cause 'MISSING-CLOSE* to be returned when an argument l i s t was missing a terminating 'CLOSE'. 38 CHAPTER 3. COMPILING PARSLEY T h i s chapter presents a P a r s l e y c o m p i l e r . T h i s compiler, which i s a c t u a l l y used i n the implementation, i s necessary f o r p r a c t i c a l p a r s i n g . The P a r s l e y i n t e r p r e t e r , presented i n the l a s t chapter, i s too i n e f f i c i e n t . The method of s e p a r a t i n g p o s s i b l e parse paths i s not p r a c t i c a l . In a d d i t i o n , the parse stack c o n t r o l of the i n t e r p r e t e r can be p r e - c a l c u l a t e d . The compiler w i l l take a P a r s l e y program and attempt to produce "an e q u i v a l e n t d e t e r m i n i s t i c p a r s e r . During the c o m p i l a t i o n process a l l a m b i g u i t i e s w i l l be found. The one major drawback to t h i s , proposed c o m p i l a t i o n i s t h a t not a l l l e g a l unambiguous P a r s l e y programs can be s u c c e s s f u l l y compiled using standard p a r s i n g techniques. . As an example of t h i s c o n s i d e r the LL (K) , SLR (K) , LALfi (K) , and LR (K) p a r s i n g techniques as u s u a l l y a p p l i e d t o context f r e e grammmars. A l l f o u r of these techniques make r e s t r i c t i o n s on the context f r e e grammars f o r which p a r s e r s can be produced. Programs which cannot be compiled are c a l l e d inadequate. 39 There are a c t u a l l y two b a s i c approaches p o s s i b l e f o r the c o m p i l a t i o n , or parse r g e n e r a t i o n , f o r P a r s l e y programs. The c o m p i l a t i o n approach can be e i t h e r • p r e d i c t i v e * or * n o n - p r e d i c t i v e • . These two approaches d i f f e r on t h e i r h a n d l i n g of the P a r s l e y procedure c a l l . The p r e d i c t i v e approach r e g u i r e s t h a t a procedure e n t r y be immediately i d e n t i f i e d , while the n o n - p r e d i c t i v e approach does not make t h i s reguirement. To demonstrate the d i f f e r e n c e c o n s i d e r the f o l l o w i n g example. Suppose a parse encounters a c h o i c e .between c a l l i n g a procedure P or re a d i n g a token X. Furthermore, suppose t h a t the f i r s t a c t i o n of the procedure P i s t o read the token X. In the p r e d i c t i v e approach the d e c i s i o n of which path to take c o u l d not be made without u t i l i z i n g some p a r t of the in p u t stream a f t e r the token X. However, i n the n o n - p r e d i c t i v e approach, the ch o i c e c o u l d be put o f f , the token X read, and the parses would sepa r a t e a t some l a t e r p o i n t . The preceding example demonstrates t h a t the n o n - p r e d i c t i v e approach can make parse d e c i s i o n s with l e s s work than the comparable p r e d i c t i v e one. In a d d i t i o n , i f the d e c i s i o n mechanism i s r e s t r i c t e d to a f i n i t e number of tokens ahead of the c u r r e n t parse p o i n t (as i s the case f o r the LL (K) and LE (K) technigues) then the p r e d i c t i v e approach cannot even handle some P a r s l e y programs which the n o n - p r e d i c t i v e approach can. Since the common p a r s i n g technigues enforce t h i s r e s t r i c t i o n i t must be c o n s i d e r e d i n choosing between the p r e d i c t i v e and n o n - p r e d i c t i v e approaches., 40 Another c o n s i d e r a t i o n i n the c h o i c e between u t i l i z i n g the p r e d i c t i v e or n o n - p r e d i c t i v e approach i s the e a s i e r syntax e r r o r r e c o v e r y i n the p r e d i c t i v e paradigm. The b a s i c reason f o r t h i s enhanced e r r o r recovery i s that a p r e d i c t i v e l y generated parser w i l l be able t o know what procedure i t i s i n , and h o p e f u l l y have a b e t t e r i d e a of how to c o r r e c t the erroneous i n p u t stream t o allo w t r a n s l a t i o n to proceed. The c u r r e n t implementation took the n o n - p r e d i c t i v e approach. T h i s approach was taken because of the g r e a t e r number of compilable P a r s l e y programs. I t was a l s o taken because a p r e d i c t i v e parser can be generated from the n o n - p r e d i c t i v e approach. A p r e d i c t i v e parser can be c r e a t e d by s t a r t i n g each procedure with an apply statement which uniquely i d e n t i f i e s the procedure. T h i s w i l l f o r c e the compiler to d i f f e r e n t i a t e between paths at the e n t r y of each procedure. T h i s f e a t u r e i s p a r t i c u l a r l y u s e f u l when a s p e c i f i c a t i o n i s q e n e r a l l y w i t h i n the ranqe of the p r e d i c t i v e approach but has a few odd t w i s t s so tha t p a r t s of the s p e c i f i c a t i o n need the non-rpredictive power. T h i s a l l o w s the s p e c i f i c a t i o n t o be p r e d i c t i v e (and hence b e t t e r e r r o r recovery) as much as p o s s i b l e , and n o n - p r e d i c t i v e only where necessary. 41 3.1 GENERAL COMPILATION PROCESS The c o m p i l a t i o n process i s done by a s e r i e s of t r a n s l a t i o n s from one machine t c another. The process s t a r t s with a P a r s l e y program (considered a machine) whose syntax and semantics were d e s c r i b e d i n the l a s t chapter. The P a r s l e y program i s then t r a n s l a t e d i n t o an i n t e r m e d i a t e n o n - d e t e r m i n i s t i c machine. During t h i s t r a n s l a t i o n the v a r i o u s parse s t a c k o p e r a t i o n s are ' b u i l t ' i n t o the new machine. The n o n - d e t e r m i n i s t i c machine i s then t r a n s l a t e d i n t o the f i n a l d e t e r m i n i s t i c machine. During t h i s t r a n s l a t i o n d e t e r m i n i s t i c d e c i s i o n mechanisms are i n t r o d u c e d . The i n t r o d u c t i o n of the new d e c i s i o n mechanism, to r e p l a c e the pr e v i o u s b a c k t r a c k i n g d e c i s i o n mechanism, i s not always p o s s i b l e . When i t i s not p o s s i b l e the c o m p i l a t i o n process i s d e c l a r e d inadeguate and f a i l s . In these cases the P a r s l e y program must be a l t e r e d t o c o r r e c t the d e f e c t . What causes the d e f e c t and how to a l t e r a program t o c o r r e c t i t i s d i s c u s s e d i n the next chapter. The f i n a l d e t e r m i n i s t i c machine, i f s u c c e s s f u l l y produced, i s outputted as parser t a b l e s . I t i s these parser t a b l e s which the t r a n s l a t o r i n t e r p r e t s t o e f f e c t the parse. 3.2 NON-DETERMINISTIC INTERMEDIATE MACHINE T h i s machine i s composed of s t a t e s and t r a n s i t i o n s . b e t w e e n s t a t e s . Each s t a t e has a l i s t of t r a n s i t i o n s a s s o c i a t e d with i t . Each s t a t e can a l s o have an e r r o r r e c o v e r y technigue 42 a s s o c i a t e d with i t . Each t r a n s i t i o n can have a l i s t c a l l e d a • l o o k l i s t ' . These l o o k l i s t s correspond to the see and notsee statements of P a r s l e y . Unless t h e c u r r e n t i n p u t token i s a c c e p t a b l e to the l o o k l i s t , the t r a n s i t i o n w i l l f a i l . How a s t a t e decides between s e v e r a l p o s s i b l e t r a n s i t i o n s w i l l not be i l l u s t r a t e d , but i t can be assumed t h a t the d e c i s i o n w i l l m i r r o r t h a t of the p r e v i o u s l y d e f i n e d P a r s l e y i n t e r p r e t e r . F i n a l l y , i t i s i n t e r e s t i n g to note t h a t the same i n t e r m e d i a t e n o n - d e t e r m i n i s t i c machine can be used f o r both p r e d i c t i v e and n o n - p r e d i c t i v e c o m p i l a t i o n approaches. The f o l l o w i n g d e f i n e s the s i x t r a n s i t i o n s of t h i s new machine. READ (X,S) - T h i s t r a n s i t i o n i s what a P a r s l e y read statement i s mapped i n t o . I f X i s the c u r r e n t token then X i s pushed onto the parse s t a c k , the c u r r e n t token p o i n t e r i s moved along, and the machine e n t e r s s t a t e S. APPLY (A,S) - T h i s t r a n s i t i o n i s what a P a r s l e y apply statement i s mapped i n t o . The semantic a c t i o n A i s executed and the machine e n t e r s s t a t e S. CALL(P,S) - T h i s t r a n s i t i o n i s what a P a r s l e y procedure statement i s mapped i n t o . The procedure P i s c a l l e d ( i e . e nter the s t a t e which s t a r t s procedure P), on r e t u r n the machine e n t e r s s t a t e S. RETURN (P) - T h i s t r a n s i t i o n marks the r e t u r n p o i n t of a procedure. I t c o n s t i t u t e s p a r t of the t r a n s l a t i o n from procedure r e t u r n statements and end of procedure 43 e x e c u t i o n ( g o a l p o s t ) . The a s s o c i a t e d parse stack o p e r a t i o n s are handled by the next two t r a n s i t i o n s . The RETURN t r a n s i t i o n causes c o n t r o l to resume with a p r e v i o u s CALL t r a n s i t i o n . POP(#,S) - T h i s t r a n s i t i o n i s used f o r parse s t a c k alignment during scope and procedure e x e c u t i o n . I t pops # elements o f f the parse s t a c k and c o n t i n u e s i n s t a t e S. PUSHEMPTY(S) - T h i s t r a n s i t i o n i s a l s o used f o r parse stack alignment. I t pushes the s p e c i a l empty token (ET) onto the parse stack and the machine e n t e r s s t a t e S. The f o l l o w i n g i s a s h o r t example of how a simple P a r s l e y program c o u l d appear a f t e r t r a n s l a t i o n i n t o the new machine. 44 EXAMPLE LET P1 BE: BEAD (X1) ; APPLY(A1); CYCLE BEAD (X2) ; EITHER {EBBOB (E1) ) (P2) (READ (X3) ; EXIT) (SKIP) END; END; READ (X4) ; _ l _ LET P2 BE: BEAD (X3) ; APPLY (A2) ; BEAD (X3) ; _ l _ C O R R E S P O N D I N G N O N - D E T E R M I N I S T I C M A C H I N E STATE 1 : REAB(X1,2) STATE 2 : APPLY (A1#3) STATE 3 : READ(X2,4) STATE 4 : CALL (P2,5) . (El) READ(X3,6) PUSHEMPTY(5) STATE 5 : POP (2,3) /* POPS THE TWO TO REPEAT CYCLE */ STATE 6 : POP (1,7) /* POPS ONE TO EXIT CYCLE */ STATE 7 : READ(X4,8) STATE 8 : BETUBN(P1) STATE 9 : READ(X3,10) /* STARTS P2 */ STATE 10: APPLY (A2, 11) STATE 11: READ(X3,12) STATE 12: RETURN(P2) 45 3.3 FINAL DETERMINISTIC MACHINE The f i n a l deterministic machine i s very close to the previous machine. It has a l l of the tr a n s i t i o n s of the preceding machine plus t h i s new one. LOOKAHEAD(X,D,S) - This t r a n s i t i o n i s used to introduce the deterministic parse decisions into the machine. I f the token (D-1) afte r the current token matches X then parsing continues i n state S. The decision making process i s now guite simple. States w i l l now have only one t r a n s i t i o n unless the multiple t r a n s i t i o n s are a l l reads or a l l lookaheads of the same depth (D). A l l READ and LOOKAHEAD t r a n s i t i o n s i n a state must be on di f f e r e n t tokens. States with more than one t r a n s i t i o n w i l l try i t s t r a n s i t i o n s one by one. As soon as one succeeds the parse can continue and no backup i s necessary as i t i s the only t r a n s i t i o n using that token. I f none of the tra n s i t i o n s succeed then a syntax error has been discovered. The state where the error occurred w i l l return i t s associated error recovery technique i f i t has one, otherwise i t w i l l signal general recovery. The generation of the new decision mechanism u t i l i z e s the previous machine 1s l o o k l i s t s . Therefore, the l o o k l i s t s are i m p l i c i t l y b u i l t i n t o the deterministic machine and no longer ex i s t as modifiers to t r a n s i t i o n s . 46 The following examples show what a non-predictive t r a n s l a t i o n could appear l i k e . EXAMPLE PREDICTIVE DETERMINISTIC MACHINE predictive and STATE 1 : READ (X 1,2) STATE 2 : APPLY (A 1,3) STATE 3 : READ(X2,4) STATE 4 : LOOKAHEAD(X3,1,4A) (E1) LOOKAHEAD(X2,1,5A) STATE 4A: LOOKAHEAD(X3,2,4B) (E1) LOOKAHEAD(X4,2,6A) STATE 4B: CALL(P2,5) STATE 5 : POP (2,3) /* POPS THE T»0 TO REPEAT CYCLE */ STATE 5A: PDSHEMPTY(5) STATE 6 : POP (1,7) •/* POPS ONE TO EXIT CYCLE */ STATE 6A: READ(X3,6) STATE 7 : READ(X4,8) STATE 8 : RETURN (PI) STATE 9 : READ(X3,10) /* STARTS P2 */ STATE 10: APPLY (A2, 11) STATE 11: READ(X3,12) STATE 12: RETURN(P2) 47 N0J-PSJ3DICTIVE DETEfiMINISTIC MACHINE STATE 1 : . READ (X1,2) STATE 2 : APPLY (A 1,3) STATE 3 : READ(X2,4) STATE 4 : LOOKAHEAD (X3, 1,4A) (E1) LOOKAHEAD(X2,1,5A) STATE 4A: READ(X3,4B) /* THIS READ IS IN BOTH P1 AND P2 .*/ /* CALL (£2, 5) IS NOW IMPLIED */ STATE 4B: LOOKAHEAD(X3,1,10) LOOKAHEAD(X4,1,7) STATE 5 : POP (2,3) /* POPS THE TWO TO REPEAT CYCLE */ STATE 5A: PUSHEMPTY (5) STATE 6 : POP (1,7) /* POPS ONE TO EXIT CYCLE */ STATE 7 : READ(X4,8) STATE 8 : RETURN(P1) STATE 9 : READ(X3,10) /* IS NO LONGER USED */ STATE 10: APPLY (A2,11) STATE 11: READ(X3,12) STATE 12: RETURN (P2) 48 CHAPTER 4. USER'S GUIDE This section attempts to f a m i l i a r i z e the reader with the current implementation. I t also t r i e s to give some d i r e c t i o n to the prospective Parsley user. 4. 1 CURRENT IMPLEMENTATION The current implementation i s written in SUE 1 2 . I t can be run with the following MIS command. $RUN GREG:PARSLEY SCARDS=? SPRINT=? SPUNCH=? T=? PAR=?;? 4.1.1 USER PARAMETERS The system has f i v e parameters which can be set by the user. These parameters are placed i n the MTS p a r f i e l d following a semicolon. (The parameters before the semicolon are parameters for the SUE runpack.) The parameters are separated by commas, have default values, and may appear i n any order. 49 The parameter name and the parameter value must be separated by an •=». USER PARAMETERS NAME VALUE RANGE DEFAULT CONTROLS K £0,1. . . 15.) LIST {ON,OFF,ONLY) PUNCH {ON,OFF) STATES {ON,OFF,INADEQUATE) SYMBOLS {ON,OFF,INADEQUATE) 1 ON ON INADEQUATE ON LOOKAHEAD LISTING PARSE TABLES STATE DUMP SYMBOL TABLES EXAMPLE PAR=;K=3,STATES=OFF,LIST=OFF The effect of several of the above parameter values i s not immediately obvious. The LIST ONLY option stops any attempt at parser generation. Under t h i s option the system just produces a l i s t i n g . The INADEQUATE value for the STATES and SYMBOLS parameters causes a state dump or symbol table dump only when the parser generation i s inadequate ( f a i l s ) . 4.1.2 INPUT The input to the system i s a Parsley program followed by an en d - o f - f i l e . Any symbol must appear on one MTS l i n e . (Therefore, i d e n t i f i e r s have an implementation r e s t r i c t i o n of 256 characters.) Comments may run over an ar b i t r a r y number of 50 l i n e s , 4.1.3 OUTPUT The system produces output on both SPRINT and SPUNCH. A paragraphed l i s t i n g , sometimes followed by error messages and fable dumps, appears on SPRINT. The parser tables, i f produced, appear on SPUNCH. 4.1.3.1 PARAGRAPHED LISTING The paragraphed l i s t i n g i s designed to improve the rea d a b i l i t y of programs by enforcing a consistent indentation scheme. Procedure bodies and scope bodies are indented. In addition, procedure escapes and scope escapes are marked to make th e i r effect on program execution c l e a r e r . The user should note that i t i s not possible to get a l i s t i n g that i s not paragraphed from the system. The paragraphed l i s t i n g has two columns to the l e f t of the program l i s t i n g . The f i r s t column contains the current symbol count at the time the paragraphed l i n e i s begun. This count i s used throughout the system to i d e n t i f y locations i n the user's program. The user should be aware that the symbol count i n system communications i s only approximate., The second column contains the number of elements pushed onto the parse stack since the s t a r t of the procedure. A blank i n the second column 51 indicates the stack depth i s the same as the previous non-blank stack depth i n the column. The stack depth can be used to access symbols on the stack during translator execution. How t h i s stack depth i s calculated w i l l be explained l a t e r . 4.1.3.2 ERROR MESSAGES There are b a s i c a l l y two kinds of error messages generated by the system. There are syntax error messages, and there are parser generator error messages. Syntax error messages are generated during the f i r s t scan of. the source program. The occurrence of any syntax errors w i l l not allow parser generation to proceed. Parser generator error messages are generated during subseguent processing to create a deterministic parser. Syntax error messages always appear d i r e c t l y preceding the l i n e i n error on the paragraphed l i s t i n g . There are some syntax error messages which concern errors which are not s t r i c t l y speaking syntax errors. These messages concern non-syntactic errors i n program formation. The following l i s t i d e n t i f i e s these pa r t i c u l a r * syntax errors'. (1) unreachable statement - t h i s occurs when a statement cannot be reached. This usually happens d i r e c t l y after escapes, d i r e c t l y after scopes which are not exitted, and d i r e c t l y after either statements whose alternatives a l l pass control via escapes or an error a l t e r n a t i v e . procedure r e d e f i n e d - t h i s occurs when two or more procedures have the same name, go a l procedure c a l l e d - the goal procedure cannot be c a l l e d . , T h i s requirement i s necessary f o r p a r s e r g e n e r a t i o n . scope name problem - scopes must have matching names a t both t h e i r s t a r t and end. scopes cannot be given the name of a c u r r e n t l y open scope. scope i n f i n i t e loop - a scope must always have an escape r o u t e . B locks, of c o u r s e , have an i m p l i e d escape r o u t e at t h e i r end. - scopes must not be repeated p r i o r to i n p u t stream m o d i f i c a t i o n . I f they do, then i f the loop i s ever executed, they w i l l never escape. U n f o r t u n a t e l y , t h i s p a r t i c u l a r c o n d i t i o n cannot always be found d u r i n g source program i n p u t . I t w i l l be d i s c o v e r e d during p a r s e r g e n e r a t i o n i f i t was missed. procedure not e x i t t e d - i f a procedure does not c o n t a i n a r e t u r n , and the procedure end i s not a b l e t o be reached, then the procedure w i l l never r e t u r n . misused e r r o r statement - an e i t h e r statement cannot have more than one e r r o r a l t e r n a t i v e . an e i t h e r statement must have at l e a s t one non-error a l t e r n a t i v e . bad escape l a b e l - scope escapes must r e f e r to a 53 c u r r e n t l y open scope. (9) e r r o r i n look statement - a l l t e r m i n a l names i n the argument l i s t f o r a l o o k statement must be d i s t i n c t . Parser generator e r r o r messages appear a f t e r the paragraphed l i s t i n g . They inform the user about d i f f i c u l t i e s encountered i n p a r s e r g e n e r a t i o n . They u s u a l l y r e f e r t o a ' s t a t e * where the t r o u b l e manifested i t s e l f . There are e s s e n t i a l l y t h r e e types of parser generator e r r o r messages. These messages are concerned with misuse of the e r r o r statement, misuse of the look statement, or lookahead inadequacies. E r r o r messages concerning a misuse of an e r r o r statement are e s s e n t i a l l y j u s t warnings. They , are caused by a s i n g l e s t a t e being a s s o c i a t e d with s e v e r a l e r r o r names. A message can a l s o be generated when a s t a t e has •only one a s s o c i a t e d e r r o r name, but a path through t h a t s t a t e has no e r r o r name a s s o c i a t e d . These messages should be c a r e f u l l y s c r u t i n i z e d t o ensure understanding of which e r r o r technigue i s going to be c a l l e d i n what c o n t e x t . E r r o r messages concerning misuses of e r r o r statements do not stop parser g e n e r a t i o n . E r r o r messages concerning look statements occur when one of the arguments of a look statement c o u l d not l e g a l l y occur a t t h a t p o i n t . They c l e a r l y r e f l e c t a misunderstanding and stop p a r s e r g e n e r a t i o n . Messages about lookahead inadequacies happen when the system cannot produce a d e t e r m i n i s t i c parser u t i l i z i n g only K symbol lookahead. T h i s problem may be caused by an ambiguity i n the s p e c i f i c a t i o n , i n s u f f i c i e n t lookahead depth, or a r e s t r i c t i o n i n t r o d u c e d by the parse r g e n e r a t i o n t e c h n i q u e . Some h i n t s on how t o handle lookahead problems occur i n a l a t e r s e c t i o n . Lookahead ina d e g u a c i e s s t o p p a r s e r g e n e r a t i o n . 4.1.3.3 TABLE DUMPS There are some t a b l e s which may be produced depending on the s e t t i n g s of the user parameters. They are the s t a t e t a b l e and the symbol t a b l e s . The s t a t e t a b l e a l l o w s the user t o see i n s i d e the parser generator. T h i s i s sometimes necessary to understand why parser generator e r r o r messages o c c u r r e d . The s t a t e t a b l e can be viewed as a d e s c r i p t i o n o f a stack o r i e n t e d f i n i t e s t a t e machine. Each s t a t e i s a node, and each t r a n s i t i o n an a r c to another s t a t e . The c o n f i g u r a t i o n s are the v a r i o u s p o i n t s w i t h i n the user s p e c i f i c a t i o n t h at the s t a t e i s concerned with. I f a c o n f i g u r a t i o n d i s t i n g u i s h e s an e i t h e r statement, then a l l of the e i t h e r a l t e r n a t i v e s are of concern. S i m i l a r l y , procedure c a l l s cause the a c t i v a t i o n of the procedure's f i r s t statement. The symbol t a b l e s names, e r r o r names, give and a l i s t of procedure the t e r m i n a l names, apply names. These names are 55 a s s o c i a t e d with the number r e p r e s e n t i n g them i n the p a r s e r t a b l e s . (The numbers are assigned s e g u e n t i a l l y s t a r t i n g a t one.) The procedure names a l s o have a l i s t of source and d e s t i n a t i o n s t a t e s f o r procedure c a l l a r c s i n the generated machine. These s t a t e p a i r s can be of use f o r g e t t i n g around the s t a t e t a b l e . In g e n e r a l , the symbol t a b l e s are v a l u a b l e as an easy p l a c e to check f o r t y p i n g e r r o r s i n i d e n t i f i e r names. 4.1.3.4 PARSER TABLES At the c u r r e n t time the program produces PL360 source programs as an e x t e r n a l r e p r e s e n t a t i o n f o r the t a b l e s . These programs, a f t e r c o m p i l a t i o n , can be loaded with the r e s t of your t r a n s l a t o r . When c a l l e d , the programs w i l l r e t u r n the address of the i n i t i a l i z e d parse t a b l e s i n r e g i s t e r z e r o . The t a b l e format i s e x p l a i n e d i n an appendix. 4.2 PROGRAMMING POINTERS T h i s s e c t i o n i s designed to i n t r o d u c e the user to some concepts concerning P a r s l e y programming. I t w i l l a l s o t r y to g i v e some i n s i g h t i n t o the parser g e n e r a t i o n . 4.2.1 MENTAL APPROACH The proper a t t i t u d e i s extremely important to s u c c e s s f u l P a r s l e y programming. \ 56 You s h o u l d f i r s t f o r g e t the e x i s t e n c e of c o n t e x t f r e e grammars. , T h i s i s because c o n t e x t f r e e grammars have a more g e n e r a t i v e paradigm than P a r s l e y . Another problem with c o n t e x t f r e e grammars i s t h e i r l a c k o f an i t e r a t i v e s t r u c t u r e . I f you t h i n k i n a r e c u r s i v e manner f o r r e p e t i t i v e c o n s t r u c t s the f i n a l program w i l l not be as r e a d a b l e . T h e r e f o r e , d e s p i t e t h e t e m p t a t i o n to use past e x p e r i e n c e i n c o n t e x t f r e e grammars, DON* T ! One s h o u l d a l s o c o n s t a n t l y remind o n e s e l f t h a t P a r s l e y does not a c t l i k e most programming l a n g u a g e s . A P a r s l e y s p e c i f i c a t i o n p r e s e n t s a s e t of p o s s i b l e paths a t any p o i n t . C o m p a t i b l e paths can be o v e r l a y e d and s i m u l t a n e o u s l y f o l l o w e d as l o n g as p o s s i b l e , but i n c o m p a t i b l e p a t h s must be s e p a r a b l e w i t h i n a f i n i t e l o o k a h e a d . F i n a l l y , remember what you w r i t e w i l l have to be r e a d . G i v e the r e a d e r a b r e a k , i t might be YOU. 4 . 2 . 2 SOME GOOD PROGRAMMING PRACTICES Most s t r u c t u r e d programming t e c h n i q u e s are o f value i n P a r s l e y programming. One s h o u l d attempt t o break s p e c i f i c a t i o n s down i n t o r e a s o n a b l e s i z e d l o g i c a l u n i t s . . S m a l l , l o c a l p a r t s can be e a s i l y e x p r e s s e d as s c o p e s . Comments, due t o the p a r a l l e l nature of P a r s l e y , are e s p e c i a l l y v a l u a b l e . S i n g l e l e v e l scope escapes s h o u l d be used i n s t e a d o f m u l t i - l e v e l 57 escapes whenever p o s s i b l e . The m u l t i - l e v e l scope escape i s a u s e f u l t o o l , but can impair r e a d a b i l i t y i f used too much. There are some other p r a c t i c e s which are important t o P a r s l e y programming. When a c o n s t r u c t has an o p t i o n a l p a r t , the o p t i o n a l i t y should be expressed a t the c o n s t r u c t i n v o c a t i o n p o i n t r a t h e r than i n the c o n s t r u c t . T h i s makes the s p e c i f i c a t i o n c l e a r e r to the reader. Another good p r a c t i c e i s t o break down semantic a c t i o n s i n t o s m a l l p r i m i t i v e a c t i o n s . T h i s a l l o w s a reader a b e t t e r chance t o understand the flow of syntax and semantics. I t a l s o promotes the development of a c l e a n s e t of p r i m i t i v e a c t i o n s . EXAMPLE A BAD PROGRAMMING EXAMPLE LET BLOCK BE: BEAD (BEGIN) ; APPLY (STABT_A NEW_BLOCK) ; DECLARATIONS;" STATEMENT-LIST; APPLY(CLOSE_THE BLOCK); READ(ENS) _ ) _ LET DECLARATIONS BE: EITHER (RETURN AND APPLY(NO-DECLARATIONS) ) END; 58 A BETTEE SPECIFICATION LET BLOCK BE: BEAD(BEGIN); APPLY(PABAGRAPH_NEH_LINE,PARAGRAPH_INDENT, P0SH_NEH_SCOPE) ; EITHER (DECLARATIONS) (SKIP; APPLY(NO_DECLARATIONS) ) END; STATEMENT_LIST; APPLY(PARAGRAPH_NEW_LINE,PARAGRAPH_EXDENT, CHECK LABELS_DEFINED,POP_SCOPE); BEAD (END) ; _ l_ LET DECLARATIONS BE: 4.2.3 SOME PRACTICAL CONSIDERATIONS Despite the system's a b i l i t y to give you f i f t e e n symbol lookahead, don't use i t . Long lookaheads are i n e f f i c i e n t as symbols are handled more than once. Long lookaheads also usually forewarn of obscurity.. Generally, one symbol lookahead i s s u f f i c i e n t f o r most implementations. (The UBC implementation of Algol-6 8 has only four states with a three symbol lookahead 1 S.) The second p r a c t i c a l consideration i s when to use the Parsley error statement. I t must be obvious to the reader that not a l l syntax errors can be reasonably caught. The error statement should be i n i t i a l l y used f o r unusual, or obscure 59 constructs of your language. Later, a f t e r the tran s l a t o r i s running, additional error traps can be set to handle unexpected common errors and also errors which are poorly handled by the general recovery mechanism. 4.2.4 LOOKAHEAD INADEQUACIES Lookahead inadeguacies occur when the parser generator cannot decide between possible paths with a f i n i t e number of input symbol lookahead. The modification of the program to correct t h i s deficiency i s c a l l e d debugging. The rest of t h i s section attempts to outline some causes for inadequacies and possible solutions. There are four major ways that a lookahead inadequacy i s discovered. They are maximum lookahead depth reached, merged lookahead c a l c u l a t i o n s , previous inadeguacy encountered, and lack of goal procedure augmentation. A maximum lookahead depth inadeguacy can be caused by either an ambiguity, or an attempt to force a semantic action or escape too soon. The correction for t h i s problem l i e s i n the eradication of the ambiguity or i n moving the action, or escape further along i n the parse. Another possible solution i s the use of the Parsley look statement. 60 EXAMPLE LET DO_STATEMENT_HEAD BE:. READ (DG, VARIABLE, ASSIGN_.OP) ; APPLY (SET_LTP_DO,GET_D0 VARIABLE) ; EITHER (LIST_OF_D0_VALUES) /*D0 I:=1,3,10; .*/ (RANGE_OF_DO_VALUES) /* DO I: = 1 TO 5; */ END; APPLY (CLOSE_DO_HEAD) ; READ (SEMI) ; LET LIST_OF_DO_VALUES BE: APPLY(SET_UP_DO_LIST) ; CYCLE READ (INTEGER) ; APPLY (S AVE_DO_VALUE) ; EITHER (READ (COMMA) ) (EXIT) END; END; . _ i _ LET BANGE_OF_DO_VALUES BE: APPLY (SET~UP~DO_BANGE) ; READ(INTEGER,T07lNTEGER); APPLY(SAVE_INITIAL_DO_VALUE,SAVE_FINAL_DO_VALUE); In the preceding example the program attempted to di f f e r e n t i a t e between the two DO_STATEMENT_HEADs too early f o r one symbol lookahead. One symbol lookahead i s possible i f the apply SET_UP_DO_LIST and SET_UP_DO_RANGE are placed a f t e r the f i r s t INTEGER was read.^ At th i s point, the occurrence of a COMMA or a 'TO* would be s u f f i c i e n t to separate the actions. In f a c t , i f these applies were moved after the COMMA and the 'TO' then no lookahead would be necessary at a l l . 61 A merged lookahead c a l c u l a t i o n i s caused when a user attempts to s p l i t a parse path which can l a t e r reconnect without encountering a di f f e r e n t input symbol. This merging i s actually just the re s u l t of an ambiguity. One can often either respecify or use a look statement to solve t h i s problem. EXAMPLE LET STATEMENT-LIST BE: CYCLE EITHER (STATEMENT) (SKIP) END; EITHER (READ (SEMI) ) (EXIT) END; END; _ l _ LET STATEMENT BE: EITHER (SKIP) (IF-STATEMENT) END; _ l _ In the preceding example the occurrence of two semicolons without a separating non-empty statement could be parsed two ways. The empty statement could be reduced or the SKIP i n the f i r s t procedure could be used. The program w i l l have to be modified to either have the empty statement or to use that SKIP. 62 An inadequacy found when a p r e v i o u s l y inadequate s t a t e i s encountered can sometimes be s a f e l y i g n o r e d . The removal of the prev i o u s inadeguacy w i l l o f t e n c o r r e c t the c u r r e n t inadeguacy. However, sometimes you w i l l f i n d the previous inadeguate s t a t e was not a c t u a l l y p r e v i o u s l y mentioned as inadeguate. T h i s occurs when the lookahead c a l c u l a t i o n s wrap around i n t o s t a t e s used i n the c u r r e n t c a l c u l a t i o n s . T h i s problem i s caused by an ambiguity i n the s p e c i f i c a t i o n . An example of where t h i s occurs i s the Algol-60 'dangling e l s e * . The appendices i n c l u d e a s p e c i f i c a t i o n f o r the SP(K) 1 1 language which i n c l u d e s a s o l u t i o n to t h i s problem. An inadequacy caused by 'lack o f g o a l procedure augmentation' i s t r i v i a l to f i x . A l l one has to do i s to •augment' the goal procedure, t h a t i s , t o have a s p e c i a l t e r m i n a l read only when the goal procedure i s t o be f i n i s h e d . T h i s t e r m i n a l should not appear anywhere e l s e i n the program, and i s u s u a l l y a s s o c i a t e d with the e n d - o f - f i l e c o n d i t i o n d i s c o v e r e d by the scanner. EXAMPLE LET GOAL BE: CYCLE EITHER (READ (X) ) (EXIT) END; END; _ i _ 6 3 T h i s example i s a simple demonstration of a program l a c k i n g s u f f i c i e n t g o a l procedure augmentation. The program w i l l not have any symbol on which to i n i t i a t e the c y c l e escape. The s o l u t i o n i s t o i n t r o d u c e a read EOF a f t e r the c y c l e . The language d e s c r i b e d i s s l i g h t l y modified by t h i s change but t h i s i s unavoidable. An inadequacy can a l s o be caused by the parser g e n e r a t i o n technique. The system uses a lookahead c a l c u l a t i o n technique known as SLR (K) 3 . I f lookahead c a l c u l a t i o n encounters the end of a procedure, then a l l p o i n t s which c a l l the procedure are c o n s i d e r e d as l o c a t i o n s to continue lookahead c a l c u l a t i o n s , not j u s t the p o i n t s which cause lookahead to proceed i n t o the inadequate s t a t e . The P a r s l e y look statement can sometimes cure t h i s problem. Making a copy of the procedure and u s i n g i t i n the inadeguate i n v o c a t i o n p o i n t w i l l a l s o sometimes work. (This technigue doesn*t work w e l l i n r e c u r s i v e s p e c i f i c a t i o n s . ) EXAMPLE LET GOAL BE: a; EITHER (A) (BEGIN READ (Y) ; READ (Y) ; END) END; READ (EOF) ; _!_ LET A BE: READ (Y) ; 64 Clearly the preceding example can be parsed with one symbol lookahead, but unfortunately i t can not be handled by the SLR (K) technigue. The d i f f i c u l t y l i e s i n the second c a l l to A. After the Y has been read for both A and the block, control w i l l pass to a state which must decide between a t r a n s i t i o n to return A or to read Y. These two actions are incompatible and lookahead must attempt to separate them. The lookahead set f o r the return w i l l produce Y and EOF. Y was found because i t followed the f i r s t c a l l of A. EOF was found because i t followed the second c a l l of A. The lookahead set f o r the read i n the block i s t r i v i a l l y Y. The two sets c o n f l i c t on the terminal Y and the lookahead separation i s inadeguate. 4.2.5 THE PARSE STACK' The parse stack i s used to temporarily hold tokens from the input stream for use during semantic actions. I t . i s here that the actual value of a number or i d e n t i t y of a variable can be found and used i n t r a n s l a t i o n . The user should use the second column of the paragraphed l i s t i n g to calculate where tokens w i l l appear on the parse stack. The actual height of the parse stack i s controlled by the syntax analyzer. I t i s important to understand how the parser generator calculates stack depths. The stack i s l e f t one element higher a f t e r a procedure return, scope completion, either statement completion, and terminal read. The stack i s 65 popped so t h a t a l l p o s s i b l e routes to any point i n a s p e c i f i c a t i o n w i l l have i d e n t i c a l stack depths. T h i s means t h a t when a scope i s repeated a l l stack elements pushed d u r i n g the c u r r e n t scope e x e c u t i o n are popped., I t a l s o means t h a t a l l but one stack element pushed during a procedure or a scope, are popped o f f when the c o n s t r u c t i s f i n i s h e d . In a d d i t i o n , sometimes the parse stack must be incremented when there i s no token to put on the s t a c k . (This occurs i n s k i p a l t e r n a t i v e s f o r e i t h e r statements as w e l l as whenever a P a r s l e y c o n s t r u c t i s l e f t before any st a c k elements are pushed i n the c o n s t r u c t . ) In these s i t u a t i o n s a 'push empty' i s executed. A push empty pushes a s p e c i a l symbol, the 'empty token*, onto the parse s t a c k . The d i f f e r e n c e between the semantic a c t i o n s being p o s i t i o n e d before an escape or a f t e r an escape can now be e x p l a i n e d . A c t i o n s p o s i t i o n e d before an escape are executed before the o p e r a t i o n s to get st a c k alignment are executed. A c t i o n s p o s i t i o n e d a f t e r escapes are executed a f t e r s t a c k alignment. T h i s v a r i a t i o n i s u s e f u l i n s i t u a t i o n s where there e x i s t s i d e n t i c a l semantics except f o r st a c k c o n f i g u r a t i o n . A l l o w i n g stack alignment to be executed p r i o r t o the semantics w i l l sometimes allow the melding of these a c t i o n s . 66 CHAPTER 5. IMPLEMENTATION NOTES The P a r s l e y system was implemented i n the SUE system language l z . T h i s was an e x c e l l e n t c h o i c e s i n c e SUE s u p p l i e s good data s t r u c t u r i n g f a c i l i t i e s , and produces e f f i c i e n t code. I t took approximately two man months to complete. I n i t i a l l y the system ran on t a b l e s produced by a l o c a l LALE (1) p a r s e r generator, l a t e r the system was bootstrapped onto t a b l e s produced by the P a r s l e y system. I t i s i n t e r e s t i n g t h a t only very minor a l t e r a t i o n s i n the semantic r o u t i n e s were necessary to complete the t r a n s f e r . The i n i t i a l c o ntext f r e e grammar and the P a r s l e y s p e c i f i c a t i o n are both i n the appendices. The implementation was c o n v e n i e n t l y s p l i t i n t o two processes. The f i r s t process reads,.in the source program and c r e a t e s an i n t e r n a l s t r u c t u r e . , The second process uses t h i s i n t e r n a l s t r u c t u r e to generate SLR (K) 3 p a r s e r t a b l e s . The r e s t of t h i s chapter assumes a knowledge of the theory and implementation of SLR (K) parser g e n e r a t o r s . 67 5.1 READING IN THE SOURCE PROGRAM Reading i n the source program was done i n a s y n t a x ^ d i r e c t e d , one-pass t r a n s l a t i o n . The i n t e r n a l s t r u c t u r e , produced from the preceding t r a n s l a t i o n , i s a n o n - d e t e r m i n i s t i c s t a c k - o r i e n t e d p a r s i n g machine. Each procedure has a d i s t i n c t s t r u c t u r e a s s o c i a t e d with i t . 5.1.1 THE PARSING MACHINE DEFINITION The p a r s i n g machine i s represented by a l i n k e d s t r u c t u r e . Nodes or s t a t e s are l i n k e d together by a r c s or t r a n s i t i o n s . There are e i g h t d i f f e r e n t types of a r c s or t r a n s i t i o n s . ,, (1) NO_OP(S) - do nothing; j u s t f o l l o w the a r c . T h i s i s u s e f u l f o r connections between c o n s t r u c t s . (2) READ(X,S) - i f the next symbol matches the read argument then push the symbol on the s t a c k , d e l e t e the symbol from the i n p u t stream, and f o l l o w the a r c ; otherwise f a i l . T h i s of course m i r r o r s the P a r s l e y read. (3) PROCEDURE CALL (P,S) - pushes c o n t r o l to the procedure argument. Upon completion of the procedure the a r c i s f o l l o w e d . Used f o r the P a r s l e y procedure c a l l (4) APPIY(A,S) - a p p l i e s the apply name argument. Upon completion f o l l o w s the a r c . Used f o r the P a r s l e y apply statement (5) RETURN (P) - marks the end of a procedure, causes the 68 completion of the procedure c a l l arc (6) PUSH STACK (S) - pushes the empty symbol on the stack and follows the arc. It i s used for stack alignment purposes. (7) POP STACK (•#,S) - pops a number of elements off the stack according to i t s argument. I t i s usually used in scope escapes and just before procedure returns (8) EITHER(A 1 t A 2 , ..,) - presents a l i s t of possible, parse a l t e r n a t i v e s . The main EITHEE arc can have an error name associated with i t . Each alternative may have a look l i s t associated with i t , 5.1.2 BASIC PARSING MACHINE CONSTRUCTION The actual construction of the parsing machine i s very simple. The current implementation found i t useful to have a stack to control scopes and another stack to control either statements. These stacks kept pointers to entry and e x i t points f o r the constructs. The scope stack also found i t necessary to keep track of the number of stack elements pushed since the s t a r t of the scope. This information was used i n scope escapes to calculate the necessary parse stack alignment. The actual procedure body was treated exactly l i k e a scope except of course i t could not be repeated. There i s a complication in constructing the basic parsing machine. Sometimes the f i r s t executable element of a procedure w i l l be an apply. Due to the implementation parsing technique t h i s i s i n v a l i d as apply actions w i l l not save t h e i r source state. The loss of the source state means that the parse w i l l not be restartable after the procedure returns. The solution to t h i s problem i s to place a push empty arc at the s t a r t of the whole procedure. A similar problem occurs with repeafable scopes. Repeating a scope causes a l l states pushed during the scope to be popped. If a repeatable scope s t a r t s a procedure then the state to re s t a r t a f t e r the procedure returns w i l l be popped off of the state stack. The solution i s the same as before. Introduce a push empty arc at the front of the procedure before the scope. 5 .1.3 PARSING MACHINE OPTIMIZATION After each procedure had been translated an optimization was performed on i t s structure. A l l no_op arcs were deleted. A l l connected pop arcs were melded. A l l push empty arcs which were d i r e c t l y followed by a pop arc were deleted and the number of pops reduced by one. These safe optimizations were found to reduce storage reguirements by around f o r t y percent. They also improved f i n a l parsing speed by reducing the number of pop and push arcs. 70 5.2 PARSER GENERATION The parser g e n e r a t i o n technigue used was SLR (K) 3 . T h i s s e c t i o n w i l l not a c t u a l l y e x p l a i n the whole tec h n i g u e , j u s t m o d i f i c a t i o n s necessary t o handle P a r s l e y . 5.2.1 LR(0) MACHINE CONSTRUCTION The LR{0) machine c o n s t r u c t i o n i s the f i r s t s tep i n parse r g e n e r a t i o n . T h i s s t e p t r a n s l a t e s the p r e v i o u s machine i n t o a new machine more s u i t a b l e f o r pa r s e r g e n e r a t i o n . The r e s t of t h i s s e c t i o n w i l l present our LR(0) - machine and some of the bas i c SLR (K) terminology. Host of the LR (0) machine c o n s t r u c t i o n i s i d e n t i c a l t o t h a t presented by Lalonde 1 6 . , Only a l t e r a t i o n s i n t h i s approach w i l l be mentioned. I n f o r m a l l y a f i n i t e s t a t e machine i s composed of s t a t e s and t r a n s i t i o n s between s t a t e s . A t r a n s i t i o n i s a s s o c i a t e d with a s t a t e and i s composed of a t r a n s i t i o n a c t i o n and a d e s t i n a t i o n s t a t e . A t r a n s i t i o n i s executed by completing the t r a n s i t i o n a c t i o n and e n t e r i n g the d e s t i n a t i o n s t a t e . There are s i x d i f f e r e n t t r a n s i t i o n s i n our LR (0) machine. (1) READT(X,S) - i f the t e r m i n a l X i s the next i n p u t token then t h i s t r a n s i t i o n can be completed. Completion w i l l push the c u r r e n t s t a t e onto the s t a t e s t a c k . , Of course the t e r m i n a l X w i l l a l s o be pushed onto the a u x i l i a r y parse s t a c k . a 71 (2) READP(P,S) - i f the procedure name P i s the next token i n the i n p u t stream then t h i s t r a n s i t i o n can be completed. (3) RETURN(P) - t h i s t r a n s i t i o n i s completed by pushing the procedure name P i n t o the i n p u t stream. I t then passes c o n t r o l to the s t a t e on the top of the s t a t e s t a c k . (4) POPSTACK(K,S) - t h i s t r a n s i t i o n pops K elements o f f of the s t a t e s t a c k . I t a l s o reduces the height of the parse s t a c k . (5) PUSH STACK(S) - t h i s t r a n s i t i o n pushes the cu r r e n t s t a t e on the s t a t e s t a c k . I t a l s o pushes the empty symbol onto the parse stack.. (6) APPLY(A,S) - t h i s t r a n s i t i o n generates the code f o r the semantic A f o r the t r a n s l a t o r . , Any s t a t e having a READT t r a n s i t i o n i s c a l l e d a read s t a t e . Any s t a t e having a READP t r a n s i t i o n i s c a l l e d a push s t a t e . Any s t a t e having one or more of the other t r a n s i t i o n s i s c a l l e d an a c t i o n s t a t e . A s t a t e i s c a l l e d inadeguate when t h e r e i s more than one RETURN, POPSTACK, PUSH STACK, or APPLY t r a n s i t i o n . A s t a t e i s a l s o inadeguate i f i t i s both a read s t a t e and an a c t i o n s t a t e . S t a t e g e n e r a t i o n c o n s i s t s of s p e c i f y i n g , f o r each t r a n s i t i o n i n a s t a t e , the t r a n s i t i o n a c t i o n and d e s t i n a t i o n 72 s t a t e . S p e c i f y i n g a l l t r a n s i t i o n s f o r a l l s t a t e s i s c a l l e d machine gene r a t i o n . A machine without any inadeguate s t a t e s i s an LR{0) machine. Every s t a t e i n the machine i s uniquely i d e n t i f i e d by a s e t of c o n f i g u r a t i o n s . A c o n f i g u r a t i o n i s a marker which d i s t i n g u i s h e s a node i n a procedure's i n t e r n a l s t r u c t u r e . I f a c o n f i g u r a t i o n d i s t i n g u i s h e s an EITHER a r c then add a new c o n f i g u r a t i o n f o r each of the a l t e r n a t i v e s . I f a c o n f i g u r a t i o n d i s t i n g u i s h e s a PROCEDURE CALL a r c then c r e a t e a c o n f i g u r a t i o n which d i s t i n g u i s h e s the f i r s t node of t h a t procedure's i n t e r n a l s t r u c t u r e u n l e s s t h i s has been p r e v i o u s l y done f o r t h i s s t a t e . Whenever an EITHER with an e r r o r name i s expanded the s t a t e i s given t h a t e r r o r name. At t h i s time a check should be made to ensure that the s t a t e d i d not already have an e r r o r name a s s o c i a t e d . I f i t does, an e r r o r message should be generated. I t was a l s o c o n s i d e r e d erroneous i f an EITHER with no e r r o r name was expanded i n the same s t a t e as an EITHER with an e r r o r name. C o n f i g u r a t i o n s and t r a n s i t i o n s can both have look l i s t s a s s o c i a t e d . During expansion, i f a PROCEDURE CALL has a look l i s t then t h i s look l i s t i s passed on to the c r e a t e d c o n f i g u r a t i o n . S i m i l a r l y , EITHER c o n f i g u r a t i o n s pass t h e i r look l i s t s to t h e i r a l t e r n a t i v e s . I f an EITHER a l t e r n a t i v e has a look l i s t then i t i s a s s o c i a t e d with the c o n f i g u r a t i o n c r e a t e d f o r i t . The EITHER a l t e r n a t i v e look l i s t has precedence over an 73 EITHER look l i s t . Transitions always take the look l i s t from th e i r creating configurations. A check should be made to ensure that a l l creating configurations f o r a t r a n s i t i o n have i d e n t i c a l look l i s t s . I f they don't, an error should be generated. 5.2.2 SLR(K) MACHINE CONSTRUCTION If the LR (0) machine has no inadeguate states then parser generation does not need to do lookahead. I f there are any inadeguate states then lookahead must be attempted to resolve the inadeguacies. Lookahead resolves inadequacies by allowing the choice between incompatible actions to be based on a number of terminals i n the input stream ahead of the current parse point. This step uses a machine i d e n t i c a l to the LR (0) machine except the addition of one new tr a n s i t i o n action. This new action i s LOOKAHEAD (T,D,S). T i s the terminal i t w i l l attempt to match from the input stream. D i s the number of symbols down the input stream being matched. When D eguals 1 the match i s on the current input symbol. The action does not push a state on the state stack. The SLR (K) algorithm i s i d e n t i c a l to that put forward by Earner 1 5 except f o r the handling of look l i s t s and error names which were not relevant i n that algorithm. , A l l lookahead states take on the error name of the inadeguate state which caused 74 t h e i r creation. The look l i s t s are used to r e s t r i c t the lookahead sets used i n the calculations concerned with a par t i c u l a r t r a n s i t i o n . I t i s at t h i s time that look l i s t s are checked for v a l i d i t y . A l l look l i s t elements must be i n the lookahead set. I f they are not then an error message i s generated and the current calculations declared inadeguate. The decision not to employ the more powerful LALR(K) and LB(K) techniques was made on the grounds of reasonableness. , The LR{0) machine that the system uses was f e l t to be excessively complex to attempt these technigues. Loops in the int e r n a l structure, and multiple returns make these other techniques extremely d i f f i c u l t . 5.2.3 PABSER OPTIMIZATIONS A l l the usual LB(K) optimizations are applicable. State merging, removal of BEADP t r a n s i t i o n s , default branches for lookahead states and ethers are possible. There are some optimizations which are only f e a s i b l e on our machine. Since only read and lookahead states can discover syntax errors, only these kind of states need have error names appended. In addition, i t was found useful to separate these states into those with error names and those without as a further space optimization. 75 CHAPTER 6. THE END T h i s t h e s i s has p r e s e n t e d a new p a r s i n g language and i t s p r a c t i c a l i m p l e m e n t a t i o n . The new language demonstra tes a new approach to s y n t a x s p e c i f i c a t i o n f o r programming language i m p l e m e n t a t i o n . The language a l s o a l l o w s t h e programmer t o get • i n s i d e * the f i n a l generated p a r s e r and s e t t r a p s f o r s y n t a x e r r o r s , a major advance o v e r most c u r r e n t p a r s e r g e n e r a t o r s y s t e m s . The language i m p l e m e n t a t i o n not o n l y shows the p r a c t i c a l v i a b i l i t y o f P a r s l e y , but a l s o the a v a i l a b i l i t y of c u r r e n t p a r s i n g t e c h n i g u e s t o new p a r s i n g l a n g u a g e s . (The SLR(K) t e c h n i g u e was o r i g i n a l l y d e v e l o p e d f o r the c o n t e x t f r e e grammars.) T h e r e f o r e , we would l i k e to c o n j e c t u r e t h a t P a r s l e y i s a f o r e r u n n e r of a group of b e t t e r p a r s i n g languages t o r e p l a c e the c u r r e n t o n e s . , 76 BIBLIOGRAPHY [ I ] Naur, P. (ed) Revised Report on the Algorithmic Language Algol 60 Communications of the ACM, Volume 6, Number 1,1963 [2] Wirth, N. and Weber, H. EULER - A Generalization of A l ^ o l and i t s Defin i t i o n Communications of the ACM, Volume 9, Number 1,27l966 [3] DeRemer, F. P r a c t i c a l Translators for LR_(Ki_ Languages. Ph.D. Thesis, Massachusetts I n s t i t u t e of Technology, Cambridge, Massachusetts,1969 [4] McKeeman, W.M., Horning, J . J . , and lortman, D.B. A Compiler Generator Prentice Hall,1970 [ 5 ] Rosen, S. A Compiler Building System Developed by Brooker and Morris Communications of the ACM, Volume 7, Number 7,1964 [6] Gries, D. Compiler Construction for D i g i t a l Computers John Wiley and Sons, Inc.,1971 [7] Floyd, R.W. A Descriptive Language for Symbol Manipulation Journal of the ACM, Volume 8, Number 10,1961 [ 8 ] Evans, A. An Algol 60 Compiler Annual Review of Automatic Programming, Pergammon Press,1964 [9] Ichbiah, J.D. and Morse, S..P. A Technigue for Generating Almost Optimal. Floyd-Evans Productions for Precedence Grammars Communications of the ACM, Volume 7, Number~7,1964 [10] Conway, M.E. Design of a Separable Transition-Diagram Compiler Communications of the ACM, Volume 6, Number 7,1963 [ I I ] Barnard, D.T. Automatic Generation of Syntax-Repairing and Paragraphing Parsers Technical Report CSRG-52, Computer Systems Research Group, University of Toronto,1975 [12] Clark, B.L. The SUE System Language User's Guide J R E V I S E D J L Department of Computer Science, University of B r i t i s h Columbia,1974 77 [13] Lomet, D.B. A Formalization of Transition Diagram Systems Journal of the ACM, Volume 20, Number 2,1973 [14] Woods, W.A. Transition Network Grammars for Natural i a a a M a a S Analysis Communications of the ACM, Volume 13, Number *107l970 [15] Earner, D.B. Construction of LBJK1. Parsers with Application to Algol 68 M.Sc. Thesis, University of B r i t i s h Columbia,T97T [16] Lalonde, W.R. An E f f i c i e n t LALB Parser Generator Technical Report CSRG-2, Computer Systems Research Group, University of Toronto,1971 78 APPENDIX I. PARSLEY IN PARSLEY 0 | 0 | LET PARSLEY-PROGRAM BE: ***** WARNING: A PUSH EMPTY INSERTED BEFORE THIS PROCEDURE : ALL PREVIOUS STACK DEPTHS INCREMENTED BY ONE 5 | 1 | APPLY(ZERO-INDENT, 9 j | SAVE-PROCEDURE-START-STATE-FOR-PROCEDURE); 1 2 | 1 CYCLE/* PICK UP, PROCEDURES */ 13 | J EITHER 14 | 2 I (READ (EOF); APPLY(END-OF-PBOGBAM)THEN 25 \ 1 | ... RETURN ) 27 | 2 | (NT-DESCRIPTION; APPLY ( 32 | | SYNTAX-EBROR-SUMMARY,SPACE-LINE, 36 | 1 | ZERO-INDENT)) 39 | 2 | END 40 | | END 411 I J **** K 0 SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 42 | 0 I LET NT-DESCRIPTION BE: 46 4 I BEAD(LET,NT,BE,COLON); 57 | APPLY(SET-UP-ACTIVE-NONTERMINAL,INDENT); 64 | CYCLE/* STATEMENT LIST, ONE PER LINE */ 65 j APPLY (LINE) ; 70 5 ! STATEMENT; 72 j EITHER 73 | (READ (SEMI) ) 79 | ... (SEE (GOAL-POST) ; EXIT ) 87 6 | END; 89 END; 91 | APPLY(EXDENT-LINE,END-OF-NT-BODY); 98 6 i READ(GOAL-POST); 103 I APPLY (LINE); 108 I _ l _ * NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 109 0 I LET STATEMENT BE: 113 EITHER 114 | (SKIP/* A STATEMENT CAN BE EMPTY */ ) 117 | (SCOPE-STATEMENT) 120 i | (EITHEB-STATEMENT) 123 i | (BEAD (NT); APPLY (CEEATE-NT-CALL)) 134 I j (BEAD-STATEMENT) 79 137 | | (APPLY-STATEMENT) 140 | | (ESCAPE-STATEMENT) 143 | 1 | END 144 \ | _|_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 145 | 0 | LET SCOPE-STATEMENT BE: 149 I ] EITHER/* OPTIONAL SCOPE NAME */ 150 I (SKIP; APPLY(NULL-SCOPE)) 158 1 J (SCOPE-LABEL; 165 0 | APPL Y(C HECK-SCO PE-IS NT-OPE N,LINE)) 168 1 | END; 170 j EITHER 171 | (READ(BEGIN)) 177 | (BEAD (CYCLE) ) 183 2 I END; 185 | APPLY(SCOPE-PUSH #INDENT); 192 | CYCLE 193 | APPLY (LINE) ; 198 3 J STATEMENT; 200 | EITHER 201 | (READ (SEMI) ) 207 | ... (SEE (END) ; EXIT AND APPLY (EXDENT-LINE) ) 220 4 I END 221 3 I END; 223 4 I READ (END) ; 228 | EITHER 229 | (SKIP; APPLY(NULL-SCOPE) ) 237 | (SCOPE-LABEL) 240 5 I END; 242 | APPLY(SCOPE-POP); /* INCLUDES A CHECK FOR 246 | COBBECT SCOPE LABELLING */ 247 | _ i _ * NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 24 8 | 0 j LET EITHER-SIATEMENT BE: 252 | 1 BEAD (EITHER) ; . 257 | APPLY(SIART-AN-EITHER,INDENT-LINE); 264 | CYCLE/* PICK UP EITHER ALTERNATIVES */ 265 | EITHER 266 | (READ (OPEN) ) 272 | (ERROR(NO-ALTERNATIVE-FOR-EITHER)) 278 | 2 i END; 280 | 3 EITHER-ALTERNATIVE; 282 | 4 READ (CLOSE) ; 287 | APPLY(FINISH-EITHER-ALTEBNATIVE,LINE); 294 | EITHER 295 | ... (SEE (OPEN); REPEAT ) 303 | ... (EXIT AND APPLY(EXDENT-LINE)) 311 I (ERROR(ALTERNATIVE-SEPARATION-PROBLEM) 317 | 5 END 318 | 2 EN D; 80 320 | 3 | BEAD(END) ; 325 | I APPLY (CLOSE-AN-EITHER) ; 330 | I _ l _ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 331 I o I LET EITHER-ALTERNATIVE BE: 335 I I EITHER 336 | J m • (ERROR-STATEMENT; RETURN ) 341 | | (BEGIN ** WARNING: A PUSH EMPTY INSERTED BEFORE THIS PROCEDURE ALL PREVIOUS STACK DEPTHS INCREMENTED BY ONE 343 APPLY(SET-UP-EITHER-ALTERNATIVE); 348 | | EITHER 349 | | (SKIP) 352 ] | (BEGIN /* GET OPTIONAL SEE 353 j j STATEMENT */ 354 I 2 | SEE-STATEMENT; 356 I 3 | BEAD (SEMI) ; 361 I 1 I END) 363 I 2 | END; 365 I 1 | END) 367 I 2 | END; 369 | | EITHER 370 | | (BEGIN 372 I 3 | READ (BEAD) ; 377 I I APPLY(TERMINAL-PHASE); 382 I 5 | READ(OPEN,TERMINAL); 389 | j APPLY(CREATE-READ, 394 | | RSWD-NONTESMINAL—PHASE); 396 | | EITHER 397 I I (READ (CLOSE) ) 403 I I (ERROR (MULTIPLE-READ-IN-EITHER) 407 | I /* DELETES EXIBA TERMINALS */) 409 I 6 | END; 411 I 2 | END) 413 | j (SCOPE-STATEMENT) 416 | | (EITHER-STATEMENT) 419 | I (READ (NT) ; APPLY (CREATE-NT-CALL) ) 430 ] | (READ (SKIP) ; APPLY (CREATE-SKIP-CONSTRUCT) ) 441 I I • • (ESCAPE-STATEMENT; RETUBN ) 446 | | (ERROR(BAD-FIRST-IN-EITHER)) 452 | 3 | END; 454 | | EITHER 455 | j (READ (SEMI) ) 461 | | • * (RETURN ) 464 I 4 | END; . 466 | | EITHEB 467 (ESCAPE-STATEMENT) 470 | | (APPLY-STATEMENT) 473 t | * * (SEE (CLOSE) ; RETURN ) 481 ] | (ERROR(BAD-SECOUD-IN-EITHER)) 487 I 5 | END; 489 ] t _l 81 * * * * NO S Y N T A X E R R O B S D E T E C T E D NO S Y N T A X E B B O R S I N C O M P I L A T I O N SO FAR 4 9 0 4 9 4 4 9 5 5 0 6 5 1 7 5 1 9 5 2 4 5 2 9 5 3 0 5 3 5 5 4 0 . 5 4 1 5 4 7 5 5 4 5 5 5 5 5 7 5 5 9 5 6 4 * * * * NO NO 1 2 3 4 3 4 L E T S E E - S T A T E M E N T B E : E I T H E R ( R E A D ( S E E ) ; A P P L Y ( S E E - L I S T ) ) ( R E A D ( N O T S E E ) ; A P P L Y ( N O T S E E - L I S T ) ) END; A P P L Y ( T E R M I N A L - P H A S E ) ; R E A D ( O P E N ) ; C Y C L E R E A D ( T E R M I N A L ) ; A P P L Y ( B O I L D - S E E - L I S T - E L E M E N T ) ; E I T H E R (READ (COMMA)) . . . ( E X I T AND A P P L Y ( R S S D - N C N T E R M I N A L - P H A S E ) ) END; END; READ ( C L O S E ) ; S Y N T A X ERRORS D E T E C T E D S Y N T A X ERRORS I N C O M P I L A T I O N SO FAR 565 | 0 I L E T A P P L Y - S T A T E M E N T B E : 569 J 1 READ ( A P P L Y ) ; 574 | A P P L Y ( A P P L Y - N A M E - P H A S E ) ; 579 | 2 B E A D (OPEN) ; 584 | C Y C L E 585 | 3 R E A D ( A P P L Y - N A M E ) ; 590 | A P P L Y ( C B E A T E - A P P L Y ) ; 595 | E I T H E R 596 J ( R E A D (COMMA) ) 602 | . . . ( E X I T AND A P P L Y ( R S H D 609 | ) 610 | 4 END 611 J 3 END; 613 | 4 READ ( C L O S E ) 617 | _ l _ * * * * NO S Y N T A X ERRORS D E T E C T E D NO S Y N T A X ERRORS I N C O M P I L A T I O N SO FAR 6 1 8 6 2 2 6 2 3 6 2 6 6 2 7 6 2 8 6 3 0 6 3 0 6 3 0 6 3 1 6 3 7 6 4 3 L E T E S C A P E - S T A T E M E N T B E ; E I T H E R ( S K I P ) ( B E G I N / * O P T I O N A L S T A R T I N G A P P L Y S T A T E M E N T * / A P P L Y - S T A T E M E N T ; E I T H E R / * I N AN E I T H E R A L T E R N A T I V E , AN A P P L Y CANNOT S T A R T BUT I T I S L E G A L I N AN E S C A P E . T H I S C A T C H E S T H E P R O B L E M * / ( R E A D (THEN) ) (ERROR ( I N V A L I D _ A P P L Y _ I N _ E I T H E R ) ) END; 645 0 I END) 647 1 | END; 649 | j EITHER 650 { j (BEGIN /* FOR SCOPE ESCAPES */ 652 | EITHER 653 | i (BEGIN /* FOR EXITS */ 655 2 I BEAD (EXIT); 660 j j EITHER 661 | j (SCOPE-LABEL) 664 I (SKIP; APPLY(NOLL-SCOPE) ) 672 3 I END; 674 | APPLY(SETUP-EXIT); 679 1 | END) 681 | (BEGIN /* FOR REPEATS */ 683 | 2 1 READ (REPEAT) ; 688 | EITHER 689 | j (SCOPE-LABEL) 692 | | (SKIP; APPLY(NULL-SCOPE)) 700 3 1 END; . 702 1 APPLY (SETUP-REPEAT) ; 707 1 | END) 709 | 2 I END; 711 j | APPLY(SCOPE-ESCAPE); 716 1 I END) 718 j (READ(RETURN); APPLY(NONTEBMINAL-ESCAPE) 729 \ (ERROR (M.ISSING_ESCAPE_TYPE) ) 735 | 2 I END; 737 J APPLY(ESCAPE—MABKEB); 741 | /* PUT MARKER ON PARAGRAPHER LISTING */ 742 | EITHER 743 J (SKIP) 746 | (BEGIN 748 I 3 | READ (AND) ; 753 4 I APPLY-STATEMENT 754 2 | END) 756 3 I END; 758 | APPLY(CLOSE-ESCAPE-STATEMENT); 763 | _ l _ *** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 764 I o I LET ERROR-STATEMENT BE: 768 1 I BEAD (ERROR) ; 773 | APPLY (ERROR-NAME-PHASE) ; 778 3 I READ(OPEN,ERROR-NAME); 785 | APPLY(ERROR-RECOVERY,RSWD 792 4 l READ (CLOSE) ; 797 | _ l _ *** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 798 | 0 | LET READ-STATEMENT BE: 802 | 1 | READ (READ) ; 807 ( | APPLY(TERMINAL-PHASE) ; 83 812 817 818 823 828 829 835 842 843 844 846 850 **** NO NO 4 3 4 SYNTAX SYNTAX BEAD (OPEN) ; CYCLE BEAD (TERMINAL); APPLY(CREATE-READ); EITHER (READ (COMMA) ) ... (EXIT AND APPLY(RSWD-NONTERMINAL-PHASE) ) END END; READ (CLOSE) ERRORS ERRORS DETECTED IN COMPILATION SO FAR 851 | 0 | ***** WARNING: 855 860 865 870 876 877 882 **** NO NO 1 2 3 LET SCOPE-LABEL BE: A POSH EMPTY INSERTED BEFORE THIS PROCEDURE ALL PREVIOUS STACK DEPTHS INCREMENTED BY ONE APPLY(SCOPE-NAME-PHASE); READ(SCOPE-OPEN); READ(SCOPE-NAME); APPLY(GET-SCOPE-LABEL, RSWD-NONTERMINAL-PHASE); READ(SCOPE-CLOSE); SYNTAX SYNTAX ERRORS DETECTED ERRORS IN COMPILATION SO FAR NO SYNTAX ERRORS 84 S Y M B O L I N F O R M A T I O N ****** TERMINAL SYMBOLS ****** NUMBER SYMBOL 1 EOF 2 LET 3 NT 4 BE 5 COLON 6 SEMI 7 GOAL-POST 8 BEGIN 9 CYCLE 10 END 11 EITHER 12 OPEN 13 CLOSE 14 READ 15 TERMINAL 16 SKIP 17 SEE 18 NOTSEE 19 COMMA 20 APPLY 21 APPLY-NAME 22 THEN 23 EXIT 24 REPEAT 25 RETURN 26 AND 27 ERROR 28 ERROR-NAME 29 SCOPE-OPEN 30 SCOPE-NAME 31 SCOPE-CLOSE 85 ****** APPLY SYMBOLS ****** NUMBER SYMBOL 1 ZERO-INDENT 2 SAVE-PROCEDURE-START-STATE-FOR-ERROR-RECOVERY 3 END-GF-PROGRAM SYNTAX-ERROR-SUMMARY 5 SPACE-LINE 6 SET-UP-ACTIVE-NONTERMINAL 7 INDENT 8 LINE 9 EXDENT-LINE 10 END-OF-NT-BODY 11 CREATE-NT-CALL 12 NULL-SCOPE 13 CHECK-SCOPE-ISNT-OPEN 14 SCOPE-PUSH 15 SCOPE-POP • 16 START-AN-EITHER 17 INDENT-LINE 18 FINISH-EITHER-ALTERNATIVE 19 CLOSE-AN-EITHER 20 SET-UP-EITHER-ALTERNATIVE 21 TERMINAL-PHASE 22 CREATE-READ 23 RSHD-NONTERMINAL—PHASE 24 CREATE-SKIP-CONSTRUCT 25 SEE-LIST 26 NOTSEE-LIST 27 BUILD-SEE-LIST-ELEMENT 28 APPLY-NAME-PHASE 29 CREATE-APPLY 30 SETUP-EXIT 31 SETUP-REPEAT 32 SCOPE-ESCAPE 33 NONTERMINAL-ESCAPE 34 ESCAPE-MARKER 35 CLOSE-ESCAPE-STATEMENT 36 ERROR-NAME-PHASE 37 ERROR-RECOVERY 38 SCOPE-NAME-PHASE 39 GET-SCOPE-LABEL 86 ****** ERROR SYMBOLS ****** NUMBEE SYMBOL 1 NO-ALTERNATIVE-FOR-EITHEB 2 ALIERNATIVE-SEPARATION-PROBLEM 3 MULTIPLE-BEAD-IN-EITHEB 4 BAD-FIRST-IN-EITHER 5 BAD-SECOND-IN-EITHEB 6 INVALID_APPLY_IN_EITHER 7 MISSING_ESCAPE_,TYPE 87 S T A T E S U M M A R Y TOTAL NUMBER OF STATES ........... : 187 TOTAL NUMBER OF LR(0) STATES ... : 166 NUMBER OF INADEQUATE LR (0) STATES : 21 NUMBEB OF INADEQUATE SLR STATES . ; 0 NUMBER OF SLR ( 1) STATES ....... : 21 APPENDIX I I . PARSLEY IN BNF GRAMMAR LISTING { 1) "PROGRAM" : "NT-DESCRIPTION-LIST",EOF-SYMBOL . { 2) "NT-DESCRIPTION-LIST" : "NT-DESCRIPTION-LIST", ( 4) "NT-DESCRIPTION" : "RSHD-NONTERMINAL-PHASE", { 5) "NT-DESCRIPTION-HEAD" : LET,NT,BE,COLON . ( 6) "NT-DESCRIPTION-BODY" : "STATEMENT-LIST", "LINE-EXDENT", END-NT-DESCRIPTION . ( 7) "STATEMENT-LIST" ; "STATEMENT-LIST**, SEMI, "LINE","STATEMENT" ; ( 8) "STATEMENT" . ( 9) "STATEMENT" : "SIMPLE-STATEMENT" ; (10) BEAD,"TERMINAL-PHASE",OPEN, (14) "ONE-STACK-STATEMENT" : READ,"TERMINAL-PHASE", OPEN,TERMINAL, "RSHD-NONTERMINAL-PHAS E" CLOSE ; < 3) "ZERO-INDENT", "NT-DESCRIPTION", "SPACE-LINE" ; (EMPTY) . "NT-DESCRIPTION-HEAD", "LINE-1NDE NT", "NT-DESCRIPTION-BODY", "LINE". (11) (12) (13) "READ-TERMINAL-LIST", "RSWD-NONTERMINAL-PHASE", CLOSE ; "SEMANTIC-STATEMENT" ; "ESCAPE-STATEMENT" ; (EMPTY) . (15) "SIMPLE-STATEMENT" (16) "SIMPLE-STATEMENT" : "SCOPE-STATEMENT" ; (17) "EITHER-STATEMENT" ; (18) NT . (19) "REAB-TERMINAL-LIST" : "READ-TERMINAL-LIST", COMBA,TERMINAL ; (20) TERMINAL . (21) "SCOPE-STATEMENT" : "OPEN-SCOPE-LABEL-TY", "SCOPE-HEAD","LINE-INDENT", "STATEMENT-LIST", "LINE-EXDENT",END, "CLOSE-SCOPE-LABEL-TY" . (22) "SCOPE-HEAD" : CYCLE ; (23) BEGIN . (24) "OPEN-SCOPE-LABEL-TY" : "SCOPE-LABEL","LINE" ; (25) (EMPTY) . (26) "CLOSE-SCOPE-LAEEL-TY" : "SCOPE-LABEL" ; (27) (EMPTY) . (28) "SCOPE-LABEL" : "SCOPE-NAME-PHASE", OPEN-SCOPE-LABEL,SCOPE-NAME, "RSWD-NONTERMINAL-PHASE", CLOSE-SCOPE-LABEL . (29) "EITHER-STATEMENT" : "EITHER","LINE-INDENT", "EITHER-ALTERNATIVE-LIST", "LINE-EXDENT",END . (30) "EITHER" : EITHER . (31) "EITHER-ALTERNATIVE-LIST" : "EITHER-ALTERNATIVE-LIST" "EITHER-ALTERNATIVE" ; (32) "EITHER-ALTERNATIVE" . (33) "EITHER-ALTERNATIVE" : OPEN, "EITHER-STATEMENT-LIST", CLOSE,"LINE" . (34) "EITHER-STATEMENT-LIST" : "ERROR-STATEMENT" ; (35) "SEE-STATEMENT-TY", "EITHER-FIRST-STATEMENT" ; (36) "SEE-STATEMENT-TY", "EITHER-FIRST-STATEMENT", SEMI,"ESCAPE-STATEMENT" ; (37) "SEE-STATEMENT-TY", "EITHER-FIRST-STATEMENT", SEMI,"SEMANTIC-STATEMENT". (38) "SEE-STATEMENT-TY" : "SEE-NOTSEE",OPEN, "SEE-TERMINAL-LIST", "RSWD-NONTERMINAL-PHASE", CLOSE,SEMI ; (39) (EMPTY) . (40) "SEE-NOTSEE" : SEE,"TERMINAL-PHASE" ; (41) NOTSEE,"TERMINAL-PHASE" . (42) "SEE-TERMINAL-LIST" : "SEE-TERMINAL-LIST", COMMA,TERMINAL ; (43) TERMINAL . (44) "EITHER-FIRST-STATEMENT" : "ONE-STACK-STATEMENT" (45) SKIP ; (46) "ESCAPE-STATEMENT" . (47) "SEMANTIC-STATEMENT" : APPLY, "ACTION-NAME-PHASE", OPEN,"SEMANTIC-LIST", "RSWD-NONTERMINAL-PHASE" CLOSE . (48) "SEMANTIC-LIST" : "S EMANTIC-LIST",COMMA, APPLY-NAME ; (49) APPLY-NAME . (50) "ESCAPE-STATEMENT" : "APPLY-BEFOBE-ESCAPE-TY", "WHAT-TO-ESCAPE", "ESCAPE-MARKER", "APPLY-AFTER-ESCAPE-TY" -(51) "APPLY-BEFOBE-ESCAPE-TY" : "SEMANTIC—STATEMENT" THEN ; (52) (EMPTY) ••• (53) "APPLY-AFTER-ESCAPE-TY" : AND, "SEMANTIC-STATEMENT" (54) (EMPTY) . (55) "WHAT-TO-ESCAPE" : "SCOPE-ESCAPE" ; (56) RETURN . (57) "SCOPE-ESCAPE" : EXIT,"SCOPE-LABEL-TY" ; (58) REPEAT,"SCOPE-LABEL-TY" . (59) "SCOPE-LABEL-TY" : "SCOPE-LABEL" % (60) (EMPTY) . (61) "EBROR-STATEMENT" : ERROR,"ERROR-NAME-PHASE", OPEN,ERROR-NAME, "RSWD-NONTERMINAL-PHASE", CLOSE . (62) "ZERO-INDENT" : (EMPTY) . (63) " SPACE-LINE" : (EMPTY) . (64) "LINE" : (EMPTY) . (65) "LINE-INDENT" : (EMPTY) . (66) "LINE-EXDENT" : (EMPTY) . (67) "ESCAPE-MARKER" : (EMPTY) . (68) "RSHD-NONTERMIN AL-PHASE" : (EMPTY) (69) "TERMINAL-PHASE" : (EMPTY) . (70) "SCOPE-NAME-PHASE" : (EMPTY) . (71) "ACTION-NAME-PHASE" : (EMPTY) . (72) "ERROR—NAME-PHASE" : (EMPTY) . TERMINAL SYMBOLS NUMBER SYMBOL 1 EOF-SYMBOL 2 LET 3 NT 4 BE 5 COLON 6 END-NT-DESCRIPTION 7 SEMI 8 READ 9 OPEN 10 CLOSE 11 TERMINAL 12 COMMA 13 END 14 CYCLE 15 BEGIN 16 OPEN-SCOPE-IABEL 17 SCOPE-NAME 18 CLOSE-SCOPE-LABEL 19 EITHEB 20 SEE 21 NOTSEE 22 SKIP 23 APPLY 24 APPLY-NAME 25 THEN 26 AND 27 RETURN 28 EXIT 29 REPEAT 30 ERROR 31 ERROR-NAME APPENDIX I I I . SPJKl IN PAHS,LEI 0 j 0 | LET PROGRAM BE: 5 1 5 I BEAD(IDENTIFIER,COLON,PROCEDURE,OPTIONS, 17 8 I OPEN,MAIN,CLOSE,SEJ3I) ; 24 | APPLY (MAIN,INDENT,LINE); 33 9 | PROCEDDRE_BODY; 35 | APPLY(EXDENT,LINE); 42 11 | READ (END,SEMI) ; 49 ] APPLY (LINE) ; 54 | _ l _ * NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO EAR 55 0 I LET PROCEDURE BODY BE: 59 EITHER 60 | (DECLARE STATEMENT LIST) 63 | (SKIP) 66 1 I END; 68 ) APPLY (PROCEDURE_LIST); 73 i <PICK_UP_PROCEDURES> 76 | CYCLE" 77 J EITHER 78 2 I (READ(IDENTIFIER)/* ALTERNATIVE CAN BE 82 ] A PROCEDURE NAME OR THE START OF 82 1 I AN ASSIGNMENT STATEMENT */ ) 84 j (NOTSEE (IDENTIFIER) ; 92 | APPLY (END_PROCEDURE_LIST) THEN 97 ... EXIT <PICK^UP_PROCEDURES>) 100 I 2 ] END; /* THE NOTSEE IS NECESSARY TO FORCE 101 \ j ASSIGNMENT STATEMENTS THRU THE FIRST 101 i i ALTERNATIVE RATHER THAN AS PART OF THE 101 i STATEMENT LIST */ 102 i - i EITHER 103 • ] (BEGIN 105 1 3 | READ (COLON) ; 110 4 ! PROCEDURE; 112 1 2 ] END) 114 1 - 1 (BEGIN 116 | APPLY (END_PBOCEDURE_LIST) ; 121 1 3 J ASSIGNMENT_STATEMENT2; 123 | | EXIT <PICK_UP_PROCEDURES> 126 | | /* MULTILEVEL EXIT */ 127 1 2 | END) 129 1 3 | END; 94 131 | 2 | END<PICK_OP_PROCEDURES>; /* AT THIS POINT, 135 | \ THE PROCEDURES PLUS MAYBE ONE 135 j | ASSIGNMENT STATEMENT HAS BEEN GATHERED */ 136 | i EITHER 137 | I (STATEMENT_LIST) 140 I ) (SKIP) 143 | 3 | END; 145 1 I I **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 146 | 0 | LET DECLARE_STATEMENT_LIST BE: 150 I j CYCLE 151 | 1 I READ (DECLARE) ; 156 | 2 I DECLARATION_UNIT; 158 J | EITHER 159 \ j (READ (SEMI); APPLY (LINE) ) 170 | | (BEGIN /* BLOCK ALLOWS FOR INDENTATION 171 | | OF SUBSEQUENT DECLARATIONS */ 172 | 3 I READ (COMMA) ; 177 | | APPLY (INDENT,LINE) ; 184 j j CYCLE 185 | 4 ] DECLARATION UNIT; 187 i j EITHER 188 | 5 I (READ(SEMI) ;APPLY (EXDENT, LINE) 200 4 I ... THEN EXIT ) 203 | | (READ (COMMA) ; APPLY (LINE) ) 214 | 5 I END; 216 | 4 I END; 218 2 I END) 220 | 3 i END; 222 | | EITHER ** WARNING: A PUSH EMPTY INSERTED BEFORE THIS PROCEDURE ALL PREVIOUS STACK DEPTHS INCREMENTED BY ONE 223 i 4 ] ... (REPEAT ) 226 J ... (RETURN ) 229 5 I END 230 2 | END 231 | j _ J _ * NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 232 | 0 | LET DECLARATION_UNIT BE: ***** WARNING: A PUSH EMPTY INSERTED BEFORE THIS PROCEDURE : ALL PREVIOUS STACK DEPTHS INCREMENTED BY ONE 236 ! 1 I APPLY(START_DECLARATION) 241 | 2 I READ(OPEN); 246 | j CYCLE 247 i 3 I VARIABLE TO DECLARE; 249 | j EITHEB 250 | | (READ (COMMA) ) 256 | | ... (EXIT ) 259 | 4 I END 260 | 3 1 END; 95 262 1 4 | BEAD (CLOSE); 267 J 5 | TYPE 268 J J _{_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAB 269 I 0 | LET VARIABLE^TO^DECLARE BE: 273 | 1 | BEAD {IDENTIFIES) ; 278 | ] EITHER 279 | | ... (BOUNDS; RETURN ) 284 | | ... (APPLY.(NEW SCALAR) THEN RETURN ) 292 1 2 | END; 294 | j _!_ * NO SYNTAX ERRORS DETECTED ;NO SYNTAX ERRORS IN COMPILATION SO FAR 295 ! 0 | LET BOUNDS BE: 299 t 1 t READ (OPEN) ; 304 | | APPLY(NE«_ARRAY) ; 309 ] | EITHER /* AN ARRAY CAN HAVE UNDETERMINED 309 j | BOUNDS OR DETERMINED ONES BUT NOT BOTH */ 310 I j (CYCLE /* UNDETERMINED BOUNDS */ 312 ] 2 | READ (STAB) ; 317 { | APPLY (STAR__BOUND) ; 322 | | EITHER 323 i i (READ(COMMA)) 329 j i ... (EXIT ) 332 | 3 | END; 334 t 1 i END) 336 { | (CYCLE/* DETERMINED BOUNDS */ 338 1 2 1 BOUND; 340 \ ] EITHER 341 | | (SKIP; APPLY (ONE_B0UND) ) 349 j | (BEGIN . 351 j | APPLY(LOW BOUND); 356 1 3 | READ (COLON) ; 361 1 4 i BOUND; 363 j j APPLY(HIGH_BODND); 368 1 2 | END) 370 ! 3 | END; 372 1 1 EITHER 373 | | (READ (COMMA) ) 379 1 1 ... (EXIT ) 382 i 4 | END; 384 I 1 i END) 386 I 2 | END; 388 I 3 | READ (CLOSE) ; 393 | ] APPLY(END BOUND LIST) ; 398 t | _ i _ * NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 399 I 0 | LET BOUND BE: 403 I I EITHER 96 404 | | ( B E A D ( H I N D S ) ; A P P L Y ( N E G A T I V E _ . C O N S T A N T ) ) 415 | | ( S K I P ) 418 J 1 | E N D ; 420 | 2 | R E A D ( I N T E G E R ) ; 425 J \ A P P L Y ( G E T _ B G U N D ) ; 430 | I _|_ * * * * NO S Y N T A X ERRORS D E T E C T E D NO S Y N T A X E R R O R S I N C O M P I L A T I O N SO FAR 431 | 0 | L E T T Y P E B E ; 435 | \ E I T H E R 436 | 1 | . . . ( R E A D ( F I X E D ) ; A P P L Y ( F I X E D _ T Y P E ) T H E N RETURN) 449 \ 1 \ . . . ( READ ( F L O A T ) ; A P P L Y ( F L G A T _ T Y P E ) T H E N RETURN) 462 | | . . . ( B I T ; A P P L Y ( B I T _ T Y P E ) T H E N R E T U R N ) 472 | | . . . ( C H A R A C T E R _ T Y P E ; R E T U R N ) 477 | 1 | END 478 | j _|_ * * * * NO S Y N T A X ERRORS D E T E C T E D NO S Y N T A X E R R O R S I N C O M P I L A T I O N SO FAR 479 ] | L E T B I T B E : / * T H I S I S C A L L E D FROM BOTH 482 | 0 | AND ' , R E T U R N S _ C L A U S E " * / 483 | 1 | R E A D ( B I T ) ; 488 | | E I T H E R 489 j J . . . ( A P P L Y ( D E F A U L T _ B I T _ S I Z E ) T H E N R E T U R N ) 497 J i ( B E G I N 499 | 4 I B E A D ( O P E N , I N T E G E R , C L O S E ) ; 508 J | . . . A P P L Y ( B I T ^ L E N G T H ) T H E N RETURN ; 515 | 1 j END) 517 J 2 1 END 518 I J _|_ * * * * NO S Y N T A X E R R O R S D E T E C T E D NO S Y N T A X E R R O R S I N C O M P I L A T I O N SO FAR 519 | 0 | L E T C H A R A C T E B _ T Y P E B E : 523 J 2 | R E A D ( C H A R A C T E R , O P E N ) ; 530 | | E I T H E R 531 | 3 | ( READ ( STAR ) ; A P P L Y ( 539 j 2 | C H E C K _ C H A R A C T E R _ P A R A H E T E R ) ) 542 | | ( R E A D ( I N T E G E R ) ; A P P L Y ( C H A R A C T E R _ L E N G T H ) ) 553 | 3 J E N D ; 555 | 5 | R E A D ( C L O S E , V A R Y I N G ) ; 562 | | A P P L Y ( C H A R A C T E R T Y P E ) ; 567 | \ _|_ * * * * NO S Y N T A X ERRORS D E T E C T E D NO S Y N T A X E R R O R S I N C O M P I L A T I O N SO F A B 568 | 0 | L E T P R O C E D U R E B E : 572 | 1 | READ ( PROCEDURE ) ; 577 | ( A P P L Y ( P U S H _ S C O P E ) ; 582 | | E I T H E R 583 | | ( S K I P ; A P P L Y ( N O _ F O R M A L _ P A R A M E T E R S ) ) 591 | | (FOR MAL_. P A R A M E T E R S ) 594 | 2 | E N D ; 97 596 | J EITHER 597 | | (SKIP; APPLY(NO_RETURNS_CLAUSE) ) 605 | I (RETURNS_CLAUSE) 608 J 3 | END; 610 | | APPLY (PROCEDURE_HEAD,INDENT,LINE) ; 619 1 4 I PROCEDURE_BODY; 621 | | APPLY(EXDENT,LINE); 628 I 6 | READ (END,SEMI); 635 l | APPLY (LINE) 639 | | _ ] _ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 640 | 0 | LET FORMAL_PARAMETERS BE: 644 | 1 | READ (OPEN) ; 649 | ] CYCLE 650 | 2 | READ (IDENTIFIES) ; 655 | J APPLY (NE8_.PARAMETER) ; 660 | | EITHER 661 \ | (READ (COMMA) ) 667 | | ... (EXIT ) 670 | 3 | END 671 | 2 | END; 673 | 3 | READ (CLOSE); 678 | | APPLY(END_FORMAL_PARAMETER_LIST) 682 | | _ l _ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 683 J 0 | LET RETBRNS_CLAUSE BE: 687 | 2 | READ(RETURNS,OPEN); 694 | j EITHER 695 } I (READ(FIXED); APPLY(FIXED_PROC)) 706 I | (READ(FLOAT) ; APPLY (FLOAT_PROC)) 717-1 I (BIT; APPLY(BIT_PROC)) 725 J j (BEGIN 727 | 6 J READ(CHARACTER,OPEN,INTEGER,CLOSE, 737 | 7 | VARYING) ; 740 | | APPLY(CHARACTER_PROC); 745 | 2 | END) 747 | 3 | END; 749 i 4 | READ (CLOSE) 753 J j _ j **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 754 | O f LET STATEMENT_LIST BE: 758 | | CYCLE 759 | | EITHER 760 | | (SI MPLE_STATEMENT) 763 | | (IF_STATEMENT) 766 1 1 | END; 768 | | EITHER 769 | | ... (RETURN ) 98 ***** WARNING; A PUSH EHPTY INSERTED BEFORE THIS PROCEDURE : ALL PREVIOUS STACK DEPTHS INCREMENTED BY ONE 772 | 2 | (REPEAT ) 775 | 3 | END 776 | 2 | END 777 | J _|_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 778 | 0 | LET SIMPLE_STATEMENT BE: 782 | | EITHER 783 | | (READ(SEMI)) 789 | | (ASSIGNMENT STATEMENT) 792 | | (GET_STATEMENT) 795 | | (PUT_STATEMENT) 798 J | (CALL..STATEMENT) 801 | | (RETURN_STATEMENT) 804 | | (D0_STATEMENT) 807 | 1 | END 808 | | _1_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 809 j 0 -| LET ASSIGNMENT_STATEMENT BE: 813 I 1 | READ(IDENTIFIER); 818 | 2 | ASSIGNMENT_STATEMENT2; 820 J | **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 821 | | LET ASSIGNMENT_STATEMENT2 BE: /* THIS BREAKDOWN 824 | | IS NECESSARY TO FINISH OFF ASSIGNMENT 824 | j STATEMENTS WHICH ARE THE FIRST STATEMENT OF 824 | 0 | THE LIST OF STATEMENTS IN ,'PROCEDURE_BODY»*/ 825 | | EITHEB 826 | | (SKIP) 829 | | (SUBSCRIPT_LIST) 832 | 1 J END; 834 | 2 | READ(EQUAL); 839 | 3 | EXPRESSION; 841 | j APPLY(ASSIGNMENT_STATEMENT) ; 846 | 4 | READ (SEMI) ; ~ 851 I | _ !_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 852 | 0 | LET SUBSCRIPTJLIST BE: 856 \ 1 | RE AD (OPEN); 861 | I APPLY (START_ARRAY_.SUBSCBIPTING) ; 866 | | CYCLE 867) 2 f EXPRESSION; 869 J | APPLY(SUBSCRIPT_EXPRESSION); 874 | | EITHER 875 | | (READ (COMMA) ) 881 | | ... (EXIT ) 884 | 3 | END; 886 | 2 J END; 888 | 3 | READ (CLOSE); 893 } | APPLY(END_ARRAY SUBSCRIPTING).;. 898 | | _|_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO EAR 899 | 0 | LET PUT_STATEMENT BE: 903 J 1 | READ (PUT) ; 908 | | EITHER 909 | | (READ (SKIP) ; APPLY (PUT_SKIP) ) 920 J | (READ (PAGE) ; APPLY (PUT_PAGE) ) 931 ] | (SKIP) 934 \ 2 j END; 936 | 4 ( READ(LIST,OPEN); 943 | ) CYCLE 944 | 5 | EXPRESSION; 946 | I APPLY(PUT_EXPRESSION) ; 951 | | EITHER 952 J | (READ (COMMA)) 958 | | (EXIT ) 961 J 6 | END; 963 j 5 | END; 965 | 7 J READ (CLOSE,SEMI) 971 | | _|_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 972 | 0 | LET GET_STATEMENT BE: 976 } 1 | READ(GET) ; 981 | | EITHER 982 | | (READ(SKIP); APPLY(GET_SKIP)) 993 ] | (SKIP) 996 J 2 | END; 998 | 4 | RE AD (LIST, OPEN) ;. 1005 \ | CYCLE 1006 | 5 !• READ (IDENTIFIER) ; 1011 | | EITHER 1012 | | (SUBSCRIPT_LIST) 1015 | | (SKIP; APPLY(SIMPLE_VARIABLE)) 1023 | 6 | END; 1025 | J APPLY(GET_ITEM); 1030 J | EITHER 1031| | (READ (COMMA) ) 1037 I I ... (EXIT ) 1040 | 7 | END; 1042 \ 5 | END; 1044 J 7 | READ (CLOSE,SEMI) 1050 1 | _}_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 1051 | 0 | LET CALL_STATEMENT BE: . 1055 | 2 | BEAD(CALL,IDENTIFIES); 1062 | | APPLY (SET_UP_FOR_PROCEDURE_CALL); 1067 J | EITHEB 1068 | | (SKIP; APPLY(NULL PARAHETER_LIST)) 1076 | | (BEGIN 1078 | 3 | BEAD(OPEN) ; 1083 | | CYCLE 1084 i 4 j EXPRESSION; 1086 | I APPLY (GET_PARAMETER); 1091 | | EITHER 1092 | | (READ (COMMA) ) 1098 | | ... (EXIT ) 1101 | 5 | END; 1103 J 4 j END; 1105 | 5 } READ (CLOSE) ; 1110 | | APPLY (END_OF_PARAMETEBS); 1115 | 2 | END) 1117 | 3 | END; 1119 | 4 | READ(SEMI) ; 1124 | | APPLY(PROCEDURE_CALL) ; 1129 | J _|_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAB 1130 | 0 | LET BETORN_STATEMENT BE: 1134 | 1 | READ (RETURN) ; 1139 | | EITHER 1140 \ | (SKIP; APPLY(SIHPLE_RETURN)) 1148 | | (BEGIN 1150 | 2 | READ(OPEN) ; 1155 | | APPLY(BEFOSE_RETDRN_EXPRESSION) 1160 | 3 | EXPRESSION; 1162 \ | APPLY(RETURN_EXPRESSION); 1 167 | 4 | READ (CLOSE) ; 1172 | 1 | END) 1174 j 2 J END; 1176 | 3 | READ (SEMI) ; 1181 | I _ | _ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 1182 | O J LET IF_STATEMENT BE: 1186 1 | EITHER 1187 J I (IF_THEN_STATEMENT) 1190 | | (IF_THEN_ELSE_STATEMENT) 1193 J 1 | END 1194 ] I _ i _ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ESROBS IN COMPILATION SO FAR 1195 | 0 | LET IF_HEAD BE: 1199 J 1 J BEAD ( I F ) ; 1204 t 2 J EXPRESSION; 101 1206 | | 1211 | 3 | 1216 J | 1223 | | _|_ **** N 0 SYNTAX EfiROBS NO SYNTAX ERRORS APPLY (IF_HEAD) ; READ (THEN) ; APPLY(INDENT,LINE) DETECTED IN COMPILATION SO FAR 1224 1228 1230 1231 1234 1237 1239 1246 1246 1247 1260 1262 * * * * NO NO 1263 1267 1269 1270 1273 1276 1278 1285 1290 1299 1300 1303 1306 1308 1316 * * * * NO NO 0 ] LET IF_T.HEN_STATESENT BE: 1 1 IF_HEAD; EITHER (SIMPLE_STATEMENT) (IF_THEN_STATEMENT) END; APPLY(EXDENT,LINE); EITHER/* NOTSEE IS NECESSARY TO GET "ELSE" OUT OF SLR LOOKAHEAD SET */ ... (NOTSEE (ELSE) ; APPLY (IF_THEN) THEN RETURN) END; _ 1 _ SYNTAX ERRORS DETECTED SYNTAX ERRORS IN COMPILATION SO FAR 0 | LET IF_THEN_ELSE_STATEMENT BE: 1 I IF_HEAD; EITHER (SIMPLE_STATEMENT) (IFJTHEN ELSE STATEMENT) 2 | END; APPLY(EXDENT,LINE); 3 I READ (ELSE) ; APPLY{ STARTING_ELSE_GLAUSE,INDENT,LINE); EITHER (SIMPLE_STATEMENT) (IF_STATEMENT) END; APPLY(EXDENT,LINE,IF_THEN_ELSE) _!_ SYNTAX ERRORS DETECTED SYNTAX ERRORS IN COMPILATION SO FAR 1317 | 0 | LET DO_STATEMENT BE: 1321 | 1 | READ (DO) ; 1326 } | EITHER 1327 | j (WHILE CLAUSE) 1330 | i (ITERATION_CLAUSE) 1333 | i (SKIP; APPLY (SIMPLE .DO)) 1341 | 2 l END; 1343 | 3 I READ(SEMI); 1348 | \ APPLY (INDENT,LINE) ; 1355 | I STATEMENT_LIST; 1357 | j APPLY(END_DO_STATEMENT ,EXDENT,LINE) 1366 j 6 i READ (END,SEMI) ; 1373 j j _ i _ **** NO SYNTAX ERRORS DETECTED NO SYNTAX EREOBS IN COMPILATION SO FAR 1374 | 0 | LET WBILE_CLAUSE BE: 1378 | 2 | BEAD(WHILE,OPEN) ; 1385 i I APPLY(BEFORE_WHILE_EXPRESSION) ; 1390 | 3 | EXPRESSION; 1392 ] | APPLY(WHILE_EXPRESSION); 1397 | 4 | READ(CLOSE); 1402 | | _|_ *** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 1403 | 0 | LET ITERATION_CLAUSE BE: 1407 | 2 | BEAD(IDENTIFIER,EQUAL) ; 1414 J ] APPLY(DO_VARIABLE) ; 1419 | 3 | EXPEESSION; 1421 I | APPLY(GET_INITIAL_DO VALUE); 1426 | 4 | BEAD (TO) ; 1431 | 5 | EXPEESSION; 1433 | | APPLY(GET_FINAL_DO_VALUE); 1438 | J EITHEE 1439 | ] ... (APPLY(DEFAULT_BY_VALUE)THEN RETURN ) 1447 | | (BEGIN 1449 | 6 | BEAD(BY) ; 1454 | 7 | EXPEESSION; 1456 | J ... APPLY(GET DO_BY_VALUE) THEN BETUEN 1463 | 5 | END) 1465 \ 6 | END; 1467 | | _|_ *** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 1468 | 0 I LET EXPRESSION BE: 1472 | 1 | QUINTARY; 1474 J | CYCLE 1475 | | EITHER 1476 | | ... (RETURN ) 1479 | j (BEGIN 1481 | 2 | READ(OR_BAR); 1486 1 3 J QUINTARY; 1488 | | APPLY (OR) ; 1493 | 1 | END) 1495 | 2 | END 1496 | | END 1497 ] | _|_ *** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 149 8 | 0 | LET QUINTARY BE: 1502 | 1 | QUADBARY; 1504 | | CYCLE 1505 | | EITHER 1506 | | ... (RETURN ) 1509 | | (BEGIN 103 1511 j 2 | READ (AMPERSAND) ; 1516 | 3 | QUADRARY; 1 5 1 8 J | A P P L Y (AND) 1522 J 1 | END) 1524 | 2 | END 1525 I | END 1526 | | _|_ * * * * NO S Y N T A X ERRORS D E T E C T E D NO S Y N T A X ERRORS I N C O M P I L A T I O N SO FAR 1527 | 0 | L E T QUADRARY B E : 1531 | 1 | T E R T I A R Y ; 1533 | | C Y C L E 1 5 3 4 | | E I T H E R 1535 | | ... (RETURN ) 1538 J | ( R E A D ( E Q U A L ) ; A P P L Y ( C O N D I T I O N _ E Q ) ) 1549 | | ( B E G I N 1551 I 2 1 R E A D ( L E S S _ T H A N ) ; 1556 | | E I T H E B 1557 J | (READ (EQUAL) ; A P P L Y ( C O N D I T I O N _ L E ) ) 1568 | | ( S K I P ; A P P L Y ( C O N D I T I O N _ L T ) ) 1576 | 3 I END 1577 | 1 j END) 1579 J | ( B E G I N 1581 | 2 | R E A D ( G R E A T E R _ T H A N ) ; 1586 | | E I T H E R 1587 I j (READ (EQUAL) ; A P P L Y ( C O N D I T I O N _ G E ) ) 1598 J J ( S K I P ; A P P L Y ( C O N D I T I O N _ G T ) ) 1606 | 3 | END 1607 i 1 | END) 1609 } | ( B E G I N 1611 | 3 | B E A D ( N O T , E Q U A L ) ; 1618 | | A P P L Y ( C O N D I T I O N _ N E ) ; 1623 J 1 | END) 1625 | 2 | END; 1627 j 3 | T E R T I A R Y ; 1629 | | A P P L Y ( F I N I S H _ C O N D I T I O N ) ; 1634 | 2 | END 1635 | | |_ * * * * NO S Y N T A X I R R O R S D E T E C T E D NO S Y N T A X ERRORS I N C O M P I L A T I O N SO F A R 1636 | 0 | L E T T E R T I A R Y B E : 1640 | 1 J S E C O N D A R Y ; 1642 | | C Y C L E 1643 » | E I T H E R 1644 | I . . . ( R E T U R N ) 1647 | | ( R E A D ( P L U S ) ) 1653 | | ( R E A D ( M I N U S ) ) 1659 | | ( B E G I N 1661 | 3 | R E A D ( C A T E N A T E _ B A B , O R _ B A B ) ; 1668 | 1 | END) 1670 | 2 | END; 1672 | 3 | S E C O N D A R Y ; 104 1674 | | APPLY (DO_TERTIARY_OPERATION) ; 1679 i 2 | END 1680 | | _|_ **** N 0 SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 1681 | 0 J LET SECONDARY BE: 1685 | 1 j PRIMARY; 1687 | | CYCLE 1688 | | EITHER 1689 | | ... (RETURN ) 1692 | | (READ(STAR)) 1698 | | (READ(SLASH)) 1704 | 2 | END; 1706 | 3 | PRIMARY; 1708 | | APPLY(DO_SECONDARY_OPERATION); 1713 | 2 | END 1714 J | _ | _ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 1715 | 0 | LET PRIMARY BE: 1719 | | EITHER 1720 | | (LITERAL_CONSTANT) 1723 | | (UNARIED PRIMARY) 1726 | | (BEGIN 1728 | 1 | READ (OPEN) ; 1733 | I APPLY (STABTING_A_BBACKETTED_,EXPRESSION 1737 | | ) ; 1738 J 2 | EXPRESSION; 1740 | 3 | BEAD (CLOSE) ; 1745 | J APPLY (CLOSING_A_BRACKETTED_EXPRESSION) ; 1750 | 0 | END) 1752 | | (STOREAGE_REFERENCE) 1755 | 1 | END 1756 | | _|_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 1757 | 0 | LET LITERAL_CONSTANT BE: 1761 | | EITHER 1762 | | (READ(INTEGER); APPLY(FIXED_LITEBAL)) 1773 | | (READ(FLOAT_LITERAL);APPLY(FLOAT_LITERAL)) 1784 J | (READ(STRING); APPLY(CHABACTER_LITERAL)) 1795 | | (READ(BITSTRING); APPLY(BIT_LITERAL)) 1806 | 1 | END 1807 | | _|_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAB 1808 | 0 | LET UNARIED_PRIMARY BE: 1812 | | EITHER 1813 | | (READ (MINUS)) 1819 | | (READ(PLUS)) 105 1825 | | (HEAD (NOT)) 1831 | 1 | END; 1833 | 2 | PRIMARY; 1835 J | APPLY(UNARY OPERATOR) 1839 | I _|_ **** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR 1840 | 0 | LET STOREAGE REFERENCE BE: 1844 1 I READ(IDENTIFIER); 1849 | | APPLY (VARIABLE) ; 1854 j EITHER 1855 | | (SKIP) 1858 j (BEGIN 1860 | 2 I READ (OPEN) ; 1865 | | CYCLE 1866 3 1 EXPRESSION; 1868 | APPLY (EXPRESS10N_LIST_ELEMENT) 1873 j j EITHER 1874 | (READ (COMMA) ) 1880 ) ... (EXIT ) 1883 4 I END; 1885 3 1 END; 1887 | 4 t READ (CLOSE) ; 1892 i APPLY(END_EXPRESSION_LIST); 1897 1 I END) 1899 2 1 END; 1901 ] APPLY(END VARIABLE); 1906 j _ l _ ** NO SYNTAX ERRORS DETECTED NO SYNTAX ERRORS IN COMPILATION SO FAR NO SYNTAX ERRORS 106 S Y M B O L I N F O R M A T I O N ****** TERMINAL SYMBOLS ****** NO MBER SYMBOL 1 IDENTIFIES 2 COLON 3 PROCEDURE 4 OPTIONS 5 OPEN 6 MAIN 7 CLOSE 8 SEMI 9 END 10 DECLARE 11 COMMA 12 STAR 13 MINUS 14 INTEGER 15 FIXED 16 FLOAT 17 BIT CO CHARACTER 19 VARYING 20 RETURNS 21 EQUAL 22 PUT 23 SKIP 24 PAGE 25 LIST 26 GET 27 CALL 28 RETURN 29 IF 30 THEN 31 ELSE 32 DO 33 WHILE 34 TO 35 BY 36 OB BAB 37 AMPERSAND 38 LESS_THAN 39 GREATER THAN 40 NOT : 41 PLUS 42 CATENATE_BAR 43 SLASH 44 FLOAT LITERAL 45 STBING 107 46 BITSTRING A P P L Y S Y M B O L S ****** NUMBER SYMBOL 1 M A I N 2 I N D E N T 3 L I N E 4 E X D E N T 5 PROCEDURE L I S T 6 E N D _ P R O C E D U R E A L I S T 7 S T A R T _ D E C L A R A T I O N 8 NEW_SCALAR 9 NEJTARRAY 10 S TAR_BOUND 11 ONE^BODND 12 LOW_BOUND 13 HIGH_BOUND 14 END BOUND L I S T 1 5 N E G A T I V E C O N S T A N T 1 6 GET_BOUND 17 F I X E D T Y P E 1 8 F L O A T _ T Y P E 1 9 B I T T Y P E 2 0 D E F A U L T _ B I T _ S I Z E 2 1 B I T _ L E N G T H 2 2 C H E C K _ C H A R A C T E R _ P A R A K IETER 2 3 C H A R A C T E R _ L E N G T H 2 4 C H A R A C T E R _ T Y P E 2 5 P U S H S C O P E 2 6 N O _ F O R M A L _ P A R A M E T E R S 2 7 N O _ R E T U R N S _ C L A U S E 2 8 P ROCEDURE HEAD 2 9 NEW_PARAMETER 3 0 END~FORMAL P A R A M E T E R L I S T 3 1 F I X E D _ P R O C 3 2 F L O A T _ P R O C 3 3 B I T _ P R O C 3 4 C H A R A C T E R _ P R O C 3 5 A S S I G N M E N T . S T A T E M E N T 3 6 S T A R T ARRAY S U B S C R I P T I N G 3 7 S U B S C R I P T E X P R E S S I O N 3 8 END ARRAY S U B S C R I P T I N G 3 9 P U T S K I P 4 0 P U T _ P A G £ 41 P U T ~ E X P R E S S I O N 4 2 GET S K I P 4 3 S I M P L E_. V A R I A B L E 4 4 G E T _ I T E M 4 5 S E T _ U P _ F O R _ P R O C E D U R E _ C A L L 4 6 N U L L P A R A M E T E R L I S T 4 7 G E T _ P A R A M E T E R 4 8 E N D _ O F _ P A R A M E T E R S 4 9 P ROCEDURE C A L L 109 50 S I H P L E _ B E T U B N 51 BEF0BE_BETURN_, EXPRESSION 52 RETURN_EXPRESSION 53 IF_HEAD 54 IF_THEN 55 SIABTING_ELSE_CLAUSE 56 IF_THEN_ELSE 57 SIHPLE_DO 58 END_DO_STATEMENT 59 BEFORE_WHILE_EXPRESSION 60 WHILE_EXPRESSION 61 DO_VARIABLE 62 GET_INITIAL_DO_VALUE 63 GET_FINAL_DO_VALUE 64 DEFAULT_BY_VALUE 65 GET_DO_BY_VALUE 66 OR 67 AND 68 CONDITION_EQ 69 CCNDITION_LE 70 CONDITION_LT 71 CONDITION_GE 72 CONDITION~GT 73 CONDITION~NE 74 FINISH CONDITION 75 DG_TERTIARY_OPERATION 76 DO_SECONDARY_OPERATION 77 STARTING_A_,BRACKETTED_EXPRESSION 78 CLOSING_A_BRACKETTED_EXPRESSION 79 FIXE D_LITER AL 80 FLOAT_LITERAL 81 CHARACTER_LITERAL 82 BIT_LITEfiAL 83 0NARY_OPERATOfl 84 VARIABLE 85 EXPRESSION_LIST_ELEHENT 86 END_EXPRESSION_LIST 87 E N D ' V A R I A B L E 110 S T A T E S U M A R Y TOTAL NUMBER OF STATES : 397 TOTAL NUMBER OF LR (0) STATES . : 363 NUMBER OF INADEQUATE LR {0) STATES : 34 NUMBEB OF INADEQUATE SIR STATES . : 0 NUMBER OF SLR ( 1) STATES .„: 34 1 1 1 A S A M l i l PARSER T h i s appendix presents what a parser running on t a b l e s produced by the c u r r e n t implementation of P a r s l e y c o u l d look l i k e . The pa r s e r presented here i s c a l l e d by the t r a n s l a t o r and r e t u r n s the next semantic a c t i o n f o r the t r a n s l a t o r t o execute. The parser r e t u r n s the semantic a c t i o n HALT_ACTION when the parse i s s u c c e s s f u l l y concluded. IV.1 TABLE STRUCTURE The c u r r e n t implementation produces t h r e e separate t a b l e s . The a c t u a l format of these t a b l e s can be found i n T e c h n i c a l Note 75-04 of the Department of Computer Science a t the O n i v e r s i t y of B r i t i s h Columbia. The t a b l e s are the PARSE_TABLE, the TRANSITIONJTABIE, and the ACTION_RETURN_TABLE. The s t r u c t u r e of these t a b l e s i s as f o l l o w s : PARSE_TABIE IS ARRAY (0: : *) OF STRUCTURE PABSE_OPERATION_TYPE(PARSE OPERATION), INTEGER (LOOKAHEAD_DEPTH_FIELD) , INTEGER (NUMBER_FIEID) , INTEGER (INDEX_FIE.LD), END; 112 TRANSITION_TABIE IS ARRAY (0: :*) OF • STRUCTURE TERMINAL_SYMBOL_TYPE (TEBMINAL_SYMBOL) ,. INTEGER (STATE_TO_ENTEB_ON_J3ATCH) OB ERRGR_RECOVERY_TYPE(ERROR_RECOVERY), INTEGER (FILLER) END; ACTION_RETURN_TABLE IS ARRAY (0::*) OF STRUCTURE ACTION_TYPE(ACTION), INTEGER(STATE_TO_ENTER_AFTER_ACTION) OR INTEGER(STATE_TO_MATCH) , INTEGER(STATE_TO_ENTER_ON_MATCH) END; PARSE_OPERATION_TYPE i s one of the following nine possible operations; HALT, READ, READ_KITH_ERROB_RECOVERY, LOOKAHEAD, LOOKAHEAD_HITH_ERROR_RECOVERY, APPLY, POP, PUSH_EMPTY, or RETURN_FROM_NONTERMINAL. TERMINAL_SYflBOL_TYPE, EBROR_RECOVEBY_TYPE, and ACTION_TYPE contain the terminals, error recovery techniques, and semantic actions of the Parsley proqram which caused the tables to be produced. Each of the elements i n these types has an unique numeric value associated with i t by the Parsley implementation. I t i s t h i s numeric value which i s found i n the tables. IV.2 PARSER ENVIRONMENT This parser requires several routines and data structures to be supplied. The data structures must be retained i n t a c t between invocations of the parser. 1 1 3 A r o u t i n e , c a l l e d STREAM, which r e t u r n s the Nth token from the i n p u t stream i s r e q u i r e d . T h i s r o u t i n e w i l l set the v a r i a b l e CTJRRENT_S YMBOL to the s p e c i f i e d token's TERMINAL_SYMBOL_TYPE. I t w i l l a l s o s e t the v a r i a b l e CUBBENT_SYMBOL^INFORMATION t o a s t r u c t u r e which c o n t a i n s a l l the scanner i n f o r m a t i o n about the s p e c i f i e d token. A syntax e r r o r r o u t i n e , c a l l e d SYNTAX_.ERROR, i s a l s o r e q u i r e d . T h i s r o u t i n e w i l l handle a l l syntax e r r o r message gen e r a t i o n as w e l l as a l l syntax e r r o r r e c o v e r y . The r o u t i n e i s s u p p l i e d with an ERROR_RECOVERY_TYPE to help i n t h i s p r o c e s s . Syntax e r r o r s d i s c o v e r e d with no a s s o c i a t e d recovery w i l l supply the DEFA0LT_RECOVERY value to t h i s r o u t i n e . The parser r e q u i r e s two s t a c k s . The STATE^STACK holds i n t e g e r s and the TCKEN_STACK holds s t r u c t u r e s with scanner i n f o r m a t i o n as found i n the v a r i a b l e CURRENT_SYMBGL^INFORMATION. , The height o f the two s t a c k s can be found i n the v a r i a b l e STACK_HEIGHT. F i n a l l y , there i s an i n t e g e r v a r i a b l e c a l l e d PARSER_STATE which i s the c u r r e n t s t a t e of the p a r s e r . . There i s another i n t e g e r v a r i a b l e STREAM_INDEX which i n d i c a t e s the next token t o be parsed i n the i n p u t stream. IV. 3 PAHSE.fi INITIAL CONDITIONS P ABS EB_ STA T E : = 1 STACK_HEIGHT:=0 STBEAM INDEX:=1 IV.4 THE PA1SEB PHOCEDOBE PARSER IN TRANSLATOB RETURNS ACTIONJTYPE; DECLARE INTEGER (NUMBEB,INDEX,LOOKAHEAD_DEPTH), PARSE_OPERATION_TYPE (STATE_TYPE) ; CODE CYCLE BEGIN OPEN PARSE_TABLE(PABSEB_STATE); STATE_TYPET=PARSE_OPEBATION; LOOKAHEAD_DEPTH:=LOOKAHEAD_DEPTH_FIELD; NUMBEB:=NUMBEB_FIELD; INDEX:=INDEX_FIELD; END; SELECT FROM PARSE_OPERATION_TYPE ON STATE_TYPE; HALT: RETURN FBOH PARSER WITH. HALT_ACTION; READ: READ_WITH_ERROR_RECOVERY: CALL READ STATE; LOOKAHEAD: LOOKAHEAD WITH_ERROR_RECOV£RY: CALL LOOKAHEAD_STATE; APPLY: CALL APPLY STATE; POP: CALL POP_STATE; PUSH_EMPTY: CALL PUSH_EMPTY_STATE; RETURN_FROM_NONTERMINAL: CALL RETURN_STATE; END END; PEOCEDUBE EEAD_STATE IN PARSER; DECLARE INTEGER (I) ; CODE STREAM (STREAM_INDEX) ; DO I:=INDEX TO INDEX+NUMBEB; I F TRANSITION_TABL£(I).TEBMINAL_SYMBOL= CURRENT_SYMBOL; THEN: /* POSH ONTO THE STACKS .*/ STACK_HEIGHT:=STACK_HEIGHT+1; STATE_STACK(STACK_HEIGHT):=PARSER_STATE; TOKEN_STACK(STACK~HEIGHT) : = C UR R E N T_ S Y M B 0 L^ IN FO B M AT 10 N ; PARSER_STATE:=TRANSITION_TABLE(I). STATE_TO_ENTER_ON_MATCH; STREAM_INDEX:=SIEAM_INDEX+1 ; RETURN FROM READ_STATE; END END; /* TO GET HERE flEANS A SYNTAX ERROR */ IF STATE_TYPE=READ; THEN: SYNTAX_ERROR (DEFAULT_RECOVERY) ; ELSE; SYNTAX_ERROR(TRANSITION_TABLE(INDEX+ NUMBER+1) ,ERROR_RECOVERY); END ; _ l _ PROCEDORE LOOKAHEAD_STATE IN PARSER; DECLARE INTEGER (I) ;~ CODE STREAM (STREAM_INDEX+LOOKAHEAD_DEPTH) ; DO I:=INDEX TO INDEX+NUMBER; IF TRANSITION_TABLE(I). TERMINAL^ YMBOL= CURRENT_SYMBOL; THEN: PABSEB_STATE:=TBANSITION_TABLE(I). STATE_TO_ENTEB_ON_JIATCH; RETURN FBOM LOOKAHEAD_STATE; END END; /* TO GET HEBE MEANS A SYNTAX ERROR */ IF STATE_TYPE=LOOKAHEAD; THEN: SYNTAX_ERROR (DEF AULT_RECOVERY) ; ELSE: SYNTAX_ERROR (TRANSITION_TABLE(INDEX+NUMBER+1) .ERROR_RECOVERY); END; _ l _ PROCEDURE APPLY_STATE IN PARSER; CODE PARSER_STATE:=ACTION RETURN TABLE (INDEX). , STATE_TO_ENTER_AFTER_ACTION; RETURN FROM PARSER WITH ACTION_RETURN_TABLE(INDEX). ACTION; _ l _ PROCEDURE POP_STATE IN PARSER; CODE ST ACK_HEIGHT:=STACK_. HEIGHT-NUMBER; PARSER_STATE:=INDEX; _ l _ PBOCEDURE PUSH_EHPTY_STATE IN PARSER; CODE STACK_HEIGHT:=STACK HEIGHT*1; STATE_STACK(STACK_HEIGHT):=PARSER_STATE; TOKEN_STACK(ST ACK_H EIGHT):=EMPTY_TOKEN; PARSER_STATE:=INDEX; _ l _ PROCEDURE RETURN_STATE IN PARSER; DECLARE INTEGER (I); CODE I:=INDEX; CYCLE /* A ZERO IN THE STATE TO MATCH FIELD INDICATES THE DEFAULT MATCH */ EXIT WHEN ACTION_RETURN_TABLE(I).STATE TO_MATCH=0; EXIT WHEN ACTION_fiETURN_TABLE(I).STATE_TO_MATCH= STATE_STACK(STACK_HEIGHT) ; I:=1+1; END; PARSER_STATE:=ACTION_RETURN_TABLE(I).. STATE TO_ENTER_ON_MATCH; _ l _ 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051774/manifest

Comment

Related Items