"Science, Faculty of"@en . "Computer Science, Department of"@en . "DSpace"@en . "UBCV"@en . "Jervis, Brian"@en . "2010-01-21T00:33:27Z"@en . "1974"@en . "Master of Science - MSc"@en . "University of British Columbia"@en . "A new data base independent query language for relational systems is presented. Queries in this language specify only properties of the data which is to be retrieved. An algorithm for reducing queries to a response relation is described. This reduction algorithm makes use of Micro-Planner to decide which relations in the data base are applicable to the query, and how these relations should be manipulated. A semantic model is used as the basis for this work. This query language is also compared with existing languages."@en . "https://circle.library.ubc.ca/rest/handle/2429/18798?expand=metadata"@en . "QUERY LANGUAGES FOR RELATIONAL DATA BASE MANAGEMENT SYSTEMS by Brian M. J e r v i s B . S c , U n i v e r s i t y of B r i t i s h Columbia, 1972 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE i n the Department of COMPUTER SCIENCE We accept t h i s t h e s i s as conforming t o the r e g u i r e d standard. The U n i v e r s i t y of B r i t i s h Columbia May 1971* In p r e s e n t i n g t h i s t h e s i s in p a r t i a l f u l f i l m e n t o f the requ i rement s f o r an advanced degree at the U n i v e r s i t y o f B r i t i s h Co lumb ia , I ag ree t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s t u d y . I f u r t h e r agree t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y purposes may be g r a n t e d by the Head o f my Department or by h i s r e p r e s e n t a t i v e s . It i s u n d e r s t o o d that c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i thout my w r i t t e n p e r m i s s i o n . Department o f Computer Science The U n i v e r s i t y o f B r i t i s h Co lumbia Vancouver 8, Canada Date May 23, 1974. i i ABSTRACT A new data base independent query language f o r r e l a t i o n a l systems i s presented. Queries i n t h i s language s p e c i f y o n l y p r o p e r t i e s of the data which i s to be r e t r i e v e d . An a l g o r i t h m f o r r e d u c i n g q u e r i e s t o a response r e l a t i o n i s d e s c r i b e d . T h i s r e d u c t i o n a l g o r i t h m makes use of Micro-Planner to decide which r e l a t i o n s i n the data base are a p p l i c a b l e to the query, and how these r e l a t i o n s should be manipulated. A semantic model i s used as the b a s i s f o r t h i s work. T h i s query language i s a l s o compared with e x i s t i n g languages. i i i T able of Contents INTRODUCTION 1 THE RELATIONAL APPROACH TO DATA BASE MANAGEMENT ...........3 The R e l a t i o n a l Model Of Data 3 C l a s s e s Of R e l a t i o n s 5 Advantages Of R e l a t i o n a l Systems .7 Pre v i o u s Research 8 RELATIONAL ALGEBRA .................11 Sample R e l a t i o n s 12 Operations On R e l a t i o n s ........12 Union .......................13 I n t e r s e c t i o n 13 D i f f e r e n c e .......14 Cross Product 14 P r o j e c t i o n 15 J o i n ..15 R e s t r i c t i o n 16 D i v i s i o n 17 Choice Of Operations In The R e l a t i o n a l Algebra ..18 Implementation Of The R e l a t i o n a l Algebra ..............19 The R e l a t i o n a l Algebra As A Query Language 21 RELATIONAL CALCULUS 22 I n t r o d u c t i o n .22 The R e l a t i o n a l C a l c u l u s .23 The Alphabet .......23 Terms 23 i v WFFs ..24 Range Formulae ...................................25 Pure Range Formulae ..............................26 Range Coupled Q u a n t i f i e r s ........................26 Q-formulae ...............27 Q-expressions ....................................28 A Reduction Algorithm .................................28 G l o b a l And L o c a l Ranges 29 The J o i n A l g orithm ...............................30 The Reduction Algorithm ..........................32 Implementation Of The Reduction Algorithm .............33 R e s t r i c t i o n s on The R e l a t i o n a l C a l c u l u s ..........34 Represe n t a t i o n Of The Queries ....................35 E v a l u a t i o n Of The R e l a t i o n a l C a l c u l u s .................36 A QUERY LANGUAGE FOR RELATIONAL SYSTEMS ................... 40 The R e l a t i o n a l C a l c u l u s Re-Defined ....................40 The Alphabet ..41 Terms ............................................41 WFFs 42 Q-Expressions ....................................43 Range Formulae ...................................44 Range Coupled Q u a n t i f i e r s 44 Target L i s t . . . . . . . . . . . . . . . . .45 Queries ..........................................45 E x p l a n a t i o n Of The Queries ............................47 Comments On The R e l a t i o n a l C a l c u l u s ..............48 Use With N a t u r a l Language ...........49 V A NEW FRAMEWORK FOR RELATIONAL SYSTEMS ..........52 Reduction Of Queries - An Overview .................... 53 The Use Of Micro-Planner 54 The Semantic Model 56 The Meaning Of R e l a t i o n s 57 P r o p e r t i e s Of R e l a t i o n s 59 Optimizing Information 59 A c t i v e And I n a c t i v e Data ..........60 G e n e r a l i t y Of The Semantic Model 63 Prov i n g R e l a t i o n a l Terms .......64 E x p l a n a t i o n Of The Problem ..66 P r o p e r t i e s Of R e l a t i o n s ....................67 Proving XRy 69 Proving XRyR'Z ....72 Usefulness Of The V a l i d Path 74 Summary ..........74 The P r o c e s s i n g Of R e l a t i o n a l Terms 75 The L i s t Returned By Micro-Planner .....76 Proving X 78 Proving XRy 79 Repr e s e n t a t i o n Of The Goals .79 B a s i c Proof Strategy 80 L i m i t i n g Unnecessary Work .82 Remembering The Proof .......83 The E f f e c t Of Real World Knowledge ..........85 Proving XRyR'Z 87 The Reduction Algorithm 88 v i The Reduction Algorithm - An Overview ............88 J u s t i f i c a t i o n Of The Reduction Algorithm .........91 Treatment Of Conj u n c t i o n s And D i s j u n c t i o n s ..91 Treatment Of Negated Terms 92 Treatment Of Q u a n t i f i e r s ...................,93 The Allowable V a r i a b l e s In Target L i s t s ..........95 Why The R e l a t i o n a l C a l c u l u s Of Codd I s Bypassed ..95 The Treatment Of Real World Knowledge .........97 Repr e s e n t a t i o n Of Real World Knowledge ..,...,.,..97 M u l t i p l e Path Problems ...100 SUMMARY ,104 BIBLIOGRAPHY ..106 APPENDIX 1 .....110 APPENDIX 2 ....113 APPENDIX 3 ................116 APPENDIX 4 .119 APPENDIX 5 ...........................124 APPENDIX 6 ......129 APPENDIX 7 133 APPENDIX 8 .................137 APPENDIX 9 ...............141 APPENDIX 10....................... ..............146 APPENDIX 11 ..............150 v i i ACKNOWLEDGEMENT I would l i k e to express my a p p r e c i a t i o n to Dr. Bay R e i t e r f o r the v a l u a b l e guidance, d i r e c t i o n , and i n s p i r a t i o n he pro v i d e d throughout t h i s work. I would a l s o l i k e to thank Dr. Doug Seeley f o r h i s h e l p f u l s u g g e s t i o n s . The f i n a n c i a l support of the N a t i o n a l Research C o u n c i l i s g r a t e f u l l y acknowledged. 1 INTRODUCTION H i s t o r i c a l l y , data base management was performed by i n d i v i d u a l programmers who attempted to design t h e i r f i l e s i n such a way as to op t i m i z e the e x e c u t i o n of t h e i r programs. I f a new a p p l i c a t i o n was found, the data would be re-arranged, and a new f i l e c o n t a i n i n g redundant i n f o r m a t i o n would be c r e a t e d . Since t h i s approach i s both c o s t l y and i m p r a c t i c a l , g e n e r a l purpose data base management systems are now i n widespread use. These systems permit many users, each with a d i f f e r e n t a p p l i c a t i o n , to c o n c u r r e n t l y share a dynamic data base. There are now a l a r g e number of such systems i n e x i s t e n c e [ 6 ]. From the development of these systems emerged a number of g e n e r a l p r i n c i p l e s f o r d e s i g n i n g data base management systems[5]. As a r e s u l t , a number of new pr o p o s a l s have been put forward. One such proposal i s the r e l a t i o n a l a b r o a c h to data base management. In t h i s approach, data bases are viewed as a c o l l e c t i o n of time v a r y i n g r e l a t i o n s which are operated upon by a number of set t h e o r e t i c o p e r a t i o n s . T h i s t h e s i s i s p r i m a r i l y concerned with user languages f o r r e l a t i o n a l systems. The r e l a t i o n a l c a l c u l u s , which has been proposed by Codd[11] as the b a s i s f o r a l l guery languages f o r r e l a t i o n a l systems, i s c r i t i c a l l y examined. S e v e r a l i n h e r e n t d i f f i c u l t i e s are observed, which l e a d to the proposal of a new framework. T h i s framework i n c l u d e s a new guery language which r e g u i r e s 2 users to s p e c i f y only the p r o p e r t i e s of the data they want r e t r i e v e d . T h i s language i s not data base dependent, and thus q u e r i e s are expressed i n the same f a s h i o n r e g a r d l e s s of the c u r r e n t o r g a n i z a t i o n of the r e l a t i o n s . T h i s r e q u i r e s the system to be capable of d e c i d i n g which elements of the data base are i r e l e v a n t to the user's request, and d e c i d i n g how these elements should be manipulated i n order to produce the c o r r e c t response. T h i s i s a n o n - t r i v i a l problem which adds a completely new dimension to the systems proposed by Codd. The c u r r e n t system i s not only capable of answering g u e r i e s but w i l l accept r e a l world knowledge which a f f e c t s the response to the q u e r i e s , i n t r o d u c e new r e l a t i o n s , and accept new i n f o r m a t i o n about the c u r r e n t data base which i s a u t o m a t i c a l l y used to optimize the r e t r i e v a l process. 3 Is. THE RELATIONAL APPROACH TO DATA BASE MANAGEMENT a j j . 1\u00C2\u00AB The R e l a t i o n a l Model of Data \u00E2\u0080\u00A2 \"Since s e t theory provides a wealth of o p e r a t i o n s f o r d e a l i n g with r e l a t i o n s , a s e t - t h e o r e t i c data s t r u c t u r e appears worth i n v e s t i g a t i o n . \" 1 In t h i s t h e s i s , we are concerned with the r e l a t i o n a l model of data presented by Codd [ 8 ] , The term r e l a t i o n i s used here i n i t s accepted mathematical sense. Given s e t s S1,S2,.,.,Sn not n e c e s s a r i l y d i s t i n c t , R i s a r e l a t i o n on these n s e t s i f i t i s a s e t of n - t u p l e s , each of which has i t s f i r s t element from S1, i t s second element from S2, e t c . More c o n c i s e l y : R( S1 S2 ... Sn ) <=- S1 x S2 x ... x Sn. We r e f e r to S j as the j t h domain of R. As d e f i n e d above, R i s s a i d to have degree n. \u00E2\u0080\u00A2 Example 1.1 \u00E2\u0080\u00A2 I f S1 i s P r o j e c t * , S2 i s P r o j e c t name, and S3 i s P r o j e c t l o c a t i o n , then: R ( P r o j e c t , P r o j e c t name,Project l o c a t i o n ) c: P r o j e c t * x P r o j e c t name x P r o j e c t l o c a t i o n R e l a t i o n s are r e p r e s e n t e d i n a t a b l e - l i k e format, with the domain names appearing a t the top of the t a b l e , and the t u p l e s i n the r e l a t i o n appearing beneath. * C h i l d s , D.L. \" D e s c r i p t i o n of a S e t - T h e o r e t i c Data S t r u c t u r e . \" Proceedings of the 1968 FJCC^ pp.557-564. 4 \u00E2\u0080\u00A2 Example 1.2 \u00E2\u0080\u00A2 B (PROJECT* PROJECT-NAME PBOJECT-LOCATION) 1 ROYAL TOWERS VANCODVER 2 BURRARD SHIPYARDS VANCOUVER 3 P.C.C. MERRITT 4 RAPID TRANSIT VICTORIA 5 GBANVILLE MALL VANCOUVER 6 OLYMPIC SEAWAY VICTORIA In t h i s framework, the data base c o n s i s t s of a c o l l e c t i o n of time v a r y i n g r e l a t i o n s of a s s o r t e d degrees. The r e l a t i o n s are not s t a t i c , but c o n s t a n t l y changing. They are s u b j e c t to i n s e r t i o n , d e l e t i o n , and m o d i f i c a t i o n . Since r e l a t i o n s are only s p e c i a l s e t s , the r e l a t i o n a l model makes use of s e t - t h e o r e t i c o p e r a t i o n s i n order to perform the data management f u n c t i o n s . T h i s s e t of o p e r a t i o n s , c a l l e d a r e l a t i o n a l a l g e b r a , forms the only s e t of o p e r a t i o n s which may s e l e c t data from the data base. * Example 1.3 \u00E2\u0080\u00A2 In order to form the r e l a t i o n which l i s t s the p r o j e c t s who are l o c a t e d i n Vancouver, one r e s t r i c t s R t o those t u p l e s whose value of PROJECT-LOCATION i s Vancouver. T h i s produces a new r e l a t i o n R\u00C2\u00BB whose t u p l e s a r e : R\u00C2\u00AB (PROJECT* PROJECT-NAME PROJECT-LOCATION) 1 ROYAL TOWERS VANCOUVER 2 BURRARD SHIPYARDS VANCOUVER 5 GRANVILLE MALL VANCOUVER 5 n i t 2 ^ C l a s s e s of R e l a t i o n s n R e l a t i o n s f a l l i n t o one of two c l a s s e s - simple, or compound. Simple r e l a t i o n s have the property that no domain i n the r e l a t i o n i s i t s e l f another r e l a t i o n . Compound r e l a t i o n s , on the o t h e r hand, have the property that at l e a s t one domain i n the r e l a t i o n i s i t s e l f a r e l a t i o n . Compound r e l a t i o n s d e f i n e h i e r a r c h i e s . \u00E2\u0080\u00A2 Example 2.1 \u00E2\u0080\u00A2 In t h i s example, R i s a compound r e l a t i o n . R (PARTS PART-DESC Q-O-H (PROJECT* PROJECT-DESC NUMBER-ORDERED)) ( 1 WIDGETS 47 1 HOUSING 40 2 APARTMENTS 3 ) ( 2 FIDGETS 4 1 HOUSING 2 4 TOWN-HOUSES 1 ) Codd[9,10] p o i n t s out t h a t compound r e l a t i o n s can be reduced to a number of corresponding simple r e l a t i o n s with no i n f o r m a t i o n l o s s . F u r t h e r , he shows t h a t t h i s i s not only p o s s i b l e , but d e s i r a b l e . T h i s process i s c a l l e d n o r m a l i z a t i o n , and the new r e l a t i o n s which are produced are s a i d to be 12\u00C2\u00A3ialized\u00C2\u00AB There are two advantages to using data bases which c o n s i s t of normalized r e l a t i o n s . F i r s t l y , i t s i m p l i f i e s the r e l a t i o n a l a l g e b r a , s i n c e the op e r a t i o n s on r e l a t i o n s need only deal with simple r e l a t i o n s . Secondly, normalized data bases are c o n s i s t e n t , non-redundant, and f r e e of u n d e s i r a b l e update dependencies. 6 \u00E2\u0080\u00A2 Example 2.2 \u00E2\u0080\u00A2 In example 2.1, R can be represented by: P (PART* PART-DESC Q-O-H) Q (PARTS PROJECT* PROJECT-DESC NUMBER-ORDERED) T h i s r e p r e s e n t a t i o n i s p r e f e r a b l e to t h a t of example 2.1. No t i c e t h a t with r e l a t i o n R, i f a new p a r t i s added to the system, i t cannot be recorded u n t i l at l e a s t one p r o j e c t which o r d e r s t h a t p a r t i s a l s o recorded. With the r e p r e s e n t a t i o n of example 2.2 however, t h i s i s no long e r true s i n c e the r e l a t i o n P c o n t a i n s no i n f o r m a t i o n about the p r o j e c t s which use the p a r t . I t d e a l s with pjart i n f o r m a t i o n o n l y . T h i s r e p r e s e n t a t i o n i s s t i l l inadeguate. I f a new p r o j e c t i s r e c o r d e d , i t cannot be put i n the data base u n t i l the p a r t s which i t w i l l be using are known. Therefore r e l a t i o n Q i s s p l i t f u r t h e r . \u00E2\u0080\u00A2 Example 2.3 \u00E2\u0080\u00A2 PART (PART* PART-DESC Q-O-H) PROJECT (PROJECT* PROJECT-DESCRIPTION) SUPPLY (PART* PROJECT* NUMBER-ORDERED) The i n f o r m a t i o n d e a l i n g with p a r t s , p r o j e c t s , and the supply of p a r t s to p r o j e c t s i s now represented by separate r e l a t i o n s . The data base c o n t a i n s e x a c t l y the same i n f o r m a t i o n as before but i s now f r e e of the previous update dependencies. For the purposes of t h i s t h e s i s , we assume t h a t a l l r e l a t i o n s are normali2ed. 7 n i^ v a n t a g e s of R e l a t i o n a l Systems a There are many p r o p e r t i e s of r e l a t i o n a l systems which make them more d e s i r a b l e than t r a d i t i o n a l f a c t r e t r i e v a l systems. These can be summarized as f o l l o w s : 1. They provide a high degree of independence between the user and the data base. 2. The data bases are c o n s i s t e n t , and non-redundant. 3. The data s t r u c t u r e i s simple, yet extremely powerful. 4. They have a s u p e r i o r query c a p a b i l i t y . 5. They p r o v i d e good i n t e r a c t i v e support f o r the c a s u a l user. The c h i e f v i r t u e of r e l a t i o n a l systems i s the high degree of user - data independence they provide. T h i s independence oc c u r s a t two l e v e l s . F i r s t l y , the user i s presented with a l o g i c a l r e p r e s e n t a t i o n of h i s data which may we l l d i f f e r from i t s p h y s i c a l r e p r e s e n t a t i o n . T h i s i s p o s s i b l e s i n c e r e l a t i o n s can only be accessed through the o p e r a t i o n s of the r e l a t i o n a l a l g e b r a . T h i s i s a v a l u a b l e p r o p e r t y , s i n c e f a c t o r s such as e f f i c i e n c y and machine c o n f i g u r a t i o n can now be taken i n t o account without i n f l u e n c i n g the users view of h i s data. While t h i s i s a fundamental concept i n i n f o r m a t i o n r e t r i e v a l , most c u r r e n t systems v i o l a t e i t [ 5 ] . Secondly, the user need not be aware of how h i s data i s arranged i n t o r e l a t i o n s . The i n f o r m a t i o n which must be s t o r e d can be broken i n t o r e l a t i o n s i n any convenient manner, and i n f o r m a t i o n i n the data base, even though i t may be i m p l i c i t 8 ( i . e . i n separate r e l a t i o n s ) , w i l l s t i l l be r e t r i e v a b l e . No c u r r e n t f a c t r e t r i e v a l systems are capable of drawing c o n c l u s i o n s from i n f o r m a t i o n which i s s t o r e d i n separate f i l e s . Another advantage i s that through the process of n o r m a l i z a t i o n , redundant and i n c o n s i s t e n t i n f o r m a t i o n can be removed from the data base. Thus, these systems tend to be f r e e of u n d e s i r a b l e update and d e l e t i o n dependencies. T h i s process a l s o tends to arrange data bases i n c o n c e p t u a l l y c l e a r and c o n c i s e u n i t s . T h i r d l y , the data s t r u c t u r e i s both simple and v e r s a t i l e . Other data s t r u c t u r e s , such as r i n g s , t r e e s , networks, and graphs, can be represented using r e l a t i o n s . F i n a l l y , r e l a t i o n a l systems lend themselves to a v a r i e t y of d i f f e r e n t guery languages. T h i s p o i n t w i l l be amply demonstrated i n the f o l l o w i n g c h a p t e r s . n Jk \u00C2\u00BBK P r e v i o u s Research n Over the past f i v e years, there has been c o n s i d e r a b l e i n t e r e s t i n the r e l a t i o n a l approach to data base management. The f i r s t c o n t r i b u t i o n s were by Feldman and Rovner[13]. They i n t r o d u c e d an approach whereby data i s stor e d i n the form of b i n a r y r e l a t i o n s . Users can access these r e l a t i o n s from A l g o l programs by means of a t t r i b u t e - o b j e c t - v a l u e t r i p l e s . A s i m i l a r approach was taken by L e v i n and Maron[21]. S e v e r a l 9 implementations have been based on t h i s scheme, i n c l u d i n g one by Gammill[ 14]. The i n i t i a l c o n t r i b u t i o n towards the goal of a system based on a r b i t r a r y r e l a t i o n s came from C h i l d s [ 3 , 4 ]. He presents a s e t t h e o r e t i c data s t r u c t u r e comprised of s e t o p e r a t i o n s , datum names, data, and s e t names. In t h i s system, the set o p e r a t i o n s a r e implemented as s u b r o u t i n e s which operate on s e t s i n the data base. T h i s approach i s s t i l l being f o l l o w e d , although the o p e r a t i o n s i n c u r r e n t r e l a t i o n a l a l g e b r a s d i f f e r from those presented by C h i l d s . The emphasis of t h i s system, however, was s t i l l on binary r e l a t i o n s , with a c c e s s paths being p r e - d e f i n e d . Since C h i l d ' s paper, most new work has been based upon t h a t of Codd[8], In t h i s a r t i c l e , he argues i n f a v o r of r e l a t i o n a l data bases, and o u t l i n e s a s e t of o p e r a t i o n s on r e l a t i o n s which are a p p l i c a b l e to r e l a t i o n s of a r b i t r a r y degrees. With t h i s framework access paths between r e l a t i o n s need not be pre-d e f i n e d . Supplementing t h i s a r t i c l e , Codd[9,10] deals with n o r m a l i z a t i o n of data bases i n order to e l i m i n a t e redundant i n f o r m a t i o n and unwanted domain dependencies. In the area of user languages, the main advances have again come from Codd[7,11,12]. In [ 1 1 ] he d e f i n e s a r e l a t i o n a l a l g e b r a and proposes the r e l a t i o n a l c a l c u l u s , an a p p l i e d p r e d i c a t e c a l c u l u s with n-tuple v a r i a b l e s . Based upon the r e l a t i o n a l c a l c u l u s i s DSL-ALPHA[7 ], an A s a p - l i k e query language. 10 The above papers i n s p i r e d many implementations. Among these are the systems of St r n a d [ 2 7 ] , Notley[24 ], G o l d s t e i n and St r n a d [ 1 5 ] , and B r a c c h i [ 2 ] . U n f o r t u n a t e l y , these works show l i t t l e o r i g i n a l i t y . They are a l l b a s i c a l l y implementations of the r e l a t i o n a l a l g e b r a . One very n o t i c a b l e p o i n t about r e s e a r c h i n t h i s area i s t h a t although t h e r e are many implementations based on Codd\u00C2\u00BBs work, very l i t t l e has been done to extend i t . N o t i c a b l e e x c e p t i o n s are Palermo[26], who i n t r o d u c e s a new r e l a t i o n a l c a l c u l u s with an improved r e d u c t i o n a l g o r i t h m , and Heath[17], who o u t l i n e s some unacceptable f i l e o p e r a t i o n s i n a r e l a t i o n a l data base. As of yet, there has been no comprehensive study of how r e l a t i o n s can best be s t o r e d i n order to optimize system performance. Since the o p e r a t i o n s of the r e l a t i o n a l a l gebra are w e l l d e f i n e d , and s i n c e the nature of these o p e r a t i o n s i s easy to observe, one would expect some concrete r e s u l t s c o u l d be obta i n e d . I t i s s u r p r i s i n g t h a t such l i t t l e e f f o r t has been put i n t o t h i s area, s i n c e e f f i c i e n c y i s one of the l a r g e s t problems i n h i b i t i n g the commercial use of r e l a t i o n a l systems. Palermo's concept of s e m i - j o i n i s the only advance that has been made. 11 lis. RELATIONAL ALGEBRA There are c e r t a i n p r i m i t i v e o p e r a t i o n s on r e l a t i o n s which form the b a s i s of any r e l a t i o n a l system. T h i s s e t of o p e r a t i o n s i s r e f e r r e d to as a r e l a t i o n a l a l g e b r a . These o p e r a t i o n s are the only ones which can manipulate the r e l a t i o n s - a l l other r o u t i n e s i n the system must access the r e l a t i o n s through them. T h i s i m p l i e s t h a t the r e l a t i o n a l a l g e b r a should be chosen so t h a t the s e l e c t i v e power i t possesses i s complete, i n the sense t h a t any reguest which c o u l d p o s s i b l y be formulated i n the system can a l s o be formulated as a query i n the r e l a t i o n a l a l g e b r a [ 4 ] . The o p e r a t i o n s of the r e l a t i o n a l a l g e b r a take r e l a t i o n s as t h e i r arguments, and produce a response which i s always a new r e l a t i o n . A c c o r d i n g l y , the r e s u l t i s c a l l e d the response r e l a t i o n . T h i s s e c t i o n d e f i n e s the r e l a t i o n a l a l g e b r a which t h i s t h e s i s adopts. I t i s based upon the a l g e b r a proposed by Codd[ 11 ]. 12 n 2,1... Sample R e l a t i o n s D The f o l l o w i n g r e l a t i o n s w i l l be used as examples throughout t h i s chapter. ( A B ) R1 R2 R3 A 1 H SUPPLIER* 1 1 2 2 2 3 PART* 1 2 3 2 7 PART* 3 2 1 2 3 1 PART-NAME ) A B C n 2i2_. Operations on R e l a t i o n s n a l g e b r a i n c l u d e s e i g h t o p e r a t i o n s on The r e l a t i o n a l a l g e b r a r e l a t i o n s . They a r e : A. Union B. I n t e r s e c t i o n C. D i f f e r e n c e D. Cross product E. P r o j e c t i o n F. J o i n G. D i v i s i o n H. R e s t r i c t i o n Of these o p e r a t i o n s , d i v i s i o n and terms of the f i r s t s i x . 13 2.Z.2.S& 2siP.fi The union of two r e l a t i o n s R and S i s d e f i n e d to be: R U S = { (r) : r\u00E2\u0082\u00ACR v r6S } , where B and S are each of degree n. \u00E2\u0080\u00A2 Example 2.1 \u00E2\u0080\u00A2 The union of Hi and R2 i s : RP ( AP BP ) 1 2 1 3 2 1 2 2 2 3 3 1 4 7 Onion i s t y p i c a l l y used to c r e a t e a r e l a t i o n which enumerates the values of some domain using a set of other r e l a t i o n s , each of which c o n t a i n s t h i s domain. 2.S.2..B I n t e r s e c t i o n The i n t e r s e c t i o n of two r e l a t i o n s R and S i s defined to be: R INT S = { (r) : r\u00E2\u0082\u00ACR 6 r\u00E2\u0082\u00ACS } , where R and S are each of degree n. \u00E2\u0080\u00A2 Example 2.2 \u00E2\u0080\u00A2 The i n t e r s e c t i o n of R1 and R2 i s : RP ( AP BP ) 1 2 I n t e r s e c t i o n c r e a t e s a r e l a t i o n t h a t c o n t a i n s the t u p l e s which are common to a l l the r e l a t i o n s being i n t e r s e c t e d . 14 2.. 2..C D i f f e r e n c e The d i f f e r e n c e of two r e l a t i o n s B and S i s d e f i n e d as: H - S = { (r) :r\u00E2\u0082\u00ACB e r-es } \u00E2\u0080\u00A2 Example 2.3 \u00E2\u0080\u00A2 The d i f f e r e n c e of B1 and R2 i s : BP ( AP BP ) 1 3 2 1 2 2 2 3 3 1 2.2.D Cross Product The c r o s s product of two r e l a t i o n s i s d e f i n e d to be: B X S = { (r,s) : r\u00E2\u0082\u00ACR 6 S6S } \u00E2\u0080\u00A2 Example 2.4 \u00E2\u0080\u00A2 RP product of R1 and B2 i s : SUPPLIER* PABT* A B 1 3 1 2 1 3 4 7 2 1 1 2 2 1 4 7 2 2 1 2 2 2 4 7 2 3 1 2 3 1 4 7 1 2 1 2 1 2 4 7 2 3 4 7 3 1 1 2 15 2 i 2 i E P r o j e c t i o n Suppose r i s a t u p l e of an n-ary r e l a t i o n R. Then f o r j=1,2, ... n, r [ j ] denotes the j t h component of t u p l e r , or the E f o j e c t i o n of r on domain number j . T h i s n o t a t i o n i s extended to a l i s t A = (J1\u00C2\u00BBj2, ... jk) , where j i \u00E2\u0082\u00AC ( 1 , 2 , ... n). Now, r[A]={r[ j1 ],r[ j 2 ] , . . . ,r[ jk]} . m D e f i n i t i o n \u00E2\u0080\u00A2 The p r o j e c t i o n R[A] of R on A i s d e f i n e d by: R[ A ] = { r [ A ] : r\u00E2\u0082\u00ACR} . Thus, i f the r e l a t i o n R (PART* PART-NAME) were p r o j e c t e d on i t s f i r s t domain, a r e l a t i o n which c o n t a i n s o n l y p a r t numbers would be formed. \u00E2\u0080\u00A2 Example 2.5 \u00E2\u0080\u00A2 The p r o j e c t i o n of R2 on i t s second domain, R2[2], i s : SP ( PART* ) 1 2 3 2 i 2 i F J o i n J o i n i s perhaps the most powerful o p e r a t i o n of the r e l a t i o n a l a l g e b r a . I t i s d e f i n e d as f o l l o w s . l e t \u00E2\u0082\u00AC ( r , s ) denote an a r b i t r a r y p r e d i c a t e whose only v a r i a b l e s are of the form r [ i ] , s [ j ] . Then the e j o i n of R with S i s d e f i n e d by: R[6]S = { (r,s) : r\u00E2\u0082\u00ACR e s\u00E2\u0082\u00ACS S e(r,s) } 16 * Example 2.6 \u00E2\u0080\u00A2 Suppose s r e s p e c t i v e l y . RP ( and p are t u p l e s of r e l a t i o n s R2 and R3 Then the j o i n R2 (s[ 2 ]=p[ 1 ]) R3 i s : SUPPLIER* 1 1 2 2 2 3 PART* 3 2 1 2 3 1 The j o i n R2 (s[ 2 ]>p[ 1 ]) R3 i s : RP ( SUPPLIER* 1 2 2 2 3 3 PART* 2 1 1 2 1 1 The j o i n R2 (s[2 ]*5=r[ 1 ]) R1 i s : RP ( SUPPLIER* 2 3 PART* 1 1 PART* 3 2 1 2 3 1 PART* 3 3 2 3 3 2 A 4 4 PART-NAME ) C B A B C A PART-NAME ) C C B C C B B 7 7 2.2.G R e s t r i c t i o n Let 6 be an a r b i t r a r y p r e d i c a t e whose only v a r i a b l e s are of the form r [ j ] . Then d e f i n e the r e s t r i c t i o n r (\u00E2\u0082\u00AC) of R by \u00E2\u0082\u00AC to be: R (6) = { (r) : r\u00E2\u0082\u00ACR S e (r) } . \u00E2\u0080\u00A2 Example 2.7 \u00E2\u0080\u00A2 The r e s t r i c t i o n of R2 , R2(r[2]=2) i s : RP ( SUPPLIER* PART* ) 1 2 2 2 The above i s a t y p i c a l use of r e s t r i c t i o n . I f there i s a 17 r e l a t i o n t h a t i n d i c a t e s the s u p p l i e r s who supply p a r t s , then r e s t r i c t i n g t h a t r e l a t i o n to the case where the p a r t number i s two w i l l r e s u l t i n a r e l a t i o n showing which s u p p l i e r s supply p a r t number two. 2,. 2.H D i v i s i o n D i v i s i o n i s the most c o u n t e r - i n t u i t i v e o p e r a t i o n i n the r e l a t i o n a l a l g e b r a . I t i s i n c l u d e d s i n c e i t i s the a l g e b r a i c c o u n t e r p a r t of the u n i v e r s a l q u a n t i f i e r . Assume R i s a b i n a r y r e l a t i o n . Then the image s e t gR (x) of x under R i s d e f i n e d by: gR(x)=[y : (x,y)\u00E2\u0082\u00ACR) \u00E2\u0080\u00A2 Example 2.8 \u00E2\u0080\u00A2 when r= (1 3), the image s e t gR(2)=(1,2) s i n c e r[2]=3, and the t u p l e s (1 3) and (2 3) are both i n R. The d i v i s i o n of R on A by S on B i s d e f i n e d by: R[A/B]S={r[ ABAR] : r\u00E2\u0082\u00ACR 8 S[ B ]\u00C2\u00A3gR (r[ ABAR ]) } , where ABAR i s the domain l i s t which i s the complement of A, and gR (x) i s the image s e t of x under the r e l a t i o n R. In t h i s d e f i n i t i o n , we c o n s i d e r t h a t R i s a b i n a r y r e l a t i o n composed of the two compound domains A and ABAR. The process of d i v i s i o n operates as f o l l o w s . Consider each t u p l e i n R to c o n s i s t of two elements, r1 and r 2 . Then r1 i s a t u p l e i n the q u o t i e n t of R[A/B]S i f f o r each t u p l e r3 i n S[B], 18 there e x i s t s a t u p l e i n R with r3=r2 and r1 i s always the other \" h a l f \" of the t u p l e i n which r2 i s contained. \u00E2\u0080\u00A2 Example 2.9 \u00E2\u0080\u00A2 The g u o t i e n t o f R2[ 2/1 ]R3[ 1 ] i s : RP ( S# ) 2 s i n c e s u p p l i e r 2 i s the only s u p p l i e r who s u p p l i e s a l l p a r t s . Notice t h a t r3[1 ] produces a r e l a t i o n which enumerates the part numbers. \u00E2\u0080\u00A2 2 . . 3 C h o i c e o f Operations i n the R e l a t i o n a l Algebra n Other authors[3,4,8,17,24] have proposed r e l a t i o n a l a l g e b r a s which appear to have the same s e l e c t i v e power[4], as the one d e f i n e d above, yet they c o n t a i n c o n s i d e r a b l y d i f f e r e n t o p e r a t i o n s . T h i s t h e s i s deals with q u e r i e s which are expressed i n a p r e d i c a t e c a l c u l u s n o t a t i o n , and must e v e n t u a l l y be reduced to a sequence of o p e r a t i o n s i n the r e l a t i o n a l a l g e b r a . Since p r o j e c t i o n and d i v i s i o n form the a l g e b r a i c c o u n t e r p a r t s of the e x i s t e n t i a l and u n i v e r s a l q u a n t i f i e r s , and s i n c e r e s t r i c t i o n can be used to process r e s t r i c t i o n s i n the query, the c h o i c e of t h i s a l g e b r a i s f i t t i n g . 19 \u00E2\u0080\u00A2 2^4^ Implementation of the R e l a t i o n a l Algebra c The o p e r a t i o n s of the r e l a t i o n a l a l g e b r a have been implemented using the programming language LISP[31], T h i s s e c t i o n e x p l a i n s how the r e l a t i o n s are s t o r e d , and shows the syntax of the r e l a t i o n a l a l g e b r a g u e r i e s . Each r e l a t i o n has two p r o p e r t i e s on i t s \u00C2\u00A3ropert_y l i s t . The f i r s t property i s DOMAINS, whose value i s a l i s t of a l l the domain names f o r the r e l a t i o n . The second i s TUPLES, whose value i s a l i s t of a l l the t u p l e names t h i s r e l a t i o n c o n t a i n s . Then, under the f l a g DATA on each t u p l e name i s a l i s t which i s the a c t u a l t u p l e i n the r e l a t i o n . \u00E2\u0080\u00A2 Example 4.1 \u00E2\u0080\u00A2 (GET \u00C2\u00ABR 'DOMAINS) = (PART# PART-NAME) (GET 'R 'TUPLES) = (T1 T2 T3) (GET 'T1 'DATA) = (1 A) N o t i c e t h a t i n LISP, f u n c t i o n c a l l s are w r i t t e n i n p r e f i x normal form. The name of the f u n c t i o n and i t s arguments are always enclosed i n b r a c k e t s , and f u n c t i o n c a l l s can be nested. The n o t a t i o n 'R i s e q u i v a l e n t to (QUOTE R). (The f u n c t i o n QUOTE r e t u r n s as i t s value the argument which i t was passed without e v a l u a t i n g i t . ) thus, i f one set N egual to 5, the value of N would be 5, whereas the value of (QUOTE N) would be N. T h i s data s t r u c t u r e a l l o w s both quick r e t r i e v a l of the t u p l e s , and ease i n p r o c e s s i n g the r e l a t i o n s t u p l e - w i s e . T h i s i s c r u c i a l , s i n c e a l l o p e r a t i o n s i n the r e l a t i o n a l a l g e b r a access the r e l a t i o n by t u p l e , r a t h e r than by domain. 20 \u00E2\u0080\u00A2 Example 4.2 \u00E2\u0080\u00A2 The f o l l o w i n g commands i l l u s t r a t e the syntax of each o p e r a t i o n i n the r e l a t i o n a l a l g e b r a : (RINTERSECT RL1ST) (RUNION RLIST) (RCROSS RLIST) (RDIFF 'REL1 \u00E2\u0080\u00A2REL2) (PROJECT 'REL1 'LIST) (JOIN \u00E2\u0080\u00A2 REL1 \u00C2\u00BBREL2 THETA) (DIVIDE 'REL1 'LIST * REL2 \u00C2\u00BBLIST) (RESTRICT 'REL1 THETA) , THETA i s an a r b i t r a r y LISP p r e d i c a t e , and DLIST i s a l i s t of domain numbers of the preceeding r e l a t i o n , and RLIST i s a l i s t of r e l a t i o n names. In order t h a t s p e c i f i c elements of t u p l e s can be r e f e r e n c e d w i t h i n the p r e d i c a t e s , the user can always assume t h a t the v a r i a b l e T1 p o i n t s to the c u r r e n t t u p l e i n REL 1, and T2 p o i n t s to the c u r r e n t t u p l e i n REL2. In order to express a p r e d i c a t e which says \"the second domain of REL1 must be egual to ten times the t h i r d domain of REL2\", one would w r i t e : (EQ (ELEM T1 2) (TIMES (ELEM T2 3) 10)) . Appendix 3 c o n t a i n s a number of sample g u e r i e s i n the r e l a t i o n a l a l g e b r a , and shows the output they produce. For a complete l i s t i n g of the r o u t i n e s which d e f i n e these o p e r a t o r s , p l e a s e see Appendix 7. 21 D 2^5^ The R e l a t i o n a l Algebra as a fiuerjj Language n Despite the f a c t t h a t most c u r r e n t r e l a t i o n a l systems use the r e l a t i o n a l a l g e b r a as t h e i r top l e v e l query language, i t i s c l e a r l y u n s u i t a b l e f o r g e n e r a l use. The user i s r e q u i r e d to generate the c o r r e c t sequence of o p e r a t i o n s which w i l l r e t r i e v e the d e s i r e d data. Queries are expressed i n terms of \"how to r e t r i e v e the data\", r a t h e r than i n terms of what i s wanted. Operations such as d i v i s i o n are a l s o c o u n t e r - i n t u i t i v e , and the average user would f i n d i t d i f f i c u l t , i f not i m p o s s i b l e , to master t h e i r use. 22 I l l s . RELATIONAL CALCULUS D 3 _ . I n t r o d u c t i o n n In an attempt to provide a more reasonable query language f o r r e l a t i o n a l systems, Codd[11 ] i n t r o d u c e d a r e l a t i o n a l c a l c u l u s . T h i s query language i s not intended to be used d i r e c t l y by us e r s , but to be used as the b a s i s f o r higher l e v e l guery languages. Queries i n the r e l a t i o n a l c a l c u l u s are expressed using a p r e d i c a t e c a l c u l u s n o t a t i o n . They tend t o be more \"prope r t y d e f i n i n g \" than the g u e r i e s i n the r e l a t i o n a l a l g e b r a . T h i s chapter p r e s e n t s a r e l a t i o n a l c a l c u l u s based on t h a t of Codd[11] and Palermo[ 26 ]. I t a l s o d e s c r i b e s a r e d u c t i o n a l g o r i t h m which takes a guery i n the r e l a t i o n a l c a l c u l u s and reduces i t to a s e m a n t i c a l l y e q u i v a l e n t sequence of o p e r a t i o n s i n the r e l a t i o n a l a l q e b r a . I t concludes with a c r i t i c a l e v a l u a t i o n of the u s e f u l n e s s of the r e l a t i o n a l c a l c u l u s as a framework f o r r e l a t i o n a l systems. 23 n 3_. 2^ The R e l a t i o n a l C a l c u l u s n 3ils.A The Alphabet The f o l l o w i n g n o t a t i o n i s adopted: Tuple v a r i a b l e s r1,r2, ... Range P r e d i c a t e s P1,P2, ... I n d i v i d u a l Constants a,b, ... Index c o n s t a n t s 1,2, ... 3.2.B Terms There are two types of terms i n the r e l a t i o n a l c a l c u l u s \u00C2\u00A3512\u00C2\u00AE terms, and j o i n terms. Range terms are used to i d e n t i f y the range of each t u p l e v a r i a b l e i n the query. For each r e l a t i o n R i , there e x i s t s a corre s p o n d i n g monadic p r e d i c a t e P i which determines whether or not any t u p l e r i n the data base i s an element of R i . [Thus, P can t e l l i f a t u g l e i s i n a r e l a t i o n R, whereas R can t e l l i f n elements (where R i s of degree n) are i n R.] \u00E2\u0080\u00A2 D e f i n i t i o n \u00C2\u00AB A range term i s a monadic p r e d i c a t e f o l l o w e d by a t u p l e v a r i a b l e . \u00E2\u0080\u00A2 Example 2.1 \u00E2\u0080\u00A2 P3r1 i s a range term. J o i n terms i n the r e l a t i o n a l c a l c u l u s are used to determine how r e l a t i o n s i n the data base are to be j o i n e d . They are 24 a r b i t r a r y f u n c t i o n s , whose purpose i s to show how the t u p l e s which are t h e i r arguments are to be r e l a t e d . \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 An indexed t u p l e i s an e x p r e s s i o n of the form r [ N ] , where r i s a t u p l e v a r i a b l e , and N i s an index constant. I t s purpose i s to i d e n t i f y the Nth element of r . \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 L e t a,b be indexed t u p l e s , and c be a con s t a n t . Then i f e i s a p r e d i c a t e whose only elements are e i t h e r a and c, or a and b, then 6 i s a j o i n term. \u00E2\u0080\u00A2 Example 2.2 \u00E2\u0080\u00A2 Rl[1]=r3\u00C2\u00A32] and (r1[1]*7) = 26 are both j o i n terms, whereas r1[ 1 ]=r3[2]=r5[7] and P1r1 are not. 3.2.C WFFs The w e l l formed formulae (WFFs) of the r e l a t i o n a l c a l c u l u s are d e f i n e d as f o l l o w s . 1. Any term i s a WFF, 2. I f f i s a WFF, then so i s -\u00C2\u00ABf. 3. I f |1 and ?2 are WFFs, so are (ipi & $2) and (y\ v f2) . 4. I f f i s a WFF i n which r occurs as a f r e e v a r i a b l e , then i r (?) and \u00C2\u00A5r(ip) are WFFs. 5. No other formulae are WFFs. The WFFs of the r e l a t i o n a l c a l c u l u s are not s u i t a b l e as 25 q u e r i e s i n the r e l a t i o n a l c a l c u l u s s i n c e they allow the f o r m a t i o n of meaningless q u e r i e s . Furthermore, q u a n t i f i e d e x p r e s s i o n s can be w r i t t e n i n a more meaningful way than i s allowed by the HFFs. 3.2.D Range Formulae Range formulae attempt to l i m i t the ranges of t u p l e v a r i a b l e s to w e l l d e f i n e d r e l a t i o n s . H h i l e doing t h i s , they must s t i l l allow the ranges t o be s p e c i f i e d i n some n a t u r a l manner. T h i s n o t i o n was i n t r o d u c e d by Palermo[26], and i s an improvement over the o r i g i n a l f o r m u l a t i o n by Codd who d i d not allow j o i n terms i n the formulae. \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 9 i s a range formula over r i f : 1. f i s a g u a n t i f i e r f r e e WFF. 2. r i s the o n l y t u p l e v a r i a b l e i n f l . 3. i s i n d i s j u n c t i v e normal form ( d n f ) , and each conjunct c o n t a i n s a t l e a s t one non-negated range term. 4. The r e l a t i o n s d e f i n e d by each range term i n If have the same number of domains. \u00E2\u0080\u00A2 Example 2.3 \u00E2\u0080\u00A2 P1r1 r1 comes from R1 Pl r 1 I P2r2 r1 i s i n R1 and i s a l s o i n R2 Pl r 1 S r1[2]=5 r1 comes from R1, and the value of i t s second domain i s 5. In the d e f i n i t i o n of range formulae as presented by 26 Palermo, r e s t r i c t i o n t h r e e above i s not present. However, without i t , formulae such as: (P3r3 v r3[ 2]=1) , which c l e a r l y do not s p e c i f y a v a l i d range f o r r 3 , are a c c e p t a b l e . R e s t r i c t i o n three d i s a l l o w s formulae of t h i s type by s p e c i f y i n g t h a t when i n dnf, each conjunct must have a t l e a s t one non-negated range term. Formulae such as: P3r3 v (P4r3 8 r3[2]=1) , and P3r3 6 (P4r3 v r3[2]=1) which are both v a l i d range formulae are s t i l l a c c e p t a b l e using t h i s new d e f i n i t i o n . 2.S.2..E Pure Range Formulae A range formula which c o n s i s t s only of range terms i s known as a pure range formula. 3._2AF Range Coupled Q u a n t i f i e r s a D e f i n i t i o n a -jip and ?ro are c a l l e d range coupled q u a n t i f i e r s , and are d e f i n e d by the e q u a t i o n s : i\u00C2\u00A5(4>) = i r ( U 8 4>) \u00C2\u00A5ip {$) = \u00C2\u00A5r (-iip v ) Assume $ i s a WFF having r as a f r e e v a r i a b l e , and f i s a range formula over r . Then (<}>) and ?f (4>) are a l s o WFFs. 27 3.2.G g-formulae We now d e f i n e the formulae which can be used i n q u e r i e s i n the r e l a t i o n a l c a l c u l u s , a D e f i n i t i o n \u00E2\u0080\u00A2 A WFF $> i n the r e l a t i o n a l c a l c u l u s i s a Q-formula i f i t i s a c o n j u n c t i o n of the form: <|> = 01 6 02 6 ... S Op S W , where 1. Each 01 i s a range formula over r i , i=1,2, ... p. 2. W i s e i t h e r n u l l , or i s a WFF i n prenex normal form, with f r e e v a r i a b l e s v1,v2, ... ,vp, and bound v a r i a b l e s Vp+1,Vp+2, ... ,Vp+q. 3. The matrix of W i s i n d i s j u n c t i v e normal form (dnf). 4. There are no -\u00C2\u00BB symbols immediately preceding a j o i n term. 5. Every v a r i a b l e i s coupled to a range: i . I f a v a r i a b l e i s f r e e , i t belongs to the set of v a r i a b l e s whose ranges are s p e c i f i e d by u1,u2, ... ,up. i i . Every g u a n t i f i e r i n W i s range coupled. T h i s i m p l i e s that any bound v a r i a b l e a l s o has i t s range s p e c i f i e d . 6. The matrix of W i s devoid of range terms. 7. p>1. These Q-formulae correspond somewhat to the range separable WFFs of Codd[11] and to the C-formulae of Palermo[26], Examples of Q-formulae can be found i n Appendix 4. 28 We are now i n a p o s i t i o n to d e f i n e the q u e r i e s of the r e l a t i o n a l c a l c u l u s . These q u e r i e s , which w i l l be r e f e r r e d to as 2~ exp revisions, are s i m i l a r to the Simple Alpha-expressions of Codd[11], and the Gamma-expressions of Palermo[26]. \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 A Q-expression has the form: ( t l t2 ... tn) : Q, where: 1. Q i s a Q-formula of the r e l a t i o n a l c a l c u l u s . 2. The set of t u p l e v a r i a b l e s o c c u r r i n g i n t 1 , t 2 , ... ,tn i s p r e c i s e l y the s e t of f r e e v a r i a b l e s i n Q. \u00E2\u0080\u00A2 3^3^ A s e d u c t i o n Algorithm o T h i s s e c t i o n shows how a guery i n the r e l a t i o n a l c a l c u l u s can be reduced to a s e m a n t i c a l l y e g u i v a l e n t seguence of o p e r a t i o n s i n the r e l a t i o n a l a l g e b r a . The method used i s based upon the r e d u c t i o n algorithm of Palermo[26], The r e d u c t i o n a l g o r i t h m does not a c t u a l l y generate a guery i n the r e l a t i o n a l a l g e b r a . Instead, i t works i t s way through the query, c a l l i n g upon the o p e r a t i o n s of the r e l a t i o n a l a l g e b r a to produce new r e l a t i o n s which are necessary f o r the c o n s t r u c t i o n of the guery's response r e l a t i o n . The r e d u c t i o n a l g o r i t h m begins by c r e a t i n g the r e l a t i o n s 29 which form the range of each t u p l e v a r i a b l e . V a r i a t i o n s of these r e l a t i o n s are then j o i n e d , a c c o r d i n g to the j o i n terms i n W. A f t e r the a p p r o p r i a t e unions and i n t e r s e c t i o n s of the remaining r e l a t i o n s have been made, the new r e l a t i o n i s r e p e a t e d l y d i v i d e d or p r o j e c t e d i n order to take the q u a n t i f i e r s i n t o account. T h i s r e l a t i o n i s then p r o j e c t e d on the domains of the t a r g e t l i s t , r e s u l t i n g i n the response r e l a t i o n . A l l o p e r a t i o n s of the r e l a t i o n a l a l g e b r a are used i n t h i s process. The r e d u c t i o n a l g o r i t h m operates i n such a way as to minimize the amount of necessary c o r e . 3.3.A G l o b a l and L o c a l Ranges In a Q-formula, W has the form: Q(P +1) Q(P+2) ... Q(P+q) [61 v 62 ... v 6k] , where Q(i) i s a range coupled q u a n t i f i e r , and 0 i i s a c o n j u n c t i o n of j o i n terms. Let (ik) be the subformula of 6 i c o n s i s t i n g of terms whose only v a r i a b l e i s r ( k ) , and l e t : f(k) = The range formula Uk, i f r (k) i s a f r e e v a r i a b l e . = The range formula f o r the q u a n t i f i e r which binds r (k) otherwise. \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 The l o c a l range L ( k i ) f o r r (k) i n (pi i s d e f i n e d by the formula: L ( k i ) = { (r) : 9 (k) S $ (ik) } . The g l o b a l range G (k) f o r r(k) i s de f i n e d by: 30 G(k) = { (r) : ip(k) } . The l o c a l ranges of a v a r i a b l e are simply r e s t r i c t i o n s of i t s g l o b a l range. When reducing a query, no r e l a t i o n need ever c o n t a i n a domain which i s not e x p l i c i t l y r e f e r e n c e d i n the query[26]. Thus, we d e f i n e the reduced l o c a l J g l o b a l j ^ range f o r a v a r i a b l e to be the p r o j e c t i o n of i t s l o c a l (global) range on a l l i t s r e f e r e n c e d domains. I t i s with these r e l a t i o n s t h a t the r e d u c t i o n a l g o r i t h m d e a l s . liJxS The J o i n Algorithm In h i s r e d u c t i o n a l g o r i t h m , Codd[11] begins by t a k i n g the c r o s s product of the g l o b a l ranges of a l l the v a r i a b l e s used i n the query. T h i s c r o s s product i s then r e s t r i c t e d to the cases d e f i n e d by the j o i n terms of e, and the r e s u l t processed a c c o r d i n g to the q u a n t i f i e r s . Needless to say, the s i z e of t h i s r e l a t i o n can become unbearably l a r g e . As Palermo[26] observed, the forming of the c r o s s product i s unnecessary. The i n d i v i d u a l l o c a l ranges can i n s t e a d be j o i n e d a c c o r d i n g to the terms of e i , producing the r e l a t i o n s C i . The subset of S d e f i n e d by 9 can now be produced by t a k i n g the union of each C i which was produced. T h i s method r e s u l t s i n c o n s i d e r a b l e savings of time and space. I t i s the f u n c t i o n of the j o i n a l g o r i t h m to produce the r e l a t i o n s C i d e f i n e d by t h e i r c o r r e s p o n d i n g e i . The a l g o r i t h m 31 assumes t h a t the l o c a l range f o r each v a r i a b l e i n \u00E2\u0082\u00AC i has been c r e a t e d . The j o i n a l g o r i t h m proceeds as f o l l o w s . STEP I... a l i s t of the reduced l o c a l ranges used i n e i i s c r e a t e d . T h i s l i s t i s ordered, with the s m a l l e s t r e l a t i o n coming f i r s t . STEP 2.. The f i r s t reduced range i s placed i n a workspace, c a l l e d the c o r e , and removed from the l i s t . STEP 2\u00C2\u00B1 A l i s t of a l l terms i n e i which r e f e r e n c e a domain i n the core i s c r e a t e d , s i n c e these terms can be used to j o i n the core with some new range from the l i s t . STEP j*^ The range which i n v o l v e s the s m a l l e s t r e l a t i o n i s chosen. STEP 5.. The core i s then j o i n e d to t h i s range, using the term from e i as the j o i n p r e d i c a t e . STEP 6\u00C2\u00B1 T h i s term i s removed from the l i s t , and p r o c e s s i n g c o n t i n u e s with step 2. T h i s process i s repeated u n t i l e i t h e r : i . The range l i s t i s not yet empty, i n d i c a t i n g more j o i n i n g i s t o be done, yet there are no more j o i n terms connected to the c o r e . In t h i s case, save the c u r r e n t core, and go t o step 1. i i . The range l i s t i s empty, i n which case one can r e t u r n , p r o v i d i n g no cores have been saved. I f cores have been saved, then form the c r o s s product of the c o r e s and r e t u r n . 32 3_._3._C The Reduction Algorithm The f i r s t step i n the r e d u c t i o n a l g o r i t h m i s to c r e a t e the r e l a t i o n d e f i n e d by 6. Since 9 i s i n dnf, t h i s can be done by t a k i n g the union of the r e l a t i o n s C i d e f i n e d by each e i i n e. Each r e l a t i o n C i i s c r e a t e d using the j o i n a l g o r i t h m . STEP X\u00C2\u00B1 Form the reduced g l o b a l range f o r each v a r i a b l e i n the query. T h i s i s done by examining the range formula of the v a r i a b l e . STEP 2j. Form the r e l a t i o n s C i which are d e f i n e d by e i . In order to do t h i s , f i r s t form the reduced l o c a l range f o r each v a r i a b l e used i n e i , then u t i l i z e the j o i n a l g o r i t h m with the terms of e i to produce C i . STEP 3., Form the union of a l l C i , producing the r e l a t i o n Tp+q. Once the r e l a t i o n Tp+q has been d e r i v e d , the next step i s to take the e f f e c t of the q u a n t i f i e r s i n t o account. Q u a n t i f i e r s are processed from r i g h t to l e f t - i . e . , from Q (p+q) to Q(p). T h e i r e f f e c t i s : 1. I f Q(j) i s an e x i s t e n t i a l q u a n t i f i e r , p r o j e c t the r e l a t i o n Tp+j on a l l domains except those processed by the r e l a t i o n which d e f i n e s the range of r ( j ) . 2. I f Q(j) i s a u n i v e r s a l q u a n t i f i e r , d i v i d e the r e l a t i o n Tp+j by the r e l a t i o n which d e f i n e s the range of r ( j ) . T h i s r e s u l t s i n a r e l a t i o n whose t u p l e s are i n some sense \" t r u e \" f o r a l l r (j) . The r e s u l t of each of the above o p e r a t i o n s i s the r e l a t i o n 33 Tp+j- 1. STEP 4.. The q u a n t i f i e r o p e r a t i o n s of p r o j e c t i o n ( e x i s t e n t i a l ) and d i v i s i o n ( u n i v e r s a l ) are a p p l i e d f o r each q u a n t i f i e r Q(i) i n the p r e f i x of W, with the q u a n t i f i e r s being processed from r i g h t to l e f t . T h i s produces the r e l a t i o n Tp. STEP 5,j_ P r o j e c t the r e l a t i o n Tp on each of the domains s p e c i f i e d i n the t a r g e t l i s t . The r e s u l t i s the response r e l a t i o n f o r the guery. Please r e f e r t o Appendix 4 f o r sample q u e r i e s i n the r e l a t i o n a l c a l c u l u s . n 3_. 4_. Implementation of the Reduction Algorithm n The r e d u c t i o n a l g o r i t h m as d e s c r i b e d above has been implemented i n LISP. Appendix 4, which shows sample r e d u c t i o n s , was c r e a t e d using these r o u t i n e s . The purpose of t h i s s e c t i o n i s to show why some r e s t r i c t i o n s have been placed on the r e l a t i o n a l c a l c u l u s f o r the b e n e f i t of the implementation and to show how the gu e r i e s are repr e s e n t e d i n LISP. 34 l i l i i A S\u00C2\u00A3\u00C2\u00A7i\u00C2\u00A3i\u00C2\u00A3ii2\u00C2\u00A3\u00C2\u00A7 22 ill\u00C2\u00AE R e l a t i o n a l C a l c u l u s There are two r e s t r i c t i o n s which have been made i n the r e l a t i o n a l c a l c u l u s f o r the b e n e f i t of the implementation. The f i r s t of these i s t h a t i n a Q-formula, the matrix of W must be devoid of range terms. T h i s means t h a t when c r e a t i n g the g l o b a l range f o r a v a r i a b l e , the matrix of W need not be examined. The r e s u l t of t h i s r e s t r i c t i o n i s t h a t the range of each f r e e v a r i a b l e must be d e c l a r e d i n some Oi r a t h e r than i n W. Secondly, the range formulae must be i n dnf. Without t h i s r e s t r i c t i o n , the formulae can become extremely complex, and i n d i v i d u a l elements of the formula cannot be processed independently o f the other elements i n the formula. When i n dnf, each c o n j u n c t i o n w i l l d e f i n e a v a l i d r e l a t i o n , and thus ranges can be determined by t a k i n g the union of the r e l a t i o n s d e f i n e d by each c o n j u n c t i o n . \u00E2\u0080\u00A2 Example 4.1 \u00E2\u0080\u00A2 P7r1 8 [P4r1 v (EQ (R1 1) A) v P5r1 v (EQ (R1 2) B ) ] . Notice that i n t h i s example, the second element of the c o n j u n c t i o n does not i n i t s e l f d e f i n e a v a l i d r e l a t i o n , and thus cannot be processed independently of the f i r s t element of the c o n j u n c t i o n . When t h i s e x p r e s s i o n i s i n dnf however, each conjunct d e f i n e s a v a l i d r e l a t i o n . 35 1-s.iJiS R epresentation of the Queries Each guery i s assigned a unigue name, and has the f o l l o w i n g i n f o r m a t i o n on i t s property l i s t : 1. VARS - a l i s t of a l l the v a r i a b l e s i n the guery, i n the re v e r s e order of t h e i r appearance. 2. THETA - a l i s t of a l l e i o c c u r r i n g i n the guery. For example, (THETA1 THETA2) . 3. TARGET - the t a r g e t l i s t f o r the guery. For example, ((R1 4) (R2 3) ) Each element i n the l i s t THETA has the f l a g TERMS on i t s p r o p e r t y l i s t . The value of t h i s f l a g i s a l i s t of the names of the terms o c c u r r i n g i n the p a r t i c u l a r e i . Each term has i t s d e f i n i t i o n as i t s value. For example, the value of TERM 1 might be (EQ (R1 2) 970). On the property l i s t o f the term, under the f l a g VARS, i s a l i s t of a l l the v a r i a b l e s i n the term. In the above case, t h i s l i s t would be (B1). Each v a r i a b l e has on i t s property l i s t the f l a g s : 1. QUANTIFIER - e i t h e r ALL, i f the v a r i a b l e i s u n i v e r s a l l y g u a n t i f i e d , EXISTS, i f the v a r i a b l e i s e x i s t e n t i a l l y q u a n t i f i e d , or NONE. 2. USED-IN-TERMS - a l i s t of a l l terms i n which the v a r i a b l e i s used. 3. REFDOMAINS - a l i s t of a l l domains of the g l o b a l range of the v a r i a b l e which are r e f e r e n c e d anywhere i n the guery. 4. RANGE - e i t h e r the name of the reduced g l o b a l range f o r the v a r i a b l e , or an e x p r e s s i o n which d e f i n e s the g l o b a l range. 36 The value of the v a r i a b l e i s i t s c u r r e n t reduced l o c a l range. In w r i t i n g q u e r i e s , j o i n terms are expressed as a r b i t r a r y LISP p r e d i c a t e s , with the l i s t (R1 1) being used to r e p r e s e n t the f i r s t domain of t u p l e R1. The above r e p r e s e n t a t i o n completely c h a r a c t e r i z e s the query. The r e d u c t i o n a l g o r i t h m needs no a d d i t i o n a l i n f o r m a t i o n i n order to respond t o a query i n the r e l a t i o n a l c a l c u l u s . \u00E2\u0080\u00A2 2 L 5 E v a l u a t i o n of the R e l a t i o n a l C a l c u l u s n I t i s the p o s i t i o n of t h i s t h e s i s t h a t as a top l e v e l query language, the r e l a t i o n a l c a l c u l u s i s inadequate. F u r t h e r , i f the r e l a t i o n a l c a l c u l u s i s to be used as the t a r g e t language f o r a higher l e v e l query language, t h e r e are a number of problems which must be overcome. The purpose of t h i s s e c t i o n i s to e v a l u a t e the f e a s i b i l i t y of the r e l a t i o n a l c a l c u l u s as a framework f o r r e l a t i o n a l systems. Consider the r e l a t i o n a l c a l c u l u s as a t o p - l e v e l query language. Since the q u e r i e s are very much t u p l e o r i e n t e d , i t i s o f t e n d i f f i c u l t t o express i n f o r m a t i o n about domains. In the r e l a t i o n a l c a l c u l u s i t i s p o s s i b l e to say \" f o r a l l t u p l e s i n r e l a t i o n X\", something i s t r u e , but d i f f i c u l t t o say \" f o r a l l p a r t s \" something i s t r u e . T h i s can only be done with a s i n g l e q u a n t i f i e r i f the r e l a t i o n s which enumerate the p a r t s have 37 e x a c t l y the same number of domains. Even then i t i s p o s s i b l e o n l y i f the p a r t occurs i n the same p o s i t i o n i n each r e l a t i o n . I f t h i s i s not t r u e , s e v e r a l q u a n t i f i e r s w i l l have to be used to q u a n t i f y one e n t i t y . T h i s i s indeed u n d e s i r a b l e . Forming q u e r i e s i n the r e l a t i o n a l c a l c u l u s tends to be a d i f f i c u l t process. \u00E2\u0080\u00A2 Example 5.1 \u00E2\u0080\u00A2 assume the e x i s t e n c e of the f o l l o w i n g three r e l a t i o n s : R1 (SUPPLIER* SNAME SLOCATION) R2 (PROJECT* PART-NAME) R3 (SUPPLIER* PART* PROJECT*) Then the guery \" F i n d the numbers of the s u p p l i e r s , each of whom s u p p l i e s a l l p a r t s \" i s rep r e s e n t e d as: R1[1] : P3r1 S VP3r2 -}P3r3 (r1[ 2 ]=r3[ 1 ] S r2[ 2 ]=r3[ 2 ]) Despite the f a c t t h a t t h i s i s an a p p l i e d p r e d i c a t e c a l c u l u s , i t i s s t i l l not p o s s i b l e f o r a user to s t a t e the p r o p e r t i e s of the data he wants r e t r i e v e d . There are s e v e r a l reasons why the r e l a t i o n a l c a l c u l u s i s so d i f f i c u l t to use. F i r s t l y , i t i s very data base o r i e n t e d . A user must have a thorough knowledge of the c u r r e n t o r g a n i z a t i o n of h i s data base, s i n c e i n s t a t i n g the query, the r e l a t i o n s which are to be used must be e x p l i c i t l y i d e n t i f i e d . T h i s i s a very s e r i o u s flaw, s i n c e a prime advantage of r e l a t i o n a l systems i s the independence they present between the system and the c u r r e n t o r g a n i z a t i o n of the data. In h i s o r i g i n a l paper, Codd[8] s t a t e s \"users at t e r m i n a l s ... should remain u n a f f e c t e d when the i n t e r n a l r e p r e s e n t a t i o n of the data 38 i s changed, and even when some aspects of the e x t e r n a l r e p r e s e n t a t i o n are changed\". When the r e l a t i o n a l c a l c u l u s i s used as a top l e v e l query language, t h i s b a s i c p r i n c i p l e i s v i o l a t e d . Not only does the f a c t t h a t the qu e r i e s d i r e c t l y r e f e r e n c e the data base make them hard to w r i t e , but i t means t h a t the same query w i l l have to be expressed d i f f e r e n t l y f o r a d i f f e r e n t o r g a n i z a t i o n of the data base. Secondly, not only do the q u e r i e s t e l l the system what i n f o r m a t i o n t o use, they a l s o t e l l the system what to do with i t i n order to produce the response r e l a t i o n . Example: Assume the e x i s t a n c e of the t h r e e r e l a t i o n s i n example 5.1. Then the query \" F i n d the names of the s u p p l i e r s , each of whom s u p p l i e s a l l p r o j e c t s \" i s represented as: R1[2] : P1r1 S \u00C2\u00A5P2r2iP3r3 ((r1[ 1 ]=r3[ 1 ] S (r2[ 1 ]=r3[3 ]) ) Not only does t h i s query t e l l the system which r e l a t i o n s to use, but i t a l s o says what should be done with them. Namely, take a t u p l e r1 i n R1. Then f o r each t u p l e r2 i n R2, there must be some t u p l e r3 i n R3 such t h a t the f i r s t domain of r1 i s equal to the f i r s t domain of r 3 , and the f i r s t domain of r2 equals the t h i r d domain of r3. I f t h i s i s t r u e , save the second domain of r 1 . Now take the next t u p l e i n R1, and t r y a g a i n . In order to formulate the above query, the user must know how the data base i s arranged, which pieces of i t he i s 39 i n t e r e s t e d i n , e x a c t l y how t h i s i n f o r m a t i o n should be r e l a t e d , and how the system can recover h i s data. C l e a r l y , any system which f o r c e s the user to decide how h i s data should be s e l e c t e d before he can even formulate h i s guery i s u n s a t i s f a c t o r y . Such a system i s performing only the mechanical p a r t of the r e t r i e v a l process, while f o r c i n g the user to do the r e a l work. Now c o n s i d e r the r e l a t i o n a l c a l c u l u s as the t a r g e t language f o r a higher l e v e l guery language Q. Then Q should possess the f o l l o w i n g p r o p e r t i e s : 1. Queries i n Q should be e x p r e s s i o n s of p r o p e r t i e s of what i s to be r e t r i e v e d . 2. Queries should never r e f e r e n c e r e l a t i o n s , but i n s t e a d , r e f e r e n c e domains, thus, Q w i l l be domain o r i e n t e d , whereas the r e l a t i o n a l c a l c u l u s i s t u p l e o r i e n t e d . 3. Q u a n t i f i e r s i n Q should not be dependent upon the data base, as the r e l a t i o n a l c a l c u l u s has been e x h i b i t e d to be. 4. Queries should not have to c o n t a i n any i n f o r m a t i o n about how the guery i s to be answered. In order to overcome the r e s t r i c t i o n s the r e l a t i o n a l c a l c u l u s imposes, the i n t e r f a c e between Q and the r e l a t i o n a l c a l c u l u s must be capable of t a k i n g an a r b i t r a r y guery expressed i n some property d e f i n i n g form, i d e n t i f y i n g which r e l a t i o n s are a p p l i c a b l e to the reguest, determining how they should be j o i n e d , and how the r e t r i e v a l i s to be done. No previous system i s capable of doing these. no IV\u00C2\u00B1 A QUERY LANGUAGE FOR RELATIONAL SYSTEMS T h i s s e c t i o n p r e s e n t s a new query language f o r r e l a t i o n a l systems. The language i s an a p p l i e d p r e d i c a t e c a l c u l u s which r e g u i r e s users to s p e c i f y only the p r o p e r t i e s of the data they want r e t r i e v e d . S p e c i f i c r e l a t i o n s are never d i r e c t l y r e f e r e n c e d i n the guery. Thus, the language i s not data base dependent, and g u e r i e s are expressed i n the same f a s h i o n r e g a r d l e s s of the c u r r e n t o r g a n i z a t i o n of the r e l a t i o n s . The language has been s p e c i f i c a l l y designed as a t a r g e t language f o r a n a t u r a l language system. In f a c t , a system[16] which compiles g u e r i e s i n t o a s i m i l a r r e p r e s e n t a t i o n , but which uses a d i f f e r e n t problem domain, has alre a d y been implemented. T h i s chapter begins with a formal d e f i n i t i o n of the query language. T h i s i s f o l l o w e d by a somewhat more i n t u i t i v e e x p l a n a t i o n of the language, and concludes with a d i s c u s s i o n of i t s use as the t a r g e t language f o r a n a t u r a l language system. D 4^1.^ The R e l a t i o n a l C a l c u l u s Re-Defined n U n l i k e the r e l a t i o n a l c a l c u l u s of Codd, q u e r i e s i n the new r e l a t i o n a l c a l c u l u s r e f e r e n c e domains r a t h e r than t u p l e s . T h i s makes the qu e r i e s data base independent, s i n c e s p e c i f i c r e l a t i o n s need never be r e f e r e n c e d . 41 4. _UA The Alphabet The f o l l o w i n g n o t a t i o n i s adopted. Domain v a r i a b l e s d1,d2, ... D i a d i c P r e d i c a t e s r 1 , r 2 , ... A r b i t r a r y P r e d i c a t e F \u00E2\u0080\u00A2 Example 1.1 \u00E2\u0080\u00A2 PAST# and SUPPLIER are both domain v a r i a b l e s . I f SUPPLIES i s a p r e d i c a t e which says yes or no to \"SUPPLIER SUPPLIES PART*\" f o r s p e c i f i c v alues of SUPPLIER and PART*, then SUPPLIES i s a d i a d i c p r e d i c a t e . 4.1.B Terms There are four types of terms i n the r e l a t i o n a l c a l c u l u s -simple terms, r e l a t i o n a l terms, r e s t r i c t i o n terms, and j o i n terms. \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 & simple term i s a domain v a r i a b l e . a r e l a t i o n a l term i s a l i s t of the form: (d1 r1 d2 denotes an o p t i o n a l occurrence of X. \u00E2\u0080\u00A2 Example 1.2 \u00E2\u0080\u00A2 PART* and SUPPLIER are simple terms, whereas: (SUPPLIER SUPPLIES PART*) i s a r e l a t i o n a l term. 42 \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 A Instruction term i s a term of the form F ( d ) , where F r e p r e s e n t s an a r b i t r a r y monadic p r e d i c a t e . A j\u00C2\u00B0i\u00C2\u00A3 1 S a term of the form F(d1,d2), where F r e p r e s e n t s an a r b i t r a r y d i a d i c p r e d i c a t e . \u00E2\u0080\u00A2 Example 1.3 \u00E2\u0080\u00A2 (EQ PART* 10) i s a r e s t r i c t i o n term, whereas: (EQ (TIMES PRICE 10) (PLUS PART-PRICE 3)) i s a j o i n term. \u00E2\u0080\u00A2 D e f i n i t i o n a Two terms are s a i d to be compatible i f they c o n t a i n a common domain v a r i a b l e . \u00E2\u0080\u00A2 Example 1.4 \u00E2\u0080\u00A2 (S# SUPPLIES PART*) and (PROJECT IS-IN PLOC) are not compatible, whereas (S* SUPPLIES PART*) and (PART* IS-USED-IN PROJECT*) ar e . i U l i C WFFs The w e l l formed formulae (WFFs) of the r e l a t i o n a l c a l c u l u s are d e f i n e d as f o l l o w s : 1. Any term i s a WFF. 2. I f V i s a WFF, then so i s 3. I f JJ1 and |J2 a r e WFFs, so are (ipi S I|f2) and (If 1 v ip2). 4. I f f i s a WFF i n which r occurs as a f r e e v a r i a b l e , then -gr (V) and \u00C2\u00A5r (JJ) are WFFs. 43 5. No other formulae are WFFs. _U D 2zExpressions D e f i n i t i o n \u00E2\u0080\u00A2 A WFF Q i n the r e l a t i o n a l c a l c u l u s i s a Q-expressicn i f : 1. Q c o n t a i n s no q u a n t i f i e r s , 2. Q c o n t a i n s no simple terms which are negated. 3. Q i s i n dnf, with each c o n j u n c t i o n being of the form ($ & 4>) i where: a. % c o n t a i n s only simple and r e l a t i o n a l terms, and (SNAME IS-IN SLOC) 6 (EQ SLOC 'VANCOUVER) ) (SS SUPPLIED PARTS) v (PARTS IS-USED-IN PROJECTS) 44 \u00E2\u0080\u00A2 Example 1.6 \u00E2\u0080\u00A2 None of the f o l l o w i n g are Q-expressions: -\u00E2\u0080\u00A2SNAME (EQ PART* 10) SNAME S (PART* IS-USED-IN PROJECT*) --SNAME S -i (SNAME SUPPLIES PART*) (SNAME SUPPLIES PART*) V (EQ PART* 10) 4_,_1__E Range Formulae \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 f i s a range formula over domain d i f : 1. f i s a q u a n t i f i e r f r e e Q-expression. 2. ip c o n t a i n s at l e a s t one r e l a t i o n a l term which has d as a f r e e v a r i a b l e . 3. Each conjunct i n ip has d as a simple term. \u00E2\u0080\u00A2 Example 1.7 \u00E2\u0080\u00A2 (S# SUPPLIES PART*) S (EQ PART* 5) i s a range formula over S*. 4.1.F Range Coupled Q u a n t i f i e r s \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 Le t (i lie a WFF having d as a f r e e v a r i a b l e , and tf be a range formula over d. Then ^ and -vip are c a l l e d range coupled q u a n t i f i e r s over d, and are d e f i n e d by the equa t i o n s : i f (4>) = i d (| & ) 45 \u00C2\u00A5ip ($) = \u00C2\u00A5d v ) . i ? (<|>) and \u00C2\u00A5f() are a l s o WFF. \u00E2\u0080\u00A2 Example 1.8 \u00E2\u0080\u00A2 \u00C2\u00A5 (PARTS), and \u00C2\u00A5 (PARTS 6 (PARTS IS-SUPPLIED-BY SS) & (GREATERP SS 10)) are both range coupled q u a n t i f i e r s . 4.1.G Target L i s t \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 A t a r g e t l i s t T i s a seguence T=t1,t2, ... ,tk of domain v a r i a b l e s . 4_.JkH Queries We are now i n a p o s i t i o n to d e f i n e the g u e r i e s of the new r e l a t i o n a l c a l c u l u s . \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 A WFF i n the r e l a t i o n a l c a l c u l u s i s a query i f i t i s a WFF of the form: T : W , where 1. T i s a t a r g e t l i s t . 2. W i s a WFF i n prenex normal form. 3. A l l g u a n t i f i e r s i n W are range coupled. 4. The matrix of W i s a Q-expression. 5. There are no range coupled g u a n t i f i e r s over any element t i of T. 6. Each domain v a r i a b l e i n T i s a l s o i n each d i s j u n c t of W. 46 \u00E2\u0080\u00A2 Example 1.9 \u00E2\u0080\u00A2 The f o l l o w i n g are sample q u e r i e s i n the r e l a t i o n a l c a l c u l u s . 1. L i s t the names of the p a r t s t h a t s u p p l i e r number 1 s u p p l i e s . PART-NAME : (S# SUPPLIES PART-NAME) 8 (EQ S# 1) 2. L i s t the p r o j e c t s t h a t s u p p l i e r number 1 s u p p l i e s . PROJECT-NAME : (S# SUPPLIES-TO PROJECT-NAME) 8 (EQ S# 1) 3. Which p r o j e c t s use p a r t 5? PROJECT-NAME : (PROJECT-NAME USES PART*) 8 (EQ PART# 5) 4. Which s u p p l i e r s supply a l l s u p p l i e r s ? SNAME : (\u00C2\u00A5 PROJECT-NAME) (SNAME SUPPLIES PROJECT-NAME) 5. Which s u p p l i e r s have more than 10 u n i t s of p a r t 12? SNAME : (SNAME HAS QOH OF-TYPE PART*) 8 (GREATERP QOH 9) 8 (EQ PART* 12) 6. Which s u p p l i e r s supply a l l p a r t s that c o s t more than 5 d o l l a r s ? SNAME : (\u00C2\u00A5 PART# 8 (PART* COSTS PRICE) 8 (GREATERP PRICE 5)) 8 (SNAME SUPPLIES PART*) A more complete s e t of sample q u e r i e s and t h e i r responses can be found i n Appendix 4. 47 o 4^2^ E x p l a n a t i o n of the Queries \u00E2\u0080\u00A2 One tends to t h i n k of the data base i n terms of the domains i n v o l v e d (eg. SUPPLIERS and PARTS). Queries i n the new r e l a t i o n a l c a l c u l u s allow q u e r i e s t o be formulated i n terms of these domains, r a t h e r than i n terms of the r e l a t i o n s . The terms of the language r e f e r e n c e domains.. Simple terms such as PART* are domain names. R e l a t i o n a l terms such as (S# SUPPLIES PART*) e x h i b i t \u00C2\u00A3\u00C2\u00A7lationships among domains. R e s t r i c t i o n terms, such as (EQ PART* 10) r e s t r i c t the values of domains, and j o i n terms are used to i n d i c a t e when terms with d i f f e r e n t names are r e a l l y the same. For example, (EQ PARTA PARTB) i s a j o i n term. The terms i n the r e l a t i o n a l c a l c u l u s can be combined to produce meaningful g u e r i e s . Using these terms, one c o n s t r u c t s the WFFs of the r e l a t i o n a l c a l c u l u s . The WFFs, however, are f a r too powerful to be of use s i n c e they can be used to d e f i n e meaningless g u e r i e s . For example, the formulae of example 1.6 are WFFs, yet they do not d e f i n e meaningful g u e r i e s . The guery language must t h e r e f o r e be a r e s t r i c t e d subset of the WFFs. The r e s t r i c t i o n process i n v o l v e s three s t a g e s . F i r s t l y , the Q-expressions are d e f i n e d . These e x p r e s s i o n s w i l l e v e n t u a l l y form the body of the guery. They d e f i n e a c l a s s of WFFs which express a v a l i d guery (and thus d e f i n e a v a l i d r e l a t i o n ) and are i n a form which the new r e d u c t i o n a l g o r i t h m can p r o c e s s . Q-expressions may not c o n t a i n q u a n t i f i e r s . 48 Secondly, the range formulae are d e f i n e d . T h i s i s a s e t of WFFs which are only capable of d e f i n i n g a subset of the v a l u e s of a p a r t i c u l a r domain. Since the query language d e a l s with q u a n t i f i e r s which express domains or s p e c i f i c subsets t h e r e o f , t h i s s e t of WFFs had to be i s o l a t e d . A range coupled q u a n t i f i e r i s now d e f i n e d to be a WFF whose range i s a range formula. F i n a l l y , users must be able t o s p e c i f y the domains which they want r e t r i e v e d . T h i s they do through the t a r g e t l i s t . Having d e f i n e d the v a l i d q u a n t i f i e r s Q, the v a l i d query bodies B and t a r g e t l i s t s T, the q u e r i e s can be d e f i n e d to be a l i s t of the form: T : W where a l l the q u a n t i f i e r s i n W are of the form Q, the matrix of W i s a v a l i d body B, and each element of T i s i n W. Hs.2\u00C2\u00B1.h Comments on the R e l a t i o n a l C a l c u l u s I f the r e l a t i o n a l c a l c u l u s were to be used as a top l e v e l query language, then two changes would be d e s i r a b l e . F i r s t l y , i t would be n i c e to say: (SNAME SUPPLIES 10) , r a t h e r than (SNAME SUPPLIES PARTS) & (EQ PART# 10) T h i s would r e q u i r e the system to determine t h a t 10 was a PART* and not, say, a PROJECT*. In cases such as: 49 (WIDGETS ARE-USED-IN VANCOUVER) , the system must know th a t WIDGETS are pa r t names, and th a t VANCOUVER i s being used as the l o c a t i o n of a p r o j e c t and not as the l o c a t i o n of a s u p p l i e r . The semantic model to be proposed i n Chapter v would be of use f o r disambiguating i n f o r m a t i o n of t h i s t y pe. Secondly, one would l i k e the r e s t r i c t i o n t h a t each v a r i a b l e i n T must a l s o be i n W t o be removed. Thus, q u e r i e s such as: (SNAME SLOC : (SNAME SUPPLIES PART*)) would be v a l i d . C u r r e n t l y , t h i s query i s expressed as: (SNAME SLOC : (SNAME SUPPLIES PART*) S (SNAME IS-IN SLOC)) The problem with han d l i n g q u e r i e s of t h i s type can not be d i s c u s s e d u n t i l the mechanism by which q u e r i e s are handled i s understood. A s o l u t i o n t o t h i s problem i s presented i n Chapter V. n 4_.3Use with N a t u r a l Language n The o v e r r i d i n g g o a l i n the design of t h i s system i s to produce a framework which i s s u i t a b l e f o r d i r e c t use i n a n a t u r a l language environment. In p a r t i c u l a r , the query language must be s t r u c t u r e d so t h a t the semantics of the n a t u r a l language system can produce q u e r i e s i n t h i s language. Since the r e l a t i o n a l c a l c u l u s of Codd c o n t a i n s d i r e c t r e f e r e n c e to the data base, and s i n c e the q u e r i e s c o n t a i n so much i n f o r m a t i o n about how the r e t r i e v a l i s to be done, the r e l a t i o n a l c a l c u l u s 50 i s not s u i t a b l e f o r use i n t h i s context. ft data base independent language and a system which can decide how the r e t r i e v a l precess i s t o be done are necessary. WALT[16], a system f o r handl i n g n a t u r a l language i n t e r r o g a t i o n s , has been implemented a t the U n i v e r s i t y of B r i t i s h Columbia. T h i s system i s modelled a f t e r the LSNLIS system by Woods[34], The program attempts to r e t r i e v e i n f o r m a t i o n about LISP programs as a r e s u l t of n a t u r a l language q u e r i e s . WALT makes use of an Augmented T r a n s i t i o n Network grammar[35] i n order t o parse the sentence and produce a l i n g u i s t i c deep s t r u c t u r e . The semantic component then uses the parse t r e e to b u i l d an i n t e r p r e t a t i o n of the sentence. T h i s type o f semantics i s based on the pr o c e d u r a l semantics of Woods[33]. The semantic c o n s t r u c t produced i s a FOB statement whose form i s : (FOB QUANT X CLASS R (X) P(X)) where QUANT i s a q u a n t i f i e r , X i s the v a r i a b l e being g u a n t i f i e d , CLASS i s the name of a s e t over which the g u a n t i f i c a t i c n i s to range, B (X) i s a r e s t r i c t i o n on the range of q u a n t i f i c a t i o n , and P (X) i s the p r o p o s i t i o n which r e p r e s e n t s an a c t i o n to be taken. \u00E2\u0080\u00A2 Example 3.1 \u00E2\u0080\u00A2 In WALT, the query \" L i s t the f u n c t i o n s which the r o u t i n e MATCH c a l l s . \" would have the i n t e r p r e t a t i o n : (FOB EACH X (FUNCTION) (CALLS 'MATCH X) (PRINT X) ) . The query language proposed i n t h i s chapter c o n t a i n s the same i n f o r m a t i o n as does the FOR statement of WALT. Only the syntax used t o express the i n f o r m a t i o n i s d i f f e r e n t . \u00E2\u0080\u00A2 Example 3.2 \u00E2\u0080\u00A2 The r e l a t i o n a l c a l c u l u s guery: SNAME : (SNAME SUPPLIES PART#) 8 (EQ PART# 10) i s e g u i v a l e n t t o : (FOR EACH X (SNAME) TRUE (FOB THE Y (PART#) (EQ PART* 10) (AND (SUPPLIES X Y) (PRINT X)) )) I t i s hoped that the approach presented i n t h i s t h e s i s w i l l soon be extended to encompass n a t u r a l language g u e r i e s . 52 li. h NEW FRAMEWORK FOR RELATIONAL SYSTEMS T h i s chapter presents a new framework f o r i n t e r a c t i n g with r e l a t i o n a l systems. Using t h i s framework, i t i s p o s s i b l e to reduce q u e r i e s i n a new r e l a t i o n a l c a l c u l u s to a sequence of s e m a n t i c a l l y e q u i v a l e n t o p e r a t i o n s i n the r e l a t i o n a l a l g e b r a . T h i s i n v o l v e s determining which r e l a t i o n s are r e l e v a n t to the query, and how the r e t r i e v a l should be done. P r e v i o u s l y , these tasks were assigned to the user. The r e l a t i o n a l c a l c u l u s of Codd i s not used as an i n t e r m e d i a t e language. The chapter begins by d e s c r i b i n g the b a s i c methodology which i s employed to respond to a query. F o l l o w i n g t h i s i s a d e s c r i p t i o n of the main components of the new framework - the semantic model, and the theorem prover. These are d i s c u s s e d i n d e t a i l , and a g e n e r a l r e s u l t which shows when i t i s p o s s i b l e to prove t h a t two or t h r e e domains are r e l a t e d by some a r b i t r a r y r e l a t i o n i s presented. The chapter concludes with a d e s c r i p t i o n of a r e d u c t i o n a l g o r i t h m which makes use of the semantic model and the theorem prover to respond t o q u e r i e s i n the r e l a t i o n a l c a l c u l u s . The approach which has been taken i s somewhat i n t e r e s t i n g i n i t s e l f . I t makes use of standard A r t i f i c i a l I n t e l l i g e n c e ( A I ) techniques i n order to s o l v e a problem i n f a c t r e t r i e v a l . Instead of r e p r e s e n t i n g i n f o r m a t i o n using t a b l e s or graphs, a g e n e r a l semantic model i s used. T h i s concept, which i s b a s i c to most work i n AI, a l l o w s the most p e r t i n e n t i n f o r m a t i o n about the 53 system to be e x p l i c i t l y r e p r e s e n t e d . The meaning of the r e l a t i o n s are represented i n t h i s way. From t h i s b a s i c i n f o r m a t i o n , other i n f o r m a t i o n i s deduced by a theorem prover. In f a c t , q u e r i e s i n the system are formulated as a s e r i e s of theorems t o be proven t r u e . The ste p s of these p r o o f s show how the r e t r i e v a l can be done. T h i s framework i s both i n c r e m e n t a l and f l e x i b l e . New i n f o r m a t i o n can be c o n s t a n t l y added to the semantic model, and i t s e f f e c t s w i l l be a u t o m a t i c a l l y taken i n t o account. The a d d i t i o n of new r e l a t i o n s i s a l s o a t r i v i a l t ask. a 5 . . J . . Reduction of Queries - An Overview D T h i s s e c t i o n o u t l i n e s the b a s i c approach to ha n d l i n g g u e r i e s , and g i v e s a b r i e f d e s c r i p t i o n of Micro-Planner, a language which i s used i n the r e d u c t i o n process. The b a s i c approach t o answering q u e r i e s i s as f o l l o w s . F i r s t l y , the user e n t e r s a query i n the language d e s c r i b e d i n Chapter IV. \u00E2\u0080\u00A2 Example 1.1 \u00E2\u0080\u00A2 I f one wanted a l i s t of a l l s u p p l i e r s who supply a l l p a r t s , one would say: SNAME : (\u00C2\u00A5 PART#) (SNAME SUPPLIES PART#) . T h i s guery i s then formulated as a s e r i e s of theorems to be proved. In Example 1.1, an attempt would be made to prove: 54 (SNAME SUPPLIES PART*) . I f t h i s i s p o s s i b l e , then the system knows that there i s indeed i n f o r m a t i o n i n the data base which a l l o w s us to conclude t h a t some s u p p l i e r s supply p a r t numbers. As a s i d e e f f e c t , the programs which perform t h i s proof c r e a t e i n s t r u c t i o n s which show how the r e l a t i o n which c o n t a i n s t h i s i n f o r m a t i o n can be c r e a t e d . In the above example, the r e l a t i o n : SUPPLIES (SNAME PARTS) would be c r e a t e d . Q u a n t i f i e r s and simple terms i n the query are t r e a t e d i n a s i m i l a r manner. These, r e l a t i o n s are then r e s t r i c t e d and j o i n e d a c c o r d i n g to the r e s t r i c t i o n and j o i n terms i n the query, and a response r e l a t i o n i s d e r i v e d . 5\u00C2\u00AB.liA The Use of Mi erg-Planner The r e d u c t i o n of a query i s based upon the a b i l i t y to prove t h a t the simple and r e l a t i o n a l terms of the query are t r u e , using the elements of the semantic model as the axioms of the system. S e c t i o n 5 o u t l i n e s t h i s procedure i n d e t a i l . In order to accomplish t h i s t a s k , the language Micro-EliLDJlillC 1 8 ] 1 S used. Micro-Planner i s a language which i s o r i e n t e d towards the accomplishment of g o a l s , which i n tu r n are broken i n t o a s e r i e s of sub-goals. I t p r o v i d e s a back-up mechanism, so t h a t i f one p o s s i b l e way of accomplishing the g o a l i s t r i e d and f a i l s , then another p o s s i b i l i t y w i l l be t r i e d e t c . , . The f o l l o w i n g t r a d i t i o n a l example of deduction w i l l 55 i l l u s t r a t e some elementary f e a t u r e s of Micro-Planner. \u00E2\u0080\u00A2 Example 1.2 \u00E2\u0080\u00A2 I f we know t h a t T u r i n g i s a human, and a l l humans are f a l l i b l e , then T u r i n g i s f a l l i b l e . In Micro-Planner, t h i s i s expressed by s a y i n g : (THASSEET (HUMAN TUBING) ) (THCONSE (X) (FALLIBLE $?X) (THGOAL (HUMAN $?X)) ) THASSEBT and THGOAL can be a b b r e v i a t e d to $A and $G r e s p e c t i v e l y . The proof would be generated by e v a l u a t i n g the g o a l : (THGOAL (FALLIBLE TUBING) $T) From t h i s example, s e v e r a l p o i n t s should be observed. F i r s t , i n f o r m a t i o n i s s t o r e d i n Micro-Planner i n one of two ways as ASSERTIONS, or as THEOREMS. Simple f a c t s , such as Turing i s a human, are represented by a s s e r t i o n s , whereas more complicated f a c t s which may i n v o l v e q u a n t i f i c a t i o n and l o g i c a l c o n n e c t i v e s are expressed as theorems. In the above example, a THCONSE (conseguence) theorem i s shown. T h i s theorem s t a t e s t h a t a conseguence of X being a human i s that X i s f a l l i b l e . Micro-Planner, being a programming language, p r o v i d e s a s e t of f u n c t i o n s which can be used to d e f i n e theorems and goa l s . For example: (THFIND ALL $?X (X) ($G (FALLIBLE $?X) $T)) ) would r e t u r n a l i s t of a l l items which are f a l l i b l e . These items need not e x p l i c i t l y be s t a t e d as being f a l l i b l e , but can be items which are provably f a l l i b l e using the c u r r e n t set of 56 theorems and a s s e r t i o n s . As w e l l as THCONSE theorems, Micro-Planner p r o v i d e s THANTE (antecedent) theorems. \u00E2\u0080\u00A2 Example 1.3 \u00E2\u0080\u00A2 (THANTE T (X Y) (LIKES $?X $?Y) (THASSEET (HUMAN $?X)) ) In t h i s example, the theorem T says t h a t i f an a s s e r t i o n i s made about X l i k i n g Y, then we should immediately a s s e r t t h a t X i s human. These theorems decrease the amount of e x p l i c i t i n f o r m a t i o n which i s necessary at any one time. The f o l l o w i n g chapters w i l l r e v e a l how the f a c i l i t i e s of Micro-Planner are used to form a framework f o r r e l a t i o n a l systems. D 5.2. The Semantic Model o T r a d i t i o n a l r e l a t i o n a l systems have claim e d a high degree of user-data independence. T h i s i s t r u e only i n that the users need not be aware of how t h e i r r e l a t i o n s are p h y s i c a l l y , s t o r e d on a storage d e v i c e . U n f o r u n a t e l y , they must s t i l l be aware of how the data i s organized. In p a r t i c u l a r , they must know which r e l a t i o n s e x i s t , and what i n f o r m a t i o n each c o n t a i n s . In order t o overcome t h i s , the concept of a semantic model i s i n t r o d u c e d . The semantic model i s the most v i t a l component of the system. The semantic model serves as an i n t e r f a c e between the system and the data base, and i s used to determine the r e l a t i o n s 57 which are r e l e v a n t t o a request. The process which reduces q u e r i e s r e f e r s o n l y t o t h i s model, and never to the data base i t s e l f . J u s t as the use of the r e l a t i o n a l a l g e b r a allows the user to be independent of the p h y s i c a l r e p r e s e n t a t i o n of h i s r e l a t i o n s , the semantic model all o w s him to be independent of the o r g a n i z a t i o n of the r e l a t i o n s . The b a s i c f u n c t i o n of the semantic model i s to d e s c r i b e the i n f o r m a t i o n conveyed by each r e l a t i o n i n the data base. That i s , i t d e s c r i b e s the meanings of the r e l a t i o n s . Without t h i s i n f o r m a t i o n , i t would be i m p o s s i b l e to t e l l which r e l a t i o n s c o n t a i n i n f o r m a t i o n about an event X, and which do not. The model a l s o c o n t a i n s r e a l world knowledge which d e s c r i b e s the c u r r e n t s t a t e o f the environment which the data base r e p r e s e n t s . Information d e s c r i b i n g v a r i o u s p r o p e r t i e s which the r e l a t i o n s may or may not possess and i n f o r m a t i o n which i s u s e f u l f o r o p t i m i z i n g the r e t r i e v a l i s a l s o contained i n t h i s model. 5.2.A The Meaning o f g e l a t i o n s Consider the r e l a t i o n : R1 (S# PABT# PROJECT*) where a t u p l e (X Y Z) i s i n R i f s u p p l i e r X s u p p l i e s part Y to p r o j e c t Z. E x a c t l y what i n f o r m a t i o n does t h i s r e l a t i o n convey? I t t e l l s us t h a t : 1. S# s u p p l i e s PART* 2. PART* i s used i n PROJECT* 3. PROJECT* uses PART* 4. S* s u p p l i e s PROJECT* 5. PART* i s s u p p l i e d by S# 6. PROJECT* i s s u p p l i e d by s u p p l i e r SS, and 58 7. S# s u p p l i e s PART* to PROJECT* I f q u e r i e s are to be expressed i n a language which does not d i r e c t l y r e f e r e n c e the r e l a t i o n s , then t h i s i n f o r m a t i o n w i l l be necessary to i d e n t i f y the r e l a t i o n s which are a p p l i c a b l e to a r e g u e s t . T h e r e f o r e , i n f o r m a t i o n of t h i s type must be i n c l u d e d i n the semantic model. The e n t r i e s i n the semantic model f o r the above r e l a t i o n would be: ($A (S# SUPPLIES PART* R1)) ($A (PART* IS-USED-IN PROJECT* R1)) <$A (PROJECT* USES PART* R1)) ($A (S# SUPPLIES-TO PROJECT* R1)) ($A (PART* IS-SUPPLIED-BY S# R1)) ($A (PROJECT* IS-SUPPLIED-BY-S S# R1)) ($A (S* SUPPLIES PART* SUPPLIES-TO PROJECT* R1)) I t should be noted a t t h i s p o i n t t h a t no s p e c i a l i n f o r m a t i o n about the semantic model need be expressed. For example, Micro-Planner does not need to know what (S* SUPPLIES PART*) means, but w i l l accept i t as a p r i m i t i v e f a c t . Thus, i t i s as easy to represent i n f o r m a t i o n about employees and wages as about s u p p l i e r s and p a r t s . Information d e s c r i b i n g the o v e r a l l t o p i c with which the r e l a t i o n d e a l s can a l s o be expressed. For example, i f B2 (PABT* PRICE) were present, we might: ($A (R2 CONCERNS CURRENT PARTS)) T h i s i s e s p e c i a l l y u s e f u l i f s e v e r a l r e l a t i o n s i n the data base have i d e n t i c a l domains, but d i f f e r e n t meaning. For example, i f R3 (PART* PRICE) were a l s o i n c l u d e d , then we would: ($A (R3 CONCERNS OBSOLETE PARTS)) 59 5*.2_.B P r o p e r t i e s of g e l a t i o n s In the process of determining which r e l a t i o n s i n the data base are r e l e v a n t t o a given request, i t i s o f t e n necessary to check f o r s p e c i f i c p r o p e r t i e s which a r e l a t i o n may or may not possess. T h e r e f o r e , the semantic model a l s o c o n t a i n s i n f o r m a t i o n showing the p r o p e r t i e s each r e l a t i o n possesses. In t h i s framework, the p r o p e r t i e s of importance are T-TRANS and DETERMINES (see s e c t i o n 3.B). \u00E2\u0080\u00A2 Example 2.1 \u00E2\u0080\u00A2 In the r e l a t i o n R4 (S# SNAME), i t i s l i k e l y t h a t the s u p p l i e r number uniquely determines the s u p p l i e r name and v i c e versa. T h i s would be rep r e s e n t e d by: ($A (S# DETERMINES SNAME BU)) ($A (SNAME DETERMINES S# RU)) I t a l s o happens t h a t S# and SNAME are T-TRANS i n R4. T h i s would be represented by e i t h e r : ($A (S# AND SNAME ARE T-TRANS IN R4)) or ($A (R4 IS T-TRANS)) 5..2..C O p t i m i z i n g Information The semantic model may a l s o c o n t a i n i n f o r m a t i o n which can be used by the system i n order to l e s s e n the time r e q u i r e d to r e t r i e v e the data f o r a request. The system which i s implemented shows one p o s s i b l e use of t h i s f a c i l i t y . C onsider a query which r e q u i r e s the system to c r e a t e a r e l a t i o n which enumerates the elements of a domain d, To do t h i s , one would p r o j e c t each r e l a t i o n i n the data base on 60 d, and then take the union of these r e s u l t s . T h i s produces a r e l a t i o n which i s guaranteed t o c o n t a i n a l l the values of a domain which are present i n the data base. I f , however, a s i n g l e r e l a t i o n enumerates the values of d, then t h i s process i s unnecessary. A l l t h a t i s needed i s a p r o j e c t i o n of t h i s one r e l a t i o n . Even i f two r e l a t i o n s t o g e t h e r enumerate d, then time can be saved i f the system i s made aware of t h i s f a c t . T h e r e f o r e , one can add i n f o r m a t i o n of the form: ($A (RI ENUMERATES S#)) ($A (R1 PARTIALLY-ENUMERATES S#)) T h i s i s e s p e c i a l l y u s e f u l when q u a n t i f i e r s are being processed, s i n c e q u a n t i f i c a t i o n i s always over a domain or a r e s t r i c t e d subset of t h a t domain. 5_.2. D A c t i v e and I n a c t i v e Data The semantic model a l s o c o n t a i n s i n f o r m a t i o n t h a t r e f l e c t s the c u r r e n t s t a t e of the environment which the data base d e s c r i b e s . T h i s we r e f e r to as \" r e a l world\" knowledge. \u00E2\u0080\u00A2 Example 2.2 \u00E2\u0080\u00A2 (VANCOUVER SUPPLIERS ARE ON STRIKE) i s a t y p i c a l example. In order to handle r e a l world i n f o r m a t i o n , the concepts of the c u r r e n t l y a c t i v e and i n a c t i v e p o r t i o n s of the data base are i n t r o d u c e d . The a c t i v e i n f o r m a t i o n i s that which can be used i n responding to a query. The i n a c t i v e data i s data which i s present i n the data base, but should t e m p o r a r i l y be i g n o r e d . 61 The system w i l l never make use of i n a c t i v e data when responding t o a guery. The e f f e c t of r e a l world knowledge i s to t e m p o r a r i l y a l t e r the part of the data base which i s c u r r e n t l y c o n s i d e r e d a c t i v e . There are two cases which must be d e a l t with. F i r s t l y , whole r e l a t i o n s can be i n a c t i v e , a l t e r n a t i v e l y , the t u p l e s i n c e r t a i n r e l a t i o n s which s a t i s f y some c r i t e r i o n can be c o n s i d e r e d i n a c t i v e . In order to i n d i c a t e t h a t a r e l a t i o n i s c u r r e n t l y i n a c t i v e , one simply adds an a s s e r t i o n of the form: ($A ($?REL IS INACTIVE)) to the semantic model, where $?REL i s a v a r i a b l e whose value i s the name of the r e l a t i o n which i s i n a c t i v e . The i n a c t i v a t i o n of t u p l e s w i t h i n a r e l a t i o n i s accomplished i n a somewhat more complicated manner. I t i s most s u i t a b l e to r e s t r i c t t u p l e s by showing the p r o p e r t i e s t h a t each domain i n the t u p l e must s a t i s f y . T h e r e f o r e , one can add i n f o r m a t i o n of the type: ($A (REST RESTRICTS DOMAIN TO LPRED)) where REST i s the name of the r e s t r i c t i o n , DOMAIN i s the domain to be r e s t r i c t e d , and LPRED i s an a r b i t r a r y LISP p r e d i c a t e whose only unbound v a r i a b l e i s DOMAIN. 62 \u00E2\u0080\u00A2 Example 2.3 \u00E2\u0080\u00A2 The semantic model would c o n t a i n the a s s e r t i o n : ($A (REST*1 RESTRICTS SLOC TO (NOT (EQ SLOC * VANCOUVER)))) i f the t u p l e s i n some r e l a t i o n are to be r e s t r i c t e d to the case where the l o c a t i o n of the s u p p l i e r i s not VANCOUVER. T h i s a l l o w s r e s t r i c t i o n s to be s p e c i f i e d independently of the r e l a t i o n s which are r e s t r i c t e d . In order t o make use of t h i s i n f o r m a t i o n , the semantic model can a l s o c o n t a i n a s s e r t i o n s of the form: ($A (PRED IN RELATION IS-RESTRICTED-TO REST*)) where PRED i s the name of a d i a d i c p r e d i c a t e i n the guery language, RELATION i s the name of a r e l a t i o n , and REST* i s the name of the r e s t r i c t i o n . For example, we might have: (SUPPLIES IN R3 IS-RESTRICTED-TO BEST*1). T h i s says t h a t i f the p r e d i c a t e SUPPLIES i s seen i n a r e l a t i o n a l term of a guery and i f R3 i s being used i n the r e d u c t i o n of the guery, then only those t u p l e s of R3 which s a t i s f y REST*1 should be used. Notice that t h i s i s more gen e r a l than r e s t r i c t i n g whole r e l a t i o n s to s p e c i f i c cases. The way that these a s s e r t i o n s are used to re p r e s e n t r e a l world knowledge i s di s c u s s e d i n s e c t i o n 5.6. 63 5,_2._E G e n e r a l i t y of the Semantic Model I t i s f e l t t h a t a semantic model i s necessary to process g u e r i e s i n any language which does not make d i r e c t r e f e r e n c e to the data base. T h i s work shows the type of i n f o r m a t i o n which a t y p i c a l semantic model might c o n t a i n . The bulk of t h i s i n f o r m a t i o n d e s c r i b e s the l e a n i n g of the r e l a t i o n s . For example, ($A (PART# IS-USED-IN PROJECTS R5) ) . The semantic model i s represented by a s e t of Micro-Planner a s s e r t i o n s . However, the r e p r e s e n t a t i o n i s not the important f e a t u r e ; the model i s proposed as a g e n e r a l concept, and the manner i n which i t i s represented can be determined by the a p p l i c a t i o n i n which i t i s being used. In t h i s t h e s i s , the framework i s formulated i n such a way t h a t i t i s u s e f u l i n the context of a n a t u r a l language system. For t h i s a p p l i c a t i o n , using Micro-Planner to d e s c r i b e the semantic model i s i d e a l . I f one were attempting to design a commercially usable r e t r i e v a l system, however, i t i s u n l i k e l y t h a t LISP and Micro-Planner would be used. They make implementing the system easy, but are not e x c e p t i o n a l l y e f f i c i e n t . In t h i s a p p l i c a t i o n , the semantic model could be rep r e s e n t e d u s i n g a more \" c o n v e n t i o n a l \" data s t r u c t u r e such as a network. The nodes of the network could be the domains i n the data base, with the ar c s being l a b e l l e d with the p r e d i c a t e t h a t r e l a t e s the nodes. For example, the a s s e r t i o n : (PART* IS-USED-IN PROJECT * R5) 64 would be represented by: B4 PART* PROJECTS The r e s u l t s which are presented i n the f o l l o w i n g c h a p t e r s are a p p l i c a b l e r e g a r d l e s s of how the semantic model i s r e p r e s e n t e d , and are not dependent upon the use of Micro-Planner. n 5 . _ 3 P r o v i n g R e l a t i o n a l Terms o Queries i n the new r e l a t i o n a l c a l c u l u s c o n t a i n no d i r e c t r e f e r e n c e to the data base. T h e r e f o r e , the system which handles them must be capable of d e c i d i n g which r e l a t i o n s are r e l e v a n t to the r e q u e s t , and how the r e l a t i o n s should be manipulated i n order to produce the c o r r e c t response. T h i s s e c t i o n d i s c u s s e s how t h i s i s accomplished. In any query, the r e l a t i o n a l terms are of primary importance. The users view these as the mechanism through which they can express the p r o p e r t i e s of the data they d e s i r e . To the system, however, they r e p r e s e n t new r e l a t i o n s which must be c r e a t e d . 65 \u00E2\u0080\u00A2 Example 3.1 \u00E2\u0080\u00A2 Consider the query: (SNAME : (SNAME SUPPLIES PROJECT*) S (EQ PROJECT* 17)) which asks f o r a l l s u p p l i e r s who supply p r o j e c t 17. In order to answer t h i s , the r e l a t i o n : SUPPLIES (SNAME PROJECT*) must f i r s t be c r e a t e d . When p r o c e s s i n g a query, the r e l a t i o n which corresponds to each r e l a t i o n a l term and each simple term must be c r e a t e d . ( R e s t r i c t i o n and j o i n terms do not r e q u i r e new r e l a t i o n s to be created.) The q u e r i e s , however, c o n t a i n no i n f o r m a t i o n as to how t h i s should be done. I t i s up to the system to determine how the i n f o r m a t i o n can be e x t r a c t e d from the data base. In order to c r e a t e the r e l a t i o n c orresponding t o some r e l a t i o n a l term T, the f o l l o w i n g approach i s taken. F i r s t , T i s formulated as a theorem, and an attempt i s made t o prove i t using the semantics of the data base as the axioms. I f the theorem can be proven t r u e , then the proof w i l l show how the r e l a t i o n can be c r e a t e d . I f i t i s f a l s e , then we can conclude t h a t the data base does not c o n t a i n s u f f i c i e n t i n f o r m a t i o n to answer the query. The remainder of t h i s s e c t i o n i s devoted to d i s c u s s i n g when, i n g e n e r a l , a term can be proven t r u e . The next s e c t i o n d i s c u s s e s the implementation. 66 5i.3i_A E x p l a n a t i o n o f the Problem In the f o l l o w i n g s e c t i o n s , we r e p r e s e n t r e l a t i o n a l terms of the form (X R Y) by the standard s e t - t h e o r e t i c n o t a t i o n xRy. Consider r e l a t i o n a l terms of the form xRy. I f a s i n g l e r e l a t i o n i n the data base c o n t a i n s the i n f o r m a t i o n requested i n the r e l a t i o n a l term, then the axiom xRy w i l l appear i n the semantic model, and the proof w i l l be t r i v i a l . I t w i l l not always be the case that a s i n g l e r e l a t i o n c o n t a i n s a l l the d e s i r e d i n f o r m a t i o n . S e v e r a l r e l a t i o n s w i l l o f t e n be needed. \u00E2\u0080\u00A2 Example 3.2 \u00E2\u0080\u00A2 To form the r e l a t i o n d e f i n e d by the term i n example 3.1 above, r e l a t i o n s R3 and R5 must be used. Consider the problem of proving xRy, when two r e l a t i o n s are necessary. I f one r e l a t i o n says xRz, and another says that zR'y, then can we conclude t h a t xRy, or xR'y, or even xR\"y? Consider the f o l l o w i n g examples which a l l attempt to prove a term of the form xRz by showing xRy and yR'z. 3.2.1. I f we know (SHAME IS-NUMBERID S#) and (S# SUPPLIES PROJECT), then we can conclude t h a t (SHAME SUPPLIES PROJECT). 3.2.2. I f we know (S# SUPPLIES PART*), and (PART* IS-USED-IN PROJECT*), then we can not conclude t h a t (S# SUPPLIES PROJECT*). He can only conclude t h a t (S* R PROJECT*), where R i s some r e l a t i o n whose name happens to be \"maybe s u p p l i e s \" . I f , however, we know th a t PART* uniquely determines PROJECT*, then we can conclude t h a t 67 (S# SUPPLIES PROJECT*) . 3.2.3. I f we know that (A IS-THE-SQUARE-OF B), and (B DOUBLED-IS C ) , then we cannot conclude t h a t (A IS-THE-SQUABE-OF C) i n any case, r e g a r d l e s s of the f a c t t h a t B uniquely determines A. 3.2.4. I f we know th a t (SUB-PART# IS-PABT-OF PART*), and (PART* HAS-NAME PART-NAME) , then we can always conclude t h a t (SUB-PABT* IS-PART-OF PART-NAME). These examples i l l u s t r a t e t h a t the proof of xRy i s going to depend upon the p r o p e r t i e s of the r e l a t i o n s i n v o l v e d , not j u s t upon t h e i r content. 5_.3_.B P r o p e r t i e s of R e l a t i o n s There are two p r o p e r t i e s which r e l a t i o n s may possess t h a t i n f l u e n c e the way they can be used i n c r e a t i n g new r e l a t i o n s . These p r o p e r t i e s are c a l l e d T-TRANS and UNIQUELY-DETERMINES. \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 Two domains a and y i n r e l a t i o n R* are s a i d to be T-TRANS (Through T r a n s i t i v e ) i f , f o r each xRa, xRa 6 aR'y => xRy. A r e l a t i o n R\u00C2\u00AB i s s a i d to be T-TBANS i f a l l domains i n R* are p a i r w i s e T-TRANS. 68 \u00E2\u0080\u00A2 Example 3.3 \u00E2\u0080\u00A2 I f (X R S#) and (S# R\u00C2\u00AB SNAME) => (X R SHAME) , then SNAME and S# are T-TRANS i n R\u00C2\u00BB. \u00E2\u0080\u00A2 Example 3.4 \u00E2\u0080\u00A2 The r e l a t i o n R (S# SNAME PROJECT* PROJECT-NAME) i s not T-TRANS, whereas R\u00C2\u00BB (S# SNAME) i s . T h i s property w i l l be used to show t h a t the c o n c l u s i o n s i n examples 3.2.1 and 3.2.4 are v a l i d , whereas no c o n c l u s i o n can be drawn from 3.2.3, which i s s y n t a c t i c a l l y s i m i l a r to both 3.2.1 and 3.2.4. \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 A domain a i n R uniquely determines domain b i n R i f , f o r any v a l u e of a i n R, there e x i s t s only one value of b. \u00E2\u0080\u00A2 Example 3.5 \u00E2\u0080\u00A2 R ( S# PART* ) 1 4 1 7 2 8 3 9 Here, PART* uniquely determines S#. N o t i c e t h a t S* does not uniq u e l y determine PART*. T h i s property w i l l be necessary i n the proof of 3.2 above. 69 5\u00E2\u0080\u00A2.3-.C Proving xR y_ I f the axiom xRy i s not present i n the semantic model, then more than one r e l a t i o n w i l l be i n v o l v e d i n the proof. Thus, the proof i s broken up i n t o two stages. F i r s t l y , an attempt i s made to f i n d a domain a such that xR'a. I f t h i s succeeds then the proof of aR\"y i s attempted. I f t h i s a l s o succeeds, then we can conclude t h a t a r e l a t i o n R e x i s t s between x and y. The v a l i d i t y of t h i s c o n c l u s i o n , however, depends upon c e r t a i n r e l a t i o n s h i p s which may or may not e x i s t between a, x, and y. Example 3.2 i l l u s t r a t e d t h i s p o i n t . F urther, i t w i l l most o f t e n be the case t h a t the r e l a t i o n R which i s d e s i r e d w i l l be s p e c i f i e d , r a t h e r than a r b i t r a r y . Thus, the property T-TRANS w i l l a l s o need to be taken i n t o account. Consider f i r s t the problem of t r y i n g to prove xRy, where R i s an a r b i t r a r y r e l a t i o n . Heath[17], i n attempting to show t h a t any r e l a t i o n can be reduced t o a n a t u r a l j o i n of r e l a t i o n s i n t h i r d normal form[10], proves the f o l l o w i n g two r e s u l t s : 1. The r e l a t i o n R(A,B,C), where A determines B, i s the j o i n of R* (A,B) and R n(A,C). 2. The r e l a t i o n R(A,B,C), where A determines E and B determines C, i s the j o i n of R\u00C2\u00BB (A,B) and R\"{B,C). Using these r e s u l t s , we d e f i n e the concept of a v a l i d r e s u l t . 70 \u00E2\u0080\u00A2 D e f i n i t i o n \u00E2\u0080\u00A2 I f A determines B i n R* (A,B), and n e i t h e r A nor C determine each other i n R\" (A,C) , then the p r o j e c t i o n R(E,C) of the r e l a t i o n R (A,B,C), formed by j o i n i n g R \u00E2\u0080\u00A2 (A, B) and R\"(A,C) i s a v a l i d r e l a t i o n . S i m i l a r l y , i f A determines B i n R* (A,B) and B determines C i n R\"(B,C), then the p r o j e c t i o n R(A,C) of the r e l a t i o n R (A,B,C) formed by j o i n i n g R\u00C2\u00AB(A\u00C2\u00BBB) and R\"(B,C) i s a l s o a v a l i d r e l a t i o n . We speak of the steps of the proof of a r e l a t i o n a l term as the path of the proof. Any path which r e q u i r e s the formation of v a l i d r e l a t i o n s and v a l i d r e l a t i o n s only i s s a i d to be a v a l i d path. \u00E2\u0080\u00A2 Example 3.6 \u00E2\u0080\u00A2 I f we know that A: (PARTS IS-SUPPLIED-BY SS), and t h a t B: (PARTS IS-OSED-IN PROJECTS), then i f PARTS uniquely determines SS i n the r e l a t i o n d e f i n e d by the r e l a t i o n a l term A, we can form the v a l i d r e l a t i o n R (SS,PR0JECTS), where the name of R i s a c t u a l l y \" s u p p l i e s - t o \" . In any proof, an attempt i s made to f i n d a v a l i d path b e f o r e a n o n - v a l i d path. Note that a n o n - v a l i d path i s not always u n d e s i r a b l e , s i n c e even i f PARTS does not determine SS i n example 3.6, the j o i n of the r e l a t i o n s d e f i n e d by the terms A and B produces a new r e l a t i o n whose name i s \"maybe s u p p l i e s \" , r a t h e r than \" s u p p l i e s \" . T h i s concept i s extremely v a l u a b l e , as i t guides the proof i n a reasonable d i r e c t i o n . T h i s p o i n t w i l l be i l l u s t r a t e d l a t e r . 71 How c o n s i d e r the problem of t r y i n g to prove xRy where R i s s p e c i f i e d . Given r e l a t i o n s R\u00C2\u00BB(X,A) or R'(A yX), and R\"(A,Y) or R\u00C2\u00BB\u00C2\u00AB(Y,A), then the f o l l o w i n g t a b l e shows the c o n c l u s i o n s which can be drawn assuming the new r e l a t i o n s are v a l i d . The c o n c l u s i o n s marked with a s t e r i s k s are the ones of i n t e r e s t , s i n c e they show when one can prove xRy. T T i | R f T-TRANS | R\" T-TRANS 1 i R\u00C2\u00AB (A,X) R\" (A,Y) | R\"(X,Y)* _ .|. _______._\u00E2\u0080\u009E ..r,., I R'(I.X) i R\u00C2\u00BB (X,A) R\" (A, Y) | R\" (X , Y) * | R\u00C2\u00AB (X,Y)* | R\u00C2\u00BB (A,X) R\" (Y,A) | R\"(Y,X) I H'(Y,X) I R' (X,A) R M (Y,A) I R'\u00C2\u00BB(Y,X) | R\u00C2\u00BB (X,Y) * j ...x .. .\u00E2\u0080\u009Ei,.,... _ _ \u00E2\u0080\u0094 We are now i n a p o s i t i o n to d e f i n e the c o n d i t i o n s under which xRy can be proven t r u e . xRy i s t r u e i f e i t h e r : 1. xRy i s an axiom of the system. 2. i . xRa and e i t h e r aR\u00C2\u00BBy or yR'a. i i . y and a are T-TRANS i n R\u00C2\u00BB. i i i . E i t h e r a determines x i n R or a determines y i n R\u00C2\u00BB. 3. i . aRy and e i t h e r aR'x or xR\u00C2\u00BBa. i i . x and a are T-TRANS i n R\u00C2\u00AB. i i i . E i t h e r a determines y i n R or a determines x i n R\u00C2\u00BB. These are the only c o n d i t i o n s which allow us to show that xRy. 72 \u00E2\u0080\u00A2 Example 3.7 \u00E2\u0080\u00A2 Say one i s attempting to prove xBy. I f i t i s known that xRa, then one must simply show t h a t aR'y where y and a are T-TRANS i n R\u00E2\u0080\u00A2, and e i t h e r a determines x i n R or a determines y i n R*. T h i s i n turn could be done by showing that aR\"b and bR\"*y, where b and y are T-TRANS i n R\"', and where b determines a i n R\" or b determines y i n R\"\u00C2\u00BB. Thus, t h i s method a l l o w s f o r proofs of a r b i t r a r y l e n g t h , not j u s t proofs with two s t e p s . \u00E2\u0080\u00A2 Example 3.8 \u00E2\u0080\u00A2 The term (S# SUPPLIES-TO PROJECT-NAME) can be proven t r u e , s i n c e : 1. (S* SUPPLIES-TO PROJECT*) 2. (PROJECT* IS-NAMED PROJECT-NAME) 3. The r e l a t i o n d e f i n e d by 2 i s T-TRANS, and 4. In the r e l a t i o n d e f i n e d by 2, PROJECT* determines PROJECT-NAME. 5.J^D Proving xR_lEl! There are s i x c o n d i t i o n s under which xRyR'z can be proven t r u e . xRyR'z i s true i f e i t h e r : 1. xRyR'z i s an axiom of the system. 2. i . aRyR'z and e i t h e r aR\"x or xR\"a. i i . x and a are T-TRANS i n R\". i i i . E i t h e r a determines x i n R\" or a determines (y z) i n the r e l a t i o n d e f i n e d by the term. 73 3. i . xRaR'z and e i t h e r aR\"y or yR na. i i . y and a are T-TRANS i n R\". i i i . E i t h e r a determines y i n R n or a determines (x z) i n the r e l a t i o n d e f i n e d by the term. 4. i . xRyR'a and e i t h e r aR\"z or zR\"a. i i . z and a are T-TRANS i n R\". i i i . E i t h e r a determines z i n R\" or a determines (x y) i n the r e l a t i o n d e f i n e d by the term. 5. i . xRy and xR'z i i . x determines y i n the r e l a t i o n d e f i n e d by (x R y) 6. i . xRy and yR'z i i . x determines y i n the r e l a t i o n d e f i n e d by (x R y) i i i . y determines z i n the r e l a t i o n d e f i n e d by (x R' z) Not i c e that i n case 5, x i s r e l a t e d to y and z, while i n case 6 x i s r e l a t e d to y and y i s r e l a t e d to z. T h i s r e p r e s e n t s the two i n t e r p r e t a t i o n s of t e r n a r y semantic i n f o r m a t i o n . In t h i s framework, i t i s p o s s i b l e to prove a term of the form xRyR'z, even i f the semantics c o n t a i n only b i n a r y i n f o r m a t i o n . \u00E2\u0080\u00A2 Example 3.9 \u00E2\u0080\u00A2 We can prove (SNAME SUPPLIED PART* SUPPLIED-TO PROJECT*), s i n c e : 1. (S* SUPPLIED PART* SUPPLIED-TO PROJECT*) 2. (S# IS-CALLED SNAME) 3. S# and SNAME are T-TRANS i n the r e l a t i o n d e f i n e d by 2 4. S* determines SNAME i n the r e l a t i o n d e f i n e d by 2. 74 5^3_j_E Usefulness of the V a l i d Path The concept of a v a l i d path i s c r u c i a l to the proof mechanism s i n c e i t tends to e l i m i n a t e \"garbage\" paths. Con s i d e r , f o r example, an attempt to prove t h a t (S# SUPPLIES-TO PROJECT-NAME). The d e s i r e d proof i s : 1. (S# SUPPLIES-TO PROJECTS), and 2. (PROJECTS IS-NAMED PROJECT-NAME). At f i r s t glance there seem to be s e v e r a l other c o r r e c t but u n d e s i r a b l e proofs. For example: A. (SS SUPPLIES PARTS), and B. (PARTS IS-USED-IN PROJECTS), and C. (PROJECTS IS-NAMED PROJECT-NAME) would a l s o appear to be c o r r e c t . He would hope t h a t i f the system found t h i s v e r s i o n of the proof before the f i r s t v e r s i o n , i t would be r e j e c t e d . T h i s i s , i n f a c t , the case. Unless PARTS uniq u e l y determines SS i n A, steps A and B do not d e f i n e a v a l i d r e l a t i o n , and t h e r e f o r e , the system w i l l abandon t h i s path and the c o r r e c t proof w i l l e v e n t u a l l y be found. I f PARTS does determine SS i n A, then t h i s proof produces the same response r e l a t i o n as does the f i r s t , and i s t h e r e f o r e a c c e p t a b l e . 5..3.F Summary, T h i s s e c t i o n has d e f i n e d the c o n d i t i o n s under which the r e l a t i o n a l terms xfly and xRyR'z can be proven. Above a l l , i t has demonstrated that i n order to prove a term such as xRy, i t i s not s u f f i c i e n t t o show t h a t : 75 xRa & aR'b & bR\"c & cR\u00C2\u00BB\"y. The p r o p e r t i e s of the r e l a t i o n s must be taken i n t o account. n 5 ..4 The P r o c e s s i n g of R e l a t i o n a l Terms o T h i s s e c t i o n d i s c u s s e s the implementation of the procedures which prove the v a l i d i t y of the r e l a t i o n a l terms. In the p r e v i o u s s e c t i o n , the c o n d i t i o n s under which t h i s could be done were o u t l i n e d . R e l a t i o n a l terms can be proven true only i f the data base c o n t a i n s enough i n f o r m a t i o n to c r e a t e the r e l a t i o n which the term d e f i n e s . T h e r e f o r e , the proof i s c o n s t r u c t e d by showing t h a t i t i s , i n f a c t , p o s s i b l e to c r e a t e such a r e l a t i o n . The s t e p s of the proof are \"remembered\", and from these, we can determine how the r e l a t i o n can be c o n s t r u c t e d . T h i s s e c t i o n begins by d e s c r i b i n g the format i n which Micro-Planner saves i n t e r m e d i a t e s t e p s . I t a l s o d e s c r i b e s the r o u t i n e s which prove that x, xRy, or xRyR'z are t r u e . Included are d e s c r i p t i o n s of how r e a l world knowledge i s handled, and how Micro-Planner saves the r e l e v a n t p a r t s of the proof. 76 5. ft. ft The L i s t Returned by Micro-Planner In proving a r e l a t i o n a l term, i t i s important to know not on l y that the corresponding r e l a t i o n can be c r e a t e d , but how i t can be c r e a t e d . A r o u t i n e which j u s t says \"yes, you can c r e a t e the r e l a t i o n \" i s of l i t t l e use i n i t s e l f . T h e r e f o r e , the Micro-Planner procedures keep t r a c k of the i n f o r m a t i o n they use i n the proof, and o r g a n i z e i t i n a f a s h i o n which makes i t easy to see how the r e l a t i o n can be c r e a t e d . T h i s i n f o r m a t i o n i s s t o r e d i n a l i s t , and given the name SRESULTS s i n c e i t i s the r e s u l t of a s u c c e s s f u l proof. The l i s t shows which r e l a t i o n s are needed i n order to c r e a t e the r e l a t i o n d e f i n e d by the r e l a t i o n a l term, and shows how they should be used to form i t . I f the r e l a t i o n a l term T has n domain v a r i a b l e s , then the f i r s t n elements of SRESULTS are l i s t s of the form: (DOMAIN RELS), where DOMftIN i s the name of a domain i n T, and RELS i s the l i s t of r e l a t i o n s which together enumerate the elements of DOMAIN. For example, we might have: (PARTS (R7 R5)) i f PARTS was to come from r e l a t i o n s R5 and R7. The l i s t RELS i s not a c t u a l l y used i n the c u r r e n t scheme. The l a s t element of SRESULTS i s a l i s t which shows how to c r e a t e the r e l a t i o n . I t has the form: (REL1 D1 REL2 2* 25 26 27 ; JOIN 28 ; RETURNS THE THETA-JOIN OF RI WITH R2. THETA IS EXPRESSED 29 ; BY PREDICATE, AN \u00C2\u00AB\u00E2\u0080\u00A2ARB ITRARY\u00C2\u00BB\u00C2\u00BB LISP EXPRESSION WHICH HUST 30 ; BE SATISFIED BEFORE TKD TUPLES CAN BECOME PART OF THE JOIN. 31 32 33 (DEFUN JOIN (RI R2 PREDICATE) 34 . I PROG INEWREL TNAMEI ' 35 (SETC NEWREL (GENSYHI 'RED) 36 I PUT NEWREL 'KTUPLES 0) 37 (PUT NEWREL -OOMAINS (APPEND (GET RI -DOMAINS) 38 (GET RZ -DOMAINS))) \u00E2\u0080\u00A2 39 IMAPC \u00E2\u0080\u00A2(LAMBDA fTI) 40 (SETO T l (GET T l -DATAH . 41 (KAPC '(LAMBDA (T2) 42 (SETO T2 (GET T2 'DATA)I 43 (CONO ( (EVAL PREDICATE) 44 (ADDTUPLE (APPEND T l T2) NEWREL) )) 45 ) (GET R2 -TUPLES)) 46 ) (GET RI 'TUPLES)) 47 (RETURN NEWREL) 48 I 49 ) 50 .51 52 S 3 ' ; RESTRICT 54 ; RESTRICT RETURNS A NEW RELATION WHICH CONSISTS OF THOSE 55 ! TUPLES IN \"RELATION\" WHICH SATISFY THE PREDICATE. \u00E2\u0080\u00A2 '< 56 57 58 (OEFUN RESTRICT (RECAT/ON PREOICATEI 59 (PROG INEWREL T l ) 60 (SETO NEWREL (GENSYMl -RELII 61 I PUT NEWREL 'HTUPLES 0) \u00E2\u0080\u00A2' 62 . (PUT NEWREL 'OOMAINS (GET RELATI ON 'DOMAINS)) ' . . ' 63 (KAPC 1(LAHOOA (THAMEl 64 t SETO. T l (GET TNAME 'DATA)) .65 (CONO I IEVAL PREDICATE I 66 (PUT NEWREL 'TUPLES (CONS TNAME 67 (GET NEWREL 'TUPLES))I 68 I I 69 I 70 (GET RELATION 'TUPLES!) 71 (PUT NEWREL 'HTUPLES (LENGTH (GET NEWREL 'TUPLES)!I 72 ' (RETURN NEWREL) 73 ) 74 ) . 75 76 77 78 ; 0(VICE 79 ; A IS A LIST OF THE DOMAIN NUMBERS OF RELATION RI TO BE BO ; DIVICED BY DOMAIN NUMBERS B OF R2. 81 5 \u00E2\u0080\u00A2 1 82 J OIVICE WORKS AS FOLLOWS. FOR EACH ROW R IN RI, IF EACH 83 TUPLE IN THE PROJECTION OF R2 ON B IS A P.EHBER OF THE-.84 ; IMAGE SEI OF THE PROJECTION OF ABAR ON R, UNDER RI. THEN -85 ; KEEP THE PROJECTION 0\u00C2\u00A3 ABAR ON R. 86 87 -eC (DEFUN RDIVIOE (RI K? A B) 89 l?\u00C2\u00ABCG IREL SONS ISET ABAR ALL USED ABAROATA) 90 (SEIO S0N8 (PROJECT \u00C2\u00AB2 6)1 91 (SEIO ABAR ISETOIF (CENLIST (GET RI -DOMAINS)) AM 92 (SETQ REL (GENSYHl -RELIt 93 I PUT REL \u00E2\u0080\u00A2\u00C2\u00BB TUPLES 01 94 (PUI >>EL -DOMAINS (CHOOSE (GET RI -DOMAINS) ABAR)) 95 (MAPC -(LAHBDA ( I I ) 96 (SETO II (GET T l -DATA)) 97 (SETO ABAROATA (CHOOSE Tl ABAR)) 98 (CCNO ( (NO! (MEMBER A3 ARO AT A USF.DII 99 (SITO ISET (IMAGE5ET I CHOOSE Tl ABAR) 100 RI 101 A8AR IC2 A l l IO J (SETS ALL ri 104' ISETO USED (CONS A3AHDATA USED)! 1C5 (MAPC MLAKBDA 1121 1C6 (SEIO 72 (GET T2 'DATA!) 1C7 (COND ( (NOT [MEMBER 12 (SET)) ICS ISEIO AIL NIL) 1C9 t UNEVAL 'MAPC NIL)> I 1 10 I 111 IGET SCNE \u00E2\u0080\u00A2TUPLES 1 I 112 (AW ALL (ADDTUPLE ABAROATA RED) 113 I ) 11* I 115 (CET R l \u00E2\u0080\u00A2TUPLES I 1 116 (RETURN RED 117 I l i e I 119 120 121 122 (DEFUN RUNION (L) 123 ; THIS PROCEDURE WILL RETURN A RELATION WHICH IS THE UNION 124 ; OF ALL THE RELATIONS IN L. 125 [PRCG (RELI 126 ISETO REL (CAR 111 127 [MAPC \u00E2\u0080\u00A2(LAMBDA (Rl 123 ICONO I (NULL R l (RETURN RELI) 129 ( T (SETO REL (RUNICN\u00C2\u00AB REL Rl>\u00C2\u00BB I 130 I 131 (CDR LI I 132 (RETURN RELI) 133 I 13* 135 l i b 137 IDEFUN RUN I ON* ( R l R2) 138 \" t 138 ; THIS PROCEDURE WILL RETURN A RELATION WHICH IS THE UNION 139 ( OF RI AND R2. THE NEW DOMAIN NAMES ARE THOSE OF R l . 140 (PROG (REL UNION TUPLE) 141 (CCND ( (NOT (EO (LENGTH (GET RI \u00E2\u0080\u00A2DDHAINS1I 142 (LENGTH (GET R2 -DOMAINS!) I I 1*3 (RETURN NIL)I I 14* (SETQ REL (GENSVH1 165 (COND I (NULL Rl (RETURN RED) 166 I T (SETO REL IRCROSS\" REL Rl)> ) 167 ) 168 ICDR D l 169 (RETURN RED 170 I 171 I 172 173 17* 175 (DEFUN RCRQSS\u00C2\u00BB IR1 R2I * 176 ; THIS PROCEDURE WILL RETURN A NEW RELATION WHICH IS THE 177 ; CRDSS PRODUCT OF R l AND R2. 178 (PROG (RELI 179 ISETO REL (GENSYM1 'RELII 1R0 (PUT REL 'DOMAINS IAPPEN0 (GET R l 'DOMAINS) 181 (GET R2 'DOMAINS)I) U 2 (PUT REL '\u00C2\u00ABTUPLES 01 183 (MAPC ' (LAMBDA ( T H -IS* 1MAPC \u00E2\u0080\u00A2ILAMtfOA [T2I 185 (AOUTUPLE (APPEND (GET T l '0ATA1 186 IGET T2 >OArA)l RED 187 ) 183 (GET R2 'TUPLES!! 189 ) 190 (GET R l 'TUPLES)) 191 (RETURN RELI I 192 ) 193 19* 195 IDEFUN RINIERSECT (LI 196 ; IHIS RfiuTlNE WILL RETURN THE INTERSECTION OF ALL RELATIONS 1 197 (PRCG (RED 193 (SETQ REL (CAR D l 199 (MAPC \u00E2\u0080\u00A2(LAMBDA 1RI 200 ISETO MEL (PROJECT (JOIN REL R '(ECUAL Tl T2>) 201 202 203 20* 205 206 207 208 209 210 211 212 213 21* 215 216 217 218 219 2 20 221 222 223 22* 225 226 227 228 229 230 231 232 233 2 3* 235 236 237 238 239 2*0 2*1 2*2 2*3 24* 2*5 2*6 247 2*8 249 250 251 252 253 25* 255 256 257 258 259 260 261 26? 263 26* 265 266 267 268 269 2 70 271 2 72 273 27* 275 . 2 76 277 2 78 279 280 281 282 283 284 285 2fl6 287 268 289 290 291 (GENLIST IGET REL 'DC.MA INS) 111 (COR L I ) (RETURN RELI) ) (DEFUN RDIFF (Rl R2) ; THIS PROCEDURE RETURNS R l - R2. ISETO REL (GENSYHl 'REL)I (PUT. REL 'DOMAINS (GET R l 'DOMAINS)) (PUT REL ' ) 1 SETDIF ; RETURNS SETI-SET2. (DEFUN SETDIF ISET1 SET2) ; (REVERSE (EXCLUOE SET2 SETUI .(PROG (DIFF) \u00E2\u0080\u00A2 (MAPC \u00E2\u0080\u00A2 (LAMBDA (E) (OR (MEMBER E SET2I (SETO DIFF (CONS E DIFF))) SET1I (RETURN (REVERSE OIFFII ) ; ADCTUPLE -; ADOS THE TUPLE TO THE RELATION, AND RETURNS THE NAME OF THE TU>L\u00C2\u00AB (DEFUN ADDTUPLE (TUPLE RELATION) (PROG (TNAME) IPUT RELATION 'TUPLES (CONS (SETQ TNAME (GENSYMl 'T)J (GET RELATION 'TUPLES))) (PUT REL A r[ON 'HTUPLES (AODl (GET RELATION \"ITUPLESIll IPUT (NAME 'DATA.TUPLE) (RETURN INANE) ) ) 302 303 304 ; CHOOSE 305 ; RETURNS THE PROJECTION OF THE TUPLE ON THE 00\"AINS WHOSE 306 ; POSITIONS ARE GIVEN 6Y THE LIST A. 307 308 309 IOEFUN CHOOSE (TUPLE Al 310 (HAPCAR \u00E2\u0080\u00A2 (i.AM80k (DOMAIN) (ELEM TUPLE DOMAIN!) 311 A) 312 1 313 31* 315 316 I ELEM 317 : RETURNS THE NTH ELEMENT OF A LIST. 318 ; IF POSITION IS A LIST OF DOMAIN NUMBERS, ALL CORRESPONDING ELEMENTS 319 ; WILL BE RETURNED. 320 321 322 (DEFUN ELEM (LIST POSITION) 323 (CONO ( INUMSEHP POSITION) 324 (CAR (NTH LIST POSITION)) ) 325 ( T (MAPCAR ' (LAMBDA (N) 326 (CAR (NTH LIST N)I 327 I . 32B POSITION! I ) 329 I 330 331 ' 332 333 ; GENL1ST 334 ; GENERATES A LIST OF NUMBERS FROM 1 TO THE NUMBER OF OOMAINS 335 336 337 (DEFUN GENLIST IDOMAINS) -338 (PROG I INC) ' 339 (SETQ INC 0) 340 (MAPCAR \u00E2\u0080\u00A2 (LAMBDA IX) (SETQ INC (A001 INCH) DOMAINS) 341 I 342 I 343 i44 345 (OPEN I IMPLODEBUFFER 100)) END OF FILE APPEND I X 8 - RE I-AT I ON A t CALCULUS 137 J . 1DEFUN REDUCE (QUERY) Z i I M S ROUTINE W i l l TAKE A OUERY IN THE RELATIONAL C i U t t . C S * 3 ! AND REDUCE I I TO. A SEOUENCE OF OPERATIONS IN THE R E l i T l ^ i i i \u00E2\u0080\u00A2 * ; ALGEBRA. IT IS HOOELLED ON THE IMPROVED REDUCTION ALiOM\"-!M 5 S OF PALERMO. 6 IPROG (RESULT GLOBAL.RANGEI 7 i FORM THE GLOBAL RANGE FOR EACH VARIABLE IN THE CUc'RY. B (MAPC *(LAMBDA (VARI 9 (PUT VAR *RANGE (SETQ GLOBAL.RANGE (RANGE VAR)1) 10 (CONO ( (NULL IGET GL03AI RANGE TUPLES!) 11 .. (PR1N1 \"'THE GLOBAL RANGE FOR\"! (PRIN1 VARI . . . 12 (PRIN1 '\"IS EMPTY\"! (TERPRII (RETURN NIL)) I 13 I AND TRACE IPRINI '\"THE REDUCED GLOBAL RANGE FOR**) 1*.. (PRINl VARI 1PR1N1 MS:! 15 (PR1 NTREL GLOBAL.RANGEI ) 16 ) IT (GET QUERY \u00E2\u0080\u00A2VARS)) IB ; CONSTRUCT THE COMPONENTS C I I l , EACH DEFINED BY THETA(J) 19 (MAPC '(LAMBDA (THETAII ' 20 (SETQ RESULT (RUNION (LIST (FORH_CI THETAII RESULT.) I) 21 ) 22 (GET QUERY 'THETAII 23 ; APPLY THE OPERATIONS OF DIVISION AND PROJECTION TO THE RESULT 2* ; TO OBTAIN THE RELATION TP. 25 (MAPC \u00E2\u0080\u00A21 LAMBDA (VAR! ) \u00E2\u0080\u00A2 2* (SET VAR (GET VAR 'RANGED 27 (CONO ( (GET VAR -QUANTIFIER) 28 (SETO RESULT (PORD RESULT VAR)) . . . 29 (AND TRACE (PRINTREL RESULT)) 30 (NAPC \u00E2\u0080\u00A2ILAHBOA (VI . ' / ' \u00E2\u0080\u00A2 \u00E2\u0080\u00A2 . 31 (CONO ( (GREATERP (GET V \u00E2\u0080\u00A2STARTS) 32 (GET VAR 'STARTS!) 33 (PUT V 'STARTS (SUB (GET V \u00E2\u0080\u00A2 STARTS) 3* (LENGTH (GET (EVAL VAR) 'DOMAINS)) 1) I I 35 1 36 (GET QUERY 'VARSII > 37 (T (UNEVAL 'MAPC NIL)) I 38 ) \u00E2\u0080\u00A2 : 39 (GET QUERY 'VAR5II \u00C2\u00AB0 8 PROJECT TP ONTO THE DESIRED OUTPUT OOMAINS Al (PRIM \u00E2\u0080\u00A2 \"THE RE5POHSE RELATION IS:-J 42 (RETURN (PRINTREL (PROJECT RESULT (GENTARGET (GET QUERY 'TARGET! I D ) 43 ) I 44 45 46 47 (DEFUN FORM CI (THETAII \u00E2\u0080\u00A2 \" 48 ; THIS ROUTINE HILL FORM THE SUBSET CI DEFINED BY THETAI 49 I IT USES IHE GOLBAL VARIABLE RANGE_L1ST 50 (PRCG (L.OF_CDRES POSS_IERHS USEDVARS CORE J0IN_TERM 51 POSITION CORE.RANGE STARTS RANGE LIST V INO) 52 (SETQ STARTS 1) 53 ) FORM THE REDUCED LOCAL RANGE FROM THE REDUCED GLOBAL RANGE. 54 IKAPC '(LAMBDA (VARI 55 (SET VAR (GET VAR 'RANGE 11 56 (FDRM_RR VAR THETAII 57 (AND'TRACE ' 58 (PRINl \u00E2\u0080\u00A2\"THE REDUCED LOCAL RANGE FOR\"l IPRINI VARI 59 IPRINI \u00E2\u0080\u00A2IN! [PRINl THETAI! (PRINl MS) (PRINTREL IEVAL VARI) ) 60 1 . 61 IGET QUERY \u00E2\u0080\u00A2VARSI I ,62 ; CREATE A LIST OF THE RANGES USED IN THETAI, KITH THOSE HAVING. 63 I THE SMALLEST NUMBER OF TUPLES COMING FIRST. 64 (SETO RANGE.LIST (RANGES THETAII! 65 LOCP (SETQ POSS_TERMS NIL! 66 ISEIQ CORE (EVAL (CAR RANGE_LIST)11 67 (AND TRACE 68 (PRINl \u00E2\u0080\u00A2\"THE FIRST ELEMENT IN THE CORE OF\u00C2\u00BBl 69 IPRINI THETAII (PRINl \"IS:I (PRINTREL CORE) ) . 70 (PUT (CAR RANGE.LIST! 'STARTS STARTS) 71 (SETO STARTS (ADO STARTS 72 (LENGTH (GET (EVAL (CAR RANGE_LIST)I 'DOMAINS)))) 73 (SETO USEDVARS (CONS IUNCENS RANGE.LIST RAMGE.LIST1 M L l ) 7* I CONSTRUCT A LIST OF TERMS WHICH CAN BE USED TO JOIN THE 75 : CORE HUH A NEW RANGE. 76 L0CP2 ISEIQ POSS.TERHS (UNION 11NIERSECT 77 (GET (CAR USEDVARS) \u00E2\u0080\u00A2USEO.IN.TERNSI 78 (GET THETAI \u00E2\u0080\u00A2TERMS 1) 79 POSS.TERHS!! 80 ; IF THERE ARE NO MORE JOIN TERMS CONNECTED TO THE CORE, 81 S START AGAIN. \" 82 (CONO ( (NULL POSS.TERHS! S3 ISEIQ L_OF_CORES (CONS CORE L.CF.CORESII 84 (CO LOOP I! I 85 I NOW THAT WE KNOW WHICH JOIN TERMS CAN BE USED, CHOOSE THE ONE WHICH 86 . : INVOLVES THE SMALLEST RELATION 87 (MAPC \u00E2\u0080\u00A2(LAMBDA (VARI SB (CONO ( (SETQ JOIN TERM (INTERSECT POSS.TERHS \u00E2\u0080\u00A289 (GET VAR 'USEO.IN.TERMSII! 90 (SETO JOIN_TERM (CAR JOIN.TER\u00C2\u00BBll 91 (SETQ RANGE.LIST (DELETE VAR RAKGE_LISTM 92 ISETO POSS.URMS (DELETE JOIN.TERH POSS.TERMS11 93 . \u00E2\u0080\u00A2 \u00E2\u0080\u00A2 (SETQ HEW.RA.'JGE VAR I 94 I FIND ALL OTHER IEHMS WHICH INVOLVE ONLY THESE TWO RELATIONS. 95 (MAPC ' I LAME10A (TERM! 96 ICONO ( (EQUAL IGET TERM 'VARS! (GET JOIN.TERM 'VARS!I 97 ISEIQ POSS_T\u00C2\u00A3KMS (0CLET5 TERM pnSS.TERNSII 98 ISETtJ JDIN.IERH (LIST ' ANO JO I N.TERM TERM) ) )) 99 I I CO POSS TERMS I 101 (COND ( IEQ (CAR ICET JCIN.TERH 'VARS!) NEW.RANGE) 102 (SETO CORE.RANGE ICACR (GET JOIN.TERM 'VMS))) I 103 ( T (SETO CORE.RANGE (CAR (GET JOIN.TERM 'VARSIIM \u00C2\u00BB 10* IPUT VAR 'STARTS STARTS) 105 (SETQ. STARTS (ADD STARTS (LENGTH IGET (EVA'. VAR) 'DOMAINS)))) 10b \u00E2\u0080\u00A2 (SETQ USEDVARS ICONS VAR USEDVARSII 107 (UNEVAL \"MAPC N I D I ) 108 I 1G9 RANGE.LISTI 110 I JOIN THE CCRE TO THE NEW RANGE 111 (SETO CORE (JOIN CORE (EVAL KEW.RANGEI 112 (FIX* (EVAL JOIN.TERM! CORE.RANGE ' T l NEW_RANGE 'T2\u00C2\u00BB)l 113 (AND TRACE IPR1N1 '\"JOINING CCRE WITH\") 11* (PRIN1 NEW.RANGEl (PRIN1 'YIELDS:! 115 (PRINTREL COREI I 116 ; SEE IF WE ARE DONE 11? (CCNO I RANGE_L1ST (GO L0OP2II 118 ( L.OF.CORES (RETURN (RCROSS (REVERSE L_OF_CORES)))) 119 ( T (RETURN CCREI)I 120 )) 121 122 123 12* (DEFUN FIX\u00C2\u00BB IJOIN_TERN CORE.RANGE T l OTHER.RANCE T2I 125 ; THIS ROUTINE WILL CHANGE ALL OCCURRENCES OF (OTHER.RANGE Nl 126 ; IN JOIN.TERM (0 (ELEM T2 N), AND ALL OCCURRENCES OF 127 : (CORE_RANCE Nl TO (ELEM T l CORE-NI. 128 (COND I (NULL JOIN_TERHI NIL! 129 ( (ATOM (CAR JOIN.TERM)) 130 (FIX* (CDR JOIN.TERM) CORE_RANGE T l OTHER.RANGE T2)) 131 I (EO (CAAR JOIN.TERM) OTHER.RANGE! 132 1RPLACA JOIN.TERM (LIST \u00E2\u0080\u00A2ELEM 133 T2 13* (NEWDCMAIN OTHER.RANGE (CADAR JOIN.TERM))I) 135 (FIX* (CDR JOIN.TERM) CCRE.RANGE T l OTHER.RANGE 12)1 136 .1 (EO (CAAR JOIN.TERM! CORE.RANGEl 137 (RPLACA JOIN.TERM (LIST 'ELEM T l 138 ISUB1 (ADD (GET CCRE.RANGE \u00E2\u0080\u00A2STARTS) 139 (NEWDOMAIN CORE.RANGE (CADAR JOIN.TERMII ) ) ) ) 1*0 (FIX* (CDR JOIN.TERMI CORE.RANGE Tl OTHER.RANGE T2)I 1*1 ( T IFIX* ICAR JOIN.TERM) CORE.RANGE T l OTHER.RANGE T2) 1*2 (FIX* (CDR JOIN.TERM) CORE.RANGE T l CTHER.RANGE T2) 1*3 )) 1** JOIN_TE\u00C2\u00BBM _ 1*5 ) 1*6 1*7 1*8 1*9 IDEFUN PORD (REL VAR) 150 ; THIS ROUTINE WILL EITHER PROJECT REL ON ALL ITS DOMAINS. 151 1 EXCEPT THE ONES CONTAINED IN THE RELATION SPECIFIED BY VAR, 152 ; OR IT WILL DIVIDE REL BY THE RELATION SPECIFIED BY VAR. 153 IPRGG (DO\u00C2\u00AB\u00C2\u00BBS POOMKS OOMAIN*S) 15* (SETQ DOHA I MAS (GENLIST IGET REL 'DOHAINSlll 155 ISETO COM\u00C2\u00BBS (MAPCAR \u00E2\u0080\u00A2ILAMBDA (D\u00C2\u00BB) 156 (ADO ISUB1 Dl) (GET VAR \u00E2\u0080\u00A2STARTS)) 157 ) 158 (GENLIST I GET (EVAL VAR) 'OOHAINSI! ) ) . 159 (COND ( (EC (GET VAR \u00E2\u0080\u00A2QUANTIFIER) 'EXISTS) 160 (MAPC \u00E2\u0080\u00A2ILAMBDA (X) 161 (CR (MEMO X DONAS) 162 (SETO POOHJS (CONS X PDOMISIII 163 ) 16* OOMAIN\u00C2\u00ABS) 165 (AND TRACE (PRINI '\"PROJECTING OFF\"! (PRINl VAR) 166 (PRINl 'YIELDS:!) 167 (RETURN (PROJECT REL (REVERSE POOM\u00C2\u00BBS))l ) 168 . I T (AND TRACE (PRINl \"DIVISION BY\"! (PRINl VAR) 169 (PRINl 'YIELDS!)) . 170 (RETURN IRDIV1DE REL 171 \u00E2\u0080\u00A2 (EVAL VARI 172 DONAS 173 (GENLIST IGET (EVAL VAR * -DOMAINS!) I I ) ) 17* ) 17S ) 176 177 178 179 IDEFUN FORM.RR (VAR THETA I) 180 ; THIS ROUTINE FORMS THE REDUCED LOCAL RANGE FOR VAR IN 181 ! THETA!. THIS AMOUNTS (0 RESTRICTING VAR TO THOSE CASES 182 I OEFINEO BY TERMS IN THEUI WHICH CONTAIN ONLY 1 TUPLE 183 : VAR1ABLE. 18* IPRCG (V) 185 IMAPC \u00E2\u0080\u00A2ILAMBDA (TERMI 186 (CC.ND ( IANO (EO (LENGTH (GET TERM 'VARS) I I) 187 (EQ (CAR (GET TERM 'VARSII VAR)) 168 (SETQ V (CAR (GET TERM 'VARSDI 1B9 (SET V (RESTRICT (EVAL V) 190 (FIX* (EVAL TERM) NIL NIL V ' T i l l ) 191 (PUT THETAI 'TERMS (DELETE TERM (GET THETAl 'TERMS))! 192 )) 193 ) 19* (GET THETAI \u00E2\u0080\u00A2TERMS)) 195 ) 196 I 197 139 198 199 2C0 201 (DEFUN GENTARCET ITARGET.L1ST) \"2 ! UPON W\u00C2\u00BB?CH\u00C2\u00B0T\u00C2\u00BB- \u00C2\u00BB!IoRN\"S * L U T \u00C2\u00B0f T H E DOMAIN NUMBERS 20, JNA^ VM'^ MB\u00E2\u0084\u00A2: nt^,*11\"1\u00E2\u0084\u00A2 shdulu be -roje\"\u00C2\u00AB-20* (LIST (AOO (NEHDOMAIN (CAR ELEMENT > (CADR ELEMENT)) 205 (SUBl (CET ICAR ELEMENT) \u00E2\u0080\u00A2STARTSI11 I 206 ) 207 TAROET_LISTI 208 ) 209 210 211 . . 212 (DEFUN CRDER (VARS) 213 ! THIS ROUTINE TAKES A LIST OF VARIABLES. AND ORDERS THEM 214 ; WITH THE ONES WHICH HAVE THE SMALLEST RANGES COMING FIRST. 215 IPROC (ANS LEN L) , 216 [SETO ANS (LIST (CAR VARS))) ' 217 (MAPC '(LAMBDA (V) 218 (SETO LEN (GET (EVAL V) \u00E2\u0080\u00A2\u00E2\u0080\u00A2TUPLESI) 219 (COND ( (NOI (GREATERP LEN (CET (EVAL (CAR ANS)) 'STUPLES) ) l 220 (SETQ ANS (CONS V ANSI) ) 221 f (NOT (LESS? LEN IGET (EVAL (CAR (LAST ANS))) '(TUPLES) )) 222 (SETQ ANS IAPPENO ANS (LIST VI)) 1 ' 223 . I T (SETO L ICONS NIL ANSI) 22* IMAPC \u00E2\u0080\u00A2I LAMBDA IELT) 225 (C.ONO ( (LESSP LEN (GET (EVAl.ELT) 'ITUPLES)) 226 IRPLACD L ICONS V (CDR D l ) 227 IUNEVAL 'MAPC NIL)) 228 ( T (SETQ L (CDR D l ) ) 229 I 2 30 ANSI I I 231 I 232 (COR VARS)) 233 (RETURN ANS)) 23* J 235 * . -236 237 238 . (DEFUN NEWDOMAIN (VAR OLDf) 239 ; GIVEN A VARIABLE. AND A DOMAIN NUMBER IN THE GLOBAL RANGE 2*0 : FCR THAT VARIABLE, THIS ROUTINE WILL RETURN THE DOMAIN NUMBER 2*1 I OF THE SAME DOMAIN IN THE GLOBAL OR LOCAL RECUCEO RANGE. 2*2 (COND I (NUHBERP OLD\u00C2\u00ABl 2*3 (SUB (LENGTH I GET VAR 'REFOOHAINSI I 2** (LENGTH ICDR (MEMO CLC\u00C2\u00BB (GET VAR 'REFOOHAINS)11 111 2*5 ( T (MAPCAR 'ILAMBDA (N) 2*6 (SUB (LENGTH (GET VAR 'REFOOHAINS)) 247 (LENGTH (COR (MEMO N (GET VAR 'REFD0MAINS11 111 2*8 . I 249 OLDAll I 250 ) . ' \u00E2\u0080\u00A2 251 1 252 253 25* (DEFUN RANGES (THETAII 255 -, THIS ROUTINE WILL RETURN A LIST OF THOSE RANGES USED IN \u00E2\u0080\u00A2 256 ; THETAI. WITH THOSE HAVING THE SMALLEST NUMBER OF TUPLES 257 ; APPEARING FIRST. 258 (PRCG (RANGE.L1ST) 259 (MAPC '(LAMBDA IT) 260 (SETQ RANGE.LI ST (UNION IGE7 T \u00E2\u0080\u00A2VARS) RANGE.LISTI) 261 > 262 [CET THETAI 'TERMS)) 263 (RETURN (ORDER RANGE.LISTI) 264 I 265 ) 266 267 268 269 (OEFUN RANGE (VAR) 270 ; THIS ROUTINE WILL CREATE THE REDUCED GLOBAL RANGE FOR VAR. 271 (PRCG (FORM UN) 272 (SETQ FORM (GET VAR 'RANGE 11 273 (COND ( (ATOM FORM.) (SETQ UN FORM)) 27* ( (EQ (CAR FORM) -AND) 275 (SETO UN (EVAL (PRCCESS.ANO (CDR FORM) VAR))) ) 276 ( IEO (CAR FORM! 'OR) 277 (MAPC '(LAMBDA ICONJI 278 (COND JIATCM COHJI (SETO CONJ ILISI CONJI)) 279 I T ISETO CONJ ICDR CONJI11 ) 280 - ISETO UN (RUNIGN IL[ST (EVAL (PROCESS.AND CG.U VAR)) 281 UNDI 282 .' ) 283 (COR FORM)) ) ,\u00E2\u0080\u009E ,\u00E2\u0080\u009E\u00E2\u0080\u009E 1 1 ) ) 3C8 IMAPC '(LAMBDA (T) 309 (SETQ FORM (LIST 'RDIFF FORM (LIST 'UUOTE TMI 310 I 311 DIFF.TERMS) 312 (COND ( (GREATERP (LENGTH REST TERMSI 0) 313 (SETO FORM ( L I S r 'RESTRICT FORM (LIST \u00E2\u0080\u00A2QUOTE (FIX* (CONS 'AND REST_TERMSI 314 NIL NIL 315 VAR ' T l l ) l ) 316 IPUT VAR 'REFDOMAINS SAVE) )) 317 IRETURM FORM)) 318 ) END OE FILE A P P E N D I X 9 - M I C R O - P L A N N E D ROUTINES . \u00E2\u0080\u00A2 1 4 1 1 (IHCON5E L I S P ! IX R Y l (PROVE* J7X J7R J7Y) 2 . (THOO (THANU TRACE 3 IPRINI* '\"ATTEMPT 10' PROVE\"! * (THCOND ( ITHASVAL J7X) (PRINl* J7XI) (T (PRINl* ' - ^ X l l I 5 (THCONO ( (THASVAL J?RI (PRINl* J7R1I (T IPRINI* \";7R)> I 6 (THCOND I IThASVAL IVY! (PR(N1* J7Y)) IT (PRINl* \"$JYI\u00C2\u00BB J 7 (TERPRIII I (THOO ITHFINO 1 J7REL (RED (JO (PROVE J7X J7R J7Y J7REL1 ITHUSE PRCVE-X-R-VJ) I 6 9 10 11 12 (THCONSE PROVE-X-R-Y ( X R Y REL VALIO RESULT USED) t PROVE S?X J7R 17Y J7REL) 13 (TMSETQ J7USED (LIST NILII I* (THCO (SETO (RESTRICT* NIL HEXTRA! NIL)) 15 (THCONO ( (THANO (JG (J7X J7R J7Y J7RED I I \u00C2\u00AB (ACTIVE \u00C2\u00BB7REL) IT)) 16 [JG (UPOATE-REST 5?X J7R t7Y J7RELI JT) IT (THSETO J7RESULT IJG IF-ALL J7X *7R \u00C2\u00BB7Y> (THUSE FINO-RELS))) 18 (SETQ IRESULTI J7RESULT) ) 19 I ISO (J7X \u00C2\u00BB7R J7Y J7REL) IT) 20 (SETQ \u00C2\u00BBRESULT\u00C2\u00BB THVALUE) ) 21 I (THANO (THSETQ J7VALID T l 22 IPRINT* '\"ABOUT TO TRY FOR A NON-VALID PATH\") 23 ITHSETO 57USEO ILIST NIL)I 2* (THOO (SETQ ((RESTRICT* N I D I 25 (JG (t?X J7R $7Y J7RELI *T) 26 (SETQ IRESULT* THVALUE) ) >) 27 (UPDATE-RESULT) 28 (SETO f RESULT A (LIST (LIST \u00C2\u00BB7X (CAR IRE5ULTO) 29 (LIST J7Y (CAR (LAST ARE5ULTJ)I I 30 \u00C2\u00BBRESULT\u00C2\u00AB!> \u00C2\u00BB\u00E2\u0080\u00A2 (THCOND ( fEXTRA! (SETQ KRESULTI (APPEND 4EXTRA8 \u00C2\u00BBRESULT\u00C2\u00ABUI ( (THSUCCEED)) ( 3 32 33 ITHFAIL ThioREM) 37 (THCONSE RELATE-X-Y (X Y R R2 RP A REL REL2 REL3 VAR A2 RELATION RES) 38 (J7X J7R $7Y J7RELATI0NI 39 (THPROG (LHS RHSI AO (THCONO ( (THAND (THASVAL J7XI ITHASVAL J7V) > A l (THCONO ( (MEMBER (LIST J7X J7Y) J7USE0) ITHFAIL THEOREM)) 42 I ITHVSETO J7USEP ICONS ILIST J7X J7YI S7USE0I) I I I 43 t (THSUCCEED)I ) 4* (THCOND ( (THNOT (THASVAL \u00C2\u00AB?R)I 45 ITHSETO J7VAR T>) 46 ( (THSETQ J7R2 J7R>) ) 47 (THCR (THAND (THASVAL *7YI 48 (THOR (THAND IJG (S7X J7R J7A J7REL)) IJG (OX J7R) JTI 49 (JG (ACTIVE J7REL) \u00C2\u00BBT) . ' 50 (JG (UPDATE-REST S7X J7R J7A J7RELI IT) ' 51 (JG (F-ALL J7X J7R J?A> ITHUSE FINO-RELS)) 52 (SETO \u00C2\u00ABRESULI\u00C2\u00BB THVALUE) (THSETQ J7LHS JRESULTt)) 53 (THANO ITHNOT ITHAND t JG [J7A2 J7R2 J7Y J7REL3I) ( J 6 (OX J7R2I ATI I) 54 (JG (J7X J7R J7A J7REL) *T) 55 (SETO HRESULT* THVALUE) 56 'ITHSETO J7LHS IRESULTO 57 (IG IACTIVE i7R6L) JT) 5S (SG (OX J7R) IT) ) ) \u00E2\u0080\u00A2' 59 (THSETQ J.7LHS (APPEND t?LHS (LIST t ? A ) ) ) 60 (PRINT* (LIST J7REL 'SAYS J7X J7R J7A>> \u00E2\u0080\u00A26t ITHCCNO ( (THASVAL !?YI (NOT (EO J7Y J7A1I 62 (PRINT* (LIST '\"TRY TO RELATE\" J7A 'AND J7Y)> I) 63 (THCOND I (THAND (SG (J7A J?\u00C2\u00ABP J7Y J7REL2)) ItG (0KJ7RP) JT) (JG (ACTIVE S7REL2) t i l l 64 (tG (UPPATE-REST J7A \u00C2\u00BB?RP J7Y J7REL2) ST) 65 (JG (F-ALL J7A J7KP J7YI (THUSE FIND-RELS>1 66 \u00E2\u0080\u00A2 (SETQ \"RESULT* THVALUE) (THSETQ *7RH5 (RESULTH 6T (PRINT* ILIST J7REL2 'SACS 17A J7RP t7YII I 68 I (THANO IJG (J7Y J7P.P J7A J7REL2II l!C (OK S7RP) JT) (IG (ACTIVE J7REL2) JTI) 69 IJG (UPOATE-REST J7Y J7RP J7A A7REL2) I t ) 70 (*G (F-ALL J7Y I2RP J7A) (THUSE FINO-RELS)) ? l (SETQ 4RESULTJ THVALUE) (THSETC J7RHS IRESULTA) T2 (PRINT* (LIST J7REL2 'SAYS J7Y J7RP J?A)I 1 73 ( (IH4N0 (JG (J7A J7KP J7Y J7REL2) JT) (JG (OK S7RP) JT) (JG (ACTIVE J7REL2) t i l l 74 (SETQ IRESULTI THVALUE) 75. . ( ThSETO J7RHS \u00C2\u00ABRESULUI 76 (PRINT* (LIST J7RFL2 'SAYS \u00C2\u00BB?A J7RP J7YI) ) 77 ( (THA-JO IIC (\u00C2\u00AB?Y J7RPS7A J7REL2) JT) I JG (OK J7RPI JT) (JG (ACTtVE J7REL2I JT)I T8 (SETO *RESULTI IHVALUE) 79 (IHSEIQ JTRHS ARESULTJI 60 (PRINT* ILIST J7REL2 'SAYS J?Y S7RP J7AI) )) 81 (THOR (\u00C2\u00BBG (\u00C2\u00BB7A AND S7Y ARE T-TRANS IN J7REL2I) 81 IJG IJ2REL2 IS T-TRANSI) 83 (THANO (TITASVAL J7VARI (THSETQ J7R (GENSYM MSRI)) \u00C2\u00BB . \u00E2\u0080\u00A24 (THOR (JG (J7A DETERMINES S7X J7REL) ) '\u00E2\u0080\u00A2 85 IJG (J7A DETERMINES J7Y J7REL2)) 86 (THASVAL J7VALI0) ) 87 ITHSETO J7VAR (GENSYM)| 88 (JA [J?X J?R I7Y S7VAPI) 89 (THCOND ( (THAND IJG IJ7A OETERMIKES J7X J7R.ELII 90 IJG IJ7A OETERMINES \u00C2\u00BB7Y J7REL2II ) 91 (JA (J7X DETERMINES J7Y J7VARI) 92 (JA ( J 7Y OE fERMINES J7X J7V^RIl ) ^ 93 ( (THSUCCEEOl) I 94 (THSETQ J7RELATI0N J7VAR) 95 ) 96 ITHANO (IHASVAL S7XI (THCONO ( ( THAND ( JG (J7A J7R J7Y J7REDI ( IC (ACTIVE J7REL) JT)) (JG (UPOATE-REST J7A J7R J7Y J7RELI J11 (JG (F-ALL J7A \u00C2\u00BB7S J7YI (THUSE FINO-RELSII I SETQ \u00C2\u00BBRESULT\u00C2\u00BB THVALUE I (THSETQ J7RHS (itESULTII) 97 98 99 ICO 1U2 101 I (THANO (SG (S7A 17R 17V 17RELI IT) ( i t (ACTIVE S7REL) IT11 102 (SETO 'RESULT! THVALUEI 103 (THSETO S7RHS IRESULTf) )) 104 (THSETQ 17RHS ICONS \u00C2\u00BB7A \u00C2\u00BB7RHSII ICS (tG (OK S7RI tT) IC6 (PRINT* 1L(ST 17REL 'SAYS VTA 17R 17Y)J 107 (THCOND ( (THASVAL 17X1 (NOT (EO S7X \u00C2\u00BB?AI) IOB (PRINT* HIST '\"TRY TO RELATE* S7X \u00E2\u0080\u00A2 AND S?A>) 1) 109 ITHCONO ( (THANO (tG [ 17A l?RP t7X J7REL2II (\u00C2\u00ABC (OK S7RP1 IT 1 UG (ACTIVE 17REL21 JTI1 \" 110 [IG l UPDATE-REST S7A 17RP i?X S7REL2) ATI 111 (*C (F-ALL 17A 17RP 17X) (1HUSE FIND-RELS>) 112 (SETO 'RESULT! THVALUE I (THSETO \u00C2\u00BB?LHS 'RESULT*) 113 (PRINT* (LIST J7REL2 'SAYS t7A S7RP J7XII I 114 ( IThAHD ItG 1111 17RP \u00C2\u00BB?A 17KEL2I1 (tG (OX 17RP) IT) (*G (ACTIVE J7RE12) ATI) 115 (JG I UPDATE-REST 17X *7RP 17A 17REL2) IT) 116 (>G (F-ALL S 7X t?RP S?A) ITHUSE FINO-RELS)) 117 (SETQ \u00C2\u00BBRESULT\u00C2\u00BB THVALUE) (TWSeTQ S7LHS SRESULT*) II S . (PRINT* (LI ST t?REL2 'SAYS t?X 17RP I7A1I ) 119 ( (THAND UG (S1A 17RP 17X J7REL2) *T> ItG (OK *7RP> \u00C2\u00BBT> UG (ACTIVE 17REL2) \u00C2\u00BBT)> 120 (SETO 'RESULT* THVALUEI 121 (THSETO 17LHS IRESULT*I 122 (PRINT* (LIST t?REL2 'SAYS 17A S7RP \u00C2\u00BB7XII. I 123 I (THAND UG U7X S7RP *7A S7REL2I JT) (IG (OX 17RP) ATI IIG I ACTIVE 17REL2) $ T I I 124 ISETO 'RESULT*.THVALUE) 125 (THSETO t?LHS 'RESULT*) 126 (PRINT* (LIST S7REL2 \u00E2\u0080\u00A2SAYS 17X *7RP $7A))) I 127 ITHOR ISG (S7A AND t7X ARE T-TRANS IN S7REL2M 128 (SG (S7REL2 IS T-TRANS)I 129 (THAND (IHASVAL J 7 VAR) I THSETO *7\u00C2\u00AB (CENSYN 'ISR>)> I 130 (THOR ISG (S7A DETERMINES S7Y S7RELI) 131 (SG (S?A DETERMINES >?X S7REL2I) 132 (THASVAL S7VAL1D) ) 133 (THSETQ S7VAR ICENSYMI) 134 l*A (S7X t?R 17Y S7VARI) 135 ITHCONO I (THAND I IG U7A DETERMINES 17Y \u00C2\u00BB7RELII 136 (tG (S7A DETERMINES t?X S7REL2I) ) 137 (*A (t7X DETERMINES t7Y S7VAR1) 138 (SA (t7Y DETERMINES l ? X t7VARll I 139 I (THSUCCEEO)) I 1*0 (THSETO \u00C2\u00AB7R A7R2I 1*1 (THSE70 17RELATI ON t7VAR) ) ) 1*2 (THSETO 17RES (APPEND 17LHS t7RHS>) ) 1*3 (THRETURN A7RES) 1** > 1*5 1*6 1*7 1*8 (THCONSE LISP3 [X RI Y R2 2) IPROVE* *7X \u00C2\u00BB7Ri t?Y t?R2 1721 1*9 (THCO ITHAND TRACE 150 (PRINl* '-ATTEMPT TO PROVE\") 151 (THCONO ( (THASVAL J7XI (PRINl* 57X11 I T (PRINl* '*7X)I ) 152 ITHCCND I (THASVAL t ? R l ) (PRINl* t 7 R l l l ( T (PRINl* \u00C2\u00AB17R1)) f 153 (THCONO I (THASVAL i?Y) (PRINl* \u00C2\u00BB7Y)I [ T (PRINl* '17YH ) 15* (THCOND I (THASVAL 17R2I (PRINl* t?R2>> ( T (PRINl* '\u00C2\u00BB7K2>) ) 155 (THCCNO ( (THASVAL 172) (PRINl* \u00C2\u00BB?Z)> ( T (P R I N l * \u00E2\u0080\u00A2*7ZI) ) \u00E2\u0080\u00A2 156 (TERPRI) II 157 ITHCO ITHF1ND 1 \u00C2\u00BB?R IR1! (SO IPROVE \u00C2\u00BB7X t ? R l $7Y 17R2 17Z \u00C2\u00BB7R) (THUSE PROVE-X-R-Y-158 I) 159 160 161 182 183 18* R2-2>)>. 162 (THCONSE PR0VE-X-R-Y-R2-Z (X R l Y R2 2 REL XVAR WAR ZVAR VALID RESULT USED) 163 (PROVE S7x A7R1 \u00C2\u00BB7Y 17R2 t7Z S7RELI 16* (THOO (SETC (SESTR1CI* NIL SEXTRAA NIL)) 165 ITHCK (THNOT (THASVAL 17X1) (THSETO 17XVAR 17X1) 166 (THCR (THNOT ITHASVAL 17YI I (THSETO 17YVAR t7Y)) 167 (THOR (THNOT (THASVAL 17211 [THSETO 17ZVAR S7Z)> 168 (THCOND I (THAND (SG l t ? X 17R1 t7Y t?R2 172 17RELII I tG (ACTIVE 17REL1 1TI) 169 ITHSETO 17RESULT IJG IF-ALL 3 t7X 17RI \u00C2\u00BB?Y J7R2 17ZI (THUSE F3RELS1II 170 I THSETO S7YV.AR (CAR S7RESULT)) IT' (SETO fRESULTi 17RESULT) ) IT2 ( (IS (17X S7R1 S?Y 17R2 172 S7REL) (THUSE KELATE-X-Y-2)) 173 (PRINT* (SETO JRESULT* THVALUE)I) I IT* (SETO \u00C2\u00BBRE5ULT* (REVERSE *RESULT*>I 175 (UPDATE\u00E2\u0080\u0094RESULT) 176 (SETQ iRESULTf (LIST (LIST 17X S7XVAR) 177 (LIST \u00C2\u00AB7Y 17YVAR) 178 (tIST 172 S7ZYAR) 179 \u00C2\u00BBRESULT\u00C2\u00BBI> 180 (THFAIL THEOREM) 181 ) III , ~ x R l 7 R A r ; 7 X Y Y , 7 R 2 X J 2 i , 7 R E L R A Z r , ^ , \" E L 2 ' \" \u00C2\u00AB \" \" \" \u00C2\u00B0 \u00C2\u00BB \u00C2\u00BB \u00C2\u00BB . \u00C2\u00BB \u00C2\u00BB \u00C2\u00AB RP, 187 ( THCR [THNOT ITHASVAL 17R1II (THSETO t7VARl T i l III I T H 4 S V 1 1 - ' TH SE TO ,7v\"2 H 189 (THOR (THANO l\u00C2\u00BBC U7A 17R1 17Y t7R2 t;Z 17REL *!? 1\u00C2\u00BBC (ACTIVE 17RELI t i l 19R I 0 I T M ' 0 T '50 172VAR. t ? Z ) l (THSETO 172VAR (CAR H a m (THAND ! ^ . r a l ' t ; . % ' ; : r . , T i ^ \" \u00C2\u00BB 2 0 0 (tG [ACTIVE 17RELI ST) 201 , 0 , ' S E T 0 \" R E S U L T * T H V A L U E I I T H S E T O t ? L H S fRESULT*I 202. ! \u00C2\u00ABG , I \" L , . T E : ! : 1 . ' ' r i ! M I T \u00E2\u0084\u00A2\u00C2\u00AB \u00C2\u00BB-\u00C2\u00AB-\u00C2\u00BB\u00C2\u00BB 201 206 207 208 205 \" ITHSETQ t7IIEM 21 206 ITHSETO 1 7 R H 5 (JG IF-ALL3 J7>r \u00C2\u00BB7R1 J7A J7R2 J72I ITHUSE F3SELSDI 207 I THOR ITHfiOT IfO S7XVAR 47X11 ITHSETO 17XVAR (CAR 17RKSI1I 206 ITHCR HHrtDT tEC J77VAR 172)1 tThSETC J77.VAR (CAR JJRHS1I) I 209 (THAND (SO (5?X 17R1 J7Y J7R2 J7A J7KELI) 210 ItG IIC11VE J7REL) JT) 211 IJG (RP.! AiE-X-A t7Z 17AI ITHUSE R-K-All 212 (SETO ' ITHUSE R-X-A)) 226 (SETO \"RESULT! THVALUE) (THSETQ A7LHS 6RESULT0I 227 . . (THOR I THNOT (EQ I7YVAR J7YII (THSETO tTYVAA (CAR J 7 L H S H I 228 (tG 1 RELATE\u00E2\u0080\u0094X-Y-A J7X t7Rl. J7A i7R2 t?Z) ITHUSE R-X-V-AII 229 (SETQ \"RESULT* THVALUE) (THSETQ t7RHS \"RESULT*I Z3Q .(THSETQ J7ITEM 21 > 231 \u00E2\u0080\u00A2 ITHANO ItG IRELATE-X-A J72 J7AI (THUSE R-X-All Z32 - (SETQ \"RESULT* THVALUE) (THSETQ A7LHS fRESULT*) 233 (THOR [THNOT (EQ J7ZVAR S?7.11 [THSETC J7Z-VAR (CAR J7LMSIII 23* (tG IRELATE-X-Y-A t7X J7R1 t7Y t7R? J7A) (THUSE R-X-r-AII 235 . (SETC \"RESULT* THVALUE) (THSETQ S7RH&' (RESULT*,) 236 ITHSETO J7!tEM 3) ) 23T (THAND (tG \"(t?X I7RI t7Y A7REL) *I\u00C2\u00BB . 238 :(SETO \"RESULT* THVALUE) 239 (THSETQ tJLHS *RESULT\u00C2\u00BB) 2*0 ITHSETO S70SFQ (LIST WILII 2*1 . I\u00C2\u00BBG ['UPDATE-REST J7X S7RI S7V \u00C2\u00BB7*ELI ATI 2*2 . (AG [ACTIVE 17RELI \u00C2\u00ABTI . 2*J (tG 1J7X S7R2 J7Z J7REL2) ATI , 24* (THSETO t7USE0. ILIST NILII . 2*5 (SETQ \"RESULT* THVALUE) 2*6 (tG (UPCAIE-REST I7X S7R2 J7Z J7REL21 t f l 2*7' [IG IACTIVE 47HEL2) t i l >\u00C2\u00AB8 ITHSETQ J7RHS (LIST J7X *RESUIT\u00C2\u00AB)I 2*9 (THOR ItG IJ7X DETERMINES t7T S7REL)I 250 . (tG (t7X DETERMINES J7Z S7RELZII 251 (THASVAL. t7V*L10l II 252 (THANO (tG IJ7X J7RI S?Y I7RELI *T< 253 (THSETO J7USE0 ILIST MIDI 75* (SETQ \"RESULT* THVALUE) 255 (IG (UPDATE\u00E2\u0080\u0094REST f7X t 7 R l STY S7REL) *t\u00C2\u00BB 256 ItG lACTtvf J7REL) AT) 257 (THSETC I71.HS \".RESULT*! 258 (THOR (SG (S?X DETERMINES t7Y t?REL)) (THASVAL tlVALIBM 259 ItG l t ? Y t?R2 172 I7RLL2I \u00C2\u00BB U 760 ITHSETQ 17USED (LIST NIL)) 261 ISETO \"RESULT* THVALUE) J6Z ItG (UPDATE-REST t7Y t7R2 J72 S7REL2) ATI 263 \u00E2\u0080\u00A2 (tG (ACTIVE J7REL2) t f l 26* (THSETO 17RH5 ILIST t7Y \u00C2\u00ABR\u00C2\u00A3SULT*)I 265 (THOR ItG II7Y DETERMINES J7Z J?XEL2(1 (THASVAL \u00C2\u00BB?VAL10I\u00C2\u00BB 266 ITHSETC 17YVAR IAPPEND (CAR S7LNSI (COR (LAST J7RHSI111 I 267 I 268 (THCOND ( (THNOT (THASVAL S7ITEM)) ITHSuCCEEDII 269 [ (EQ 171 TEH 1) 270 (THOR (tG (S7A DETERMINES V7X 17REL2)) 271 [THANO ItG (i?A DETERMINES t7Y 17RELII 272 (IG (J7A DETERMINES 172 17REL)1 I I 273 (THOR ItG U7REL2 IS T-TRANS)I 27* (THANO (THASVAL I7VAR1I (THSETQ S?Rl (GENSYM \"ISRIl 275 IIC II7A DETERMINES J7Y J7P.F.LII 276 [THASVAL t7VAR2l 277 (THSETQ 17R2 (GENSYM \"VSR 11 I I ) 278 ( (EO t7IIEM 2) 279 (THOR (IG I57A DETERMINES t7Y J7RFL2II 2P,0 (THANO (IG II7A CETERHINES 17). S7RELI1 281 IJG lt7A DETERMINES 172 S7RELII I I 282 (THOR (tG IS7REL2 IS T-IRANSII 2E3 [THAND [ THASVAL >7Y.\R1> ITHSETO S?Sl (GENSYM MSRII 28* ItG (17A 0ET6RMINES l?l t T R E l l l 285 (THASVAL A7VAR2I 236 ITHSETO t?R2 (GENSYM MSRII I) I 237 ( (EQ 17IIEM 31 . ' . 268 (THOR IIG U7A DETERMINES t?l S7P.H2II 2S9 (IHANO (IG (t7A CClERMlNfS t?X i??.\u00C2\u00A3LU 290 l\u00C2\u00BBG 117 4 DETERMINES t7Y 57RCLH ) I 291 (THOR ItG I17REL2 IS T-TRAWSII \u00E2\u0080\u00A2292 ' ITHAND (THASVAL I7VAR2I 293 ITHSETO t?REL2 IGENSYK 11 SRI) I I I 2V* I 2'!5 (THSETO t7A (GENSYHl) 7/96 (SA (S7X 17R1 t7Y 47R2 S7Z t7AI) 297 (THSETQ ITRELAIION t7A\u00C2\u00BB 298 \u00E2\u0080\u00A2 (IHSUCCEED THEOREM (APPCNO A7LHS t?RHS) I 299 I 300 301 302 303 304 jos 306 307 308 309 310 311 312 313 31* 315 316 317 318 319 320 321 322 323 32* 325 326 327 32S 329 . 330 331 332 333 33* 335 336 . 3 37 338 339 3*0 3*1 3*2 3*3 3\u00C2\u00AB* 3*5 3*6 : 3*7 3*8 3*9 350 351 352 353 35* 355 356 357 358 359 360 361 362 363 36* 365 366 367 368 369 370 371 ' 372 3 73 3 7* 375 376 377 378 379 380 381 382 383 36* 385 3t6 387 3\u00C2\u00AB8 389 370 391 ITHCONSE ENUMERATE-DOMAIN (X RESULT RI 1 ENUMERATE J7XI (THCONO 1 ItG l t ? R ENUMERATES t7XI) ItG IACHVE t?RI \u00C2\u00BBTI ITHSETO 17RESULT (LIST 17RU) I ($G (t?R P-ENUHERATES J7X)> (THSETQ t 7RESULT (THE I NO ALL J7RELS (RELSI ITHANO ItC (\u00C2\u00BB7RELS P-ENUMERATES \u00C2\u00BB7X)I (\u00C2\u00BBG (ACTIVE J7RELS) *T11) I) ( ITHSETO J7RESULT (THF[ND ALL S7RELS (RELS A) ITHAND (THOR (tG (\u00C2\u00BB7X t7R J7A S7RELSI) (tG IJ7A t?R S7X S?REL51>> (tG (ACTIVE J7RSLS) tT>>) )) (SETQ \"RESULT! (LIST (LIST t?X J7RESULT) (LIST S7RESULT))) ITHCONSE VAL10-PRE0ICATE IR) ICX J7R) (ThNOT (tG (EXTRA t7R>)) I (THCONSE ACTIVE-RELATION (RI (ACTIVE *?RI (THNOT (tG (J7R IS INACTIVE))) ) ITHCONSE CHECK-RESTRICTIONS IX Y R REL NOS\u00C2\u00BB \u00E2\u0080\u00A2 (UPOATE-KEST t7X t7R t7Y t7REL) (THCOND ( (IHSETQ t7N0S [THF1 NO ALL (t7X STY J7N0I (NO) (SG (J7K IN &7REL IS RESTRICTED TO t 7 N 0 ) ) l I) I (THSUCCEED THEOREM)) 1 ~ \u00E2\u0080\u00A2 (THCO (THANO TRACE (PRIN1\u00C2\u00BB '\"MAKING NOTE OF RESTRICTION ON\") (PRIN1\u00C2\u00AB !?\u00C2\u00AB) (PRIN1\u00C2\u00BB 'IKI (PRINl* \u00C2\u00BB7REL) (TERPRIII I (THMAPC \"UPDATE 17M0S) ) (THCONSE UPDATE (X Y NO R2 REL2 DOMAIN A REST) !*?X J7Y *7N0) (IG (J7.NC RESTRICTS 1700MAIN TO tTRESTI) (THCONO I (tG l t 7 X t7R2 S7D0MAIN t?REL2)> (tG IOK t7R2) ST) I THSETQ t7A t? X ! '.THSETQ J7REL2 ( L I S \" i7RSL2)) ) ( (tG (t7Y t7R2 t?OOHAIN t?REL2)) (tG (OK 17R2) tT| (THSETQ t7A A7Y) (THSETQ J7REL2 (LIST J7REL2)) ) (tG (t7X S7R2 t700MA!H J7REL2) JT) (SETQ -.RESULTS THVALUE) (THSETQ J7REL2 \"RESULTS) (THSETQ J7A J7X!) ( (JG (J7Y J7R2 J700MAIN J7REL2) IT) (SETQ HRESULT* THVALUE] (THSETQ J7REL2 IRESULTO ITHSETQ J7A J7YI) ) (SETQ ARESTRICTI (APPEND (LIST J7REST \u00C2\u00BB7A J7REL2 $700HA1N) \u00E2\u0080\u00A2RESTRICT!)) I I (THCONSE CHECK-3RESTRICTI0NS (X RI Y R2 Z R E l l (UP0ATE-REST3 J7X t ? R l J7Y J7R2 177. J7REL) (JG (UPDATE-REST J7X J7RI J7Y J7REL) JT) (THOR (THAND (tG (J7X J7R2 J72 J 7REL)) (tG (UPOATE-REST t7X J7R2 J77. J7REL) \u00C2\u00BBTI) j (JG (UPOATE-REST 17Y J7R2 J7I J7RLDJT I | \"UH'SITYJT'R'ET\" * Y \"=-*U *7* *\u00C2\u00BB\" ITHFIND ALL J7RELS IRELS) (THAND (JG (J7X J7R J7Y 17 RE I SI I (tG (ACTIVE J7RELSI JT) I \" (THERASE (J7X f?R t7Y J7RELS)! ) l ^ (THSUCCEED THEOREM ILIST J7RESI) ( THFINO ALL I7RELS (RELS) I THANO K G I t7X t7Rl J7Y 17R2 J72 J7RELSU , I \u00C2\u00BBC IACHVE 17RELS) IT)) | ^ (THSUCCEED THEOREM (LIST t?R\u00C2\u00A3S)| Ih5 392 393 394 395 306 I THCONSE R-x-A |x A RP RE12 LHS! (RELATE-X-A S7X 17A1 , l I H S E T Q t 7 L H S <* = IF-AU S?X MRP S7A) (THUSE F I S M P , , , , , , *0L ( T H S U C C E E o ' r ^ O R V ^ E ^ . r H S ^ t l s 7 ? I ^ . ' J \" ' \u00E2\u0084\u00A2 \u00C2\u00BB \u00C2\u00BB - \u00C2\u00AB \" -403 404 *1* ISG (ACTIVE S7RELI ST) 4 15 (THSETO S7RHS ARESULT*!! ) 416 (THSUCCEED THEOREM A7RHS) 417 I 418 419 420 421 ; START OF ANTECEDENT THEOREMS 422 423 \" \"\u00E2\u0080\u00A2' \" 424 (THANTE NOT-AVAILABLE IX RELS) (S7X ARE NOT AVAILABLE! 425 (THSETO S7RELS ITHFIND ALL IS7REL) (REL) (SG (S7REL CONCERNS 57X1) )) 426 ITHHAPC 'ASSERTFUN S7RELS) *2T ) 428 429 430 431 (THCONSE ASSERTFUN (XI (S?X) (SA IS7X IS INACTIVE))) 4.32 \u00E2\u0080\u00A233 434 435 ITHANTE ON-STRIKE IX REST* RELS LIST) (S7X SUPPLIERS ARE ON STRIKE) 436 (THSETO S7REST* (GEN5YH \u00E2\u0080\u00A2REST*)) 437 (THSETQ S7LIST (LIST 'NOT (LIST \u00E2\u0080\u00A2EO \u00E2\u0080\u00A2SLOC ILIST \"OUOTE 57X)))) 438 ISA (S7REST* RESTRICTS SLOC TO S7L1ST)I 439 (THSETO S7RELS (ThFINC ALL (SUPPLIES S7R S7AESTI) (R A 8) 440 (SG IS7A SUPPLIES S7B S7RI) )) 441 .(THMAPC 'ASSERT-RESTRICTION S7RELS) 442 (THSETO S7RELS (TnFINO ALL (SUPPLIES-* \u00C2\u00BB7R 57MEST*) (R A B) 443 ISC (S 7 A SUPPLIES-.\" S7B S7RII II 444 (THMAPC 'ASSERT-RESTRICTION S7RELS) 445 (THSETO S7RELS (THFIND ALL [IS-SUPPLIEO-BV 57R 57REST*) (R A B> 446 ISG (S7A IS-SUPPLIED-3Y S7B S7R1I )) 447 ITHMAPC 'ASSERT-RESTRICTION S7RELS) 4*8 I 449 450 451 452 (THCONSE ASSERT-RESTRICTION (RELSHIP REL* REST*) IS7RELSHIP 57REL* S7RESTO 453 (SA (S7RELSH1P IN S7REL* IS_RESIRICTED.TO J7REST*)) 454 ) 455 456 457 .458 (DEFUN UPDATE-RESULT ( 1 459 (PRCG (REST TEMP) 460 (CCND ( (NULL (SETQ REST \u00C2\u00ABRESTRICTJ)I (RETURN T)) ) 461 LOOP (SETO TEMP (CCNS (CAR REST) IEMP)) 462 (OR (ASSO (CACDDR REST) \"RESULT*) 463 ISETO *EXTRAi (APPEND (LIST ILIST ICAOOOR REST))I 46* \u00C2\u00ABEXTRA\u00C2\u00AB)I ) *65 (COND I (NOT I MEMBER ICAOOR RESTI \u00C2\u00ABRESULT*I1 466 (SETQ 'RESULT* I APPEND iRESULT* *6T ILIST ICACR REST) (CAODR REST))) ) )) 468 (COND ( I NULL (SETQ*REST (CCOOGR REST))) 469 ISETO ARESTRICT* TEMP) 470 (RETURN T)) 471 ( T (GO LOOP 1) I 472 ) 473 ) A P P E N D I X 10. - NEW REDUCTION ALGORITHM 1 (DEFUN REDUCE (QUERY! 2 ; THIS ROUTINE HILL REDUCE A CUERY tN THE RELATIONAL CALCULUS TO 3 S A RESPONSE RELATION. IT DOES SO DY USING THE RELATIONAL ALCEBft*. 4 (PROG (TARG\u00C2\u00A3T_L1ST RESPONSE OUANTS \"RESULT* (EXTRA! JR\u00C2\u00A3S1HICTOI 5 (AND TRACE (PRINT '\"ABOUT TO REDUCE THE CU\u00C2\u00A3RV\")I 6 (SETO TAHGET.LIST ICREATE.IARGEt OUERY1) 7 (SETO OUfcRY (COR (MEMO \u00E2\u0080\u00A2: CUERY111 8 (MAPC * 1 LAMBDA (OUANT) 9 (COND ( (MEMO (CAR CUANTI 'IE V I I 10 (SETQ OUERY ICOR QUERY!) 11 ISETO OUANTS ICONS ICENSYN1 'QUAND OUANTSII 12 (SET (CAR GUANTSI ICADR OUANT!I 13 IPUT (CAR OUANTSl 'REL (CREATE.OUANT (COR OUAWIM t 1* ( (UNEVAL \u00E2\u0080\u00A2MAPC NIL!, I 15 I ' \u00E2\u0080\u00A2, 16 OUERYJ \" '\u00E2\u0080\u00A2 . 17 r~- (SETO RESPONSE IDIVIOE_OR_PROJECT ID[5JUNCT CUERYI OUANTSII 18 (RETURN (PROJECT RESPONSE (DOHA!HfS RESPONSE TAR&ET_L1ST)II -19 I ) * 20 . . \u00E2\u0080\u009E . - \u00E2\u0080\u00A2 21 2? 21 (OEFUN CREATE.TARGET (OUERY) 24 ; THIS ROUTINE HILL CREATE A LIST OF THOSE OOMAINS WHICH ARE TO 25 ; 6E PRESENT IN THE RESPONSE RELATION. 26 (PRCG (RESULT) 27 (MAPC 'ILAMBDA IELT) 28 (COND ( (EO ELT *I) (UNEVAL \"MAPC f U l l ! 29 I T (SETQ RESULT (CONS ELT RESULT!)) > JO 1 31 QUERY) 32 (RETURN (REVERSE RESULT!! 33 I I 3* ..35 36 37 (DEFUN CREATE_QUANT (QUANTI 38 ; THIS ROUTINE TAKES A QUANTIFIER FROM THE QUERY, AND CREATES THE 39 ; CCRRESPONDIUG RELATION. 40 (PRCG (RESULT) 41 (AND TRACE (PRIM '\"DETERMINE THE RELATION WHICH QUANTIFIES\"! 42 (PRINl (CAR OUANT1) (IERPAI1I \u00E2\u0080\u00A243 . ! (SETO RESULT (DISJUNCT QUANT)) 44 (PROJECT RESULT (DOMAIN!S RESULT (CAR CUANTI)) 45 )) ' \u00E2\u0080\u00A2 46 47 48 49 (OEFUN CREATE.RELATITN IPLIST) 50 ; THIS ROUTINE TAKES THE LIST THAT PLANNER RETURNS. AND CREATES 51 ; THE RELATION WHICH IT DEFINES. 52 (PRCG (METHOD RESULT J1ERH \u00C2\u00ABL1I IL2A1 53 ISETO METHOD (CAM (LAST PLISTIII 54 TSETC PLIST (DELETE HETHOD PLISTI) 55 ISETC RESULT IREI 0EF_8Y (UNCONS METHOD KETHCODl 56 (MAPC '(LAMBDA (MED 57 \u00E2\u0080\u00A2\u00E2\u0080\u00A2 (COND ( (ATOM REL) (SETO JTERM REL)! 58 ( T (SETQ REL (REL_DEF_BY RED) 59 (SETQ #L1* (OOMAINIS RESULT JTERM) 60 \u00C2\u00ABL2\u00C2\u00AB (00\u00C2\u00ABAIN\u00C2\u00AB5 REL JTERKI1 61 (SETQ RESULT (JOIN RESULT 62 REL 6 } '(EQUAL IELEM T l l l l f l 64 (ELEM T2 IL2I1I I I I I 65 ) 66 METHOD) 67 ISETO DOMAINS IHAPCAR 'CAR PLlSTIJ 68 (RETURN (PROJECT RESULT (DOMAIN\u00C2\u00BBS RESULT DOMAINS))1 69 )) 70 71 72 73 (OEFUN REL_OEF_BY (RELS) 74 ; THIS ROUTINE TAKES A LIST OF\" RELATION NAMES, AND RETURNS THE RELAtlC*) 75 ! WHOSE DOMAINS ARE THOSE COMMON 10 ALL RELATICNS IN RELS. 76 (PRCG (DOMAINS RESULT I i 17 ICCND ( IEQ (LENGTH RELS! 11 (RETURN (CAR RELS!I II 78 ; CREATE A LIST OF THE DOMAINS WHICH ALL THE RELATIONS HAVE IN COMMOM. 79 (SETO DOMAINS (GET I CAR RELS) 'DOMAINS)); 80 (MAPC \u00E2\u0080\u00A2(LAMBDA IREL) 81 (SETO DOMAINS (INTERSECT OOMAINS (GET REL 'DOMAINS)11 82 ) 83 (CCR RELS)) S4 . . ISETO RESULT (PROJECT (CAR RELS) (OOMAINIS (CAR RELS1 OOMAINS))) ES (MAPC '(LAMBDA U S D 86 87 ( (SETO SULT (RUNION. RESULT (PROJECT REL (OOMAINIS REL DOMAINS)!) ) 8 8 (CDR RELS!I (RETURN RESULT 190 )) 91 92 VJ 94 95\u00E2\u0080\u00A2 - . 9697 (DEFUN DISJUNCT (QUERY! 98 ; This R 0 U 1 I N E WILL PROCESS *-=<.\u00E2\u0080\u00A2?.*\u00E2\u0080\u00A2' \u00E2\u0080\u00A2.\u00E2\u0080\u00A2K;CH I S IN O.N F IT rflFt t u i . iS . I TBHE=T?HESI \u00E2\u0084\u00A2 -101 (PROG (RELS) 102 (CO.ND ( (OR (NULL (COR QUERY)) (EQ (C\u00C2\u00BB0\u00C2\u00BB QUERY) ' t l ) IC3 (RETURN (CONJUNCT QUERY)J II 104 (MAPC '(LAMBDA (C) . . . 105 (OR (EQ C 'VI 106 (SETO RELS (CONS (CONJUNCT C). RELS)) ) 107 I . 108 QUERY) 109 (R\u00C2\u00A3L_D\u00C2\u00A3F_BY RELS) 110 )) 111 112 113 .' 114 (DEFUN CONJUNCT (QUERY) 115 ; THIS ROUTINE WILL RETURN THE RELATION DEFINED BY A CONJUNCTION .116 I IN THE CUERY. ALL RELATIONAL TERMS ARE PASSED TO PLANNER AS 117 1 THG01LS, AND THE CORRESPONDING RELATIONS ARE CREUEO. 118 ; THESE ARE THEN SUBJECTED TO THE RESTRICTIONS IN THE CONJUNCTION, 119 ; WITH THE FINAL RESULTS BEING JOINED TOGETHER ON EITHER COMMON 120 ; OOMAINS, OR AS SPECIFIED BY THE JOIN TERMS. ' 121 IPRCG (FNR KELS JTERMS'R FLAG RELSP RELSN OOMAINS) 122 (MAPC ' I LAMBDA IKELTERNI . 123 \u00E2\u0080\u00A2 ICOND I IEO RELTERM \u00E2\u0080\u00A2\u00C2\u00A3) NIL) j 124 I (EQ RELTERH \u00E2\u0080\u00A2-.) (SETO FLAG T)> 125 ( (ATOM RELTERM) (THVAL ILIST \u00E2\u0080\u00A2$G ILIST 'ENUMERATE RELTERM) \u00E2\u0080\u00A2\u00C2\u00BB!) 126 [LIST NIL NIL))), 127 ( (FUNCTIONP ICAR RELTERM!) 12B (UNEVAL 'MAPC N I D I 129 I T IIHVAL (LIST \u00E2\u0080\u00A2IG ICONS 'PROVE* RELTERM) 'AT) 130 ILIST NIL NIL)) II 111 (OR (NULL (RESTRICT*) [SETQ QUERY (APPEND CUERY (RESTRICT*)I) 132 (SETQ QUERY (CDR QUERY)!. 133 (OR (CO RELTERM \u00E2\u0080\u00A2 C! (EC RELTERM \u00E2\u0080\u00A2-.) 134 (SETQ RELS (CONS ICRcATE.REL AT I ON (RESULT*) RELS)) I 135 (CONO ( (AND FLAG (NOT (EQ RELTERM '-111 (PUT (CAR RELS) 'NEGATED Tl 136 (SETQ FLAG NIL)) ) . 137 (SETQ (RESTRICT* NIL (EXTRA* NIL) 138 ) 139 OUERY) 140 I GO THROUGH ALL THE RESTRICTION AND JOIN TERMS, APPLYING THE RESTRICTIONS 141 I TO THE APPROPRIATE RELATIONS, AND SAVING THE JCIN TERMS, ALONG WITH 142 ; THE NAMES OF THE DOMAINS THEY CONTAIN. 143 IMAPC 'I LAMBDA I TERM I 144 ICCNO ( (EO TERM 'CI NIL) 145 ( (ATOM (SETQ DOHAINS (DCMAINS.IN TERM))) 146. (MAPC \u00E2\u0080\u00A2 (LAMBDA [RED 147 I AND IMEHQ DOMAINS IGET REL 'DOMAINS!I 148 (SETO R (RESTRICT REL (FIXPRED DOMAINS TERM REL ' T l l l l 149 IPUT REL 'TUPLES (GET R \u00E2\u0080\u00A2TUPLES I 1 150 (PUT REL '(TUPLES (GET R '(TUPLES11 151 (AND TRACE (PRINl 'MESTRICT! (PRINl REL) (PRINl \"TO) 152 IPRINI TERN) IIERPRI)) ) 153 ) 154 RELS) ) 155 ( T (SETO JTERMS (CONS (LIST TERM DOMAINS) JTERMS)) II 156 ) 157 QUERY) ' 158 I GLUE ALL THE RESULTS TOGETHER - JOIN ALL RELATIONS WHICH MUST BE 159 I JOINEO USING JOIN TERMS. 160 . ISETC RELS (JOIN.WJT RELS JTERMS)) 161 ; TAKE THE UNION OF ALL NON-NEGATEO RELATIONS V62 IMAPC '(LAHfaOA (R! 163' (CONO ( IGET R 'NEGATED! (SETQ RELSN ICONS R RELSN)) ) 164 I T (SETQ RELSP (CONS R RELSP)) )) 165 ) 166 RELSI 167 (AND TRACE (PRINl \"TAKE THE UNION OF RELATIONS\"! (PRINl REL5PI (TERPR11> 168 (SETQ RELSP (JOIN.RELS RELSP!I 169 I TAKE THE DIFFERENCE BETWEEN THIS RELATION AND EACH OF THE NEGATED 170 ; RELATIONS. 171 IMAPC '(LAH8DA 1RNI 172 (AND TRACE (PRINl '\"THE DIFFERENCE BETWEEN\"! (PRINl RELSP) 173 (PRINl 'ANDI (PRINl RN! (TERPRl)l 174 (SETQ DOMAINS (INTERSECT (GET RELSP 'DOMAINS) (GET RN 'DOMAINS))) 175 (SETQ RELS (ROIfF (PROJECT RELSP (DOMAIN\u00C2\u00BBS RELSP DOMAINS)) 176 (PROJECT RN (DOMAIN'S RN DOMAINS)) )) 177 (SETQ RELSP (JOIN RELS (LIST RELS? RELS)!) 178 ) 179 RELSN) lfiO (RETURN RELSP) 181 )) 182 '\" 183 . 184 -185 [DEFUN DIVICE.OR.PROJECT (REL CUANTS) 1B6 I THIS ROUTINE WILL PROCESS THE QUANTIFIERS FRCH RIGHT TO LEFT, 187 ; DIVIDING OR PROJECTING THE CURRENT RELATION 8Y THE QUANIIFIEO 188 I VARIABLE. \u00E2\u0080\u00A2 J 189 (PRCG IVAR OOMAINS) 190 IHAPC \u00E2\u0080\u00A2I LAMBDA (Ql 191 ISETQ VAR I GET 0 'REL1 I 192 ISETQ DOMAINS (GET VAR 'DOMAINS!) 193 '[CONO ( (EQ 0 *E) 194 (CONO ( TRACE (PRINl \u00E2\u0080\u00A2\"PROJECT THE RESPONSE RELATION ON ALL FIELDS EXCEPf 195 IPRINI (EVAL O i l (TERPRlll 196 I T T l ! 197 ISETO REL (PROJECT REL (DOMAINIS REL (SET01F (CET REL 'OOMAINSI 198 (CEI VAR 'COMAINS))) ) l I 199 ( T ISETQ REL (ROIVIOE REL VAR (DOMAINtS REL OOMAINS] 200 [GENLIST DOMAINS!I I 201 (AND TRACE IPRIN1 '-DIVIDE THE RESPONSE RELATION BY\") 2C2 (PRINl (EVAL CI) (IERPRII) )) 20J | 20* OUANTS) 205 IRETURN RELI 20!> II 207 208 209 210 (DEFUN JOIN_WJT (RELS JTERMS) 211 : THIS ROUTINE WILL JOIN ALL RELATICNS IN RELS WHICH CAN BE JOINED 212 ; UNDER CRITERION SPECIFIED BY JTERMS. 213 ; IT RETURNS A NEW LIST OF THE CURRENT RELATICNS 21* IPRCG (RI R2 R) 215 IMAPC '(LAMBDA (JTERMI 216 (MAPC \u00E2\u0080\u00A2I LAMBDA (Rl 217 ICOND I (GET R 'NEGATED) (UNEVAL 'MAPC NILII 21B ( (MEMO (CAADR JTERMI (GET R 'OUMAINS)) 219 (SETO R l Rl) 220 1 (MEMO (CADADR J TERM) (GET It 'DOMAINS)) 221 (SETO R2 \u00C2\u00AB|| ) 222 ) 223 RELS) 22* (SETQ JTERH (FIXPREO ICAR DOMAINS) IF1XPRED ICADR DOMAINS) JTERM R2 -T2) R l \u00C2\u00ABTl)) 225 ISETO R (JOIN Rl R2 JTERM)) 226 (SETQ RELS (DELETE Rl (DELETE R2 RELS)) 22? RELS (CONS R RELS)) 228 ) 229 JTERMS) 230 (RETURN RELS) 231 I) 232 233 23* 235 (DEFUN JOIN.RELS (RELS) 236 ! THIS ROUTINE WILL TAKE A LIST OF RELATICNS AND JOIN THEN 237 ; ON COMMON DDMAINS. 238 IPRCG (REL DOMAINS A L U \u00C2\u00BBL2\u00C2\u00BB1 239 (SETQ REL (UNCONS RELS RELS)) 2*0 (AND I NULL RELS) (RETURN RED) 2*1 (MAPC \u00E2\u0080\u00A2(LAMBDA IR) 2*2 (AND (SETQ DOMAINS (INTERSECT IGET R 'DOMAINS) (GET REL 'OOMAINS!)) 2*3 (SETO A L U IDOMAIN\u00C2\u00BBS REL DOMAINS)! 2** (SETQ AL21 (OOMAINIS R DOMAINS!) 2*5 (SETQ REL (JOIN REL R \u00E2\u0080\u00A2 1ECUAL (ELEM T l IUII I ELEM 12 A L 2 0 I I I 2*6 I NULL ISETO RELS (DELETE R RELS)I I (RETURN REL) 2*7 I) 2*8 RELS! 2*9 IJOIN_RELS (CONS REL RELS)) 250 )) 251 252 253 25* IDEFUN OOMAINIS (RELATION L_OF_00NAINSI 255 : THIS ROUTINE WILL RETURN/THE POSITIONS EACH OF THE DOMAINS IN L_OF_C0MAIN5 256 ; OCCUPIES IN RELATION. 25? IPRCG (DOMAINS! 258 (AND (ATCM L OF DOMAINS! (SETO L OF.DOMAINS (LIST l_OF.DOMAINS))) 259 ISETO DOMAINS (GET RELATION 'DOMAINS!) 260 (MAPCAR MLAHBOA I DOMAIN! 261 . IAD01 (SUB (LENGTH OOMAINS) 262 (LENGTH (MEMO DOMAIN DOMAINS)) II 263 > 26* l_OF_D0MAINSI 265 > I 266 267 268 269 (OEFUN INTERSECT 1L1 L2) 270 J THIS PROCEDURE RETURNS THE INTERSECTION OF LISTS 11 ANO L2. 271 IPRCG (RESULT) 272 (MAPC '(LAMBOA ( E l ) 273 (AND I MEMBER E l L2) INOT I MEMBER E l RESULT)) 27* ISETO RESULT ICONS E l RESULT!I ) 275 ) 276 111 277 (RETURN RESULT! 278 I ! 279 280 281 282 (DEFUN FIXPREO (DOMAIN JTERM REU TX) 263 1 THIS ROUTINE WILL CHANGE ALL OCCURRENCES OF DOMAIN IN JTERM ZE* ; 10 (ELEM TX N) , WHERE OOMAIN IS IN THE NTH POSITION OF REL. 283 (COPY JTERM OCMAIN (LIST 'ELEM TX (CAR (DOMAINAS REL ODMAINllll 256 I 2E7 288 289 .290 (DEFUN 0OMAINS.1N IFUN) 291 I THIS ROUTINE WILL RETURN EITHER A LIST OF ALL COMA IN VARIABLES 292 -, IN THE FUNCIION. CR THE NAME OF THE VARIABLE IN THE FUNCTION 293 I IF THERE IS ONLY ONE. 29* (PRCG (ANSI \u00E2\u0080\u00A2 295 (OCMAINS_IN\u00C2\u00BB FUN) 296 (CCND I (EQ (LENGTH ANSI II (RETURN ICAR AN5)I) 297 ( T (RETURN A N S ! I I 298 II 299 14 9 300 301 302 (CEFUN C0HAINS_1N\u00C2\u00BB (FUN) 303 (PRCG I ) 304 (COND ( (OR I NULL FUN) (EC (CAR FUN) \"QUOTE)) 305 (RETURN NIL)) 306 ( IATOM (CAR FUN)) 307 (OR I NUMB ERR (CAR FUN)) IFUNCTICNP (CAR FUN)) JOB (SETO ANS (CONS (CAR FUN) ANSI) )) 30\") ( T I DOHA INS.IN\u00C2\u00BB (CAR FUN))) ) \ 310 (DDHAINS_IN\u00C2\u00BB (COR FUND 311 )) 312 ! 313 314 315 (DEFUN FUNCTIONP (FUN) 316 ; THIS ROUTINE RETURNS T IF FUN IS A LISP FUNCTION. 317 (OR (GET FUN \"EXPRI 318 (GET FUN 'SUBRI 319 (GET FUN \"FSUBR)) 320 ) END OF FILE APPEND I X 1 1 - A U X I L I A R Y ROUTINES ; CUERY ; THIS ROUTINE WILL ACCEPT INPUT FROM THE USER. THE INPUT CAN EITHER ; BE A CUERY, OR A PIECE OF INFORMATION TO BE ACOEO TO THE SEMANTIC MODEL. (OEFUN CUERY I ) ; READ - REPLY LOOP (PRCG IRR TRACE I (PRINT '-TRACE ON, OR OFF7\") (OR 1EOUAL (READ) 'OFF) ISETO TRACE T)) RO (SETO RR (READ)) ICONO ( (EO RR \u00E2\u0080\u00A2 -END) (RETURN 'THANKS)) ( (NOT (MEMO RR|| (THVAL (LIST '\u00C2\u00BBA RR 'AT) (LIST M i l NIL)) (PRINT '\"O.K.-) I GO ROD ( T ISETO RR (REDUCE RR))) I (COND ( [2EROP (GET RR \u00C2\u00BB9 TUPLES)) (PRINT \u00E2\u0080\u00A2\"NONE.\")) ( T (PRINTREL RR)> ) (CO RO) I) OEFRELS (DEFINE RELATIONS) DEFRELS READS IN RELATIONS FROM THE FILE \"FILE\", ANO STORES THEM IN INTERNAL FORMAT: THE RELATION NAME HAS 3 PROPERTIES ON ITS P-LIST: DOMAINS- A LIST OF THE DOMAIN NAMES UPON ViHICH THE RELATION IS DEFINED. TUPLES - A LIST OF THE TUPLE NAMES WHICH OCCUR IN THE RELATION. \u00E2\u0080\u00A2TUPLES - THE NUMBER OF TUPLES CURRENTLY IN THE RELATION. ON THE P-LIST OF A TUPLE NAME, UNDER THE FLAG \"OATA\", IS THE . ACTUAL TUPLE. (OEFUN CEFRELS FEXPR (FILE) (APPLYl 'OPEN (LIST 'RELINPUT 255 (CAR F I L E D ) (OPEN (BUFFER 255)) (PRCG (RELATION RNAME TNAME) RD ICONO ( IEO (SETO RELATION IREAD RELINPUT]) \u00E2\u0080\u00A2\u00C2\u00BBENO) (RETURN NIL)I ) ISETO RNAME (UNCONS RELATION RELATION)) I PUT RNAME \u00E2\u0080\u00A2'TUPLES 0) [PUT RNAME 'DOMAINS (CAR RELATION]I [MAPC \u00E2\u0080\u00A2 i LAMBDA (TUPLE) IACOTUPLE TUPLE RNAME) ) (COR RELATION)) (GO RO) ) ; PRINTREL I PRINTS A GIVEN RELATION (OEFUN PRINTREL (RELATION). (PROG (SPACES CENTRES) (TERPRl) (CONO I [NULL (SETO CENTRES (GET RELATION 'CENTRES))) (SETQ CENTRES IFIND.CEKIRES RELATION)) )) (PRINl RELATION) (PRINl ' \" [ \" I (PRINT_TUPLE (GET RELATION 'DOMAINS) CENTRES) IPRINI \u00E2\u0080\u00A2\u00C2\u00AB l \" l (TERPRl) (MAPC '(LAMBOA ITUPLE) IPR]NT_TUPL\u00C2\u00A3 (GET TUPLE 'DATA) CENTRES) (TERPR1I! (REVERSE (GET RELATION 'TUPLES!') (RETURN RELATION)) (OEFUN PRINT_TUPLE (TUPLE CENTRES) (MAPC \u00E2\u0080\u00A2(LAMBDA (NAME POS) \" (COND I (NUMBERP NAME) ISETO LEN (NLEN NAMED! ( ISETQ LEN IPLEN NAME!!) ) .'PTR?N!SNA3MET L M X ' D M C E L E N *\u00C2\u00BB'\u00C2\u00BB ) TUPLE CENTRES) ) ) 99 ICO 87 IDEFUN FIND.CENTHES (RELATION) 88 I THIS ROUTINE WILL CREATE A LIST WHICH CONTAINS THE PRINT POSITION 69 i WHERE EACH CO\"AIN IN THE RELATION SHOULD EE CENTRED. 90 (PRCC (TEMP LEN CENTRE) 91 . (COND ( I NULL IGET RELATION 'CENTRES)) 92 (PUT KELATICN 'CENTRES C.APCAR '(LAMBDA (XI 01 (CET RELATION * DOMAINS)!) )> 93 (SETQ CENTRE (CET RELATION 'CENTRES!) 94 (MAPC MLAMSOA (TUPLE) 95 (OR (LISTP TUPLE) ISETO TUPLE (CET TUPLE 'DATA))) 96 ISETO TEMP CENTRE) 97 (MAPC \u00E2\u0080\u00A2(LAMBDA (ELT) 98 (CONO ( (NUM3ERP ELT I (SETQ LEN (NLEN ELT))) r ( ISETO LEN (PLEN ELT))1 I \u00E2\u0080\u00A2 ' (AND (GREATERP LEN I CAR TEMPI) (RPLACA TEMP LEH)) 101 (SETQ TEMP (CDR TEMPI 1 102 I ; 103 TUPLE) 104 I 105 (APPEND (LIST IGET RELATION -DOMAINS)> IGET RELATION 'TUPLES!) ) ICS (PUT RELATION 'CENTRES (UPDATE.CENTRES RELATION CENTRE)I ic? i! ; 108 IC9 110 111 IDEFUN UPDATE.CENTRES (RELATICN CENTRES) 112 ; THIS ROUTINE WILL CHANGE THE LIST OF MAXIMUM DOMAIN SIZES 113 1 TO A LIST OF CENTRES FOR EACH DOMAIN 114 IPRCG I SUM LIST) ' 115 (SETO SUM (ADO 4 (PLEN RELATION)!) 116 (MAPC \u00E2\u0080\u00A2(LAMBDA ICI 117 (SETQ LIST ICONS (ADD (FIX (DIVIDE C 2!) 1 SUM) LIST)) 118 (SETC SUM (ADD SUM C 2)1 119 ) 120 CENTRES) 121 (RETURN (REVERSE LIST)) 122 )) 123 124 12S 126 IDEFUN NLEN (NO) 127 I THIS ROUTINE RETURNS THE PRINT LENGTH OF A NUMBER 128 (TAB 1 BUFFER I 129 IPRINI NO BUFFER 21 130 (PLEN BUFFER) ~ 131 ) 132 133 134 135 ; GENSYH1 136 ; LIKE GENSYM, EXCEPT THA'T NUMBERING STARTS AT 1 FOR EACH OIFFERENT 137 ; CHARACTER STRING C. 138 139 140 IDEFUN CENSYMl (CI 141 (CONC ( (NULL (GET C 'GSM! (PUT C \u00C2\u00ABCS\u00C2\u00BB 01 ) > . 142 (IMPL00E1 C (PUT C 'GSA (ADD1 (GET C \"GSAII) I 143 ) 144 \u00E2\u0080\u00A2 145 \u00E2\u0080\u00A2 146 147 I 1MPLCDE1 148 ; GIVEN TWO ATOMS A AND B, RETURNS THE ATOM AS. 149 150 151 IDEFUN IMPLCDE1 (A 81 152 (TAB 1 IMPL00E3UFFERI 153 IPRINI A IMPL0UE3UFFER 2) 154 (PRIM 8 IMPLOOEBUFFER 2) 155 (READ IMPLGDEBUFFER) 156 I 157 158 159 \" 160 161 162 S PRINT* 163 I THIS ROUTINE WILL PRINT THE^ ARGUMENT *ONLY* IF TRACE IS ON. -164 165 (CEFUh PRINT* (EXPI 166 (CR (NOT IRACEI [PR[NT EXPI1 167 ) 168 169 170 171 ! P R I M * 172 ; THIS ROUTINE WILL CALL PRINl ONLY IF TRACE IS NIL. 173 174 IDEFUN PRINl* IEXP) 175 ICR. INOT TRACE! (PRINl EXP)) 176 ) END OF FILE "@en . "Thesis/Dissertation"@en . "10.14288/1.0051770"@en . "eng"@en . "Computer Science"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use."@en . "Graduate"@en . "Query languages for relational data base management systems"@en . "Text"@en . "http://hdl.handle.net/2429/18798"@en .