AN APPROACH TO THE ORGANIZATION OF TAXONOMIES by CRAIG DAVID BISHOP B.Sc. University of B r i t i s h Columbia A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES Department of Computer Science We accept t h i s t h e s i s as conforming to the r e q u i r e d THE standard UNIVERSITY OF BRITISH COLUMBIA March, 1981 © C r a i g David Bishop, 1981 In presenting advanced Library agree degree that at the permission be in partial University s h a l l not for granted by i s understood gain thesis s h a l l make i t f r e e l y purposes may It this that fulfillment of British available extensive or copying requirements f o r Columbia, of this and I agree study. thesis department or h i s publication be allowed without my the f o r reference the Head of my copying of written of this thesis permission. Craig Bishop for that I an the further scholarly representative. for financial PAGE i i Abstract The ISA h i e r a r c h y used by present cient i n that i t does not represent relationships). leaves child day i n f e r e n t i a l f o r having of domain r e l a t i o n s h i p s (type Such h i e r a r c h i e s a s s o c i a t e e x p l i c i t l y d e f i n e d sets with the of a t r e e . nodes. a variety database systems i s d e f i - Non-leaf In keeping with nodes i n the tree inherit the use of s e t s , t h i s type r e l a t i o n s h i p s other than subset. members paper gives Included from their motivation are union, inter- s e c t i o n and a d i s j o i n t n e s s c o n d i t i o n . A formalism f o r the typed database (TDB) i s given, order p r e d i c a t e c a l c u l u s as i t s t h e o r e t i c a l b a s i s . using the monadic first Non-unit formulae r e p r e - sent i n t e n s i o n a l (general) i n f o r m a t i o n about the world being represented and unit ground information. clauses cates represent connected that represent extensional types, and constant to the set concept set of constant (specific) symbols represent v i a predicate set members. extensions. symbols which are provable PrediThis i s The extension i s as arguments to the given predicate. Given the s o - c a l l e d concreteness desired c o n d i t i o n and c o n s i s t e n c y set t h e o r e t i c r e l a t i o n s h i p s (union, of p r e d i c a t e extensions follow. This intersection strengthens the l i n k of the TDB, the and d i s j o i n t n e s s ) between the f o r - malism of the p r e d i c a t e c a l c u l u s and the more n a t u r a l set r e p r e s e n t a t i o n . A c a n o n i c a l form of a TDB i s shown that admits an a p p r o p r i a t e machine representation. Using this i s can be determined i f a constant symbol as an PAGE i i i argument time. it to a given predicate An update a l g o r i t h m maintains practical expressive. concreteness generalization i s provable (domain membership) i n constant i s developed and i s shown to be c o r r e c t and c o n s i s t e n c y . of the ISA Thus hierarchy the TDB but i n that i s shown to be a i s considerably more PAGE i v . T a b l e o f Contents Page # Section A Introduction 1 Section B Motivation 4 Section C Formal P r e l i m i n a r i e s 7 The TDB 7 Extension 9 Acyclicity of a TDB 9 ALPHA, BETA and TAU 9 Diagram C . l - Sample TDB I n t e n s i o n Lemma C . l - Generalization Corollary C.l - Containment 13 Lemma C.2 - Intersection 13 Section D The D e f i n i t e Component 11 . . . . . . . . . . . . . 1 2 of a TDB .15 The D e f i n i t e Component /\(T) 15 Concreteness 17 The P r o p o s i t i o n a l Component Lemma D . l Dead C l a u s e s DEAD - PROP 17 T P r o p o s i t i o n a l Equivalence 18 19 T Lemma D.2 - Dropping o f Dead c l a u s e s 19 Theorem D . l - T -/\(T) 21 Lemma D.3 - Union 24 Lemma D.4 - Disjointness 25 C o r o l l a r y D.l - D i s j o i n t Union 25 Equivalence PAGE v Table o f Contents Page # Section E T r a n s f o r m a t i o n of a TDB t o i t s Reduced Form The Horn Component H(T) Algorithm E . l - 26 26 DOMAIN^ 27 Sample Run of A l g o r i t h m E . l 28 Theorem E . l - 29 Corollary E . l - A(T) - DOMAIN^ Equivalence Algorithm E . l Correctness 32 The Reduced or C a n o n i c a l TDB, R(T) 32 Lemma E . l - Spanning 33 Theorem E.2 - T - R(T) Equivalence 33 C o r o l l a r y E.2 . . 34 C o r o l l a r y E.3 - Concreteness of R(T) 34 C o r o l l a r y E.4 - C o n s i s t e n c y of R(T) 34 Section F Updating a TDB 36 Algorithm F . l - Update 36 Theorem F . l - A l g o r i t h m F . l Correctness 37 Sample Run of A l g o r i t h m F . l 38 Summary and C o n c l u s i o n s 41 Bibliography 42 Appendix A D i c t i o n a r y of Symbols 44 PAGE v i Acknowledgements I wish to thank Dr. Raymond R e i t e r of the U n i v e r s i t y of B r i t i s h Columbia f o r his His p e r c e p t i v e and c r i t i c a l dedication and insight a n a l y s i s of the ideas that went i n t o t h i s into inferential database theory has paper. been a c o n t i n u i n g source of enlightenment. I would also like to thank B e l l Northern Research of Ottawa t h e i r word p r o c e s s i n g f a c i l i t i e s f o r the use of i n the p r e p a r a t i o n of t h i s manuscript. PAGE Section A In recent Introduction works on r e l a t i o n a l database systems, e s p e c i a l l y as they a r i s e i n artificial intelligence, definition of D o m a i n ^ ) has ing on applications, the need f o r a database become apparent. of domains (using In simple data the acceptance or r e j e c t i o n of data a d e c i s i o n procedure f o r that or membership i n a predefined s e t . This ential ones). database form makes the primary assumption that T h i s may be r e a l i s t i c However, work i n a r t i f i c i a l systems acknowledges Winograd l e d the way by using and their properties, process- alpha/numeric i n which each entry of one domain i s more or l e s s u n r e l a t e d numeric Codd's i n a domain i s based domain such as c o r r e c t each domain i s independant of a l l o t h e r s . (like 1 the to other domains i n t e l l i g e n c e and dependancies an ISA h i e r a r c h y f o r domains amoung infer- domains. to r e l a t e domains of objects eg. a MOVEABLE-OBJECT ISA PHYSICAL-OBJECTt ^. 15 McSkimin and Minker f o r m a l i z e the ISA h i e r a r c h y by d e f i n i n g a semantic graph as a l a b e l l e d tree i n which the root node i s the u n i v e r s a l category and the leaves are p r i m i t i v e categories[7,8,9]^ elements, which are a s s o c i a t e d with by the parent to be categories. " l e a f y " with many non/human/male/mam). mechanism i s given elements. j n t h i e (domain) formalism r the p r i m i t i v e c a t e g o r i e s , are i n h e r i t e d As can be seen i n t h e i r examples these trees categories, There i s much f o r i n s u r i n g that the some with unnatural names d u p l i c a t i o n of category duplicate categories (such names contain tend as but no the same PAGE Also, McSkimin and Minker between c a t e g o r i e s ) without the p a r t i t i o n . define a defining partitioning (disjointness an update a l g o r i t h m that w i l l 2 condition maintain In s h o r t , though they d e f i n e the ISA h e i r a r c h y , they do not i n c l u d e a mechanism f o r e n f o r c i n g data c o n s i s t e n c y . Reiter has adopted predicate the idea of a typed calculus ^ ' 1 1 ^. database He r e p l a c e s the idea member of a domain with the proof t h e o r e t i c domain i s a monadic p r e d i c a t e i n which It of t h i s and develop paper the element hence Here the i s i t s argument. typed database a The are beyond the be r e f e r r e d database (TDB) INSTANCES) typed the necessary mathematics and a l g o r i t h m s . h e r e a f t e r be r e f e r r e d to as an INSTANCE. f o r which JOHN i s an INSTANCE. type being to propose a s t r u c t u r e f o r R e i t e r ' s A domain i n Codd's terminology w i l l element w i l l of an element paper. i s the purpose database i n the monadic n o t i o n of p r o v a b i l i t y . s t r u c t u r e , mathematics and algorithms f o r t h i s scope of h i s formalized JOHN i s s a i d i s then a l o g i c a l satisfy certain theory to as a TYPE. An F o r example, HUMAN i s a TYPE to have the property HUMAN. A i n which groups of TYPEs (and mathematical relationships called type relationships. In order f o r a TDB to be p r a c t i c a l i t must s a t i s f y 1) I t must be d e s c r i p t i v e l y than simple set certain conditions: r i c h , c a p t u r i n g type r e l a t i o n s h i p s containment. other PAGE 2) I t must be s t r u c t u r e d information 3) I t must so as to encourage only 3 the most natural content. admit algorithms that are computationally efficient in both time and space. 4) I t must admit violating a updates type i n such a way relationship. as to prevent ( i e . enforce a user integrity from con- straints) S e c t i o n B motivates the i n c l u s i o n of type tainment. C section Section C which d e f i n e s Structuring properties final and defines an a c y c l i c a TDB. TDB Condition and introduces 2 than set con- i s addressed a graphic i s completed i n s e c t i o n D by imposing the s o - c a l l e d c o n d i t i o n on TDBs. reducing then r e l a t i o n s h i p s other With t h i s of the TDB and c o n s i s t e n c y follow. Section E section details consistency an update algorithm of the TDB are maintained notation. concreteness a l l the d e s i r e d set t h e o r e t i c i s devoted a given TDB to one which i s e f f i c i e n t in to an a l g o r i t h m f o r i n both time and space. which during ensures updates. that The concreteness PAGE Section B Motivation To see why a more robust set of type r e l a t i o n s h i p s ment i s important c o n s i d e r the f o l l o w i n g statements: JANE i s a WOMAN; hence JANE i s a FEMALE (ISA) 2) JANE i s a FEMALE; hence JANE i s a WOMAN or 4) Statements JANE i s a GIRL (but not both) a) JOHN i s a BACHELOR; hence JOHN i s a MAN (ISA) b) hence JOHN i s SINGLE (ISA) JOHN i s a BACHELOR; JOHN i s a MAN and JOHN i s SINGLE; 3a and 3b bear a resemblance set containment. how FEMALE TYPEs. than simple set c o n t a i n - 1) 3) 4 However, i n view i s composed of hence JOHN i s a BACHELOR to statement statements 1 i n that both 2 and 4 we see that of WOMAN and GIRL - WOMAN and GIRL being I n c o n t r a s t MAN and SINGLE i n t e r s e c t display to give BACHELOR. some- disjoint The standard ISA h i e r a r c h y does not provide the formal e q u i v a l e n t of statements 2 and 4. An i n f e r e n c e of type 2 i s important i n that we would not want JANE to have the p r o p e r t y FEMALE without her having one of the p r o p e r t i e s WOMAN o r GIRL. S i m i l a r l y we would not want JANE to have both the p r o p e r t i e s WOMAN and GIRL. As seen i n the f o l l o w i n g dition, The as sections, "concreteness" s a t i s f i e s the former con- c o n s i s t e n c y the l a t t e r . type r e l a t i o n s h i p having reservation INSTANCES system of type 4 i s important when we want which JOHN are common might have to a set of TYPEs. the p r o p e r t i e s to d e f i n e a TYPE I n an a i r l i n e NON-SMOKING, COACH, PAGE MOVIE-WATCHING and 747 from which the system could infer that 5 JOHN i sa member of CABIN-B. The way the above of type statements pair suggests the need f o r only two c l a s s e s relationships: A) The TYPE FEMALE i s e q u i v a l e n t to the d i s j o i n t u n i o n of the TYPEs WOMAN and GIRL B) The TYPE BACHELOR i s e q u i v a l e n t to the i n t e r s e c t i o n of the TYPEs MAN The and SINGLE union i n A need not be d i s j o i n t , C) f o r example: The TYPE PEOPLE i s e q u i v a l e n t to the union o f the TYPEs L E F T — HANDED and RIGHT-HANDED ( a l l o w i n g f o r the Graphically ambidextrous) these are r e p r e s e n t e d as: A) • B) FEMALE WOMAN MAN GIRL Set t h e o r e t i c a l l y SINGLE BACHELOR these can be r e p r e s e n t e d a s : A) FEMALE = WOMAN # GIRL B) BACHELOR = MAN n SINGLE C) PEOPLE = RIGHT-HANDED U LEFT-HANDED C) PEOPLE RIGHT- LEFT- HANDED HANDED PAGE In the g r a p h i c a l representation, the TYPE o f which i t i s a subset. written the more The four restricted original TYPE 6 i s drawn below statements can now be i n terms of s e t s : 1) JANE £ WOMAN implies JANE £ FEMALE 2) JANE £ FEMALE implies JANE £ WOMAN Q GIRL 3) JOHN £ BACHELOR implies JOHN £ MAN 0 SINGLE 4) JOHN £ MAN fl SINGLE implies JOHN £. BACHELOR PAGE Section C The Formal Preliminaries set n o t a t i o n of the previous s e c t i o n TYPE r e l a t i o n s h i p s . 7 The f i r s t i s convenient for visualizing order p r e d i c a t e c a l c u l u s , the however, o f f e r s a r o 131 well our developed results proof within theory a first » 1 . J order For this framework. reason Refer i n t e r p r e t a t i o n given to s p e c i a l symbols used i n what Definition we shall to appendix A develop f o r the follows. TYPE Database (TDB) A TDB i s a 4-tuple ( C , P, I , E ) s a t i s f y i n g : 1) C is a finite set of c o n s t a n t s . 2) P is a finite set of unary p r e d i c a t e s i g n s , c a l l e d TYPES. 3) E i s a finite set of atomic P £ P E 4) i s called I is a finite formulae of the form the e x t e n s i o n of the TDB. set of w e l l formed formulae of the form: (x) Px = P j x A. P x A ... P x ii) (x) Px = Pj^x v P x v ... P x 2 n 2 where [P, P ^ the event where c £ C. and i) In Pc or_ n P , 2 ... P ] C P. n that I c o n t a i n s a formula of the form ( i i ) , i t may a l s o c o n t a i n a l l of the formulae: iii) (x) ~ (P.jX A P J X ) f o r each 1 <= i < j <= n. In which case the formula ( i i ) i s s a i d I i s called the i n t e n s i o n of the TDB. to be e x c l u s i v e . PAGE 8 For Example: (C, P, I , E ) i s a TDB where: C = [John, mary] (INSTANCES) P = [FEMALE, WOMAN, GIRL, MAN, SINGLE, BACHELOR] (TYPES) I = [(x) BACHELOR x = MAN x A SINGLE x, (x) FEMALE x = WOMAN x v GIRL x, (x)~ (WOMAN x A GIRL x) ] (intension) E = [WOMAN mary, MAN John, SINGLE John] (extension) Notation: 1) F o r b r e v i t y , the v a r i a b l e 'x' w i l l 2) (C, P, I , E ) w i l l be r e f e r r e d to as T's constants, i n t e n s i o n and e x t e n s i o n 3) Members of C w i l l be dropped where understood. where c l a r i f i c a t i o n predicates, i s required. be w r i t t e n i n lower case, members of P i n upper case. 4) F o r reasons of n o t a t i o n a l convenience we sometimes write W £ I T131 where W i s a c t u a l l y a c l a u s e 1 J « 5) A TDB w i l l be s a i d to be c o n s i s t e n t whenever I U E i s . 6) When we w r i t e T \— W we w i l l mean I U E \~ W f o r any formula W. 7) R (U, V) w i l l r e f e r to the resolvents'- "^ 1 With t h i s d e f i n i t i o n of a TDB we can r e p l a c e of the formulae U and V. the n o t i o n a member of a TYPE (John £ BACHELOR) with the p r e d i c a t e provability ( I U E \~ BACHELOR John). of an INSTANCE being c a l c u l u s n o t i o n of However, we would s t i l l a type as a set with c e r t a i n set t h e o r e t i c r e l a t i o n s h i p s . like to view PAGE Definition The Extension extension ||P|| The of P £ P w r i t t e n = t [c | c £ C, s u b s c r i p t on ||P||-J. w i l l c £ ||p|| f o r T f~ Pc. the l a s t 9 ||P||^ i s d e f i n e d : I U E f- Pc] be dropped where i m p l i c i t . Hence we can w r i t e The set t h e o r e t i c r e l a t i o n s h i p s between extensions s e c t i o n do not a u t o m a t i c a l l y follow. of We w i l l give c o n d i t i o n s under which each of the r e l a t i o n s h i p s given do f o l l o w . A c y c l i c i t y of a TDB The idea of a TYPE viewed as an ISA h i e r a r c h y of g e n e r a l i z a t i o n and r e s t r i c t i o n . type c o n t a i n i n g hierarchy another ( i e . Definitions The L o g i c a l l y we would l i k e ||P|| C ||Q ||). of TYPES, m o t i v a t i n g i s f i r m l y r e l a t e d to the ideas This leads the f o l l o w i n g a c y c l i c i t y to think of one to the n o t i o n definition^. ALPHA, BETA and TAU r e l a t i o n s ALPHA , T BETA T and TAU T are d e f i n e d on a TDB, T, over P x P: 1) P ALPHA Q T i f f Q E Pj v P 2 v ... P n £ I such that n £ I such that P = P^ f o r some i , 1 <= i <= n 2) P BETA T Q Q = Q 3) The cit. TAU T = i f f P EE Q i x A Q 2 A ... Q f o r some i , 1 <= i <= n ALPHA T U BETA T s u b s c r i p t s on ALPHA^, BETA^, and TAU^, w i l l A TDB i s s a i d to be a c y c l i c be dropped where i m p l i - i f f TAU"" i s i r r e f l e x i v e . 1 ie. of a PAGE P Q T A U P and l Pj T A U implies The graphical representation P representing l P 2 P and 2 not P n TAU ... P P Q P. T A U n - 1 10 (n > 0) of S e c t i o n B can be adopted with * ' * n P (x) Px — PjX A P x A ... P x ( 4 i ) 2 n and with P r representing l r 2 * * * n r (x) Px = PjX v P x v ... P x ( 4 i i ) 2 R and finally P P representing 2 . . . P n (x) Px = P j x v P x v ... P x , ( x ) ~ ( P x 2 n i A PjX) Diagram C . l g i v e s an example of t h i s g r a p h i c a l r e p r e s e n t a t i o n By convention, ( 4 i i and 4 i i i ) , of I . i f P TAU Q then P i s d i r e c t l y below Q i n the graph. TDB i s a c y c l i c when and only when i t s graph i s a c y c l i c . Throughout Thus a this paper the a c y c l i c i t y of a l l TDBs i s assumed. The f o l l o w i n g lemma gives us the property of set containment desired: PERSON PAGE Lemma C . l Generalization I f T i s a TDB, [P, Q] C P and P TAU+ Q then T (- (x) Px 3 Proof: Qx By i n d u c t i o n on n i t i s s u f f i c i e n t P TAU Q Case n = 1 implies to prove T |~ (x) Px =3 Qx P TAU Q Subcase a P ALPHA Q then Q = P 1 v P 2 v ... P such that P = P , 1 <= i <= n i In c l a u s a l Subcase b form P. v Q £ I P BETA Q then P = Q (x) Px D Qx £ I or 1 A Q 2 such that Q = Q , ± In c l a u s a l form P v Q Case n > 1 Induction £ I or A ... Q £ I n 1 <= i <= n (x) Px 3 Qx £ I P TAU Q n assumption: P TAU n _ 1 Since P TAU P TAU n n _ 1 Q l Q then Q : T |~ (x) Px 3 implies f o r some and Q Then T \~ (x) Px 3 Q^x T (- (x) Qjx 3 Qx Therefore i £ I n x Q-^ £ Qx { f o r any P, TAU Q by i n d u c t i o n assumption by case n = 1 T (- (x) Px Z? Qx QED and Q : £ P 12 PAGE Corollary C.l Containment I f T i s a TDB, [P, Q] C then In terms 13 P and P TAU+ Q HQII ||P||C of the graph of I , we are assured that i f a path e x i s t s between P and Q i n a ' v e r t i c a l ' d i r e c t i o n then every INSTANCE of the TYPE P i s an INSTANCE of the TYPE Q. c h o i c e f o r i f P TAU+ We can see now and Q TAU Q following and part lemma e s t a b l i s h e s For ||Q|| C in a practical the r e l a t i o n s h i p 4 i i n the d e f i n i t i o n of a Lemma C.2 acyclicity ||p|| C P then This would c e r t a i n l y be of no use The why T, if P = ? l P A 2 ||P|| = H p j n then n £ I ||p|| n ... ||P|| 2 n Proof: Case 1 c £ ||P|| (or T f - P c ) In c l a u s a l form Px v P^x f o r each i , 1 <= By r e s o l u t i o n T \~ P^c c £ £ I i <= n implying c £ ||p J| or IIPJI n ||P || n . . . ||p|| 2 n ||p|| = ||Q||. between set i n t e r s e c t i o n TDB. •.. P A ||P|| or TDB. Intersection a TDB, i s the only p r a c t i c a l PAGE ^se 2 c e Hi?!II n l|p ll 2 (or T f- P c i In clausal form By r e s o l u t i o n n ... ||P || n f o r each i , Px v P^x v P x i 2 v 1 <= i <= ... P x n n) £. I T |— Pc implying c £. ||p|| S e c t i o n D develops a p a r a l l e l r e s u l t formula P E P , 14 v P~ v ... P . z n relating the union of extensions to the PAGE Section D 15 The D e f i n i t e Component of a TDB I t has been recognized by R e i t e r ' - ' and M i n k e r ' ^ 1 1 that r e s t r i c t i n g inten- s i o n a l i n f o r m a t i o n to Horn c l a u s e s ( c l a u s e s c o n t a i n i n g at most one p o s i t i v e l i t e r a l ) e f f e c t i v e l y c o n t r o l s the s i z e of the proof search space o v e r l y c o n s t r a i n i n g the expressiveness of the formalism. by viewing Horn clauses as p r o c e d u r e s ' ^ without T h i s can be seen i n a n o n - d e t e r m i n i s t i c programming language s i m i l a r to MICRO-PLANNER^ ^. I e . , the formula, 14 P 1 A P 2 A ... P n D P, could be programmed, THE_CONSEQUENT_OF ( P ^ P , ... P ) 2 and IS P; the c l a u s e , P [ v P^ v ... P^, could be programmed, THE_CONSEQUENT_OF ( P p ? , ... P ) 2 Proofs done i n t h i s way are reasonably are ever generated. IS INCONSISTENCY. e f f i c i e n t as only u n i t consequents Thus the maximum number of procedure f i r i n g s i s bounded by the number of elements i n P. In t h i s s e c t i o n we w i l l c l a u s e s ) to develop Definition Given a TDB, c o n s i d e r clauses of the former form the d e s i r e d r e s u l t on union (definite of e x t e n s i o n s . D e f i n i t e Component T = (C, P, I , E), the d e f i n i t e component of T, A ( T ) , i s d e f i n e d as the TDB, (C, P, [W £. I | W i s a d e f i n i t e c l a u s e ] , E) PAGE Note that E c o n t a i n s only d e f i n i t e c l a u s e s . 16 We can now see e x a c t l y what parts of a TDB are d e f i n i t e : W £I C l a u s a l Form P v P P EE P 1 A P2 ... P A n P ? P EE P^ v P 2 v ... P n P v P , l s v P x 2 Definite? 2 v ... P v P, n ... P v P v P P 2 v P, P v Pj^ v P 2 v ... P l n ... P yes v P n A yes no n P v Q ~- (P yes no Q) We would l i k e a r e s u l t p a r a l l e l s i o n s to the formula P EE Pj v P to lemma C.2 but r e l a t i n g unions of exten2 v ... P . However, as the f o l l o w i n g example shows, i t would not n e c e s s a r i l y h o l d : FEMALE C = [jane] P = [FEMALE, GIRL, WOMAN] I = [FEMALE EE WOMAN v GIRL, \^ WOMAN GIRL ||FEMALE|| since = [jane] ~ (WOMAN A GIRL) ] E = [FEMALE and T \h WOMAN jane jane] || WOMAN || = and ||GIRL|| = [ ] T \h GIRL jane. So not only do we not have ||FEMALE || = 11WOMAN|| U ||GIRL|| but we have a TDB i n which u n n a t u r a l i n f o r m a t i o n i s contained. as a "WOMAN o r GIRL" without I t i s hard to c o n c e p t u a l i z e jane her being e i t h e r a WOMAN o r a GIRL. Actually, PAGE jane i s one of the two, we j u s t can't determine which. insist Concreteness The formula P EE P^ v P 2 v ••• c £ C , A ( T ) \~ Pc i m p l i e s 2) For these reasons we on a TDB being " c o n c r e t e " . Definition 1) p n ^ * i s c o n c r e t e i f f f o r every there i s an i such that A ( T ) (— P j C . A TDB i s c o n c r e t e i f f every formula of the above form i s c o n c r e t e . We could have d e f i n e d concreteness i n terms of p r o v a b i l i t y than A ( T ) . it The disadvantage i n the approach taken i s that equivalence must be' e s t a b - Theorem D.l accomplished on union of extensions can be this. The Davis and Putnam r e s u l t u s e d i n the proof of t h i s theorem a p p l i e s only f o r the p r o p o s i t i o n a l c a l c u l u s . rectify to maintain c o n c r e t e n e s s . between T and A ( T ) before the r e s u l t attained. from T r a t h e r U n f o r t u n a t e l y as we s h a l l see, i t turns out that at update time would then be e x c e s s i v e l y d i f f i c u l t lished 17 The f o l l o w i n g d e f i n i t i o n and lemma this. Definition P r o p o s i t i o n a l Component Given a TDB, T, and c £ C we d e f i n e the p r o p o s i t i o n a l component of T with respect to c, PROP^,c as: PROP c T = [P | Pc L E] U [W | (x)Wx t I] ((x)Wx r e p r e s e n t s any formula i n I with free v a r i a b l e x) PAGE Using the l a s t example: PROP Jane = [FEMALE, T Lemma D . l FEMALE = WOMAN v GIRL, ~ Propositional Given a c o n s i s t e n t then 18 (WOMAN A GIRL) ] Equivalence TDB, T, c £ C and P £ P T f~ Pc i f f PROP c (- P T Proof: Part 1) Assume E U I U [Pc] i s u n s a t i s f i a b l e . theorem, i f we r e p l a c e each clause i n I with the s e t of clauses obtained by r e p l a c i n g the f r e e v a r i a b l e 'x' with each member of C then the remaining s e t i s u n s a t i s f i a b l e . of I . I [Pc] cannot r e s o l v e C except c are i r r e l e v a n t the ground form of a g a i n s t any clause which does not have c's as i t s ground c o n s t a n t . the C a l l t h i s the ground form Since I U E was o r i g i n a l l y c o n s i s t e n t , U E is. By Herbrand's T h e r e f o r e a l l the constants i n to the i n c o n s i s t e n c y . I f we c o n s t r u c t c l a u s e set [Q | Qc £ E ] U [W | Wc £ ground form of I ] then we have a p r o p o s i t i o n a l l y unsatisfiable set. U [7] Therefore, PROP c \~ P. T P a r t 2) [Wc Assume PROPTc U [P] i s u n s a t i s f i a b l e . | W £ PROP c U [P]] i s u n s a t i s f i a b l e . substitution T of E U I U [ P c ] . Clearly T h i s i s a ground Therefore T (- Pc. PAGE 19 Theorem D . l a l s o r e q u i r e s that one f a c t o r out c e r t a i n n o n - d e f i n i t e c l a u s e s . These are the ones that a l l o w jane to be a "WOMAN o r GIRL" without the ne- c e s s i t y o f her being one of WOMAN o r GIRL. Definition Dead Clauses Given a TDB, T and c £ C then the dead c l a u s e s of T with respect to c, DEAD^c i s d e f i n e d : DEAD c = T [P 3 ? Lemma D.2 1 v P 2 v . .. P n £ PROP c T | P R O P ^ ) C \~ P, Dropping of Dead Clauses Given a concrete TDB, T, c £ C and W £ DEADTc then PROP c ~ DEADTc (~ W. T Proof: W = P O Pj v P and 2 P i n £ DEAD c T P R O P ^ ^ c \~ P. By concreteness PROP v ... P A ( T ) f o r some i , 1 <= i <= n, c t - P r subsumes W so P R O P ^ ^ c f~ W. T Now DEADTc contains no d e f i n i t e clauses so P R 0 P ^ ) C C PR0P c ~ D E A D c T T PROP c ~ DEAD c T T T (- W. QED giving T n > 1] PAGE We w i l l gorize now prove equivalence the set of clauses of T and A(T). The s t r a t e g y w i l l 20 be to cate- produced by r e s o l u t i o n w i t h i n PR0P c ~ DEAD c and show that i f a u n i t Qc i s provable from T then Q w i l l T be i n that s e t . T PAGE Theorem D . l T ~A(T) Given a c o n s i s t e n t concrete TDB, T, c £ C and Q £ P then T |- Qc 21 Equivalence i f f A ( T ) \~ Qc. Proof: Assume PROP c \~ Q. T |~ Q Then PROPTc ~ DEADTc So by Lemma D.2. PROP c ~ D E A D c U [Q] i s u n s a t i s f i a b l e . T T Define the f o l l o w i n g T T = Q i+1 = sets iteratively: PROP c ~ DEAD c U [Q] T T i w h e n T Q £• i £ £ T when T^ c o n t a i n s no p o s i t i v e u n i t s , = T ± ~ [X v P v Y £ T ] - ~ i [X v Y | X v 7v Y [X v 7v Y otherwise £ T ] ± U £ T ] i where ? i s the l e x i c o g r a p h i c a l l y f i r s t positive i n T^ and X and Y a r e negative p r e d i c a t e predicate The following If i s an i n v a r i a n t disjunctions, (possibly null) statement on T^: W £ T. then A) W = P such that P R O P ^ ^ c (- P f o r some P £ P T B) W = P j such that P = ? l v P 2 v ... P which i s e x c l u s i v e and PROP^ (T) c f~ P ± ( i 4 j ) or n £ I or unit and p o s i t i v e PAGE C) W = Qj v Q v ... Q 2 P = \° A P l [Qp Q 2 , A 2 ... P ... Q ] C T The (ra >= 1) such £ n that I and [ p , P , ... P ] and m P R O P ^ ) C r~ W D) W £ T v P 22 x 2 n or Q i n v a r i a n t statement i s proved by the f o l l o w i n g subproof: Subproof: By i n d u c t i o n on i Case i = 0 C l e a r by c o n d i t i o n D Case i > 0 If W £ T i + 1 then W £ T W = R (U, V) f o r [U, V] C T unit or ± with U as the l e x i c o g r a p h i c a l l y f i r s t ± i n T^. P R O P ^ ^ (— U by A and D above. T V i s of the form B, C or D above: B) V = P j such that which PROP P = ? i s exclusive A ( T ) c |- P i U = P j and P R O P ^ l v P v ... P 2 and (~ P j . (T) exclusiveness. T h i s v i o l a t e s the c o n s i s t e n c y v Q l P [ Q =E P, L F v ... Q 2 1 P~ A Z Q, ... 2 A P R O P ^ ) C (- V. T m v P of T, so V ^ P j . (m >= 1) ... P £ 1 n Q £ I ( i ¥ j) But P. v P j £. I by C) V = Q n F F L ] C [P l f and P , 2 ... P ] n and PAGE U = 23 f o r some i , 1 <= i <= m. C l e a r l y W has the same form as C with PROP^.-^ (— W u n l e s s m = 1 i n which case W = P w i t h P R O P ^ ^ \— P which i s i n the form A. [Q] D) V £ PROP c ~ D E A D c U T Case 1 T V = P U = ? l x v P 2 so P R O P ^ ) f - P T V = pT v PT v ... i z See C above. Case 2 Case 3 V=pTviT i J PROP^ )C f-V. £ I, t r i v i a l l y T which i s of form A. 2 F v P £ I n ( i < j) where P EE P,1 v P~Z v ... P„ n £ Assume without l o s s of g e n e r a l i t y I which i s e x c l u s i v e . that U = P^. Then W = P.. which i s i n the form B. Case 4 V = P v Pj v P 2 v ... P U = P. S i n c e P R O P ^ ^ c {- P then V £ DEAD^c. T But Case 5 But then V t T Q which i s a V = "Q then U = Q was the l e x i c o g r a p h i c a l l y T^ S O by c o n s t r u c t i o n T h i s completes contradiction. first unit i n T^ ^ = T^. + the subproof of the i n v a r i a n t statement on T^. As a c o r o l l a r y , T^ c o n t a i n s no non-unit p o s i t i v e c l a u s e s . By the r e s u l t s Since T Q of Davis and Putnam'^, T i s u n s a t i s f i a b l e , by i n d u c t i o n i + 1 i s u n s a t i s f i a b l e whenever T for a l l i , is unsatisfiable. i is. PAGE Also, T\ c o n t a i n s a p o s i t i v e u n i t pretation f o r every i , f o r i f not then the i n t e r - i n which every p r e d i c a t e i s taken as f a l s e i s a model f o r T^. Note that T^ ^ has at l e a s t one l e s s p r e d i c a t e symbol than + graphically first unit i n T^) u n l e s s T p o s i t i v e u n i t and since P i s f i n i t e , point Q £ T^ s i n c e 24 j i + = T^. Since T then T ^ ^ = always contains a f o r some i . + contains a p o s i t i v e u n i t . i (the l e x i c o - By the i n v a r i a n t At t h i s statement on T., P R O P ^ . - Q C (~ Q We can now r e l a t e part QED. 4 i i i n the d e f i n i t i o n of a TDB to the union of exten- sions. Lemma D.3 Union Given a c o n s i s t e n t concrete TDB, T and a formula P E P, v P v ... P £ I l z n 0 then ||P|| = || || U ||P|| Pl 2 U ... | | P J . Proof: Part 1 c £ ||p|| T (~ Pc. A ( T ) I Pc. By theorem D . l , By concreteness, f o r some i , giving Part or 2) T f P^c - c £ ||p |j ± In c l a u s a l form By - c £ ||P ||. or or \~ P^c i T f- P c f o r some i . i P^ v P £ I . r e s o l u t i o n T |— Pc or c £ ||p||. PAGE U n t i l now we have not of extensions of the d i f f e r e n t i a t e d between d i s j o i n t form P EE P^ not a l l ) a p p l i c a t i o n s we Lemma D.4 v P would l i k e 2 v ... P n £ I. and non-disjoint However, i n most disjointness s t r i c t l y union (but enforced. Disjointness Given a c o n s i s t e n t TDB, T, and then f o r each i , j , Proof: Let Then an e x c l u s i v e 1 <= i < j <= T (~ P ^ and This v i o l a t e s consistency, v Pj £ D i s j o i n t Union Given a c o n s i s t e n t concrete 2 v P 2 | | p j f) ||Pj|| = I. C o r o l l a r y D.l P EE Pj v P n, P EE P^ T |- P^c. P i formula c £ ||p.|| f | ||p..||. But v ... P then ||P|| = n HPJI TDB, so no T, and such c e x i s t s . an e x c l u s i v e formula £. I • ||P|| • 2 ... ||PJ. In summary: 1) For 25 conjunctions ||p|| = H P J Furthermore i f T i s c o n s i s t e n t f l ||P II H 2 and concrete: HPJI U 2) For d i s j u n c t i o n s ||p|| = 3) For exclusive disjunctions ||P || U 2 ... ||PJ ... ||P || ||P|| = ||p j || « N ||P|| 9 2 ... ||P|| n v ... P [ ]. n £ I PAGE Section E Transformation of a TDB to i t s Reduced Form One use of a TDB i s to answer the question quires t h i s f o r h i s "typed Algorithm and detects "Is T I Pc". [P | T f - as d e s c r i b e d Pc] g i v e n c £ C , i n [10,11]^ f o r a concrete TDB, any i n c o n s i s t e n c y i n PROPTc. As [P | T (~ Pc] may c o n t a i n a l a r g e number of P f o r any given c, we T h i s corresponds to a s s o c i a t i n g INSTANCES with [7,8,9]^ define only the Pc's necessary to span E . a reduced form of T, R(T), which contains in R e i t e r f o r one r e - - u n i f i c a t i o n algorithm" E . l below computes 26 p r i m i t i v e semantic categories -j^us, we can determine i f c £ ||p|| by l o o k i n g at the p r i m i t i v e c a t e g o r i e s below P. We have shown that formulae of the form P v P , v P ~ v . . . P £1 1 z n moved with no e f f e c t when T i s both concrete computing on the Horn component Definition of concrete H(T) = ( C , P, Note that E c o n t a i n s TDBs. [W £ I the Horn component Looking at t h i s case by C l a u s a l Form A Po A z ... P „ n of T, H ( T ) , as | W i s a Horn c l a u s e ] , E ) only Horn c l a u s e s . W £ I 1 P v P, , I PvP , z' 9 Horn? yes yes yes yes yes yes no no no yes ... P v P„ n 0 ~ 1 v P 2 v ... P n (P A Q) The f o l l o w i n g a l g o r i t h m ? P 2 v P, P v Pj v P 2 v ... P 1 v P, ... P R v P n P v Q computes case: Definite? P, v P v ... P„ v P 1 z n P IE P This motivates Horn Component Given a TDB, T = ( C , P, I , E ) , d e f i n e P E P , and c o n s i s t e n t . can be r e - [P | T (— Pc] given c £ C PAGE Algorithm E . l T Q <- s 0 <- [ ] 27 For computing DOMAIN^c g i v e n c £ C PROP H ( T ) c FOR i = 0 to i n f i n i t y DO BEGIN IF c o n t a i n s no p o s i t i v e unit THEN DOMAIN^,c E E S , ± k <— P <— IF i , EXIT the l e x i c o g r a p h i c a l l y first p o s i t i v e unit i n T^ 7 £. T. THEN PR0P c i s i n c o n s i s t e n t , T D0MAIN c T s 1 + 1 <- T 1 + 1 <— S U [P] ± T i s undefined, EXIT i ~ [X v P v Y £ T ] ~ ± [X v P v Y £ T ] ± U [X v Y | X v P v Y £ T ] i where X and Y are d i s j u n c t i o n s t i v e predicates of negative and p o s i - respectively (possibly null). END Example: Using the I o f diagram C . l and letting: E = [SINGLE John, MALE John] and c = John, the a l g o r i t h m w i l l produce the f o l l o w i n g values: PAGE I 0 T.^T MALE 1 + 1 (dropped) T 1 + 1 — T. (added) MALE MAN 3 MALE BOY 3 MALE MALE 3 PERSON PERSON — (MALE A FEMALE) FEMALE PERSON MALE A ADULT 3 MAN ADULT 3 MAN MALE A CHILD 3 BOY CHILD 3 BOY PERSON FEMALE 3 PERSON ADULT 3 PERSON CHILD 3 PERSON SINGLE SINGLE BACHELOR 3 SINGLE MAIDEN 3 SINGLE SINGLE 3 ADULT ADULT ^ ( S I N G L E A MARRIED) MARRIED MAN A SINGLE 3 BACHELOR MAN 3 BACHELOR SINGLE A WOMAN 3 MAIDEN WOMAN 3 MAIDEN ADULT ADULT MAN 3 ADULT WOMAN 3 ADULT MARRIED 3 ADULT — (ADULT A CHILD) ADULT 3 MAN FEMALE A ADULT 3 WOMAN MAN CHILD MAN FEMALE 3 WOMAN MAN BACHELOR 3 MAN MAN 3 BACHELOR BACHELOR BACHELOR BACHELOR 28 PAGE c o n t a i n s no p o s i t i v e u n i t . 29 Thus: DOMAIN^John = [MALE, PERSON, SINGLE, ADULT, MAN, BACHELOR] The a l g o r i t h m can be implemented by hash i n d e x i n g p r e d i c a t e symbols copy of the formulae i n I i n which each f o r m u l a ) . they appear ( o n l y one copy i s made f o r For e f f i c i e n c y , p o s i t i v e u n i t s are kept i n a separate l i s t . When a formula of the form X v P v Y £ I i s to be dropped deleted. the into a i t i s tagged as When a formula of the form X v P v Y i s to be r e p l a c e d by X v Y p r e d i c a t e symbol P i s d e l e t e d from the copy. list. I f a p o s i t i v e unit results it i s added to the p o s i t i v e u n i t I f the n u l l c l a u s e r e s u l t s , DOMAIN c is declared to be undefined and the o r i g i n a l clause P v Q that caused the T c o n t r a d i c t i o n can be p r i n t e d f o r i n s p e c t i o n . If we assume that the time taken to d e l e t e the symbol P from X v P v Y i s constant, the time taken to add a symbol to the p o s i t i v e u n i t l i s t stant and the hash indexing i s performed i n constant time (by design) then the i s con- f o l l o w i n g i s a bound on the time c o m p l e x i t y ^ of the a l g o r i t h m : 0 # p r e d i c a t e symbols J Theorem E.1 H(T) in W r Algorithm E . l Correctness Given a concrete TDB, T, and c £• C If PROPTc i s c o n s i s t e n t then DOMAIN c = [P | T (~ Pc] T otherwise DOMAINTc i s undefined Proof: F i r s t , note that the c l a s s of Horn clauses i s c l o s e d under resolution PAGE ( i e . R(Wp the W) 2 i s Horn whenever only p o s i t i v e and W clauses i n It follows f o r a l l i are u n i t s . to a p p l y the D a v i s and P u t n a m t h e o r e m Case A are). 2 This 30 that i s needed below. PROP^c i s c o n s i s t e n t Subcase 1 For P £ DOMAIN^c. some i , P £ and hence Subcase 2 giving T ± PR0P ( )C H T \~ P PR0P c (~ P T T (- Pc Assume P $ 7 for a l l i ± D e f i n e H*(T) = H(T) U [7] Starting the a l g o r i t h m w i t h H'(T) r a t h e r rather Note that than H(T) we produce than when d e f i n e d , T^ = T. U [7] The a l g o r i t h m w i l l 1) [P, 7] C T^ terminate i n one of two ways: f o r some i Q] <£ T[ ([Q, when Q £ P and Q ^ P s i n c e Tg i s c o n s i s t e n t ) But then P £ T^ contradiction 2) T^ c o n t a i n s no p o s i t i v e u n i t ; but then the i n t e r p r e t a t i o n i n which every p r e d i c a t e symbol i s f a l s e i s a model f o r T^. Since T (— Pc by theorem D . l , A(T) (— Pc H(T) (~ Pc or Tg U [P] = Tg i s u n s a t i s f i a b l e . and P u t n a m ^ , T^ i s u n s a t i s f i a b l e which i s a tion. giving By Davis contradic- T h e r e f o r e P £ T^ f o r some i . The a l g o r i t h m must terminate s i n c e T ^ ^ c o n t a i n s a t l e a s t + one fewer p r e d i c a t e symbol then T . i Case B PROP^c i s u n s a t i s f i a b l e Statement: PR0P ^ ^c i s u n s a t i s f i a b l e H T Subproof: By lemma D.2 PR0P c ~ T DEAD c i s u n s a t i s f i a b l e . T If PAGE 31 we d e f i n e T Q as PROPTc ~ DEADTc and l e t the a l g o r i t h m d e f i n e for each i , then the i n v a r i a n t statements of theorem D . l hold with one a d d i t i o n : W £. The may be the n u l l c l a u s e . a l g o r i t h m w i l l terminate 1) [P, P] C first f o r some i where P i s the l e x i c o g r a p h i c a l l y u n i t i n T^. By statements A and B o f theorem D . l , A(j.)C \~ P PROP for i n one of two ways: some Q £. P, Since P v Q P vQ and £1 and PROP^^^c |- Q. £ H(T) and A(T) C H(T), PROP - c i s i n c o n s i s R( T) tent. 2) By the r e s u l t s of Davis unsatisfiable. and P u t n a m ^ , f o r each i , T^ i s Since T^ ^ contains at l e a s t one l e s s l i + teral than T^, f o r some k, T^ contains no p o s i t i v e als. The i n t e r p r e t a t i o n i n which every p r e d i c a t e symbol is false i s a model f o r T^ unless T^ c o n t a i n s c l a u s e . Since T^ i s u n s a t i s f i a b l e clause. the n u l l i t must c o n t a i n the n u l l T Q doesn't c o n t a i n the n u l l clause so f o r some i , [P, P] C T where P i s the l e x i c o g r a p h i c a l l y f i r s t i In t h i s case in liter- u n i t T^. though, the a l g o r i t h m w i l l have terminated as case A. T h i s ends the subproof. As i n the subproof, the a l g o r i t h m w i l l not terminate t a i n i n g no u n i t s ( S t a r t i n g with T Q = PROP^^^c Therefore, [P, P] C T DOMAINTc undefined. i with T^ con- unsatisf iable). f o r some i . The a l g o r i t h m w i l l stop QED T h i s ends the c o r r e c t n e s s proof of a l g o r i t h m E . l with PAGE The f o l l o w i n g c o r o l l a r y i s used 32 i n the proof of the update a l g o r i t h m . A(T) - DOMAIN^ E q u i v a l e n c e Corollary E . l Given a TDB, T (not n e c e s s a r i l y concrete or c o n s i s t e n t ) and c £. C 1) I f P £ DOMAIN c then T 2) I f A(T) h P c t h e n p A(T) H c p & DOMAIN c T or DOMAINTc i s undefined Proof: Part 1 The P £ D0MAIN c T invariant statements of theorem D.l h o l d with the a d d i t i o n that W may be the n u l l c l a u s e . Therefore Part 2 A ( T ) \~ Pc A ( T ) h Pc Hence H(T) (- Pc. I f T i s c o n s i s t e n t then by theorem E . l , P £• T^ Case A, subcase 2, f o r some i (we do not need theorem D.l here) or P £ DOMAIN c. T I f T i s u n s a t i s f i a b l e then by theorem E . l , Case B, H(T) i s u n s a t i s f i a b l e and DOMAINTc i s undefined. QED As a TDB, T, becomes l a r g e , [DOMAINTc | c £ C] may become very l a r g e . The f o l l o w i n g allows us to save only the Pc that span E. I e . , s t o r e only the 'lowest' extensions i n the graph. Definition The Reduced or C a n o n i c a l TDB Given a c o n s i s t e n t concrete TDB, T R(T) = (C, P, I , E), the reduced TDB, R(T), i s d e f i n e d : = (C, P, I , [Pc | P £ DOMAIN c and T i f Q £ DOMAIN c then T not Q TAU+ P ] ) PAGE 33 Using the example of diagram C . l : [P | Pc £ E] DOMAINTc = [SINGLE, MALE] = [MALE, PERSON, SINGLE, ADULT, MAN, but [P | Pc £ R(T)*s extension] T h i s i s a good space r e d u c t i o n . l a r g e r than T ' s . = BACHELOR] [BACHELOR] In f a c t , i n no case i s R(T)'s e x t e n s i o n Note that we cannot assume that the concreteness or c o n s i s t e n c y o f R(T) f o l l o w s from that of T. Lemma E . l Spanning Given a c o n s i s t e n t c o n c r e t e TDB, T, i f P £ DOMAINTc then f o r some Q £ P with Qc £ R(T)'s e x t e n s i o n , Q TAU* P Proof: Assume f o r each n >= 0 w i t h Qc £ R(T)'s e x t e n s i o n , that not Q TAU n P By d e f i n i t i o n of R(T)'s e x t e n s i o n , Choosing Pc £ R(T)'s e x t e n s i o n n = 0, Q = P we have a c o n t r a d i c t i o n f o r Q TAU^ P. R(T)'s e x t e n s i o n i s s a i d to span [Pc | T |— P c ] . G e t t i n g back to the o r i g i n a l problem, " I s T |— Pc?" we see that we need only scan R(T)'s e x t e n s i o n f o r Qc such that Q TAU* P. Theorem E.2 T - R(T) E q u i v a l e n c e Given a c o n s i s t e n t concrete TDB, T, then and i f Pc £ R(T)'s e x t e n s i o n then i f Pc £ E then T (- Pc R(T) (- Pc Proof: P a r t 1) Assume Pc £ R(T)'s e x t e n s i o n . P £ D0MAIN c T so by theorem E . l , T (~ Pc PAGE P a r t 2) or 34 Assume Pc £ E P £ DOMAIN c. T By the spanning such By lemma ( E . l ) f o r some Q £ P, that Qc £ R(T)'s extension, the g e n e r a l i z a t i o n lemma ( C . l ) , Q TAU* P. R(T) J— Pc QED C o r o l l a r y E.2 Given a c o n s i s t e n t concrete TDB, T, c £ C, P £ P, T f- Pc i f f R(T) f~ Pc C o r o l l a r y E.3 Given Concreteness of R(T) a c o n s i s t e n t concrete TDB, T, R(T) i s concrete Proof: Let £1 P E P , v P, v ... P l l n Assume f o r some c £ C , A ( R ( T ) ) \~ Pc Then T f— Pc a n d A X ) \~ T P by theorem E.2 by theorem D . l c a n d A X T ) (— P^c by T's concreteness ( f o r some i , 1 <= i <= n) T h i s gives P ^ £ DOMAINTc by theorem D.l By the spanning Q TAU* P By f o r some Qc £ R(T)'s A(R(T))(- i C o r o l l a r y E.4 p Q c r e s o l v e s only E D C o n s i s t e n c y of R(T) a c o n s i s t e n t concrete TDB, T, R(T) i s c o n s i s t e n t Proof: Let S = [Pc | P £ DOMAIN c] U T [Pc extension, i the g e n e r a l i z a t i o n lemma ( C . l ) (which clauses), Given lemma ( E . l ) , | P £ P and S i s a model f o r H ( T ) [ 1 2 ^. P I DOMAINTc] definite PAGE I f i t i s not a model f o r T then f o r some P = Pj_ v P P £ DOMAINTc 2 v ... P and each £ I, fl P ± $ DOMAINTc, 1 <= i <= n T h i s c o n t r a d i c t s concreteness, so S i s a model f o r T. R(T) i s concrete so s i m i l a r l y S i s a model f o r R ( T ) . Thus R(T) i s a s u i t a b l e replacement f o r a c o n s i s t e n t TDB, T. QED 35 PAGE Section F Updating a TDB Thus f a r we have shown that a TDB w i l l have the d e s i r e d p e r t i e s i f concreteness and c o n s i s t e n c y canonical are maintained. We have g i v e n a both space and time i n answering "Is T (— Pc?". an a l g o r i t h m creteness and additions We must now spe- and c o n s i s t e n t TDB, T, and c £. C the a l g o r i t h m so as to leave T and c o n s i s t e n t . Algorithm while that w i l l update the c a n o n i c a l TDB w h i l e guaranteeing to and d e l e t i o n s from DOMAIN c algorithm w i l l computed con- consistency. G i v e n a concrete concrete set t h e o r e t i c pro- (reduced) form of the TDB that can be e f f i c i e n t l y conserving cify 36 I t i s assumed produce l ' = F.l that o r i g i n a l l y accepts the r e s u l t i n g TDB T = R(T). The R ( T ' ) . F o r updating a c a n o n i c a l TDB T, g i v e n c £ C UPDATE c: T BEGIN WRITE ( c , "has the root p r o p e r t i e s : " , S: INPUT (A C P IF (additions) [Pc | P £ D] <£ E WRITE (D ~ and DCP [P | Pc £ E ] ) (deletions) ) THEN [P | Pc £ E ] , "cannot be d e l e t e d , t r y again") GOTO S L: T* < — (C, P, I , E U [Pc | P £ A] — Calculate [Pc | P £ D]) DOMAINTc CONCRETE <-- TRUE E: IF DOMAIN^ic i s undefined (because P v Q £ I ) THEN WRITE ( c , "cannot be both a", P, "and a", Q, " t r y a g a i n " ) GOTO S PAGE C: FOR EACH P EE P, v P 9 £ I such that v ... P P £ DOMAIN^,tc but no DOMAINTc WRITE ("A", P, "must a l s o be one of", P j , P , 2 "add IF 37 ... P , one of them") CONCRETE < — FALSE NOT CONCRETE THEN INPUT (A* C p ( a d d i t i o n s ) ) A < — A U A* GOTO L C a l c u l a t e R ( T ' ) given DOMAINTc T * < — and T ' * R ( T ' ) . WRITE ("update accepted") END f o r c-^ 4 c need not Note that i n the step marked *, DOMAIN tc^ T be r e c a l c u l a t e d as DOMAIN^tc^ = DOMAIN c^. The f o l l o w i n g theorem T g i v e s the update p r o p e r t i e s d e s i r e d . Theorem F . l Algorithm F . l Correctness Given a c o n s i s t e n t concrete resultant and c a n o n i c a l TDB, T and c £ C then the c a n o n i c a l TDB, T ' , i s c o n s i s t e n t and c o n c r e t e . Proof: Part 1 Concreteness Let P E P j By v P 2 v ... P n £ I and c o r o l l a r y E . l , s i n c e DOMAINTc algorithm terminated termination, /\(T*) (~ Pc i s defined i n order f o r P £ DOMAIN^tc. Since (CONCRETE i s TRUE), f o r some i , ? £ ± C o r o l l a r y E . l gives A( ') T H p i c ' the a l g o r i t h m DOMAINTc. PAGE Part 2 Consistency If T 1 i s not consistent then s i n c e T i s c o n s i s t e n t , PROP^tc must be u n s a t i s f i a b l e . T* 38 i s concrete tradicts Theorem E . l a p p l i e s saying DOMAIN^ic i s undefined. termination of the a l g o r i t h m . since This Therefore T coni s con- 1 sistent. By The c o r o l l a r i e s E.3 and R ( T ' ) i s concrete E.4, f o l l o w i n g i s a sample update using Input Response - Update John - John has the consistent. the I of diagram C . l and or E properties - A=[ADULT] and root QED an empty E . E * E = [ ] E' = [ADULTJohn] E ' = [ADULTJohn, [ ] - An ADULT must be one of MARRIED or SINGLE, add one. - A=[MALE,SINGLE] - Update accepted MALEJohn, SINGLEJohn] - Update John - John has the properties - A=[MARRIED] and D=[SINGLE] - A=[MARRIED] - root try [BACHELORJohn] E = [BACHELORJohn] E' = [MARRIEDJohn, be again - John cannot be MARRIED and = [BACHELOR] [SINGLE] cannot deleted, E ' both SINGLE, t r y BACHELORJohn] again A=[MARRIED] and D=[BACHELOR] A=[MALE] - An ADULT must be one MALE or FEMALE, add - Update accepted of E ' = [MARRIEDJohn] E* = [MARRIEDJohn, one MALEJohn] PAGE 39 I t may seem redundant that we must "Add MALE" when we d i d not e x p l i c i t l y del e t e MALE. general However, when BACHELOR i s d e l e t e d , has no way i n o f determining i f MAN and/or SINGLE a r e to remain i n the TDB. ther than t r y i n g to second guess the user, any the a l g o r i t h m dropped the a l g o r i t h m Ra- asks him to c o r r e c t information. Before the f i r s t update, when E = [ ], f o r the update theorem to apply, must be concrete and c o n s i s t e n t . I t i s c e r t a i n l y concrete. to be c o n s i s t e n t , I must be c o n s i s t e n t . In order T for i t The i n t e r p r e t a t i o n i n which a l l pre- d i c a t e symbols are f a l s e under the i n t e r p r e t a t i o n i s a model f o r I , so i t i s consistent. Thus by i n d u c t i o n on the update theorem we are assured that i f we s t a r t with E = [ ] then T w i l l always be concrete and c o n s i s t e n t a f t e r the update. Although there are no i n c o n s i s t e n c i e s i n I , there may be c o n s t r a i n t s prevent c e r t a i n kinds given Ie. of updates from t a k i n g p l a c e . some P £ P, f o r no c £ C i s Pc £ E while T i s that By t h i s we mean that consistent. ||p|| = [ ] always. Therefore i f the TDB i s to be kept c o n s i s t e n t , c l e a r l y a case of bad s t r u c t u r i n g of I . ||P f| = [ ] always. 1 This i s I t can be avoided by ensuring that PAGE f o r a l l P £ P, I U [Px] i s c o n s i s t e n t . E = [Pc] f o r any c £ C and c a l c u l a t i n g badly s t r u c t u r e d above P. This can be guaranteed DOMAINTc. 40 by l e t t i n g I f undefined, then I i s PAGE Summary and Conclusions A generalized to r e p l a c e rich structure has facility so-called concreteness and that database algorithm has along with given domain. was not been proposed the more r e s t r i c t e d structuring the 41 a has been and of imposes the desired specified for that set ISA only consistency. the technique notion for Reiter's hierarchy. the Results natural It requirements concreteness testing i f a specific and to properties. element An and a of ensure update consistency i s a member of A means f o r t e s t i n g a non-atomic formula against included provides have been given theoretic guarantees typed database the a database i s an open t o p i c . While t r a d i t i o n a l r e l a t i o n a l databases suggest no need f o r a typed database, work i n a r t i f i c i a l often deficient because Since the TDB well structured power of provides and mechanisms b u i l t tive i n t e l l i g e n c e and of applicability. the need We host f o r a more database interdependant the most rudimentary l e v e l mathematically on i t stand a inferential on f i r m ground. inferential suggest complete that the database proposed in of systems domain inferencing order that I t must not limit system, thereby s t r u c t u r e and structure. i t must be inferencing the descrip- reducing algorithms typed database are adequate f o r managing the domain s t r u c t u r e of database systems. are its for a inferential PAGE 42 Bibliography 1. Aho, A.V., H o p c r o f t , J.E., Ullman, J.D. (1974). of Computer A l g o r i t h m s . 2. 3. Addison-Wesley, Reading, Mass., 1974. Chang, C.L., and Lee, R.C.T. (1973). Theorem P r o v i n g . Date, C.J. (1975). The Design and A n a l y s i s Symbolic L o g i c and M e c h a n i c a l Academic P r e s s , New York, 1973. An I n t r o d u c t i o n to Data Base Systems. Addison- Wesley, Reading, Mass., 1975. 4. D a v i s , M., and Putnam, H. (1959). A Computational Proof Procedure. Rensselaer P o l y t e c h n i c a l I n s t i t u t i o n , 5. G a l l a i r e , H., Minker, J . , and N i c o l a s , J.M. (1978). I n t r o d u c t i o n t o L o g i c and Data Bases. Gallaire 6. Troy, New York, AFOSR TR 59-124. An Overview and L o g i c and Data Bases, Eds. H. and J . Minker, Plenum P r e s s , New York, 1978, P.12. G r i e s , D. (1971). Compiler C o n s t r u c t i o n f o r D i g i t a l Computers. John Wiley and Sons, I n c . , 1971. 7. McSkimin, J.R. (1976). The Use of Semantic I n f o r m a t i o n i n Deductive Question-Answering Systems. Ph.D. T h e s i s , Dept. of Computer S c i e n c e , Univ. of Maryland, C o l l e g e Park, Md., 1976. 8. McSkimin, J.R., and Minker, J . (1977). The Use of a Semantic Network i n a Deductive Question-Answering System. Dept. of Computer S c i e n c e , Univ. of Maryland Tech. Report TR-506, 1977. PAGE "9. Minker, J . (1978). Logic. 43 An E x p e r i m e n t a l R e l a t i o n a l Data Base System Based on L o g i c and Data Bases, Eds. H. G a l l a i r e and J . Minker, Plenum P r e s s , New York, 1978. 10. R e i t e r , R. (1977). An Approach t o Deductive Question-Answering. n i c a l r e p o r t 3649, B o l t Beranek and Newman Inc., Cambridge, Tech- Mass., Sept. 1977. 11. R e i t e r , R. (1978a). Bases. Deductive Question-Answering on R e l a t i o n a l Data L o g i c and Data Bases, Eds. H. G a l l a i r e and J . Minker, Plenum P r e s s , New York, 1978. 12. R e i t e r , R. (1978b). On c l o s e d world data bases. L o g i c and data bases, Eds. H. G a l l a i r e and J . Minker, Plenum P r e s s , New York, 1978. 13. Robinson, J.A. (1965). Principle. J . ACM 12 (Jan. 1965), 25-41. 14. Sussman, G., Winograd, Reference Manual. 15. Winograd, 1972. A Machine O r i e n t e d L o g i c Based on the R e s o l u t i o n T., and Charniak, E. (1970). MICRO-PLANNER A . I . MEMO no. 203, M.I.T., Cambridge, T. (1972). Understanding N a t u r a l Language. Mass., 1970. Academic Press, PAGE Appendix A D i c t i o n a r y of Symbols Sets: [ ] - Used i n set d e f i n i t i o n s as a replacement f o r £ , € ~ Set membership and C , <£. - Set n, - Set i n t e r s e c t i o n and - D i s j o i n t union - Set d i f f e r e n c e U $ i n c l u s i o n and | - Set q u a l i f i c a t i o n ||p|| - Extension non-membership non-inclusion union (read "such that") of a p r e d i c a t e , (see section C) P r e d i c a t e C a l c u l u s Formulae: A - Conjunction v - D i s j u n c t i o n symbol ^> - Implication EE - Equivalence -"-^ - Negation P - Also (x) - Universal quantificaion |—, |/- - Symbols f o r p r o v a b i t y - The R(Wj, W) 2 symbol negation and non-provability r e s o l v e n t s of formulae W 1 and W l 2 J 44 PAGE Mathematical 45 Relations:^ ALPHA, BETA and TAU - Relations defined i n section C R n - Composition of r e l a t i o n R, n times R + - T r a n s i t i v e c l o s u r e of r e l a t i o n R - R e f l e x i v e t r a n s i t i v e c l o s u r e of r e l a t i o n R R General Usage: EE 0 - read " i s defined as", i n definitions - computational complexity and of an a l g o r i t h m algorithms (order complexity)^^
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- An approach to the organization of taxonomies
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
An approach to the organization of taxonomies Bishop, Graig David 1981
pdf
Page Metadata
Item Metadata
Title | An approach to the organization of taxonomies |
Creator |
Bishop, Graig David |
Date Issued | 1981 |
Description | The ISA hierarchy used by present day inferential database systems is deficient in that it does not represent a variety of domain relationships (type relationships). Such hierarchies associate explicitly defined sets with the leaves of a tree. Non-leaf nodes in the tree inherit members from their child nodes. In keeping with the use of sets, this paper gives motivation for having type relationships other than subset. Included are union, intersection and a disjointness condition. A formalism for the typed database (TDB) is given, using the monadic first order predicate calculus as its theoretical basis. Non-unit formulae represent intensional (general) information about the world being represented and unit ground clauses represent extensional (specific) information. Predicates represent types, and constant symbols represent set members. This is connected to the set concept via predicate extensions. The extension is that set of constant symbols which are provable as arguments to the given predicate. Given the so-called concreteness condition and consistency of the TDB, the desired set theoretic relationships (union, intersection and disjointness) of predicate extensions follow. This strengthens the link between the formalism of the predicate calculus and the more natural set representation. A canonical form of a TDB is shown that admits an appropriate machine representation. Using this is can be determined if a constant symbol as an argument to a given predicate is provable (domain membership) in constant time. An update algorithm is developed and is shown to be correct in that it maintains concreteness and consistency. Thus the TDB is shown to be a practical generalization of the ISA hierarchy but is considerably more expressive. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-03-29 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0051816 |
URI | http://hdl.handle.net/2429/22854 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-UBC_1981_A6_7 B58.pdf [ 1.96MB ]
- Metadata
- JSON: 831-1.0051816.json
- JSON-LD: 831-1.0051816-ld.json
- RDF/XML (Pretty): 831-1.0051816-rdf.xml
- RDF/JSON: 831-1.0051816-rdf.json
- Turtle: 831-1.0051816-turtle.txt
- N-Triples: 831-1.0051816-rdf-ntriples.txt
- Original Record: 831-1.0051816-source.json
- Full Text
- 831-1.0051816-fulltext.txt
- Citation
- 831-1.0051816.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051816/manifest