(a) is a lambda term called a lambda constant. Lambda constants were introduced to capture a kind of individual occurring naturally in the DDB domain: regulations. For example, to describe a typical degree program we must classify requirements of that program, e.g., \"nobody can register if they're under 16 years old\" refers to a regulation that 42 uses the lambda constant \\x l.\\3x2.age — of (x x,x2)A ^ 2—16] In this way regulations can be placed in relation to other individuals and sets, e.g., program — prerequisite(BScMajor3, first ,\\x1.[3x2.age —of [xx,x2)*x2^ 16]) says \"one of the regulations for the first year of a BScMajors program is that an individual be at least 16 years old.\" 3.1.2. Formula syntax The formulas of DLOG are defined inductively as follows: (Fl) atomic formulas: if tx,t2, •••,(„ are terms of sort TX,T2, • • • ,r„ respectively, and is an n-ary predicate constant, then (tltt2, •••,<„) is an atomic formula; A DLOG atomic formula is an n-ary predicate constant followed by a parenthesized list of DLOG terms of the appropriate sort. As explained above, the predicate constants in a multi-sorted language are slightly more complex, as they must encode their arity and the sort of each argument. In most of our examples, we write only the string of the predicate constant instead of the complete tuple; the sort of each argument should be obvious from the context. For example, the atomic formula program — prerequisite(BScMajors, {third * fourth}, \\x1.completed(x1,CS315)) would be more formally written as 7J(a;1)AT2(x2)A • • • Ar\"(j;n) where each r' is a boolean combination of monadic predicates. This restriction to monadic predicates is significant, as derivability of formulas of the monadic predicate calculus is decidable. In [Reiter83] this class is generalized to arbitrary formulas, but he acknowledges that their efficient application requires further research. Two further observations on the use of integrity constraints are worth noting. First, constraints may sometimes be used to derive new information as well as to constrain the consistency of updates [Nicolas78b, Minker83]. For example, consider the formula Vx Vy.supplies (x ,y)Dsupplier(x)t>part(y) (3.27) which asserts that if supplies(x,y) is true, then x must be a supplier, and y a part. As an integrity constraint, formula (3.27) would cause the rejection of an assertion like \"supplies (Acme,Widgets)\" unless both \"supplier (Acme)\" and \"part (Widgets)\" were already derivable. However, as a ordinary rule we can use formula (3.27) to deduce that Acme and Widgets have the required properties. Simi-larly, in the terminal session in figure 2.5, the assertion head(PCGilmore ,CS) was initially rejected because of the DDB constraint 62 Vz Vy.head(x ,y )D faculty — member (x) A department (y) There seems to be no concensus about how to decide in which way to use formulas such as (3.26). Second, because constraints are intended to assert invariant properties of databases, potentially violated constraints must be identified at update time. That is, upon each update, potentially violated constraints must be identified and re-derived to establish the consistency of the augmented database. Logic database researchers have usually distinguished integrity constraints from the rest of the data-base, but have not indicated how such a subset should be identified. This is presumably an issue of \"database design.\" In DLOG, constraints are intended to be used to specify invariant properties of those relations that a user may update. DLOG constraints are used only to verify database consistency, and cannot be used to answer queries. Furthermore, their syntax permits the definition of a meta predicate relevant that identifies a subset R of integrity constraints, i.e., RQICQDB, that must be re-derived whenever a new assertion is made. This subset is determined by the form of the assertion (see §5.2.4.2). The application of DLOG integrity constraints is specified in a modified version of Reiter's notion of constraint satisfaction [Reiter83]: if DB I- ic for all icUC, then A is assertable in DB only if DB U {A} I- ic for all icUC If we assume that the current set of constraints IC is derivable, the consistent update of DB with new assertion A is possible only if DB U {A} r~ ic, for all icZIC; the DLOG specification uses the meta predicate assertable(A) as the specification of the DLOG mechanism for applying integrity con-straints. Recall that the DLOG syntax for constraints has the form 'A D B'; the DLOG database maintenance specification reveals it to be an abbreviation for 'as3ertable(A)Cderivable(By. The class of constraints treated in [Reiter81] are equivalent to relational schemas, so DLOG's constraint facility also captures the database notion of relation schemas [Stonebraker75]. Further-more, it is part of the data model — a constraint is simply another part of a DLOG application data-63 base. Chapter 4 Descriptive terms in DLOG Chapter 3 provided a rather detailed description of the DLOG data model, with little emphasis on the motivation for and intuition behind DLOG's complex terms. This chapter presumes to supply the motivation, with further examples of their intended use. The motivation for including descriptive terms in DLOG comes from work in Artificial Intelli-gence, where the ability to refer to objects with descriptions is of considerable conceptual advantage in specifying symbolic representations of domains (e.g., [Moore76, Schubert76, Norman79, Bobrow77b, Bobrow77a, Hewitt80, Steels80, Attardi8l]). Descriptions have also been used within the logic database literature (e.g., [Dilger78, DahI80, Dahl82]). The descriptive terms of DLOG are believed to be the most extensive currently employed in any logic-based representation language. In AI, part of the reluctance to adopt a logical interpretation of descriptions must be attributed to the long-standing confusion and controversy about their meaning — logicians do not generally agree on the semantics of descriptive terms.17 This reluctance to analyze descriptions in a logical framework probably results from a traditional misunderstanding of the role of logic (cf. [Hayes77]). DLOG descriptions were originally developed to help describe concepts in the Department Database (DDB) domain; they have been somewhat elaborated to help explain the role of logic in analyzing descriptions and to investigate the computational problems of including them in a representation language. The next section reviews all DLOG complex terms, and discusses their use and intended meaning in an informal way. Two subsequent sections, §§4.2, 4.3, discuss the traditional use and meaning of descriptions by focusing on issues that impact their interpretation in a finite symbolic database. The 1 7 See [Carroll78] for a good survey of the logical, philosophical and psychological interpretation of descriptions. 64 65 chapter's last section, §4.4, gives more examples of how DLOG descriptive terms are used in the DDB domain. 4.1. DLOG terms: motivation and intended use When designing a representation language, it is easy to become detached from the worlds to be represented; elaborate language features are often presented with little indication of how they are best used. For example, the KRL language [Bobrow77b, Bobrow77a] has been criticised in this way (e.g., see [McDermott78b, Lehnert78]). The need for embedded descriptions does arise in the DDB domain (e.g., see fig. 4.1 below). In DLOG's case, indefinite descriptions, lambda terms, and indefinite sets were motivated by domain considerations. Definite descriptions of both kinds were used rarely in the implemented DDB, but were included in order to investigate strategies for their implementation. 4.1.1. Individual constants DLOG individual constants are, by Tarskian semantics, intended to denote individuals in an interpretation. A constant identifier is usually selected as a name of an individual, but this need not be the case. The correspondence between constants and real objects is a semantic issue, so a constant like PERSON00169 might be used to denote the person named John Q. Smith. Conceptually, it is generally less confusing to choose identifiers mnemonically. An alternative is to specify a name rela-tion as a predicate of the theory, e.g., establish a correspondence like name(P000l69,\"John Q. Smith') name(P000072,\"Dweisel Zappa') The non-mnemonic constants might then be interpreted as denoting individuals in the domain, with the predicate name interpreted as a relation on individuals and strings. In this way, a user can build a database that includes a theory of aliases, e.g., 66 name(P000185,\"C7arfc Kent') name(P000185, \"Superman \") 4.1.2. Set constants DLOG set constants are formed from individual constants, so each member of a set is inter-preted as specified above for individual constants. A set constant is written as a collection of indivi-dual constants, e.g., {P000119AP10112}, and denotes a collection of individuals. For example, the axiom #({P000119AP10112}) is true in an interpretation where the property named by $ is true of the class object that has the denotations of P000119 and P10112 as members. Note that a DLOG theory is not a complete axiomatization of classical set theory (cf. [Brown78]). As Quine says: ...set theory can in part be simulated by purely notational convention, so that the appear-ance of talking of sets (or classes), and the utility of talking of them, are to some degree enjoyed without really talking of anything of the kind.18 Similar observations have been made by Bledsoe [Bledsoe77a], who replaces set variables with their first order translations, and Kowalski [Kowalski81], who uses a similar idea to provide an axiomatiza-tion of sets. DLOG's set theory is based on a finite domain of individuals, and thus avoids the major issue of classical set theory: what expressions determine sets, and vice versa. This is done by restricting the interpretation of sets to range over the power set of a finite number of individuals. A logician would no doubt claim that, indeed, DLOG has sets \"without really talking of anything of the kind.\" How-ever, the intended use of DLOG \"sets\" is as a conceptual aid — they behave as if they were sets. Recall that chapter 3 presented the DLOG language as a 3-sorted logic. DLOG's interpretation of set constants could be transformed to a single-sorted language. This translation to a single-sorted logic requires that each application database include axioms that define the unary predicates individual, set and X. In fact, because the implementation of DLOG is in the single-sorted language See [Quine69, p. 4, pp. 15-21]. 67 Prolog, the unary predicates are required for the implementation, and are provided for the user. DLOG users can assume that the necessary axioms are \"built-in,\" however an assertion of the form #({P000119AP101112}) is manipulated in a single-sorted language internally. By incorporating these predicates within each DLOG database, the user is provided with the illusion of sets. 4.1.3. Lambda constants The experimental DDB domain requires the description of degree requirements. Often the requirements can be formed as lambda constants, e.g., the assertion enrolment — requirement(BScMajorsCS,\\x. [age —of (x,2/)Aj/&16]) (4.3) states that \"an enrolment requirement of the BScMajorsCS degree is that the candidate's age is greater than or equal to sixteen.\" (Recall that all unbound variables in the body of a lambda term are assumed to be existentially quantified.) The lambda constant format allows the requirement to be asserted and queried, and the satisfies predicate provides the mechanism to pose a query like ^enrolment — requirement (BScMajorsCS, I ^ A satisfies(John ,lx) (4-4) that can be read as \"Has John satisfied an enrolment requirement for the BScMajorsCS program?\" Notice that, in the DDB domain, degree requirements are most naturally conceived as conditions which must be satisfied. Since degree programs are distinguished by their various requirements, it is most straightforward to describe degree program requirements as relations on degree names and con-ditions to be satisfied — in DLOG, as lambda constants. Of course there are alternatives to the use of this special constant. For example, the meaning of sentence (4.3) might be rephrased in terms of a standard first order language as VE. [satisfied — requirements (x ,BSc Ma jorsCS)D3y.age — of (x ,y) A J/& 16] (4.5) where we would use BScMajorsCS as the name of a degree program and modify the predicate satisfies to correspond more closely to our intuition regarding what one must do with degree require-ments. This alternative has a more straightforward meaning (since there are no \"special\" forms) and the requirement for a higher order semantics is gone. But now there is no way of asking what the requirements of the BScMajorsCS program are, short of providing another meta level order primitive 68 for manipulating sentences. For example, to answer the query equivalent to the query (4.4) in the alternative notation, we require an operation that retrieves a sentence of the form (4.5) from the current database, and then returns the literal consequent of that sentence as an answer. Similarly, if there were n lambda constants, each denoting a degree requirement regulation, .i.e., requirement (BScMajorsCS ,Xx. requirement (BScMajorsCS,\\x. $2) requirement (BScMajorsCS ,\\x. {x1)jr)C der i vable (DB, Vx 2x 3. *P(x 2)) A #(x 3 ) ) ^ A x 2= x 3. must be considered. As queries are viewed as alleged theorems, a definite individual that fails to refer results in query failure. This interpretation of definite individuals is essentially the test of an 73 assertion's presuppositions regarding the individual being described (cf. [Mercer84, Hirschberg84]). Here the presupposition is existence and uniqueness. 4.3.2. Indefinite individuals Indefinite descriptions of individuals are formed using Hilbert's symbol €, intuitively interpreted as the English indefinite article \"an.\" It should here be acknowledged that, whereas philosophical treatments of reference have been most interested in the meaning of definite descriptions, theories of indefinite descriptions have received relatively less attention. For example, Hilbert's operator can be used to dispense with quantifiers but it does so only for syntactic expedience, to ease the mechanical function of the proof theory [Leisenring69]. It is apparently for this reason that Kaplan [Kaplan75, p. 213] denies Hilbert's term the status of an indefinite description operator even though there is some basis to believe that Hilbert intended its interpretation as a designator could be based on a semantic choice function (e.g., see [Leisenring69, Alps81]). In DLOG, the term £x.student (x) A age — of (x ,16) can be read as \"a student whose age is 16.\" Indefinite descriptions are treated, in a sense, attributively in assertions, and referentially in queries. In assertions they are interpreted attributively in the sense that that failure of their preconditions (i.e., existence) cannot cause an assertion to be revoked (cf. definite descriptions). An assertion of the form 4>((.x.1\\x)) will be admitted as an axiom regardless of whether the clause &{x) is derivable (i.e., there is a referent for x). This means that an indefinite assertion of the form #(a)v#(&) can be made by embedding the disjunction within a description, viz. 12)}) In the intended interpretation some ambiguity remains, as the original assertion (4.12) is a relation on individuals (e.g., degree names X courses) but the final version (4.17) involves sets (e.g., degree names X sets of courses). Intuitively, we expect that if a set of courses is a course requirement, then so is each member of that set. The DLOG implementation described in chapter 5 uses a weak typing mechanism that permits the user to declare how the set arguments of each predicate are to be inter-preted. 4.4. Mapping a domain into D L O G sentences The specification of the DLOG data model provides a set of guidelines for implementing a DLOG database management system, as well as a description of how to capture and interpret a domain in terms of DLOG sentences. The claimed advantage of including the numerous kinds of spe-cial terms is conceptual ease in mapping a domain to DLOG sentences. Much of the difficulty in using a representation language or data model is deciding how to capture relevant domain facts in a data-base. This section provides some examples of the intuition behind one particular rendering of the example DDB domain. Consider the information in fig. 4.1, which is a reproduction of a portion of The University of British Columbia undergraduate calendar for the academic year 1981-1982. COMPUTER SCIENCE The Department offers opportunites for study leading to bachelor's, master's and doctor's degrees. For information on the M.Sc. and Ph.D. degree courses, see Grad-uate Studies. All students who intend to take Honours in Computer Science must consult the Head of the Department. Requirements for the B.Sc. degree: \" M a j o r and Honours First Year . Computer Science 115' - (3) Mathematics 100 and 101 (120 and 121) (3) Physics 110 or 115 or 120 (3) Chemistry 110 or 120 (3) English 100 (3) ' (15) 'Computer Science 118 (114) and a VA unit elective can be substituted by those eligible for Computer Science 118. Special arrangements may be made for a student who did not take Computer Science 115 or 118 in First Year. Such arrangements may limit choice of400-level courses. Major Second Year . Third and Fourth Years Computer Science 215 . (3) . Computer Science 315 (3) Computer Science 220 (1V&) Other Computer Science Mathematics 205 and 221 (3) courses numbered 300 Other Mathematics ..' (114) or higher2 (6) Electives (6) Further Computer Science courses (15) . . numbered 400 or higher2 (6) Mathematics courses numbered 300 or higher3 (6) Electives* (9) (30) JFor Major students, it is recommended that at least two of the optional Computer Science Courses be chosen from application areas (e.g., Computer Science 302, 402, 403,404,405,406). 3Mathematics courses in analysis, applied mathematics, linear algebra, probability, differential equations, and statistics are recommended. Such courses include Math-ematics 300, 305, 306, 307, 315, 316,318,340, 344, 345,400, 405, 407, 426,480. 4 Appropriate courses from other fields of possible computer applications are sug-gested. In particular, attention is called to the following courses outside the Facul-ties of Arts and Science, for which credit will be granted: Commerce 356, 410, 411, 450,459; Electrical Engineering 256,358, 364. Figure 4.1 A portion of the DDB domain As described in §2.3, the experimental DDB is intended for use in building and maintaining student transcripts. The question answering knowledge is informally classified as knowledge of program 79 requirements and knowledge of program prerequisites. Program requirements include faculty program requirements, department program requirements, and course requirements, e.g., faculty—program—req(Science, (4-18) Bachelor, second, \\x2.completed(x2, {xs,Xi.( course(xs) A units (xs,x4) ) A( total — units (Xltx5) A*fi=24) )}) In general, faculty— program—req(a,P,^,5,t) is interpreted as \"in faculty a , at level |3, in stream \"y and year 8, requirement e must be satisfied.\" Assertion (4.18) specifies a second year requirement for Bachelor of Science programs, for all streams xx (e.g., Majors and Honours). The requirement is that a student x2 have completed a set of courses, specified as an indefinite set description; the indefinite set is read as \"a set of courses whose total number of course units is greater than or equal to twenty-four.\" Notice that the total — units predicate is applied to set terms; it can be defined in terms of the simpler units predicate. We expect to be able to distinguish how a predicate is being used by deter-mining the sorts of its arguments. One implementation technique for doing so is described in chapter 5. Requirements are represented by lambda terms because, in the domain, they are perceived as regulations, or truth conditions to be satisfied. As explained above, in §4.1.2, queries can retrieve requirements as terms, and use the definition of satisfies to determine if some particular individual satisfies that requirement. Faculty program requirements are related to program requirements by the implication program-req{x6,x4,x6)C facuity —program —req(x !,x 2,x s,x 4,x 5) 80 A faculty — of(x6,Xi) * level —of (x6,x2) A stream— of (x6,xs) In a similar way, department and course requirements are included in the class of program require-ments. Most of the partial domain given in fig. 4.1 specifies course requirements. Consider the first course requirement for first year B.Sc. Majors: course —req(BScMajorsCS, first , \\x 1. complet ed (xx,CS 115) v (complet ed (x 1:CS 118) A completed(xv€x2.elective(BScMajorsCS,x2) *units(x2,l.b))) Notice that this requirement makes use of an indefinite description of an individual in rendering the domain requirement \"a 1% unit elective.\" An alternative representation might have been course — req(BScMajorsCS, first ,CS 115) A completed(x1,CS 115) v (course - req(BScMajorsCS,first ,CS 118) A completed(x1,CSllcl) *elective(BScMajorCS ,x2) *units(x2,\\.b) A cours e — req (BScMajorsCS ,x2) A completed(x l tx 2)) The descriptive version shortens the representation by keeping the alternative course within the con-text of course —req. Furthermore, the lambda constant allows the requirement to be tested for satis-faction. As a final example, notice that the third and fourth year requirements for the Majors program are specified together in fig. 4.1. Consider the following as a representation of \"6 units of other Com-puter Science courses numbered 300 or higher:\" 81 course —req(BScMajorsCS, {third A fourth}, \\x 1.completed(x lf {x2,Xi.( course(x2)*dept—of(x2,CS) A course — no(x2,x4) Az4&300) A units (Xlt6)})) Since this requirement is specified for third and fourth year, the assertion is made for the set term that contains third and fourth as individuals. Notice that the set term {third*fourth} means that the assertion is not decomposable into two assertions of the form course — req(BScMajorsCS,third, — ) and course — req(BScMajorsCS,fourth, — ) because the lambda constant is a requirement that must be satisfied in the third and fourth year period. The intended interpretation of the predicate course — req is such that both individual and set arguments in the second position make sense. In the implementation of chapter 5, a user is given some control over the specification of predicate constants, to achieve this flexibility in using sorts. The intuitive meaning of the above combination of predicate constants implicitly asserts something like \\/x1x2x3X1.course — req(x1,X1,x2)*xs(:X1'Dcourse — req(x1,x3,x2) This kind of flexibility is provided by the implementation. Footnote \"2\" in this specification (see fig. 4.1) describes recommendations; in general recommen-dations must be related to indefinitely specified requirements by some user decision, but the recom-mendations themselves can be captured in the sentence recommended(BScMajorsCS, {third A fourth}, {z1,X1:( course(x1) A (x , €{CS302 A CS402 A C5403 A C5404 A CS40S A OS406} ) ^{topic — of (x implications)) A( cardinality(Xx,x2)AZ2^2 )}) that recommends \"at least two\" of those suggested courses. Chapter 5 An implementation of the DLOG data model An experimental database management system based on the DLOG data model has been imple-mented in the programming language Prolog. Two different implementations have been developed, one in Waterloo IBM Prolog under the CMS operating system [Roberts77], and another in Prolog/MTS [Goebel80], a derivative of Waterloo IBM Prolog. Another version is currently under construction in Waterloo Unix Prolog [Cheng84]. A Prolog interpreter is a flexible implementation tool because it provides a theorem-prover, a database management system, and a programming language all in one package. This accounts for its growing popularity in many areas of computing (e.g., see [Clark82]). 5.1. Prolog overview This overview is necessarily brief, but should provide the concepts needed to interpret the specifi-cation of DLOG in Prolog. For a programming introduction, see Clark and McCabe [Clark84]. The syntax we will use is that of Prolog/MTS. The language is single sorted. Variables are writ-ten as arbitrary alphanumeric strings preceded by asterisks, e.g., human(*x) is Prolog's representation for Vx.human(x) Implications are written in the form consequent <- antecedent and are interpreted by Prolog to mean \"to establish consequent try deriving antecedent.\" For exam-ple, the axiom 82 83 mortal(*x) <- human(*x) is Prolog's representation for Vx.human(x) 3 mortal(x) To prove that some a is mortal, one can show that a is human. Prolog queries or goal statements are conventionally written as <- goali & goal2 & . • • • & . goaln. If the free variables are xxx2 • • • xm, this goal is considered as a request to establish zix\\X2 • • • xm.goali^goal2^ • • • *goaln as a theorem; all free variables are interpreted as existential. The back-chaining (i.e., goal-driven) proof procedure is a depth-first implementation of the SLD proof procedure [Lloyd82]. Finite failure on one search path will force backtracking to another path; terms are substituted for existential vari-ables where appropriate. The meta variable facility of Prolog is an implementation of the reflection rule used by Weyrauch [Weyrauch80] and Bowen and Kowalski [Bowen82]. It is simply a way of treating the value of a bound variable as a goal, e.g., derivable(*x) <- *x is a Prolog statement representing the meta language statement if DB y-x then DB r-derivable(x) Intuitively, this axiom states that an attempt to establish x as a theorem using derivable, is equivalent to deriving x directly in the logic's proof procedure. The axiom's validity is established when the proof procedure axiomatized by derivable corresponds to the logic's proof theory (i.e., the definition of 'I-'). This rule permits the logic programmer to either simulate a proof procedure using derivable, or to relinquish control to the Prolog proof algorithm and pursue a proof directly. By simu-lating a derivation, a programmer can control each derivation step and substitute alternative proof strategies or apply heuristics: this is how the DLOG implementation modifies Prolog's standard proof theory to deal with descriptions. Inefficiency is the penalty for simulating derivations: a derivation of derivable(x) is simply a direct proof of x. 84 By using the meta variable mechanism, the Prolog implementation of DLOG can manipulate DLOG sentences as terms, and can use Prolog's proof theory to derive sentences that fall within the standard Prolog interpretation. This is the mechanism required to manipulate DLOG higher-order objects. 5.1.1. Prolog as a foundation for DLOG The DLOG data model is an extension of the data model implicit in Prolog [Kowalski8l]. Prolog's data model uses the Horn clause subset of a first order language as its representation language, Tarskian semantics, and a depth-first proof procedure based on SLD resolution. Prolog's treatment of negation is defined by Clark's results [Clark78]. From this viewpoint, DLOG extensions include complex terms and constraints. These extensions require features unavailable with the implicit data model of Prolog, but they are provided by treating the Prolog language as its own meta language in which the appropriate extensions can be axiomatized [Bowen82]. The DLOG extensions are defined, indeed implemented, by using Prolog to describe their rela-tionship to Prolog's implicit data model. The handling of complex terms is provided by extending Prolog's theory of equality, as implemented in unification; the interpretation of lambda constants and constraints is specified by a meta language axiomatization. For example, the specification of DLOG database maintenance requires that we implement axiom (3.26) of §3.6 Vx \\fX.assertion(x) AVy\\y €X * constraint (y)* relevant (x ,y) O derivable(XU{x},y)\\ O assertable(x,X,XU{x})) We can massage formula (5.1) into a Prolog-like syntax by first rewriting it as: (5.1) assertable(x,X,X\\J{x}) C assertion(x) (5.2) A constraint (y) 85 * relevant (x ,y) D derivable(X\\J{x},y) Here we have reversed the implication sign, then dropped the universal quantifiers on x and X because free variables in the consequent of an implication are interpreted by Prolog as universal throughout the formula. However, we have explicitly retained the universal quantifier within the antecedent of formula (5.2), as Prolog will interpret variables appearing only in the antecedent of an implication as existential. So far, formula (5.2) is similar to the database maintenance axioms proposed by Bowen and Kowalski [Kowalski79, Bowen82], but notice that it is not a Horn clause (an explicit universal quantif-ier appears in the antecedent). Kowalski's more recent formulation does not have that problem:22 Assimilate(currdb,constraints,input ,bound ,newdb) (5.3) if Demo(currdb\\J constraints,bound,not(input),yes(proof)) a n d AnalyseFailureRestoreConsistency(currdb,constraints,proof ,newdb) This version might be read as \"to establish that a new input is consistent with a current set of con-straints, show that the input is derivable from the current database and the relevant constraints.\" Of course this reading is a simplification, since the axiom says that this should be done by attempting to find an inconsistency, and then restore consistency by a procedure AnalyseFailureRestoreConsistency. One difference between formulas (5.1) and (5.3) is that the former shows that a new assertion is con-sistent by re-deriving only the relevant constraints. In Kowalski's version, isolation of constraints is done implicitly by assuming that variable names are interpreted as types, and that Demo and AnalyseFailRestoreConsistency achieve their intended effect by using auxiliary procedures to collect constraints and make database changes. The DLOG specification provides a way of determining which constraints are relevant. Another advantage of (5.1) over (5.3) is implementation clarity: (5.3) is closer to implementation using a procedural interpretation of Horn clauses, but (5.1) more clearly shows the relationships between assertions, constraints, and databases. Much of the effort in implementing DLOG is in specifying Prolog meta predicates for achieving [Kowalski81, p. 20]. 86 the effect intended in (5.1). For example, if we assume an appropriate axiomatization of two new predicates all(*,*,*) and update(x), then (5.2) might be rendered as: assertable(x) C assertion(x) (5.4) A all (y ,constraint (y) A relevant (x ,y ),X) *derivable(X) *update(x) First, we assume that there is always one current theory that is implicitly identified, so all references to it (i.e., in assertable, and derivable) are dropped. Next, we use a meta predicate all (cf. [Clocksin81, Pereira83]) to isolate the relevant constraints and bind them to the set variable X. We expect that all will use its second argument as a generator, to construct the set of all relevant constraints X. The final assumption is that derivable will be axiomatized so that when given a set X of formulas, it will attempt to derive each formula of that set. The final modification is the addition of a database predicate update that will add the \"assertable\" formula x to the implicitly specified database. Under these assumptions, (5.4) closely mimics Prolog syntax, except for the set variable X. 5.1.2. Standard extensions to Prolog The Prolog implementation of DLOG uses some meta predicates that have, by now, become established as \"standard\" extensions (e.g., see [Clocksin81]). These include a specification of the meta predicate all used to mimic universal quantification in goal statements, and special axioms to mimic finite disjunction. DLOG uses the following: all(lambda(*x, *expression), *extension) The all meta predicate represents a relation on lambda terms and sets of individuals. It is similar to CProlog's setof (x,P,S) meta predicate that is used to create a set S of all individuals x that satisfy the Cprolog goal P [Pereira83]. This meta predicate is derivable when the extension of the complex predicate described by the lambda expression is bound to the variable *extension. In the implementa-tion, suitable modifications enforce the sort constraint that treats the extension as a DLOG set term. The DLOG disjunctive connectives are described in Prolog as follows: 87 *x | *y <- *x *x | *y <- *y and *x ! *y <- *x & *x ! *y <- -<*x & *y The implementation substitutes '|' and '!' for 'V and respectively. The Prolog implementation treats the operators as meta predicates: inclusive alternation requires the derivation of at least one alternative, while the exclusive case requires proof of one disjunct and the negation of the other. As discussed below (§§5.2, 5.2.3), this implementation of disjunctive operators does not provide a general facility for interpretating disjunctions embedded within descriptions. Note also that the definition of the alternatives for the disjunctions rely on finite failure of both subgoals of the first alternative. 5.2. A DL O G implementation The terminal session of fig. 2.5 was produced with the most recent DLOG implementation writ-ten in Waterloo IBM Prolog [Roberts77]. The source code for this implementation is included as Appendix 1. The current prototype has only weak versions of negation and disjunction, but it is capable of interpreting each form of DLOG descriptive term. At least one instance of each term occurs in the DDB, as the motivation for their construction arose in that domain. A complete listing of the DDB is give in Appendix 4. Several features have been provided in response to traditional DBM system facilities, but others are unique to DLOG. Examples of the former include a data dictionary facility, a facility for inter-preting application commands (i.e., \"data manipulation language\" interface), and a transaction pro-cessor. In addition, the system augments the traditional data dictionary with a topic structure [Goe-bel77, Goebel78, Schubert79]. Topic predicates are included to organize predicates of the data dic-tionary into categories. A user can use the topic structure to browse the database contents by topic, to get a \"feeling\" for the kinds of information stored. 88 A feature unique to this implementation is a facility for heuristic interpretation of various kinds of queries. The system is capable of invoking heuristics when the logical interpretation of a DLOG query fails. This is DLOG's concession to \"procedural attachment\" (cf. FRL, KRL), but the DLOG framework is such that heuristic processing is initiated only after deductive processing fails. Further details of this proof heuristic based on partial matching are given in §5.4. 5.2.1. Implementation syntax The restricted syntax of complex terms given in §3.3.1 are only part of the way to our desired specification in a standard single-sorted first order logic. The final step to an implementation is the translation of the 3-sorted syntax to the single-sorted syntax of Prolog. In general, the translation of an re-sorted language to a single-sorted language is done by rewriting sentences of the form Vx7.4>(zT) 3x,.4>(x7) as VX.T(X)D \\n(z1)p (x)*I7(X)} {aR0R...R8Y set(*xl;set(*x2):q>(*xl)&^(set(*x2))) set(a R 3 R...R 8)* lambda expressions \\x. \\Cn A -> C 1 & C 2 & • • • & C n queries ?C 1AC 2A...AC B < - C x & C 2 & • • • & C„ t R Is any of A ( V, or *, and one cf V or ' must appear. t R is any of &, |, or!, and one of | or! must appear. Table 5.1 Specification and Implementation Syntax 90 5.2.2. System structure From a system implementor's viewpoint, the DLOG system is a collection of independent pro-grams, collected within a framework that routes information among these programs and controls their operation. We will refer to these programs as DLOG processors, and describe the system in terms of their relationships. The major processors and their basic relationships are given in fig. 5.1. BROWSE PROCESSOR DERIVATION PROCESSOR COMMAND PROCESSOR QUERY PROCESSOR CONSTRAINT PROCESSOR ASSERTION PROCESSOR TRANSACTION PROCESSOR Figure 5.1 DLOG processor architecture Directed edges show possible flow of control, e.g., the transaction processor may require the services of both the assertion processor and constraint processor. Each processor can be viewed as a Prolog axiomatization of the appropriate relation. User inputs are filtered by the command processor and, depending on their content, are for-warded to the appropriate sub-processors. Each processor may set various global flags to select optional processing in dependent processors. This is most useful when combined with Prolog's failure-driven backtracking: each \"call\" is a goal request, and failing goals return control to their initiating processor to pursue an appropriate alternative. However, it is not exemplary logic programming style — it is done here for efficiency and expedience. The DLOG implementation could be viewed as a single complex Prolog theory, however a loaded application database is distinguished from implementation axioms by using the DLOG data dictionary. Furthermore, since Kowalski requires that meta and object databases be clearly distinguished,23 2 3 [Kowalski81, p. 21]. 91 application predicates are enumerated within the application's data dictionary. The implementation can use this dictionary to distinguish object theory axioms from meta theory axioms. Further data dictionary details are provided in §5.5.4. 5.2.3. DLOG derivability The foundation of the DLOG implementation is the Prolog axiomatization of the derivable rela-tion. This Prolog implementation of derivable performs DLOG query evaluation, and uses Prolog's meta variable facility (§5.1) only for more efficient derivation of system implementation predicates. As discussed in §3.4, the Prolog implementation of DLOG's derivable predicate represents the relation r- on DLOG databases and queries. The implementation of derivable(DB,Q) given here is slightly more elaborate in that it uses extra information about the form of a query to determine if the general specification of derivable (§3.4) is required. In fact, the implementation actually includes three distinct proof strategies, the selection of which is controlled by data dictionary information. The meta predi-cate i —derivable corresponds most closely to the specification of derivable given in §3.4, except that the connectives '&', '!', and '|' are interpreted elsewhere. Note also, that since there is at most only one application database at any one time, the DB argument in derivable(DB,Q) is left out. The three strategies are: 1) derivation of literals without set predicates (s-derivable). As mentioned §4.3.4, there can be some ambiguity in the interpretation of assertions involving sets as terms. For example, if 'set(a&b)' is properly treated as a set constant, the formula 'P(a)' does not follow from the for-mula P(set(a&b)). This implementation of DLOG will use s-derivable to attempt such a deriva-tion. The ambiguity about whether the assertion P(set(a&b)) actually means P(a)&P(b) is removed by a set —predicate assertion in the DLOG data dictionary (§5.3.4). This is the way in which a user can specify a \"weak typing\" of predicates, as suggested in §4.3.4. The s-derivable strategy is selected as follows: 92 derivable(*atomic — goal) <-get—predicate(*pname,*atomic —goal) & ^ set—predicate(*pname) & s — derivable(*atomic — goal). A good example of this strategy's use is the frequent derivation of the satisfies predicate. Recall (§4.1.2) that satisfies(a,P) can be derived when a is a DLOG constant and is a DLOG lambda expression applicable to a and derivable from the current database. The satisfies predicate is axiomatized as satisfies(*x,lambda(*x,*goal)) <-derivable(*goal) but is frequently used in queries with a set of lambda expressions. For example, satisfies(John ,set (*x -.course — req(BScMajorsCS, first, *x))) asks if \"John has satisfied the set of all first year course requirements for BScMajorsCS.\" When a goal involves a non-set —predicate but includes a set term, the s-derivable derivation strategy expands the set description and applies the predicate to each element of the set's extension. In other words, the s-derivable strategy uses data dictionary information to decide how to interpret the manipulation of predicates that have set arguments. Notice that this strategy doesn't make sense for goals with multiple set terms, since the application of a predicate to a Cartesian pro-duct of multiple set extensions is nonsense. Notice also, that this strategy is not part of the DLOG data model specification — it is an implementation strategy that helps streamline the user interface. derivation of correctly typed arguments (i-derivable). The i-derivable proof strategy is used in most cases, where either the goal includes a set term (and has been typed as such), or doesn't contain a set term but cannot be derived directly. In other words, when predicates are used as specified by the typing of the data dictionary, this proof strategy is selected. This strategy is the most general, and is the one from which extended unification is used to match DLOG complex terms. It is selected as follows: 93 derivable(*atomic — goal) <-get —predicate(*pname,*atomic—goal) & set —predicate(*pname)\\ --contains — set —term(*atomic — goal) & t — derivable(*atomic — goal). 3) direct derivation (d-derivable). Any literal of a DLOG query that uses a DLOG system predicate (e.g., satisfies, extension) or a Prolog system predicate (e.g., SUM, AX) can be derived directly by Prolog rather than be simulated with the more general strategies 1) and 2) above. This stra-tegy is selected when the predicate of the current goal has the system—predicate or prolog—predicate properties: derivable(*atomic —goal) <\"-get — predicate(*pname ,*atomic — goal) & system—predicate(*pname) \\ prolog—predicate(*pname) & *atomic — goal. 5.2.3.1. D L O G unification Though the referential interpretation of descriptions (§4.2) requires manipulation of DLOG's descriptive terms at assertion time, much of the required special treatment can be done at query evaluation time at the term matching level. Instead of extending the derivable axiomatization to han-dle the standard rewritings of descriptions (§3.2.1), the unification algorithm can be augmented to pro-vide the correct matching of descriptive terms. In a sense, some of the complexity of derivation is off-loaded to the pattern-matcher (cf. [Reiter75]). The idea of extending a resolution proof procedure's power by augmenting unification was first suggested by Morris [Morris69], who proposes that equality be dealt with by so-called \"E-unification.\" Other proposals include Stickel [Stickel75], Morgan [Morgan75], Kahn [Kahn81] and Kornfeld [Korn-feld83]. Kahn proposes unification as the only computation mechanism in a complete programming language. Kornfeld's proposal is more coherent; he intercepts normal unification to apply a simple theory of equality that permits proofs like 94 6 = succ(b), P(6)r-P(succ(5)) More interesting (but less coherent) is the representation language KRL [Bobrow77b, Bobrow77a, Bobrow79], which relies on a comprehensive and complex \"mapping\" process on several different kinds of object descriptions called \"descriptors.\" To illustrate how DLOG might provide insight into the mapping operation of the KRL language, a version of Hayes' [Hayes80] logical rewriting of KRL exam-ples appears in Appendix 3. Returning to the handling of descriptive terms by augmenting unification, we cite Rosenschein on the advantage of embedded terms: ...the data object is kept small and \"hierarchical\" so that where an exhaustive match must be performed, failure can occur quickly. That is, deep, heterogeneous structures are pre-ferred to broad, homogeneous structures. For example, {(){()()}} is better than {{}{}{}{}}.24 We view Rosenschein's claim as support for the interpretation of descriptions as embedded terms, rather than as their contextual definition by rewriting. The DLOG unification algorithm is invoked by the DLOG derivable predicate, similar to the way Prolog's derivation procedure uses a built-in unification algorithm. Intuitively, whenever a unifi-cation must be performed and there are special DLOG terms to be matched, standard unification is intercepted, and DLOG unification is used. For example, suppose that the two terms an(*x, (*x)) lambda (*x , (*x) This is a case where disjunctive terms would require a more general proof mechanism, since a proof of *X ) V \\p{ *y )s= (*x)),an(*x,V(*x))) <-eqlambda (lambda(*x , (*x))&Vi3et*y))) <-map(lambda(*x)),se<(c,&C2& • • • &c„)) & apply(lambda(set(*y),tP(set(*y))),set(c1&c2& • • • &cn)). 14. unify (set (Ci&czto • • • &c„),sei(cn + 1&cn+2& • • • &cn + m)) Two set constants are equivalent if they are extensionally equivalent. In this case, the eqext predicate will sort the sets into a normal form (ASCII collating sequence) to determine their equivalence: unify(set(c1&c2& • • • &cn),set(cn + l&cn + 2& • • • &cn + m)) <-eqext (set (c,&c2& • • • &cn),se<(cn + 1&cn + 2& • • • &cn + m ) ) . 15. unify(set(*x:^(*x)),set(*y:^*y))) The equivalence of descriptions is tested by using eqlambda: 100 unify (set (*x: JB where A is an atomic assertion and B is a body. The actual implementation of integrity maintenance is just as §3.6 suggests: each constraint of the form above is treated as an assertion about assertability, viz. assertable(A) that indicates the kind of object that can appear in each term position: = — 1 specifies a standard1 individual; a, =0 specifies a pro-position; and a, & 1 specifies a s,-place predicate.2 1 Here \"standard individual\" means the usual notion of an individual in a first order model. This is in contrast to higher-order models, where there are different kinds of individuals. 2 See [Montague74, p. 150]. The notion of predicate used in this context is sometimes called a \"relation in intension.\" 121 122 For example, a predicate constant P of type <-l,l> takes individual constants in its first posi-tion and unary predicates in its second. In the Department Data Base domain, the satisfies predicate constant has type <-l,l>, e.g., the assertion satis f ies(fred, \\x.completed(x ,cs 115)) has an individual constant 'fred' in the first argument position, and a lambda constant in the second argument position. The first denotes an individual object (the person with name 'fred'), and the second denotes a predicate specifying the property of \"x completing the course CS115.\" The meaning of the above assertion is assigned in a way that introduces the second and most important difference of Montague's system. The assignment of truth values to sentences is an inherently two phase process. As Montague explains [Montague74a, p. 157], an interpretation assigns intensions to symbols, and a model assigns extensions. Extensions include the standard objects well-known from traditional Tarksian first-order semantics, as well as sets of sequences of individuals. Intensions are functions from possible worlds to the universe of individuals. They are introduced in order to distinguish the sense or abstract meaning of a predicate from its denotation in a particular possible world. Though the complexity of Montague's complete system can be perplexing, but of the complexity dissolves because of the simplicity of DLOG theories: they are finite, and the intended interpretation is over a highly restricted domain. This simplicity constrains the number of possible worlds that can serve as interpretations for DLOG theories (for example, this provides a restricted interpretation of \"•\"). In effect, the world that a particular DLOG database is intended to describe is the intended possible world for semantic interpretation. Therefore the first phase of interpreting a sentence, the selection of a possible world, is trivial. 6.2.1. DLOG's semantics in Montague's system A possible interpretation for the DLOG language is a triple , cardinality <2,-l>, etc. Since the DLOG domain is finite, this poses no theoretical problem. However, we can avoid this inelegance by interpreting a DLOG set object as an individual of Montague's logic and then rendering DLOG set descriptions as predicates that relate that \"individual\" to its members. In effect, we are avoiding the inelegance of the first encoding by defining a kind of local sorting within the Montague encoding of set descriptions. For example, with this strategy the set constant {a} is now written as TQAxa(Q[x\\-(Q'[x,a\\*Ayy±aD^Q'[z,y\\)) or, in the abbreviated form as xQ'[x,a]>\\Ayy±aD^Q'\\x,y\\ The new predicate constant Q1 is introduced to name the relation of \"being a member of set x,\" i.e., Vx V yQ\\x ,y)=set (x) A y €x Its introduction avoids the necessity of providing a different type for each predicate constant — for example, cardinality can have type <1,-1> regardless of the cardinality of the set on which it is asserted. The newly introduced predicate constant Q' avoids the need for explicitly sorting Montague's language, and permits the meaning of DLOG sets to be expressed without explicitly increasing the number of predicate constants in any particular DLOG theory. This introduction of a new \"element-of\" relation for each set is again theoretically feasible, and can be hidden in an imple-mentation. Definite sets can be similarly expressed. A definite set {*!*(*)} is expressed as 125 TQAxO(Q[x}^4>(x)) or, in abbreviated form x and so denotes a relation *, where / is a set of possi-ble worlds, U is the set of possible individuals32 and F is a function from individual and predicate 123 constants to intensions. That is, for any individual or predicate constant c, Fc, is a function FC:I-U. This is the formal notion of intension—Fc is a function which, computed in a possible world i, pro-duces the extensional denotation of that constant. In other words, Fc(i) is the individual in possible world « denoted by the constant c. Individual constants of DLOG's syntax correspond to possible individuals in the set of possible worlds. In any actual DLOG database, the intended possible world is fixed by that database's intended interpretation. The set objects in the DLOG language can be expressed in several ways in Montague's system. In one method a DLOG set object is expressed as a predicate with the property of being the set in question. That is, the set constant '{a}' is semantically associated with the predicate that, in a possi-ble world i, describes the set whose only member is the possible individual assigned to a in i. In Montague's formal language the set constant '{a}' can be expressed as 3 3 TQAuD(Q[u]^u=a) For example, consider the DLOG formula cardinality ({a),\\) Here cardinality is a predicate constant with type <1,-1>; its first argument is a unary predicate constant and its second an individual constant. We can express the DLOG assertion as cardinality {TQ /\\uO(Q\\u\\^u = a),l) (6.1) Here Montague's T ' symbol has a similar role to Russel's symbol ' I ' in the DLOG syntax. It is used to name a predicate Q, i.e., we can read TQ' as \"the predicate Q such that...\" The symbol '•' is needed to specify that the predicate definition is unique across all possible worlds. It asserts that the predicate Q has the same definition, regardless of in which possible world an interpretation is made. 3 2 In Montague's \"Pragmat ics\" [Montague74b] U = U A | , while in \"Ent i t ies\" [Montague74a] L O U A j , where A s is the set of i£I i € possible individuals existing in the possible world i. The interested reader is referred to those papers for a deeper under-standing of the philosophical issues arising from the consideration of possible objects not included in one's selection of I, the set of possible worlds. 3 3 Montague [Montague74a] uses the symbols 'A' and 'V' for 'V' and '3', respectively. He also uses brackets where parentheses are typical, e.g., P[x] for P(x). In addition Montague employs the symbols 'T ' and 'CP, read as \"the\" and \"necessarily,\" respectively. These latter symbols are used to form names of predicates. 124 Montague offers the following abbreviation for formula (6.1): cardinality (uu = a,l) In general, the 'u**, where J is the set of possible worlds, U the set of individuals, and Ux is the set of all predicates . Now the formula (6.2) is true in a structure just in case there is a denotation x£U for BScMajorsCS and A£ for xcompleted(x ,CS115) such that F r t ? B i r e m e n ((t) holds on *

); similarly for other sentential connectives. (6) If u is an individual variable and

) if and only if there is a predicate of type which satisfies $ (in i, with respect to ), where each Uk (for k

), and A is that predicate, or (ii) it is not the case that there is exactly one such predicate and A is the empty predicate (i.e., IX{A}). Finally, we require an addition that considers the meaning of DLOG's definite individuals. While Montague's language uses the symbol T ' to form new predicate constants, the language has no equivalent symbol for forming individual names. Therefore we add (10) P(Lx$(x)) is true (in i with respect to ), if and only if either (i) there is exactly one u in U such that u satisfies $(x) (in » with respect to ) and 128 u €F/>(j), or (ii) there is not exactly one u such that u satisfies $(x) (in i with respect to ) and there is a distinguished individual A(FP(i). The special individual A is the individual, different from any other individual, that is denoted by those definite individuals that are ambiguous. In the DLOG implementation each definite individual is sub-ject to reference determination. Definite individuals that fail to refer cause the rejection of the asser-tion in which they appear. In theory, the selected individual is that special individual that achieves parsimony in the specification of the first order semantics of definite descriptions (cf. [Rosser68, Kaplan75, p. 215]). In application, A represents the computer program that rejects the offending assertion (for further details see §4.3.1). To conclude this section, we note that Hilbert's symbol '€' has not yet been given a meaning within Montague's system. This is because we desire an interpretation where '€' is treated as a selec-tion operator(cf. [Leisenring69, p. 4-5]). The particular individual denoted by a term of the form l€i.