COL(ID:3539/col001)Complex Object Language recursive OO data query languager by Serge Abiteboul and Stephane Grumbach Structures: Related languages
References: in Proceedings of the International Conference on Extending Data Base Technology, March 1988 view details in Proceedings of the International Conference on Extending Data Base Technology, March 1988 view details in Proceedings of the International Conference on Extending Data Base Technology, March 1988 view details in Proceedings of the International Conference on Extending Data Base Technology, March 1988 view details Introduction To overcome limitations of the relational model, two main directions have been followed. ?Deductive databases? integrate knowledge in the database using the logic-programming paradigm. ?Object-oriented databases immerse the database in an object-oriented environment. Both approaches have advantages. The first one keeps with the tradition of relational databases by having a sound mathematical foundation and by focusing (to a large extent) on declarative access languages. The second one has so far been mainly implementation driven and has already delivered products that facilitate the development of database applications. The challenge that is now facing databases is to integrate these two promising approaches into a unified framework. There are several aspects to this integration. One technical issue is that terms in logic programming are usually based on lists, whereas object-based database models tend to be based on sets. Another issue is that traditionally the logic-programming approach focuses primarily on predicates, whereas the object-oriented one favors methods, that is, functions. In the present paper, we introduce the complex object language (COL), a rule-based language based on particular functions, called data functions, and structured values constructed with the tuple and set constructors. COL is an extension of Datalog (Horn clauses without function symbols) which allows the manipulation of structured values obtained using tuple and (heterogeneous) set constructors. The originality of the approach is that besides the base and derived relations, base and derived data functions are considered. Data functions present several advantages: (1) They facilitate the manipulation of sets. Data functions are used to ?name? sets of objects. (2) Functions are often more natural to use than relations (for example, joins are replaced by function composition). (3) Data functions allow the integration of features from the relational and functional models. Thus COL can be seen as the first step toward a logic-based language for semantic database models [5, 19, 201. (4) The functional flavor introduced in the language by data functions facilitates the integration of COL with a functional host language. The language COL is based on a clausal logic. The database consists of facts concerning both predicates and data functions. New facts can be derived using rules. The semantics is given in terms of minimal models. Unfortunately, because of the presence of sets, even positive programs may have more than one minimal model. A stratification in the spirit of a number of other researchers is used [9, 17, 31, 361. Extending the techniques of Apt et al. [91, we show that for stratified programs (in the absence of monovalued data functions), a justified and minimal model can be computed using a sequence of fixpoints. Extract: Introduction (ii) It is shown in Abiteboul and Beeri [2] that COL is equivalent to the classical algebras and calculi for complex objects. Hull and Su [21] have demonstrated that the set of relational queries expressible in COL is exactly the computable queries of Chandra and Harel [16] that have hyperexponential time complexity. We introduce COLrr, a range-restricted version of COL, and show that queries expressed in COLrr can be answered in polynomial time with respect to the size of the database. The use of monovalued data functions yields consistency problems which are studied in Abiteboul and Hull [61. We recall here their results. We show how to extend the language using ?externals?, that is, functions and relations that are externally evaluated. Finally, we briefly discuss an implementation of COLrr in CAML [181 with monovalued and external functions. Query evaluation is done by forward chaining using a compilation of rules into an extended algebra for structured values. We focus on two fundamental issues: (1) type inference, and (2) the integration with a functional host language. Two main approaches have been previously followed for defining query languages for structured values based on set and tuple constructors: (1) an algebraic approach (that is, [2, 3, 29, 32, 35]) and (2) a calculus approach [2, 23, 26, 29]. Recently, a logic-programming approach has been considered [4, 12, 27, 28]. One should also mention proposals such as those of Bancilhon and Khoshafian [11] to enrich logic-programming with a set construct. The originality of the approach introduced in [4] and developed here is in the use of data functions with the aforementioned advantages that this approach provides. The paper is organized as follows. In Section 2 the language is presented. First, a number of motivating examples are given. Then the syntax and semantics of programs are described. Section 3 introduces stratification restrictions on programs. It is shown that under these restrictions, for each program, a minimal, justified model can be computed using a sequence of fixpoints. Expressive power and complexity issues are considered. In Section 4 two new kinds of functions are introduced: monovalued data functions (only set-valued ones are considered in Sections 2 and 3), and externals (that is, functions that are evaluated ?outside? COL). An implementation of COL is discussed in Section 5. The proofs of the main results are given in an appendix. Extract: COL the language COL the language We introduce the language COL. First, we provide an informal presentation based mainly on examples. The language is then formally defined. A fixpoint semantics generalizing the semantics of Datalog with negation and stratification is presented. The two main approaches that have been followed for developing languages for complex objects are an algebraic approach and a calculus approach. The languages thereby obtained are hard to use. It is therefore interesting to investigate alternatives. The logic-programming approach provides a sound mathematical framework. Also, the logic-programming style naturally fits the context of structured values. Three independent proposals for using a logic-programming approach to obtain languages for structured values have recently been made [4, 12, 27, 28]. We elaborate here on the proposal developed by the present authors [4]. In addition to the logic-programming approach that is shared with Beeri et al. [12] and Kuper [27], we emphasize the use of data functions. Functions play a crucial role in functional data models like [14, 34] and in many semantic database models (see [20]). Their absence from the relational model together with their natural importance lead to the introduction of functional dependencies. We believe that functions are essential in a structured values context. Indeed, set grouping in [12] introduces functions in a hidden way. We argue that, as in COL, functions (based or derived) must be first-class citizens. Extract: Conclusion Conclusion Object-oriented database systems will, most probably, be the next generation of commercial database systems (see Bancilhon [10]). A major challenge is to provide high-level, ?declarative? languages as interfaces to these systems. The COL language is a first step in that direction. However, COL is purely ?valued-based.? One may view IQL [7] and Logres [151 as extensions of COL to incorporate object-oriented features. In particular, IQL introduces object identifiers (oid?s) and classes. Major problems studied in IQL are the control of the creation/deletion of oid?s and the handling of inheritance. An open question discussed in Abiteboul [1] is the introduction of overloading in this context. in Proceedings of the International Conference on Extending Data Base Technology, March 1988 view details COL COL (Complex Object Language) is a typed extension of Datalog that supports complex values directly. Unlike LDL which uses functor objects for tuples indirectly, COL directly supports tuples, tuple constructors and sets. A novel feature of COL is the use of interpreted functors called data functions in a deductive framework. Data functions play a crucial role in functional data models and in many semantic data models. Their use in COL provides natural support for functional dependencies in deductive databases. In LDL, functional dependencies are not supported. In COL, data functions and member predicate (3) can be used to access sets and to construct sets through grouping or enumeration. Unlike LDL where sets are unnamed, data functions in COL are used to name sets. in Proceedings of the International Conference on Extending Data Base Technology, March 1988 view details |