DML(ID:8098/)


for Data Manipulation language

DML for numerical scientific databases, with GSTGS making up the NDBMS

Declarative language to manipulate the wide array of numerical scientific databases, automatically producting HPF code.

Ola-Olu A. Daini   University of Ife, Ile-Ife, Nigeria  



References:
  • Daini, Ola-Olu A. and Scheuermann, Peter "A data definition and mapping language for numerical data bases" Proceedings of the ACM 1980 annual conference pp418-432 view details Extract: Data Language Facilities
    Data Language Facilities
    The data language facilities provide a generalized approach for describing any numerical database and its mapping to storage. They consist of a stored-data description language (SDDL) and a stored-data mapping language (SDML). The two languages are similar to other data definition and mapping languages [7,17,18]. We have attempted as much as possible to make them user friendly, by including simple, self-explanatory language const, ructs. The choice of only one of the alternatives is represented by [] (braces) and an optional phrase by [] (square brackets). Language keywords appear in capital letters and user-defined words in lower case. Sample SDDL and SDML statements of both source and target numerical databases are shown in Figures 4 and 4.1 respectively. Other features of the two languages will be revealed as they are described below.
    Extract: Stored-Data Description Language (SDDL)
    Stored-Data Description Language (SDDL)
    The SDDL is intended mainly for the user to describe the logical characteristics of his numerical database and the associated type of file organization on secondary storage devices, or alternatively the card input-format. Therefore, the language is divided into three parts which are (I) matrix structure, (2) file control, and (3) input format.
    The matrix structure describes the logical characteristics of the data and it also indicates if dynamic storage management is required. The basic matrix format is specified using the selfexplanatory keywords: ~DENSE ~ {SYMMETRIC ~,and ~SPARSEy~ONSYMMETRIC 3 BANDED ~. If the matrix is symmetric, the ONBANDEDJ statement will include~UPPER-DIAGONAL~ ~LOWER-DIAGONAL~ in order to specify the partition of the dataset
    to be processed. Similarly, a bandwidth statement which specifies the size of the band is required for a band matrix and a density statement giving an estimated density of a sparse matrix is necessary for creating a database with random file organization. Some statements in the matrix structure section are shown in the example below.

    The file control specifies the file organization of a numerical database already residing on a secondary device or to be created, by listing the type of file, device medium, file unit etc. The file control statements depend on the device medi~m~ selected for processing as specified by the device medium keyword, CARD, TAPE, or DISK. If data is to be processed from card input stream, only the file-type, file-unit and device-medlum statements are required, but in addition to these three statements, both disk and tape files require record statements.
    The file-type statement identifies the source/ target file and the file-unlt statement gives a set of FORTRAN READ/WRITE unit numbers for processing the files in the database. The record statement lists the record properties llke record-size.

    In addition, the file control section may include any of the following optional statements: (I) a record-key statement to specify either integer or alphanumeric key for random file organization; (2) a block-size statement required for blocked records; and (3) a format statement (similar to FORTRAN) for formatted records.

    The input-format section provides facilities for processing unstructured database from cards. The section is comprised of the dimension, the data ordering and format statements respectively. The dimension statement, shown below, specifies the numbers of DIMENSION= SROW ~, integer,~COLUMN~, integer; COLUMN) [ROW rows and columns in the matrix. The data ordering statement specifies a rowwise/columnwlse/none ordering. The data-format statement: (SRARSE- YPE-q DATA-FORMAT=~SPARSE-TYPE-21; (DENSE J gives users three choices of format specifications. Both SPARSE-TYPE-I and SPARSE-TYPE-2 are for sparse matrix input format specifications of only nonzero elements and the DENSE is for all the matrix elements.
    SPARSE-TYPE-i is for an ordered input data so that a row or column input data stream is processed at a time. As shown below, it requires a control data to specify the row or column to be processed so that the format becomes a set of pairs of column/row and data item datatypes.
    A data-type is any valid FORTRAN format specification for spacing, alphanumeric, integer or real variable e.g. 5X, 16, FIO.4 and E20.12. SPARSE-TYPE-2 is for an unordered input data so that the format is a set of row, column, and data item data-types as follows: SPARSE-TYPE-2 = SET([ROW], data-type, [COLUMN], data-type, data-type);
    Finally, DENSE = SET(data-type); provides for a set of regular FORTRAN-type format specifications. An example of a SPARSE-TYPE-I input format is shown below.
    Extract: Stored-Data Mapping Language (SDML)
    Stored-Data Mapping Language (SDML)
    The SDML has two functions: (i) to describe the different types of mapping which the system can make between a logical schema and a target storage space, and (2) to describe the encoding to storage structures. The major structure of the language is comprised of the access path encoding and the encoded file. The major emphasis of the language is on the access path encoding, which represents the most difficult part of the mapping description. The encoded file section enables the assignment of encoded data (data items and pseudo data) to the files in the database according to the corresponding definitions of filenames and file accessing methods.
    The access path encoding section enables the selection of an appropriate mapping subsection and relates its subsections to the mapping descriptions of the direct, indirect and linked schema encoding groups. Reference to mapping descriptions defined in one encoding group by another is a colmnon feature of the language, e.g. REF-ITEM definition of pseudo data in the indirect encoding subsection is referenced by the linked encoding subsection.
    The direct encoding, implied by the DATA-ORG: subsection, describes the data item with its properties llke data ordering and type. It also provides for an optional definition of dimension and bandwidth for a source database description. The indirect encoding provides a choice of mapping alternatives for encoding pseudo data and data item to separate encoded files by the mapping descriptions identified by MAP-ORG: and REF-ORG: (see Figure 4). In addition, an ordered combination of pseudo data and data items may be mapped to an encoded file by MIXED-ORG: mapping description as follows:
    MIXED-ORG: SET ~RDERED~(REF-ITEM, DATA-ORG)~. ~(REF-ITEM, REF'ITEM,~r ~ DATA-ORG) JJ
    The linked encoding enables the mapping of any set of nodes to an encoded file. Each node is identified by a user defined node-name and consists of a set of fields. Each field is described by an optional field-name and a field identifier which may be a node key, pseudo data, or data item.
    The mapping description consists of definitions of both primitive and nonprimitive data structures. The representation of structures of primitive type is usually by an assignment statement, while that of nonprimltive is by a descriptive statement consisting of a set or group name, and a set or group definition [16]. We provide the following constructs in the language to specify data, ordering and linkage definitions:
    1. ordering definition types--rowwise, collumnwise and none;
    2. basic data types--integer, real, and alphanumeric;
    3. linkage definition types--header, first, next, prior, last, row, column, node, field, and null.
    A valid and meaningful linkage definition, except the NULL keyword, requires an ordered combination of the following: (I) a pointer linkage keyword, (2) row or column, and (3) node or field.
    The pointer linkage keywords are header, first, next, prior, and last. An example of a valid definition is FIRST ROW NODE.
    A primitive type data structure which is semantlcally ambiguous, e.g. index and pointer, becomes a nonprlmltive structure by qualifying the basic data definition with a semantic phrase definition as follows:
    INDEX: ~integer~ , TYPE =[ROW INDEX Lalpha J ~COLUMN INDEX ~ ; ]CONCAT(ROW INDEX,] £COLUM~ INDEX) J
    An access path is described by ORDERING and LINKAGE phrases. ORDERING describes the matrix data access path by row, column or none. It is assumed that the ORDERING of reference items, i.e., indices and locations (within the matrix or from diagonal elements) corresponds to that of matrix data items. LINKAGE describes linked llst structure connectivity by a combination of linkage keywords as in the following example:
    PTR-ORG: SET(PTR-ITEM), LINKAGE=NEXT COLUMN FIELD;

          in Proceedings of the ACM 1980 annual conference January view details
  • Daini, Ola-Olu Adeniyi "An approach for numerical database management" PhD 1981 Northwestern University view details
          in Proceedings of the ACM 1980 annual conference January view details
  • Daini, Ola-Olu A. "Numerical database management system: a model" Proceedings of the 1982 ACM SIGMOD international conference on Management of data pp192-199 view details
          in Proceedings of the ACM 1980 annual conference January view details
  • Ola-Olu A. Daini: A Language-Driven Generalized Numerical Database Translator. BIT 25(1): 91-105 (1985) view details
          in Proceedings of the ACM 1980 annual conference January view details