ASF(ID:1455/asf001)

Algebraic Specification Formalism 


for Algebraic Specification Formalism

CWI Holland.

Language for equational specification of abstract data types.

Places
Related languages
ASF => ASF+SDF   Subsumed
ASF => COLD   Influence

References:
  • J.A. Bergstra et al eds "Algebraic Specification", A-W 1989. view details
  • Cerioli, Maura; Gogolla, Martin; Kirchner, Helene; Bruckner, Bernd Krieg; Qian, Zhenyu; Wolf, Markus "Algebraic System Specification and Development - Survey and Annotated Bibliography" Second Edition Compass Group Bremen 1997 view details Abstract: Methods for the algebraic specification of abstract data types were proposed in the early seventies in the USA and Canada and became a major research issue in Europe shortly afterwards. Since then the algebraic approach has come to play a central role in research on formal program specification and development, as its range of applications was extended to the specification of complete software systems, to the formal description of the program development process, and to the uniform definition of syntax and semantics of programming languages. Today this approach extends beyond just software to the development of integrated hardware and software systems. These flourishing activities in the area of algebraic specifications have led to an abundance of approaches, theories and concepts, which have universal algebra, category theory and logic as a common mathematical basis.
    The present volume is an annotated bibliography which attempts to provide an up-to-date overview of past and present work on algebraic specification. No attempt is made to provide a coherent introduction to the topic for beginners the intention is rather to provide a guide to the current literature for researchers in algebraic specification and neighbouring fields. Some indications of how the different approaches are related are included. ps Extract: Extended ML
    Extended ML
    The long-term goal of work on Extended ML is to provide a practical framework for formal development of Standard ML programs together with an integrated suite of computer-based specification and development support tools and complete mathematical foundations to substantiate claims of correctness. The complete formal definition of the Extended ML language [533] constitutes an important milestone in this programme, necessary to provide a basis for further research on foundations and tools. One of the key issues that had to be resolved was the design of a logic suitable for describing properties of Standard ML programs, taking into account features like recursive type and function definitions, exceptions, polymorphism, and non-termination. A number of errors in the Standard ML semantics [679] have been discovered in the course of this work [526]. Work has begun on rules for proving theorems about EML specifications, to be used in verifying correctness conditions that arise in the course of formal development. Research has so far concentrated on equality reasoning [530] and quantifier rules.
    The length and requisite formality of the definition of Extended ML renders it rather difficult to penetrate. Accordingly, [532, 534] provides an informal overview of the definition, explaining most of the main issues involved and justifying some of the choices taken.
    Extract: LOTOS and PSF
    LOTOS and PSF are two formalisms following the approach of separate specification of data by choosing a process calculus as the formalism for concurrency. The differences between them are in the formalism for the algebraic specification part (ACT ONE [311] for LOTOS and ASF [64] for PSF) and in the combinators of the chosen process calculus (inspired by those of CCS for LOTOS and by ACP for PSF).
    PSF (see [657]) is the process specification formalism developed by Mauw and Veltink as a basis for a set of tools to support the process algebra of Bergstra, Klop et al' (see e.g. [65, 48]). The basic specification formalism is equational logic with total algebras. The theory and language of ASF is adopted for handling modular and parameterized specifications.
    LOTOS has probably been the first internationally known algebraic specification formalism for concurrency (see [187, 143]); most importantly, it is an official ISO language specification for open distributed systems, a qualification that alone would rank it high in an ideal value scale of possible important applications.
    LOTOS uses ACT ONE ( [311]) for the ADT specifications and a process description formalism based on an extension of CCS with several derived combinators (e.g. input/output of structured values, sequential composition with possible value passing, enabling/disabling operators). The basic specification formalism (equational logic with total algebras) is the same as in PSF and also the bisimulation semantics for processes.
    All these years, LOTOS has been used in several practical applications. Moreover, a toolset for helping to write correct LOTOS specifications has been developed (see e.g. the ESPRIT project LOTOSPHERE [911]). Recently a revised version of LOTOS (E-LOTOS, for Enhanced LOTOS) is under way, see [620], taking into account the needs raised during its application in industry.
    Extract: ASF
    ASF is based on a pure initial algebra approach. It provides export, import and renaming as specification-building operations. The semantics of these operations is defined axiomatically using module algebra. The specification part of the wide-spectrum language COLD developed within the ESPRIT project METEOR is strongly influenced by ASF Extract:
    Reusability
    The demand for reusability originates from economic considerations and attempts towards standardization. Rather than always starting from scratch, the use of existing components is cheaper and also less error-prone. A central problem for the identification and the correct use of reusable components is the abstract description of such components. A formal specification is the only form of description that can serve as a basis for a correctness proof; it can be processed automatically and it establishes a degree of confidence in the functionality of the component that is particularly important if a component has to be modified before being reused.
    Goguen [404] proposes the algebraic specification language OBJ as well-suited for the design of reusable software systems. A component's interface is specified as an abstract data type and may be parameterized by other components. Combination of components is possible by instantiation using appropriate fitting morphisms. A similar approach is used in Extended ML [836], [848].
    In ACT TWO [315] components are modules, which consist of two interface specifications, i.e. an export specification and an import specification, and a body specification which implements the export specification in terms of the import specification. Similarly, in LARCH [445], a component consists of a pair, an "abstract" interface specification and an implementation. Here the implementation is written in a conventional programming language. A similar distinction between a requirement and a design specification is made in PAnndA-S, the language of the PROSPECTRA project [138], based on the notion of visible and private interface in Ada. Ada bodies are generated semi-automatically by transformation.
    In the approach of [655] a component consists of four parts at four different levels of abstraction: a requirement specification, a design specification, a source program and object code. Components are written in a so-called wide spectrum language which provides facilities for system design and software development at many levels of abstraction. Matsumoto uses Ada, which has as a consequence that the requirement specification is merely an informal description. CIP-L [54] and COLD [341] are two languages which include algebraic specifications as well as predicative and imperative language styles in one coherent framework and therefore are better suited for such an approach.
    In the object-oriented approach of [675] a reusable component is represented by a graph of object-oriented programs, each node of which stands for one level of abstraction in the description of the software. In the approach of the ESPRIT-project DRAGON [945], a reusable component consists of a tree of algebraic specifications, where similarly to [675], each node of the tree is an instance of a certain level of an abstraction of the program with the root being the most "abstract" formal specification and the leaves representing the "concrete" implementations. A methodology for reusing components is described in [947].
  • M. G. J. Van Den Brand , J. Heering , P. Klint , P. A. Olivier, "Compiling language definitions: the ASF+SDF compiler" view details Abstract: The ASF+SDF Meta-Environment is an interactive language development environment whose main application areas are definition and implementation of domain-specific languages, generation of program analysis and transformation tools, and production of software renovation tools. It uses conditional rewrite rules to define the dynamic semantics and other tool-oriented aspects of languages, so the effectiveness of the generated tools is critically dependent on the quality of the rewrite rule implementation. The ASF+SDF rewrite rule compiler generates C code, thus taking advantage of C's portability and the sophisticated optimization capabilities of current C compilers as well as avoiding potential abstract machine interface bottlenecks. It can handle large (10,000+ rule) language definitions and uses an efficient run-time storage scheme capable of handling large (1,000,000+ node) terms. Term storage uses maximal subterm sharing (hash-consing), which turns out to be more effective in the case of ASF+SDF than in Lisp or SML. Extensive benchmarking has shown the time and space performance of the generated code to be as good as or better than that of the best current rewrite rule and functional language compilers. Extract: Brief survey
    2. BRIEF SURVEY OF THE ASF+SDF LANGUAGE
    In addition to regular rewrite rules, ASF+SDF features conditional rewrite rules, associative (flat) lists, default rules, and simple modularization. In our discussion of these features we will emphasize issues affecting their compilation rather than issues of language design. For the use of ASF+SDF see van Deursen et al..

    2.1 Syntax Definitions
    An ASF+SDF module can define arbitrary lexical and context-free syntax. An example of the former is shown in Figure 1.1 It defines sort ID for identifiers using a regular expression involving character classes. It imports module Layout, which defines lexical syntax for white space, comments, etc. (not shown). Furthermore, again using a regular expression, it defines a set of variables of sort ID. All definitions (including the variable declarations) are exported from the module, meaning they are available in modules importing it.

    A simple context-free syntax for statements is shown in Figure 2. It imports modules Identifiers and Expressions (not shown). In this way, it obtains definitions for sorts ID and EXP. It then defines sort STAT for single statements and STATS for lists of statements. List constructs like fSTAT ";"g+ denote separator lists. In this case, a list of statements consists of one or more statements separated by semicolons (but no semicolon at the end).

    2.2 Conditional Rewrite Rules
    We assume throughout that the terms being rewritten are ground terms, that is, terms without variables. A rule is applicable to a subterm if its left-hand side matches the subterm, and its conditions (if any) succeed after substitution of the values found during matching. Such a subterm is called a redex for that particular rule. The process of exhaustively rewriting a term is called normalization. The resulting term is called a normal form (if normalization terminates). Conditions may be positive (equalities) or negative (inequalities).

    Negative conditions succeed if both sides are syntactically different after normalization. Otherwise they fail. They are not allowed to contain variables not already occurring in the left-hand side of the rule or in a preceding positive condition. This means both sides of a negative condition are ground terms at the time the condition is evaluated.

    Positive conditions succeed if both sides are syntactically equal after normalization. Otherwise they fail. One side of a positive condition may contain one or more new variables not already occurring in the left-hand side of the rule or in a preceding positive condition. This means one side of a positive condition need not be a ground term at the time it is evaluated, but may contain existentially quantified variables. Their value is obtained by matching the side they occur in with the other side after the latter has been normalized. The side containing the variables is not normalized before matching.

    Variables occurring in the right-hand side of the rule must occur in the lefthand side or in a positive condition, so the right-hand side is a ground term at the time it is substituted for the redex.

    As a running example we will use a definition of the “language” of type environments (Figure 4). From the viewpoint of ASF+SDF, this is just a (small) language definition. As explained in Section 1, ASF+SDF does not distinguish data types with user-defined notation from language definitions with operations like typechecking, execution, and transformation.

    Module Type-environments defines a type environment (sort TENV) as a list of pairs, where each pair (sort PAIR) consists of an identifier (sort ID) and a type (sort TYPE). The latter is defined in module Types (Figure 3), which is imported by Type-environments. It defines bracket notations for pairs and lists of pairs as well as appropriate distfix notation for the operations lookup and add. A sample sentence of the type environment language would be add d with real to [(a:int),(b:real),(c:string)].

    The semantics of the language is defined by rewrite rules for the operations lookup and add. Consider rule [at-2] in Figure 4 keeping the above in mind. Its application proceeds as follows:

    (1) Find a redex matching the left-hand side of the rule (if any). This yields values for the variables Id1, Type1, Id2, Type2, and Pairs1.

    (2) Evaluate the first condition. This amounts to a simple syntactic inequality check of the two identifiers picked up in step 1. If the condition succeeds, evaluate the second one. Otherwise, the rule does not apply.

    (3) Evaluate the second condition. This is a positive condition containing the new list variable Pairs2 in its right-hand side. The value of Pairs2 is obtained by matching the right-hand side with the normalized lefthand side. Since Pairs2 is a list variable, this involves list matching, which is explained below. In this particular case, the match always succeeds.

    (4) Finally, replace the redex with the right-hand side of the rule after substituting the values of Id2 and Type2 found in step 1 and the value of Pairs2 found in Step 3.

    2.3 Lists
    ASF+SDF lists are associative (flat) and list matching is the same as string matching. Unlike a term pattern, a list pattern may match a redex in more than one way. This may lead to backtracking within the scope of the rule containing the list pattern in the following two closely related cases:
    —A rewrite rule containing a list pattern in its left-hand side might use conditions to select an appropriate match from the various possibilities.
    —A rewrite rule containing a list pattern with new variables in a positive condition (Section 2.2) might use additional conditions to select an appropriate match from the various possibilities.
    List matching may be used to avoid the explicit traversal of structures. Rule [l-1] in Figure 4 illustrates this. It does not traverse the type environment explicitly, but picks an occurrence (if any) of the identifier it is looking for using two list variables Pairs1 and Pairs2 to match its context. The actual traversal code is generated by the compiler. In general, however, there is a price to be paid. While term matching is linear, string matching is NP-complete [Benanav et al. 1985]. Hence, list matching is NP-complete as well. It remains an important source of inefficiency in the execution of ASF+SDF definitions [Vinju 1999].

    2.4 Default Rules
    A default rule has lower priority than ordinary rules in the sense that it can be applied to a redex only if all ordinary rules are exhausted. In Figure 4, lookup uses default rule [default-l-2] to return nil-type if rule [l-1] fails to find the identifier it is looking for.

    2.5 Modules
    ASF+SDF supports import, renaming, and parameterization. Renaming corresponds to replacing a syntax rule with another one and replacing the corresponding textual instances. Modularization is eliminated at the preprocessing level. An ASF+SDF function definition may be distributed over several modules. Since the compiler maps ASF+SDF functions to C functions, this hampers separate compilation. The full specification has to be scanned for each function.

    2.6 Rewriting Strategies
    ASF+SDF is a strict language based on innermost rewriting (call-by-value). This facilitates compilation to and interfacing with C and other imperative languages. In particular, it allows ASF+SDF functions to be mapped directly to C functions and intermediate results produced during term rewriting to be stored in an efficient way (Section 7.1).We also encountered cases (conditionals, for instance) where innermost rewriting proved unsatisfactory. In such cases, rewriting of specific function arguments can be delayed by annotating them with the delay attribute. See Bergstra and van den Brand [2000] for details.