SDF(ID:1522/sdf001)

Syntax Definition Formalism 


Syntax Definition Formalism.

CWI.

Language for lexical and syntactic specification.

Places
Related languages
SDF => ASF+SDF   Subsumed
SDF => fSDL   Extension of

References:
  • Heering, J. et al, "The Syntax Definition Formalism SDF - Reference Manual" view details
          in Bergstra, J.A. et al eds Algebraic Specification, , ACM Press 1989 view details
  • J. Heering , P. R. H. Hendriks , P. Klint , J. Rekers, "The syntax definition formalism SDF — reference manual", view details
          in SIGPLAN Notices 24(11) November 1989 view details
  • M. G. J. Van Den Brand , J. Heering , P. Klint , P. A. Olivier, "Compiling language definitions: the ASF+SDF compiler" view details Abstract: The ASF+SDF Meta-Environment is an interactive language development environment whose main application areas are definition and implementation of domain-specific languages, generation of program analysis and transformation tools, and production of software renovation tools. It uses conditional rewrite rules to define the dynamic semantics and other tool-oriented aspects of languages, so the effectiveness of the generated tools is critically dependent on the quality of the rewrite rule implementation. The ASF+SDF rewrite rule compiler generates C code, thus taking advantage of C's portability and the sophisticated optimization capabilities of current C compilers as well as avoiding potential abstract machine interface bottlenecks. It can handle large (10,000+ rule) language definitions and uses an efficient run-time storage scheme capable of handling large (1,000,000+ node) terms. Term storage uses maximal subterm sharing (hash-consing), which turns out to be more effective in the case of ASF+SDF than in Lisp or SML. Extensive benchmarking has shown the time and space performance of the generated code to be as good as or better than that of the best current rewrite rule and functional language compilers. Extract: Brief survey
    2. BRIEF SURVEY OF THE ASF+SDF LANGUAGE
    In addition to regular rewrite rules, ASF+SDF features conditional rewrite rules, associative (flat) lists, default rules, and simple modularization. In our discussion of these features we will emphasize issues affecting their compilation rather than issues of language design. For the use of ASF+SDF see van Deursen et al..

    2.1 Syntax Definitions
    An ASF+SDF module can define arbitrary lexical and context-free syntax. An example of the former is shown in Figure 1.1 It defines sort ID for identifiers using a regular expression involving character classes. It imports module Layout, which defines lexical syntax for white space, comments, etc. (not shown). Furthermore, again using a regular expression, it defines a set of variables of sort ID. All definitions (including the variable declarations) are exported from the module, meaning they are available in modules importing it.

    A simple context-free syntax for statements is shown in Figure 2. It imports modules Identifiers and Expressions (not shown). In this way, it obtains definitions for sorts ID and EXP. It then defines sort STAT for single statements and STATS for lists of statements. List constructs like fSTAT ";"g+ denote separator lists. In this case, a list of statements consists of one or more statements separated by semicolons (but no semicolon at the end).

    2.2 Conditional Rewrite Rules
    We assume throughout that the terms being rewritten are ground terms, that is, terms without variables. A rule is applicable to a subterm if its left-hand side matches the subterm, and its conditions (if any) succeed after substitution of the values found during matching. Such a subterm is called a redex for that particular rule. The process of exhaustively rewriting a term is called normalization. The resulting term is called a normal form (if normalization terminates). Conditions may be positive (equalities) or negative (inequalities).

    Negative conditions succeed if both sides are syntactically different after normalization. Otherwise they fail. They are not allowed to contain variables not already occurring in the left-hand side of the rule or in a preceding positive condition. This means both sides of a negative condition are ground terms at the time the condition is evaluated.

    Positive conditions succeed if both sides are syntactically equal after normalization. Otherwise they fail. One side of a positive condition may contain one or more new variables not already occurring in the left-hand side of the rule or in a preceding positive condition. This means one side of a positive condition need not be a ground term at the time it is evaluated, but may contain existentially quantified variables. Their value is obtained by matching the side they occur in with the other side after the latter has been normalized. The side containing the variables is not normalized before matching.

    Variables occurring in the right-hand side of the rule must occur in the lefthand side or in a positive condition, so the right-hand side is a ground term at the time it is substituted for the redex.

    As a running example we will use a definition of the “language” of type environments (Figure 4). From the viewpoint of ASF+SDF, this is just a (small) language definition. As explained in Section 1, ASF+SDF does not distinguish data types with user-defined notation from language definitions with operations like typechecking, execution, and transformation.

    Module Type-environments defines a type environment (sort TENV) as a list of pairs, where each pair (sort PAIR) consists of an identifier (sort ID) and a type (sort TYPE). The latter is defined in module Types (Figure 3), which is imported by Type-environments. It defines bracket notations for pairs and lists of pairs as well as appropriate distfix notation for the operations lookup and add. A sample sentence of the type environment language would be add d with real to [(a:int),(b:real),(c:string)].

    The semantics of the language is defined by rewrite rules for the operations lookup and add. Consider rule [at-2] in Figure 4 keeping the above in mind. Its application proceeds as follows:

    (1) Find a redex matching the left-hand side of the rule (if any). This yields values for the variables Id1, Type1, Id2, Type2, and Pairs1.

    (2) Evaluate the first condition. This amounts to a simple syntactic inequality check of the two identifiers picked up in step 1. If the condition succeeds, evaluate the second one. Otherwise, the rule does not apply.

    (3) Evaluate the second condition. This is a positive condition containing the new list variable Pairs2 in its right-hand side. The value of Pairs2 is obtained by matching the right-hand side with the normalized lefthand side. Since Pairs2 is a list variable, this involves list matching, which is explained below. In this particular case, the match always succeeds.

    (4) Finally, replace the redex with the right-hand side of the rule after substituting the values of Id2 and Type2 found in step 1 and the value of Pairs2 found in Step 3.

    2.3 Lists
    ASF+SDF lists are associative (flat) and list matching is the same as string matching. Unlike a term pattern, a list pattern may match a redex in more than one way. This may lead to backtracking within the scope of the rule containing the list pattern in the following two closely related cases:
    —A rewrite rule containing a list pattern in its left-hand side might use conditions to select an appropriate match from the various possibilities.
    —A rewrite rule containing a list pattern with new variables in a positive condition (Section 2.2) might use additional conditions to select an appropriate match from the various possibilities.
    List matching may be used to avoid the explicit traversal of structures. Rule [l-1] in Figure 4 illustrates this. It does not traverse the type environment explicitly, but picks an occurrence (if any) of the identifier it is looking for using two list variables Pairs1 and Pairs2 to match its context. The actual traversal code is generated by the compiler. In general, however, there is a price to be paid. While term matching is linear, string matching is NP-complete [Benanav et al. 1985]. Hence, list matching is NP-complete as well. It remains an important source of inefficiency in the execution of ASF+SDF definitions [Vinju 1999].

    2.4 Default Rules
    A default rule has lower priority than ordinary rules in the sense that it can be applied to a redex only if all ordinary rules are exhausted. In Figure 4, lookup uses default rule [default-l-2] to return nil-type if rule [l-1] fails to find the identifier it is looking for.

    2.5 Modules
    ASF+SDF supports import, renaming, and parameterization. Renaming corresponds to replacing a syntax rule with another one and replacing the corresponding textual instances. Modularization is eliminated at the preprocessing level. An ASF+SDF function definition may be distributed over several modules. Since the compiler maps ASF+SDF functions to C functions, this hampers separate compilation. The full specification has to be scanned for each function.

    2.6 Rewriting Strategies
    ASF+SDF is a strict language based on innermost rewriting (call-by-value). This facilitates compilation to and interfacing with C and other imperative languages. In particular, it allows ASF+SDF functions to be mapped directly to C functions and intermediate results produced during term rewriting to be stored in an efficient way (Section 7.1).We also encountered cases (conditionals, for instance) where innermost rewriting proved unsatisfactory. In such cases, rewriting of specific function arguments can be delayed by annotating them with the delay attribute. See Bergstra and van den Brand [2000] for details.
          in SIGPLAN Notices 24(11) November 1989 view details