Post-X(ID:5734/pos005)


Experimental applicative programming language for linguistics and string processing


References:
  • Bailes, P.A.C. and Reeker, L.H. "An experimental applicative programming language for linguistics and string processing" view details Abstract: The Post-X language is designed to provide facilities for pattern-directed processing of strings, sequences and trees in an integrated applicative format.

    Post-X is an experimental language designed for string processing, and for the other types of operations that one often undertakes in computational linguistics and language data processing.

    In the design of Post-X, the following four goals have been foremost:

    (1) To modernize the Markov algorithm based pattern matching paradigm, as embodied in such languages as COMIT and SNOBOL;
    (2) To provide a language useful in computational linguistics and language data processing, in particular, but hopefully with wider applicability;
    (3) To provide a vehicle for the study of applicative programming, as advocated by Backus , among others;
    (4) To provide a vehicle to study the application of natural language devices in programming languages, as advocated by Hsu  and Reeker.

    The "X" in "Post-X" stands for "experimental", and is a warning that features of the language and its implementation may change from one day to the next. The eventual goal is to produce a language designed for wide use, to be called "Post" (after the logician Emil Post). In this paper, we shall present some of the language's facilities for string and tree processing. A more detailed statement of the rationale behind the language can be found in (2), and more details of the language are to be found in(2).

    Extract: Pattern Matching

    The basic idea of using pattern matching to direct a computation is found in the normal algorithms of Markov, and was embodied in the early string processing language COMIT. The series of SNOBOL languages developed at Bell Laboratories, culminating in SNOBOL4, improved a number of awkward features of COMIT and added some features of their own. Among these latter was the idea of patterns as data objects.

    Post-X incorporates patterns into an applicative framework, which will be illustrated  below. In doing so, the powerful pattern matching features of SNOBOL4 have been retained, and in fact, improved. In an applicative framework, the pattern match must return a value that can be acted upon by other functions. The pattern itself has been generalized to a much more powerful data object, called the FORM.

    A FORM consists of a series of alternative PATTERNS and related ACTIONS. Each pattern is very much like a pattern in SNOBOL4 (with some slight variations). FORMS may be passed parameters (by value), which are then used in the pattern or action portion.

    A PATTERN determines the structure of the string to which it is matched. The pattern contains a sequence of concatenated elements, which are themselves PATTERNS, PRIMITIVE PATTERNS (utilizing most of the SNOBOL4 primitives) or STRINGS. The value returned by the pattern is either FALSE (if it fails to match) or a "parse tree" designating the structure of the string that corresponds to portions of the pattern. As an example, suppose that a pattern is P:=p1^P2^...^pn" It may be matched to a string S=SoSl...s n by the use of the operator '~in", and if each of the Pi match a successive letter s j, one can conceptualize the "tree" returned as
    [fig]
    where s O represents the unmatched portion to the left of the matched portion and Sn+. the portion to the right of the matched portion.

    The numbers I ..... n and the characters < and > in the example are SELECTORS, used in the ACTION portion to refer to the appropriate substring. The tree returned is denoted by $$, and ! is used for selection, but $$! can be condensed to $ in this context, so the expression $ < returns s O in the example above, while $ 2 returns s 2. The selectors give the effect of the short persistence variables that were found in COMIT, where they were denoted by numerals. These variables had the advantage of having their scope limited to a single line of the program, thus minimizing the number of variables defined at any one time. In Post-X, the selectors are local to a particular FORM. pdf
          in 1980 International Conference on Computational Linguistics view details
  • Bailes, P.A.C., and Reeker, L.H. "Post-X: An Experiment in Language Design for String Processing", pp252-267 view details Abstract: Post-X is an experimental language, and it will evolve considerably. Portions will be excised and other features will be added. We are finding, however, that it allows the expression of many string processing problems in an elegant manner, avoiding the more objectionable problems posed by the "von Neumann bottleneck" introduced into the Markov algorithm based languages, while retaining the fundamental pattern-matching paradigm.

          in Australian Computer Sciehce Communications. Vol.2, no.2, March,1980 view details
  • Bailes, P.A.C., and Reeker, L.H., "The Revised Post-X programming language", Technical Report no.17, Computer Science Department, University of Queensland. 1980 view details
          in Australian Computer Sciehce Communications. Vol.2, no.2, March,1980 view details