Fornax(ID:1760/for037)

APL with patterns 


Fornax (after the constellation) Hall Rutgers U 1994

Interesting hybrid of Snobol (for pattern matching) and APL (for array control) used in general education and compiler design


Structures:
Related languages
APL => Fornax   Extension of
SNOBOL4 => Fornax   Influence

References:
  • Storrs Hall, J. "Fornax: A General Purpose Programming Language" view details External link: Page at USENIX Extract:
    Introduction

    During the 1980's it was the practice of the Student Chapter of the ACM at Rutgers to hold annual programming contests.  This author had the honor of being one of the judges at such a contest.  It worked as follows:  The entrants were teams, of from 3 to 5 students.  Each team was given an assignment of four programs to write.  The team which got all four programs running correctly the earliest, won.

    Normally most of the students used Pascal, which was the language taught in computer science courses, or Basic, which many had started in.  A few engineering students did their programming in Fortran.  This year there was entered a motley team of mavericks.  Instead of working together, each member would do his work alone, and in a different language.  One, who had interned at Bell Labs, would work in C.  Another, God help him, was going to use assembly language.  A third would try Lisp.  And the other member of the "screwball'' team, whose name was Damian Osisek, worked in SNOBOL.

    The contest was unexciting that night, because Mr. Osisek, working alone, completed all four programs before anyone else, individual or team, completed even one.  (Note: The following year, the use of SNOBOL (and APL) was banned.)

    Philosophy

    The general reaction to this phenomenon, and similar feats occasionally seen in APL, is to dismiss them as not applicable to larger, multi- person projects.  We believe the truth is not so simplistic.  If there is a tool, such as SNOBOL, which can rapidly solve problems of size x, it is the job of the software project manager to reduce the project to problems of size x and combine the results.  In fact, the manager must do this anyway, and tends naturally to prefer software systems that alleviate the integration task; he is less concerned about the actual programming of the subparts because other people do that.

    The other major objection to SNOBOL (and APL) is that they are interpreted, and thus unsuited for large systems that are heavily used or where performance is critical.  This objection carries forward to the more powerful of the modern languages such as Prolog and the functional languages.  It is perhaps even more cogent in the latter case, since Prolog is notoriously difficult to control; runtimes are not only slow, but unpredictable.

    The design Fornax is based on the notion that the state of the art in compilers has advanced sufficiently since the mid-1960's when SNOBOL and APL were invented, that reasonably efficient implementations of their primitives can now be compiled. The design goals of Fornax are, in order of importance:

    * A set of datatypes and data-manipulation primitives that are powerful, well-matched to each other, but which can be implemented efficiently.

    * A concise, uncluttered syntax.  The austere beauty of Miranda is not to be achieved, but we can avoid the garbage-dump appearance of Common Lisp.

    The primitive operations, of course, are part of an abstract world that includes the ontology of data-structures and the framework in which the operations are done.  Logic and functional languages attempt to abstract away from the notion of a small sequence of changes to an overall state; lower-level conventional languages overindulge in it as being an efficiently implementable reflection of the actual process in hardware. Fornax includes state changes to be used where necessary, and indeed encourages this as quite a powerful primitive concept, but also provides primitives for operations that are conceptually parallel or unitary to avoid gratuitous decomposition. Extract: Elements
    Elements

    The overall structure of a Fornax program is a set of function definitions, as in Lisp, APL, or the various logic and functional languages.  Like Prolog, there may be several entries with the same function name, which are selected between on the basis of pattern matching with the arguments.  Unlike Prolog, there is no backtracking. There are vector operators like APL and string pattern matching like SNOBOL.

    Within a given definition the syntax is a simple operator expression. Undeclared variables are lexically scoped and local to the function definition.  Side-effects are allowed but restricted to the action of explicit assignment operators; parameter passing is by value.

    There are two kinds of expressions, pattern and value.  A pattern can be thought of as a program that takes a string as input and either accepts (matches) it or not.  Pattern matching has many of the  features of programs, e.g. variable assignment can be done as part of a match.  The left-hand sides of assignment statements, and the formal parameters of functions, can be patterns, with interesting and sometimes useful results.

    Extract: Arrays
    Arrays

    The datastructure in Fornax is a recursive, multidimensional array. It is similar in semantics to the "array theory'' array of Trenchard More.  Arrays may be "closed'' to create synthetic scalar objects; objects of arbitrary complexity may be created in this fashion and treated as atomic by other code.  This capability is intended for for such "object oriented'' uses; multidimensionality is sufficient for more pedestrian aggregations.

    While no concrete datastructure implements Fornax arrays efficiently for all operations, linked lists, indexed memory arrays, trees, hash tables, and so forth are appropriate for various subsets. It is intended that the programmer ignore the details and use the abstraction as needed, and the compiler will determine how best to implement the resulting operations (even changing data representations if necessary).

    An array in Fornax code is written as a sequence of values, e.g. 6 7 8. A "sequence" of one value, is an array of one value, is a scalar, e.g. 4.   There is no requirement that each subarray of an array have the same number of elements, or indeed have the same dimensionality.  The "nested" and "multidimensional" interpretations of array structures in Fornax are unified; any array can be treated either way at any time. Extract: Scalar Extension and Parallel Application
    Scalar Extension and Parallel Application

    In APL, 2 + 3 4 automatically extends the 2 to the length of the vector 3 4 and automatically does + in parallel between the resulting vectors, producing 5 6.  Fornax does not do this; most of the functions operate on more complex structures than numbers, and it would be quite ad hoc to "destructure" them to some arbitrary level to do the parallelization automatically.

    Instead parallelization is specified explicitly, using the ".'' operator.  2 +. 3 4 produces the vectorization we want.  In general, a f. b produces a result that is the same length as b, of which each element is a f [the corresponding element of b]. Likewise a .f b produces a result corresponding to a.

    Extension is often used in conjunction with literal functions; any expression in {} is taken as a function; & and && represent its right, and if supplied, left arguments.  To take the absolute value of each number in list (without using abs), do {&>0|-&}. list. Extract: Success and Failure
    Success and Failure

    As in SNOBOL and Prolog, expressions in Fornax can succeed, returning a value, or fail.  Normally any expression having a subexpression which fails, will itself fail; some constructs, used for control flow, "catch'' failure.  Unlike Prolog, backtracking is not done.

    Arrays can contain failures as elements.  Any "dotted'' function application catches single-element failures. Extract: L-values and R-values
    L-values and R-values

    In conventional programming languages, only variables, subscript expressions, and record field expressions have L-values, that is, only those expressions that refer to an explicit storage cell.  SNOBOL allows the result of a pattern match to be an L-value, doing replacement of the substring that matched.  As special cases, insertion and deletion could be done, replacing a null string with a non-null one and vice versa. Fornax adopts this formulation of L-values, allowing a fairly general range of expressions.

    Patterns can be used as L-values, generally as the formal parameters to functions.  Variable names in this context are interpreted as equivalent to __, the pattern that matches any object.  Thus we can define list reverse as:

       rev (a,,b) := (,b), rev a
       rev () := ()

    (b is rotated so it must match a single element; a can match any length of list, including 0, preceding the last element.)
    Extract: Patterns
    Patterns

    Pattern matching happens in two modes.  The first is exact matching, denoted by ^, which is essentially an equality predicate.  The other mode is a substring match, denoted by ?, where the pattern must match some substring of the subject array.  That is to say, s ? p if there are arrays a, b, and c such that s = a,b,c and b^p.

    The simplest patterns are values.  A value, as a pattern, matches itself and nothing else: 5 6 ^ 5 6 and 4 5 6 7 ? 5 6 succeed, but  6 5^ 5 6 and 3 4 5 ? 5 6 fail.  The L-value, as well as the R-value, of a successful ^ is its left argument.  A ? returns  an array of the substrings matched; its R-value is the sequence of  places in the string where the matches occurred.  If

       a = "peter piper picked a peck of pickled peppers"

    then a ? "pe" returns "pe" "pe" "pe" "pe" "pe", and a ? "pe" .= "" leaves a equal to "ter pir picked a ck of pickled prs".

    Fornax has a set of pattern primitives allowing the construction of patterns roughly equivalent to the ones in SNOBOL.  Pattern  functions and value functions are overloaded onto the same symbol  set, sometimes subtly: v1, v2 is the concatenation of v1 and v2, whereas p1, p2 is a pattern that matches any concatenation of  something matching p1 and something matching p2.  On the other hand, p1 | p2 is the pattern alternation of p1 and p2 (just as in SNOBOL) but v1 | v2 is a flow-of-control expression like a Lisp OR.

    All value functions and their pattern cognates have the same precedence; an expression parses the same whether it is in value or pattern context.

          in USENIX Symp on Very High Level Langauges, Oct 1994 view details