SPLINTER(ID:474/spl017)

Scientific PL/I Interpreter 


for Scientific PL/I INTERpreter,

Robert L. Glass, Boeing Corporation, 1966

PL/I interpreter with debugging features and scientific mathematical capabilities


Related languages
PL/I => SPLINTER   Subset

References:
  • Glass, R. L., "SPLINTER, A PL/I Interpreter Emphasizing Debugging Capability" view details
          in ACM-sponsored PL/I Forum, Aug., 1967, Washington, D.C. view details
  • Glass, R. L. "SPLINTER--a PL/I interpreter emphasizing debugging capability" PL/I Bull. (Mar. 1968). view details Extract: Introduction
    Introduction
    There is an increasing interest in the computing community in language processors which stress
    (1)     Quick compilation
    (2)     Maximum diagnostic capability.
    A covey of such processors has been described in recent computing literature - DITRAN (Moulton and Muller, 1967), WATFOR (Shantz, German, Mitchell, Shirley and Zarnke, 1967), PUFFT (Rosen, Spurgeon and Donnelly, 1965). What distinguishes the processor described herein from those mentioned above is a third item which can and should be added to the two above,
    (3)     Maximum debugging capability;
    as well as these other characteristics - the source language processed (PL/I), the mode of execution (interpretation), and the implementation technique (Fortran-coded, expandable).
    SPLINTER - Scientific PL/I INTERpreter - is an interpretive implementation of a subset of the PL/I language. Development of the processor has taken place incrementally - that is, a basic language was implemented and made usable, then additional features were added on in "upwards compatibility" fashion. By adhering to a modular processor design, the system has avoided the internal system clutter which might be expected of this sort of implementation, and has given the advantage of providing a system which can be expanded to include as much of the (seemingly infinitely large) PL/I language as core space, development time, and implementer patience permit.
    Implementation has been performed on a UNIVAC 1108 (in the Boeing Company Aerospace Division, Computing & Analysis organization) under the EXEC II (1107-compatible) operating system, and the processor is presently operational. The language in which the processor is coded is Fortran, which is primarily responsible for the fact that implementation has been done by one person over a span of eight months. How the obvious disadvantages of using Fortran in a language processor were overcome is discussed in detail later.
    Implementation Goals
    SPLINTER was implemented in order to meet the following design goals:
    (1)     Research
    (i)     Investigate the advantages of an interpretive language implementation. (For the purpose of this paper, an interpreter is defined as a language processor which maintains control during the execution as well as the compilation of the source program).
    (ii)     Produce a fast-compilation implementation of a powerful source language.
    (iii)     Produce a system which can be run in conversational mode on present generation hardware, and relatively easily moved to new generation hardware later.
    (iv)     Design the system in such a way that the interpretive section is modular, so that it could be removed and replaced by a compilation section if desired.
    (v)     Use a commonly-used "problem-oriented" language (Fortran) for language processor implementation.
    (2)     Development
    (i)     Ease the transition of application programmers to PL/I by making a PL/I system available now.
    (ii)     Give the user debugging tools (via interpretive mode) which most language implementations do not provide.
    (3)     Economics
    (i)     Reduce 1108 machine time by minimizing compilation time for checkout jobs, where compilation takes up the majority of the running time.
    (ii)     Possibly return investment in the system by releasing it for consideration.
    Most of the goals have been successfully achieved. Notable exceptions have arisen in the areas of (1) (iii), where development of a conversational version of SPLINTER is delayed pending development of a conversational operating system; (2) (i), where the demand for PL/I has been considerably less than anticipated (the author still feels, however, that PL/I is inevitable!); and (3) (i), where except for student-size jobs the economic cost of interpreting outweighs the economic advantage of fast compilation. On the positive side, the debugging/diagnostic advantages of interpretation have been considerable, the use of Fortran for processor development has been a significant success, and system modularity allows some interesting future development plans.
    Extract: Debugging/Diagnostic Facilities
    Debugging/Diagnostic Facilities
    Probably the most overlooked area of programming, from the point of view of development and system effort spent versus computer and programmer time involved, is debugging. Investigations of elapsed time from problem receipt to computer-produced answers show that the checkout phase of program development is a significant, and sometimes the significant factor. Yet because checkout techniques have developed as an individual art and because other problems were more interesting to solve, little effort has been made toward significant debugging capability. Language specifications traditionally overlook the matter entirely, making it an implementation feature if it is indeed present at all.
    The features which follow are not a part of the PL/I language and were added to SPLINTER through the facilities of interpretation. With one exception, the following statements are executable and thus take effect during execution rather than compilation of a program. Although they by no means exhaust the possibilities of user debugging aids, they constitute at least a minimal subset of such aids.
    Note that * LABEL- TRACE is a subset of *TRACE, and *STORE- TRACE namei is a subset of *TRACE namei.
    Statement     Meaning
    *TRACE;      Print statement numbers of all subsequent statements as they are executed.
    * LABEL- TRACE;      Print statement number and label of all subsequent labeled statements as they are executed.
    *TRACE namel, name2, . . . ;      At each subsequent execute-time reference to any of the named variables, print the name and value of the variable, and the number of the statement in which the reference occurs.
    *STORE- TRACE namel, name2, . . . ;      At each subsequent execute-time reference to any of the named variables which changes the value of the variable, print the name and value of the variable, and the number of the statement in which the reference occurs.
    *DE TRACE      Suspend *TRACE and/or * LABEL- TRACE printouts.
    I-DETRACE namei, name2, . . . ;      Suspend *TRACE namei and/or *STORE-TRACE namei printouts for the named variables.
    *DEDEBUG;      Ignore all subsequent (compile-time) debugging statements.
    Since all the above statements except *DEDEBUG are executable, debugging facilities may be conditionally invoked; that is, the debug statement may be imbedded in an if- statement.
    IF A < B THEN *TRACE A;
    Thus debugging may be pre-planned and debug printouts can be taken only under exceptional conditions, if desired. One disadvantage of pre-planned debugging is that many more debug statements are included in a program then will actually be needed. It is the purpose of the *DEDEBUG statement to automatically purge a program of those debug statements which a programmer failed, intentionally or inadvertently, to delete.
    In addition to the above dynamic debugging facilities, SPLINTER produces an optional formatted post-mortem dump. This dump is actually the processor name list, analogous to that produced by most compilers, except that it is taken at the conclusion of execution rather than compilation and for all program variables (including those in arrays), contains the final value and the number of times the variable was referenced during execution. The name list is alphabetized, and fixed, floating, character/bit string and label items are printed in an appropriate format. The print of the number of references is useful for analyzing program performance, such as the number of calls on a particular subroutine or the number of times an iteration parameter was accessed.
    Diagnostic facilities include the traditional syntax-and-sporadic-semantic checking during compilation, and illegal operation detection during execution In addition, diagnostic detection is made on such errors as
    Variables referenced before they have been initialized.
    (1)     Subscripts out of range or omitted.
    (2)     Call sequences containing the wrong number of arguments, or arguments of the wrong mode (fixed vs floating, etc).
    The philosophy adopted throughout SPLINTER is that of the "forgiving" processor -- that is, diagnostics are given for all detectable errors, but every attempt is made to make a "best guess" and continue. This is based on the debatable theory that the probably trivial amount of machine time involved in continuing will be less than that required to rerun the job to the point of the error, that program errors are often as detectable using bad data as good, and that each computer run should produce a maximum of information in order to reduce both the total number of computer runs and the total programmer wait-time for results. As an example of this philosophy, SPLINTER will treat A = B (C + D); as implied multiplication if B is neither an array nor a function (after printing an appropriate diagnostic). However, to keep this philosophy from getting out of hand the processor controls all diagnostics through a central subroutine which aborts a program after 15 diagnostics of any type. For those who cannot accept this philosophy, it would be easily possible to introduce the concept of user-defined abort level.
    The centralized diagnostic processor also allows such features as standardized diagnostic formats: compile-time diagnostics print an underscore under errors in addition to the diagnostic, and execute-time diagnostics print the number of the statement containing the error as well as the diagnostic (the processor source listing contains statement and line numbers as well as the source text).
    Extract: Language Features
    Language Features
    SPLINTER makes no attempt to implement the full PL/I language. However it does implement a scientific subset which is for the most part at least as capable as Fortran IV.
    Standard Fortran-like features in SPLINTER are:
    1)     Arithmetic statements allowing expressions in function arguments and subscripts, and mixed-mode arithmetic.
    IF statements allowing expressions, both THEN and ELSE clauses (IF A < B THEN A = B; ELSE A = C;) with full nesting (IF A < B THEN IF C < D THEN C = A; ELSE C = B; ELSE A = B;).
    DO statements allowing expressions, the WHILE clause (DO A = B ** 2 TO B ** 2 + 5 BY C/2 WHILE (I > 10); . . . END;), with full nesting.
    2)     Traditional control statements, such as GO TO, CALL, END, STOP
    3)     Format-directed output.
    Non-Fortran-like features are:
    (1)     Data-directed input/output. The statement GET DATA; allows the reading of a card-format-independent input record which consists of a string of variable=value statements separated by commas or blanks and terminated by a semicolon. The statement PUT DATA name1, name2, ...; outputs the specified variables in the same format as data-directed input.
    (2)     Internal subroutines via the statements PROCEDURE, RETURN (logical end of subroutine) and END (physical end of subroutine). Internal subroutines may have dummy call sequence variables and reference either local or global variables (local/global variable implementation is discussed later). The ENTRY statement allows multiple subroutine entry points. External subroutines are planned but not presently implemented.
    (3)     Block structure via the statements BEGIN and END, allowing local variables to be referenced in in-line as well as subroutine code.
    (4)     Dynamic array allocation with some user control. All arrays are automatically allocated dynamically when they either are given initial values or are initially referenced during execution. The user may retrieve array storage from the memory pool by using the statement FREE namel, name2, . . . ; if he is through with the named arrays.
    (5)     Names may be up to 3 1 characters in length, and statements are labeled rather than numbered.
    (6)     Interpretive Processing
    The processor consists of two major subprograms, the compiler and the interpreter. The interpreter is called by the compiler, with the eventual expectation that it can be called after each statement for incremental execution on a time-sharing system. However, the processor runs only in a batch environment at the present time and calls the interpreter only after detecting the END statement denoting physical end of program.
    The compiler operates in one pass and converts source statements into name list entries and internal language statements. The name list is accessed randomly via hashing techniques and will be discussed later. The internal language is a reverse Polish for arithmetic statements, augmented to include such additional operators as GO TO, IF, IF-THEN, PROCEDURE, etc. , to allow the other statement types to be handled in a very straightforward manner. Operators are represented by encoded numeric quantities; operands are represented by name list subscripts pointing to the relative location in the name list where the operand attributes and value are maintained.
    The interpreter reads this internal language and executes its intent. Some of the work of processing is deferred until execution, partly because the PL/I language allows data declarations to follow references to the data (and thus the compiler cannot in one pass distinguish between fixed point addition and floating point addition, for example).
    It is interesting to note that no machine code is ever generated. Thus the processor is to an unusual degree machine independent.
    A great deal of the work of the interpreter is spent in retrieving attributes pertaining to operands in order to make decisions regarding arithmetic mode, function vs array vs sealer references, array dimensions, etc. The amount of time spent actually executing the intent of the source statement is relatively small, this overhead cost is, of course, the cost of interpreting as opposed to executing a program, and its magnitude is discussed below.
    Extract: Interpreting - The Cost
    Interpreting - The Cost
    Some studies of the relative cost of interpreting vs compiling-executing have been made using SPLINTER, not enough to be definitive but enough to draw some tentative conclusions which match what language implementers intuitively feel about interpreting.
    Two test programs were run on the 1108 Fortran compiler (a processor which performs considerable optimization and is a good example of a production-job oriented language processor) and on SPLINTER. The first, which should show interpreting at its worst, involved the execution of a DO-loop 10,000 times. The second, a small program with a better balance between compilation and execution, involved the simulation of a baseball game.
    Worst Case (DO loop 10, 000 times)
         1108 Fortran      PL/I Interpreter
    Compile & Load      2 secs.      < 1 sec.
    Execute      < 1 sec.      33 secs.
    Average Case (Baseball game simulation)
    Compile & Load      2 secs.      < 1 sec.
    Execute      2 secs.      5 sec.
    Obviously, interpretation should never be used for production-type work. Even for average small jobs, it is economically dubious; however, the capability advantages of better diagnostics and debugging may outweigh the disadvantage of execution cost.
    Only in programs which involve trivial execution time, such as student or training problems, will interpretation actually be justifiable on solely economical grounds, . . as shown here, compilation is roughly twice as fast as an efficient Fortran processor.
    Extract: Name List Implementation Technique
    Name List Implementation Technique
    The name list presents some interesting problems in that it must allow for names of up to 3 1 characters in length, many attributes for each name, and local/global name conventions. The table consists of three Fortran arrays: NMRAND, which contains all the attributes pertaining to the names in packed bit form; NAMES, which contains the Hollerith names themselves; and NMVAL, which contains either the execute-time value of the scalar variable, the internal language text pointer of a label variable, or the array information for an array variable.
    For each variable in the source program there is one word assigned in the NMRAND and NMVAL tables. These words are accessed by hashing the name of the variable to obtain a table subscript. Since the name in its Hollerith representation may be longer than one 1108 word, NMRAND contains a pointer to a word in the NAMES table which is the first of those required to hold the name. Entries in NAMES are made sequentially as the names are encountered; access to NAMES is always via the pointer in NMRAND, so that all name table lookups during compilation are random, not sequential.
    Since operand references during interpretation are made via NMRAND subscripts, there is never any need during interpretation for any form of name list lookup except direct access.
    To implement the full PL/I name manipulation capability, analogous to that in COBOL, such as BY NAME(CORRESPONDING) and name qualification referencing, would involve a more sophisticated approach to name list handling, possibly involving bi-directional threaded lists. This area deserves further study, and, perhaps publication on the part of computing sophisticates. It was not attempted here because it was not within the needs of a scientific subset of PL/I.
    It was stated early in this paper that the processor was built incrementally but that this had not led to system clutter because of a modular design. It did, however, lead to one restriction on the use of local/global names which will be intolerable to Algol addicts but acceptable to Fortran fanatics PROCEDURE-END and BEGIN-END blocks may not be nested (defined within one another). This is because the name list was constructed without consideration of local/global naming, and so much of the processor is tied to the name list that it became economically undesirable to introduce any significant change in name list treatment in order to implement local/ global names. This constraint led to the nesting restriction.
    It is encouraging, however, that with this one restriction the implementation of local/global names in an existing name list was not only feasible but accomplished in under one week of implementer time. Since dummy variables for subprograms were added at the same time, the conversion time was quite trivial, a fact which may be encouraging to other implementers who may be considering adding local/global capability to an existing language processor, such as a Fortran compiler.
    Basically, the techniques required establishing three more attributes (cost 3 bits in each NMRAND word) in the name list for each variable a flag saying whether the variable is local or global, another saying whether the variable, if local, is active (the compiler is operating in the domain in which it is meaningful) or inactive, and a third saying whether an active local variable exists or not. Each distinct definitive occurrence of the same name, whether local or global, results in one location in the name list being assigned to the variable. The three attributes serve to distinguish between these otherwise identical variables, identify the one local variable which is active (if any), and delimit the length of the search for that active local variable. The "active" flags must be purged from the name list each time an END statement ending a PROCEDURE or BEGIN block is encountered, a process which takes a significant amount of time but which occurs so infrequently as to have no detectable effect on compilation time.
    Extract: Using Fortran to Write a Language Processor
    Using Fortran to Write a Language Processor
    As stated previously, SPLINTER was coded in Fortran. This was not done to facilitate machine independence, although machine independence is always worthwhile at least as a byproduct, but to facilitate the coding process itself.
    There have been two traditional reasons why Fortran, or other so-called problem-oriented languages, have not often been used for software implementation
    1)     The language is inadequate
    2)     The generated code is inefficient.
    Both deficiencies are overcome by the same device: unashamedly dropping into machine language at any point where language deficiencies or efficiencies make it look desirable. In particular, machine language subprograms are used for character fetching, bit string manipulation, internal language storing and retrieving, binary table lookup on keywords, accessing operating system functions not accessible through Fortran, fixed point double precision evaluation of floating point constants in the source program, dynamic storage allocation, dynamically formatted printouts, and name list sorting. In spite of the number of such routines listed, the relative amount of machine language coding is quite small.
    To identify those areas where Fortran-code produced intolerable inefficiencies, a simple timing analysis scheme is used. Two trivial subroutines were coded, one of which interrogates the real time clock and stores its value, the other of which calculates the increment of time since some previous entry and adds it to a total time cell. At various critical modules in the processor, a linkage is made to the first subroutine on entry and to the second on exit. The total time in that module is thus obtained and can be printed out at the end of execution of the processor.
    By visually examining these times for various processor modules, it is possible to spot those areas where changes for the sake of efficiency are desirable and those in which changes would be a waste of the implementer's time.
    Extract: Documentation Idiosyncrasies
    Documentation Idiosyncrasies
    Incremental system development leads to a difficulty in system documentation. Either the documentation is nonexistent or badly obsolete at any given point in time, or it, too, must be developed incrementally. The latter course has been chosen for SPLINTER.
    To avoid the screams of secretarial anguish which would accompany frequent demands for retyping such a document, documentation has been constructed on punched cards. Changes may be easily made, either of the punched card images on tape via a text editor, or simply by manipulating the punched cards, and a new listing of the document made. As a byproduct, any convenient text production or manipulation programs may be used on the document. In the case of the 1108, only a syntax-directed cross reference lister is known to be available in support of this need; it is used to produce an automatic index for the document.
    In addition to the user document discussed above, an implementation document was prepared in the form of an unpublished elementary-level text in compiler writing.
    Extract: Future Plans
    Future Plans
    One of the design goals mentioned earlier was that of modularity, so that the processor could be easily expanded but also easily converted into other types of processors. It is not the intention of the implementer to build a static system, but rather one with which continuous development and research can be performed.
    Some work has been done in these areas using SPLINTER as either a nucleus or as a reservoir of useful subroutines:
    1)     Significance analysis of computed data by letting the interpretive processor keep track of significance as an attribute of each variable.
    2)     Experiments in implementing a Fortran to PL/I translator.

    Directions for future development will include at least one of the following:
    1)     Building interpreters for other source languages by changing only portions of the compiler subroutines, none of the interpreter.
    2)     Building a conversational interpreter when a time-sharing operating system becomes available.
    3)     Building a PL/I compiler by substituting a code generator for the interpreter subroutine.

          in PL/I Bulletin, Issue 6, March 1968 view details
  • R. L. Glass, "An Elementary Discussion of Compiler/Interpreter Writing" p.55-77March 1969 view details Abstract: Elementary techniques are described for the implementation of compilers, interpreters, and translators of programming languages. An overview of such a language processor is presented, followed by a detailed discussion of certain building blocks useful in its construction. Frequent reference is made to a specific implementation, that of a PL/I interpreter called SPLINTER (Scientific PL/I INTERpreter) on the UNIVAC 1108 at Boeing.
    Extract: SPLINTER design
    INTRODUCTION
    It is the purpose of this paper to describe the techniques used to implement a PL/I interpretive system, written in the FORTRAN language for the UNIVAC 1108 at Boeing, covering in breadth the general approach used, and in depth some of the more interesting techniques.
    [...]
    The project upon which this discussion is based is a PL/I subset interpreter called SPLINTER (Scientific PL/I Interpreter) developed by one person at Boeing in approximately six man-months in late 1966. Many of the techniques used are source language independent, and emphasis will be placed not on the PL/I language or interpreters, but on the solutions to those problems which are common to compilers, interpreters, or translators of any source language. It is worth mentioning in passing that the SPLINTER processor was written to service several design goals (Appendix A), among them being to produce a system which could be easily modified to process other source languages, to be a conversational processor, or to be a compiler.
    For the purposes of this paper, the following
    definitions are made:
    Processor--a computer program which processes other computer programs.
    Compiler--a processor which converts a program written in a programming language (called the "source code") into code which a computer can execute (called "object code").
    Interpreter--a processor which accepts a program written in a source code, converts it into some readily executable form, and performs a controlled execution. The readily executable form may or may not be object code.
    Translator--a processor which converts
    one source code into another.
    In what follows, the word "compiler" is occasionally used to imply "compiler, interpreter, or translator." The context in which these references occur should leave no doubt as to the intent.

          in [ACM] ACM Computing Surveys (CSUR) 1(1) January 1969 view details
  • Rosin R F "PL/1 Implementation survey" PL/I Bulletin 7 (in ACM SIGPLAN Notices Feb 1969) view details Extract: Introduction
    This report summarizes all data collected in a survey of implementations of the PL/I language. An attempt was made to contact all people or groups known or rumored to be undertaking such a project. A fairly lengthy questionnaire was mailed out when a prospect was identified, beginning in August, 1967. The latest response was received in August, 1968.
    As is the case in most such efforts, it is likely that some projects were overlooked. In other instances, our attempts to contact people were met with no response at all, neither confirming nor denying the existence of the implementation involved. There were also a few cases in which one or more implementations did indeed exist; but their nature was proprietary and, therefore, the questionnaire was not completed. An in two cases, the individual contacted replied by stating that the supposed implementation had never existed. Only in the latter cases is the attempt at contact not included in the overall summary table.
    The data are summarized in four tables. The first contains the basic identification of all attempted contacts and the resulting response. The second table summarizes all questionnaires returned with respect to implementation data. The third and fourth tables relate responses to questions about the PL/I language and the dialect implemented. Responses from three questionnaires are excluded from these tables since, in the author's opinion, the dialects are more closely related to languages other than PL/I, itself. The third table summarizes restrictions found in the various dialects, while the fourth lists the extensions found in one or more dialects.
    This report suffers from the fact that the questions were somewhat general (e. g. , "Which features have you implemented in a manner at variance with C28-6571 -4? "), and the responses, therefore, reflect the interpretation imposed by the responder. An attempt has been made to normalize these aspects of the data, but it is clear that the result is far from ideal.
          in [ACM] ACM Computing Surveys (CSUR) 1(1) January 1969 view details
  • Sammet, Jean E. "Computer Languages - Principles and History" Englewood Cliffs, N.J. Prentice-Hall 1969. p.600. view details
          in [ACM] ACM Computing Surveys (CSUR) 1(1) January 1969 view details