LUCIFER(ID:7964/)

Text manipulation system for experimental results 


for LINC Unrelenting Console Interception and File Editing Routines

Text manipulation system for online experimentation

Mike Wilber

Stanford Research Institute
Menlo Park, California  1966


References:
  • Wilber, B. Michael "LINC-8 text-handling software for on-line psychophysical experiments" Decus Spring 1967 view details Abstract: A complete text-handling system (LUCIFER) has been developed for the LINC-8.  All communication between LUCIFER and mortal man is carried on through a Teletype medium, so that hard copy is always produced, and one need never invoke scope, switches, and lights.  Along with LUCIFER have appeared subroutines by which experiment-running programs can do input and output of data with text files or the Teletype.  This paper discusses the philosophy of LUCIFER and includes examples of the use of LUCIFER and the running of a typical experiment.
    Extract: Introduction
    We are using a LINC-8 computer for presenting stimuli and recording responses in psychophysical experiments.  This use is characterized by extremely low data rates over long sessions.  For example, experimental sessions typically take twenty minutes to an hour, with data rates of 30-180 bits per minute in each direction.  Since much, if not all, of the computer's time is taken with running experiments, and because of the availability of commercial remote-access time-sharing computer facilities using ASR-33 terminals, we have decided not to do processing on the LINC-8 that can be done remotely.  Communication between computers is via punched paper tape, and since this is our only use of that medium,  it is not viewed as onerous.  (Eventually we may be able to eliminate this use with a connection of the LINC-8 to the telephone lines.) Since the output of almost all the  experiment-running programs is to other  programs, output formatting is considerably simplified.

    Because of the low data  rates involved in our experiments, it is practical to input and output data in the form of text files called manuscripts.  This greatly simplifies the problems of preparing input (stimulus) files and making sense of output (response) files because our regular text-handling programs can be brought to bear  on these files. These programs, part of the LUCIFER system, are needed for preparing the  programs to run the experiments,  so a great saving is realized by using the same programs to handle the data as much as it is handled on the LINC-8.  A family of text-handling subroutines has grown out of the family of programs and has been made a part of LUCIFER, giving it the structure shown in Fig. 1.   Experiment-running programs can handle text by merely incorporating the subroutines and using simple calling sequences.  An example of this is shown in Figs. 2 and 3.  We should note in passing that LUCIFER includes a program to convert almost any tape from LAP4 format to LUCIFER format, but there is no such program for LAP6.  It might also be mentioned that the LUCIFER programs use the PROGOFOP typeout instruction and assumes the keyboard and typing mechanism are connected, so they probably could not be modified to run on the so-called classic LINC.
    Extract: The LUCIFER Philosophy
    The LUCIFER Philosophy

    Part of the philosophy of LUCIFER is that a typewriter-like device is a good medium for interacting with computers.  With such a device, one has hard copy of what one did just before somebody walked in with an interesting question.  Since one is often referring to typed and written material, a typed page in a well-lighted room seems to cause a good deal less eyestrain than a flickering scope in a dark room--and well-lighted rooms are easier to come by than dark ones.  When one is interacting with an active special-purpose program, one is less tempted to push the wrong button than with a passive, general purpose bank of switches and lights, because only logically "correct" commands need be accepted by the program; and one can easily arrange that the consequences of a mistaken command are easier to recover from.  See Fig. 4.  Again, hard copy is produced automatically as a by-product of initializing a program instead of (hopefully) by somebody noting down on paper what is put into certain magic locations, hopefully without making mistakes.  There also seems to be some advantage to leaving the file name permanently and automatically recorded on paper instead of hoping the right paper or magnetic tape or card deck was put in the right place.  Operating instructions can almost completely be dispensed with if the program gives an idea of what parameters are needed, and in this mode, one can easily arrange that no steps can possibly be forgotten. Our experiment-running programs run through a set dialog and do not start the experiment until the end of the dialog.

    It is felt that interactive programs can have more natural and easy-to-use command structures than currently appearing in many distributed programs.  The LUCIFER programs are few in number, interactive and frequently used, so ease of memorization is a minor consideration.  This is also helped by the fact that they have as many commands in common as apply.

    In most situations not every command makes sense at all times.  For instance, a line of the current file cannot be examined if there is no current file.  It would help if one could obtain a priori knowledge of what sort of commands are acceptable in a given situation.  In the LUCIFER and experiment-running programs, this consideration has given rise to the concept of prompting, i.e., all type-in is in response to some type-out from the program, and this type-out is somehow indicative of the nature of response expected.  For example, as illustrated in Fig. 5, if the last thing typed was an asterisk, then the program expects a file name, consisting of up to six characters terminating on a carriage return.  Furthermore, rubout functions as a backspace and types as a backslash, and the special name blank Q(" Q") causes the program to return to GUIDE.

    In this connection, a little can be said about the LUCIFER programs command structures and methods of operation.  First, one cannot modify text, directories, or core locations without first seeing what one is modifying. This is in contrast to some systems in which one cannot see one's text and change it at the same time.  For example, to kill or replace a line of text, one directs the editor's attention to that line, which causes its contents to be typed out, and only then will the editor accept a command to replace or kill it.  There is a certain amount of protection from careless errors afforded by the fact that the positions of the keys on the keyboard were considered when choosing the commands to be associated with certain actions.  For instance, the editor kills and replaces lines and inserts lines before others (the commands are K, R and B) instead of changing, deleting and inserting forward of a given line (the commands might be C,  D and F) or killing, inserting and overlaying, with commands  K, I and O.

    All the LUCIFER programs and all the experiment-running programs always refer to files by name, and so the actual location of a file is not relevant unless one wishes to add new files to the directory or expand ones already there.  On the other hand, the format and location of the directory are well noted in the documentation, and the only program which lists the names in the directory also lists sufficient additional information to reconstruct parts of the directory or even the entire directory.

    The editor edits a file in place and is a random access editor.  Thus opening a file involves ascertaining its location, validity and length, requiring the inspection of exactly two tape blocks, instead of copying the file into some "working area" and/or inspecting it to build a directory.  Also, one has no need to edit a file serially and/ or be continually copying it between two temporary files and finally rename one with the original's name and kill the other one and the original source.
    Extract: The LUCIFER Programs
    The LUCIFER Programs

    The purpose in writing the LUCIFER programs was primarily to facilitate the process of forming programs on the LINC-8.  This imposed two requirements on LUCIFER.  First, it had to have a much more tractable overall organization and command structure than other available systems.  Also, it was felt mandatory that there should be no overlaying. This not only greatly enhances response time of the programs and greatly facilitates the process of debugging them, but it makes programs more readable, permits them to be ordinary GUIDE programs, facilitates their assembly, and otherwise facilitates local and remote program updating.  A secondary purpose was to consolidate action from the switches, lights, scope and Teletype to just the Teletype.  The reason this was felt desirable is a foundation of the philosophy of LUCIFER. LUCIFER includes DDT, a debugging program, and four kinds of text-handling programs: the programs of primary interest, some book-keeping programs, two programs whose sole purpose is to allow our experiments to communicate experimental data with other computers, and one program which does not exist.

    DDT is a simple program to examine and change  storage locations in octal and exert some control over the execution of a program. It always works on the most recently assembled program, and there is no provision for saving  (with GUIDE'S FILEBI) this program in any but its pristine state - it is possible to override this, but it is often easier to bring the manuscript up to date, thus facilitating later changes.

    The programs of primary interest are the editor, the assembler, and the lister.  The salient features of the editor have already been discussed.  The assembler inputs the names of the manuscripts  composing a program, types out all symbol (tag) definitions and assembles the binary for the program.  The language it processes is an extension of a restriction of LAP4.  Details may be gotten from its documentation, but some of the salient features are that lines may be as long as the editor will handle (something like sixty characters), symbols (tags) are up to four letters and/or digits with at least one letter, comments can be on the same line as anything else, and arbitrarily complex, logically meaningful expressions can be  used for equalities and origins. Checking is much tighter, with almost any situation the assembler cannot correctly handle  giving rise to a message containing the name of the situation, the current core location in the object program and the current line number.  The lister simply produces a listing of a manuscript, with options to select only parts of the text and to form 8" by 11" pages.

    There are three bookkeeping programs included in LUCIFER: MUNG [Manuscript ultra-normalization and generation], the iceberg and DIRGEN [The directory generator].  The latter is used to convert a LAP4 tape to LUCIFER format if that is possible.   All programs and subroutines concerned with the text in a manuscript use information stored in the manuscript's directory, which tells how many blocks are actually occupied by the text and the highest line number for each block.  A manuscript in this form is called normal, but it is much more convenient for experiment-running programs to write text in another form, called abnormal. Also, pursuant to the second law of thermodynamics, the manipulations of the editor  are quite likely to lower a manuscript's packing density, but they are somewhat less likely  to raise its density.  For these two problems, we have a program MUNG, which reads a normal or abnormal text file and writes a normal  text file with the highest possible density.  Finally, all responsibility for the tape's directory of files is vested in the iceberg whose operation is shown in Fig. 6. This program accepts explicit commands to change the directory but of course rejects any commands which would result in the directory becoming potentially invalid.

    In running our experiments, we have found that additional processing should be done on other computers.  With our present hardware and with the particular other computers used (dial-in with a Teletype), the only means of communication is ASCII-coded punched paper tape.  This particular medium is less onerous, however, when it is viewed as backup to the storage of data in the more accessible forms and when one realizes that a very small absolute amount of data is handled this way and quite seldom at that.  These programs are TTYOUT, which punches the contents of a file onto a paper tape with blank leader and trailer, and TTYIN, which inputs a paper tape into a file.  For timing reasons, the latter will not run with PROGOFOP, but requires our own corruption of that program, which is named PROTOCROCK [PDP-8 routine to oversee tape operations and cooperative routines which obtain console-type knowledge (the program of total crockery).].  Since it is only intended for data, it does not handle the full character set, and for timing reasons, it is extremely limited in the length of a tape it will handle.

    The last part of LUCIFER consists of a nonexistent program - a mythical beast.  It may not be a program at all - it may simply be a nonexistent feature of MUNG.  Due to its lack of existence, one cannot say very much about it, and whatever one does say about it may be of indeterminant validity and concreteness. One can, however, say that this program, which has no name, has the property of merging several files, or possibly arbitrarily or otherwise selected portions thereof, into one file.  Despite the obvious usefulness of this program, it has only been used once, so it has never been written.  Probably the principal reasons for this state of affairs are that the assembler is nearly indifferent to the number of manuscripts composing its object program, and that the lack of real pages makes short manuscripts desirable. Extract: The LUCIFER Subroutines
    The LUCIFER Subroutines

    While the LUCIFER programs were being written, it was realized that their id and parts of their preconscious could be unified, generalized and quickened in manuscripts, so the experiment-running programs could easily communicate with LUCIFER and each other.  The logical outcome of this idea is the LUCIFER subroutines.  For the first few months of their existence, they were highly evolutionary, but as experiment-running programs were written around them, they gradually coalesced into a unified, modularly useful whole.

    The structure of the subroutine package is that the subroutines are distributed across five interrelated manuscripts in such a way that at least six subsets of the manuscripts are conceivably useful.  The selected manuscripts are assembled along with one or several other manuscripts in which reside the conscious part of the program and its own special-purpose preconscious and id. These other manuscripts have to set aside certain locations and areas with specified names, and some small amount of initialization is required, but otherwise one need not consider the internal mechanizations of the id of LUCIFER.

    The specific functions represented on the manuscripts are the following.  The first contains basic pushdown list manipulative functions.  Although this makes recursive functions possible, only one program actually does have a. recursive section, and that program is not often used.  On the other hand, the pushdown list is seen as a good discipline for the allocation of temporary storage, and the handling subroutine returns (always a problem on the LINC) is uniformized and considerably simplified.

    The next manuscript contains subroutines to type text displayed in the calling sequence and to input and output numbers in octal.  In our use, it would seem that octal numbers are not objectionable even to people having their first contact with computers, because all numbers handled by any part of LUCIFER are octal, so it is seldom necessary to convert between octal and decimal.

    On the next manuscript there is a subroutine to buffer a line of input, up to a preset maximum length and process backspaces. Since almost all input is performed by this subroutine and the number input subroutine on the previous manuscript.  it does not take new people very long to learn the interactive characteristics of our programs.  The other subroutine on this manuscript is a subroutine to find a file in the manuscript directory. This subroutine uses the line input subroutine and does its own prompting, so there is some uniformity gained by this device.  In addition, this subroutine detects names with lead blanks (these have significance as commands instead of names) and also obeys the command of this form which means that the current program should be terminated and GUIDE should be restored.  Thus it is no accident that all our experiment-running programs at least exit in the same fashion.

    A tape file can be treated as a character-oriented serial access input or output medium by use of the subroutines on the next manuscript.  There are two completely independent subroutines here, one for input and one for output.  The input file may be normal or abnormal, but the output file will be abnormal because it is very hard to write a normal file. In fact, only the editor and MUNG write normal text.

    The upper levels of the id and the lower levels of the preconscious of LUCIFER are contained on the previous manuscripts; the last one contains subroutines from the upper levels of the preconscious and the lower levels of the id.  There are subroutines to open and close a file to the output routine and to open a normal file to the input routine of the previous manuscript.  In practice these special-purpose file opening subroutines are used instead of the general purpose one, which they call, on the third manuscript.  All the previous subroutines do character input and output with instructions not intrinsically defined in the assembler.  These instructions could be defined as operate-class instructions, if the Teletype is to be the medium for all character input and output.  However, this final manuscript defines these instructions as funny subroutine calls.  The subroutines, depending on the state of a flag in memory, cause either the Teletype or the current open input and output files to be used.  Thus we have a degree of device independence.  Device independence in itself is often used as a selling point for computers or software, but here, it is precisely what makes the subroutine package useful in the experiment-running programs.  It is what enables the same set of subroutines to be used for communicating with the two media, and this makes the calling sequences tractable as well as cutting core requirements considerably.
    Extract: Critique
    Critique

    The LUCIFER system, being a real system in constant use for four to ten months and having an evolutionary background, is possessed of some shortcomings, and it is felt that some space could be devoted to exposing them.  The most fundamental fault is that, for historical reasons, two different character sets are used? the LIXC character set and DEC's ASCII character set.  Fortunately, one of these is only used in outplotting to the Teletype.  The full set of subroutines, though composing a comprehensive, easy to use package, require almost 380 locations in LINC lower memory and two LINC memory quarters for buffers.  Of course, the buffers need only be respected while they are in use, a fact which is exploited in one of our experiment running programs.

    There is some conspicuous room for improvement in the manipulations one can perform with the LUCIFER programs.  The editor will not append to a manuscript but will only insert before a line.  In practice, this means that all manuscripts have an empty line after the last meaningful one.  Also, the editor will not make a new manuscript, but only edit an old one, so one usually has a manuscript, usually named SEED1, whose contents are exactly one empty line, which one MUNGs into a file before editing new text into it.  The editor has two further properties which were included to increase its safety but which make its interactions take longer than would otherwise be necessary.  First, it will not accept any commands pertaining to a line of text without first opening the line and typing out its entire contents.  Also, whenever a change is made to a line, both the block containing that line and the manuscript's directory are written out onto the tape. Finally, editing would be greatly facilitated if it were possible to alter a line without retyping it in its entirety.

    Other criticisms of the LUCIFER programs are more general.  The editor, the iceberg, and the assembler are all incapable of handling the null case (i.e., empty manuscript, empty file directory and empty file directory and empty program).  The editor and the iceberg both have safeguards built in so that they will not make an empty manuscript or directory from a nonempty one, and the action of the assembler on an empty program is harmless.  There is no facility in LUCIFER for merging or dividing manuscripts.  So far, LUCIFER is used mainly for preparing programs, and the assembler accepts a program spread out over many manuscripts, so the lack has not been objectionable enough to be cured. In the lifetimes of most data files, they are usually not changed enough that such a facility would be a great convenience.  The iceberg is very crude - it does little more than accept a human-oriented command language for a minimal set of atomic operations on the file directory and perform consistency checking between those commands and the directory. The intent was (and still is) to have it automatically invoked to create and extend output files as necessary, and to have it automatically handle the case where a file cannot be expanded without running into another file, but where there are free blocks on the tape.  One final criticism which applied to all the LUCIFER programs is that their command languages are very tight, in that any would-be command which is not of exactly the correct format is rejected.  For instance, in contrast to assembly language, blanks are forbidden wherever they are not mandatory.

    These criticisms are presented to show the other side of LUCIFER.  Without this section, we would have just been extolling the favorable aspects of LUCIFER and ignoring the basic fact that ideal systems exist only before their logical consequences are attained and that as soon as a system is realized (in the form of running programs in production use, in this case), the logical consequences are hard to ignore.  In considering the critique, one should bear in mind not only that it is offered by the author of LUCIFER, but that any sufficiently severe faults would be (and have been) corrected in the evolution of the system.