ACT(ID:5368/act028)

Autocode Coding system 


Continuation of the FLEXMATIC research by Holt and Turanski at U Pennsylvania Moore School. Designed to be a portable solution for the FIELDATA computers used by the army. In the article from 1961, Holt is scathing about UNICODE and UNCOL and their approach, and the ACT project was reactive against both of them


People:
Related languages
FLEXIMATIC => ACT   Evolution of
GPX => ACT   Evolution of
UNCOL => ACT   Reaction to
ACT => ACT II   Evolution of
ACT => APL   Co-development
ACT => Mem-theory   Evolution of

References:
  • Holt, Anatol W. "General purpose programming systems" view details DOI Extract: Generic programming systems
    In these remarks, I do not propose to describe an automatic coding system. My aim is, rather, to present an approach to automatic coding or, if you prefer, to present a motivation which has guided us in the development of the three actual systems for Remington Rand computers. The earliest of these three systems, developed for the UNIVAC I, has been known by the name Generalized Programming. (This system was recently renamed FLEXIMATIC by the Remington Rand Sales Office.) In operation for approximately two years, GP is currently employed at a considerable number of commercial installations. The second system, Generalized Programming Extended (GPX), is a new version of FLEXIMATIC for the UNIVAC II; it has considerably increased powers in comparison with its ancestor.

    Currently, we are building the third system for the LARC computer. During years of development, the aims we wish to achieve have only gradually become clarified. Therefore, the system now in preparation for the LARC represents the fullest realization so far of these aims. The ensuing discussion will use the term "the system" to refer to an embodiment of the aims to be described.

    In point of broad classification, the automatic programming system developed by the UNIVAC Applications Research Center is a compiler. This means that it shares two important characteristics with other systems similarly classified. First, complete solution of a problem with the aid of the system involves two computer steps: (1) translation of an encoding of the problem (in a special form associated with the system) into a computer-coded routine, and (2) actual problem computation. The second characteristic common to all compilers is that during the translating process, reference is made to a collection of packages of information, usually called library subroutines, which contribute in some way to the final result.

    The system differs from other compilers, however, in that it aims to provide the programmer with a set of services equally applicable to the solution of any problem--services, in other words, which are claimed to be altogether "general purpose." Therefore, the system contains no special design features which make it better adapted to one area of computation than to another.

    This aim, as stated, raises a peculiar question: A general purpose computer is, itself, supposed to provide a code which is equally well adapted to all areas of computation. To claim that it is profitable to build a programming system which contributes equally to all problems is, of course, to claim that some facilities which have "general purpose" standing are not realized in the hardware of a general purpose computer. We believe that such unrealized facilities exist and that they lay the basis for a powerful programming system. The facilities in question are an addition to the computer and, in no sense, a subtraction. Hence, a program in the system code may possibly be a one-to-one image of a computer-coded program.

    The general purpose facilities I have in mind fall into two broad categories, the first of which--naming aids used in writing computer instructions--has already been widely recognized in the computer field. Naming aids include symbolic addressing, convenient abbreviations to represent computer instructions, special conveniences in the naming of constants, etc. Such naming aids, while interesting in their own right, have already been exploited to a greater or lesser extent in many existing assembly routines. It is apparent that these conveniences are equally applicable to all programs in which computer instructions must be used.

    The second facility for which I claim general purpose standing is the ability to use mnemonic and abbreviated notations to represent computations of some given type. In part, this means the ability to use, in addition to the elementary computer instructions, special purpose instructions which are helpful in a given type of computation. By a few examples, I hope to show you the range of special notations I have in mind.

    Suppose, for instance, that I have frequent occasion for evaluating sines of angles and that the computer does not contain a special instruction for sine evaluation. In addition to my usual command vocabulary, which the computer interprets directly, it would be a considerable convenience to write a special command which represents sine computation. Perhaps in different problems I require sines with different degrees of accuracy. In that case, my notation for sine computation may also mention a desired accuracy.

    In any event, the special notations which represent the sine computation may be viewed as a special encoding of the sine function which must be translated into computer code before the program may function. Such translation is normally accomplished by reference to a so-called library subroutine, introduced by a compiler at the point at which the special notations were originally written. Sine computation, with or without parameters, exemplifies a large family of cases in which it seems convenient to introduce special notations to represent functions which the computer cannot execute directly.

    Another example of special notations not usually considered to have much in common with a sine routine call is the use of algebraic notation to represent the evaluation of algebraic expressions. A little reflection shows, however, that both of these cases have important properties in common. In fact, they are both special notations used as part of coding to represent certain types of computations. In the case of sine with specifiable accuracy, we "code" a sine routine by mentioning the word sine and a chosen accuracy number. In the case of algebra, we "code" an algebraic evaluation by writing the appropriate expression with algebraic notation. In both cases, these special notations must combine with information stored elsewhere before they become coding which is intelligible to the computer.

    In the case of special notations for sine, the "other information" is usually thought of as a library subroutine. A compiler then serves to interpret the special notations by finding the library subroutine and, perhaps, by inserting a specified accuracy value into the library coding. In the case of algebra, it is common to think of the "other information" as an algebraic translator which operates as an entity by itself. It is my contention that either special notations for algebra or special notations for sine may both be handled by a general purpose compiler which has broad ability to combine library information with special notations employed in the original coding.

    Still other types of special notations which may offer great conveniences in the writing of code for special problems may also be handled by a general purpose compiler. For example, suppose that I have a file of information on which several different types of processings are to be accomplished. This file contains complex items, each of which is subdivided into many fields containing quantities that may be mnemonically named. It would now be a great convenience if, in my coding, I could refer to the quantities contained in this item by mnemonic names which have been agreed upon as appropriate to that file of information.

    A general purpose compiler can assist the translation of such special mnemonics with appropriate packages of library information. A package of library information is combined with special notations employed in the original coding to produce coding in another form. The combining agent is a general purpose compiler. The library packages discussed here have wide use and diversity of structure; so-called library subroutines, on the other hand, are only one example of the library packages which a general purpose compiler can use.

    In the development of most compiling systems, the designers begin with a special range of problems for which they intend their system to be useful. They invent special notations which they consider naturally adapted to the area of problems in question, and then they construct a translating mechanism with just enough generality to cope with these notations. While a given set of special notations for a given area of problems may, indeed, be very useful and important, the ability to add arbitrary, new notations is even more important than any given set of special notations.

    The truth of this may be readily appreciated if we realize that no one set of special notations can be optimally adapted to all ranges of problems. No single computer would be optimally adapted to all types of computation. Furthermore, we cannot even hope, once and for all, to subdivide the entire universe of problems into fixed areas, each with a well-adapted notation. For example, if I have devised one notation satisfactory for business problems and another notation satisfactory for algebra, I am still unable to write convenient coding for problems which lie squarely between these two areas - namely, those problems which involve a lot of data handling and algebraic computation. Even if problems which cross the boundary lines never arose, the inability to add new notations would, nevertheless, be hampering. Suppose, for example, that a notation has been developed to handle all algebraic problems with equal facility.

    If, now, in a certain set of problems, I use frequently either a particular algebraic function or a particular succession of steps, I will find that the general algebraic language at my command causes me to write much more than the particular set of problems requires. In short, as soon as the range narrows from all of algebra to a special sub-domain of algebra, the general algebraic notation ceases to be well-adapted. In summary, it is possible to construct a general purpose compiler which (a) provides special conveniences in the writing of computer code (naming aids) and (b) provides the ability to add new notations to my repertoire for instruction writing whenever such notations seem profitable for particular ranges of problems. Of course, profit is a relative thing. In exchange for the advantages bought by having certain new notations available, I must labor to add to the library packages which make translation of these notations possible. Consequently, this central feature of the general purpose compiler is truly meaningful only if the labor required to obtain the use of new notations is comparatively small.

    The theoretical foundations for such compiling systems are now being developed at Remington Rand. Early examples, such as GP for UNIVAC I, demonstrate the practical utility of a system designed from this point of view.

          in [ACM] CACM 1(05) (May 1958) view details
  • Holt, A. W. and Turanski, W. J. Final Report of Common Programming Language Task, Part II, "Automatic Code Translation System", USASRDL, Fort Monmouth, N.J., Report no. AD59UR1, July 31, 1959; view details
          in [ACM] CACM 1(05) (May 1958) view details
  • Holt, A. W. and Turanski, W. J. Final Report of Common Programming Language Task, Part II, "Automatic Code Translation System", USASRDL, Fort Monmouth, N.J., Report no. AD59UR1, July 31, 1959; view details
          in [ACM] CACM 1(05) (May 1958) view details
  • Holt, A. W.; Turanski, W. J.; and McGinnity, D. L. "Common programming language task, II. Automatic code translation system (ACT.)" Univ. of Pennsylvania, Report No. AD60URI, 1959 view details
          in [ACM] CACM 1(05) (May 1958) view details
  • Holt, A. W.; Turanski, W. J.; and McGinnity, D. L. "Common programming language task, II. Automatic code translation system (ACT.)" Univ. of Pennsylvania, Report No. AD60URI, 1959 view details
          in [ACM] CACM 1(05) (May 1958) view details
  • Gorn, S., and E. Parker. "Common Programming Language Task, Final Report No. AD60UR1, Part I." U.S. Army Signal Research and Development Laboratories and University of Pennsylvania, the Moore School of Electrical Engineering, the Institute for Cooperative Research. June 30, 1960 view details
          in [ACM] CACM 1(05) (May 1958) view details
  • Holt, A. W. and Turanski, W. J. "Man-to-Machine Communication and Automatic Code Translation" view details Abstract:

    The ACT system is a programmed adjunct to a general purpose computing machine
    (currently considered for the MOBIDIC)whose purpose is to facilitate the initial
    encoding and subsequent application of specific "code-to-code translation"
    procedures. The phrase "code-to-code translation" is primarily meant to cover
    the conversion of problem oriented pseudo-codes into machine code equivalents.
    The ACT system provides for the "housing" of many distinct translation
    procedures -- carrying algebraic, data-processing, simulation languages, etc.,
    into any of a variety of computer codes -- within the bounds of a single
    controlling system.

    The programmed components of the ACT system include:


    1. An allocation interpreter -- AI
    2. A "core library" of basic translation functions
    3. A general translation library whose content varies with use of the system.

    The allocation interpreter permits the formation of large translation
    programs out of large (and small) pre-stored sub- programs. Without greatly
    impeding computation speed, AI determines internal and external storage
    allocations for all space-taking portions of information as part of problem
    computation. This facility is shown to be essential to the construction of ACT,
    and perhaps widely useful in other types of programs.


    The core library contains translation functions which play a fundamental role
    in a wide range of code-to-code translations. Included are many functions which
    usually play an important part in machine assembly programs. tether functions
    provide for general library handling, code reordering, etc.

    Also associated with ACT is a writing convention called "canonical form"
    which governs the presentation of data (i.e., programs in source code submitted
    for translation). Canonical form, while flexible enough to permit highly diverse
    problem codes, still standardizes all those signals which the basic ACT
    components decode.

    The ideas presented herein were developed at the University of Pennsylvania
    under contract to the U.S. Army Signal Corps in connection with the FIELDATA
    family of data-processing equipment. Although reference is made to the FIELDATA
    family in the presentation below, the ideas should be applicable to other
    families of equipment. Because of the prevailing tendency towards modular design
    in computing systems, such families are rapidly becoming more numerous.


    Extract: General Considerations
    General Considerations
    The Problem
    The ACT system, is designed to reduce the man-to-machine communication problem in operating the Army FIELDATA system. The fundamental assumptions with regard to this communication problem are:

    that the FIELDATA system employs a variety of different digital computers;
    that the computational problems to be solved with these machines belong to many different "problem areas" -- e.g., trajectory computation, estimation of statistical parameters, information retrieval, language translation;etc.
    that the exigencies of time and circumstance may make it necessary to run any of the intended range of problems on almost any of the intended range of machines';
    that the overwhelming majority of Army personnel concerned with these problems will not be computer programming experts for any (let alone all) of the intended computers; and
    that the efficiency of the entire FIELDATA system hinges primarily on the speed of problem solution -- i.e., the total "lapsed time" from problem statement till solution delivery. Therefore, the speed with which a problem can be correctly encoded for computer solution may matter a good deal more than the efficiency of the machine program with which the computation is finally performed.
    Existing Techniques
    The man-to-machine communication problem has been widely recognized and seriously attacked in many of its special instances. Most of the present day "automatic programming systems" are the result of manufacturers' attempts to narrow the gap between "the problem", which exists in the minds of technical specialists, and "the program" which "causes" the computer to compute a solution. The invention and construction of an automatic programming system normally involves two major steps:

    The invention of a special problem code (often referred to as a "pseudo-code")which:
    is compatible with already established habits of expression within some technical specialty, and
    is sufficiently formal and complete to permit automatic conversion -- or "translation" -- into a computer-coded program.
    The construction of a special translating routine which accepts as input data problems represented in problem code and produces ass output corresponding machine programs.
    Most commonly, the translating routine is designed to accomplish its entire task before computer solution of the stated problem begins. Thus, bringing a problem to solution with the help of an automatic programming system is typically a two-stage process -- translation followed by computation. (See figure 1).

    Automatic programming systems which function as in Figure 1 depend upon three codes: S, V, T. They are usually produced by manufacturers as adjuncts to particular computers.

    Thus, "Computer V" and "Computer T" of Figure 1 are almost always one and the same. If, for example, the manufacturer is putting a "scientific machine" on the market, he will usually design an algebraic-mathematical pseudo- code to go with it, and he will prepare the translating routine to produce programs for itself on that computer.

    Past experience with the building of automatic programming systems has brought to light several facts which have a strong bearing on the FIELDATA man-to-machine communication problem:


    To produce an automatic programming system (S, V, T) normally requires major programming effort -- measured in tens of man-years.
    Such systems are usually very difficult to modify in response to changes in the specifications for S, V or T. This difficulty arises because the translating routine, within which the specifications for S, V, and T are "frozen", is usually a large computer program from which it is impossible to 'factor' the components, S, V, T.
    As a consequence of the immediately preceding point, it is further true that a translating routine which has been programmed for the triple (S1, V1, T1), does not give a substantial head start for producing the routine for (S1, V1, T2) or for (S2, V1, T1).
    Extract: FIELDATA Requirements
    FIELDATA Requirements
    The requirements for FIELDATA computing, as set forth at the beginning of this introduction, make it clear that:
    1.The operation of the FIELDATA computing facility will require a large variety of source codes available for easy use in a rich variety of technical specialties.
    2. A given source code will have to be translatable into any one of several target codes.
    Some more detailed discussion of both points follows.

    Problem Code Flexibility
    It is a well known fact that each technical field has associated technical "jargon" -- special vocabulary as well as special forms of expression -- which belongs to it and it alone. This specialization of language resembles the specialization of computers for particular purposes. In general, special computer problem codes for statisticians, physicists, psychologists, business administrators, operations researchers, etc., have not been developed because the art of automatic programming is young and the building of new systems very expensive. If the cost of translating routines were not an issue, it would be desirable to provide each technical field with coding conventions which most nearly suit its established habits of expression.

    Quite apart from making provisions for presently fixed habits of expression, the creator of a translating routine must also note that habits of expression change with time and circumstance. As the constellations of problems to be solved undergo change, so do the related language habits. Old words are put to new uses; new words are invented; abbreviations are created to refer to oft- repeated combinations; previously disjoint sets of notations must be combined, etc. Thus, expression forms change in response to changing needs. In many important cases the expressions lead to computable problems. The speed with which answers appear depends heavily on the amount and kind of hand "'recoding'" (or "encoding") of the original problem statement before the computer can "take over" and ultimately execute the proper sequence of computer steps to achieve a problem solution.

    Computers -- Variety and Compatability

    No matter how similar two computers may seem, they always turn out to be distressingly different in relation to some problems. Consider, for example, two computers, C and 2C, which differ from each other in no respects other than that 2C has twice the internal storage capacity of C. For all those problems easily accomplished by a single run on C, Computers C and 2C are fully equivalent. For those problems which involve more than one run on C -- or perhaps the use of overlays -- Computer 2C is equivalent to Computer C in only the sense that those originally programmed for C will also run on 2C. Thus, for such problems, a mayor task of conversion from one computer to another may be encountered if

    the problem was originally programmed for 2C, or
    the problem was originally programmed for C but must be made to run efficiently on 2C.
    So-called "compatibility" between machines (in the sense of very similar or identical instruction codes) does not guarantee ease of program conversion from one computer to another. Only small problems will convert readily, while big problems will offer difficulties not much less formidable than when "incompatible" machines are considered. Although the FIELDATA computers are being designed with mutual compatibility in mind, significant differences between one translating routine (S, V, T1) and another (S, V, T2) must be expected.

    The need for many source codes and the inclusion of many machines (and many configurations for each machine) in the FIELDATA system makes the normal automatic programming approach economically unfeasible. To go from s source codes to m machines patently requires s.m translation routines -- a prohibitively large number when the effort for each such translating routine is considered. Even worse, if one had gone to the labor of constructing these s.m translating routines, one would still be faced with an unmanageable chore in keeping them up-todate in response to changing requirements. (This is almost exactly the problem to which a proposal named UNCOL (sponsored by SHARE and described in ACM Communications, Vol. I Nos. 8, 9) is addressed. In our opinion the UNCOL proposal, while aimed in a valuable direction, is practically unworkable. For more details, see section on General Translation Library below, also a letter from George H. Mealy, Switching Systems Dept., Bell Tel. Lab., Inc., to the members of the UNCOL committee, dated Feb. 3, 1959.)
    Extract: Code-to-Code Translation
    Code-to-Code Translation
    The study of many problem oriented source codes and their translation into computer code reveals that various translation Processes have many important common elements. Often commonness between translation processes can be found by examining their respective constituent parts. For example the translation of S1 to T1 -- briefly (S1, T1) -- and the translation (S2, T2) may both employ the same assembly system for the terminal portion of the translation. Sorting is a frequent common sub-process; at a finer level of analysis, table look-up, list forming, subdividing a "sentence" into "words", economizing storage locations, etc., have frequent occurrence.

    Another kind of commonness frequently found between two translation processes is the employment of a common control sequence with differences in regard to what is controlled. This is illustrated by generator systems (such as FLOWMATIC) in which the individual generators are properly regarded as sub- translators subject to an over-all controlling scheme. Two quite different pseudo-codes might be handled by the same controlling scheme with differences only in the library of generators being controlled.

    These examples indicate that code-to-code translation is coherent area of computation for which programming aids may be prepared. These programming aids (whose nature and form will be discussed below) will:

    Reduce by large factors the effort required to produce an operating program for the translation (S, V, T).
    Similarly reduce the effort required to modify existing translation programs in response to changed specification for S, V, or T.
    The ACT system is designed to be a programming aid such as discussed above. Once constructed, it should have a far reaching effect on the manner and effectiveness of FIELDATA utilization. In working on some particular class of problems -- or even just one large problem -programmers will find it convenient to refine their tools of program expression as they go -- by modifying or adding to some previously available set of conventions. Thus code or "language" development will go hand in hand with problem solution, just as is normally the case in technical or scientific work without computers. Furthermore the ACT system' should greatly facilitate the re-translation of a particular source code, originally translated to one machine but now required for another.

    Extract: The ACT System
    The ACT System
    The ACT System is a collection of programmed devices which loosely fit together to help in the implementation of specific translations. (Some of these components have importance and usefulness above and beyond their function within ACT. This is exactly comparable to parts of a computer (such as memory units or tape transports) which can be integrated into many different systems.) Also associated with the ACT System are two sets of writing conventions; one set governing the presentation of translation procedures -- to be called "ACT translation procedures" -- and the other to govern the presentation of data (i.e., aource-coded programs submitted for translation.) This is analogous to computer systems, for which data and programming conventions must always be specified. A complete description of ACT will therefore include:

    A description of mayor components.
    Writing Conventions governing ACT translation procedures.
    Writing Conventions governing programs submitted for translation.
    The programmed components of the ACT system include:

    An allocation interpreter -- AI
    A "core library" of basic translation functions
    A general translation library whose content varies with use of the system.
    Extract: Allocation Interpreter
    Allocation Interpreter
    The allocation interpreter permits the formation of large translation programs out of large (and small) pre-stored sub-programs. Unlike assembly programs, it accomplishes its function dynamically -- i.e., during the actual running of translation programs. This mode of operation is dictated by two important characteristics of translation programs:


    frequent use of data dependent storages
    the possibility of analyzing such programs into parts some of which are large sub-programs (unlike mathematical subroutines).
    Although AI participates at run-time, it does not cause any substantial reduction in computation speed. Almost all program instructions are performed by the actual computer without interpretive interference. Only at infrequent intervals in program running time, when there is a change in requirements for internal or external storage apace, is there a jump out of AI for bookkeeping and allocation activities, followed by a return Or control to the running program. in this respect AI behaves in a manner analogous to that of contingency routines.

    The special jump-return instructions which interrupt the main computation for AI intervention might be conveniently viewed as new macro-instructions, added to the previously available stock of computer commands. These macro-instructions are entirely concerned with space allocation, and programs written with these instructions in mind have new formal properties. Correspondingly, a computer-plus-1I on which such programs are intended to run may be conveniently regarded as an "Extended Machine" -- EH. Figure 2 is based on this conception. The central crosa represents the extension.

    Associated with EM is EM-code in which programs for EM must be stored. This code differs from ordinary machine code in several respects.

    The program is formally subdivided into program "phases", each phase headed by a phase description. The phase description lists "sequences" Or "segments" of information belonging to the phase and requiring space (internal or external) when the phase operates.
    The phases of a program relate to each other through special call a (interpreted by AI). When a call for a new phase is encountered, the calling phase is suspended and the called phase initiated. When the called phase terminates it may either be held for re-use or dropped, depending on the manner of call.
    Aside from special AI macro instructions having to do with apace allocation (including new phase call) EM code contains ordinary computer instructions with sequence-and-segment-relative addresses. These addresses are converted by a program loader into fixed form when the phase is first encountered during program operation. If a phase is held for reuse and then re-activated, it is returned to memory to the same set of locations to which it was originally assigned. Between uses of the same phase AI will have taken care not to assign any newly developed information which this phase requires to locations causing conflict with old information already belonging to the phase.
    Although phases of EM-coded programs will normally be functional units (corresponding to a box on some flow-chart description), the converse need not be expected. The only use which is made by AI of the knowledge that a program is subdivided into phases with particular interconnections is to temporarily dump some information in favor of other information which is now required and which could not otherwise be accommodated. Therefore functional units which, in their coded form, occupy very little space in the various computer media are not worth treating as separate phases. The bookkeeping involved in their handling is not worth the space return which they can bring.

    The power of EM may be said to derive from the following fact. In the case of EM programs the total apace requirements of the program (for external and internal apace) have been broken down into space requirements for individual program phases (where different phases may overlap by requiring the same information). At any one moment in computation time only one phase is operating and hence only its apace requirements must be satisfied. Any parts of other phases which do not overlap with the active phase can be dumped and later restored if necessary. But the greater the amount of dumping and later restoring the slower the program operates. If, therefore, there is enough memory space to permit the assignment of phases side-by-side the program will run correspondingly faster. In this way EM is able automatically to trade time for space depending on the total available storage of the computer and the sizes of various data lists which may have grown in course of computation.
    Extract: The Core Library
    The Core Library
    The core library consists of a collection of sub-programs -- called translation procedures -- which have near-universal importance in translating pseudocodes into machine codes. (Many of these functions derive from a single fact about the relation of source to target: namely that the information in the source code is packaged by time of generation, while in the target code it must be packaged by time of use).

    The writing conventions -- called "Canonical Form" > abbreviated "CF" -- governing source-coded programs submitted for translation, are entirely related to the core library. d11 of the signals for which CF provides are decoded by one or several procedures in the core library. Conversely, all those translation procedures which only deal with signals specified by CF belong in the core library. The following is a partial list of functions fulfilled by the core library procedures.

    Coding Item Handling --
    Each CF item of pseudo-code (such as a FORTRAN formula, an assembly instruction, a FLOWMATIC sentence, etc.) has a formal structure which provides for standard item features such as:

    an item identification number
    a symbolic item name
    item type indications.
    In addition, after some preliminary editing, the body of the item may be recognized to consist of a sequence of symbols. The core library provides CF item handling functions which allow specific translation processes to gain access to constituent portions of items by means of "item dissecting" commands. Similarly, if a translation procedure produces CF coding items as output, they may be assembled from their constituents by use of "item assembling" commands.

    Patch Interpretation --
    CF coding items may be generated (by programmer or generating routine) in some order wholly different from the order in which they will be next required. Item rearrangement may be coded into the original item sequence by use of the item identification numbers as well as by use of special "patch items". Core library functions will cause item re-ordering, thus interpreting the "patch" intentions expressed in the original item sequence.
    Symbol Interpretation --
    By "symbol" la meant the smallest meaningful constituent of a coding item. Thus, the computer instruction 'CLA B 542' is composed of three symbols. Symbols may be of various types from the point of view of what interpretation they require during translation. Some symbols, for example, are absolute and require no interpretation; other symbols are 'dummy' and require replacement by specified expressions yet other symbols refer to named coding items -- i.e., are "addresses" -- and will require address interpretation. The core library contains a variety of procedures to effect symbol interpretation.
    Type Interpretation --
    CF coding items may contain marks to indicate their specific type. Thus, for example, a FORTRAN DO-statement and a dimension statement might be distinguished as to type; or a statement to be interpreted by FORTRAN as opposed to a statement to be interpreted by SAP may be distinguished by type. In general, type indications always mean that the items thus categorized, are to be respectively delivered to different translation procedures for treatment. Core procedures are used to deliver CF coded items to appropriate translating routines in proper priority sequence, according to type indication.
    Call Interpretation --
    One of the frequent mechanisms employed in code-to-code translation is the adjoining of one or more batches of coding items, taken from a library of such "batches", to the coding items which already belong to the source coded routine being interpreted. A good example of this is the selection of subroutines from a library. After the subroutines have been picked up and (with some adjustments ) adjoined to the routine undergoing translation, this latter routine has been 'expanded ~ by the inclusion of new coding items. Special items in the source coded routine, called "call items" tell what particular batches are to be selected from the library for inclusion.
    In the case of subroutine selection, call interpretation may be carried out ttlrough a hierarchy of levels. The subroutines aelected for adJoining to the calling routine may themselves contain further call items requiring (for their interpretation) the selection of yet other library routines for inclusion. Thus the original source coded routine contains "first level" calls; the subroutines thus selected may contain "second level" calls, and so forth, for nth level calls. ( "Univac Generalized Programming", Remington Rand Univac, RRU 21, 1957, New York, N.Y. )

    Call interpretation may be usefully applied to a variety of cases which have nothing to do with subroutine selection. For example, a routine coded in FLOWMATIC might be said to "call" on file descriptions from a library. The interpretation of such calls has much in common with sub-routine selection. To go one step further: if there were a number of different files which all involved the same item description, then there might even be "second level" calls -- the routines calling on file descriptions, and the file descriptions calling on item descriptions. Core procedures are used to carry out hierarchic fetching and adjusting of library information.
    Extract: The General Translation Library
    The General Translation Library
    The contents of the general translation library are governed by the particular source codes and target codes for which translation procedures are desired. When stored in the translation library, these procedures are represented in EM code -- hence contain special macroinstructions having to do with allocation. They also contain macro-instructions referring to translation procedures stored in the core library. In general the time at which various core functions are brought into play in course of a translation is dependent upon features of source and target codes beyond those which are governed by canonical form. Hence the highest level control of a translation procedure always resides in a program belonging to the general translation library. With ACT, as presently conceived, there must be a distinct high-level controlling program for each source-to-target pair (S, T) desired. Of course these controlling programs will, in general, be very short, the bulk of the work being encoded in sub-programs belonging to general or core libraries. (It is projected that a special component of ACT can be prepared, to be called the ACT "translation control" program. Highest level control of translation would then always be vested in this program, and the total number of individual translation procedures required to handle a large class of (S, T) would be greatly reduced. Further study is required to test the feasibility of this.)

    One class of translation procedures expected in the library deserve special comment. In course of many code-to-code translations one reaches a penultimate code form which may be roughly described as "assembly code" and a final step in translation called "assembly". There is considerable variation between present-day translation systems (such as FORTRAN, FLOWMATIC, IT, etc.) in regard to the complexity of this final assembly phase. In the FLOWMATIC compiler for example, only the most rudimentary assembly phase, consisting of nothing more than relative address interpretation is employed, while FORTRAN translation produces SAP assembly code with symbolic addressing, library reference ability, and many other features. ("X-1 Assembly System" ADP, Remington Rand Univac, Philadelphia, Pa.)

    In the development of ACT a special analysis was undertaken of the terminal translation steps to determine whether a widely applicable 'end game' could be prepared for inclusion in many different translation procedures. This analysis resulted in a set of specifications for assembly -- system and code -- which includes many features normally included in such systems but also some noteworthy additions. (Description of these lies beyond the scope of this paper.) Assembly procedures conforming to these specifications make extensive use of the core functions previously described as well as some additional ones. since, however, such procedures always depend, in part, on the target computer, they are themselves represented in the general translation library.

    The success of the ACT system conception hinges critically on the feasibility of general translation library building, and we will now try to show that past experience with code-to-code translation justifies the expectations on which the over-all design of ACT is based.

    It is well known that, in normal code translation procedures, the source code is made to pass through some number of intermediate forms until, through final processing it becomes target code.


    (1) CS ?> CI1 ?> CI2 ?> CI3 ... CA1 ?> CT
    In (1) the subscripts, I1, I2, etc., are meant to suggest successive Intermediate forms, while A1 suggests Assembly form (A2, A3, etc., being possible intermediate forms through which the code passed during assembly). In FORTRAN translation, for example, the initial formula statements pass through several intermediate forms until they reach "triple" form; they are then translated further into SAP assembly code, etc. Generally speaking, as the code progresses from form to form, it becomes, step-wise, less and less conditioned by the specifications of the source code. Thus, for example, in FLOWMATIC, by the time the code has reached Op. file (1) form, it is largely independent of the detailed rules for FLOWMATIC statement writing. One could substitute many new conventions for FLOWMATIC sentence presentation, and correspondingly only alter the action of the so-called "glossaries", which (together with some general controlling code) determine the first intermediate code form. To put this another way: many details of FLOWMATIC code form are interpreted and done with in the first stages CS ?> CI1 . All such details could therefore be altered without requiring any revision in the compiler, save a change in the early phases. (Peter Sheridan, "The FORTRAN Symbolic Translator", ACM Communications, Vol. II, No. 2, p. 9.)

    Exactly parallel observations apply to the terminal stages of code translation with regard to the details of the target code. Again choosing FLOWMATIC as an example: one can alter the target computer (hence the target code) with little or no effect on the early stages of translation which are source-code determined. (This observation is the basis for the so-called AIMACO system which translates FLOWMATIC code into 1103 target code.) Figure 3 shows a rough break-down of code-translation into stages which are successively more target-code determined and less source-code determined. (The fact that some computer effects enter the translation procedure almost from the beginning is what makes the UNCOL proposal -- namely to accomplish all translations in two stages, the first source determined and the second target determined -- unfeasible.)

    The division into gross stages as in Figure 3 is neither unique nor optimal. Its sole purpose is to show how the ACT translation library may be used to good advantage. Suppose for example that a certain translation (Si, Ti) is coded with exactly six intermediate
          stages, shown in Figure 3. The over all ACT program would read:(2)
    Do stage 1Do stage 2...Do stage 6
    This program would be stored in the ACT library together with all six sub-programs which are called upon. Now suppose that one wishes to produce a translation for (Sj, Ti) where Sj differs from Si only in its rules for operation
          and operand presentation (such as parenthesis notation for algebra
          replaced by Polish notation; or the rules for file and item descriptions
          changed from "explicit" to "pictorial" form, etc.). The new translation
          may be obtained by writing a new controlling program(3)
    Do stage 18Do stage 2....Do stage 6
    and also coding a new sub-program corresponding to 1'. If, on the other hand, one wishes to produce a translation for (Si, Tj), one will have to consider how radically Tj differs from Ti . The more radical the difference, the greater the number of intermediate stages which must be replaced.

    Consider next, the problem of producing a translation for (Sk, Ti) where Sk is a source code which allows parts to be written in Si and other parts in Sj with "type" labels to show which sections are written in accordance with which conventions. Suppose further that both translations (Si, Ti) as well as (Sj, Ti) were al ready available, and both coded in six stages as in Figure 3. The new control program would cause some number of initial stages for Si to be performed, and also some number of initial stages for Sj. As soon as both streams of coding have achieved a common form, they are, from there on in, submitted to the same terminal stages. How many stages would be involved before the two streams converge is a function of how profoundly the two codes, Si and Sj, differ in their features as analyzed in Figure 3.

    These greatly over-simplified examples are meant to show that, because of the partial "factoring" of source from target effects which one may realize in code-to-code translations, one will have ample opportunity for the re-use of mayor translation sub-programs to be stored in the ACT translation library. As already discussed previously, the extended machine base seems indispensible for the combining of such mayor sub-portions into entire operative programs.

    Extract: Final Summary
    Final Summary
    The primary aim of the ACT system is to reduce to a minimum the labor required to implement a given translation (Si, Vi, Ti). The reduction of labor is achieved through:


    1. Storing translational functions of almost universal applicability in the core library an d providing for their incorporation in new translation programs.
    2. Making it possible for the translation library to grow in such a way that new programs (Sj, Tj) may use, as sub-programs, previously coded procedures already stored in the translation library.

    Regarding item (2) above, its realization is entirely dependent on EM as the fundamental base on which ACT is built. The effictiveness of item (1), on the other hand, is dependent on the analysis of coding functions which leads to the definition of canonical form.

    (Many of the representational facilities which CF provides will not be directly employed in programmer-written source codes, but will appear in intermediate code-forms which are the result of early editing and other translations.)

    Figure II, is a schematic summary of the three main portions of ACT. Each box in the figure is horizontally divided into three sections. The middle section tells by what functional requirements the object is determined; the lower section tells by what coding forms the object is determined. ('EM-code' refers to the extended machine. In regard to the triplet (S, V, T) the machine used -- extended or unextended -- is V. Because of the nature of EM, the ACT system can be transferred from one V to another without recoding if the two machines differ only in configuration but not in order structure.)

          in [JCC 17] Proceedings of the 1960 Western Joint Computer Conference, May 1960 view details
  • Holt, A. W. and Turanski, W. J. "Man-to-Machine Communication and Automatic Code Translation" view details Abstract:

    The ACT system is a programmed adjunct to a general purpose computing machine
    (currently considered for the MOBIDIC)whose purpose is to facilitate the initial
    encoding and subsequent application of specific "code-to-code translation"
    procedures. The phrase "code-to-code translation" is primarily meant to cover
    the conversion of problem oriented pseudo-codes into machine code equivalents.
    The ACT system provides for the "housing" of many distinct translation
    procedures -- carrying algebraic, data-processing, simulation languages, etc.,
    into any of a variety of computer codes -- within the bounds of a single
    controlling system.

    The programmed components of the ACT system include:


    1. An allocation interpreter -- AI
    2. A "core library" of basic translation functions
    3. A general translation library whose content varies with use of the system.

    The allocation interpreter permits the formation of large translation
    programs out of large (and small) pre-stored sub- programs. Without greatly
    impeding computation speed, AI determines internal and external storage
    allocations for all space-taking portions of information as part of problem
    computation. This facility is shown to be essential to the construction of ACT,
    and perhaps widely useful in other types of programs.


    The core library contains translation functions which play a fundamental role
    in a wide range of code-to-code translations. Included are many functions which
    usually play an important part in machine assembly programs. tether functions
    provide for general library handling, code reordering, etc.

    Also associated with ACT is a writing convention called "canonical form"
    which governs the presentation of data (i.e., programs in source code submitted
    for translation). Canonical form, while flexible enough to permit highly diverse
    problem codes, still standardizes all those signals which the basic ACT
    components decode.

    The ideas presented herein were developed at the University of Pennsylvania
    under contract to the U.S. Army Signal Corps in connection with the FIELDATA
    family of data-processing equipment. Although reference is made to the FIELDATA
    family in the presentation below, the ideas should be applicable to other
    families of equipment. Because of the prevailing tendency towards modular design
    in computing systems, such families are rapidly becoming more numerous.


    Extract: General Considerations
    General Considerations
    The Problem
    The ACT system, is designed to reduce the man-to-machine communication problem in operating the Army FIELDATA system. The fundamental assumptions with regard to this communication problem are:

    that the FIELDATA system employs a variety of different digital computers;
    that the computational problems to be solved with these machines belong to many different "problem areas" -- e.g., trajectory computation, estimation of statistical parameters, information retrieval, language translation;etc.
    that the exigencies of time and circumstance may make it necessary to run any of the intended range of problems on almost any of the intended range of machines';
    that the overwhelming majority of Army personnel concerned with these problems will not be computer programming experts for any (let alone all) of the intended computers; and
    that the efficiency of the entire FIELDATA system hinges primarily on the speed of problem solution -- i.e., the total "lapsed time" from problem statement till solution delivery. Therefore, the speed with which a problem can be correctly encoded for computer solution may matter a good deal more than the efficiency of the machine program with which the computation is finally performed.
    Existing Techniques
    The man-to-machine communication problem has been widely recognized and seriously attacked in many of its special instances. Most of the present day "automatic programming systems" are the result of manufacturers' attempts to narrow the gap between "the problem", which exists in the minds of technical specialists, and "the program" which "causes" the computer to compute a solution. The invention and construction of an automatic programming system normally involves two major steps:

    The invention of a special problem code (often referred to as a "pseudo-code")which:
    is compatible with already established habits of expression within some technical specialty, and
    is sufficiently formal and complete to permit automatic conversion -- or "translation" -- into a computer-coded program.
    The construction of a special translating routine which accepts as input data problems represented in problem code and produces ass output corresponding machine programs.
    Most commonly, the translating routine is designed to accomplish its entire task before computer solution of the stated problem begins. Thus, bringing a problem to solution with the help of an automatic programming system is typically a two-stage process -- translation followed by computation. (See figure 1).

    Automatic programming systems which function as in Figure 1 depend upon three codes: S, V, T. They are usually produced by manufacturers as adjuncts to particular computers.

    Thus, "Computer V" and "Computer T" of Figure 1 are almost always one and the same. If, for example, the manufacturer is putting a "scientific machine" on the market, he will usually design an algebraic-mathematical pseudo- code to go with it, and he will prepare the translating routine to produce programs for itself on that computer.

    Past experience with the building of automatic programming systems has brought to light several facts which have a strong bearing on the FIELDATA man-to-machine communication problem:


    To produce an automatic programming system (S, V, T) normally requires major programming effort -- measured in tens of man-years.
    Such systems are usually very difficult to modify in response to changes in the specifications for S, V or T. This difficulty arises because the translating routine, within which the specifications for S, V, and T are "frozen", is usually a large computer program from which it is impossible to 'factor' the components, S, V, T.
    As a consequence of the immediately preceding point, it is further true that a translating routine which has been programmed for the triple (S1, V1, T1), does not give a substantial head start for producing the routine for (S1, V1, T2) or for (S2, V1, T1).
    Extract: FIELDATA Requirements
    FIELDATA Requirements
    The requirements for FIELDATA computing, as set forth at the beginning of this introduction, make it clear that:
    1.The operation of the FIELDATA computing facility will require a large variety of source codes available for easy use in a rich variety of technical specialties.
    2. A given source code will have to be translatable into any one of several target codes.
    Some more detailed discussion of both points follows.

    Problem Code Flexibility
    It is a well known fact that each technical field has associated technical "jargon" -- special vocabulary as well as special forms of expression -- which belongs to it and it alone. This specialization of language resembles the specialization of computers for particular purposes. In general, special computer problem codes for statisticians, physicists, psychologists, business administrators, operations researchers, etc., have not been developed because the art of automatic programming is young and the building of new systems very expensive. If the cost of translating routines were not an issue, it would be desirable to provide each technical field with coding conventions which most nearly suit its established habits of expression.

    Quite apart from making provisions for presently fixed habits of expression, the creator of a translating routine must also note that habits of expression change with time and circumstance. As the constellations of problems to be solved undergo change, so do the related language habits. Old words are put to new uses; new words are invented; abbreviations are created to refer to oft- repeated combinations; previously disjoint sets of notations must be combined, etc. Thus, expression forms change in response to changing needs. In many important cases the expressions lead to computable problems. The speed with which answers appear depends heavily on the amount and kind of hand "'recoding'" (or "encoding") of the original problem statement before the computer can "take over" and ultimately execute the proper sequence of computer steps to achieve a problem solution.

    Computers -- Variety and Compatability

    No matter how similar two computers may seem, they always turn out to be distressingly different in relation to some problems. Consider, for example, two computers, C and 2C, which differ from each other in no respects other than that 2C has twice the internal storage capacity of C. For all those problems easily accomplished by a single run on C, Computers C and 2C are fully equivalent. For those problems which involve more than one run on C -- or perhaps the use of overlays -- Computer 2C is equivalent to Computer C in only the sense that those originally programmed for C will also run on 2C. Thus, for such problems, a mayor task of conversion from one computer to another may be encountered if

    the problem was originally programmed for 2C, or
    the problem was originally programmed for C but must be made to run efficiently on 2C.
    So-called "compatibility" between machines (in the sense of very similar or identical instruction codes) does not guarantee ease of program conversion from one computer to another. Only small problems will convert readily, while big problems will offer difficulties not much less formidable than when "incompatible" machines are considered. Although the FIELDATA computers are being designed with mutual compatibility in mind, significant differences between one translating routine (S, V, T1) and another (S, V, T2) must be expected.

    The need for many source codes and the inclusion of many machines (and many configurations for each machine) in the FIELDATA system makes the normal automatic programming approach economically unfeasible. To go from s source codes to m machines patently requires s.m translation routines -- a prohibitively large number when the effort for each such translating routine is considered. Even worse, if one had gone to the labor of constructing these s.m translating routines, one would still be faced with an unmanageable chore in keeping them up-todate in response to changing requirements. (This is almost exactly the problem to which a proposal named UNCOL (sponsored by SHARE and described in ACM Communications, Vol. I Nos. 8, 9) is addressed. In our opinion the UNCOL proposal, while aimed in a valuable direction, is practically unworkable. For more details, see section on General Translation Library below, also a letter from George H. Mealy, Switching Systems Dept., Bell Tel. Lab., Inc., to the members of the UNCOL committee, dated Feb. 3, 1959.)
    Extract: Code-to-Code Translation
    Code-to-Code Translation
    The study of many problem oriented source codes and their translation into computer code reveals that various translation Processes have many important common elements. Often commonness between translation processes can be found by examining their respective constituent parts. For example the translation of S1 to T1 -- briefly (S1, T1) -- and the translation (S2, T2) may both employ the same assembly system for the terminal portion of the translation. Sorting is a frequent common sub-process; at a finer level of analysis, table look-up, list forming, subdividing a "sentence" into "words", economizing storage locations, etc., have frequent occurrence.

    Another kind of commonness frequently found between two translation processes is the employment of a common control sequence with differences in regard to what is controlled. This is illustrated by generator systems (such as FLOWMATIC) in which the individual generators are properly regarded as sub- translators subject to an over-all controlling scheme. Two quite different pseudo-codes might be handled by the same controlling scheme with differences only in the library of generators being controlled.

    These examples indicate that code-to-code translation is coherent area of computation for which programming aids may be prepared. These programming aids (whose nature and form will be discussed below) will:

    Reduce by large factors the effort required to produce an operating program for the translation (S, V, T).
    Similarly reduce the effort required to modify existing translation programs in response to changed specification for S, V, or T.
    The ACT system is designed to be a programming aid such as discussed above. Once constructed, it should have a far reaching effect on the manner and effectiveness of FIELDATA utilization. In working on some particular class of problems -- or even just one large problem -programmers will find it convenient to refine their tools of program expression as they go -- by modifying or adding to some previously available set of conventions. Thus code or "language" development will go hand in hand with problem solution, just as is normally the case in technical or scientific work without computers. Furthermore the ACT system' should greatly facilitate the re-translation of a particular source code, originally translated to one machine but now required for another.

    Extract: The ACT System
    The ACT System
    The ACT System is a collection of programmed devices which loosely fit together to help in the implementation of specific translations. (Some of these components have importance and usefulness above and beyond their function within ACT. This is exactly comparable to parts of a computer (such as memory units or tape transports) which can be integrated into many different systems.) Also associated with the ACT System are two sets of writing conventions; one set governing the presentation of translation procedures -- to be called "ACT translation procedures" -- and the other to govern the presentation of data (i.e., aource-coded programs submitted for translation.) This is analogous to computer systems, for which data and programming conventions must always be specified. A complete description of ACT will therefore include:

    A description of mayor components.
    Writing Conventions governing ACT translation procedures.
    Writing Conventions governing programs submitted for translation.
    The programmed components of the ACT system include:

    An allocation interpreter -- AI
    A "core library" of basic translation functions
    A general translation library whose content varies with use of the system.
    Extract: Allocation Interpreter
    Allocation Interpreter
    The allocation interpreter permits the formation of large translation programs out of large (and small) pre-stored sub-programs. Unlike assembly programs, it accomplishes its function dynamically -- i.e., during the actual running of translation programs. This mode of operation is dictated by two important characteristics of translation programs:


    frequent use of data dependent storages
    the possibility of analyzing such programs into parts some of which are large sub-programs (unlike mathematical subroutines).
    Although AI participates at run-time, it does not cause any substantial reduction in computation speed. Almost all program instructions are performed by the actual computer without interpretive interference. Only at infrequent intervals in program running time, when there is a change in requirements for internal or external storage apace, is there a jump out of AI for bookkeeping and allocation activities, followed by a return Or control to the running program. in this respect AI behaves in a manner analogous to that of contingency routines.

    The special jump-return instructions which interrupt the main computation for AI intervention might be conveniently viewed as new macro-instructions, added to the previously available stock of computer commands. These macro-instructions are entirely concerned with space allocation, and programs written with these instructions in mind have new formal properties. Correspondingly, a computer-plus-1I on which such programs are intended to run may be conveniently regarded as an "Extended Machine" -- EH. Figure 2 is based on this conception. The central crosa represents the extension.

    Associated with EM is EM-code in which programs for EM must be stored. This code differs from ordinary machine code in several respects.

    The program is formally subdivided into program "phases", each phase headed by a phase description. The phase description lists "sequences" Or "segments" of information belonging to the phase and requiring space (internal or external) when the phase operates.
    The phases of a program relate to each other through special call a (interpreted by AI). When a call for a new phase is encountered, the calling phase is suspended and the called phase initiated. When the called phase terminates it may either be held for re-use or dropped, depending on the manner of call.
    Aside from special AI macro instructions having to do with apace allocation (including new phase call) EM code contains ordinary computer instructions with sequence-and-segment-relative addresses. These addresses are converted by a program loader into fixed form when the phase is first encountered during program operation. If a phase is held for reuse and then re-activated, it is returned to memory to the same set of locations to which it was originally assigned. Between uses of the same phase AI will have taken care not to assign any newly developed information which this phase requires to locations causing conflict with old information already belonging to the phase.
    Although phases of EM-coded programs will normally be functional units (corresponding to a box on some flow-chart description), the converse need not be expected. The only use which is made by AI of the knowledge that a program is subdivided into phases with particular interconnections is to temporarily dump some information in favor of other information which is now required and which could not otherwise be accommodated. Therefore functional units which, in their coded form, occupy very little space in the various computer media are not worth treating as separate phases. The bookkeeping involved in their handling is not worth the space return which they can bring.

    The power of EM may be said to derive from the following fact. In the case of EM programs the total apace requirements of the program (for external and internal apace) have been broken down into space requirements for individual program phases (where different phases may overlap by requiring the same information). At any one moment in computation time only one phase is operating and hence only its apace requirements must be satisfied. Any parts of other phases which do not overlap with the active phase can be dumped and later restored if necessary. But the greater the amount of dumping and later restoring the slower the program operates. If, therefore, there is enough memory space to permit the assignment of phases side-by-side the program will run correspondingly faster. In this way EM is able automatically to trade time for space depending on the total available storage of the computer and the sizes of various data lists which may have grown in course of computation.
    Extract: The Core Library
    The Core Library
    The core library consists of a collection of sub-programs -- called translation procedures -- which have near-universal importance in translating pseudocodes into machine codes. (Many of these functions derive from a single fact about the relation of source to target: namely that the information in the source code is packaged by time of generation, while in the target code it must be packaged by time of use).

    The writing conventions -- called "Canonical Form" > abbreviated "CF" -- governing source-coded programs submitted for translation, are entirely related to the core library. d11 of the signals for which CF provides are decoded by one or several procedures in the core library. Conversely, all those translation procedures which only deal with signals specified by CF belong in the core library. The following is a partial list of functions fulfilled by the core library procedures.

    Coding Item Handling --
    Each CF item of pseudo-code (such as a FORTRAN formula, an assembly instruction, a FLOWMATIC sentence, etc.) has a formal structure which provides for standard item features such as:

    an item identification number
    a symbolic item name
    item type indications.
    In addition, after some preliminary editing, the body of the item may be recognized to consist of a sequence of symbols. The core library provides CF item handling functions which allow specific translation processes to gain access to constituent portions of items by means of "item dissecting" commands. Similarly, if a translation procedure produces CF coding items as output, they may be assembled from their constituents by use of "item assembling" commands.

    Patch Interpretation --
    CF coding items may be generated (by programmer or generating routine) in some order wholly different from the order in which they will be next required. Item rearrangement may be coded into the original item sequence by use of the item identification numbers as well as by use of special "patch items". Core library functions will cause item re-ordering, thus interpreting the "patch" intentions expressed in the original item sequence.
    Symbol Interpretation --
    By "symbol" la meant the smallest meaningful constituent of a coding item. Thus, the computer instruction 'CLA B 542' is composed of three symbols. Symbols may be of various types from the point of view of what interpretation they require during translation. Some symbols, for example, are absolute and require no interpretation; other symbols are 'dummy' and require replacement by specified expressions yet other symbols refer to named coding items -- i.e., are "addresses" -- and will require address interpretation. The core library contains a variety of procedures to effect symbol interpretation.
    Type Interpretation --
    CF coding items may contain marks to indicate their specific type. Thus, for example, a FORTRAN DO-statement and a dimension statement might be distinguished as to type; or a statement to be interpreted by FORTRAN as opposed to a statement to be interpreted by SAP may be distinguished by type. In general, type indications always mean that the items thus categorized, are to be respectively delivered to different translation procedures for treatment. Core procedures are used to deliver CF coded items to appropriate translating routines in proper priority sequence, according to type indication.
    Call Interpretation --
    One of the frequent mechanisms employed in code-to-code translation is the adjoining of one or more batches of coding items, taken from a library of such "batches", to the coding items which already belong to the source coded routine being interpreted. A good example of this is the selection of subroutines from a library. After the subroutines have been picked up and (with some adjustments ) adjoined to the routine undergoing translation, this latter routine has been 'expanded ~ by the inclusion of new coding items. Special items in the source coded routine, called "call items" tell what particular batches are to be selected from the library for inclusion.
    In the case of subroutine selection, call interpretation may be carried out ttlrough a hierarchy of levels. The subroutines aelected for adJoining to the calling routine may themselves contain further call items requiring (for their interpretation) the selection of yet other library routines for inclusion. Thus the original source coded routine contains "first level" calls; the subroutines thus selected may contain "second level" calls, and so forth, for nth level calls. ( "Univac Generalized Programming", Remington Rand Univac, RRU 21, 1957, New York, N.Y. )

    Call interpretation may be usefully applied to a variety of cases which have nothing to do with subroutine selection. For example, a routine coded in FLOWMATIC might be said to "call" on file descriptions from a library. The interpretation of such calls has much in common with sub-routine selection. To go one step further: if there were a number of different files which all involved the same item description, then there might even be "second level" calls -- the routines calling on file descriptions, and the file descriptions calling on item descriptions. Core procedures are used to carry out hierarchic fetching and adjusting of library information.
    Extract: The General Translation Library
    The General Translation Library
    The contents of the general translation library are governed by the particular source codes and target codes for which translation procedures are desired. When stored in the translation library, these procedures are represented in EM code -- hence contain special macroinstructions having to do with allocation. They also contain macro-instructions referring to translation procedures stored in the core library. In general the time at which various core functions are brought into play in course of a translation is dependent upon features of source and target codes beyond those which are governed by canonical form. Hence the highest level control of a translation procedure always resides in a program belonging to the general translation library. With ACT, as presently conceived, there must be a distinct high-level controlling program for each source-to-target pair (S, T) desired. Of course these controlling programs will, in general, be very short, the bulk of the work being encoded in sub-programs belonging to general or core libraries. (It is projected that a special component of ACT can be prepared, to be called the ACT "translation control" program. Highest level control of translation would then always be vested in this program, and the total number of individual translation procedures required to handle a large class of (S, T) would be greatly reduced. Further study is required to test the feasibility of this.)

    One class of translation procedures expected in the library deserve special comment. In course of many code-to-code translations one reaches a penultimate code form which may be roughly described as "assembly code" and a final step in translation called "assembly". There is considerable variation between present-day translation systems (such as FORTRAN, FLOWMATIC, IT, etc.) in regard to the complexity of this final assembly phase. In the FLOWMATIC compiler for example, only the most rudimentary assembly phase, consisting of nothing more than relative address interpretation is employed, while FORTRAN translation produces SAP assembly code with symbolic addressing, library reference ability, and many other features. ("X-1 Assembly System" ADP, Remington Rand Univac, Philadelphia, Pa.)

    In the development of ACT a special analysis was undertaken of the terminal translation steps to determine whether a widely applicable 'end game' could be prepared for inclusion in many different translation procedures. This analysis resulted in a set of specifications for assembly -- system and code -- which includes many features normally included in such systems but also some noteworthy additions. (Description of these lies beyond the scope of this paper.) Assembly procedures conforming to these specifications make extensive use of the core functions previously described as well as some additional ones. since, however, such procedures always depend, in part, on the target computer, they are themselves represented in the general translation library.

    The success of the ACT system conception hinges critically on the feasibility of general translation library building, and we will now try to show that past experience with code-to-code translation justifies the expectations on which the over-all design of ACT is based.

    It is well known that, in normal code translation procedures, the source code is made to pass through some number of intermediate forms until, through final processing it becomes target code.


    (1) CS ?> CI1 ?> CI2 ?> CI3 ... CA1 ?> CT
    In (1) the subscripts, I1, I2, etc., are meant to suggest successive Intermediate forms, while A1 suggests Assembly form (A2, A3, etc., being possible intermediate forms through which the code passed during assembly). In FORTRAN translation, for example, the initial formula statements pass through several intermediate forms until they reach "triple" form; they are then translated further into SAP assembly code, etc. Generally speaking, as the code progresses from form to form, it becomes, step-wise, less and less conditioned by the specifications of the source code. Thus, for example, in FLOWMATIC, by the time the code has reached Op. file (1) form, it is largely independent of the detailed rules for FLOWMATIC statement writing. One could substitute many new conventions for FLOWMATIC sentence presentation, and correspondingly only alter the action of the so-called "glossaries", which (together with some general controlling code) determine the first intermediate code form. To put this another way: many details of FLOWMATIC code form are interpreted and done with in the first stages CS ?> CI1 . All such details could therefore be altered without requiring any revision in the compiler, save a change in the early phases. (Peter Sheridan, "The FORTRAN Symbolic Translator", ACM Communications, Vol. II, No. 2, p. 9.)

    Exactly parallel observations apply to the terminal stages of code translation with regard to the details of the target code. Again choosing FLOWMATIC as an example: one can alter the target computer (hence the target code) with little or no effect on the early stages of translation which are source-code determined. (This observation is the basis for the so-called AIMACO system which translates FLOWMATIC code into 1103 target code.) Figure 3 shows a rough break-down of code-translation into stages which are successively more target-code determined and less source-code determined. (The fact that some computer effects enter the translation procedure almost from the beginning is what makes the UNCOL proposal -- namely to accomplish all translations in two stages, the first source determined and the second target determined -- unfeasible.)

    The division into gross stages as in Figure 3 is neither unique nor optimal. Its sole purpose is to show how the ACT translation library may be used to good advantage. Suppose for example that a certain translation (Si, Ti) is coded with exactly six intermediate
          stages, shown in Figure 3. The over all ACT program would read:(2)
    Do stage 1Do stage 2...Do stage 6
    This program would be stored in the ACT library together with all six sub-programs which are called upon. Now suppose that one wishes to produce a translation for (Sj, Ti) where Sj differs from Si only in its rules for operation
          and operand presentation (such as parenthesis notation for algebra
          replaced by Polish notation; or the rules for file and item descriptions
          changed from "explicit" to "pictorial" form, etc.). The new translation
          may be obtained by writing a new controlling program(3)
    Do stage 18Do stage 2....Do stage 6
    and also coding a new sub-program corresponding to 1'. If, on the other hand, one wishes to produce a translation for (Si, Tj), one will have to consider how radically Tj differs from Ti . The more radical the difference, the greater the number of intermediate stages which must be replaced.

    Consider next, the problem of producing a translation for (Sk, Ti) where Sk is a source code which allows parts to be written in Si and other parts in Sj with "type" labels to show which sections are written in accordance with which conventions. Suppose further that both translations (Si, Ti) as well as (Sj, Ti) were al ready available, and both coded in six stages as in Figure 3. The new control program would cause some number of initial stages for Si to be performed, and also some number of initial stages for Sj. As soon as both streams of coding have achieved a common form, they are, from there on in, submitted to the same terminal stages. How many stages would be involved before the two streams converge is a function of how profoundly the two codes, Si and Sj, differ in their features as analyzed in Figure 3.

    These greatly over-simplified examples are meant to show that, because of the partial "factoring" of source from target effects which one may realize in code-to-code translations, one will have ample opportunity for the re-use of mayor translation sub-programs to be stored in the ACT translation library. As already discussed previously, the extended machine base seems indispensible for the combining of such mayor sub-portions into entire operative programs.

    Extract: Final Summary
    Final Summary
    The primary aim of the ACT system is to reduce to a minimum the labor required to implement a given translation (Si, Vi, Ti). The reduction of labor is achieved through:


    1. Storing translational functions of almost universal applicability in the core library an d providing for their incorporation in new translation programs.
    2. Making it possible for the translation library to grow in such a way that new programs (Sj, Tj) may use, as sub-programs, previously coded procedures already stored in the translation library.

    Regarding item (2) above, its realization is entirely dependent on EM as the fundamental base on which ACT is built. The effictiveness of item (1), on the other hand, is dependent on the analysis of coding functions which leads to the definition of canonical form.

    (Many of the representational facilities which CF provides will not be directly employed in programmer-written source codes, but will appear in intermediate code-forms which are the result of early editing and other translations.)

    Figure II, is a schematic summary of the three main portions of ACT. Each box in the figure is horizontally divided into three sections. The middle section tells by what functional requirements the object is determined; the lower section tells by what coding forms the object is determined. ('EM-code' refers to the extended machine. In regard to the triplet (S, V, T) the machine used -- extended or unextended -- is V. Because of the nature of EM, the ACT system can be transferred from one V to another without recoding if the two machines differ only in configuration but not in order structure.)

          in [JCC 17] Proceedings of the 1960 Western Joint Computer Conference, May 1960 view details
  • Holt, A. W. and Turanski, W. J. First Quarterly Report of Common Programming Language Task, Part II, January 15, 1960; view details
          in [JCC 17] Proceedings of the 1960 Western Joint Computer Conference, May 1960 view details
  • Holt, A. W. and Turanski, W. J. First Quarterly Report of Common Programming Language Task, Part II, January 15, 1960; view details
          in [JCC 17] Proceedings of the 1960 Western Joint Computer Conference, May 1960 view details
  • Holt, A. W. and Turanski, W. J. Second Quarterly Report of Common Programming Language Task, Part II, March 15, 1960 view details
          in [JCC 17] Proceedings of the 1960 Western Joint Computer Conference, May 1960 view details
  • Holt, A. W. and Turanski, W. J. Second Quarterly Report of Common Programming Language Task, Part II, March 15, 1960 view details
          in [JCC 17] Proceedings of the 1960 Western Joint Computer Conference, May 1960 view details
  • Holt, Anatol "Over all computation control and labelling" view details
          in [ACM] CACM 3(11) November 1960 view details
  • Sammet, Jean E "1960 Tower of Babel" diagram on the front of CACM January 1961 view details

          in [ACM] CACM 4(01) (Jan 1961) view details
  • Sammet, Jean E., "Programming languages: history and future" view details
          in [ACM] CACM 15(06) (June 1972) view details
  • Watts S. Humphrey "MOBIDIC and Fieldata" pp146-147 view details Extract: Fieldata Programming
    Fieldata Programming

    During this entire period, internal rivalry was growing between the communications and computer people at Fort Monmouth. The communications faction was wedded to teletype and felt that computers were a temporary fad. They thus attempted to redirect any computer I/O funding for use on teletype equipment. While Luebbert managed both computer and communications R&D, he was able to control this to some degree. He did not have an entirely free hand on how the funds should be spent, however, or what programs would receive emphasis.

    Though he felt that more should be done both with I/O equipment and programming, he was unable to get enough funds for either more I/O development work or for programming. Bill chose to emphasize programming with the limited additional funds he was allowed (Luebbert 1985) a number of programming development; contracts were let to Sylvania for basic software which included an assembler, some mathematical routines, and a set of utility programs. Development of a COBOL compiler was started in 1959 (Sammet 1985; USASRDL undated), but Sylvania did this work on its own without army funding.

    The Signal Corps also supported work on the systems and programming concepts for the Fieldata family of computers at the Moore School of the University of Pennsylvania. The six task areas involved systems, human factors, data transmission, common languages, new devices, and codes. The common language effort, led by Saul Gorn, focused on the feasibility of standardized coding methods for families of computers (Sammet 1985; Gorn and Parker 1960). As part of this effort, the Automatic Code Translation System (ACT) addressed the problems of writing programs that could be compiled and executed on families of different computers (Holt and Turanski 1960a).

    A basic tenet of Fieldata was the need for any type of problem to be run on almost any one of the machines. The farsighted nature of the ACT work is best indicated by the statement in their  final report (Holt et al. 1960b): "that the overwhelming majority of  Army personnel concerned with these problems will not be computer programming experts . . . and that the efficiency of the entire Fieldata system hinges primarily on the speed of problem solution -- i.e., the total "lapsed time" from problem statement till solution delivery."  

    The people who worked primarily on ACT were Anatol W. Holt, William J. Turanski, and E. J. Parker and they envisioned a system of three parts. The first was an allocation interpreter that coordinated a family of small stored subprograms. The second was a library of basic translation functions, and the third was a general translation library with content that varied depending on the application.

    Various other results of this University of Pennsylvania work were the development of techniques for detailed two-dimensional microflowcharting of machine instructions, significant theoretical work on formal languages by Gorn, exploration of optimizing techniques for information retrieval using list processing, and human factors studies of computer console design.

    The Signal Corps contract at the Carnegie Institute of Technology was under the leadership of Alan Perlis (Perks 1961). It concentrated on the general area of programming languages and translators and produced, for example, the concept of the threaded list.

          in Annals of the History of Computing, Spring 1987 view details