ACSI-Matic(ID:2799/acs001)

Associative query language  


for Assistant Chief of Staff for Intelligence + MATIC

Hugey influential query language with early associative capabilities, ran on Sylvania 9400 and Telex Mass Memory. Had special features integrating hardware and programming structures to facilitate information location predicting, and advanced sorting primitives (which influenced the COBOL sort verb)

Jack Minker Astro-Electronics Division, RCA "An Intelligence Processing System for the Office of the Assistant Chief of Staff for Intelligence"

Information system - esp storage and sort developed by Dr. Burnett H. Sams of RCA, Data Systems Center, Bethesda, Maryland.

Part of the work carried out by Common Programming Language Task of Project ADAR, at the Moore School, University of Pennsylvania by Holt and Turanski




People:
Related languages
ACSI-Matic => COBOL-61   Influence
ACSI-Matic => DM-1   Influence

Samples:
References:
  • Holt, A. W.; Turanski, W. J. "Automatic Code Translation System," final report, Document No. AD6OURI, Contract No. DA-36-039-sc-75047. view details
  • Holt, A. W.; Turanski, W. J. "Automatic Code Translation System," final report, Document No. AD6OURI, Contract No. DA-36-039-sc-75047. view details
  • Miller, L., Minker, J., Reed, W., and Shindle, W. "A multi-level file structure for information processing" view details
          in [JCC 18] Proceedings of the 1960 Eastern Joint Computer Conference, New York, December 1960 view details
  • "Design specifications for ACSI-MATIC sort and merge system" Report No. SR-61-3, Prepared for Office, Assistant Chief of Staff for Intelligence, Department of the Army, Washington, 25, D. C. by Radio Corporation of America, Astro- Electronics Division, Princeton, N. J. and Applied Data Research, N. J., June 1961. view details
          in [JCC 18] Proceedings of the 1960 Eastern Joint Computer Conference, New York, December 1960 view details
  • Climenson, W.D., Hardwick, N.H., and Jacobson, S. N. "Automatic syntax analysis in machine indexing and abstracting" Amer. Doc. (Aug. 1961). view details
          in [JCC 18] Proceedings of the 1960 Eastern Joint Computer Conference, New York, December 1960 view details
  • Herbert M. Gurk, Jack Minker "The Design and Simulation of an Information Processing System" view details Extract: Introduction
    Introduction
    The apphcation of digital computers to the processing of natural language (i.e., non-numerical) data is being examined and attempted along many different lines. The most highly publicized practical applications are translations of text from one language to another, indexing and retrieval schemes for collections of documents, and automatic means of developing indexes or abstracts for individual documents. Several more basic investigations--general problem solvers, information processing languages, and list type memory structures--are also being conducted at our universities and elsewhere and being reported on regularly with much optimism for the future. In this paper we wish to discuss a further application--a natural language information processing system with high input rate, a great variety of input, and requirements not only for interpretation, storage, and retrieval of the data, but also for the logical processing, correlation, and combination of the data to develop a different body of information for retrival and analysis. Moreover, this system, though designed with one particular practical operation in mind, has characteristics that would make it easily adaptable to many other activities. These basic concepts have been tested in a major operational simulation of a model of the system, which is described also in this paper.
    The system has been designed as part of a study nicknamed Project ACSI-MATIC. This work is being conducted by RCA to determine the potential uses of modern data-processing equipment and procedures in the activities of certain headquarters military intelligence operations of the Department of the Army.
    DOI
          in [ACM] JACM 8(2) April 1961 view details
  • Holt, Anatol W. "Program organization and record keeping for dynamic storage allocation" view details Abstract: The material presented in this paper is part of the design plan of the core allocation portion of the ACSI-MATIC Programming System. Project ACSI-MATIC is concerned with the application of computer techniques to the activities of certain headquarters military intelligence operations of the U.S. Army. In describing features of organization and record keeping there has been no attempt at completeness, but rather an exploration of the salient aspects of the system to some reasonable level of technical detail. DOI
          in [ACM] CACM 4(10) (October 1961) "Proceedings of a Symposium on Storage Allocation" view details
  • Minker, J.M. "Implementation of large information retrieval problems" view details
          in [ACM] CACM 4(10) (October 1961) "Proceedings of a Symposium on Storage Allocation" view details
  • Sams, Burnett H. "Dynamic storage allocation for an information retrieval system" view details Extract: Introduction
    This paper presents an information retrieval problem whose programming solution included dynamic storage allocation. Allocatable machine code is defined, and an assembly program to produce allocatable machine code is described. The work reported on was done as part of Project ACSI-MATIC1 which is concerned with the application of computer techniques to the activities of certain headquarters military intelligence operations of the U.S. Army
    Simply stated, the problem was to design a system capable of digesting a large volume of daily input in such a manner as to be able to regurgitate selected portions of information in response to interrogations on a real-time basis. There were, of course, hosts of detailed questions which had to be met before such a system could function as a useful tool for research analysts. It is perhaps characteristic of information retrieval problems that the output selectors are often not the same as the input descriptors. In fact, the output is usually a distillation of pieces information gleaned from many distinct inputs to the system. The implied processing raises many problems which cannot be answered to our complete satisfaction until experience has been obtained with such operational systems.
    A basic pattern of processing is to prepare an input-data structure which represents the terms and syntax of an input in a form convenient for machine manipulation. The variety of input structures renders impractical a fixed data structure to accomodate any allowed input. Different classes of terms require different sets of programs to incorporate a term into the data structure, and individual terms may introduce processing variations.
    A different process is the extraction from mass storage records or items satisfying selection criteria. The criteria and the type of records or items will determine the programs to be used for selection and the amount of working storage required.
    Another process is record collation. Records with contents relating to a given record are selected as candidates for merging into one or more records. Compatible records are merged, and records containing conflicting ilfformation are sent to programs which resolve the conflict. Many programs must be on-line to handle the different situations that may arise. These programs are large and have dissimilar storage requirements.
    These three examples of processing activity have in common an unpredictable flow of control through a set of programs having variable working storage requirements. The storage requirements for the individual programs are such that they could not be simultaneously satisfied for the set of programs participating in a given process. However, on a given computation only a few programs will use large amounts of storage so that the storage requirements can be satisfied on a shared basis.
    These characteristics led us to the conclusion that we could not assign programs to fixed memory areas and develop an overlay plan. We required the capability to make memory assignments at decision points throughout the processing. The making of memory assignments during processing requires access to a library file of programs whose translation from pseudo code into machine code is not complete. In general, a program will be segmented to ease the problem of finding a block of contiguous memory large enough to hold the program.
    Before a program may be used, its translation into machine code must be completed. This requires allocation activity to determine the origins where the various segments of the program will be placed in memory. The origins and other parameter values are supplied to a program loader which completes the translation into machine code. Such code will be termed allocatable code.
    In systems where no parameters are allowed except a program origin, the program is translated into relocatable code. A relocatable loader converts relocatable code into machine code by adding the value of the single origin onto selected fields containing addresses. ~elocatable code does not Mlow instructions in one segment to directly reference data in another segment unless that segment always occupies the same memory locations or the origins of the two segments are required to differ by a constant; this latter is nearly equivalent to saying that the two segments are two intervals of a larger segment. This restriction on code segmentation was too severe and led to the rejection of relocatable code following a consideration of interprogram communication. The problem of inter-program communication is always important when many programmers are working on the components of a large system. Two common ways of solving this communication problem are (a) to agree upon symbolic names for the necessary ilfformation and merge programs, or (b) to place the necessary information in a fixed part of memory. The first approach is inefficient for large problems and assumes precise knowledge of the programs to be placed in memory or overlayed at a given time. The second approach was not possible for our problem because of the large number of programs involved and the variety of communication patterns. Our problem does not consist of a large set of plug-in programs operating from a common set of parameters, but instead, a large set of programs which have dissimilar parameterizations and which enter into many combinations.
    The above remarks are intended as background to the programming system development done for Project ACSI-MATIC. The information processing problems which have been briefly discussed and other problems arising from the daily and long-term operations of a military system led to the development of the ACSI-MATIC Programming System (APS): Broadly characterized, APS consists of three major components:
    (1) An instruction processor which receives instructions from both the inside and outside worlds.
    (2) A program sequencer which interprets control transitions, allocates space, and handles the multiprogramming aspects of a simultaneous input-output capability.
    (3) Translation from source languages into allocatable code.
    The remainder of this paper is concerned with allocatable code and the design of a program which assembles allocatable code. The discussion will move out from allocatable code towards the input programming language. DOI
          in [ACM] CACM 4(10) (October 1961) "Proceedings of a Symposium on Storage Allocation" view details
  • Colilla, R. A., and Sams, B. H., "Information Structures for Processing and Retrieving," pp11-16 view details Extract: Introduction
    This paper discusses the problem of conceptualizing an information retrieval system. Broadly characterized, an information retrieval system consists of a file structure to index and hold information, an input language for entering new information into the system or changing the information currently in the system, at, interrogating language for couching retrieval requests, a body of programs for performing the various processing tasks, a programming language for specifying new information processing algorithms, and finally a language to control the operations of the system. There is a tremendous variety of possible information retrieval systems. The input and interrogating languages may separately range from binary codes through formal languages to English. The corresponding processing may depend upon specified index terms or may utilize automatic abstracting and indexing techniques. The programming language may range from machine code to an arbitrarily sophisticated compiler code. The control language may be nonexistent, requiring manual loading of programs and data, or may enable self-scheduling of fully automatic parallel processing with dynamic allocation of storage to program and data [l, 2].
    To the extent that the input and interrogating language requirements overlap, the input and interrogating languages are able to share terms and syntax. The desirability of incorporating these informational terms and syntax within the programming language to ease the problem of specifying new information processing algorithms is also discussed. Therefore, in the more sophisticated systems, the input, interrogating and programming languages will be incorporated with the control language into a single information processing and retrieval language; in fact, there will be a tendency to incorporate a variety of these languages. (Consider the use of machine code within ALGOL procedures.)
    An information retrieval language introduces restraints on the design of data structures and processing programs. We take the point of view that the primary problem is not designing an information retrieval language but rather designing a file structure adequate for the anticipated processing and retrieval on a given class of machines and then designing notations (information processing and retrieval languages) for presenting inputs, requesting outputs, defining algorithms, and controlling operations.
    These notations may then be made increasingly sophisticated by adding analysis and translation programs to the system. Both approaches require a definition of the information retrieval problem, be it a single problem, a class of problems, or the "generic" processing and retrieval problem. Once the problem has been defined, one may either design a language and trust that a satisfactory implementation can be developed or one may start with the hardware and devise file structures and programming techniques. The latter approach yields a family of systems with increasing capabilities and gives greater assurance that an efficient system can be built. The development of a user's notation is now influenced by the particular file structure and processing patterns chosen, and without detriment to the user a specialized notation may be adopted which will simplify the analysis and translation of the information processing and retrieval language.
    We might say, generally, that an information retrieval language for a particular information retrieval system should provide statements for performing all the operations desired to be performed by a user of the system on data described in any way that may be necessary. A general information retrieval language, then, should provide this capability for a large number of different kinds of information retrieval systems. Just how different these systems can be is a measure of the generality of the language. One might reasonably expect it to be easier to design a language for systems with similar subject matter than with different, subject matter since terms and syntax in the first ease are likely to be similar. It seems easier for example, to incorporate into one language the statements: "What is the population of New York"? and "What is the temperature of New York at 3 PM on July 10th?" than it is to incorporate "What is the population of New York?" and "What is the best procedure for an American to follow after having an automobile accident in Copenhagen?"
    Even for closely related subject matter, however, two systems might be different enough to make different demands on a language that is to be suitable for both systems. Suppose for some system one may ask the question, "How far is it from New York to Philadelphia?" where the answer is to be 90 miles. Suppose another system discriminates among the various routes of travel between New York and Philadelphia and will respond only if a specific route is stated. For example, the system may require a statement of the kind, "How far is it from New York to Philadelphia starting from Times Square, traveling through the Lincoln Tunnel, along the New Jersey turnpike and across the Benjamin Franklin Bridge to Penn Square?" Obviously, if the parameters are changed--e.g.., substituting the George Washington Bridge for the Lincoln Tunnel--the answer will be different. The language, therefore, must recognize that route specification is necessary in system two, but should not be present in system one. Its presence in system one gives faulty information since the answer "90 miles" is given regardless of the route expressed. A means for handling inappropriate data, therefore, must likewise be incorporated.
    Suppose now that the users of system one decides that it is inadequate for their more detailed investigations and want to incorporate system two while maintaining the old capability, namely, that of recognizing a statement with route unspecified. Suppose, further, they are in agreement that for such a statement--that is, with route unspecified -- a tabulation of the ten "most popular" routes is given where "most popular" is statistically determined by frequency of interrogations with route specified. The result is essentially a third system that performs the functions of system two with additional statistical and summarizing operations. The statements acceptable for systems one and two, however, are still acceptable to this third system, but the corresponding functions are now different; hence the interpretation of the same statements for the different systems must be different.
    One may conclude from the above examples that if one can design a general information retrieval language for a number of information retrieval systems, these systems are going to be similar in at least some way other than subject matter. Suppose that instead of trying to design a language for mainly information retrieval systems, we try to design a general information retrieval system that will be adequate for the solution of a large number of different information retrieval problems. One could then try to design a language for this general system. Each particular system could use as much or as little of the system capability as it. chooses. The system should be reasonably efficient regardless of the extent of this use. We do not claim to have such a system developed but we proceed to describe at least what such a system should be capable of doing to be adequate for a particular class of information retrieval problems that require moderately sophisticated solutions.
    First we distinguish between direct and collated document retrieval systems. The object of the former is to get, source documents all of which have something in common. Generally, each document is indexed by a set of key terms. The object of the latter is to produce a composite of data extracted and collated from many source documents.
    Each composite, called a record, is likewise indexed by a set of key terms. For example, if one wanted to know the current status of a particular hurricane, the direct document retrieval system would produce a few dozen weather reports each having something to say about the hurricane. The collated document retrieval system would organize, or would have previously organized, the data in the weather reports and would produce only the current status of the hurricane. If one asked about a tornado that was also discussed on those same weather reports, tile direct system would simply present the reports a second time. The collated system would limit itself to the  status of the tornado.
          in [ACM] CACM 5(01) January 1962 "Design, Implementation and Application of IR-Oriented Languages," ACM Computer Language Committee on Information Retrieval on 20-21 October 1961 in Princeton, N. J. view details
  • Grems, Mandalay "A survey of languages and systems for information retrieval" pp43-46 view details
          in [ACM] CACM 5(01) January 1962 "Design, Implementation and Application of IR-Oriented Languages," ACM Computer Language Committee on Information Retrieval on 20-21 October 1961 in Princeton, N. J. view details
  • Goetz, Martin A. "Design and characteristics of a variable-length record sort using new fixed-length record sorting techniques" pp264-267 view details Abstract: This paper describes the application of several new techniques for sorting fixed-length records to the problem of variable-length record sorting. The techniques have been implemented on a Sylvania 9400 computer system with 32,000 fixed-length words of memory. Specifically, the techniques sequence variable-length records of unrestricted size, produce long initial strings of data, merge strings of data at the power of T - 1, where T is the number of work tapes in a system, and do not restrict the volume of input data.
    DOI
          in [ACM] CACM 6(05) May 1963 "Proceedings of ACM Sort Symposium, November 29, 30, 1962" view details
  • Sams, Burnett H. "On the solution of an information retrieval problem" pp289-297 view details
          in [AFIPS JCC 23] Proceedings of the 1963 Spring Joint Computer Conference in Detroit SJCC 1963 view details
  • Waks, David J. "Conversion, reconversion and comparison techniques in variable-length sorting" pp267-272 view details Abstract: The logic is described for converting highly variable input records into a format that can be easily and efficiently processed by a sorting program.1 The internal record formats are discussed in relation to (1) their conversion from input formats, (2) their reconversion to output formats, and (3) comparison techniques between internal formats. DOI Extract: CSI-MATIC sort system
    The ACSI-matic sort/merge offers the user three features not usually found in sort/merges. First, the input records may be of variable length. Second, only the fields the user requires in the output records appear there; he specifies which fields are to appear. Finally, more than one type of output may be produced in one run of the sort; different keys may be used to sequence each output type, and different selected fields may appear in each.
    For example, suppose each record of an input tape contains up to 50 fields (number 1 to 50), some of which may be missing in any given input record. The user may request two types of output, two sorts, from the input. One could be sequenced on fields 10, 2 and 20 in that order, with fields 1-15 and 30 appearing in the output records. The other may be sequenced on fields 10, 2 and 15, with all the input fields appearing as output.
    The sort/merge is based on the conversion of the variable- length input records to fixed-length records, the use of these fixed-length records for sorting and, finally, the reconversion of the fixed-length records to the original variable-length format for output. Since the variablelength records are used only initially for input to the sort and finally for output, they are termed external records. The fixed-length records actually used for sorting are termed internal records.

          in [ACM] CACM 6(05) May 1963 "Proceedings of ACM Sort Symposium, November 29, 30, 1962" view details
  • Duggan, M. A. review of Sams 1963 (ACSI-MATIC) view details Abstract: This paper states that "[the] problem was to formulate a system which would be capable of digesting an input stream of documents in such a manner as to be able to regurgitate selected information in response to interrogations by a number of research analysts." In 1961, the same author stated, in an almost identical paper, that "the problem was to design a system capable of digesting a large volume of daily input in such a manner as to be able to regurgitate selected portions of information in response to interrogations on a real-time basis." (Emphasis supplied).

    The author concludes that he "is indebted to many persons ... [who] have left their imprint upon [this] paper." After a close examination, this reviewer suggests that he is also indebted to his own previous writings, [2] in particular.

    However, treating this paper as terra nova, the following points were noted:

    1) The typical computer was described in explicit hardware terms - serial (sic) magnetic tape, etc. - however the description served no purpose in the body of the paper. In view of its rarity, omission of the name of the specific computer used in the ACSI project (the Sylvania S-9400) is understandable.

    2) The author has met his quota for non-standard terms by utilizing "restrictors" and "extractors" with gusto. The afterthought that "one can permit Boolean functions of extractors" speaks for the depth of this paper.

    3) The explanations of executive system, programming system, program execution time, process time, computer time, parallel processing, or multi-programming have a virgin naiveté about them.

    REFERENCES
    [1] Sams, B. H., "Dynamic Storage Allocation for an Information Retrieval System," presented at an ACM Storage Allocation Symposium, Princeton, NJ., June 2~24, 1961, Comm. ACM 4, 10 (October 1961), 431-435.

    [2] Colilla, R. A., and Sams, B. H., "Information Structures for Processing and Retrieving," presented at an Open Technical Meeting on "Design, Implementation and Application of IROriented Languages," held by the ACM Computer Language Committee on Information Retrieval, Princeton, NJ., October 2~21,1961, Comm. ACM 5, 1 (January 1962), 11-16.
          in ACM Computing Reviews 5(03) May-June 1964 view details
  • Glaser, E. L., Couler, J. F., and Oliver, G. A. "System design of a computer for time-sharing applications" pp197-202 view details
          in [AFIPS JCC 28] Proceedings of the 1965 Fall Joint Computer Conference FJCC 1965 view details
  • Randell, B.; Kuehner, C. J. "Dynamic storage allocation systems" pp297-306 view details Abstract: In many recent computer system designs, hardware facilities have been provided for easing the problems of storage allocation. A method of characterizing dynamic storage allocation systems--accordlng to the functional capabilities provided and the underlying techniques used--is presented. The basic purpose of the paper is to provide a useful perspective from which the utility of Various hardware facilities may be assessed.

    A brief survey of storage allocation facilities in several representative computer systems is included as an appendix. DOI Extract: Storage in ACSI-MATIC
    Pioneering work on the concepts of segmentation and the use of predictive information to control storage allocation was done in connection with Project ACSI-MATIC [10]. In this system programs were accompanied by "program descriptions," which could be varied dynamically, and which specified, for example, (i) which storage niedium a particular segment was to be in when it was used, and (ii) permissions and restrictions on the overlaying of groups of segments. Storage allocations strategies were then based on the analysis of these descriptions.
          in [ACM] CACM 11(05) (May 1968) view details
  • Fry, James P.; Sibley, Edgar H. "Evolution of Data-Base Management Systems" view details Extract: History
    Another early and ambitious developmet was ACSI-MATIC sponsored by the US Army in the late fifties. This system was designed by Minker to emphasize effective memory utilization and inferential processing. It could make inferences such as: if John is the son of Adam, and Mary is the sister of John, then Mary is the daughter of Adam. It contributed the first generalized data-retrieval accessing package for a disk-oriented system with batched requests, a dynamic storage algorithm for managing core storage, and the first assembler to use a dynamic storage allocation routine. Because disks were not reliable at that time, the ACSI-MATIC system was never fully implemented. A prototype version was implemented later at RCA (1964).
          in [ACM] ACM Computing Surveys (CSUR) 8(1) March 1976 view details