DEACON(ID:395/dea001)

English-like query system 


Direct English Access and CONtrol.


English-like query system, part of the TEMPO (Technical Military Planning Operation) porject at General Electric.

Used ring-structures, present in the KLS parent language

From Feldman and Gries "In applied semantics the DEACON project, whose approach was quite novel to linguists, can be looked upon as a straightforward application of TWS techniques (cf. [Nap 67, Col 67]). One can expect to see more interaction between these research areas as linguists attempt to test semantic theories and TWS workers attempt to cope with nonprocedural languages."

TWS = translator writing systems.


Related languages
KLS II => DEACON   Written using
DEACON => REL English   Influence

References:
  • Craig, J. A. "Grammatical aspects of a system for natural man-machine communication" RM63TMP-31, TEMPO, General Electric Co., Santa Barbara, Calif., July 1963. view details
  • Simmons, R.F., "Answering English Questions by Computer - A Survey" SDC Report SP-1536 Santa Monica, Calif.; April, 1964 view details
  • Thompson, F. B. "Semantic counterpart of formal grammars" TEMPO General Electric Co., Santa Barbara, Calif. view details
  • Thompson, F. B. et al "DEACON breadboard summary" RM64TMP-9, TEMPO General Electric Co., Santa Barbara, Calif., Mar. 1964. view details
  • Thompson, Frederick B. "DEACON - Type Query Systems" view details
          in Orr, William (ed) "Conversational Computing", 1968 view details
  • Thompson, Frederick B. "The Application and Implementation of DEACON-Type Query Systems" Research Report Document RM 64TMP-11 October 1964 view details
          in Orr, William (ed) "Conversational Computing", 1968 view details
  • Simmons, R. F. "Answering English questions by computer: a survey" p53-70 view details Abstract: Fifteen experimental English language question-answering systems which are programmed and operating are described and reviewed. The systems range from a conversation machines to programs which make sentences about pictures and systems which translate from English into logical calculi. Systems are classified as list-structured data-based, graphic data-based, text-based and inferential. Principles and methods of operations are detailed and discussed.

    It is concluded that the data-base question-answerer has passed from initial research into the early developmental phase. The most difficult and important research questions for the advancement of general-purpose language processors are seen to be concerned with measuring meaning, dealing with ambiguities, translating into formal languages and searching large tree structures. DOI Extract: DEACON and KLS II
    The DEACON Breadboard.
    At General Electric's TEMPO, F. Thompson and J. Craig have
    reported on DEACON -- a data-based question answerer which is part of a man-machine communication system that, may eventually allow operators to communicate with each other and with the computer in a subset, of natural English. The question answerer is programmed in a special list-processing language, KLS-II, developed at TEMPO for this system. At this writing, many aspects of the natural-language programs have been checked out and the offers great potential for making inferences as well as for providing explicitly stored data.
    In general, DEACON depends on a list-structured data base. Thompson makes explicit the importance of a wellunderstood data structure and introduces a principle of equivalence between the word classes of syntactic analysis and the semantic categories of the data base. As a result his programs do not break neatly into a parsing system, a semantic analyzer, and a data processor, although these phases are still distinguishable. His language analysis parses a sentence into the names of lists and the calls to operations to be performed on the lists. These operations are performed immediately and the resulting sublists are tested for their truth value in the last phase of data processing.
    The TEMPO system accepts the occurrence of ambiguous analyses but usually these are resolved in terms of the data context of the sentence. Each remaining analysis is dealt with as a separate statement or question. It generalizes to a broad range of data and to a reasonably complex subset of English. The system is self-contained in that it both reads its own data and answers questions. It makes explicit the principles of structure and question analysis which although previously implicit in such systems as Baseball, SAD SAM and the PLM were not then fully conceptualized. It is theoretically important in showing the continuity between syntactic, semantic and symboliclogical analyses of English in a data base system.
          in [ACM] CACM 8(01) Jan 1965 view details
  • Thompson, Frederick B., et al. Information systems research: the Deacon project. 65TMP-69, General Electric Co., Oct. 1965 view details
          in [ACM] CACM 8(01) Jan 1965 view details
  • Thompson, Frederick et al "Phrase-structure oriented targeting query language" The Deacon Project Vol. 1, RM 65TMP-64, General Electric Co., Santa Barbara, Calif., 1965. view details
          in [ACM] CACM 8(01) Jan 1965 view details
  • Craig, J. A., Berezner, S. C., Carney, H. C., Longyear, C. R., "DEACON: Direct English Access and Control" view details
          in [AFIPS] Proceedings of the 1966 Fall Joint Computer Conference FJCC 29 view details
  • Simmons, R. F. "Storage and retrieval of aspects of meaning in directed graph structures" view details Extract: Introduction
    Introduction
    Behind the development of every new computer language there lies a set of problems and a set of programming structures with whose aid the problems can be managed. With Fortran the problem was to solve algebraic equations without the need for a great deal of I/O bookkeeping and without concern for detailed computer-word-packing statements. Behind JovrAL lay the command-control problem, which customarily dealt with complex data structures and the need to use every bit of computer memory efficiently. IPL grew in response to a need to use associative list structures of nonnumeric symbolic data in a computer. Lisp answered the need for a high-level functional language to handle recursive algebraic and symbolic structures. Comit was the machine translator's approach to handling natural language strings.
    In developing a special concept dictionary for storing and retrieving the meanings of words and phrases of English, the authors have found it desirable to use a complex network whose nodes are interrelated in several ways. Basically, the idea is that of a dictionary of English words in which each word has associated with it word-class information and lists of attribute-value pairs that define various aspects of its meaning and usage in terms of pointers to other words in the dictionary. In a data structure language such as Jovial, in addition to ordinary table structures several additional levels of associative coding are required to give easy access to the data without excessive costs in either space or processing time.
    Because of the many levels of associative linking required the authors decided to use Lisp, at least for early experimental work with the system. Advantages of Lisp extended beyond the ease of producing complex data structures; they also included the simplicity of producing complex, often recursive functions for maintaining and querying the dictionaiy. An additional advantage is gained in the fact that although Lisp is primarily an interpretive system it does allow for the compiling of a fast-running completed program. The most serious disadvantage of Lisp for our system is that in present versions1 it is limited to core memory for most uses. This limitation means that a dictionary of the type we are studying could not exceed two or three hundred words.
    Since we are aiming for an eventual vocabulary of from five to fifty thousand words, the limitation to core memory is intolerable. Either an expansion of Lisp will be required or the writing of a special language using auxiliary memory for handling cyclical structures in large complex networks will grow out of our experiments with the conceptual dictionary.
    Extract: The Problem
    The Problem
    The major shortcoming of all existing retrieval systems is their inability to handle anything vaguely resembling the meaning of words taken singly, let alone the meaning of the language strings that they comprise. Synonym dictionaries and thesauri have often been added but have proved but feeble makeshifts offering little improvement over the use of root forms of words alone. To the extent that automatic syntactic analysis has been available it has only emphasized the need for word and phrase meanings.
    Five years of Synthex research toward the development of question-answering systems based on natural language text have confirmed this inability to deal with meanings. In these five years many approaches have been attempted toward representing some aspects related to the meaning of words. Most have been unsuccessful. It was learned early that the use of a synonym dictionary did not greatly improve our understanding of text in response to questions. More recently it was realized that even a well-coded thesaurus was not an answer. At various times attempts were made to save syntactic contexts associated with words as possible representations of meanings; these too, although promising, did not appear to be a reasonable answer. With more recent research, particularly that of Bobrow [1964], Raphael [1964], Quillian [1965], and Thompson [1964], it has become apparent that, in addition to dictionary-type meanings, there is a need for something that can best be characterized as a knowledge of the world (e.g., Cows eat grass. Walls are vertical. Grass doesn't eat., etc.). Without something representing knowledge of the world it can hardly be hoped that a word or sentence can be understood.
    The consequence of this line of thought is the realization that the problem requires the development of a conceptual dictionary that would contain definitional material, associative material, and some representation of knowledge of the world. These three aspects of a word's meaning seem to be the minimum that will allow for enough understanding of English to make question answering even a reasonable probability.
    Extract: The Conceptual Dictionary
    The Conceptual Dictionary
    In a conceptual dictionary each word would be characterized by (a) a set of class memberships, (b) a set of attributes, (c) a set of associations, and (d) a set of active and passive actions.
    The set of class memberships includes statements of the form the/an X(noun) is a/an F(noun). Thus, "an aard-vark is an animal" or "an aardvark is a mammal" are both examples of statements giving rise to class membership characteristics. For many nouns the class membership set is one of the basic definitional aspects of the word.
    Attributes characterizing a word are such that if y is an attribute of x, then "x is y" is a true statement and the string "the yx" is grammatical. Thus if "scaly" is an attribute of "aardvark," then "an aardvark is scaly" is true and "the scaly aardvark" is grammatical. Associates of a word are in a loose part-whole relationship. If x has a y, then y is an associate of x\ thus "John has a wallet" and "John has a nose" provide the two associates, "nose" and "wallet," for John.
    The set of actions characterizing a word are derived from the verbs and their complements or objects that appear in context with the word. The two sentences "Natives eat aardvarks" and "Aardvarks eat ants" provide "eaten by natives" and "eat ants" as passive and active actions related to aardvarks.
    The idea underlying this schema is that the meaning of a word can be conceptualized in bands of closeness of relation. The class membership of an object is so closely related as to be an integral part of it or its perception. Attributes and things closely associated with a word are seen as important but less essential, while the actions associating one word and another may range from important to irrelevant. Extract: Toward an Operational System
    Toward an Operational System
    The conceptual dictionary briefly described above exists now as an experimental Lisp program in. 47k of core memory in the ARPA-SDC Q-32 Time-Sharing system. It is currently limited to a dictionary of 200-300 words and a relatively small set of program functions. What is required of even an early operational system is the ability to handle from 20 to 50 thousand words and a rather large set of functions for dealing with questions of increasing difficulty. Our expectations are that Lisp will be expanded into a system that uses up to four million words of disk to augment core and that we will be able to pay an increased cost of response time in favor of continuing to use Lisp for a large operational system. If that cost is prohibitive it will be necessary to produce a system tailored to the needs of the conceptual dictionary, and that will be able to use auxiliary memory efficiently to deal with a very large network of complexly linked words.
    Although it is our belief that new problems create the need for new languages, it is apparent that existing languages are largely sufficient for our language processing problems, but in many cases, especially among the list-oriented languages, they simply have not geared themselves to the large amounts of data and data processing required in this special field.
    Extract: Acknowledgments
    Acknowledgments. I wish to acknowledge my debt to the twenty or so people who have studied question-answering systems over the past decade. Their work is reviewed elsewhere [Simmons 1965]. Headers knowing the Quillian system will recognize that the result of the author's three years of acquaintance with Quillian was the appropriation wholeheartedly of his ideas insofar as the author was able to understand them. A special debt is also expressed to Fred Thompson for leading the author to an understanding of parsing directly into a data structure and for acquainting him with TEMPO'S forthcoming language system of associative cycles. Programming and detail design of the conceptual dictionary described in this paper were accomplished by John Burger.
    Extract: Discussion
    Discussion
    Salton opened the discussion with the comment that a system such as this is inherently not extendable. The system will operate nicely with various kinds of "fish," but will run into trouble when "whales" appear, since the system will not know how to deal with aquatic mammals. He compared the system to the "Baseball" system, in which one cannot go beyond a limited range of questions, e.g., one has trouble if a new team appears. Young objected to this view, saying he believed the two systems were quite different, and that the present system was in principle infinitely extendable and very general. He said he could conceive of a system one level above this one which could deal with generalizations and which would make handling propositions easier than with the present special programming.
    Burger said that work with higher level relationships was planned, but that the immediate extensions would be more trivial, in the way of inserting a great deal of knowledge about the world, of the kind any child has (e.g., "Walls are vertical.").
    Gora observed that the present system is based upon two forms of the verb "is" and the verb "has," and that more relationships were needed. Burger said the system was not in principle so limited. Gorn then asked how they would deal with the statement " 'Word' is a word." Burger had no immediate answer, though he thought they would eventually be able to deal with such cases.
    Responding to a question from Cheydleur, Burger said they had about 50 different functions.
    Mooers asked if there were any inherent features of Lisp which limited their work. Burger said that a fundamental limitation was the limitation to use of core storage alone in the SDC version of Lisp. They could not use disks. The second limitation was the inability to break out individual characters from the "atoms" of Lisp. This prevents au easy and direct way of treating the similarity between "dog" and "dogs." At present this relationship has to be put in as a separate piece of information.
    Mitchell then mentioned that for a very much larger data base a limitation will be the time required to pass the dictionaries through the machine. He said one of the big improvements in speed due to syntax-directed compilers was that they had less need to refer to a very large dictionary. Burger admitted that this was a very important problem, and one which is being considered. What can be done outside of Lisp is being studied, since a problem in using an auxiliary memory with a list-structure system like Lisp is that interrelationships between lists are broken up if just one section is brought from the auxiliary memory. So far, a good solution to the problem has not been found.
    Abstract: An experimental system that uses LISP to make a conceptual dictionary is described. The dictionary associates with each English word the syntactic information, definitional material, and references to the contexts in which it has been used to define other words. Such relations as class inclusion, possession, and active or passive actions are used as definitional material. The resulting structure serves as a powerful vehicle for research on the logic of question answering. Examples of methods of inputting information and answering simple English questions are given. An important conclusion is that, although LISP and other list processing languages are ideally suited for producing complex associative structures, they are inadequate vehicles for language processing on any large scale—at least until they can use auxiliary memory as a continuous extension of core memory.

          in [ACM] CACM 9(03) March 1966 includes proceedings of the ACM Programming Languages and Pragmatics Conference, San Dimas, California, August 1965 view details
  • Feldman, Jerome and Gries, David "Translator writing systems" p77-113 view details Abstract: A critical review of recent efforts to automate the writing of translators of programming languages is presented. The formal study of syntax and its application to translator writing are discussed in Section II. Various approaches to automating the postsyntactic (semantic) aspects of translator writing are discussed in Section III, and several related topics in Section IV. Extract: Gargoyle, TWS, AMOS
    Other TWSs
    ETC (Garwick [Gar 64], Gilbert [Gil 66] and Pratt [Pra 65])
    In this section we describe three other efforts, that are best described as syntax-directed symbol processors. For various reasons, these systems have not had the impact of those discussed above and are presented in less detail.
    The GARGOYLE system [Gar 64] was developed for the Control Data 3600 by Jan Garwick at the Norwegian Defense Establishment. There are reasonably complete descriptions in internal reports, but the published paper [Gar 64], which was written first, gives only a vague picture of GARGOYLE.
    GARGOYLE is like TMG (Section III.A.1) in many ways; the recognizer is top-down with a facility for directing the search. All syntactic and semantic statements are written in a five-entry tabular form: (label) (else} (next} (link} (action). The sequencing rules are quite complex (partly because backup is handled implicitly) and are normally done by a "steering routine." The backup mechanism also requires complicated data handling involving stacking of some variables and copying of others. All five entries are used in multiple ways; e.g. the label may instead be a code controlling how the line is to be interpreted.
    The following example would be used to process an (assignment statement).
    (label) (else) (next} (link) (action}
    R ASSIGN
    L VAR
    0 VARIABLE 1
    1 NEXT 2 if S YMB=:COLEQ then
    VARy----WORD
    2 3 ASSIGN 4
    3 EXPR 4
    4 RETURN OUTTEXT ("STO", 10);
    OUTFIELD (VAR, 20) ;
    The first line is the header for routine ASSIGN, whose local variable, VAR, is declared in the second line. If VARIABLE finds a (variable} then the next symbol must be COLEQ (":=") or the routine fails. Line 2 is a recursive call of ASSIGN for treating multiple left sides; the first backup will lead to EXPR (which will compile the right side), and subsequent returns will execute statement 4 once for each variable in the multiple assignment. The (action} in statement 4 produces text for storing into each successive value of VAR.
    The language of the (action} column includes partial word operators and a few fixed routines for table searching, input-output, etc. The GARGOYLE user is expected to embed assembly statements frequently, and the entire (action} language appears similar to the high level machine language of [Wir 66a]. GARGOYLE has been used by its author to help implement a complex language [Gar 66], but its wider use will require a somewhat cleaner design and considerably better publication.
    Thc TWS of Gilbert and McLellan [Gil 67] is based on some syntactic ideas of Gilbert (Section II.C, [Gil 66]) and an attempt to revive the UNeOL [Ste 61] concept. Source programs are to be translated into a machineindependent language, BASE, which in turn can be translated into many machine languages. The first translation is described in an intermediate language which is a macro assembly language having a few fixed data structure conventions; the second translation is described in a string form like that of TMG. Although there are a number of good ideas in the paper, they are not significantly different from others in this section. Some of the bad ideas, however, are uniquely illustrative.
    The UNCOL notion of a machine-oriented but machineindependent language has always foundered on the diversity of languages and computers. Gilbert and McLellan attempt to avoid this by Mlowing new operators to be defined in BASE and passed blindly to each BASE-to-machinelanguage translator. This gives the appearance of machineindependence but leaves untouched the basic problem of which macros to choose. The authors also make a point of the fact that their system is "rigorously based." This presumably encouraged them to use the set of strings "AmB-AmB-CCC '' as the programming language example illustrating all aspects of their system. Finally, in a classic example of misplaced rigor, they exclude infinite loops from their system by not permitting loops and go to statements.
    The only reference in this paper [Gil 67] to other TWS literature is [Ir 61].
    The AMOS system developed by Pratt and Lindsay [Pra 65, Pra 66] is a direct application of list-processing ideas to TWSs. The source language is translated into an intermediate language (I-language) which is interpreted by AMOS. The I-language is a simple list processing language with some string processing operations and crude arithmetic; e.g. "*ADD* P1, P2, P3" means "add P1 to P2 and put the result in P3." The I-language was designed to be minimal and to be expanded in macro fashion. The syntax is treated by a variant of production language (Section II.B.5) relying heavily on recursive calls; the semantics is written in I-language. The most interesting feature of AMOS is the attempt to provide for the translation of data structures as well as programs. AMOS has had some minor successes in handling list structures, but the problem deserves much more attention (ef. Section IV.C). Extract: Deacon
    The field of computational linguistics in both its theoretical
    and practical aspects is closely related to TWS studies.
    The applications here, though fewer than one would expect,
    have been significant. The syntactic theories of
    computational linguistics and TWSs both are based on the
    early work of Chomsky [Chom 63] and share many ideas.
    The implementations of English syntax (especially [Kun
    62]) developed concurrently with top-down TWSs, but the
    natural language efforts have been slow to incorporate the
    efficiency improvements developed in TWS work. In
    applied semantics the DEACON project [Th 66], whose
    approach was quite novel to linguists, can be looked upon
    as a straightforward application of TWS techniques (cf.
    [Nap 67, Col 67]). One can expect to see more interaction
    between these research areas as linguists attempt to test
    semantic theories and TWS workers attempt to cope with
    nonprocedural languages.
    The last, but by no means
          in [ACM] CACM 11(02) (February 1968) view details
  • Barter, C.J. "Data Structure and Question Answering" view details
          in Kaneff, S. (ed) Picture Language Machines: Proceedings of a Conference held at the Australian National University, Canberra on 24-28 February, 1969 view details
  • Sammet, Jean E. "Computer Languages - Principles and History" Englewood Cliffs, N.J. Prentice-Hall 1969. p.668. view details Extract: DEACON
    DEACON (Direct English Access and CONtrol) has been under development at GE TEMPO since at least 1963. Some of the allowable statements in DEACON strongly resemble those in the systems discussed earlier, but that is because they are related to military subjects rather than because of their structure, which is far more general. In systems such as COLINGOand the 473L Query Language, the languages are definitely artificial with limited or no syntactic flexibility, whereas DEACON actually uses linguistic techniques to analyze sentences which are far more complex, e.g.,

    HAS THE 25TH BATTALION ARRIVED IN TEXAS SINCE 3 P.M.? IS THE 100TH SCHEDULED TO ARRIVE AT FT. LEWIS BEFORE THE 200TH LEAVES FT. LEWIS?

    The data is stored in ring-type list structures. It includes time-dependent data. "Thompson hypothesizes that English essentially becomes a formal language as defined if its subject matter is limited to 'material whose interrelationships are specifiable in a limited number of precisely structured categories [memory structures].'... Because these programs are written in terms of structural categories (independent of content), the interpretation rules apply to any subject matter that is stored in these categories.? {13} The authors state in a footnote that "This is the major advance of DEACON over Green's BASEBALL... and Lindsay's SAD SAM.?{14} Work on this project seems either dormant or terminated.


          in Kaneff, S. (ed) Picture Language Machines: Proceedings of a Conference held at the Australian National University, Canberra on 24-28 February, 1969 view details
  • Woods, William A. "Context-sensitive parsing" pp437-445 view details Extract: Copmarison with DEACON Algorithm
    Comparison with the DEACON Parsing Algorithm
    At this point, the similarity between this algorithm and that of the DEACON system [2, pp. A7-A15] should be pointed out. The DEACON algorithm, like ours, uses notions of parser level and the level of a reduction, and performs a reduction only if these two levels are the same. In DEACON terminology, an intermediate string is a subgoal and each of the symbols in the string is a phrase. Associated with each phrase of a subgoal is a number called the "level of the phrase." The input string is called the "initial subgoal," and each of its phrases has level 0. The "level of a subgoal" is defined to be the maximum of the levels of its phrases, and the parser also keeps a level at which it operates. For a given parser level l, a reduction is performed only if the maximum of the levels of all phrases involved is equal to l, and if it is performed, then the resulting constitute is given level 1 + 1. Thus DEACON's notions of parser level and the level of a reduction are very similar to ours, except that only one level is kept for each phrase, and no distinction is made between phrases used for context and phrases used as constituents when calculating the level of a reduction. The DEACON algorithm, however, uses the following method of enumerating subgoals, which they describe as "working down a column." Given a table, the rows of which comprise all the subgoals with a given level 1 (initially just the initial subgoal with level 0), the algorithm compares rules against subgoals beginning with the first position in the first row. Any successful reductions result in subgoals having level l + 1 which are added as new rows at the bottom of the table. The algorithm then checks rules at the first position of the second row and continues checking rules against the first position of each row (i.e. working down the first column) until it reaches the bottom of the table. It then begins with the second position of the first row and works down the second column and so on, until it has completely processed the last column. At this point, all of the subgoals having level l + 1 have been enumerated. The level 1 subgoals are then removed from the table, the parser level is raised to l + 1, and the process is repeated until at some level no new subgoals are formed. This enumeration procedure corresponds to the enumeration of choices for our nondeterministic parser in the following way.

    When a reduction is performed in some column k on a subgoal at parser level l, the subsequent processing of this subgoal at parser level l corresponds to the nondeterministic purser's choice of not performing the reduction. The processing of the resulting level l + i subgoal which is done at parser level l beginning at column k corresponds to the choice of performing the reduction and continuing to scan at the same parser level. Finally, the processing which this resulting level 1 + 1 subgoal will receive at parser level l + 1 beginning in column 1 corresponds to the choice of performing the reduction and raising the parser level. Thus DEACON's method of enumeration by "working down a column" performs the same enumeration as our cycling mechanism and requires virtually the same amount of computation. Our cycling mechanism, however, has a tremendous space-saving advantage over DEACON's enumeration procedure. The DEACON procedure requires in its table enough space to remember all of the subgoals for two successive levels. Furthermore, in order to reconstruct the structure of successful analyses, it needs to retain all of the previous subgoals on an output tape or some such medium, and it would require an output editor to scan this tape and pick out the subgoals which make up a given analysis. Our algorithm, on the other hand, stores in its pushdown stack nothing more than the derivational history of the current working string, and whenever a successful analysis is found, the pushdown stack contains exactly the information which is desired as output, in a form ready to print out immediately with no need of an elaborate output editor. Thus, all that need be remembered by our algorithm at any given moment is the derivational history of a single analysis.

    There is a more serious inadequacy in the DEACON algorithm, however, which arises from carrying only one level for each phrase instead of two. Recall that in our algorithm, when a symbol is used as context for a reduction at level l, its maturation level is raised to l + 1 while its birthday is unchanged. Reference [2] fails to mention what the DEACON algorithm does to the level of a phrase used as context. In the example given in [2, p. A-12] the levels of the phrases used as context are left unchanged, but I have been told that their algorithm actually raises these levels to 1 + 1 when a reduction is performed at level 1. In the former ease, the levels of the phrases would be the same as the birthdays of the symbols as defined for our algorithm, while in the latter case, they would be the same as our maturation levels. Both alternatives are inadequate, since in the first ease the algorithm may miss some analyses, as in Figure 4, while in the second case it may find some analyses more than once, as in Figure 5.

    In the situation represented in Figure 4, the structure diagram (b) is the only legitimate analysis for the grammar (a). The subgoals shown in (c) are generated as follows. At parser level 0, subgoal 2 is formed from subgoal 1 by rule 4 at position 1, and subgoals 3 and 4 are formed from 1 and 2 by rule 2 at position 2. Then at level 1, rule 3 applies to subgoal 2 at position 1 yielding subgoal 5. The reductions of a ° to A in subgoal 3 and c o to C in subgoal 2 are forbidden by the level restriction. (They were already done at level 0.) Thus there are no more reductions at level 1 and the parser level is raised to 2. Now, however, the reduction of c o in subgoal 5 to C, which is needed to continue the analysis, is blocked by the level restriction. No more subgoals are generated and the analysis of Figure 4(b) is not discovered. Raising the level of the symbol e when it was used as context by rule 3 at parser level 1 would avoid this problem, but results in the possibility of finding the same analysis several ways (and in the worst cases, in many ways). [...] It appears, then, that the use of a single level for each symbol is not sufficient, and it is necessary to carry both a level to govern the use of the symbol as context and a level to govern its use as a constituent.
          in CACM 13(7) July 1970 view details
  • Stock, Marylene and Stock, Karl F. "Bibliography of Programming Languages: Books, User Manuals and Articles from PLANKALKUL to PL/I" Verlag Dokumentation, Pullach/Munchen 1973 174 view details Abstract: PREFACE  AND  INTRODUCTION
    The exact number of all the programming languages still in use, and those which are no longer used, is unknown. Zemanek calls the abundance of programming languages and their many dialects a "language Babel". When a new programming language is developed, only its name is known at first and it takes a while before publications about it appear. For some languages, the only relevant literature stays inside the individual companies; some are reported on in papers and magazines; and only a few, such as ALGOL, BASIC, COBOL, FORTRAN, and PL/1, become known to a wider public through various text- and handbooks. The situation surrounding the application of these languages in many computer centers is a similar one.

    There are differing opinions on the concept "programming languages". What is called a programming language by some may be termed a program, a processor, or a generator by others. Since there are no sharp borderlines in the field of programming languages, works were considered here which deal with machine languages, assemblers, autocoders, syntax and compilers, processors and generators, as well as with general higher programming languages.

    The bibliography contains some 2,700 titles of books, magazines and essays for around 300 programming languages. However, as shown by the "Overview of Existing Programming Languages", there are more than 300 such languages. The "Overview" lists a total of 676 programming languages, but this is certainly incomplete. One author ' has already announced the "next 700 programming languages"; it is to be hoped the many users may be spared such a great variety for reasons of compatibility. The graphic representations (illustrations 1 & 2) show the development and proportion of the most widely-used programming languages, as measured by the number of publications listed here and by the number of computer manufacturers and software firms who have implemented the language in question. The illustrations show FORTRAN to be in the lead at the present time. PL/1 is advancing rapidly, although PL/1 compilers are not yet seen very often outside of IBM.

    Some experts believe PL/1 will replace even the widely-used languages such as FORTRAN, COBOL, and ALGOL.4) If this does occur, it will surely take some time - as shown by the chronological diagram (illustration 2) .

    It would be desirable from the user's point of view to reduce this language confusion down to the most advantageous languages. Those languages still maintained should incorporate the special facets and advantages of the otherwise superfluous languages. Obviously such demands are not in the interests of computer production firms, especially when one considers that a FORTRAN program can be executed on nearly all third-generation computers.

    The titles in this bibliography are organized alphabetically according to programming language, and within a language chronologically and again alphabetically within a given year. Preceding the first programming language in the alphabet, literature is listed on several languages, as are general papers on programming languages and on the theory of formal languages (AAA).
    As far as possible, the most of titles are based on autopsy. However, the bibliographical description of sone titles will not satisfy bibliography-documentation demands, since they are based on inaccurate information in various sources. Translation titles whose original titles could not be found through bibliographical research were not included. ' In view of the fact that nany libraries do not have the quoted papers, all magazine essays should have been listed with the volume, the year, issue number and the complete number of pages (e.g. pp. 721-783), so that interlibrary loans could take place with fast reader service. Unfortunately, these data were not always found.

    It is hoped that this bibliography will help the electronic data processing expert, and those who wish to select the appropriate programming language from the many available, to find a way through the language Babel.

    We wish to offer special thanks to Mr. Klaus G. Saur and the staff of Verlag Dokumentation for their publishing work.

    Graz / Austria, May, 1973
          in CACM 13(7) July 1970 view details
  • Grosch, Herbert R.J. "Copmuter: BIt Slices from a Life" 1991 view details Extract: from Chapter 54 - An Earthly Paradise
    he introduced me to the Thompson DEACON team and we got down to business.

    In Chapter 22 I described how I had brought Fred Thompson and his crew into GE, supported their model-building and other TEMPO-like work for Fort Huachuca, and left them to Oldfield's tender mercies when I went back to IBM. Fred had changed - well, expanded - his interests, found his way out of the civilian and into the military part of the GE organization, and during that transition had become an important figure in computational linguistics.

    He had a new bunch of disciples, considerable financial support from the Rome Air Development Center [that's Rome in upstate New York], and was operating his DEACON Project in TEMPO. He found Paine and the TEMPO tempo undistinguished, had clashed with the marketing people and John Fisher about how to handle the RADC support, and ended up by resigning contemptuously.

    This sat poorly with the Air Force, which regarded Fred highly and could not conceive of supporting the work without him. The whole project was in his bulging cranium. The computing and linguistic people on his crew thought the world of him. For a while things staggered on; Paine was conciliatory. But Fred had put out feelers during the interregnum, and in the late winter was offered a tenured professorship at Cal Tech.

    There was nothing Fisher or Paine could do to keep him. And when they suggested replacements, they found most of the possibilities in a then rather narrow discipline were either non-citizens, unwilling to come to industry (even in lovely Santa Barbara), or unacceptable to Fred and his gang. Time was running out, and RADC was unhappy.

    Fisher, who was a rather nice if somewhat wimpish guy and sympathetic with the academic atmosphere, asked Fred to suggest someone who would be acceptable to RADC, not make the higher brows on the small DEACON team twitch, and could work in General Electric. Fred said he knew just the man!!

    They ran me down via the ACM staff. Neither Fred nor John were members, but Fred had stayed in touch with Mike Seven, who had landed a nice academic job in Pomona, and Mike was wired in via a student chapter; it was easy. TEMPO still had a switchboard, and lots of staff with languages. Presto!! Villa Loup.

    DEACON stood for Direct English Access and CONtrol, Fisher had explained, and was a mixture of natural-language research and an implementation of Fred's Piaget-influenced ideas about learning and about semantics. There was a small series of internal reports, which I found very hard to understand, since they were written in a private jargon, leavened with paragraphs about linguistics and systems programming. A year later the team put out a major publication in an AFIPS Proceedings, and this has survived in the citations world of Academe. "Nothing beside remained ...", as Shelley said.
    External link: Online edition Extract: more from Chapter 54 - An Earthly Paradise
    I was introduced to the small group, who had been warned by Fred of my astronomical origin and my engineering career. The computing side of the house, which ran a dedicated GE625 with a non-GE disk stack cloned from the IBM RAMAC, and used fairly primitive Teletype gear for most I/O, found it acceptable; the language experts were startled.. System loads were card decks, but the linguistic entries were hunt-and-peck TTY.

    This meant that later on, when I was demonstrating our capabilities at RADC and in the Pentagon, I could sashay up to any standard teletype and sign in in the Balboa Building on the main drag of Santa Barbara. Crude by 1998 networking standards, but sophisticated in 1965.

    I'll come back to the people later. I have to explain a little first about the "Two vast and trunkless legs of stone ...": what remains of the DEACON effort. The intention to use natural language - fairly unrestricted English - was foredoomed, and I was dubious from the beginning. Hundreds of very clever and mostly well-intentioned researchers have struggled on four continents, from the first years of the Sixties to today, and are only slightly closer to the goal. I have written many times that yes, we will some day talk to our computer as Arthur Clarke's heroes talked to HAL. But it may take centuries.

    Understand that this is not voice input; DEACON was content with keystrokes. I have had a dictation program for six months in 1997/98, and am startled and fascinated by its rather considerable capabilities and its obvious promise. The Santa Barbara attempt was to parse simple English sentences input by teletype and understand the structure, permitting hundreds of simple variations in order and tense and gender and on and on. The group had done well with tense and number but had not yet tackled negation!

    The second leg was the semantic one: what is the meaning of the phrase/sentence/paragraph after you have its structure deciphered? And here I think Fred was on solid ground, and as we look at the PC operating systems and the database constructs of today we see the embodiment of his idea: that each person or team using an inquiry system (or whatever) has his or their private set of meanings for words or strings of words, and by extension even of numerical data.

    DEACON, I came to tell audiences, was to be an electronic backpack storing the world and the world-view of the person owning it. Today we have laptops that offer that sort of personalization for two or three thousand dollars; in the mid-Sixties a team of CIA analysts would need a major chunk of a million- dollar mainframe.

    The dictionaries and linkages were stored in ring structures much like the lists in LISP, which was a slow tool and required a lot of disk space. My demo sentence at RADC was, "How many red ships are in Boston harbor?" The structure analysis handled "how many" and which harbor, the semantic lists chose between red equals hull color and red equals communist flag.

    Incidentally, when I did everything right and the TTY and its line and the 625 back in TEMPO were all happy, DEACON would reply, "if red equals hull color 4/ if red equals communist 2". What Paine and Fisher wanted me to disguise was how much special parsing and how many linguistic "rules" had been tinkered up to generate that rather glum reply.

    It wasn't crooked: the team, and Fred while he was in charge, would never never have put in a prescribed answer or otherwise faked a response. If we changed the ship-movement database the answers changed from 4 and 2 to new numbers, quite honestly. If we did "airport" or "Chicago" or "planes" (and put those words in the rather sparse vocabulary), the parsing still worked. If we did "Chicago harbor" but left the rest of the demo query unchanged, DEACON would coolly inform us Chicago was not a harbor (of course it really is, for Great Lakes shipping).

    What Fred and his team, and their intellectual counterparts in Academe, had glossed over - although you can be sure Paine saw it clearly - was how specialized the market would be even for a highly perfected DEACON. In the Sixties or the Seventies it could only be applied to terrifically valuable inquiries: those of a POTUS, or a Tom Watson (both of whom would be too impatient to use it). A team of intelligence analysts, say, at NSA or deep in the bowels of the Pentagon, might be doing work so important they could afford it; hence, RADC. But you had to strain a little!

    DEACON would never forget a fact, no matter how old or obscure. It would never overlook a connection it was capable of making. It was available every minute of the year, even during Bowl games. Speeds a thousand times greater, data memory a thousand times larger, were possible, with no limit in sight. But I had guessed, even during Thompson's hypnotic presentations, that even quite modest success would be superhumanly difficult.

    Parenthetically, there are projects today that hope to input enough of, um, the universal context to mimic common sense. They have enormously increased speed, incredible storage capacity - and hundreds of times as many workers as DEACON had. I wish them well. Perhaps when the next millenium approaches ...

    Extract: more from Chapter 55 DEACON LEADS TO D.C.
    While the painful personal struggles kept me on tenterhooks, DEACON was evolving. Gwynn made considerable progress not only in massaging the ring processor, but in describing it in TEMPO reports for our sponsors back East. Longyear was beginning to wire me in to the computational linguistics community, whose special skills were still beyond me but whose jargon I had begun to acquire. Craig and the youngsters were adding grammar rules at a fair rate. And on the managerial front, I was beginning to settle in.

    This, in two senses. Out in the world, I was learning to work with grant money, an unfamiliar venue, and with the key people who supervised its expenditure, At RADC this was a Col. Florence Casey (leaf, as I remember; not eagle); she came out to see us and mutter at our slow progress.

    On occasion I would go back east, sometimes to visit RADC but also to racket around the DDRE precinct of the Pentagon, which was a couple of levels up in the food chain. There I dealt mostly with a vigorous and less pleasant type named Dr. Ruth Davis, who claimed unlike Casey to understand what Thompson had promised to do. I resisted gently, since her view - in my view - would take decades to accomplish, and most of her rather large budget.

    One of my genuine skills, honed in the jet engine years, was walk-through. I could pretty well sense what was going well and what was foundering, as I talked to my youngsters and their clients. Out in the shop, not in my office - and I couldn't do it very well with strangers.

    As I got to know and respect the DEACON bunch I began to regain this ability. It told me that we were only at the beginning of the beginning of the task: notably, it had me note that whenever a new "rule" was implemented (several a day, but not dozens or hundreds), many and sometimes very many of the previously-working rules had to be, ah, upgraded. "That's the way language works", said Chris, and I agreed. But as Casey pointed out, it looked like many thousands of rules would be needed to parse even a small subset of English.

    I shared my misgivings with the crew, gently, and with Fisher. That latter was

          in CACM 13(7) July 1970 view details