LINGOL(ID:617/lin010)


for LINguistics Oriented Language

Vaughan Pratt MIT 1973

Natural language processing language


People:
Related languages
LINGOL => NLINGOL   Evolution of

References:
  • Pratt, V.R. "A Linguistics Oriented Programming Language" view details
          in Proceedings of the Third International Joint Conference on Artificial Intelligence IJCAI-73, Stanford, CA: Stanford University 1973 view details
  • Pratt, Vaughn "Top downoperator precedence" pp41-51 view details DOI Extract: Survey of the Problem Domain
    Survey of the Problem Domain.
    There is little agreement on the extent to which syntax should be a consideration in the design and implementation of programming languages. At one extreme, it is considered vital, and one may go to any lengths [Van Wijngaarden 1969, McKeeman 1970] to provide adequate syntactic capabilities. The other extreme is the spartan denial of a need for a rich syntax [Minsky 1970]. In between, we find some language implementers willing to incorporate as much syntax as possible provided they do not have to work hard at it [Wirth 1971].
    In this paper we present what should be a satisfactory compromise for a respectably large proportion of language designers and implementers. We have in mind particularly
    (i) those who want to write translators and interpreters (soft, firm or hardwired) for new or extant languages without having to acquire a large system to reduce the labor, and
    (ii) those who need a convenient yet efficient language extension mechanism accessible to the language user.
    The approach described below is very simple to understand, trivial to implement, easy to use, extremely efficient in practice if not in theory, yet flexible enough to meet most reasonable syntactic needs of users in both categories (i) and (ii) above. (What is ?reasonable? is addressed in more detail below) , Moreover, it deals nicely with error detection. One may wonder why such an ?obviously? utopian approach has not been generally adopted already. I suspect the root cause of this kind of oversight is our universal preoccupation with BNF grammars and their various offspring: type 1 [Chomsky 1959], indexed [Aho 1968], macro [Fischer 1968], LR(k) [Knuth 1965], and LL(k) [Lewis 1968] grammars, to name a few of the more prominent ones , together with their related automata and a large body of theorems. I am personally enamored of automata theory per se, but I am not impressed with the extent to which it has so far been successfully applied to the writing of compilers or interpreters. Nor do I see a particularly promising future in this direction. Rather, I see automata theory as holding back the development of ideas valuable to language design that are not visibly in the domain of automata theory.
    Users of BNF grammars encounter difficulties when trying to reconcile the conflicting goals of practical generality (coping simultaneously with symbol tables,  data types and their inter-relations, resolution of ambiguity, unpredictable demands  by the BNF user, top-down semantics, etc.)  and theoretical efficiency (the guarantee  that any translator using a given technique  will run in linear time and reasonable space,  regardless of the particular grammar used).  BNF grammars alone do not deal adequately  with either of these issues, and so they  are stretched in some directions to increase  generality and shrunk in others to improve  efficiency. Both of these operations tend  to increase the size of the implementation  "life-support" system, that is, the software needed to pre-process grammars and to  supervise the execution of the resulting  translator. This makes these methods  correspondingly less accessible and less  pleasant to use. Also , the stretching  operation is invariably done gingerly,  dealing only with those issues that have  been anticipated, leaving no room for unexpected needs.  I am thinking here particularly of the  work of Lewis and Stearns and their colleames  on LL(k) grammars, table grammars, and attributed translations. Their approach, while  retaining the precision characteristic of  the mathematical sciences (which is unusual  in what is really a computer-engineering  and human-engineering problem) , is tempered  with a sensitivity to the needs of translator writers that makes it perhaps the most  promising of the automata-theoretic  approaches . To demonstrate its practicality,  they have embodied their theory in an  efficient Algol compiler.
    A number of down-to-earth issues are not satisfactorily addressed by their system  deficiencies which we propose to make up  in the approach below; they are as follows.
    (i) From the point of view of the language designer, implementer or extender, writing an LL(k) grammar, and keeping it  LL(k) after extending it, seems to be a  black art, whose main redeeming feature is  that the life-support system can at least  localize the problems with a given grammar.  It would seem preferable, where possible,  to make it easier for the user to write  acceptable grammars on the first try, a  property of the approach to be presented  here.
    (ii) There is no "escape clause" for dealing with non-standard syntactic problems (e.g. Fortran format statements),  The procedural approach of this paper makes  it possible for the user to deal with  difficult problems in the same language  he uses for routine tasks.
    (iii) The life-support system must be up, running and debugged on the user's computer before he can start to take advantage  of the technique. This may take more effort  than is justifiable for one-shot applications.  lie suggest an approach that requires only  a few lines of code for supporting software.
    (it ) Lewis and Stearns consider only translators, in the context of their LL(k)  system; it remains to be determined how  effectively they can deal with interpreters.  The approach below is ideally suited for  interpreters, whether written in software,  firmware or hardware.
          in [ACM SIGACT-SIGPLAN] Proceedings of the ACM Symposium on Principles of Programming Languages, Boston, October 1973. Association for Computing Machinery. view details
  • Vaughan Pratt "A Linguistics Oriented Programming Language", MIT AI Memo 277 February 1973 view details External link: Online copy at MIT Abstract: A programming language for natural language processing programs is described. Examples of the output of programs written using it are given. The reasons for various design decisions are discussed. An actual session with the system is presented, in which a small fragment of an English-to-French translator is developed. Some of the limitations of the system are discussed, along with plans for further development.
          in [ACM SIGACT-SIGPLAN] Proceedings of the ACM Symposium on Principles of Programming Languages, Boston, October 1973. Association for Computing Machinery. view details
  • Vaughan Pratt "LINGOL - A Progress Report" pp422-428 view details External link: Online copy at MIT Abstract: LINGOL is a linguistics oriented programming language. We describe briefly work in progress on a large-scale LINGOL program designed to demonstrate the value of LINGOL as a research tool in the writing of natural language programs. Attention is then focused on the parsing algorithm used in LINGOL. It is shown that for the class of interpretive context-free parsers that do no backing up or lookahead, the algorithm is optimal with respect to discovery of phrases in that no phrase is discovered (or "state" built) that cannot be used in some continuation of the input seen so far. This constitutes an improvement in space and time of up to a factor proportional to the size of the grammar over previously known general context-free parsing algorithms. For small grammars of English, a factor of five has been measured. It is shown that the parsing algorithm can accept context-dependent "advice" in such a way as to facilitate the writing of "intelligent" grammars. Finally, the role of syntax in contributing to efficient parsing is discussed.
          in IJCAI-75 Proceedings of the 4th International Joint Conference on Artificial Intelligence, 1975 Tbilisi, USSR view details
  • Baker, Henry G. "Parsing with infinite lookahead" Usenet posting From comp.compilers Mon, 28 Feb 1994 view details External link: online copy Extract: Background
    Vaughan Pratt did another Lisp-based CF parser called Lingol while at MIT
    in the 1970's. Lingol is a full CF parser with additional hints and
    extensions on the grammar rules. Unlike Woods's parser, Lingol is
    bottom-up, and generates _all_ parses in O(n^3) time and space. For
    ambiguous sentences, you get a fully factored parse tree in which the
    potentially infinite number of different parses are stored in cubic space.
    Lingol uses the Lisp 'hints' to decide which of the ambiguous parses are
    to be selected. Lingol uses no backup. I extended Lingol to handle
    ambiguity in the word morphology level, which is required if there is
    noise (e.g., misspellings) in the input. (I still have a version of
    Lingol for Common Lisp if anyone is interested. I'll have to check with
    Vaughan re copyright, however.)

          in IJCAI-75 Proceedings of the 4th International Joint Conference on Artificial Intelligence, 1975 Tbilisi, USSR view details