REL English(ID:693/rel005)


Rapidly Extensible Language, English. REL formal language based on natural English structure represented by ring ADTs


Structures:
Related languages
DEACON => REL English   Influence
REL => REL English   Written using

References:
  • Thompson, F. B., "English for the Computer" pp349-356. view details
          in [AFIPS] Proceedings of the 1966 Fall Joint Computer Conference FJCC 29 view details
  • Dostert, Bozena; Thompson, Frederick B. "A Rapidly Extensible Language System (REL English)" view details Abstract: 1. REL English in Terms of Modern Linguistics
    REL, a Rapidly Extensible Language System, is an integrated information system operating in conversational  interaction with the computer. It is intended for work with large or small data bases by means of highly individualized languages. The architecture of REL is based on theoretical assumptions about human information dynamics [I], among them the expanding process of conceptualization in working with data, and the  idiosyncratic language use of the individual workers. The result of these assumptions is a system which allows the construction of highly individualized languages which are closely knit with the structure of the data and which can be rapidly extended and augmented with new concepts and structures through a facile definitional capability. The REL language processor is designed to accommodate a variety of languages whose structural  characteristics may be considerably divergent. The REL English is one of the languages within the REL system. It is intended to facilitate sophisticated work with  computers without the need for mastering programming languages.

    The structural power of REL English matches the extremely flexible organization of data in ring forms . Extensions of the basic REL English language can be achieved either through defining new concepts and structures in terms of the existing ones or through addition of new rules.

    The REL dialect and idiolects
    English is our primary mode of verbal communication, therefore everyone has the right to know what someone else means by it. We use the term "English" in its most ordinary sense, i.e. we bear in mind the fact that there really is no one English  language. Rather, the term English refers to as many idiolects as there are speakers, these idiolects being grouped into dialects. The REL English is one such dialect. It shares with natural  language also the characteristic of being, in its design and functioning,  a conglomerate of idiolects, which we call versions.

    Thompson's design philosophy of REL [2] defines the theoretical basis for the assumption of individual, idiolectal approach to the use of information.

    REL English as a formal language
    The second basic characteristicis that REL English is a formal language. The characteristics of English as a formal language are  in an earlier paper[3]. The central thesis of that paper is that English becomes a formal language when the subject matter which it talks about is limited to material whose interrelationships a respecifiable in a limited number of precisely structured categories. It is the type of structuration of the subject matter and not the nature of the subject matter itself that produces the necessary limitations. Natural language encompasses a multitude of formal languages and it is the complexities of the memory structures on which natural language can and does operate that account for the complexities, flexibility and richness of natural language. These latter give rise to the notorious problem of ambiguities in natural language analysis.

    Ambiguities
    What about ambiguities in REL English? The purpose of REL English grammar is to provide a language facilitating work with computers. It is thus assumed that the language is used for a specific purpose in a specific context. Allowance for ambiguities at the phrase level, with subsequent disambiguition through context, is a powerful mechanism in a language. It is this aspect of ambiguity we wish to include.

    Ambiguities, in the general case and in our case, are due to different semantic interpretations (data structuration) arising from different deep structures. Ambiguous constructions are of two main types - (1)those which are structurally ambiguous, e.g., 'Boston ships" is ambiguous overall relations existing between "Boston" and "ships" (built in Boston, with home port in Boston, etc.); and (2) those which are semantically ambiguous, e.g., "location of King" if "King" can refer both to Captain King and the destroyer King in the data elements. Ambiguities of the first type can be resolved by the specification of the relation, those of the second type by inclusion of larger context. Chomsky's well-known example of an ambiguous sentence "Flying plane scan be dangerous" is of the first type; Katz and Fodor's "bachelor" is of the second type. The purpose of REL sentence analysis is not to find all possible interpretations of ambiguous sentences irrespective of context. Rather, the purpose is maximal disambiguation where such disambiguation is possible in terms of semantic  interpretation, providing for the preservation of ambiguities present in memory structures if the syntactic form of the query is ambiguous.

    Nature of restrictions
    How does our English compare with English as discussed by modern linguists? On the level of surface structure, they are essentially the same. Some more complex transformationally derived strings, such as certain forms of elipsis are not handled as yet. However, most of the common forms are treated in a straightforward manner. Although some constructions which can be formed in natural conversational English are not provided in the basic English package, such deficiencies can to a large extent be overcome by the capability for definitional extension provided by the system.

    The level of deep structure presents more problems. As distinct from surface structure, deep structure is that level of syntactic analysis which constitutes the input to semantic analysis, both in Chomsky's[4] terms and ours. What is the nature of this semantic interpretation?

    In the general case, little is known.In our case, as in most types of computer analysis, interpretation is in terms of the internal forms of organization of the data in memory. To the extent that the constituents of deep structure can be directly correlated with corresponding structures in the data, semantic analysis, and  therefore sentence analysis, can be carried to completion. It is important to distinguish, in this regard, between two quite distinct though related ways in which language use can be restricted. The first is by the ways in which the data is organized, that is the structural forms used and the interlinkages which are formed for the manipulation of these structures. This type we will call "structural" restrictions. The second is by restrictions of the subject matter, or the universe of discourse; this we will call  "discourse" restrictions. When one restricts the universe of discourse to a body of material which is naturally formal or has been  formalized, one often tacitly accepts the structural restrictions thus  imposed. Tothe uninitiated, it may appear that it is the discouse limitations and not the implied structural limitations that make the material amenable to machine analysis. However, it is the establishment of relatability between deep structural constituents and data structural forms, rather than discourse restrictions that make computer processing of the semantic component possible. Any content area whose data is organized into these given structural forms can be equally efficiently processed by a system establishing such interrelationships.

    The restrictions on REL English area function of the first type of restrictions, i.e., structural restrictions. Not all deep structures found in natural English are brought out by our analysis, because constituents of these deep structures do not correspond to structural relations in the organization of our data. For instance, "collections of boys" and "boys' collections" are  considered synonymous, although they are not in English. Consider: "At the fair, I saw collections of boys." and "At the fair, I saw boys' collections."

    Finally, a limitation in reverse, as it were, is the fact that we have emphasized the inclusion of grammatical strings rather than attending to strict insistence on grammaticallty.
          in International Conference on Computational Linguistics COLING 1969 view details
  • Thompson, F. B., P. C. Lockemann , B. Dostert , R. S. Deverill "REL: A Rapidly Extensible Language system" view details Abstract: In the first two sections of this paper we review the design philosophy which gives rise to these features, and sketch the system architecture which reflects them. Within this framework, we have sought to provide languages which are natural for typical users. The third section of this paper outlines one such application language, REL English. The REL system has been implemented at the California Institute of Technology, and will be the conversational system for the Caltech campus this fall. The system hardware consists of an IBM 360/50 computer with 256K bytes of core, a drum, IBM 2314 disks, an IBM 2250 display, 62 IBM 2741 typewriter consoles distributed around the campus, and neighboring colleges. Base languages provided are CITRAN (similar to RAND's JOSS), and REL English. A basic statistical package and a graphics package are also available for building special purpose languages around specific courses and user requirements. Extract: What is REL?
    What is REL?
    REL is an integrated information system designed to facilitate conversational access to a computer. As such it is similar in gross respects to many c u r r e n t developments in time-shared, conversational computer systems . However, in its system architecture it differs from other systems in:
    a) a single language processor which accommodates a variety of userlanguages;
    b) integrity of user data structures, user language and user database;
    c) tight coupling between the multiprogramming operating system and the single language processor;
    d) rapid, conversational extensibility of user languages at the language processor/operating system level.

    In the first two sections of this paper we review the design philosophy which gives rise to these features, and sketch the system architecture which reflects them. Within this framework, we have sought to provide languages which are natural for typical users. The third section of this paper outlines one such application language, REL English.

    The REL system h a s been implemented at the California Institute of Technology, and will be the conversational system for the Caltecg campus this fall. The system hardware consists of an IBM 360/50 computer with 256K bytes of core, a drum, IBM 2314 disks, an IBM 2250 display, 62 IBM 2741 typewriter consoles distributed around the campus, and neighboring colleges. Base languages provided are CITRAN (similar to RAND's JOSS), and REL English. A basic statistical package and a graphics package are also available for building special purpose languages around specific courses and user requirements.

    Several such special applications are already being prepared: for example, a "theorem prover" based upon the lower predicate calculus and the resolution principle, an animated motion picture capability, and a syntax analyzer which allows grammars to be built from the typewriter and immediately applied to sentence analysis. The most extensive application so far is to a data base gathered by Caltech anthropologist Dr. Thayer Scudder. The data consists of over 100,000 items concerning the Tonga, a people living in Zambia who have made very dramatic cultural advance during the last decade. Dr. Scudder is already beginning the analysis of his data using his extension of REL English.
    Extract:

    Underlying Design Considerations
    The notion of user oriented language and of a user's data base are current notions in computer science. We should like to call attention to a particular relationship between them. Data, by its very nature is usually at a rather low level of aggregation.
    [...]
    However, the user, in investigating a large body of such basic data, soon is concerned with notions that require a variety of aggregations of this data.
    [...]
    Some of these aggregations may arise as natural outgrowths of his analysis, and recur in his ongoing considerations.
    [...]
    In order to maintain a reasonable efficiency of expression, he would like to define them once and for all.
    [...]
    This ability to extend his language naturally and easily is an essential system requirement . These definitions are not merely more complex reformulations of the  original language, independent of the database . Nor are they simple reclassifications of the data. They can serve both to extend the syntactic aspects of the language and to form new relations and categories in the data. The excessive ambiguity found in the purely syntactic analysis of natural language sentences is reduced by contextual restriction of notions defined at higher levels of abstraction. Of course, if a query is ambiguous in terms of the given data base, the analysis should preserve it
    [...]
    The data and the language must be a highly integrated, user-oriented and extensible package. It is this inextricable nature of data and language that we want to stress here. In the early days of computers, this was one of the great ideas, credit for which was given von Neumann as the inventor of the "stored program computer. " Today, with large operating and language systems that are independent of the user, we may have lost sight of this integral relationship . We may feel that at least the baselevel language can be user independent.  Notice , however, that in the examples above, time is interpreted in units of one month. In a language/data package for a travel agent, time would be in minutes and ignore any reference to year; a neural biologist would measure events in microseconds, a geologist in millenia . Consider as another example the meanings of "where" in an anthropological investigation, in a picture processing  language , and a police data file. Thus primary goals of our REL project are to facilitate the implementation and subsequent user extension and modification of highly idiosyncratic language/ data base packages. In such a package, the semantics of the language can be specifically oriented in the context of the associated data.

    Probably the most idiosyncratic aspect of any language/datapackage, other than vocabulary and definitions, are its data structures . If the package is to function efficiently in terms of response times, the structures into which the data is organized  must reflect the conceptual organization in terms of which the user visualizes his data. It is this reciprocal relationship between language structure and data structure that under lies the effective use of a language as a tool for understanding. From this reciprocal relationship we can derive two key implications for the architecture of the REL system. First , the system cannot preclude  some forms of data structures that may occur in language/datapackages. Thus one user may work with his data as contiguously stored arrays, another as n-plexes in a ring structure, a third may wish to manipulate strings of symbols. Second, the system can exploit the tight coupling between the syntactic processing of the sentence and the semantic processing of the underlying data structures . The more both the language and the internal data structures reflect the user's conceptual structures, the tighter this coupling will be. Although the forms for data organization are unrestricted, it can be assumed that syntactically distinguishable entities and data base entities will be reciprocally related.

    One of the major problems in designing an operating system is peripheral storage management. If the language/data packages to be implemented involve large bodies of data or extensive calculation and language extension, peripheral storage management becomes a central concern. The clue to its solution is found in the interrelation between syntactic structure and data structure emphasized above. Consider the two general schemas of system architecture in Figure I. In configuration B, user languages are defined in a manner independent of the language processor, in terms of their own grammar and associated semantic routines that handle their idiosyncratic data structures. The operating system, with its peripheral memory handling, is closely coordinated with the language processor. Thereby the dynamic allocation, and subsequent input/output, of peripheral storage is coordinated with syntactic analysis. In turn, the syntactically meaningful elements of sentences are recognized and processed by the language processor, thus carrying through the coupling between the language level and the storage management level. What the language's semantic routines do with the elements of memory made available to them is left up to the language designer, and not precluded by either the language processor or the operating system. If on the other hand, as in system configurationA, the operating system and language processor are independent, the operating system's management of the peripheral storage will not be sensitive to the interrelationship between language and data structures, and the data will soon be spewed haphazardly over peripheral memory, with degradation of system performance as a result. One can prevent this by dictating structural aspects of the data, by establishing formating conventions; but this solution, as we have seen, will often prohibit an adequate representation of the aforementioned interrelationship. For these reasons, a  central feature of the REL system is that it employs a single language processor integrated with  the operating system and showing special concern  for management of peripheral memory.

    One of the principal interests of the REL project has been language/database packages where the language has extensive structural power of expression, in particular those found in natural languages.  If varieties of such languages are to be accommodated, the single language processor must be a powerful one. In particular, it must handle most rewrite rule grammars and simple transformational  grammars, and must seek all analyses of an input sentence. We are particularly pleased with the language processor we have developed, which satisfies these criteria, and yet remains fast and tight. Central to it is the parser designed by Martin Kay of the RAND Corporation. Besides general  rewrite and transformational rules, the language processor must handle ambiguity, language extension, and a variety of generator functions. Extract: REL English

    Nature of REL English
    In line with the design philosophy of REL, we have sought to provide as one ofthebaselanguages a language which usefully approximates natural English. In using the term "English" we bear in mind the fact that there really is no one English language. Rather, the term English refers to as many idiolects as there are speakers; these idiolects are grouped into dialects. The REL English is one such dialect. It shares with natural language also the characteristic of being, in its design and functioning, a conglomerate of idiolects, which we call versions. Such idiolectic versions arise when a user takes REL English as his base language, adds the vocabulary and relationships of his data, and proceeds to extend his language to fit his growing conceptual understanding.

    The second basic characteristic is that REL English is a formal language. The characteristics of English as a formal language are discussed in an earlier paper. The central thesis of that paper is that English becomes a formal language when the subject matter which it talks about is limited to material whose interrelationships are specifiable in alimited number of precisely structured categories. It is the type of structuration of the subject matter and not the nature of the subject matter itself that produces the necessary limitations.

    Natural language encompasses a multitude of formal languages, and it is the complexities of the memory structures on which natural language operates that account for the complexities, flexib ility and richness of natural language. How does our English compare with English as discussed by modern linguists? On the level of surface structure , they are essentially the same. Some more complex transformationally derived strings, such as certain forms of elipsis are not handled as yet. However, most of the common forms are treated in a straightforward manner. Although some constructions which can be formed in natural conversational English are not provided in the basic English package, such deficiencies canto a large extent be overcome by the capability for extension provided by the system.

    The level of deep structure presents more problems. As distinct from surface structure, deep structure is that level of syntactic analysis which constitutes the input to semantic analysis.

    What is the nature of this semantic interpretation? In the general case, little is known. In our case, as in most types of computer analysis, interpretation is in terms of the internal forms of organization of the data in memory. To the extent that the constituents of deep structure can be directly correlated with corresponding structures in the data, semantic analysis, and therefore sentence analysis, can be carried to completion. It is important to distinguish, in this regard, between two quite distinct though related ways in which language use can be restricted . The first is by the ways in which the data is organized, that is, the structural forms used and the interlinkages which are formed for the manipulation of these structures. This type we will call "structural" restrictions. The second is by restrictions of the subject matter, the universe of discourse; this we will call "discourse" restrictions. When one restricts the universe of discourse to a body of  material which occurs or is already derived in a formal way, one often tacitly accepts the structural restrictions thus imposed. To the uninitiated, it may appear that it is the discourse limitations and not the implied structural limitations that make the material amenable to machine analysis. However, it is the establishment of relatability between deep structural constituents and data structural forms, rather than discourse restrictions,  that make computer processing of the semantic component possible. Any content area whose data is organized into these given structural forms can be equally efficiently processed by a system establishing such interrelationships. The restrictions on REL English are a function of structural restrictions . Not all deep structures found in natural English are brought out by our analysis, because constituents of these deep structures do not correspond to structural relations inthe organization of our data. For instance, "collections of boys" and "boys' collections" are considered synonymous. And yet, "At the fair, I saw collections of boys." and "At the fair, I saw boys' collections."

    The REL English rules consist of a syntactic component, namely the right hand sides and the syntax completion routines, andthe semantic component consisting of the semantic routines. The syntactic component constitutes a set of rewrite rules, context-free and general, which build the deep structure Phrase-markers in the form of kernel sentences, and a number of transformational rules. The semantic routines act on the memory structures of the data bases.

    The parts of speech of REL English are given in Figure 4 together with examples. These parts of speech are inclusive terms for syntactic classes (labels on the parsing tree) and semantic categories (memory structures).

    Function words, e.g. all, of, what, are distinct from referent words; the former are empty in the sense of not being associated with memory structures. Other aspects of the grammar, namely features, name and relation modification, verbs, clauses, quantifiers and conjunctions, are discussed in the remainder of the paper.

          in Proceedings of the twenty-fourth ACM national conference August 1969 view details
  • Sammet, Jean E., "Roster of Programming Languages 1972" 240 view details
          in Computers & Automation 21(6B), 30 Aug 1972 view details
  • Sammet, Jean E. "Roster of Programming Languages for 1973" p147 view details
          in ACM Computing Reviews 15(04) April 1974 view details
  • Stock, Marylene and Stock, Karl F. "Bibliography of Programming Languages: Books, User Manuals and Articles from PLANKALKUL to PL/I" Verlag Dokumentation, Pullach/Munchen 1973 504 view details Abstract: PREFACE  AND  INTRODUCTION
    The exact number of all the programming languages still in use, and those which are no longer used, is unknown. Zemanek calls the abundance of programming languages and their many dialects a "language Babel". When a new programming language is developed, only its name is known at first and it takes a while before publications about it appear. For some languages, the only relevant literature stays inside the individual companies; some are reported on in papers and magazines; and only a few, such as ALGOL, BASIC, COBOL, FORTRAN, and PL/1, become known to a wider public through various text- and handbooks. The situation surrounding the application of these languages in many computer centers is a similar one.

    There are differing opinions on the concept "programming languages". What is called a programming language by some may be termed a program, a processor, or a generator by others. Since there are no sharp borderlines in the field of programming languages, works were considered here which deal with machine languages, assemblers, autocoders, syntax and compilers, processors and generators, as well as with general higher programming languages.

    The bibliography contains some 2,700 titles of books, magazines and essays for around 300 programming languages. However, as shown by the "Overview of Existing Programming Languages", there are more than 300 such languages. The "Overview" lists a total of 676 programming languages, but this is certainly incomplete. One author ' has already announced the "next 700 programming languages"; it is to be hoped the many users may be spared such a great variety for reasons of compatibility. The graphic representations (illustrations 1 & 2) show the development and proportion of the most widely-used programming languages, as measured by the number of publications listed here and by the number of computer manufacturers and software firms who have implemented the language in question. The illustrations show FORTRAN to be in the lead at the present time. PL/1 is advancing rapidly, although PL/1 compilers are not yet seen very often outside of IBM.

    Some experts believe PL/1 will replace even the widely-used languages such as FORTRAN, COBOL, and ALGOL.4) If this does occur, it will surely take some time - as shown by the chronological diagram (illustration 2) .

    It would be desirable from the user's point of view to reduce this language confusion down to the most advantageous languages. Those languages still maintained should incorporate the special facets and advantages of the otherwise superfluous languages. Obviously such demands are not in the interests of computer production firms, especially when one considers that a FORTRAN program can be executed on nearly all third-generation computers.

    The titles in this bibliography are organized alphabetically according to programming language, and within a language chronologically and again alphabetically within a given year. Preceding the first programming language in the alphabet, literature is listed on several languages, as are general papers on programming languages and on the theory of formal languages (AAA).
    As far as possible, the most of titles are based on autopsy. However, the bibliographical description of sone titles will not satisfy bibliography-documentation demands, since they are based on inaccurate information in various sources. Translation titles whose original titles could not be found through bibliographical research were not included. ' In view of the fact that nany libraries do not have the quoted papers, all magazine essays should have been listed with the volume, the year, issue number and the complete number of pages (e.g. pp. 721-783), so that interlibrary loans could take place with fast reader service. Unfortunately, these data were not always found.

    It is hoped that this bibliography will help the electronic data processing expert, and those who wish to select the appropriate programming language from the many available, to find a way through the language Babel.

    We wish to offer special thanks to Mr. Klaus G. Saur and the staff of Verlag Dokumentation for their publishing work.

    Graz / Austria, May, 1973
          in ACM Computing Reviews 15(04) April 1974 view details
  • "Practical Natural Language Processing: The REL System as Prototype" view details
          in Adv in Computers 13, Academic Press 1975 view details
  • Bozena Henisz Thompson and Frederick B. Thompson "Rapidly Extendable Natural Language" ACM78, Proc. 1978 Annual Conf., Dec. 1978, 173-182. view details Abstract: A major thrust of artificial intelligence research is how to build knowledge of the application domain into computer systems. We investigate how the user himself can introduce his own expert knowledge into his data base system through rapid language extension so that it may then respond intelligently to his curt queries and commands. Illustrations of rapid language extension using the REL System are presented and discussed.
          in Adv in Computers 13, Academic Press 1975 view details
  • Sammet, Jean E "Roster of programming languages for 1976-77" pp56-85 view details
          in SIGPLAN Notices 13(11) Nov 1978 view details