PLM(ID:6197/plm002)

Graphically oriented data retrieval language 


for Picture Language Machine

Linguistics-based scene description system for a picture- and sequence- driven data-retrieval language deveoped at NBS in Washington


The systems were developed in tandem with special purpose hardware - computers and scanners, and the earliest reports date from 1957. The first published accounts of the successful system are from 1963, but there are interim reports from 61, 62.


Places
References:
  • Kirsch, R. A., Ray, L. C, Cahn, L., and Urban, G. H. Experiments in processing pictorial information with a digital computer. Rep. 5713, Nat. Bur. Stand., Washington, D. C, Dec. 1957. view details
  • Rankin, B. K., III. A programmable grammar for a fragment of English for use in an information retrieval system. Hop. 7352, Nat. Bur. Stand., Washington, D. C, June 1961 view details
  • Cohen, D., "Picture Processing in a Picture Language Machine," NBS Report 7885. Dept. of Commerce, Wash,, D.C.; April, 1963 view details
  • Kirsch, R. A. The application of automata theory to problems in information retrieval (with selected bibliography). Rep. 7882, Nat. Bur. Stand., Washington, D. C, Mar. 1963 view details
  • Kirsch, R.A. and B.K. Rankin III, "Modified Simple Phrase Structure Grammars for Grammatical Induction" NBS Report 7890, Dept. of Comm, Wash., DC 1963 view details
  • Sillars, W., "An Algorithm for Representing English Sentences in a Formal Language,"' NBS Report 7384, Department of Commerce, Wash., D.C.; April, 1963 view details
  • Bobrow, D.G. "Natural Language Input for a Computer Problem Solving System", Report MAC-TR-1, Project MAC, M.I.T., Cambridge, Mass., June 1964 view details External link: Online copy pdf ps Abstract: The STUDENT problem solving system, programmed in LISP, accepts as input a comfortable but restricted subset of English which can express a wide variety of algebra story problems. STUDENT finds the solution to a large class of these problems. STUDENT can utilize a store of global information not specific to any one problem, and may make assumptions about the interpretation of ambiguities in the wording of the problem being solved. If it uses such information, or makes any assumptions, STUDENT communicates this fact to the user. The thesis includes a summary of other English language question-answering systems. All these systems, and STUDENT are evaluated according to four standard criteria. The linguistic analysis in STUDENT is a first approximation to the analytic portion of a semantic theory of discourse outlined in the thesis. STUDENT finds the set of kernel sentences which are the base of the input discourse, and transforms this sequence of kernel sentences into a set of simultaneous equations which form the semantic base of the Student system. STUDENT then tries to solve this set of equations for the values of requested unknowns. If it is successful it gives the answers in English. If not, STUDENT asks the user for more information, and indicates the nature of the desired information. The STUDENT system is a first step toward natural language communication with computers. Further work on the semantic theory proposed should result in much more sophisticated systems. Extract: Introduction
    Introduction
    The aim of the research reported here was to discover how one could build a computer program which could communicate with people in a natural language within some restricted problem domain. In the coarse of this investigation, I wrote a set of computer pro-grams, the STUDENT system, which accepts as input a comfortable but restricted subset of English which can be used to express, a wide variety of algebra story problems. The problems shown in Figure 1 illustrate some of the communication and problem solving capabilities of this system.
    In the following discussion, I shall use phrases such as "the computer understands English".  In all such cases, the "English" is just the restricted subset of  English which is allowable as input for the computer program under discussion. In addition, for purposes of this report I have adopted the following operational definition of understanding.  A computer "understands" a subset of English if it accepts input sentences which are members of this subset, and answers questions based on information centered in the input. The STUDENT system understands English in this sense. Extract: BASEBALL
    Baseball is a question-answering  system designed and programmed at Lincoln Laboratories by  Green, Wolf, Chomsky and Laughery (19).    It  is a data base system  in which the data  is placed in memory in a prestructered tree format.    The data consists of the dates,   location, opposing teams and scores of some American League baseball games.    Only questions to the system can be given in English, not the data.
    Questions mast be simple sentences, with no relative clauses,  logical or coordinate connectives. With these restrictions, the program will accept any question couched in words contained in a vocabulary list quite adequate for asking questions about baseball statistics.    In addition,  the parsing routine,  based on techniques developed by Harris (21) , must find a parsing for the question.
    The questions must pertain to statistics about baseball games  found in the information store.    One cannot    ask questions about extrema,  such as "Highest" score or "fewest" number of games won.    The parsed question is transformed into a standard specification (or spec)  list and the question-answering routine utilizes this canonical form for the meaning of the question. For example, the question  "Who beat the Yankees on July 4th?" would be transformed into the "spec list":
    Team (Losing)= New York
    Team (winning) = ?
    Date      = July
    Because Baseball does not utilize English for data input, we cannot talk about deductions made from information implicit in several sentences.    However, Baseball can perform operations such as counting (the number of games played by Boston, for example)  and thus  in the sense that it is utilizing  several separate data units in its store,   it is performing deductions.
    Baseball's abilities can only be extended by extensive re-programming,  though the techniques utilized have some general applicability.    Because the parsing program has a very complete grammar, and the vocabulary list  is quite comprehensive for the problem domain, the user needs no knowledge of the internal structure of the Baseball program.    No provision for interaction with the user was made.
    Extract: Kirsch, Cohen and Sillars
    At the National Bureau of Standards, Kirsch, Cohen and Sillars have designed a system in which pictures and English language statements are converted to expressions  in the first order predicate calculus. One can then check to see if an English language statement is consistent with a given picture. Extract: Advice-taker
    McCarthy's Advice-taker, though not designed to accept English input, would make an excellent base for a question-answering system. Fischer Black has programmed a system which can do all of McCarthy's Advice-Taker problems, and can be adapted to accept a very limited subset of English. The deductive system in Black's program is equtvalent to the propositional calculus.
  • Kirsch, R. "Computer Interpretation of English Text and Picture Patterns" IEEE Transactions on Electronic Computers, EC-l3, 4, pp. 363-367. August 1964 view details
  • Simmons, R.F., "Answering English Questions by Computer - A Survey" SDC Report SP-1536 Santa Monica, Calif.; April, 1964 view details
  • Londe, D L., and Simmons, R. F. "NAMER, a pattern-recognition system for generating sentences about relations between line drawings" view details Abstract: This paper reports on a series of experimental programs which are used to recognize line drawings and the spatial relationships between them. At the pattern-recognition level the programs learn to associate a name with a generalized bit pattern representing a line drawing. At the relation-learning level, the programs abstract characteristics which relate to such spatial concepts as "above," "left," "thicker than," etc. The names that have been learned in association with a drawing, and relation names which the program selects as true for the spatial relations between two drawings, are substituted into a simple generation grammar; variations of sentences that are true for the picture are then generated. External link: Online copy Extract: Namer and PLM
    A machine somewhat similar to Namer was recently completed by Kirsch and others, at the National Bureau of Standards. This system is called the Picture Language Machine (PLM). Given statements of the form "The black circle is left of the white triangle," the PLM tests their truth value against the drawings that have been read into the computer. It does this by parsing the sentence, translating the parsed sentence into a specialized predicate calculus, and testing the resulting formalization against the relationships in the picture. The formal language h,as such predicates as "left of," "greater than," "square," "circle," "triangle," etc. Although the PLM approaches the same problem of dealing with language in a figural-spatial context as does Namer, the PLM is not concerned with learning names of patterns or of relations. Its primary emphasis is on experimentation with means of translating from language into logical formalisms whose validity can be tested.
          in [ACM] Proceedings of the 1965 20th National Conference 1965 , Cleveland, Ohio, United States view details
  • Simmons, R. F. "Answering English questions by computer: a survey" p53-70 view details Abstract: Fifteen experimental English language question-answering systems which are programmed and operating are described and reviewed. The systems range from a conversation machines to programs which make sentences about pictures and systems which translate from English into logical calculi. Systems are classified as list-structured data-based, graphic data-based, text-based and inferential. Principles and methods of operations are detailed and discussed.

    It is concluded that the data-base question-answerer has passed from initial research into the early developmental phase. The most difficult and important research questions for the advancement of general-purpose language processors are seen to be concerned with measuring meaning, dealing with ambiguities, translating into formal languages and searching large tree structures. DOI Extract: PLM
    The Picture Language Machine (PLM)

    At the National Bureau of Standards Kirsch (1964), D. Cohen (1962), B. Rankin (1961) and W. Sillars (1963) have devised a program system which accepts pictures and sentences as input. It translates both the pictures and the English statement into a common intermediate logical language and determines whether the statement about the picture is true. This set of programs is of particular interest in this review since it. Seems to be one of the first to explore the principle of translating from a subset of English into a formal language and to point out a reasonable method for doing so.

    The PLM is composed of three subsystems---a parser, a formalizer, and a predicate evaluator. Its language is limited to a small subset of English suitable for making statements and asking questions about three geometric figures. The parsing system is based on an immediate. Constituent grammar that includes the discontinuous constituent operator. Parsing is accomplished by recognition routine which successively substitutes symbols for the dictionary for words in the sentence or for an inter. Mediate symbol string until the top of the parsing tree is reached.

    After the sentence has been parsed to produce one or more tree structures representing it, the formalizer translates the parsed sentence into the formal language. The formal language is a first-order functional calculus with a small number of constant predicates. The primitives of the language include brackets, parentheses, the terms "and," "if... then," "for all", "There exists," "not," "identity," certain other quantifiers, variables and finally three types of predicates. The singular predicates are typified by the following examples:
                 Cir(a) a is a circle
                 Bot(a) a is at the bottom
                 Bk(a) a is black
    Some typical binary and ternary predicates follow:
                 Bgr(a,b) a is bigger than b
                 Lf(a,b) a is to the left of b
                 Sme(a, b) a is the same color'as=b
                 Bet(a,b,c) a is between b and c
                 Mort(a,b,c) a is more to the right of b than c
                 Mmid(a,b,c) a is more in the middle of b than c

    The formalizer is designed to work with each parsing that the grammar produces. For each rule in the grammar there is a corresponding rule of formalization. The translation process is primarily one of substituting formalization symbols for grammar symbols beginning at the top of the parsed tree and working down. More than simple substitution is required to insert quantifiers and implication symbols, but essentially the process of translating to the formal language bears a great similarity to that of generating a language string from a phrase structure grammar. For each parsing of a sentence the translation into the formal language results in a unique, unambiguous, well-formed formula. An example sentence "All circles are black circles" has only one parsing which finally translates into the following formal statement:

    The structure of this formula is explicit and unambiguous; the relationships between the geometric variables are clearly specified and the truth value may be tested by the predicate evaluator. If true, the answer to the implied question "Are all circles black?" is yes. The predicate evaluator translates from pictures to the formal language. It is designed to accept inputs that have been processed by SADIE, a scanning device which is used as an input to a computer. The inputs are limited to three sizes each of triangle, square or circle, each of which may be in outline or filled in. A technique called blobbing is used to distinguish objects resulting from the scan and each such object is then ciremnscribed with a rectangle. Maximum and minimum, x-y coordinates are computed and the ratios of these serve to distinguish triangles from circles or squares. Circles are distinguished from squares on the basis of covering less area. A blaclc figure is one whose area is filled in while a white figure is an outline. These relatively e simple computations suffice to generate the valid predicates from the picture matrix. Some of the interesting features of the PLM can be appreciated only by close study of its documentation. For example, the gramntar used is a modified phrase strlleture system with a remarkably compacted notation (Kirsch, 1963). The problem of ambiguity of syntactic analysis is accepted and each possible interpretation of the sentence is tested fox' validity as a well-formed formalization. There is a practical scheme for translating at least a small subset of English into the predicate calculus and an equally feasible system for testing the formalization against that resulting from the picture matrix. Extract: Namer and PLM
    Both the PLM and Namer show the capability of making geometric inferences based on a set of computational operations on line drawings. Unique sets of characteristics resulting from the computations can be mapped onto English names and words expressing spatial relationships.
    In this respect these systems anticipate some recently developed inference systems (see Section VII on SQA). The PLM has the additional feature of being able to answer questions about those English language statements which are permissible in its grammar. Since it can translate from English into a formal statement, the formalization for the question can be compared to that for the proposed answer. By the addition of predicates which go beyond simple spatial relations, the PLM may generalize into a much broader inference system than any yet available.
    Namer on the other hand is not strictly a question answerer. In order to answer English questions selectively it would be necessary to match the valid statements that can be generated against an analysis of the specific question. However, Namer offers a probabilistic learning approach for learning names and relationships in a data base. This approach may generalize far beyond spatial relationships and their expression in language and may suggest a method for dealing with inference and non-graphic data bases as well.
          in [ACM] CACM 8(01) Jan 1965 view details
  • Clowes, Max Review of "Recognition: A Study in the Philosophy of Artificial Intelligence" by Kenneth M. Sayre in The British Journal for the Philosophy of Science 17(2) August 1966 pp165-167 view details Extract: The PIcture Language Machine
    Over the past decade a great deal of money and time, and a not inconsiderable amount of thought have been devoted to the problem of mechanically emulating one of the simpler skills of which we are capable. This is the ability to observe and describe the significant structure of pictures. For example, an electrical engineer might remark of a circuit diagram, ' the line connecting the grids of these two valves should not be present'. A physicist commenting upon an etched crystal surface viewed under a microscope might begin, ' there are lots of rectangles lying in pairs end to end . . .'. By contrast a child might say of an illustration in a reading book 'A'. The first two examples are concerned with the relationship between parts of the picture and are essentially descriptive. The verbal behaviour consists in a string of words, it is the grammatical and semantic relations between these wTords which is the description of the picture. The third example applies to the picture as a whole and is essentially classificatory: the verbal string exhibits no significant internal structure. It is largely with the problem of classification that Artificial Intelligence has been concerned: it is also the main burden of Professor Sayre's book. It will be clear from the first two examples, however, that letter recognition is not a representative sample of man's visual skills: a book which undertakes to examine the philosophy and practice of Artificial Intelligence and fails to bring out this fact is therefore disappointing.
    Attempts to redress the balance 1 are rapidly leading to formal systems for the description of pictures which are closely akin to the ' generative grammars ' formulated by the algebraic linguist as descriptive models of natural language. The outcome of this (if successful) will be computer programmes capable not only of classifying simple patterns but also of making descriptive remarks about pictures (cf. the electrical engineer) or of verifying that remarks about pictorial structure are true of some subset of a collection of pictures presented to the machine. The formulation of computer programmes of this sort (which Kirsch dubs ' Picture Language Machines') presupposes the existence of (i) an adequate method of picture analysis, such as will recover from a picture its content and structural organisation, (2) a corresponding language mechanism which would permit the generation and analysis of sentences having an appropriate content and form, and (3) a mechanism or mechanisms for mapping from one to the other. This tripartite division is typical of the way in which mechanisms of this complexity have to be organised for logical, heuristic or economic reasons. The simulation of behaviour is in part concerned to discover wrhether, for example, such divisions have counterparts in the organisation of human skills. In general this will involve a study of the performance both of the machine and of human subjects when carrying out similar tasks.

          in [ACM] CACM 8(01) Jan 1965 view details
  • Rosenfeld, Azriel "Picture Processing by Computer" ACM CSUR 1(3) September 1969 view details Abstract: Techniques for processing pictorial information by computer are surveyed. The topics covered include efficient encoding and approximation; position-invariant operations and applications; picture properties useful for pattern recognition; picture segmentation and geometrical properties of picture subsets; picture description and "picture languages." External link: Online copy Extract: Picture Description and "Picture Languages"
    Picture Description and "Picture Languages"
    Given a picture and a collection of its subsets ("objects"), one can formulate a description of the picture in terms of properties of the subsets and relationships among them ("above," "to the left of," "near," "between," "inside of," "part of," "larger than," "darker than," etc.). If one wants to be able to make effective use of such a description (e.g. for answering questions about the picture), the descriptive information should be stored in an appropriate data structure. Such structures have been implemented primarily in the area of computer graphics; most of them are relatively specialized, since most computer graphics systems permit only a limited number of types of subsets, properties, or relations [383-385].

    In many cases, adequate descriptions of a picture can be formulated using a special formal "language" (a better term might be "notation") [386]. For example, a line drawing can be described as consisting of "vertices" (isolated points, arc ends, angles, points where two or more arcs meet), pairs of which may be joined by arcs, where the arcs can be described in terms of mean slope, straightness, length, etc. [387, 388]. If a picture contains simple geometric figures, or other easily "name-able" subsets, it can be concisely described in terms of the names of the subsets and the relationships among them [389-392]. Efforts are under way to develop "habitable" description languages for classes of complex pictures, in which any reasonable statement, question, or command concerning the pictures can be formulated [393].
    If a picture description language is capable of expressing complete descriptions, from which pictures can be exactly reconstructed, it can be regarded as defining a "picture language," in which pictures are built up by combining a set of pieces (presumably corresponding to parts of the description) in various ways, just as phrases and sentences are built up by concatenating words. Here the pieces can range from single picture elements and unit line segments to complex geometrical figures which can be "attached" to one another in various ways [394-400].
    Given a language for describing pictures, whether partially or completely, one can use it to formulate definitions for specific classes of pictures. (On definitions for alphabetic characters in terms of "strokes," see [43, 401-406]; for other examples see [316, 407, 408].) Even partial definitions can be very useful for pattern recognition purposes, since if a picture is known to be made up of pieces of certain types which are combined in certain ways, one can identify the picture by identifying the pieces and the way in which they are combined. This "syntax-directed" approach to pictorial pattern recognition is expected to become a major theme in picture processing research during the coming years.

          in [ACM] CACM 8(01) Jan 1965 view details
  • Stanton, R.B. "The interpretation of graphics and graphic languages" pp144-160 view details Abstract: Computer processing of graphical data can be divided into two broud categories, by drawing a distinction between those systems which attempt to interpret the data as a 'picture' and those which do not. We are here concerned with the former whilst acknowledging that the majority of computer graphics systems belong to the latter.

    Research into the interpretation of graphics has been motivated primarily from two sources. The first is simply the desire to realise in man-machine communication the kind of descriptive power supported by the use of graphics in man-man communication. The second springs from working with a data base of information which is most conveniently recorded in graphical form (e.g. maps, engineering drawings, etc.). The quality associated with this kind of interpretation is captured by the idea of the 'machine perception of graphics'. The acceptance of this idea places computer graphics and with it, graphic languages, in a cognate position with respect to picture interpretation and scene analysis, although there are, of course, important differences. The body of the paper is concerned with reviewing the status of graphical languages given that the task  for which they have to be suited is one of interpretation.
          in Nake, F. and Rosenfeld, A. "Graphic Languages" Amsterdam: North-Holland Publishing Company 1972. view details