1.pak(ID:4365/pak002)Goal-directed graph-pattern-matching languageBadler, U of Toronto 1973 AI dialect of SNOBOL (derived via SPITBOL) with graph-pattern-matching capabilities. Draws on PLANNER, CONNIVER and QA4 for goal direction etc. Related languages
References: (a) a data base, in the form, of a collection of labeled directed graphs where knowledge can be stored (b) pattern directed information retrieval and pattern invoked function calls (c) primitive statements which enable the user to construct flexible searching algorithms. The language is an extension of SNOBOL in its design and implementation and uses SNOBOL'S string pattern matching facilities for its own (graph) pattern matching. Extract: Introduction Introduction 1.pak was designed and implemented in order to facilitate AI research at the University of Toronto. Its design was influenced by other languages designed for similar reasons such as CONNIVER, PLANNER, QA4, SAIL, and our decision to implement it as an extension of SNOBOL and keep it relatively inexpensive (in terms of the time and space required for the execution of programs). This paper only discusses the main features of 1.pak, how they can be used, their relation to features offered by other languages for AI, and the success of the current implementation. More details on the language are available elsewhere, It is assumed that the reader is familiar with the basic features of SNOBOL. Extract: The Data Base The Data Base The data base for each l.pak program consists of a collection of directed, labeled graphs (hereafter graphs) such as that shown in FIG. 1. A list of transitive and/or intransitive edges is associated with each node, which define either properties of (the object represented by) the node or relations which hold between the node and other elements of the same graph. A linear order exists, for the nodes of each graph defined by special edges labeled NEXT. Edge labels (properties) consist of one or more atoms separated by underscores. The first atom is the attribute of the property, while subsequent atoms are its modifiers. Information can be retrieved from the data base by matching (graph) patterns against it. Patterns are specified in terms of sequences of path descriptions. For example, the pattern <$X, (LEFT_AARB?Y), (FAR), $W> contains one path description which will match paths that (a) Begin at the node which is the value of atom X , (b) Move along an edge whose label has LEFT as attribute and is followed by at least one modifier, (c) Move along an edge whose labei has FAR as attribute, (d) End at node $W. If such a path connects nodes $X and $W for a particular data base, the pattern match succeeds and the modifier of LEFT is assigned to atom Y , while the graph pattern match returns $W, the last node visited during the match. Patterns are evaluated by the function SEARCH. AARB is a special property pattern which will match any atom. l.pak offers several other built-in property patterns and operators similar to those offered ;by SNOBOL to help the user'specify classes of edge labels in his graph patterns. Thus (graph) pattern matching is essentially driven by property pattern matching and can be easily implemented using the string pattern matching facilities in SNOBOL. . Note that there may be several paths which will match the same pattern. For example, the pattern match SEARCH (<$X, (LEFT) , (ABOVE_FAR) > ) could return node nl or n2 in FIG, 2. The pattern match however will return just one of the two nodes. We will show later how to generate all the matching, alternatives. Information can be added to or deleted from the data base through the special functions ADD and DEL. ADD's main function is to create new information with a graph as context. For example, the function call ADD(<$X,(LEFT),(FAR),?W>) will create enough new edges and nodes to make the pattern match of its argument against the data base successful. In the process, a (node) value will be assigned to W . ADD will also perform a simple consistency-redundancy test for each new edge attached to a node, with respect to the edge list of that node. Thus an intransitive edge labeled TYPE_HOUSE will not be added to the edge list of a node which already contains an edge labeled TYPE_HOUSE_TALL because it is redundant. An example of a DEL function call is DEL() where the special operator '-' specifies that the node or edge it operates on must be deleted. For this function call, a pattern match takes place first, and if unsuccessful the call fails; otherwise the node $Y is deleted along with all the edges that point to or away from it. l.pak also offers SNOBOL tables and arrays as means for representing information. Moreover, as in SNOBOL, the user can define his own data types. Explicit list facilities using CONS, CAR and CDR and list notation are provided. Expressions such as $L = ($A, $B, $C, $D) may be used to construct lists; here the value of L is a four element list. Extract: Discussion Discussion 1.pak has been implemented in SPITBOL[9], an efficient version of SNOBOL. SPITBOL uses both a compiler and an interpreter, offers most SNOBOL features (in particular the function EVAL) , plus a few more, has a very fast , garbage collector and handles user-defined data types very efficiently. It requires approximately ~50K bytes and runs several times faster than the BTL implementation of SNOBOL. Some of the reasons that led us to choose SNOBOL over other candidate languages are listed below: (a) It offers pattern matching facilities. This has helped the design and implementation of graph pattern matching; moreover, SNOBOL users will have no difficulty adapting to the graph pattern matching formalism since it is an extension of string pattern matching. (b) It offers tables and user-defined data types. These features were used extensively during the implementation of the 1.pak system, and are offered by 1.pak at very little extra cost. (c) SNOBOL'S control structure is unusual but flexible and well-suited to AI applications programming. 1.pak offers most of that control structure, in addition to the various shades of the LOCAL feature, function requests, generator calls etc. Labeled graphs have already been used for the representation of knowledge (e.g., Palme [10], Rumelhart et al [11]). 1.pak graphs have the additional feature however that the user has a choice of representing a piece of information structurally or as a property. Which form he chooses should depend on how often the parts of this piece of information will be retrieved and manipulated independently of each other. For example, the statement "John gives a gift to Mary" could be represented (rather crudely and with various subtleties of the sentence's meaning ignored) by the graph shown in FIG. 4(a), 4(b) or 4(c), depending on whether we will be referring explicitly to the gift or Mary and their properties, or will simply refer to them as parts of a property John has. Unlike PLANNER etc., 1.pak's data base offers only partial associativity. This may be inconvenient for the user in certain cases, but it offers him more control over his data base's thirst for memory space. More conventional data structures (arrays, tables, user-defined data types) are also available in 1.pak at very little expense for the 1.pak system since they are mostly handled by SNOBOL. Graph pattern matching offers many features found in PLANNER in that a similar backtracking mechanism is used and implicit function requests can be considered as consequent theorem calls. If the user agrees with Sussman and McDermott's criticism of PLANNER'S backtracking [2], he can switch to a programming style favoring generators where he has more control over the backtracking mechanism he uses. We feel that both features will be found useful. The 1.pak implementation uses both a compiler and an interpreter and requires a minimum of ~140K (this includes ~50K for the SPITBOL system) . There are plans to use 1.pak for question-answering, scene analysis and natural language understanding to test it and find which features are useful and should be made more prominent and which ones should be modified. in Proceedings of the Third International Joint Conference on Artificial Intelligence IJCAI-73, Stanford, CA: Stanford University 1973 view details 1.PAK is a graph manipulation language built upon SPITBOL. 1.PAK graphs are labelled directed graphs with transitive and intransitive edges. Most information in a graph is carried by edge labels since nodes are treated as locations to attach information. Information can be obtained from graphs through graph patterns, which employ the string pattern matching facilities of SPITBOL to retrieve information from edge labels. Graph patterns may also include branching points, conditionals, function calls and unevaluated expressions all within a backtracking control structure. Many other facilities of 1.PAK, such as generators and pattern directed procedure invocation, have their roots in such languages as PLANNER, CONNIVER and QA4. Extract: Workings As previously mentioned, we maintain 1.PAK functions for almost all "content" words in our restricted vocabulary, and refer to the totality of these functions as the procedure dictionary. These functions, known as "base routines" are called in order to check semantic well-formedness, disambiguate, and make necessary deductions. These deductions are often tentative, though motivated by strong evidence, and therefore are termed assumptions. The procedures that are invoked are those represented by the sentence just read. I.PAK allows the procedures to have the same or similar names in order that they may be invoked via patterns. Thus there may be two procedures for the word TEAR. As each function determines if the meaning it represents is applicable, the "right" function is found, provided the world-modeller has determined adequate disambiguating heuristics. Extract: Description QUESTION-ANSWERING We represent questions in exactly the same format as sentences, i.e., as graphs. Fact-retrieval questions such as "who", "which", and "where" often are simply requesting a noun phrase filling a particular case of a verb. This type of processing is performed by converting the question graph into a graph pattern and matching the pattern against likely propositions stored in the data base. This graph pattern match also performs simple deductions by employing the subset-superset hierarchy. An important aspect of the I.PAK graph pattern matching facilities is the capability of returning control to a user'specified function when an edge match fails. We utilize this feature in matching questions against facts by having the user'specified function perform a depth-first search of the subset-superset hierarchy to see if the node in question is an instance of the one requested in the pattern. Extract: Distinctions 1.PAK: Our comments regarding the 1.PAK language can be directed at its data base and associated retrieval facilities. The data base of labelled graphs was well-suited to this application program. The graph pattern facility allowed for a flexible retrieval process from these graphs and future versions of 1.PAK should provide the user with even more flexible pattern matching facilities. The distinction between graphs and graph patterns will be removed as the conversion from one to the other is a widely used technique. Future versions of the language may consider programs as graphs and vice-versa in order to be able to create procedures and execute graphs at run-time, as is available with lists in LISP and its dialects. Another feature needed by 1PAK is a more flexible data base organization to permit context manipulation, a simplified form of which is already present. in Proceedings of the 1974 ACM Annual Conference San Diego, November, 1974 view details |