KLS(ID:165/kls001)

Knotted List Structures

Country: us
- languages for us
- us/1962
Began: 1962
Published: 1962
- 1962
Type:List
- List
Sammet:SLP
- SLP
- SLP/1962
- SLP/us

Knotted List Structures. List-processing language, a predecessor of SLIP.

People:

Joseph Weizenbaum

Structures:

List

KLS	=>	KLS II		Evolution of
KLS	=>	SLIP		Evolution of

References:

Weizenbaum, J. "Knotted List Structures" view details Abstract: In (3) Perlis and Thornton describe a list structure language the central innovation of which is that the last word of each list specifies the location of the head of the list of which it is the terminal word. They call such list structures threaded lists. Beyond this, however, they hypothesize Sequencing Tables and associated sequencing modes (e, l, and w). The sequencing modes make possible forms of traversing list structures which leave the lists themselves undisturbed in memory. Since the Sequence Table can contain true names as well as aliases for particular list structures and since lists are undisturbed by sequencing, many programs may sequence through lists in different manners, at various rates, but still simultaneously. This partially mitigates against the need to store many copies of a list and becomes critical in multiple computer systems in which it may be expected that a number of sub-programs operate on a single list structure (under various aliases) independently. This note describes a generalization of the threaded list. The name given to such generalized threaded list structures is knotted list structures (KLS). The generalization consists essentially of replacing the thread link which Perlis and Thornton place in the R or link position of the last list cell of a list (or sub-list with a Control List (CL) which is attached to the list structure and which contains, among other information, a push down list capable of storing many thread links. The chief advantage of this scheme is that it permits a sub-list to be a sub-list of many lists, something which ordinary threaded lists do not permit. All other advantages of threaded lists are kept.

in Proceedings of the 16th ACM National Conference, January 1961 view details

Floyd, R. W. review of Weizenbaum 1961 view details Abstract: A knotted list is a proposed machine representation for bodies of information which grow and shrink unpredictably, which are used in a foreseeable sequence and which have a non-trivial associated structure. The paper is concerned exclusively with the techniques for implementation of knotted lists; a reader unfamiliar with the motivation for usage of list structures would be well advised to turn first to Newell and Tonge's "An Introduction to Information Processing Language V" [ Comm. ACM 4 (Apr. 1960), 205 211 ] . The present paper describes a representation for lists in machine memory, and several modes of sequencing through such lists. The representation is convincingly proposed as suitable for a machine organization capable of performing the basic list operations without the wastage of time characteristic of interpretive systems (abstractions). A detailed machine organization is promised as the subject of a subsequent paper.

in ACM Computing Reviews 3(04) July-August 1962 view details

R. M. Shapiro "Computers, connector systems, and data descriptions" pp72-73 view details
in Artificial languages view details

Weizenbaum, Joseph "An introduction to the KLS Processing System" view details
in Symbolic Languages in Data Processing, in the Proceedings of the Symposium organized and edited by the International Computation Centre, Rome, Italy, March 2631, 1962, Gordon and Beech Science Publishers, 1962. view details

Knuth, Donald E. The Art of computer programming, Addison-Wesley Publishing Company Reading, MA 1968 view details Extract: History And Bibliography
History And Bibliography
Linear lists and rectangular arrays of information kept in consecutive memory locations were widely used from the earliest days of stored-program computers, and the earliest treatises on programming gave the basic algorithms for traversing these structures. [ For example, see J. von Neumann, Collected Works, vol. V, 113-116 (written 1947); M. V. Wilkes, D. J. Wheeler, S. Gill, The Preparation of Programs for an Electronic Digital Computer (Reading, Mass.: Addison-Wesley, 1951), subroutine V-1. ] Before the days of index registers, operations on sequential linear lists were done by performing arithmetic on the machine language instructions themselves, and this type of operation was one of the early motivations for having a computer whose programs share memory space with the data they manipulate.

Techniques which permit variable-length linear lists to share sequential locations, in such a way that they shift back and forth when necessary, as described in Section 2.2.2, were apparently a much later invention. J. Dunlap of Digitek Corporation developed these techniques in 1963 in connection with the design of a series of compiler programs; about the same time the idea independently appeared in the design of a COBOL compiler at IBM Corporation, and a collection of related subroutines called CITRUS was subsequently used at, various installations. The techniques remained unpublished until after they had been independently developed by Jan Garwick of Norway; see BIT 4 (1964), 137-140.

The idea of having linear lists in non-sequential locations seems to have originated in connection with the design of computers with drum memories; the first such computer was developed by General Electric Corporation during 1946-1948, and the IBM 650 was the most notable later machine of this type. After executing the instruction in location n, such a computer is usually not ready to get its next instruction from location n + 1 because the drum has already rotated past this point. Depending on the instruction being performed, the most favorable position for the next instruction might be n + 7 or n + 18, etc., and the machine can operate up to six or seven times faster if its instructions are optimally located rather than consecutive. [ For a discussion of the interesting problems concerning best placement of these instructions, see the author's article in JACM 8 (1961), 119-150. ] Therefore the machine design provides an extra address field in each machine language instruction, to serve as a link to the next instruction. Such a machine is called a "one-plus-one-address computer," as distinguished from MIX which is a one-address computer." The design of one-plus-one-address computers is apparently the first appearance of the linked-list idea within computer programs, although the dynamic insertion and deletion operations which we have used so frequently in this chapter were still unknown.

Linked memory techniques were really born when A. Newell, C. Shaw, and H. Simon began their investigations of heuristic problem-solving by machine. As an aid to writing programs that searched for proofs in mathematical logic, they designed the first " listprocessing" language IPL-II in the spring of 1956. (IPL stands for Information Processing Language.) This was a system which made use of links and included important concepts like the list of available space, but the concept of stacks was not yet well developed; IPL-III was designed a year later, and it included "push down" and "pop up" for stacks as important basic operations. (For references to IPL-II see "The Logic Theory Machine," IRE Transactions on Information Theory IT-2 (Sept. 1956), 61-70; see also Proc. Western Joint Comp. Conf. (Feb. 1957), 218-240. Material on IPL- III first appeared in course notes given at the University of Michigan in the summer of 1957.)

The work of Newell, Shaw, and Simon inspired many other people to use linked memory (which was often at the time referred to as NSS memory), mostly for problems dealing with simulation of human thought processes. Gradually, the techniques became recognized as basic computer programming tools; the first article describing the usefulness of linked memory for "down-to-earth" problems was published by J. W. Carr, III, in CACM 2 (Feb. 1959), 4-6. Carr pointed out in this article that linked lists can readily be manipulated in ordinary programming languages, without requiring sophisticated sub-routines or interpretive systems. See also G. A. Blaauw, IBM J. Res. and Dev. 3 (1959), 288-301.

At first, one-word nodes were used for linked tables, but about 1959 the usefulness of several consecutive words per node and "multilinked" lists was gradually being discovered by several different groups of people. The first article dealing specifically with this idea was published by D. T. Ross, CACM 4 (1961), 147-150; at that time he used the term "plex" for what has been called a "node" in this chapter, but he subsequently has used the word "plex" in a different sense to denote a class of nodes combined with associated algorithms for their traversal.

Notations for referring to fields within nodes are generally of two kinds: the name of the field either precedes or follows the pointer designation. Thus, while we have written "INFO(P)" in this chapter, some other authors write, for example, "P.INFO ". At the time this chapter was prepared, the two notations seemed to be equally prominent. The notation adopted here has been used in the author's lectures since 1962 and it has also been independently devised by numerous other people, primarily because it is natural to use mathematical functional notation to describe attributes of a node. (For example, see the article by N. Wirth and C. A. R. Hoare, CACM 9 (1966), 413-432.) Note that "INFO(P)" is pronounced "info of P" in conventional mathematical verbalization, just as f(x) is rendered "f of x." The alternative notation P.INFO has less of a natural flavor, since it tends to put the emphasis on P, although it can be read "P's info" the reason INFO(P) seems to be more natural is apparently the fact that P is variable, but INFO has a fixed significance when the notation is employed. By analogy, we could consider a vector A = (A [ l ] , A [ 2 ] , ..., A [ l00 ] ) to be a node having 100 fields named 1, 2,..., 100. Now the second field would be referred to as "2(P)" in our notation, where P points to the vector A; but if we are referring to the jth element of the vector, we find it more natural to write A [ j ] , putting the variable quantity "j" second. Similarly it seems most appropriate to put the variable quantity "P' second in the notation INFO(P).

Perhaps the first people to recognize that the concepts "stack" (last-in-first-out) and "queue" ( first-in-first-out) are important objects of study were cost accountants interested in reducing income tax assessments; for a discussion of the "LIFO" and "FIFO" methods of pricing inventories, see any intermediate accounting textbook, e.g., C. F. and W. J. Schlatter, Cost Accounting (New York: Wiley, 1957), Chapter 7. In 1947 A. M. Turing developed a stack, called Reversion Storage, for use in subroutine linkage (see Section 1.4.5). No doubt simple uses of stacks kept in sequential memory locations were common in computer programming from the earliest days, since a stack is such a simple and natural concept. The programming of stacks in linked form appeared first in IPL, as stated above; the name "stack" stems from IPL terminology (although "pushdown list" was the more official IPL wording), and it was also independently introduced in Europe by E. W. Dijkstra. "Deque" is a term introduced by E. J. Schweppe.

The origin of circular and doubly linked lists is obscure; presumably these ideas occurred naturally to many people. A strong factor in the popularization of these techniques was the existence of general List-processing systems based on them [principally the Knotted List Structures, CACM 5 (1962), 161-165, and Symmetric List Processor, CACM 6 (1963), 524-544, of J. Weizenbaum ] .

Various methods for addressing and traversing multidimensional arrays of information were developed independently by clever programmers since the earliest days of computers, and thus another part of the unpublished computer folklore came into existence. This subject was first surveyed in print by H. Hellerman, CACM 5 (1962), 205-207. See also J. C. Gower, Comp. J. 4 (1962), 280-286.

Tree structures represented explicitly in computer memory were described for the first time in applications to algebraic formula manipulation. The A-1 compiler language, developed by G. M. Hopper in 1951, used arithmetic expressions written in a three-address code; the latter is equivalent to the INFO, LLINK, and RLINK of a binary tree representation. In 1952, H. G. Kahrimanian developed algorithms for differentiating algebraic formulas represented in the A-1 compiler language; see Symposium on automatic Programming (Washington, D.C.: Office of Naval Research, May 1954), 6-14.

Since then, tree structures in various guises have been studied independently by many people in connection with numerous computer applications, but the basic techniques for tree manipulation (not general List manipulation) have seldom appeared in print except in detailed description of particular algorithms. The first general survey was made in connection with a more general study of all data structures by K. E. Iverson and L. R. Johnson [ IBM Corp. research reports RC-390, RC-603, 1961; see Iverson, A Programming Language (New York: Wiley, 1962), Chapter 3 ] . See also G. Salton, CACM 5 (1962), 103-114.

The concept of threaded trees is due to A. J. Perlis and C. Thornton, CACM 3 (1960), 195-204. Their paper also introduced the important idea of traversing trees in various orders, and gave numerous examples of algebraic manipulation algorithms. Unfortunately, this important paper was hastily prepared and it contains many misprints. The threaded lists of Perlis and Thornton actually were only "right-threaded trees" in our terminology; binary trees which are threaded in both directions were independently discovered by A. W. Holt, A Mathematical and Applied Investigation of Tree Structures (Thesis, U. of Pennsylvania, 1963). Postorder and preorder for the nodes of trees were called "normal along order" and "dual along order" by Z. Pawlak, Colloquium on the Foundation of Mathematics, etc. (Tihany, 1962, published by Akademiai Kiado, Budapest, 1965), 227-238. Preorder was called "subtree order" by Iverson and Johnson in the references cited above. Graphical ways to represent the connection between tree structures and corresponding linear notations were described by A. G. Oettinger, Proc. Harvard Symp. on Digital Computers and their Applications (April, 1961), 203-224.

The history of tree structures as mathematical entities, together with a bibliography of the subject, is reviewed in Section 2.3.4.6.

At the time this section was written, the most widespread knowledge about information structures was due to programmers' exposure to List processing systems, which have a very important part in this history. The first widely used system was IPL-V (a descendant of IPL-III, developed late in 1959); IPL-V is an interpretive system in which a programmer learns a machine-like language for List operations. At about the same time, FLPL (a set of FORTRAN sub-routines for List manipulation, also inspired by IPL but using subroutine calls instead of interpretive language) was developed by H. Gelernter and others. A third system, LISP, was designed by J. McCarthy, also in 1959. LISP is quite different from its predecessors: programs for it are expressed in mathematical functional notation combined with "conditional expressions" (see Chapter 8), then converted into a List representation. Many List processing systems have come into existence since then, of which the most prominent historically is J. Weizenbaum's SLIP; this is a set of subroutines for use in FORTRAN programs, operating on doubly linked Lists.

An article by Bobrow and Raphael, CACM 7 (1964), 231-240, may be read as a brief introduction to IPL-V, LISP, and SLIP, and it gives a comparison of these systems. An excellent introduction to LISP has been given by P. M. Woodward and D. P. Jenkins, Comp. J. 4 (1961), 47-53. See also the authors' discussions of their own systems, which are each articles of considerable historical importance: "An introduction to IPL-V" by A. Newell and F. M. Tonge, CACM 3 (1960), 205-211; "A FORTRAN-compiled List Processing Language" by H. Gelernter, J. R. Hansen, and C. L. Gerberich, JACM 7 (1960), 87-101; "Recursive functions of symbolic expressions and their computation by machine, I" by John McCarthy, CACM 3 (1960), 184-195; "Symmetric List Processor" by J. Weizenbaum, CACM 6 (1963), 524-544. The latter article includes a complete description of all of the algorithms used in SLIP. In recent years a number of books about these systems have also been written.

Several string manipulation systems have also appeared; these are primarily concerned with operations on variable-length strings of alphabetic information (looking for occurrences of certain substrings, etc.). Historically, the most important of these have been COMIT (V. H. Yngve, CACM 6 (1963), 83-84) and SNOBOL (D. J. Farber, R. E. Griswold, and I. P. Polonsky, JACM 11 (1964), 21-30). Although string manipulation systems have seen wide use and although they are primarily composed of algorithms such as we have seen in this chapter, they play a comparatively small role in the history of the techniques of information structure representation; users of these systems have largely been unconcerned about the details of the actual internal processes carried on by the computer. For a survey of string manipulation techniques, see S. E. Madnick, CACM 10 (1967), 420-424.

The IPL-V and FLPL systems for List-processing did not use either a garbage collection or a reference count technique for the problem of shared Lists; instead, each List was "owned" by one List and "borrowed" by all other Lists which referred to it, and a List was erased when its "owner" allowed it to be. Hence, the programmer was enjoined to make sure no List was still borrowing any Lists that were being erased. The reference counter technique for Lists was introduced by G. E. Collins, CACM 3 (1960), 655-657; see also the important sequel to this paper, CACM 9 (1966), 578-588. Garbage collection was first described in McCarthy's article cited above; see also CACM 7 (1964), 38, and an article by Cohen and Trilling, BIT 7 (1967), 22-30.

Unfortunately, too few people have realized that manipulation of links is not a magic, complex process that must be done by fancy List manipulation routines. There is still a widespread tendency to feel that the advantages of link manipulation can be obtained only by using a large List-processing system; actually, as we have seen in many examples throughout this chapter, it is quite easy to design linking algorithms which operate more efficiently than a general List-processing subroutine could possibly do. The author hopes that these detailed examples will help to convey a better-balanced attitude towards the use of structural links in data.

Dynamic storage allocation algorithms were in use several years before published information about them appeared. A very readable discussion is given by W. T. Comfort, CACM 7 (1964), 357-362 (an article written in 1961). The "boundary-tag' method, introduced in Section 2.5, was designed by the author in 1962 for use in a control program for the B5000 computer. The "buddy system" was first used by H. Markowitz in connection with the SIMSCRIPT programming system in 1963, and it was independently discovered and published by K. Knowlton, CACM 8 (1965), 623-625; see also CACM 9 (1966), 616-625. For further discussion of dynamic storage allocation, see the articles by Iliffe and Jodeit, Comp. J. 5 (1962), 200-209; Bailey, Barnett, and Burleson, CACM 7 (1964), 339-346; Berztiss, CACM 8 (1965), 512-513; and D. T. Ross, CACM 10 (1967), 481-492.

A general discussion of information structures and their relation to programming has been given by Mary d'Imperio, "Data Structures and their Representation in Storage," Annual Review of Automatic Programming 5 (Oxford: Pergamon Press), in press. This paper is also a valuable guide to the history of the topic, since it includes a detailed analysis of the structures used in connection with twelve List processing and string manipulation systems. See also the proceedings of two symposia, CACM 3 (1960), 183-234 and CACM 9 (1966), 567-643, for further historical details. (Several of the individual papers from these proceedings have already been cited above.)

An excellent annotated bibliography, which is primarily oriented towards applications to symbol manipulation and algebraic formula manipulation but which has numerous connections with the material of this chapter, has been compiled by Jean E. Sammet, Comput. Rev. 7 (July-August 1966), B1-B31.

in Symbolic Languages in Data Processing, in the Proceedings of the Symposium organized and edited by the International Computation Centre, Rome, Italy, March 2631, 1962, Gordon and Beech Science Publishers, 1962. view details