ABSET(ID:356/abs003)Set-based languageU Aberdeen. Early declarative language. Places Related languages
References: This paper deals mainly with the treatment of declarative material in ABSET. The paper is in two parts. Part 1 presents the motivation for the fundamental ideas and their integration into the total design. Part 2 presents examples of the interactive use of ABSET, some simply to give an idea of the complete range of the declarative part of ABSET, others explicitly chosen to indicate a particular contribution of each of the fundamental ideas to the achievement of the design aims. A large part of ABSET has been implemented and all the major examples run, though with some minor syntactic differences from the versions given here. Abstract: Introduction In designing ABSET we have tried to solve what seemed to us to be the fundamental problems of programming. We felt that it was the lack of a clear formal organization of the very basis of programming and the bad style resulting from this that makes the management of large projects so difficult. ABSET is therefore not specially adapted to list processing, or to matrix multiplication or text processing, and so on, though such features could be introduced in it. We have concentrated on more primitive ideas: such notions as repetition, functions, the different kinds of equality, representations, the distinction between an ordering of decisions and an order of evaluation, and the manipulation of partly-evaluated program. The treatment of the first four of these notions is founded on sets and it is from this that AB SET takes its name. Our ideas here have grown out of earlier work reported in this Workshop series, and it is appropriate, as part of the motivation of ABSET, to sketch this earlier work. Murray and Elcock (1968) describe an investigation of automatic improvement of a board-game (Go-Moku) playing program. The improvement was by program generation of generalized descriptions of winning board situations from program analysis of particular instances encountered in play against a human opponent. This work emphasized the need for more general solutions to problems of description and recognition of complex but well-structured data. At this time collaboration with J.M.Foster led to a considerable widening of the original interest in program description of data. It was clear that in many problems, instead of being constrained to writing sequences of imperative statements to construct particular data, it would be advantageous and certainly nearer mathematical practice to be able simply to assert things about the structure of data. In particular, if data exists which satisfies all the assertions made about them, then these data are independent of the ordering of any operations which may be performed to construct them. We felt that it should not be necessary, at least at certain levels of program, to be concerned with such orderings. These ideas were first reported in Elcock (1968) and Foster (1968): they represent a first approach to the careful distinction that is made in ABSET between an ordering of decisions and an order of evaluation. Further developments of these ideas led to the design and implementation of an on-line incremental compiler for assertions called ABSYS (standing for ABerdeen SYStem). Part of this work was reported in Foster and Elcock (1969). ABSYS uses a completely declarative language with a primitive equality relation and the ability to compound assertions using and and or connectives. A written assertional program in ABSYS places no explicit constraints on the order in which particular evaluations might be performed. The on-line system is incremental in that whatever processing can be done on the basis of data already present is done. As far as we are aware, Foster and Elcock (1969) was the first published account of an implemented general-purpose interactive language to explore this lack of concern with control. Later papers by Nevins (1970) and Fikes (1970) make contributions to this area. It is particularly interesting that Nevins, with different motivation from our own, has developed a similar treatment of the elementary equality relation, particularly in its pattern-matching aspects, and of the primitive connectives and and or. Although an interesting language and one which gave us some moments of great pleasure, ABSYS, even accepting the experimental limitation of being totally declarative, has a number of deficiencies. Not least is an awkward treatment of the and and or connectives which inhibited, among other things, good design of set operations in the language. At the same Workshop, Machine Intelligence 4, at which the paper on ABSYS was presented, Robinson (1969) gave his paper on the mechanization of higher-order logic. We must acknowledge the influence of this paper on our subsequent work. It confirmed our dissatisfaction with our treatment of propositional connectives in ABSYS and was a contributory stimulus to a re-examination of the primitive notions of ABSYS. As already mentioned, this re-examination led to a further broadening of our language objectives and to the focusing of our attention on the elementary notions of value, repetition, equality, representation, the distinction between ordering of decisions and an order of evaluation, and, not least, the manipulation of partly-evaluated program. Extract: ABSET - Intro The Philosophy Behind ABSET We describe below some of the things we have borne in mind in producing ABSET. TWO themes run through them: that it should be clear what a program says, and that the language should not force the .programmer to commit himself to decisions he would prefer to postpone. We do not suggest that ABSET is perfect in these respects. A simple structure for the language It is important to be able to describe the language in as simple a way as possible, not only for the purpose of teaching it, but also because this tends to simplify the implementation. We have tried to make it possible to explain the language as follows. First, we introduce a few fundamental ideas, such as the purpose of a compiler, the nature of a function and the idea of evaluation. Second, we say that syntax is of secondary importance. There is, of course, a primitive syntax but, beyond this, every syntactic construction for an expression can be described in terms of the application of named available functions to parameters. Third, we describe the significance of these functions. The grouping of decisions in a program We would like to be able to choose any structure for a program: that is, we would like to be able to group the microscopic decisions of which a program consists in any way we please. We include here decisions about representations of data and functions. We particularly want to distinguish between the order of evaluation and the ordering of statements in program text. We do not want to be forced by the language to take compound decisions, though of course we want to be able to make such constructions when convenient. We think these features are important for several reasons. First, good program organization demands them: it should be possible to write program at different levels of abstraction. For example, it should be possible to indicate that an object is a function from integers to reals without deciding whether it is to be represented by a procedure or an array. It should be possible to indicate that an object is an array without deciding how big it is, or to indicate that it is a procedure without specifying the details of the corresponding algorithm. It should be possible to say those things before or after using the object. As a good illustration of the difficulties created by ignoring them, consider the writing of library procedures in ALGOL 68, for which see section 0.1.44 of Wijngaarden et al. (1969). This can be done only if the decision about all the modes of the parameters is made at the time of definition of the library procedure, so that, for example, a general procedure to sort pairs of integers and references cannot be written because we must decide, too early, exactly what kind of objects the references are to. Second, we need them for descriptive programming. In many cases we would be content to write merely a specification of the required constraints on our data, and leave the problem of satisfying these constraints to the system. That is, we would like to leave out the information about the order of evaluation (an extreme version of this would be to write our descriptions in the predicate calculus, but we do not feel that the strategies of current theorem provers are good enough to make this a practical approach at present). Third, they help in making programs easier to manipulate, in compilation, in proofs of their properties, and in turning programs, or partially evaluated programs, back into text. Fourth, they are most important for interactive use of the language, both because the order of taking decisions cannot be thought out in advance and also because program can be partly evaluated as the programmer proceeds, without waiting for him to complete his definitions. The separation of the statement of the problem from the problem solver All interesting programs are finite presentations of a potentially-infinite set of evaluations. The two notions of the finite program, and the possible evaluation of it by an interpreter or problem solver, are qixite distinct. We wish to keep this distinction clear, both for the sake of the programmer and because we may wish to have different interpreters (we call them sequencers to emphasize a different aspect of them) in different circumstances. To obtain this feature in full we require that the representation of a partly-evaluated program produced by a particular sequencer shall be acceptable data to other sequencers (see the section below on availability of the data structures used for evaluation). The availability of the data structures used for evaluation It is very convenient to be able to manipulate partly-evaluated program. So far the only significant use that we have made of this feature has been in the printing of the constraints on objects in a partly-evaluated state (see the section on generating text in ABSET and the associated examples). However, we intend to base much of our future work on this feature. To take an example, consider a differentiation program in a conventional language. This will usually be written to take in a formula in text form,' translate it to some internal form suitable for the program, carry out the differentiation, translate back into text, and print out the result. In ASSET the formula could be the result of partial evaluation and the result or intermediate results could be further evaluated at any time. The powerful features of the evaluation mechanism are always available for working 'on the side'. Repetition and sets We decided that the elementary ideas of sets would form the best basis for a programming language. Many things we want to do in programs fall naturally into this terminology, and we regard it as a good foundation for ways of achieving repetition. Rather than use go to statements or recursion, we prefer to use repetition formulae, such as 'this is true for all members of this set' and the notion of image sets. Quite apart from their clarity, such constructions have the advantage of giving rise to structures more amenable to partial evaluation and further processing. Representations, equality, and sets It is an important aid to clarity in a program to know what is representing what. The process of programming consists of representing certain sets of objects and the functions between them by means of 'lower level' sets and functions. Much confusion in programming languages comes from not explicitly recognizing this fact. Languages (like ALGOL 68) provide basic sets and functions which the programmer may use to represent his problem, but no declarative way of indicating the representations at any level higher than this. We have provided three features intended to be helpful in these respects. First, we can define sets in a general way, for example, as those items satisfying a predicate (see the section on sets below). Second, we insist that the domain and range of every function is stated in terms of such sets and that functions are total, that is, defined everywhere in the domain (of course we do not insist that this information is given at any particular time in relation to the definition of the function). Third, we attempt to treat the notion of equality carefully. For example, sometimes we might use a list to represent a set. When we are considering the functions on sets we have one notion of equality but when we are thinking of the object as a list, in order to represent the set operations, we need a different equality. We have therefore chosen a notation which brings out the fact that equality is an equivalence relation, and which allows us to have any number of such equivalences. Furthermore, we have taken the same care over the equality implied by the parameter passing mechanism (see the section on function application below). Constructors and projectors We prefer to use constructors in programming, rather than projectors, wherever possible and appropriate. Thus A = [B, C, D];E = [D, C, B]; is clearer than HD(E) = HD(TL(TL(A))); HD(TL(E)) = HD(TL(A)); HD(TL(TL(E)))=HD(A); Extract: Informal Description Of ABSET Informal Description Of ABSET In this section we follow the method of language description referred to above, that is, we explain the purpose of the compiler, the nature of functions, and the idea of evaluation, and we briefly describe part of the syntax. In this paper we do not explain in detail the primitive functions of the language, but in the section of examples some of the functions are illustrated. Program text, the compiler, and program evaluation A piece of program text consists of expressions and statements. Expressions are constants or objects. Objects are identifiers or functions applied to expres sions (see below). So 1 A F(A) A + B A = B are expressions. Note that infixed operators, .and other syntactic constructions not illustrated, stand for functions applied to expressions. Since the text of a program is much easier to manipulate on paper than any other representation of the data in the computer, we shall pretend that there is a notional interpreter of text. Statements are certain expressions of the program which we insist must denote the constant TRUE. This is done by placing a semi-colon after them which gives this information to the interpreter. It is the job of this interpreter to try to find constants to associate with the objects in such a way as to associate the statements with TRUE and to be consistent with the definition of the primitive functions. For example, A+B=3ANDA=1 is an expression and the interpreter is not applicable to it, but A + B = 3 AND A=1; is a statement and is interpreted as meaning that the whole expression is associated with TRUE, therefore A + B = 3 is associated with TRUE and so is A=l by definition of AND, therefore A + B is associated with 3 and A with 1 by definition of =, and therefore B is associated with 2 by definition of +. The classification of constants in ABSET includes primitive type sets such., as integers, reals, booleans, strings, quoted identifiers, code entries, and references. Other types can be constructed as described in the section on sets. The compiler translates text of statements into a representation in the computer and applies a representation of the interpreter (a sequencer plus the function definitions) to this to construct the values of the expressions (that is, the representations of the constants associated with them). The action of the interpreter in the current implementation of ABSET is illustrated in the section on annotated examples of ABSET programs. Functions, their application to expressions and the idea of equality A function is a rule for associating one value with another. A function applied to a value is an expression standing for the associated value. Thus F(A) is an expression which must be associated by the interpretation with the value belonging to A in F. The argument and result of a function are restricted to lie in certain sets, the domain and range of the function. These must be specified as part of the definition of every function. In ABSET we insist that all functions are total, that is, they are defined everywhere in their domain. The application of a function to an argument not in its domain is an error, and when ABSET detects this it prints an error message. In ABSET all functions are single valued, that is, all evaluations of F(X) yield equal results. This raises the interesting question of what interpretation of equality is used in this notion of single-valuedness and indeed the general question of what we mean by equality. Equality is a concept which has caused difficulty in various programming languages (including ABSYS) and we consider that a careful treatment of it is very important. Much of the difficulty arises from having just one kind of equality: in ABSET we can have any number. By an equality we mean an equivalence relation on values which has the substitution property in some class of functions (that is, the equality of x and Y implies the equality of F(X) and F.(Y) if F is one of these functions). The point of difficulty is the choice of the class of functions. If we allow ourselves only one equality then either we have substitution over the whole class of ABSET functions, or over some fixed arbitrary subset of it, neither of which is satisfactory. Let us consider this point in more detail. The process of programming consists of representing the values and functions of our problem in terms of the constructs available in the programming language. For example, if we are concerned with a problem in syntax analysis we first represent it in terms of concepts such as syntactic production and substitution. Then we represent these in terms of list-processing concepts such as CONS and NIL and finally these are represented by machine operations on addresses and numbers. If equality is to have the substitution property in all these functions we are in trouble. For two lists can be equal only if they are indistinguishable to all functions, including the machine ones. Likewise syntax productions, represented by lists, can be equal only if the lists are equal. This intrusion of low-level concepts (for example, of machine address) into material really concerned with lists (that is, ordered sets) leads to unnatural requirements on the programmer. In ABSET we allow many equalities by associating them with sets. When a new set is introduced an equivalence relation is defined for it. When we write A=B AS FRACTIONS; we mean that A and B are equivalent according to the equivalence relation belonging to the set FRACTIONS. A default equivalence relation is used if we write A=B; Now whenever we define a function over a set we imply that the equivalence relation of that set has the substitution property in the function. Let us return now to the single-valuedness of functions. Clearly this means that the evaluation of F(X) and F(Y) where x and Y are equivalent in the domain set of F yields results which are equivalent in the range set of F. Some further comments on the equality of formal and actual parameters of compound functions is made in the section on the syntax of ABSET. Breadth-first evaluation Because a program is a presentation of an infinite set of possible operations of evaluation, there is a danger that if a bad order of evaluation is chosen it may be impossible to find a solution which could be found by some other order. We are not considering here the difficulties of the restricted amount of time and storage, but the possibility of the sequencer involving itself in an infinite set of operations before reaching a crucial one. That is, unless we are careful the sequencer may not be complete. To avoid this we guarantee that every operation of evaluation implied by the program will be treated within a finite number of steps. As an example of this we can consider the enumeration of the elements of the Cartesian product of two infinite sets. We must use a Cantorean breadth-first enumeration, for example, (11, 21, 12, 31, 2 2, 13...) and not the easier square enumeration (1 1, 1 2, 1 3, 1 4, 1 5 . . .). The ABSET primitive sequencer We call the unit of processing handled by the ABSET primitive sequencer a job. A job consists of a reference (of unique type) to a data cell for the job. The data cell is a list cell of references to the objects to be processed by the machine code associated with the job. The data cell also contains a reference to a unique 'job-descriptor-cell' which contains information common to all activations of the job including the address of the entry to the machine code associated with the job. An object referenced from the data cell of a job may be of type 'novalue' with the significance that the ABSET interpreter has not yet inferred a value for that object. An object of type 'novalue' itself contains a reference to a list of jobs (possibly the nil list). This list is extendable: we call the process of adding a job to it 'attachment' of the job to the object. A function in ABSET is represented by a job-descriptor-cell. As well as the code entry this descriptor cell contains information such as the number of parameters of the function and the size of the data cell of the described job (the code for the job may require own workspace). This information is used by the compiler to implement function application by the creation of a job whose data cell refers to the parameters of the function and the job-descriptor-cell of the function. In the section 'a simple structure for the language' we stressed that every syntactic construction for an expression can be described in terms of the application of named available functions to parameters. We can now complete the description of both the compiler and of the process of evaluation. When the compiler processes text, for each expression processed the compiler applies the relevant functions to their parameters thereby creating jobs, and it also attaches each job to those of its parameters which at that moment are of type 'novalue'. Each job is added to the primitive sequencer's list of waiting jobs. Essentially then, an object of type 'novalue' is a reference to a list of jobs each of which references that object (among others). The syntactic constructions NEW and ' are such that application of their associated functions creates new 'novalue' objects. When the sequencer is called it removes the first job from its list of waiting jobs, places the job reference in a globally-named location, and jumps on the job entry. The code for the job is therefore written on the basis that, when the job is active, the data cell of the job is referenced from this global location. Typically the code for a job is written so that, when entered, the set of parameter references in the data cell of the job is examined. There are three possibilities: (1) the job is unconstrained: that is, there are not enough parameters with values to allow any inferences to be made: the job terminates by effectively returning control to the primitive ABSET sequencer; (2) the job is partly constrained: that is, there are sufficient parameters with values for the value of the remaining parameters of type 'novalue' to be inferred. The giving of these inferred values to the remaining parameters uses a mechanism which, at the same time as giving the parameter object the inferred value, adds the list of jobs associated with the object to the sequencer's list of waiting jobs. Control is again finally returned to the sequencer; (3) the job is totally constrained: that is, the set of parameters with values is such that it is simply a question of determining whether the constraints over them are satisfied or not. If the constraints are satisfied, control is simply returned to the primitive sequencer. If they are not satisfied, termination is by an error exit which arranges for appropriate action to be taken (for example, the setting up of jobs to take diagnostic action) before again returning control to the primitive sequencer. A final remark: the compiler is itself a job. When active it has the good manners to 'give way' at appropriate moments: that is, it unilaterally adds itself to the (end of) the sequencer's list of waiting jobs and then returns control to the sequencer. Some notes on the syntax of A B S ET In the section on 'a simple structure for the language' we said that syntax was of secondary importance: it should be stressed that this was in the context of explaining the language. Good syntactic constructions can however contribute a great deal to the use of the language, to clarity and good style. We have tried to provide good constructions in ABSET. The syntax of ABSET is mostly conventional. The constants in the various basic sets are written in the usual way (for example, 1 or TRUE or 'JOHN')- ABSET identifiers are (1) a sequence of letters, digits and hyphens beginning with a letter; (2) a sequence of digits; (3) any single character from w , ;[]£"(). (4) any sequence of characters chosen from the remaining teletype symbols. An identifier is terminated by space or newline or by any character from another group. Group (1) is used in ABSET for names of user objects. Group (2) identifiers are interpreted as integers. Group (3) are reserved identifiers in ABSET. Group (4) is used in ABSET for user objects: typically infix operators. Certain combinations from groups (1) and (4) have particular meanings in ABSET. New identifiers are introduced by priming them or by a NEW statement {see section on 'a general demonstration of ABSET'). The textual scope of identifiers is controlled by BEGIN and END in a conventional way except that the END construction allows promotion of identifiers out to the next enclosing scope level {see, for example, section on 'text in ABSET: illustrated examples'). Compound functions are introduced in a way similar to the lambda-calculus. For example, INC' ISIN INTEGERS^-INTEGERS; INC = MAP A' TO A + l; and INC' = MAP A': INTEGERS TO A+l: INTEGERS; and INC'=MAP A' ?1 TO A; INC ISIN INTEGERS->INTEGERS; and INC'=MAP A': INTEGERS TO B': INTEGERS WHERE B = A+1; all introduce equivalent functions. The primes on A and B indicate that the scope is the function. The infix WHERE insists to the interpreter that the following expression must denote TRUE. Note that the interpretation of INC when applied is that the local objects and constraints are copied and the actual parameters asserted to be equal to the formal parameters according to the equivalence relations of the appropriate sets. Thus INC' = MAP A'-1: INTEGERS TO A: INTEGERS; X' = INC(Y'); is equivalent to NEW x', Y'; BEGIN Y=A'-1 AS INTEGERS; X = A AS INTEGERS; END; TEXT IN ABSET Two main ideas underlie the conversion of partly-evaluated program into text. First, it seemed attractive to have a single language for communication between programmer and machine and between machine and programmer: that is, it should be possible to take partly-evaluated program and re-express it as valid ABSET text, which, given a suitable state of the system, could be read as a program by this system. Second, in an interactive programming system the user does not want to have to sit through lengthy printouts of the text representing a large structure in complete detail when he is likely to be interested only in a small part of the total structure. On the other hand, there will be times when he does want all the available information about the structure. What is required is the ability to construct text representation of structures at controllable levels of detail. The simplest way in which the text representing a structure can be presented at controllable levels of detail is to represent substructures by identifiers which are paired with these substructures in some way. An ABSET dictionary is a set of pairs where each pair consists of an identifier and a value. There can be only one occurrence of an identifier in a dictionary, so a dictionary can be thought of as a mapping from identifiers to values. (Not a one-one mapping, as two different identifiers can be paired with the same value.) An environment consists of a list of dictionaries and so also represents such a mapping if the convention is adopted that the first occurrence of an identifier in a dictionary of an environment is the one that indicates which value is to be paired with that identifier in the environment. An 'inverse' mapping from values to identifiers can be obtained from an environment if, given a value paired with at least one identifier in the environment, an arbitrary choice is made from the identifiers paired with that value in the environment. The primitive text mechanisms provided in ABSET require as input not only the object to be texted, but also an environment with respect to which the structure is to be texted. Any substructures which are in the domain of the 'inverse' mapping of the environment will be represented by their associated identifiers in the text produced. The level of detail in the text is controlled by the extent to which the substructures of the object are named in the environment supplied to the primitive text mechanisms. Of course, in the case of simple objects, such as integers, the environment will not be used in the texting process. Further facilities are provided whereby a structure can be scanned, prior to creating its text representation, and names invented for parts of the structure which are not named in the input environment, these names then also being used in' texting the structure. This creates a text representation of the structure containing less detailed information about the structure than would otherwise have been the case. The names invented have to be paired with their new values in a fresh dictionary, and that dictionary added to the environment input to the text mechanism, so that if further detailed texting of part of the structure is required, the values paired with the names used in the text representation can be obtained from this new environment and the appropriate ones texted to a further level of detail. A number of examples illustrating the ABSET text facility are included in the set of annotated examples. Sets in ABSET In this section we describe the defining properties of sets, some primitive sets, and some of the ways in which new sets can be formed. Other illustrations can be found in the examples. We refer also to the question of the definition of the domains of functions of more than one argument. Sets are defined as the domain of the following functions. (a) IN, which gives a function in UNIVERSE->-BOOLBANS. This is the usual set-membership predicate for the set. (b) EQINSET, which gives the equivalence relation over the set referred to previously. (c) SITEM, which gives an item of the set unless the set is empty, in which case it gives an arbitrary item (cf. Hilbert's epsilon operator). (d) SREM, which gives a set which is the same as the original except that it does not contain the item selected by SITEM. Thus, for example, IN(SET1, ITEM) is a truth value. We can also write this with an infixed operator asiTEMiSiNSETl. There are a number of (mostly infinite) primitive sets, such as INTEGERS, SETS, UNIVERSE, and BOOLEANS. There are many functions which yield sets as results. They all work by specifying the results of the four projectors listed above and so they are not primitive, but can be defined using MAP. TO assist in this we mention the function MAKESET which yields a set. Its four arguments are the results of the four projectors, except that instead of giving the result of SREM we give a function which, when applied to the value of SITEM, gives SREM (this is to avoid constructing the set explicitly). We give below the definition of one of these functions and describe some others formally. (a) The definition ofUPTOF (which has the infixed operator UPTO). UPTOF=MAP LOW': INTEGERS, HIGH': INTEGERS TO MAKESET(BETWEEN(LOW, HIGH), EQF, LOW, MAP LOW'TO LOW+1 UPTO HIGH); (b) The function CUPF (infixed operator CUP) gives the union of two sets. (c) The function POWF (infixed operator-*) gives the set of functions from one set to another. (d) The function THOSEF gives the set of all items of a given set which satisfy a given predicate. There is a special syntactic construction for this related to MAP, for example, THOSE X': 1 UPTO 1000 SATISFYING PRIME(X) . It frequently happens that the domain of a function of more than one variable is not a simple Cartesian product. Since we insist that functions should be total we may have to use slightly more complex constructions than we have illustrated before. For example, consider the function APPLY. APPLY = MAP F', X' TO F(X); The domain of this function is not simply the product of FUNCTIONS and UNIVERSE, since when we have chosen a function it is only applicable to items in its domain. We can solve this in two ways. The first is to write, for example, APPLY' = MAP F', X' TO F(X) APPLY ISIN (THOSE P' & Q': FUNCTIONS X UNIVERSE SATISFYING Q ISIN DOMAIN(P))->UNIVERSE; The second is to write APPLY = MAP F': FUNCTIONS TO (MAP X': DOMAIN(F) TO F(X): RANGE(F)): FUNCTIONS; Since a function of more than one variable is exactly equivalent to a function of one variable (the first) of which the result is a function from the remaining variable to the result, we allow the following with exactly equivalent effect. APPLY = MAP F': FUNCTIONS, x': DOMAIN(F) TO F(X): RANGE(F); ILLUSTRATIVE EXAMPLES OF THE INTERACTIVE USE OF ABSET The first of the examples below, 'a general demonstration of ABSET', is presented to give an idea of the range of the declarative part of ABSET. Although the example begins with the comment that not all the facilities demonstrated are available, it should be emphasized that the only major feature missing in the current implementation is the 'AS' construction discussed in the section on 'functions and their application to expressions'. The remainder of the examples have been chosen to illustrate particular points made in the earlier part of the paper. In all the examples we have indicated material typed by the programmer in upright type and material typed by the machine in italic type. The 'end of message' symbol used by the programmer in interactive mode is a non-printing character. It can be used by the programmer anywhere in the text he is typing. The examples can therefore be read as if the material being typed by the programmer was being interpreted at word level. Extract: Conclusions Conclusions The work described here concerns the declarative part of ABSET. We consider that this is deficient in two main respects and it is in these areas that we are attempting improvements. First, although we can write new sequencers we shall need further new primitives and new syntactic constructions to make this convenient. In particular we need to treat assignment, reference, and storage allocation. Secondly, we need to be able to recognize when more sophisticated sequencers are needed. Using a complex .sequencer (for example, keeping records for back-tracking) where a simple one would suffice can be very inefficient. However, we need to be able to call an appropriate complex sequencer into play. This appears a difficult recognition process at present. in "Machine Intelligence 6", Meltzer, Bernard and Michie, Donald (eds) Edinburgh University Press, 1971 view details Introduction This paper gives a brief description of the ABSET (Elcock, Foster, Gray, McGregor, and Murray, 1971) sequencer and operating system. The motivation is not to give a complete description but rather to expose certain general principles which could have application in the design of other operating systems and, for the particular language ABSET and its embedding operating system, to expose the uniformity of concept and implementation of the basic unit of processing at all levels of operation. Earlier papers (Foster and Elcock, 1969 ; Elcock et al., 1971 and Elcock, 1971) have described programming languages ABSYS and ABSET, both the work of the Computer Research Group in Aberdeen, which are mainly declarative in content and in which much less information is given than is usual about the order in which individual operations are to be carried out. The basic statements of these languages are not instructions to do something as in ALGOL, but assertions about the data. A program statement asserts a relation about data objects. Compound assertions can be constructed by the use of and and or connectives with properties similar to their logical counterparts. A written assertional program places no explicit constraints on the order in which particular operations are performed. Compilers for such languages must try to arrange that the assertions about data lead to the construction of data satisfying all the assertions made about them. A feature of the implementation is what was referred to in Foster et al. (1969) as 'data directed control' of jobs arising from the application of functions to parameters. This control structure was conceived with the design objectives of the language in mind. It turned out, however, that these same general principles were very relevant to the design of the total system in which the language was embedded. As indicated in the first paragraph, this paper is written in the expectation that the principles of this kind of control structure could be applicable to the design of operating systems in other environments. In some form or other the notion of data directed control is present in all operating systems. Consider a system with a multiprogramming capability. The activation of a particular program may have to be delayed because an input file is not available. It might be desirable to suspend a currently active program awaiting a peripheral transfer. Activation of a set of programs might have to be ordered because the control descriptions of certain of them specify as input files the output files of other programs of the set, etc. More complex examples of cooperating processes together with system designs for handling them are to be found in Dijkstra (1968), Feldman and Sproull (1971) and the PL/1 Reference Manual. In such environments there is much to be gained from a simple uniform mechanism which provides for dynamic data directed control of the activation and suspension of processes. The rest of the paper presents the ABSET mechanism. It depends on: 1. A uniform particular representation of a process. 2. A primitive sequencer for activating the next process of a list of waiting processes and to which each activated process eventually returns control. 3. A uniform particular representation of an as yet unevaluated datum which allows processes to be associated with (referenced from) the unevaluated datum. 4. A uniform mechanism for giving a value to a datum which has the additional effect of returning all processes associated with the datum to the sequencer's list of waiting processes. 5. An appropriate dynamic storage allocation scheme and garbage collector. The above makes possible a very simple regime for writing processes in which an active process can test whether or not a required datum has been evaluated and, if not, unilaterally associate itself with the datum and suspend its activity, by a return of control to the primitive sequencer, possibly but not necessarily after itself initiating a separate process to obtain a value for the required datum. The system design ensures that this process will be reactivated if the datum is subsequently given a value by some other process. All these points are elaborated in Section 2 below. Section 3 shows how the mechanism is used to perform certain conventional tasks of an operating system including the handling of interrupts and peripheral transfers. Extract: Jobs and the ABSET sequencer Jobs and the ABSET sequencer This section should be regarded as a contextual unit. It is suggested that the reader skim through it to obtain an overall view, re-reading to relate specific details to the whole. We call the unit of processing handled by the ABSET sequencer a job. A job consists of a reference (of unique type) to a data cell for the job. The data cell is a list cell of references to objects to be processed by the machine code associated with the job. The data cell also contains a reference to a unique 'job-descriptor-cell' which contains information common to all activations of the job including the address of the entry to the machine code associated with the job. An object referenced from the data cell of a job may be of unique type 'no-value'. An object of type 'no-value' itself contains a reference to a list of jobs (possibly the nil list). This list is extendable: we call the process of adding a job to it 'attachment' of the job to the object. The motivation for this structure for jobs and the action of the ABSET sequencer is perhaps best introduced by a digression to describe the ABSET compiler and the process of program evaluation. A function in ABSET is represented by a job-descriptor-cell. As well as the code entry this descriptor cell contains information such as the number of parameters of the function and the size of the data cell of the described job (the code for the job may require own workspace). This information is used by the compiler to implement function application by the creation of a job whose data cell refers to the parameters of the job and the job-descriptor-cell for the function. [...] Every syntactic construction in ABSET can be described in terms of application of named functions to parameters. When the compiler processes text, for each expression processed the compiler applies the relevant functions to their parameters by creating jobs with appropriate data cells. It also attaches each job so created to those of its parameters which are currently of type 'no-value'. Each job is added to the primitive sequencer's list of waiting jobs. Essentially then, an object of type 'no-value' is a reference to a list of jobs each of which reference that object (among others). The ABSET syntactic constructions NEW and ' are such that the application of their associated functions introduces new 'no-value' objects. [...] When the sequencer is called it removes the first job from its list of waiting jobs, places the job reference in a globally named location and jumps on the job entry. Code for the job is therefore written on the basis that, when the job is active, the data cell of the job is referenced from this global location. Typically the code for the job is written so that, when entered, the set of parameter references in the data cell of the job are examined. If there are not enough parameters with values to allow further inference then the job simply terminates by returning control to the sequencer. If there are sufficient parameters with values for the remaining parameters to be inferred then these are inferred: the giving of these inferred values to the relevant parameters referenced from the data cell of the jobs uses a mechanism which, descriptor cell 1. If the referenced parameter is of type 'no-value' then as well as assigning the parameter object the inferred value, adds the list of jobs currently associated with the object to the sequencel's list of waiting jobs. 2. If the referenced parameter already has a value different from the current inferred value then the job is terminated by an error exit which arranges for appropriate action to be taken (e.g. the setting up of a job to take diagnostic action). In either case control is eventually returned to the primitive sequencer. It is also arranged that termination of a completed job inhibits an) other activations of the job. As an illustrative example a flowchart for the code of the ABSET function ANDF is given in Fig. 3. It should be noted: 1. The instruction 'inhibit the current job' is implemented by clearing the one-cell containing the reference (of type function) to the current job. Since in activating a new job the sequencer looks for and ignores such empty one-cells, and since the one-cell was common to all references to the particular job, this simple device has the desired effect. 2. Instructions such as 'infer parameter 1 true' and 'infer result = parameter 2' are points at which the job might terminate by the error exit since values might already be known for the relevant parameters or result which make the relevant inference false. Alternatively they might lead to the release of further jobs to the sequencer as indicated in the general account above. [...] Finally it should be mentioned that in the interests of exposition the above description has been over-simplified. In particular the sequencer operates, not on a simple list, but on a list of lists of jobs. Jobs are released to the appropriate list as designated by the badness of the job in the job descriptor cell. The compiler itself is simply a job of badness worse than those considered above. The compiler typically adds itself to the sequencer's list of jobs after processing each statement and returns control to the sequencer. It is automatically reactivated when all jobs of lower badness have been processed. in The Computer Journal 15(2) 1972 view details Extract: ABSET One of the most significant of the more general languages with this approach is the interactive language ABSET being developed at the Computer Research Group at the University of Aberdeen, Scotland. (See Elcock, 1971.) A key element in their language is the use of sets in which the user is allowed to say "this is true for all members of this set" and the notions of TRUE and FALSE are appropriately applied. A simple illustration of the application of the latter together with a deduction from the stated assertions is the following: A + B = 3 AND A = l; From this the interpreter deduces that B = 2 since the meaning of AND requires that the two halves of the first statement are both true. Results 1 - 7 of 7 Relevance scale in Proceedings of the ACM SIGPLAN symposium on Very high level languages, March 28-29, 1974, Santa Monica, California, United States view details The Kernel ThingLab System The kernel ThingLab system consists of a Smalltalk extension, written by the present author, that is used in all ThingLab simulations. Embedded in this program is knowledge about such things as inheritance hierarchies, part-whole relations, and constraint satisfaction techniques. The kernel system doesn?t have any knowledge about specific domains in which ThingLab can be used, such as geometry or electrical circuits. Rather, it provides tools that make it easy to construct objects that contain such knowledge. Another goal in constructing the system, besides the exploration of language design as described above, was to investigate computer-based tools for use in education. For example, a ThingLabstyle system might prove valuable as part of a geometry curriculum, or as an adjunct to a physics laboratory. With this in mind, it is anticipated that there would be two sorts of users of the system. The first sort of user would employ ThingLab to construct a set of building blocks for a given domain. For example, for user in simulating electrical circuits, such a user would construct definitions of basic parts such as resistors, batteries, wires and meters. The second sort of user could then employ these building blocks to construct and explore particular simulations. The knowledge and skills required by these two kinds of users would be different. The first kind of user would need to now about message passing the constraint specification domain (e.g.Ohm?s Law). The second kind of user, on the other hand, could deal with the system using only simple interactive graphics techniques, such as selecting items in a menu or moving pictures around on the screen. Thus, this sort of user wouldn?t need to be familiar with either the details of ThingLab, or with the domain-specific theory behind the simulation (although one of the objectives of a curriculum might be for such a user to acquire this domain-specific knowledge). Extract: Constraint Representation and Satisfaction Constraint Representation and Satisfaction Representation of Constraints The representation of constraints reflects their dual nature as both descriptions and commands. Constraints in ThingLab are represented as a rule and a set of methods that can be invoked to satisfy the constraint. The rule is used by the system to construct a procedural test for checking whether or not the constraint is satisfied, and to construct an error expression that indicates how well the constraint is satisfied. The methods describe alternate ways of satisfying the constraint; if any one of the methods is invoked, the constraint will be satisfied. Merges An important special case of a constraint is a merge. When several parts are merged, they are constrained to be identical. For efficiency, they are usually replaced by a single part rather than being kept as several separate objects. The owner of the parts keeps a symbolic representation of the merge for use in constraint satisfaction, as well as for reconstruction of the original parts if the merge is deleted. One use of merging is to represent connectivity. For example, to connect two sides of the triangle, an endpoint from one side is merged with an endpoint of the other. Another use of merging is to apply pre-defined constraints. Thus, to constrain the base of the triangle to be horizontal, one can simply merge an instance of HorizontalLine with the side to be constrained. Constraint Satisfaction It is up to the user to specify the constraints on an object; but it is up to the system to satisfy them. Satisfying constraints is not always trivial. A basic problem is that constraints are typically multi-directional. For example, the horizontal line constraint is allowed to change either endpoint of the line. Thus, one of the tasks of the system is to choose among several possible ways of locally satisfying each constraint. One constraint may interfere with another; in general the collection of all the constraints on an object may be incomplete, circular, or contradictory. Again it is up to the system to sort this out. The approach taken in ThingLab is first to analyze the constraints on an object and plan how to satisfy them, and then to make the actual changes to satisfy the constraints. In ThingLab, the particular values that an object holds usually change much more rapidly than its structure. For example, if on the display the user moves some part of a constrained geometric object with the cursor, the values held by this object will change every time its picture is refreshed. Each time some value is changed, other values may need to be changed as well to keep the constraints satisfied. However, the object?s structure will change only when the user adds or deletes a part or constraint. The design of the ThingLab constraint satisfaction mechanism is optimized for this environment. A constraint satisfaction plan may depend on the particular structure of an object, but should work for any values that the object might hold. (If not, appropriate tests must be included as part of the plan.) Once a plan for satisfying some constraints has been constructed, Smalltalk code is compiled to carry out this plan. Thus each time the part of the constrained geometric object is moved , it is this pre-compiled method that is invoked, rather than a general interpretive mechanism. Also, the plan is remembered in case it is needed again. Planning is done using symbolic references to the constrained parts, so that the same plan may be used by all instances of a class. If the class structure is changed so that the plan becomes obsolete, it will be automatically forgotten. When an object is asked to make a change to one of its parts or subparts, it gather up all the constraints that might be affected by the change, and plans a method for satisfying them. In planning a constraint satisfaction method, the object will first attempt to find a one-pass ordering for satisfying the constraints. There are two techniques available in ThingLab for doing this: propagation of degrees of freedom, and propagation of known states. In propagating degrees of freedom, the constraint satisfier looks for an object with enough degrees of freedom so that it can be altered to satisfy all its constraints. If such an object is found, that object and all the constraints that apply to it can be removed from further consideration. Once this is done, another object may acquire enough degrees of freedom to satisfy all its constraints. The process continues in this manner until either all constraints have been taken care of, or until no more degrees of freedom can be propagated. In the second technique propagating known states, the constraint satisfier looks for objects whose states are completely known. If such an object is found, the constraint satisfier will look for one-step deductions that allow the states of other objects to be know, and so on recusively. If there are constraints that cannot be handled by either of these techniques the object will invoke a method for dealing with circularity. Currently, the classical relaxation method is the only such method available. As will be described in Chapter 5, relaxation can be used only with certain numerical constraints, and is also slow. In this method, the object changes each of its numerical values in turn so as to minimize the error expressions of its numerical values in turn so as to minimize the error expressions of its constrains. These changes are determined by approximating the constraints on a given value as a set of linear equations by finding the derivative of the error expressions with respect to the value, and solving this set of equations. Relaxation continues until all the constraints are satisfied (all the errors are less than some cutoff), or until the system decides that it cannot satisfy the constraints (the errors fail to decrease after an iteration). If the relaxation method is used, the system issues a warning message to the user. The user can either let things stand, or else supply additional information in the form of redundant constraints that eliminate the need for relaxation. Where are Constraints Useful? Where are constraints useful? In discussing this question, it is important to differentiate what can be expressed using constraints from what sets of constraints can be satisfied. Many more things can be expressed than can be satisfied, For example, it is easy to state the following constraints: xn + yn = zn x, y, z, n integers x, y, z >0 n > 2. However, finding values that satisfy these constraints, or proving that no such values exist, requires either a counterexample or a proof of Fermat?s Last Theorem. What can be expressed using constraints? To express a relation as a constraint, the following information is needed: a rule (from which the system will derive a satisfaction test and an error expression); and one or more methods for satisfying the constraint. For numerical constraints, the methods may be omitted if the user is willing to live with the relaxation method. Any relation that must always hold, and for which this information can be supplied, may be expressed as a constraint. Some relations that cannot be expressed as constraints in a general way using current ThingLab techniques include: any relation involving ordering or time; relations that need hold only under certain conditions; and meta-constraints (constraints on other constraints or on constraint satisfaction strategies). What sets of constraints can be satisfied? If the constraint dependency graph has no circularities, or if the circularities can all be broken using one-step deductions, then the one-pass constraint satisfaction techniques will always succeed, and will provide correct results. Further, the constraints can be satisfied, or determined to be unsatisfiable, in time proportional to that required to execute the local methods provided by the constraints. If the dependency graph does have circularities that cannot be broken by one-step deductions, the constraints for which relaxation can be used. These constraints must either be linear, or else constraints for which linearization is an adequate approximation. An example of a set of circular constraints for which the relaxation method does not work are those that describe a cryptarithmetic problem, e.g. DONALD + GERALD = ROBERT with D=5. [See Newell & Simon 1972 for a description of this domain.] Relaxation is useless here, since the constraints cannot be approximated by linear equations. To solve such kinds of problems, other constraints satisfaction techniques would be needed, such as heuristic search. Relation to Other Work As mentioned previously, the two principal ancestors of ThingLab are Sketchpad an Smalltalk. It is also closely related to work on constraints by Gerald Sussman and his students; other related work includes Simula the Actor languages, KRL, and a number of problem solving systems. Following a discussion of these and other systems, a summary of the novel features of ThingLab is presented. Extract: ABSET ABSET was a set-oriented language developed at the University of Aberdeen. It allowed statements of the from A+B=3 AND A=1; given this statement, it could deduce B?s value. ABSET had a number of interesting features: it emphasized the avoidance of unnecessary ordering restrictions in the statement of a program; and it allowed assertions (or constraints) to apply to non-numeric objects such as sets or text in Proceedings of the ACM SIGPLAN symposium on Very high level languages, March 28-29, 1974, Santa Monica, California, United States view details |