Newsqueak(ID:1503/new006)Windowing/mousing languageConcurrent applicative language with synchronous channels. People: Related languages
References: This is an informal reference manual for the concurrent language Newsqueak. Newsqueak?s roots are in Squeak, a language designed a few years ago by Luca Cardelli and Rob Pike to illustrate concurrent solutions to problems in user interface design. Newsqueak addresses the same problems but in a broader context: Squeak was for designing devices such as menus and scroll bars; Newsqueak is for writing entire applications, and in particular a window system. Besides a wholesale redesign of Squeak?s syntax, Newsqueak therefore has several major components absent in Squeak: a type system, dynamic process creation, and dynamic channel creation. Also, Squeak deferred most mundane programming details to the language it compiled into, C. Newsqueak is instead a selfcontained language. An interpreter for it, called squint, has been implemented. Newsqueak draws heavily from CSP and C, and the discussion that follows assumes modest familiarity with both these languages. Roughly, the syntax and basic semantics come from C, while the messagepassing primitives come from CSP. The way these are put together is, however, unique to Newsqueak. Storage Since the management of storage is central to the implementation of any language, it is a good starting point for the description of the Newsqueak interpreter. But it is especially pertinent for Newsqueak because the design of the language hinges on its strictly byvalue (applicative) management of data. Concurrent and applicative programming complement each other. The ability to send messages on channels provides I/O without side effects, while the avoidance of shared data helps keep concurrent processes from colliding. Newsqueak is an applicative concurrent language based on the concurrent composition of processes communicating on synchronous channels [Pike 89, McIl 89]. Two computations share a value only if they share a variable. All assignment of values to objects ? including function return, binding of parameters to functions and processes, and passing values on channels ? is done by making a copy of the assigned value. Although Newsqueak permits global variables, programming without them is convenient and implicitly encouraged. The language promotes independent functions and processes operating in environments defined exclusively by their parameters. Since those parameters may include communication channels, the language feels considerably different from traditional applicative languages. Newsqueak does not require that its implementation copy values in an assignment, but the behavior must be as though the value were copied. The Newsqueak interpreter, squint, combines referencecount garbage collection with a lazy copying scheme to defer copying as long as possible. The result is similar to copyonwrite page management schemes in operating systems. Reference counting was chosen because it is easy to implement [Knuth 73]. It was (correctly) expected that storage management would not dominate performance. Moreover, when the language was being designed there were several candidates for the CPU it would eventually be run on, so it seemed prudent to use simple, inherently portable techniques. Each object in Newsqueak is represented using a single word containing the address of a data structure describing the object, except for integers, which are just stored in the word. (The language itself has no pointer types, but the implementation uses pointers extensively.) The data structure begins with a header that includes a reference count and a description of the size and layout of the object, which simplifies copying the structures. (The static type system requires no runtime checking of the type of objects, but the interpreter uses a single routine to copy all objects.) The data contained in arrays of characters and integers are stored immediately after the header. Arrays of nonintegral objects are represented recursively by storing an array of pointers to the component objects after the initial header. Records, defined using the struct keyword, are stored using a hybrid scheme; a bit array at the beginning of the data part of the object indicates which elements of the structure are represented by pointers. An object is freed (collected) when the number of references to it goes to zero, which can occur when a link to the object is broken by assignment or when a variable pointing to the object is released at the end of an executing block. When an object is freed, pointers contained within it are followed and their reference counts are decremented. When these counts go to zero, the algorithm continues recursively. When an object is assigned to a variable, its reference count is increased. However, when a component of an object is updated, the resulting assignment may be done in place only if the containing object has a reference count of one. If not, the object must first be copied. Consider the isolated assignments A = somearray B = A A[i] = p. After the second assignment, the array pointed to by A and B has a reference count of at least two. To assign p to A[i], A is copied, the old array?s reference count is decremented (A no longer points to it, but B still does), all its component objects? reference counts are incremented (the new array points to the same components), and A is made to point to the new object. This new object has a reference count of one. The original object pointed to by A[i] and B[i] has its reference count decremented (after assignment A[i] will not point to it but B[i] will), and A[i] may finally be pointed at p, whose reference count increments. All these operations occur as a single atomic action. Care must be taken in the implementation to guarantee that assignments such as A[i] = A[i] work properly, even when disguised, for example by communication on a channel. But the method is not hard to implement, and is beneficial for many common cases such as passing an array to a function that examines it but does not change it. Newsqueak Hoare's language was novel and influential, but lacking in a few key aspects. The main defect is that the unbuffered channels used for communication are not first-class objects: they cannot be stored in variables, passed as arguments to functions, or sent across channels. As a result of this, the communication structure must be fixed while writing the program. Hence we must write a program to print the first 1000 primes rather than the first n primes, and to multiply a vector by a 3x3 matrix rather than an nxn matrix. Luca Cardelli and Rob Pike developed Hoare's ideas into the Squeak mini-language [4] for generating user interface code. (This Squeak is distinct from the Squeak Smalltalk implementation mentioned in earlier chapters.) Pike later expanded Squeak into the fully-fledged programming language Newsqueak [5][6] which begat Plan 9's Alef [7] [8] and Inferno's Limbo [9]. The main semantic advantage of Newsqueak over Squeak is that Newsqueak treats communications channels as first-class objects: unlike in CSP and Squeak, channels can be stored in variables, passed as arguments to functions, and sent across channels. This in turn enables the programmatic construction of the communication structure, thus allowing the creation of more complex structures than would be reasonable to design by hand. In particular, Doug McIlroy demonstrated how the communication facilities of Newsqueak can be employed to write elegant programs for manipulating symbolic power series [10]. Similar attempts in traditional languages tend to mire in bookkeeping. In a similar vein, Rob Pike demonstrated how the communication facilities can be employed to break out of the common event-based programming model, writing a concurrent window system [11]. |