Actus(ID:839/act024)Perrot, Dhillon et al, The Queen's Univ., Belfast, N. Ireland 1979-1989 Pascal with parallel extensions, similar to the earlier Glypnir. 1979 Parallel constants, index sets. Descendants include Parallel Pascal, Vector C, and CMU's recent language PIE. Structures: Related languages
References: 1. INTRODUCTION During the last two decades, the design and development of several generations of computers have given rise to increased processing speeds; the more recent advances in the number of operations performed per second have been obtained by a revolution in computer architecture rather than by component technology. Examples of this revolution are the duplication of arithmetic units for array processors such as the Illiac IV [1] and the Phoenix [2], and the pipelining of functional units for vector processors such as the Star-100 [3] and the Cray-1 [10]. These types of computer are based on a form of parallel or lockstep processing which does not have the synchronization problems of a conventional multiprocessor system. These machines are widely used in large scale scientific computa- tions, particularly for grid or mesh type problems where regularity of processing the data is the dominant problem characteristic; they form the baseline architecture for this paper. Hence asynchronous parallel configurations such as C.mmp [15] are excluded from consideration. Unfortunately, there has not been a comparable investment of either research funds or effort into the development of programming languages or software production tools to utilize these technological and architectural advances. Most of the high level languages currently used to program these parallel computers are extensions of languages which were specifically designed, many years ago, for sequential machine architectures, e.g., extended Fortran for the Star-100 [11], CFT for the Cray-1 [10], and IVTRAN [8], CFD [12] (Fortran-like languages), and Glypnir [7] (an Algol-like language) for the Illiac IV. SL/1 [6] is one of the few languages that has tried to bring some of the benefits of structured programming to one of these machines, namely, the Star-100. The gap between the hardware and software development for these machines has been apparent in many of the projects attempted, e.g., some conversions of important production codes to the Illiac IV were terminated due to software problems. Also the size and the complexity of the projects that program- mers are being asked to implement have increased with the available processing power and are now almost beyond the features and capabilities of the program- ming languages being used to tackle them. More specifically, to construct and to increase the efficiency of a program, the user either has to be aware of the machine instruction set or of the method of detection of parallelism used by the compiler. In addition, the organization of the transfer of data to and from the backing store can require the use of low level primitives; the transfer can critically affect the performance of a program. Hence the challenge to the language designer is to devise a language which provides the programmer with sufficient tools to enable the construction of efficient algorithms and at the same time effectively utilize the hardware. The language described in this paper is an attempt to redress the technology imbalance: to develop a high level language whose features exploit the advanced architecture of these parallel machines and incorporate the new software engi- neering approaches that are necessary in writing algorithms in this parallel environment. The language is called Actus. The new language enables the specification of parallelism directly. The features are appropriate for both array and vector processors and they are defined without reference to the hardware of either type of machine; the algorithmic and data constructs are of sufficient generality and structure to make efficient use of the parallel computational resources. This could facilitate codes developed on one parallel architecture being moved to another parallel architecture without undue 'loss of efficiency. The language features of Actus are described using a notation similar to that of the language Pascal [14]. (A possible implementation is also suggested by using a Pascal P-compiler [9]; such an approach is currently being pursued at the Institute for Advanced Computation for one of the parallel computers, namely, the Illiac IV.) It is therefore hoped that this type of parallel computer can benefit from some of the advances of structured programming and software engineering which are, as yet, not widely disseminated throughout the scientific community and that this, in turn, will mean reduced software production costs, with the improved quality, reliability, and adaptability that have been realized for sequential machines. Extract: Design Approach 2. DESIGN APPROACH There have been two main approaches to the design of a high level language for a vector or an array processor which are reflected in the existing languages for these machines, namely, either (i) the user writes a program in a conventional sequential programming language and the compiler tries to detect the inherent parallelism of the program, or (ii) the parallel nature of the computer is readily apparent in the syntax of the language. Examples of the first philosophy are IVTRAN for the Illiac IV and CFT for the Cray-1. Such languages try to extract the parallelism within sequential Fortran DO loops. The disadvantages of this approach are that the extraction of the parallelism is somewhat limited and inefficient in the code generated; and often the user has to restructure the program to benefit from the parallelism of the machine. Examples of the second approach are CFD and Glypnir for the Illiac IV. CFD is an extension of Fortran which reflects in the syntax that the Illiac IV has 64 processors. The user then has to size the data structures of the problem to this natural length of the machine. This can add significantly to the complexity of a program, again causing the user to restructure the program. Another example is Star Fortran. To access the parallel capability of the Star, the programmer must explicitly encode hardware instructions in separate subrou- tine calls (one call for each instruction). This effectively turns the extended Fortran into a higher level assembler language. Another disadvantage in using these existing languages is that normally the two-level memory hierarchy is not a natural part of the language abstraction. For example, in the situation where the database under consideration is so large that it will not fit into the available main store, the user must, employing very basic facilities, organize the transfer of data between the extended memory and the fast processing store. Since the management of the memory hierarchy is often the critical factor in the overall performance of a program, the user must therefore code the most crucial aspect of the program in the most primitive syntax. The approach adopted in the language reported here is that the language should enable the expression of parallelism in a manner which is suitable for the problem and can be easily exploited by these parallel machines. This will enable the compiler to generate efficient object code in a rapid and straightforward manner without resorting to a complex detection mechanism for the inherent parallelism. Another advantage is that programming with a parallel syntax produces algorithms of greater efficiency, i.e., a user does not have to size the data structures of the problem but can adopt a straightforward and natural notation to suit the problem. Thus the current programming situation for these parallel computers is such that the user is forced both to construct a solution to a problem using a language which does not provide the most appropriate abstraction mechanisms and to take account of the particular hardware characteristics of the machine. Extract: Language Criteria 3. LANGUAGE CRITERIA The design of the language involved an extensive study of the problem areas in which such computers are used, consultations with programmers who had used such machines for many years, and a survey of the users of the Illiac IV over the years that it had been operational. The survey responders included users of both types of language as represented by the pair CFD/Glypnir and IVTRAN. The main points were (i) parallel computations were difficult to express in or adapt to the syntax of the language if they were not based on a factor of 64, i.e., the number of processing elements available, (ii) manipulation of the status of processing elements was cumbersome; frequently data structures were increased in size to avoid status manipulation, (iii) the lack of suitable control structures caused complexity of programming and at times required the introduction of machine code; machine code was also introduced to increase efficiency, (iv) selecting an arbitrary group of array elements was difficult, (v) manipulation of part words required machine code, (vi) detailed knowledge of the layout of the data on the backing store and in the main store is required; also the facilities for specifying the transfer of data were primitive and required careful programming, (vii) debugging and tracing facilities were lacking. The purpose of the study, the consultations, and the survey were designed to determine the frequency of use of certain features and to identify what new features were required. In this way the relevant abstractions for large scale scientific problems could be formed along with their representation and rules for manipulation. This, in turn, led to the introduction of new data structures, language constructs, and operators. In effect, the new features were involved with manipulation of the extent of parallel processing, while the normal structured programming concepts were modified to enable representation and manipulation of the data in parallel. The following criteria were adopted for the design of a language for the current parallel computers: (i) the idiosyncrasies of the hardware should be hidden 'from the user as much as possible, (ii) the user'should be able to express the parallelism of the problem directly, (iii) the user'should be able to think in terms of a varying rather than a fixed extent of parallel processing, (iv) control of the parallel processing should be possible both explicitly and through the data, as applicable, (v) the user'should be able to indicate the minimum working set size of the database (in those cases where the database is larger than the size of the fast memory). The implementation of the language should provide compile and run time checks plus tracing facilities to assist with the debugging and testing of programs. Since lockstep parallel computers were developed as a means of performing the same operation on independent data sets in parallel, in Actus the data declarations indicate the maximum extent of parallelism. The extent of parallelism is defined for an array processor as the number of processors that can logically compute upon a particular data at the same time (this can be less than, equal to, or greater than the actual number of processors); for a vector processor it is the length of the data structure presented to the processor for computation. Hence each data declaration has associated with it a maximum extent of parallelism; the language statements or constructs can then adjust this maximum (or a lesser) extent of parallelism in the course of program execution. In this way it is possible to express directly in the syntax the parallel nature of an algorithm which is appropriate for both vector and array processors. in TOPLAS 1(2) Oct 1979 view details in Software - Practice and Experience 11(05) May 1981 view details in Software - Practice and Experience 11(05) May 1981 view details in Software - Practice and Experience 11(05) May 1981 view details in Software - Practice and Experience 11(05) May 1981 view details in Software - Practice and Experience 11(05) May 1981 view details This conflict may be resolved by using a multi-pass scheme for the implementation process thereby allowing machine dependent and machine independent sections of the implementation to be dealt with in distinct operations. This paper describes an implementation of ACTUS intended for use on a vector processor, the CRAY-1, outlining the typical operations that are carried out during the various passes of the compilation process. in Software - Practice and Experience 11(05) May 1981 view details in Software - Practice and Experience 11(05) May 1981 view details in Journal of Parallel and Distributed Computing, 1987 view details The Booster Language 2.1. Basic concepts In a conventional programming language (such as Fortran) the array is used as the basic data structure for storing large amount of related data elements. These elements can be accessed by the use of indices into the array. An index is a rigid pointer to a memory location. It is not possible in these languages to reason about or manipulate indices themselves. Only the data can be moved around or changed, and it is precisely this which makes arrays so awkward whenever sorting or inserting (for example) needs to take place. The use of indirect addressing (e.g., index files) to keep a large database sorted on different keywords is an example of how useful it can be to regard the indices to an array as a separate, manoeuvrable collection of entities. This is particularly true for parallel programming, where it is often important to identify sets of index values that refer to data upon which computations may be executed in parallel. A comparable approach is followed in a language like ACTUS [Perrot87]. Data- and index domain In Booster, these observations have resulted in a strict distinction between data- and index domain. The data domain consists of several possible data types, just as in conventional languages. Supported in Booster are integers, reals, booleans, and records. The index d,omain consists of nonnegative integer values. On the index domain ordered index sets can be defined, and operations can be performed on these sets independent of the data-elements that the index values in question refer to. Shapes and views There are two concepts in Booster to reflect the two domains. The first is the shape, Booster?s equivalent of a traditional array: a finite set of elements of a certain data-type, accessed through indices. Unlike arrays, shapes need not necessarily be rectangular (for convenience we will, for the moment, assume that they are). The ordered set of all indices in the shape is called an index set. The second concept is that of the view. A view is a function that takes the index set of a certain shape as input, and returns a different index set. Through the indices in this new index set one can still access the elements of the original shape, but it is as though we now view the shape?s data-elements through a different perspective, hence the name. Shapes are defined in Booster as follows: SHAPE A (20) OF REAL; SHAPE B (3tlO) OF REAL; In the first statement, A is declared to be a vector of 20 numbers of the type real. The basic index set for this shape is the ordered set (0,1. . . . . 19). Next, B is declared to be a matrix of 3 by 10 elements. The index set for this shape is the ordered set ((0,O.L (OJ), (0,2), . . . . (2,8), (2.9)). Content statements In so-called content statements we can move and modify the data stored in shapes: A := 2.5; A[101 := 5; B[1,81 := 3.1416; In the first content statement, all elements of A are initialized to 2.5. In the second statement, the value 5 is stored in the lO* element of A, and so on. Arithmetic Operators Apart from standard scalar operators Booster also supports their multi-dimensional equivalents. For example, a vector A t.ittIeS a SCahU 2 is written as: A := A'2; Application of these multi-dimensional operators is restricted to pairs of scalars and higher dimensional structures. Other operators can be specified with help of the function construct, which is discussed shortly. View Statements We manipulate index sets in so-called view statements. The easiest view to define is the identity view: V <- A; v is called a view identifier and does not need to be declared. After this view statement the three content statements will have exactly the same effect. At01 := ALlO]; A[01 := V[lO]; VI01 := V[lOl; Note that no additional data structure is created. This is typical of all views. Also note the different assignment symbols for view- and content-statements; ?<-? for view statements and ? : =? for content statements. Modules and Functions A Booster program consists of a set of modules, where each module has a number of input- and output-arguments. Within a module it is possible to encapsulate a number of view- and content statements in functions. Booster functions, like modules, do not have side effects and when a function has only one output argument and at most two input arguments it may be used as an infix operator in content statements. An example of a Booster function is the following: FUNCTION Vector-Mult PRIORITY 7 (V, W) -> (U): V,W (I-I) OF REAL; U (n # n) OF REAL; BEGIN U[i, jl := V[il * W[jl; END ; Here a new language construct is introduced, thefree variable i and j. A treatment of the exact syntax and semantics of this construct is deferred to the next section. This function assigns the vector-vector product Vi*W j to Ui, j. The keyword priority assigns a precedence to the function Vector cult which is used to decide evaluation order when the function is itself used as binary operator. priority 7 corresponds with the priority of the ?*? operator. E.g., 2 + X Vector-Mult Y = 2 + (X Vector-Mult Y) Control Flow In addition to the content- and view-statements, Booster also offers several control-flow constructs similar to those found in conventional languages. Available are the IF statement for conditional execution, and WHILE and ITER statements for iteration purposes (XTER-IOO~S execute a fixed number of times, WHILE-loops execute as long as a boolean condition evaluates true). in ACM SIGARCH Computer Architecture News 18(03) June 1990 (Proceedings of the 4th international conference on Supercomputing) view details |