Booster(ID:1458/boo001)


Data parallel language


References:
  • Paalvast, E. "The Booster Language", TR PL 89-ITI-B-18, Inst voor Toegepaste Informatica TNO, Delft, 1989 view details
  • Paalvast, Edwin M.; van Gemund, Arjan J. ; Sips, Henk J. "A method for parallel program generation with an application to the Booster language" view details Abstract: This paper describes a translation method for the automatic parallelization of programs based on a separately specified representation of the data. The method unifies the concept of data-representation on the algorithm-level as well as machine-level, based on the so-called view concept. It is shown that given a decomposition of the data, application of the translation method to the view-based Booster programming language results in efficient SPMD-code for distributed- as well as shared-memory architectures. It will be argued that the method is not restricted to Booster, but can also be applied to other languages. DOI Extract: Conclusion
    Conclusion
    We have presented a general method for automatically deriving efficient SPMD-code for distributed- as well as shared-memory processors, given an algorithm and data decomposition description. The translation method as well as the algorithm and decomposition description formalisms are based on the application of the so-called view concept with its associated calculus. To this purpose, the highlevel view programming language Booster is introduced, illustrating its merits by an example of its translation to distributed- and shared-memory architectures.
    Our method can also be applied to other languages as well. As can be seen from the method, a number of criteria can be formulated with respect to maximum achievable speed-up when translating programming languages, based on data partitioning. If the functionsfcq) and g(q) and their properties, such as linearity and monotonicity, are known, efficient translations can be generated for block- and scatter-decompositions.
    Traditional languages such as Fortran introduce serious problems regarding general decompositions. From a given Fortran program the functions f(q), g(q), etc., including the appropriate ranges for q, have to be extracted from the source code, which can be quite complex, if not impossible. Dynamic decompositions appear even more cumbersome, since all releveant information on the dynamic decomposition is scattered throughout the program. In this respect, functional languages are more promising, provided that a strict separation between index- and data manipulations can be made (possibly through annotations). The Booster language, as described in this paper, meets both approaches half in between.
    Further research will be directed to extending the calculus to a true intermediate formal framework for the description of optimizations like vectorization and parallelization, based on machine models described within the same framework. Although in the current approach, data decomposition is supplied by the user, future research will also focus on incorporating decompositions as integral part of the architecture driven translation process which is described in Section 5. Extract: iNTRODUCTION
    Introduction
    In programming either shared- or distributed-memory parallel computers, programmers would like to consider them as being uni-processors and supply as little extra information as possible on code parallelization and data partitioning. On the other hand, maximum speed-up is desired, without loss of portability. This trade-off is reflected in the existence of a variety of parallel language paradigms, which, regarding to the decomposition method, can be divided into two categories: implicit and explicit. Languages based on implicit descriptions, like functional and dataflow languages, leave the detection of parallelism and mapping onto a parallel machine to the compiler. Unfortunately, contemporary compilers do not produce efficient translations for arbitrary algorithmmachine combinations. In turn, if a programmer would know the optimal mapping of an algorithm onto a certain architecture most implicit description languages do not provide facilities to express this mapping explicitly. An exception to this is described in [Hudak88].
    Most languages based on explicit descriptions specify parallelism c.q. communication and synchronization as integral part of the algorithm. This has the disadvantage that one has to program multiple threads of control, which are generally very hard to debug. Hence, experimentation with different versions of the same parallel algorithm, for example different decompositions, is in general rather cumbersome. Comparably small changes may require major program restructuring.
    In this paper, we describe a different explicit approach, that pairs the flexibility of the implicit with the expressiveness of the explicit. In the approach algorithm description and algorithm decomposition are described separately. Efficient SPMD (Single Program Multiple Data) code, in particular communication and synchronization, is generated automatically by the compiler. Furthermore, the compiler uses a model base of target architectures in order to optimize computation and communication efficiency.
    The approach of inducing parallelism by explicitly decomposing the data is not new. In [Callahan88, Gemdt89, Kennedy891 applications to Fortran are described, in [Rogers89] to Id Nouveau, in [Koelbel87] to BLAZE, and in [Quinn891 to C*. In particular application to Fortran is limited, because of equivalencing, passing of array-subsections to subroutine calls, and any form of indirect addressing cannot be translated efficiently. A second limitation is that the description of complex decompositions and especially dynamic decompositions, i.e. a redistribution of the data at run-time, is not feasible either. An exception is [Kennedy891 where a method is presented to describe redistribution. However, this method still has the drawback that redistribution statements are intermingled with the program code, which limits portability.
    A more fundamental problem to these approaches is that distinctive formalisms are used for the description of algorithm and decomposition. Hence, a unified formal system to reason about optimizing transformations at compile- time is not possible. Furthermore, the approaches do not address the issue in the general context of data representation. In our approach the way in which we "view" data on the algorithm level is essential. The approach is illustrated in Fig. 1. We specify the program P and the involved set of data structures D. By adding the data representation D + D?, the compiler automatically generates an equivalent program P? with the set of data structures D?. Especially for sparse data structures the possibility to "view" them as a compact structure is very convenient from the programmers point of view.
    To illustrate our method, a high-level parallel programming language called Booster is introduced in Section 2. Programs written in Booster are translated to imperative languages, like (parallel) Fortran, C, ADA, or OCCAM. For Fortran and C, the code-generation is not restricted to distributed-memory computers, but code can also be generated for shared-memory- and vector computers. Section 3 describes the translation of Booster to SPMD-programs in terms of a so-called view calculus. In addition, a number of optimizations with respect to the translation are discussed. In Section 4, the generation of a number of alternative send/receive schemes are elaborated on and an example translation of a Booster program to both a distributed- and shared memory architecture is given. Finally, in Section 5 a brief description is given of the architecture of the Booster translator. Extract: The booster language 1:
    The Booster Language
    2.1. Basic concepts
    In a conventional programming language (such as Fortran) the array is used as the basic data structure for storing large amount of related data elements. These elements can be accessed by the use of indices into the array. An index is a rigid pointer to a memory location. It is not possible in these languages to reason about or manipulate indices themselves. Only the data can be moved around or changed, and it is precisely this which makes arrays so awkward whenever sorting or inserting (for example) needs to take place. The use of indirect addressing (e.g., index files) to keep a large database sorted on different keywords is an example of how useful it can be to regard the indices to an array as a separate, manoeuvrable collection of entities. This is particularly true for parallel programming, where it is often important to identify sets of index values that refer to data upon which computations may be executed in parallel. A comparable approach is followed in a language like ACTUS [Perrot87].

    Data- and index domain
    In Booster, these observations have resulted in a strict distinction between data- and index domain. The data domain consists of several possible data types, just as in conventional languages. Supported in Booster are integers, reals, booleans, and records. The index d,omain consists of nonnegative integer values. On the index domain ordered index sets can be defined, and operations can be performed on these sets independent of the data-elements that the index values in question refer to.

    Shapes and views
    There are two concepts in Booster to reflect the two domains. The first is the shape, Booster?s equivalent of a traditional array: a finite set of elements of a certain data-type, accessed through indices. Unlike arrays, shapes need not necessarily be rectangular (for convenience we will, for the moment, assume that they are). The ordered set of all indices in the shape is called an index set. The second concept is that of the view. A view is a function that takes the index set of a certain shape as input, and returns a different index set. Through the indices in this new index set one can still access the elements of the original shape, but it is as though we now view the shape?s data-elements through a different perspective, hence the name. Shapes are defined in Booster as follows:
    SHAPE A (20) OF REAL;
    SHAPE B (3tlO) OF REAL;
    In the first statement, A is declared to be a vector of 20 numbers of the type real. The basic index set for this shape is the ordered set (0,1. . . . . 19). Next, B is declared to be a matrix of 3 by 10 elements. The index set for this shape is the ordered set ((0,O.L (OJ), (0,2), . . . . (2,8), (2.9)).

    Content statements
    In so-called content statements we can move and modify the data stored in shapes:
    A := 2.5;
    A[101 := 5;
    B[1,81 := 3.1416;
    In the first content statement, all elements of A are initialized to 2.5. In the second statement, the value 5 is stored in the lO* element of A, and so on.

    Arithmetic Operators
    Apart from standard scalar operators Booster also supports their multi-dimensional equivalents. For example, a vector A t.ittIeS a SCahU 2 is written as:
    A := A'2;
    Application of these multi-dimensional operators is restricted to pairs of scalars and higher dimensional structures. Other operators can be specified with help of the function construct, which is discussed shortly.

    View Statements
    We manipulate index sets in so-called view statements. The easiest view to define is the identity view:
    V <- A;
    v is called a view identifier and does not need to be declared. After this view statement the three content statements will have exactly the same effect.
    At01 := ALlO];
    A[01 := V[lO];
    VI01 := V[lOl;
    Note that no additional data structure is created. This is typical of all views. Also note the different assignment symbols for view- and content-statements; ?<-? for view statements and ? : =? for content statements.

    Modules and Functions
    A Booster program consists of a set of modules, where each module has a number of input- and output-arguments. Within a module it is possible to encapsulate a number of view- and content statements in functions. Booster functions, like modules, do not have side effects and when a function has only one output argument and at most two input arguments it may be used as an infix operator in content statements. An example of a Booster function is the following:
    FUNCTION
    Vector-Mult PRIORITY 7 (V, W) -> (U):
    V,W (I-I) OF REAL;
    U (n # n) OF REAL;
    BEGIN
    U[i, jl := V[il * W[jl;
    END ;
    Here a new language construct is introduced, thefree variable i and j. A treatment of the exact syntax and semantics of this construct is deferred to the next section. This function assigns the vector-vector product Vi*W j to Ui, j. The keyword priority assigns a precedence to the function Vector cult which is used to decide evaluation order when the function is itself used as binary operator. priority 7 corresponds with the priority of the ?*? operator. E.g.,
    2 + X Vector-Mult Y = 2 + (X Vector-Mult Y)

    Control Flow
    In addition to the content- and view-statements, Booster also offers several control-flow constructs similar to those found in conventional languages. Available are the IF statement for conditional execution, and WHILE and ITER statements for iteration purposes (XTER-IOO~S execute a fixed number of times, WHILE-loops execute as long as a boolean condition evaluates true). Extract: The booster language - View classes

    2.2. View classes
    Views basically come in three flavours, each of which will be illustrated with a simple example.

    Selection Views
    The first non-trivial type of view we illustrate is the selection view:
    V[Ol := V[5];
    A[51 := A[lO];
    Again, the two content statements are equivalent, given the view statement that precedes them. The index expression s : 15 selects the subset or range of indices 5 through 15 of A. After this statement, the identifier v can be used to access A through the index set (0, I, . . . 14). Note that the element V [ 0 I actually refers to A [ 51 , etc: renumbering of the index sets after a view statement causes all index sets to start from zero, like the original index set does. A itself is never affected by any view statement.

    Permutation Views
    The second type of view is the permutation view:
    V[i] <- A[19-i];
    The following content statements are equivalent:
    V[Ol := V[5];
    AI191 := V[5];
    A[191 := A[14];
    The free variable i is used to access the values of A through v in reverse order. In fact, permutation views are an efficient alternative to create high level indirect addressing.

    Dimension Changing Views
    Free variables can be used for even more powerful purposes, as is illustrated by the third type of view: the dimension changing view:
    V(4#5) [i,j] <- A[(4*i)+j];
    The following content statements are equivalent:
    vto,o1 := V[2,3];
    ALO1 := A[ll];
    The construct ( 4 #S ] explicitly specifies the index set that should result from the view, or, put in another way, the domains for the free variables i and j. In the permutation view statement given in the previous sections, the domain of i and hence the resulting index set could be deduced by the compiler from the declaration of A. Here, the compiler needs to be told how to partition the 20 elements of A over the two dimensions of V. The identifier v now becomes for all application purposes identical to a matrix shape. The fact that the index set of this matrix refers, via a view function, to a one-dimensional vector is completely hidden from the ?user of v.

    View Functions
    Another example illustrates that selection views need not always select consecutive ranges. At this time we introduce the concept of a viewfunction:
    VIEW FUNCTION Even (V) -> (E) ;
    v (n);
    E (n div 2 t n mod 2);
    BEGIN V <- A[5:151;
    459
    E[i] C- V[2*il;
    END:
    The view function is a way of encapsulating related view statements. Input and output arguments are specified and their index sets declared. The use of the implicit index parameter n allows the view function to be applied to vectors of arbitrary length. No content statements may be used in the body of a view function. Note that renumbering compacts the selected collections of non-consecutive indices into rectangular index sets that start from zero. Below a more complex view function is given, which uses the previously defined function Even and returns the even and odd elements of A.
    VIEW FUNCTION Unzip (V] -> (E, 0);
    V (n);
    E (n div 2 t n mod 2);
    0 (n div 2);
    BEGIN
    E <- VIEven];
    O[il <- V[2*i+l];
    END;
    (E, 0) <- Unzip(A); // Call in main program
    Consequently, the three following content statements are equivalent:
    EL51 := 0[5];
    AL101 := A[ll];
    A[Even][5] := A[ll];



          in ACM SIGARCH Computer Architecture News 18(03) June 1990 (Proceedings of the 4th international conference on Supercomputing) view details
  • Paalvast, E.M.; Sips, H.J. and L.C. Breebaart. Booster: a high-level language for portable parallel algorithms" pp177-192 view details Abstract: Booster is a high-level, fourth generation, parallel programming language. The language has been designed to program parallel algorithms for a wide variety of target parallel architectures. Booster has a strong separation of concerns, featuring amongst others a clear separation of algorithm description and algorithm decomposition and-representation. Programs written in Booster are translated to imperative languages, such as FORTRAN or C, and can be easily integrated in large applications. Parallelism can be obtained by applying data and/or code decomposition. Once algorithm and decomposition are described the transformation is done automatically
          in Applied Numerical Mathematics, (8), 1991 view details