Delirium(ID:1618/del005)


An embedding coordinate language for parallel programming, implemented on Sequent Symmetry, Cray, BBN Butterfly.


Related languages
Sloop => Delirium   Influence

References:
  • Lucco, Steven; Sharp, Oliver "Delirium: an embedding coordination language" view details Abstract: Parallel programs consist of a group of sequentially executing sub-computations which cooperate to solve a problem. To exploit existing sequential code and available optimization tools, programmers usually choose to write these sub-computations in traditional imperative languages such as C and Fortran. A coordination language expresses data exchange and synchronization among such sub-computations. Current coordination languages support a variety of interaction models. Delirium introduces a new, more restrictive coordination model that provides the benefit of deterministic execution without requiring programmers to re-write large amounts of code. Current coordination languages are embedded; they work through the insertion of coordination primitives within a host language. Delirium is the first example of an embedding coordination language. A Delirium program is a compact representation of a framework for accomplishing a task in parallel. Extract: The Language
    The Language
    Delirium is based on restricted access to shared memory blocks. No memory block may be destructively modified by an operator unless it possesses the only reference to that block. This restriction corresponds to the idea of single assignment in functional languages, so we chose to express coordination with a functional notation. The language contains only six constructs; it is described in detail elsewhere [25] so we will only summarize its features here. A Delirium program consists of a set of functions, one of which is called main. These functions are first-class, meaning that they can be treated like any other value. They may be passed as arguments, bound to variables, or returned as values. The compiler converts each function into a graph, where the edges represent data paths and the nodes represent sequential operators. Where a function is passed as an argument, the run time system actually passes the corresponding graph. A special call-closure operator takes a graph and expands it dynamically at run time.

    The language constructs are:
    1. atomic values - integers, strings, floats
    2. multiple values - packages that can be put together, decomposed, and used as return values
    3. let bindings - a binding can be a single value, a decomposition of a multiple value package, or a function definition
    4. conditionals
    5. iteration - this is compiled into tail-recursive functions which are handled efficiently in the run-time system
    6. function or operator application

    Because of the flexibility of a functional notation, Delirium can express a broad variety of parallel constructs without requiring special syntax for each one. The parbegin parend [3] construct, for example, in which the programmer lists computations guaranteed to be independent, is completely subsumed. In Delirium, any two operations that do not have a lexically expressed dependency can always be executed in parallel and the run time system will automatically detect that. Extract: A Comparison of Coordination Models
    A Comparison of Coordination Models
    Any model for coordination of sub-computations must incorporate a straightforward notion of how sub-computation scheduling interacts with memory access. A coordination model must specify a data contention protocol by which several sub-computations can agree on the modification of shared data.
    The simplest of all coordination models is that of uniform, distributed shared memory. In this model, any subcomputation can access any part of shared memory. Subcomputations communicate through this memory. Higher-level coordination is done with locking (mutual exclusion) primitives embedded in a host language. On shared memory multi-processors such as the Sequent Symmetry, this model directly reflects the underlying architecture, and is a good low-level environment for programming.
    The Ivy 1193 system attempted to implement this coordination model on a message-passing multiprocessor. It had several drawbacks which led to poor performance. The refinements in the Tarmac and Shared Data Object systems partially address these drawbacks.
    Linda coordinates sub-computations through Tuple Space, an abstraction which emulates shared memory. All data shared among sub-computations passes through Tuple Space via read, insert, and remove operations. The Tuple Space mechanism is novel in that it provides only associative primitives for finding tuples. A sub-computation requests a particular kind of tuple, and the system responds with a random selection from the set of tuples which match the request.
    Like all the examples cited above, Linda is an embedded coordination language. Programmers embed uses of Linda primitives within a host language, usually C or FORTRAN. The notation for these primitives resembles a procedure call in the host language.
    RPC systems embed special procedure calls in a host language. Generally, uses of RPC are indistinguishable from regular procedure calls, but have special semantics (they may cross address-space boundaries). Similarly, message passing systems such as VORX, V, and ButterfIy embed calls to library functions with special data exchange semantics in a host language. These calls are expressed in the standard notation of the host language.
    RPC and message passing systems have a simple coordination model - all shared data is passed in messages (or procedure arguments). This model supports a variety of higher level coordination models, among them replicated-worker execution. To realize a higher-level model using message passing, one must devise a complex communication protocol. This is a difficult and error-prone task that most application programmers will not have time to undertake.
    Object-oriented systems such as Emerald, Amber, and Sloop provide an improvement over message-passing systems. They still use implicit RPC to implement complex higher-level protocols. However, these protocols are easier to understand because the abstract data types defined in such systems encapsulate shared data. That is, a particular RPC call on an abstract data type can only directly modify local data for the called instance of that type. This encapsulation has the additional benefit that one can improve performance by explicitly moving object instances about the network of processors. Sloop, Emerald and Amber all provide an embedded primitive for specifying object locality.
    The Ada coordination model is based on task rendezvous. The relative merits of this model are discussed in detail elsewhere. Like the languages discussed above, one can think of Ada as containing an embedded notation for expressing operations in its coordination model.
    Our coordination model is as follows:
    1. All shared memory is explicitly passed between subcomputaiions.
    2. A sub-computation may destructively modify a block of data only if it owns the sole reference to that data.
    3. All sub-computations are encapsulated. That is, they have unique, well-defined entry and exit points. We call such encapsulated sub-computations operators.
    4. The communication topology connecting individual operators does not change during execution. The topology itself supports conditional expression evaluation.
    This model is somewhat restrictive (see table 2 for a comparison between coordination models). Iin the next section, we will explore some of the disadvantages of requiring a programmer to follow these constraints. However, the model provides several important advantages. As with other coordination languages, programmers can use existing code and program development tools to create sub-computations.
    Most important, execution within the model is deterministic. If there is a bug in the program it will recur in exactly the same way tvery execution, greatly simplifying debugging. Development is simpler because a program that runs correctly on a uniprocessor will run correctly on a multiprocessor. Profiling is also straight-forward, as was demonstrated in the case studies.
          in Proceedings of the 1990 ACM/IEEE conference on Supercomputing New York view details
  • Lucco, Steven and Oliver Sharp "Parallel programming with coordination structures" pp197-208 view details
          in [ACM SIGACT-SIGPLAN] Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages January 1991 view details
  • Lucco, Steven "A dynamic scheduling method for irregular parallel programs" view details
          in SIGPLAN Notices 27(07) July 1992 (Proceedings of the 5th ACM SIGPLAN conference on Programming language design and implementation, 22-31, San Francisco, June 1992) view details