IVTRAN(ID:262/ivt001)

Parallel FORTRAN 


IV (four) from the Illiac IV

Parallel FORTRAN for the Illiac IV, featuring parallel loop like FORALL, and a way of mapping arrays to PE arrays

Hardware:
  • ILLIAC IV University of Chicago at Illinois/Burroughs

Related languages
FORTRAN IV => IVTRAN   Extension of

References:
  • Barnes, G.H., Brown, R.M, Kato, M., Kuck, D.J., Slotnick, D.L., and Stokes, R.Q. "The ILLIAC IV computer" IEEE Trans. Comput. C-17 (Aug. 1968), pp746-757 view details
  • Kuck, D.J. ILLIAC IV software and application programming. IEEE Trans. Comput. C-17 (Aug. 1968), pp758-770 view details
  • Burroughs Corporation. "ILLIAC IV system characteristics and programming manual" Defense, Space and Special Systems Group, Paoli, Pennsylvania, June 30, 1970. view details
  • Millstein, R.M., Krugman, E., and Goldberg, D. "Optimization for an array computer" view details
          in SIGPLAN Notices 5(09) September 1970 view details
  • Fifth Semi-Annual Technical Report for the Project - Compiler Design for the ILLIAC IV, (14 January 1972 - 13 July 1972), Vol. 1. Massachusetts Computer Associates, Inc., August, 1972 (CADD-7208-1411) view details
          in SIGPLAN Notices 5(09) September 1970 view details
  • Fifth Semi-Annual Technical Report for the Project - Compiler Design for the ILLIAC IV, (14 January 1972 - 13 July 1972), Vol. II. Massachusetts Computer Associates, Inc., August, 1972 (CADD-7208-1411) view details
          in SIGPLAN Notices 5(09) September 1970 view details
  • "The IVTRAN Manual", Massachusetts Computer Associates, Wakefield, Massachusetts, November 1973 (CADD-7311-0111) view details
          in SIGPLAN Notices 5(09) September 1970 view details
  • Millstein, R. E. "Control structures in ILLIAC IV Fortran" pp621-627 view details Abstract: As part of an effort to design and implement a Fortran compiler on the ILLIAC IV, an extended Fortran, called IVTRAN, has been developed. This language provides a means of expressing data and control structures suitable for exploiting ILLIAC IV parallelism. This paper reviews the hardware characteristics of the ILLIAC and singles out unconventiona features which could be expected to influence langluage (and compiler) design. The implications of these features for data layout and algorithm structure are discussed, and the conclusion is drawn that data allocation rather than code structuring is the crucial ILLIAC optimization problem. A satisfactory method of data allocation is then presented. Language structures to utilize this storage method and express parallel algorithms are described. DOI
          in [ACM] CACM, 16(10) October 1973 view details
  • Sixth Semi-Annual Technical Report for the Project - Compiler Design for the ILLIAC IV, (14 July 1972 - 13 February 1973). Massachusetts Computer Associates, Inc., February, 1973 (CADD-7302-2011) view details
          in [ACM] CACM, 16(10) October 1973 view details
  • "The IVTRAN Manual, Revised Edition", Massachusetts Computer Associates, Inc., January 1975 view details
          in [ACM] CACM, 16(10) October 1973 view details
  • Erickson, David B. "Array processing on an array processor" pp17-24 view details Abstract: Central memory is distributed across several processing elements on the ILLIAC-IV and similar array processors. This causes memory to appear two dimensional and raises special problems in the handling of arrays. Assignment of arrays to storage, and development of efficient array mapping functions and accessing techniques are all much more difficult than on conventional machines with ?linear? memories. This paper discusses these problems as they relate to IVTRAN, a Fortran-like compiler for the ILLIAC-IV. Alternate solutions, useful in a different environment, are also explored. We shall start by giving a brief overview of the pertinent features of the ILLIAC-IV. The paper then describes IVTRAN constructs which may be used in expressing parallelism and the implications that these constructs have for array storage. Next, array mapping formulas are developed and the array packing problem is treated. Finally, argument passage and Fortran COMMON and EQUIVALENCE statements are discussed. DOI
          in SIGPLAN Notices 10(03) March 1975 Proceedings of the conference on Programming languages and compilers for parallel and vector machines, January 1975 view details
  • Loveman, David B. and Faneuf, Ross A. "Program optimization - theory and practice" pp97-102 view details Abstract: The conventional program optimization techniques employed by the ILLIAC FORTRAN compiler are general purpose, effective, and efficient. The underlying theory is applicable to FORTRAN and to other high level languages. A unique approach to the gathering of global set and use information about variables as well as careful software engineering of the algorithms has led to the construction of an effective source-to-source optimizer which performs constant propagation, constant computation, common subexpression elimination, reduction in strength, and invariant code motion. We will first consider the type of information which must be gathered about a program, and how this information is used to perform optimization. Then we shall state the algorithm for globally computing the set and use information for program variables. Having discussed the science of optimization we shall turn to the engineering aspects and consider such topics as representation of programs, order of optimization transformations, and efficient computation of global use and set information.

    DOI
          in SIGPLAN Notices 10(03) March 1975 Proceedings of the conference on Programming languages and compilers for parallel and vector machines, January 1975 view details
  • Millstein, Robert E. and Muntz, Charles A. "The ILLIAC IV FORTRAN compiler" pp1-8 view details Abstract: This paper provides a basic description of a FORTRAN system for the ILLIAC IV. In this context ?FORTRAN system? means exactly what one would expect ? a user familiar with a different system will find no major surprises when he uses ILLIAC FORTRAN. The language is the same ? a dialect of ANSI standard FORTRAN. The processors are the same ? a compiler which generates relocatable binary files from FORTRAN source text, a link editor which collects and joins separately compiled program pieces into a single module, a loader which loads and relocates a single module into ILLIAC memory, a library of functions, and an I/O subsystem which supports formatted and unformatted FORTRAN I/O.

    DOI Extract: Introduction
    Introduction
    This paper provides a basic description of a FORTRAN system for the ILLIAC IV. In this context "FORTRAN system" means exactly what one would expect -- a user familiar with a different system will find no major surprises when he uses ILLIAC FORTRAN. The language is the same - a dialect of ANSI standard FORTRAN. The processors are the same -- a compiler which generates relocatable binary files from FORTRAN source text, a link editor which collects and joins separately compiled program pieces into a single module, a loader which loads and relocates a single module into ILLIAC memory, a library of functions, and an I/O subsystem which supports formatted and unformatted FORTRAN I/O.
    The ILLIAC IV hardware has been described many times, so we will Just briefly review its features. An ILLIAC IV quadrant consists of a control unit (CU) and 64 processing units (PUs). Each processing unit consists of a processing element (PE) and a processing element memory (PEM) of 2K G4 bit words. All instructions are interpreted by the CU, which decodes each instruction and broadcasts, synchronously, sequences of microlnstructions to each PE. That Is, the CU interprets an instruction and then each PE, simultaneously, executes that instruction. One operand may be broadcast from the CU. Other operands are available to the PE and its own PEM or operating registers.  In addition, PEs may be disabled for the execution of any given (sequence of) instructions.
    That is, any set of PEs can be (temporarily) turned off during the course of an instruction stream. Thus, if an add instruction is broadcast by the CU, a given PE may execute it (on local data) or ignore it. It is not, however, possible to execute a different instruction.
    The CU is able to perform some integer arithmetic, primarily for loop control and address calculation, but the major computing power resides in the PE. The PEs can perform a standard repertoire of fixed point, floating point, and logical computations.
    An ILLIAC quadrant has 128K of memory, all of which is accessible, in conventional fashion, to the CU. Each PE, however, sees only 2K of this memory. Instructions which reference memory generate an effective address between 0 and 204710. This address is used as a displacement in each PEM. Each PE contains a local index register which can be used to modify the virtual address field of an instruction. A routing initruction is provided to allow data transfers between PEs. The PEs are, in effect, connected in a closed circular fashion. The routing instruction transmits a word from each PE to the PE located n positions distant around the ring. A total of 64 words are transmitted: PE0 sends a word to PEn, PEI sends to PE((n+l)mod64) .... and PE64 sends to PE((n+63)mod64), 0 ~_ n_~ 63.
    Although the user will find no major surprises in the ILLIAC FORTRAN system, he will notice some new features since the ILLIAC is hardly a "FORTRAN machine". In particular, he will receive, as an optional part of his compilation print-out, a rewritten source level version of his program in WTRAN, an extended FORTRAN with syntactic structures for expressing parallelism and describing array storage maps. A major part of the translation process is the detection of parallelism in the original program and the rewriting of that program in IVTRAN with the parallelism explicitly represented. The processor which does this part of the translation is called the Paralyzer (parallelism analyzer and synthesizer) as is its most significant subpart. It can be used independently of the remainder of the compilation process as the subprocessors labelled PARSE, PARALYZER and TRANSCRIBER in Figure 1.
    Once parallelism has been explicitly represented and arrays have been mapped into storage, the remainder of the translation process is relatively straightforward. The optimizer and code select portions of the compiler differ only in degree from similar portions of more conventional compilers. For example, the scope of the optimizer iS broader than that of any existing compiler - it Is truly global and not restricted to any subset of flow blocks - but the optimizations performed - common subexpression elimination, constant propagation, etc. - are conventional.
    Finally, once relocatable binary files have been generated, there is the usual collection of processors to link separate compilations into load modules, retrieve intrinsic functions from libraries, load and relocate load modules into memory, and support run-time I/O. The process is entirely straightforward; it is complicated only by the complex ILLIAC IV system environment.
    These three sections - Paralyzer, compiler, and support package - comprise the FORTRAN system. The user with an existing code will use the entire system and will note little difference from other systems (other than the addition of the rewritten IVTRAN source listing). It is, however, possible to use the system in a different way, bypassing the Paralyzer and submitting a program written in IVTRAN. Extract: IVTRAN
    IVTRAN
    IVTRAN serves a double role in the compiler. It is a "standard" FORTRAN, an amalgam of IBM and CDC FORTRANs (with conflicts between these languages resolved by reference to the ANSI standard), and it is also an extended FORTRAN with explicit syntactic constructs for expressing parallelism and describing storage mappings.
    ? As a standard FORTRAN, IVTRAN is designed so that many existing FORTRAN programs are syntactically legal IVTRAN programs. Thus, a prospective user of the ILLIAC FORTRAN system does not have to perform a technically trivial hand translation to rewrite an existing program in IVTRAN. For example, two-way logical IFs (CDC) are allowable; .N. as an abbreviation for .NOT. (CDC) is allowable; DEBUG statements (IBM) are allowable; REAL * 8 (IBM) declares a double precision variable (as does DP); etc.
    ? As an extended FORTRAN, IVTRAN incorporates new syntactic structures - the DO FOR ALL statement, allocation declaration, etc. - for expressing parallelism and describing storage maps. These extensions were made so that the code deformation produced by the Paralyzer could be examined at the source language level. Hence, they are designed to reflect the capabilities of the Paralyzer more than to provide a human-engineered ILLIAC language. Nevertheless, these structures are easy to use and do provide a usable high level ILLIAC programming language.
    We will now discuss the principal IVTRAN extensions. Further details may be found in [I]. We will treat only the three most significant extensions - the DO FOR ALL statement, the allocation declaration, the OVERLAP and DEFINE statement.

          in SIGPLAN Notices 10(03) March 1975 Proceedings of the conference on Programming languages and compilers for parallel and vector machines, January 1975 view details
  • Presberg, David L. "The Paralyzer: Ivtran's Parallelism Analyzer and Synthesizer" pp9-16 view details Abstract: The ILLIAC IV Fortran compiler's Parallelism Analyzer and Synthesizer (mnemonicized as the Paralyzer) detects computations in Fortran DO loops which can be performed in parallel. It is a step of the compiling process which lies between source language parsing and target code generation, and as such can be considered as a high-level optimization step specific to the ILLIAC architecture. The Paralyzer performs its transformations within the Intermediate Language tables of the compiler. The parallel execution constructs introduced into the user's program are those which can be expressed in the extended Fortran language, IVTRAN, the source language of the compiler [1]. With a decompiler from the Intermediate Language to IVTRAN source, the Paralyzer can act as a source-to-source translater. Some pertinent characteristics of the ILLIAC IV motivate the parallelism detection methods employed by the Paralyzer. ILLIAC is in the general class of parallel processors known as array processors. That is to say, it performs identical computations in a lock-step, synchronous fashion over separate data streams. Its computational access to main memory is highly constrained: each of the 64 Processing Units can access directly only a private section of the whole memory. Data can be passed from one Processing Unit to another by a relatively expensive routing instruction. This is executed identically by all Processing Units and passes data a uniform end-around distance in the fixed ordering of the Processing Units. The machine executes most efficiently those computations which are element-by-element operations on vectors or arrays. Thus, the most fruitful sources of parallelism in Fortran programs intended for ILLIAC IV execution are DO loops containing array references with subscripts depending on the DO index variables. DOI
          in SIGPLAN Notices 10(03) March 1975 Proceedings of the conference on Programming languages and compilers for parallel and vector machines, January 1975 view details
  • Perrott, R. H. and Stevenson, D. K. "Users' experience with the ILLIAC IV system and its programming languages" view details Abstract: The ILLIAC IV is a unique machine which has led the research and development of lockstep parallel processing. The machine has been operational since 1973, in experimental mode, and since 1975 in full production mode. There has been on the order of a hundred users of the machine and these users and their codes have been well documented. Four languages are available on the machine ranging from high level to machine code. A survey has been conducted of the users in order to determine how the ILLIAC IV has been employed and how the high level programming languages have facilitated the use of this machine. This paper presents the results of that survey.The survey attempts to confirm or eliminate some of the folklore that has grown up around the ILLIAC IV facility. It can be helpful in the design of the next generation of supercomputers and their languages and in the improvement of the present generation of languages. The responses to the survey indicate: 1) that the ILLIAC IV has been accepted by the scientific community; 2) that a wide range of different application areas have used the machine; 3) that users have had to construct their programs so as to minimise the effects of the serious bottleneck created by the movement of data between the main and backing stores; and 4) that the high level programming languages available have insufficient or inefficient structures which at times require the use of machine code. DOI
          in SIGPLAN Notices 16(07) July 1981 view details
  • Davis, E. W. (Raleigh, NC) Review of Hord view details Extract: COntent
    It is perhaps ironic, perhaps appropriate, that this book should be published in the year that ILLIAC IV was disassembled for scrap [ 1 ] . The author, who was a manager for the Institute for Advanced Computation (IAC) where ILLIAC IV was operated, reports on more than 15 years of the computer's history. A brief discussion of political and technical implementation difficulties during the late 1960s and early 1970s development period at the University of Illinois provides an interesting perspective on the project. Not many computer histories can include sentences like "It was during the firebombing and rioting that shook the University of Illinois campus in the spring of 1970 that the ILLIAV IV computer project reached its climax." It was an exciting time. Any of us who were there could add items to the discussion of difficulties and successes.
    The book is organized into chapters on Background, The Computer, Programming, Applications, and Commentary. Much of the material is based on IAC newsletters and reports and published papers. The book gathers material together in an organized fashion and supplements it with supporting narrative and figures. Unfortunately, there is no index to aid the reader in locating items of interest. The book as a whole does not have a list of references, only "sources," most of which are IAC documents. Individual sections based on previously published work usually do include references.
    More than half of the book is devoted to applications. It provides a thorough view of formulating algorithms for a parallel machine and demonstrates the usefulness of ILLIAC IV over a range of applications. By contrast, coverage of ILLIAC IV hardware is quite slim. In a 64-page chapter called The Computer, only 11 pages are concerned with ILLIAC IV, and less than one page with the actual processing unit. The rest of the material is a documentary description of the IAC facility, what IAC did to implement overlap between control and processing units, and some performance figures. As just one example of the lack of appropriate balance of description, more space is devoted to the IAC facility air conditioning system than to the ILLIAC IV processing unit.
    There are many inconsistencies in the text that could have been eliminated by careful editing. For example, the overview of the IAC computational facility has the sentences "Five physically distinct memory subsystems comprise the three levels of the hierarchy. The third or primary level separates into the PEM (Processor Element Memory), the 14DM, and the Central Memory." Why is the third level the primary level? Why does this single level include both solid state random access memory (the PEM) and disk memory (the 14DM)? The book also states "This hierarchy is illustrated in Figure 3.2." Figure 3.2 then shows four levels of memory.
    Basing sections of the book on reports prepared by others introduces redundancy. Several figures are duplicated, along with their textual discussion. Figure 3.3 is captioned "Simplified Diagram of the ILLIAC IV." The same diagram is used as Figure 4.1 but is labeled "Example #2 Overlap Code." The obvious mislabeling should have been corrected. However, simply referring to the earlier figure would have avoided the problem altogether. The author's intent might have been to make each section self-contained; if so, direct reprints of papers might have been better than just basing sections on previous work. Actually, it's not clear what the author wrote and what is taken from others.
    The programming of ILLIAC IV is discussed both in the chapter on applications and in a separate chapter titled Programming. The programming chapter emphasizes the CFD and GLYPNIR languages. It briefly examines two other languages: IVTRAN and APPLE. Early in the chapter it says "The ILLIAC is difficult to program; it is even harder to program well." This must be meant as a comparison with more conventional machines. Yet later in the same chapter we see "Software development for the ILLIAC is not much different from software development for other machines." Either it is more difficult or about the same; the reader should be given a clear view of the programming complexity. In the first few paragraphs of the chapter the author reveals a management problem at the IAC. He tells us "There is no central repository for application programs. Even the programs developed at IAC get lost in time or become useless through incomplete documentation."
    According to the author, the book was written primarily for computer professionals. It does include much material of interest to them and to a broader group of technical people. Certainly not all parts are of equal interest to all readers, but the author has performed a service in gathering diverse material on this influential machine into a single volume.

          in SIGPLAN Notices 16(07) July 1981 view details
  • Hord, R. Michael "The ILLIAC IV: the first supercomputer" Computer Science Press, Inc., Rockville, MD, 1982 view details
          in SIGPLAN Notices 16(07) July 1981 view details
  • Mehrotra, Piyush; Van Rosendale, John; Zima, Hans "High Performance Fortran: History, Status and Future" Technical Report TR 97-8, Institute for Software Technology and Parallel Systems, University of Vienna, September 1997. view details Abstract: High Performance Fortran (HPF) is a data-parallel language that was designed to provide the user with a high-level interface for programming scientific applications, while delegating to the compiler the task of generating an explicitly parallel message-passing program. The main objective of this paper is to study the expressivity of the language and related performance issues. After giving an outline of developments that led to HPF and shortly explaining its major features, we discuss in detail a variety of approaches for solving multiblock problems and applications dealing with unstructured meshes. We argue that the efficient solution of these problems does not only need the full range of the HPF Approved Extensions, but also requires additional features such as the explicit control of communication schedules and support for value-based alignment. The final part of the paper points out some classes of problems that are difficult to deal with efficiently within the HPF paradigm.
    External link: Online copy Extract: IVTRAN
    One of the first languages to allow users to control the layout of data was IVTRAN, a language developed for the SIMD machine ILLIAC IV. Users could indicate the array dimensions to be spread across the processors and those which were to be local in a processor. Combinations resulting in physically skewed data were also allowed.
    Extract: Conclusion
    Conclusion
    HPF is a well-designed language which can handle most data parallel scientific applications with reasonable facility. However, as architectures evolve and scientific programming becomes more sophisticated, the limi-  tations of the language are becoming increasingly apparent. There are at least three points of view one could  take:
    1. HPF is too high-level a language --- MPI-style languages are more appropriate.
    2. HPF is too low-level a language --- aggressive compiler technologies and improving architectures obviate the need for HPF-style compiler directives.
    3. The level of HPF is about right, but extensions are required to handle some applications for some upcoming architectures.

    All three of these alternatives are being actively pursued by language researchers. For example, HPC++ [?] is an effort to design an HPF-style language using C++ as a base. On the other hand, F - - [?] is an attempt  to provide a lower-level data-parallel language than HPF. Like HPF, F - - provides a single thread of flow  control. But unlike HPF, F - - requires all communication to be explicit using "get'' and "put'' primitives.

    While it is difficult to predict where languages will head, the coming generation of SMP-cluster ar- chitectures may induce new families of languages which will take advantage of the hardware support for  shared-memory semantics with an SMP, while covering the limited global communication capability of the  architectures. In this effort the experience gained in the development and implementation of HPF will surely  serve us well.
          in SIGPLAN Notices 16(07) July 1981 view details