INC(ID:5390/inc001)


language for incremental computations


References:
  • Yellin D. M. and Strom R. "INC: A Language for Incremental Computations" Technical Report RC 13327, IBM, April 1988. view details
  • Yellin, D. Strom, R. "INC: a language for incremental computations" view details Extract: Introduction
    An INC program consists of a number of components, with the outputs of some components serving as the inputs to other components. It is convenient to view INC programs in the same way that one views electronic circuits. Because of this analogy, an INC program is called a circzlit. The components of an INC circuit operate on data elements and connectors pass the outputs of some components to inputs of other components.
    Every data element in an INC circuit has a type. Types are either primitive (integers, reals, booleans, or chars), or constructed. There are two constructed types, tuples and bags (multisets)2. A tuple is of type t =< tr,. . . ,tk >, where t; is any INC type, and a bag is of type t = bg(t’), where t’ is any INC type. For instance, one possible type in INC is bg(). Recursive types are not supported in INC.
    Every component in an INC circuit consumes some data (scalars, tuples, and bags) and produces some data (scalars, tuples, and bags). In writing an INC circuit, one specifies the components and the connections between them. Those inputs to components that are not produced by any other components are the circuit’s inputs, and those outputs that are not consumed by any other component are the circuit’s outputs. The goal of an INC circuit is to produce outputs from its inputs.
    An INC component can be classified as either a basic or a hierarchical component. The basic components perform primitive operations on data elements. For example, the + component consumes two integers of type integer, and produces an integer (the sum). The tuple projection consumes a tuple < el,s-- ,e, > and produces the Icth field ek. The basic components that operate on bags include: a duplicator, a merger, a remover, a compactor (which removes multiple values from a bag), a bag former, and a bag stripper.
    One very useful and powerful INC component is the transitive closure. This component consumes a set B representing a binary relation over the domain of a type t, and produces the set B’ representing the transitive closure of B. Both B and B’ are of type < t, t >.
    The hierarchical components in INC consume a bag (or bags) as input, and operate on each element of the bag. They are called hierarchical components because they make use of subcircuits (one can think of these as circuit parameters). One example of a hierarchical component is the transformer. A transformer is composed of a circuit parameter cdled Trans, which consumes an element of type t and produces an element of type t’. The transformer T consumes a bag B of type bg(t) and produces a bag B’ of type bg(t’). B ’ is the bag [Trans(b)[b in B]. For instance, say that the subcircuit addhe is supplied as the Trans parameter of a transformer, where addOne takes an integer and adds 1 to it.
    Then the transformer would consume a bag of integers B, and produce the bag of integers B’, where
    B’ = [b + ljb in B].
    Extract: Introduction
    1 Introduction
    There are many problem domains which require repeating a computation many times, each time on slightly different data. We list a few of these areas below:
    - Advanced editing environments: In these environments, as a programmer modifies his/her program, an analyzer is invoked to check the syntactic and semantic validity of the program (e.g., the Cornell Synthesizer [11,12]). Each time the analyzer is invoked, it performs the same computation, albeit on slightly different data. A similar situation exists for incremental compilers.
    - Spreadsheet programs: In a spreadsheet program, the user wants to compute some formula, and he wants to know how the results differ subject to different constraints (e.g., market conditions, interest rates, etc.). Typically, the user enters some initial data, computes an initial result, and then repeatedly changes a few of the parameters and recomputes the results. Each time the program is invoked to compute the formula, it performs the same calculation on slightly different data.
    Researchers have addressed the problem of how to optimize computations to take advantage of the fact that similar computations have already been performed. In each different problem domain, specialized techniques have been introduced. We believe that these diverse problems, and many more, are really instances of a single problem, and would benefit from a unified language-based solution. To see how these how these problems are related, consider a function f and 11, . . . ,In, successive inputs to f. Let 1i+l be “closely related” to input 1j. The problem of incremental computation is to devise a scheme for computing f(1r), . . . , f(ln) which exploits the relationship between 1j and Ij+i so that the cost of computing f(1r), . . . , f(ln) is less than the sum of the costs of computing each f(1j) individually. Each of the problems discussed above is a specific case of an incremental computation.
    One set of approaches[lI,12,2] to incremental computation involve building a dependency graph. Each node in the dependency graph represents an intermediate step in the computation. Directed edges between nodes indicate dependencies between intermediate computations. Incrementality is achieved by re-evaluating as few nodes as necessary. A node is re-evaluated iff its inputs have changed. Unfortunately, once a node needs to be re-evaluated, it is re-evaluated completely. Because previous computations within a node are not reused, the implementation does not fully exploit incrementality. Consider a node that sums up a set of numbers. A single insertion/deletion to the set causes the node to re-execute from scratch, even though there is an obvious way to incrementally update the sum. These sorts of computations occur frequently in compiler front-ends and spreadsheet programs. We refer to the dependency graph approach to incrementality as coarse-grain incrementality.
    A different approach to incremental computation is found in the works of Paige[9,8]. This approach, called finite differencing, has its origins in the technique of strength reduction. Finite differencing addresses the problem of incrementality in terms of a single monolithic functionr. Unfortunately, it requires that the user provide the transformational rules needed to convert the function into an incremental one. Furthermore, there is no guarantee that the transformed function will actually perform better, as it is only as good as the transformational rules that have been provided. A more recent work by Cai and Paige[4] presents a language Lr. Any algorithm expressible in Lr can be transformed by the Lr compiler into a linear time algorithm. Lr does not, however, support incremental programs and it cannot express non-linear algorithms. We refer to the finite differencing approach to incrementality as fine-grain incrementality.
    This paper introduces a new approach to incremental computation that extends prior work by providing a single application-independent language and by supporting both coarse- and fine-grained incrementality.
    We provide a language, INC, for the specification of programs, and a set of techniques for optimizing INC programs for efficient incremental execution. A programmer writes a program in the INC language giving no special consideration to incrementality. The INC compiler generates an efficient incremental program automatically.
    An INC program is represented as a dataflow graph in which inputs are fed through a network of functions to produce outputs. The transformed INC program is a network of processes which save data between executions and exchange messages describing how the inputs and outputs have changed. As in techniques based on attributed graphs, outputs are not recomputed if the inputs have not changed. However, in the case when the inputs are complex aggregates and the functions complex, only “short” messages reflecting the changes to the input are exchanged, and only the incremental change to the computation is performed. The incremental computation and the saved data structure needed to support it are generated automatically by the compiler using finite differencing techniques.
    These transformations guarantee that the cost of executing the resultant incremental program will never exceed the cost of executing the original program (at least in terms of expected asymptotic complexity). It becomes possible to analyze a program not only in terms of its static complexity (how long it takes to execute the program from scratch on input of size n), but also in terms of its incremental com$ezity (how long it takes to execute when there are m changes to the input).
    The research described in this pamper most closely resembles the extensions to the Cornell Synthesizor Generator described in [6] and [7]. In [6], several inefficiencies of incremental attribute updating are addressed. One problem discussed is the aggregate update problem, where a small change to an aggregate can require a substantial amount of unnecessary recomputation. To overcome this difficulty, a finite differencing technique called differential propagation is introduced. Although this technique does introduce fine-grain incrementality into the dependency graph approach, its application is restricted to finite map data structures. The operations allowed on finite maps are also severely restricted.
    The work in [7] describes a method for incrementally updating relational database (RDB) views. As INC has many RDB operators, there is a similarity between the some of the incremental algorithms employed by INC and those by the incremental view update algorithm. Both methodologies can better exploit certain sorts of incrementality appearing in language-based editing environments than traditional techniques (see section 4). INC, however, is a richer language than a RDB query language, and can express computations over a wider range of applications. Among the computations expressible in INC but not in a query language are certain arithmetic computations, transitive closure[l], and computations over complex relations (those not in first normal form).
    In this paper, we first review the INC language. We then outline how the compiler transforms an INC program into an incremental one. Finally, we give two small but powerful examples of INC programs, drawn from different application domains, and describe the transformed incremental programs generated by INC.
          in SIGPLAN Notices 23(07) July 1988 (Proceedings of the SIGPLAN'88 conference on Programming Language design and Implementation, June 20-24, 1988, Atlanta, Georgia, United States) view details