ABL(ID:3967/alg009)

Language manipulation of files 


for Algebraic Business Language

Lionello Lombardi's functional language, based on his "theory of files"

Possibly the first to describe itself as Non-von-Neumann, and to justify itself with reference to Church, therefore likely to be the first attempt at a functional on non-von-Neumann language

Also likely to be the first comprehensive attempt at defining data retrieval formally

Renamed ADSL later


Related languages
Church typed-lambda calculus => ABL   Influence
N1 => ABL   Evolution of
ABL => ADSL   Renaming

References:
  • Lombardi, Lionello. "Theory of files" p137 view details Abstract: The theory of files is a tool for the logico-mathematical treatment of automatic non-numerical data processing problems, such as machine accounting, information retrieval and mechanical translation of languages. The main result which has been obtained so far from the application of the theory of files is the formulation of a simple pattern to which the data flow of any information processing procedure conforms, regardless of how many files are involved. The flow of each file can be controlled and coordinated with the flow of the other files by means of five Boolean parameters, called "indicators".

    A specially designed Algebraic Business Language exploits this result for the purpose of programming digital data processing systems. This paper also probes into the impact of the theory of files upon the logical design of digital information processing systems.

          in [JCC 18] Proceedings of the 1960 Eastern Joint Computer Conference, New York, December 1960 view details
  • Lombardi, Lionello "Inexpensive punched card equipment" J. Mach. Accounting 12, 8 (1961) 11-18. view details
          in [JCC 18] Proceedings of the 1960 Eastern Joint Computer Conference, New York, December 1960 view details
  • Lombardi, Lionello "System Handling of Functional Operators" view details
          in [ACM] JACM 8(2) April 1961 view details
  • Lombardi, Lionello "Mathematical aspects of non-arithmetic data processing procedures" view details Extract: Introduction
    In this paper are defined the basic elements of' the theory of files, which was presented in Lombardi (1962). In particular, a system language is considered, which allows the use of mathematical methods for the description of procedures in which non-numerical information processing is of primary importance. This language is called the Algebraic Data System Language. The main features of the language are the following:
    - The Algebraic Data System Language uses logico-mathematieal techniques to control the data flow.
    - A procedure description consists of a sequence of statements, which are not in a biunique correspondence with the boxes of any flow diagram of the machine control, in other words, the description of a procedure is asyncronous with respect to the manner in which the procedure  is carried out by the machine.
    - The schemes by which any problem is solved, and in particular the scheme of the data flow, are determined by the computer, in accordance with its internal structure and input- output equipment. Procedure descriptions are consequently independent of the hardware with which they are carried out.
    - A problem description consists of the layout of the input data and of the output results (date description) and of a set of equations (procedure description) which relate these results to the data. The human element is eliminated, not only in coding, but also in the design of flow diagrams.
    - The Algebraic Data System Language provides for a shorthand reference to functions which are discretely defined by tables, and for the stating of reeursive operations concerned with information organized in tables, which avoids any explieit setting of loops or tallies.
    The features which are original with the Algebraic Data System Language and not shared by other system languages are discussed thoroughly in the paper; when no need of special treatment arises, only references are made. In Section 1 the basic definitions are given, the boolean and ordinary expressions and the conditions are discussed, and the first examples of statements of the Algebraic Data System Language are exhibited. Section 2 is concerned with the table- functions, the handling of tables, and the recursive operations. Section 3 deals with the information to be processed. Files are defined, the "derivatives of a file" are introduced, and the scheme of the functional relationship between files is investigated. In addition, the use of keys in order to coordinate the flow of data is discussed. A special kind of expression, called "flow control expression," is introduced. The concept of "pulse" is discussed, and the laws by which the flow of data is controlled are defined. As an example, the data flow of a billing procedure is described in terms of Algebraic Data System Language statements.
    Further mathematical developments are studied in Section 4. Some of the basic elements for the construction of a "boolean algebra of files" are brought out, and it is shown how a complete sorting procedure may be defined by a simple statement which to a certain extent is independent of the sorting method adopted.
    Although this paper deals primarily with an advanced form of the Algebraic Data System Language, intermediate and simpler forms of the language should also be taken into consideration.
    Extract: Background
    Background
    The effort which led to the conception of the principle of the stored program computer was primarily directed towards the construction of a device able to handle problems of a mathematical nature. One of the difficulties which delayed this conception was the fact that one was not accustomed to think of a mathematical problem in terms of the steps by which it is solved, i.e. in terms of a flow diagram. Even for computational purposes a mathematical problem was stated by describing its issues with sets of equations. A breakthrough was made when it was first realized that one had to consider machinery able to handle mathematical problems which are described by means of procedures, namely of flow charts, rather than by the definition of their issues. This idea was primarily responsible for a major effort in the field of digital data processing devoted to the development of stored program computers.
    A few years later, when the possibility of using already existent computers for accounting purposes was first brought up, there arose a new, antithetic problem: an accounting problem is usually known only in terms of the steps of the procedure involved, rather than by a precise definition of its issues. Consequently it was easy to design a flow chart of any accounting problem, since a flow chart consists essentially of these steps. On the contrary, no way of defining such a problem in a compact way, similar to the way in which a mathematical problem is described by equations, was available, nor is available today. It appears that the description of non-numerical problems in terms of their issues and of the functional relations between these issues and the data is the first of the steps to be performed in order to have a deeper insight into these problems, to handle them mathematically, and to automatize their programming thoroughly. The first need, in our opinion, is to develop a language for the purpose of describing them in this way.
    The theory of files is an effort in this direction. Files and sets of files are defined and discussed mathematically. The mathematical nature of the laws which coordinate the data flow of nomnumerical systems is investigated. These laws, which appear to be common to all non-numerical processes, can be stated easily by means of certain boolean variables, called indicators, which are defined in Section 3 of this paper. The Algebraic Data System Language is primarily designed for the description of non-numerical problems, independently of the procedures they involve. The "flow control expression" feature (see Section 3) is an attempt to provide for a language tool by means of which the possibility of efficiently stating the issues of any system of this nature is made available. Extract: Definitions
    1. Definitions
    1.1 Mode: Throughout this paper we shall consider variables and constants in two modes -- ordinary and boolean. Ordinary entities are either alphameric or numerical. Arithmetic operations and comparisons are performed with ordinary entities (constants and variables). Boolean operations and comparisons are performed with boolean entities.
    1.2 CONSTANTS. There are two boolean constants, namely 1 and 0; we shall also refer to them with the words YES and NO, or TRUE and FALSE, respectively. Ordinary constants are either integers (signed or unsigned), rational-decimal numbers (signed or unsigned), or alphameric words. Constants are introduced into a problem description as literals, or by means of special entries in the data description; constants introduced in the latter way are called defined constants.
    1.3 VARIABLES. A variable may be in either mode, and is referred to by an alphamerie name, containing at least one alphabetic character. Only the names of variables of a special kind, indicators, contain a special character. If a variable is a field of the data, then its mode has to be specified in the data description. The mode of any ogher variable is derived h'om the description of the variables in terms of which it is defined.
    Variables may be introduced into the description of a problem in any of the following ways:
    (a) as names of fields, or groups of fields, in the data description;
    (b) by special statements in the data description, where the initial value is assigned (notice that variables and defined constants are essentially two different aspects of the same concept);
    (c) by writing their name as the left-hand member of a formula (see Section 1.8)
    (d) as indicators or types (see Section 3) -- both indicators and types are boolean variables; the names of variables other than indicators may not contain any special characters;
    (c) as indexes of tables (see Section 3.2).
    Conditional variables are in the ordinary mode; the names of the values which conditional variables may assume are called determinations. The determinations are variables at the boolean mode.
    1.4 FUNCTIONS. A function is referred to by its name, followed by a list of arguments separated by commas, enclosed in parentheses. The function library is open; functions which determine maxima, minima, absolute values, positive parts, (i.e. max (0, x)), or perform roundings, date arithmetic, British currency conversion, congruent arithmetic, basis-12 arithmetic, etc. are of primary importance.
    1.5 MARKS. In building expressions one may use the following marks:
    + , - , * , / , =, ~ , > , >_, <, _<, -=-, ~--;
    parentheses are used in the usual manner.
    1.6 ELEMENTARY EXPRESSIONS. We define the concept of expression implicitly; we begin by defining a particular kind of expression, which will eventually be referred to as elementary expression.
    A sequence of constants, variables, functions, separated by marius which are neither ~- nor +--, and by parentheses, representing a single-valued funcgon of the variables involved, is an "expression" (elementary expression) provided that the standard parentheses rule is respected.
    The concept of expression will be extended in Section 1.12. When not otherwise specified, the word expression will refer to the most general type of expression of the Algebraic Data System Language.
    The hierarchy of operations is the standard one completed by the convention that, comparisons are performed prior to arithmetie operations and calculations of functions.
    The usage of marks is related to the mode of the variables and constants by the following rules:
    I. >, <, ~, >_, =, ~ always separate variables or constants in the same mode.
    II. The -t- and * must separate entities in the same mode. The operation is performed accordingly (either arithmetically or logically), and the mode of the result is in the mode of the operands.
    III. The minus sign may only separate ordinary entities, and denotes subtraction.
    IV. The solidus "/" when separating ordinary entities denotes division; the same sign when followed by a boolean entity denotes inversion (complementation). In either ease, if no expression precedes the solidus, a "1", either boolean or ordinary, depending upon the mode of the entity on the right hand, is assumed to precede it.
    1.7 STRUCTURE. The structure of an expression in the ordinary mode is composed of elements taken from the properties of the allowable entries. More precisely, it includes:
    - the property of being either alphameric or numerical;
    - if it is numerical, the property of being either signed or unsigned and the property of being either integral or rational;
    - if it is alphameric, its length;
    - the editing format.
    As an optional feature, a wider selection of possible structures may be provided, and the meaning of the marks + , - , *, / may be extended accordingly. For example, entities measured either by the dozen or according to the British currency system may be considered.
    The structure of a field of the data has to be stated in the data description; the structure of any other expression, whatever it may be, is derived from the structure of the entries.
    When a new function is included in the library, a chart relating the mode and structure of the arguments with those of the function should be provided by the human librarian.
    1.8 FORMULAS. A formula consists of two expressions separated by the mark "*--". The entry immediately on the left of "_L" may be the name of a variable already defined, or a word which may become the name of a variable, or a table +_'unction (see Section 2). The entry on the right will be either an expression or the word ERASE. This latter feature is connected with the use of table functions and is discussed in Section 2.
    The mark 'q--" is interpreted imperatively. The meaning of a formula is the following: Set the variable on the left side of "+-" equal to the current value of the express.ion on the right side of "+--". If a variable having the name of the variable on the left side of "+--" was not defined previously, a new variable having this name is introduced, eorresponding in mode and structure with the expression on the right side.
    1.9 SYMBOLS. A symbol consists of the sign "==-" preceded by a non-numerical word W and followed by an expression E. A symbol never has any operational meaning. It is just used in order to provide a shorthand reference to the expression E, by calling it "W". Any symbol may be placed anywhere in the procedure or in the data description.
    1.10 CONDITIONS. A boolean expression E enclosed in square brackets is called a condition. A condition is immediately preceded by an expression, which: is to be considered or ignored, depending on whether E is currently 1 or 0. A couple consisting of an expression and of the condition on which it depends is called a conditional expression. The two items which make up a conditional expression are not separated by a comma, but may be separated by the sign.
    1.11 LOOPS. A loop consists of braces enclosing the following two items:
    (1) a sequence of ordinary unsigned integral variables, called indexes, separated by commas and enclosed in parentheses; and (2) an expression. The use of loops is connected with the handling of tables and is fully discussed in Section 2.
    1.12 EXPRESSIONS. The most general expression is defined inductively by the following three statements:
    (1) Elementary expressions, formulae, conditions, conditional expressions,names of files and loops are expressions.
    (2) A sequence of expressions, separated by commas, is an expression.
    (3) An expression enclosed in parentheses is an expression.
    The hierarchy of the operation is determined by the following rules:
    (a) A condition is checked immediately before the expression to which it refers is considered.
    (b) The hierarchies of parentheses, brackets and braces are observed.
    (c) Compatibly with (a) and (b), multiplications (both arithmetic and logical) are considered before sums and subtrac¢ions, divisions precede multiplications, and comparisons precede multiplications, divisions, sums and subtractions.
    (d) Commas precede both comparisons and operations.
    (e) Compatibly with (a), (b), (c) and (d), the expressions are considered from left to right. The expression to the left of a condition in a conditional expression, or to the right of the mark " - " or "+--", respectively, in a symbol or in a formula, is delimited according to the following:
    Expression Delimitation Rule. The expression to which a condition applies is such that its rightmost character, say B, is the first character which precedes the left bracket of the condition. Its leftmost character, say A, is the leftmost of the characters of the description of the problem which satisfies the following three requirements:
    (I) A does not follow B.
    (II) The information contained between 3, and B, inclusive, is an expression.
    (III) The character immediately preceding A is either a comma, right parenthesis, bracket or brace, or any of the marks +, - , , , / .
    An analogous rule applies to symbols.
    If a conditional expression, a symbol or a formula is enclosed by parentheses or by a set of nested parentheses, these are not considered when requirement (III) is checked.
    1.13 SINGLE-VALUED EXPRESSIONS. Some expressions are designed in such a way that they associate a single-valued function with the variables they contain,  which may be in either mode and in any structure. These expressions are called single-valued. A single-valued expression does not have to be elementary.
    For instance, the non-elementary expression (A[B], C[/B]), whose value is either A or C, depending upon the value of the boolean variable B, is single-valued.
    The property of being single-valued does not even have to be a consequence of the manner in which the expression is written but may simply be determined by the analyst's knowledge of the data. The expression (A[B],C[D]), for instance,  is not single-valued in an absolute sense, but is single-valued with respect to any procedure in which it is known in advance that this expression will never be considered when both the boolean variables B and D have the same value.
    When, for some reason, a single-valued expression cannot be calculated, its value is taken to be blank if it is ordinary and alphameric; otherwise, its value is taken to be 0.
    An expression which is used as a condition must be single-valued and boolean
    1.14 PROCEDURE DESCRIPTIONS. The "procedure description" consists of a sequence of expressions, called statements. The statements are considered sequentially.  A simple procedure description consists of a sequence of statements, each consisting of a formula followed by an elementary expression in the boolean mode, enclosed in brackets. The current value of this expression determines whether the formula is to be executed or skipped.
    Many procedures may be described in this way, which is advisable for beginners in systems analysis. Some difficulties might arise when table handling opera- tions are described, but these operations can be stated by using simple loops. However, this kind of description might be cumbersome, and in order to design complex systems efficiently it appears advisable to take full advantage of the flexible features of the Algebraic Data System Language.

          in [ACM] JACM 9(1) January 1962 view details
  • Lombardi, Lionello "Mathematical models of file processes". Atti. Sem. Mat. Fis. Univ. Modena 1962 view details
          in [ACM] JACM 9(1) January 1962 view details
  • Lombardi, Lionello "On table operating algorithms" view details
          in Popplewell, Cicely M. (Ed.) Information Processing 62, Proceedings of the 2nd IFIP Congress, Munich, Aug. 1962. North Holland Publ. Co., 1963. view details
  • Lombardi, Lionello "On the control of the data flow by means of recursive functions" pp173-186 view details
          in Symbolic Languages in Data Processing, in the Proceedings of the Symposium organized and edited by the International Computation Centre, Rome, Italy, March 26­31, 1962, Gordon and Beech Science Publishers, 1962. view details
  • Skoffnor, Ralph M. review of Lombardi 1962 (JACM) view details Abstract: In contrast to a mathematical problem, there is no means for stating the issue of a non-numerical problem as a set of equations. Rather, such a problem is stated as a particular set of steps which leads to solution. The author feels that ". . . description of non-numerical problems in terms of their issues and of the functional relations between these issues and the data is the first of the steps to be performed in order to have a deeper insight into these problems, to handle them mathematically, and to automatize their programming thoroughly."

    This paper is an effort in that direction. It contains four sections. The first section is comprised of definitions. In the second section, use is made of these definitions to provide a compact notation for specifying the results desired from table operations. The title of the third section is "Control of the Data Flow." Definitions of records and files are given which correspond to the usual concepts. Two higher order sets, bundles (composed of files) and streams (composed of bundles), are then defined and used in specifying a model of data flow coordination. In the final section, some potential extensions of work are considered; particularly, an initial demonstration of files as elements of a Boolean algebra is provided.

    Few readers will follow the author's many lines of development. It becomes clear only after the reader is well along that the last three sections represent three virtually independent lines of development. Symbol development is extensive and continues through all the sections. The language of the paper further contributes to this difficulty (example: a sentence in which the preposition "of" occurs 9 times).

    However, it should also be noted that the author is dealing with some highly complex problems, particularly in attempting to provide a generalized model of file processing in an integrated system. Other people involved with this problem will be interested in this paper.

          in ACM Computing Reviews 3(05) September-October 1962 view details
  • Keese, Will, Jr. Review of Lombardi (IFIP 62) view details Abstract:
    This paper demonstrates the use of a very terse language form intended for usage as per title. Unfortunately, the extreme shortness required of IFIP papers resulted in this terseness being carried over to the general writing of the paper. A summary perusal of the "introductory definitions" reveals half a dozen instances (one of which is multiple) where meanings depend on other words or phrases which are neither defined in the paper nor in wide standard usage. Any reading of the paper must be accompanied by use of the extensive reference list.

    The notation is an advanced logico-mathematical one. While there doubtlessly exist many problems toward which it may be well-oriented, one is left to wonder about the extent to which it fulfills that more important aspect of normal POLs: that of being well-oriented towards the people who may have the problem. However, since the total group of people who are working with textual data can hardly be thought of as enjoying any common lingual orientation, there may be much more hope for a pure type language in this field than there was in the mathematical one.



          in ACM Computing Reviews 4(03) May-June 1963 view details
  • Lombardi, Lionello "Logic of automation of system communications" J. Mach. Accounting 1963 view details
          in ACM Computing Reviews 4(03) May-June 1963 view details
  • Lombardi, Lionello "A general business-oriented language based on decision expressions" view details Abstract: The structure of a digital computer programming language which covers a wide class of business and file processing applications is presented. Such a structure, based on identifying and incorporating into a compiler the aspects common to all processes of such class, permits writing extremely compact programs, even for comparatively complex applications, in terms of tables of control expressions which express only information characteristic of the particular application. Furthermore, local changes of a process (e.g. changes affecting only one of the output files involved) can be effected by local modifications in the program (e.g. modification of only one entry of the tables). This structure also allows for inexpensive preparation of loading-speed compilers which translate the source programs into efficient machine codes. The approach adopted here departs from conventional mechanical language design philosophies. It stresses the structural analysis of the class of processes to be represented in the languages, as opposed to emphasizing formal (i.e., contents-independent) syntactical definitions. It relies exclusively on nonprocedural representation of processes as sets (tables) of relations between data and results (there are no control statements such as GO TO, etc.), instead of using procedure descriptions (which are one-to-one translations of flowcharts). Here an invariant pattern of procedure is identified as characteristic of the class of all batch file processes.
          in [ACM] CACM 7(02) (Feb 1964) includes Papers presented at a Working Conference on Mechanical Language Structures, Princeton, N. J., August 1963 view details
  • Lombardi, Lionello "Coding in storage and searching systems" In P. Garvin (Editor), Natural Language and the Computer, ch. 14; , McGraw-Hill Book Co. view details
          in [ACM] CACM 7(02) (Feb 1964) includes Papers presented at a Working Conference on Mechanical Language Structures, Princeton, N. J., August 1963 view details
  • Moore, D. P. review of Lombardi view details Abstract: This article describes a non-procedural language for de scribing a certain class of file-processing applications. This class embraces the processing of a number of sequential input files, all arranged according to the same key, to yield a number of similarly sequenced output files. The essential structure of the file-processing procedure is built into the language: records are accessed from input files according to their keys, values are assigned to output fields as functions of input variables, and records are output, or not output, depending on the values of control expressions associated with the output files. The programmer supplies data format descriptions, field-value expressions, and control expressions. These expressions may involve input field values as well as certain control variables set by the process to indicate when a new record has been read from a particular file, etc.

    This language is non-procedural in that the order of expression evaluation is not specified by the programmer. However, if the programmer is allowed to define function involving conditional expressions, then the language be comes procedural except for input/output.

    In its simplest form, the language is clear and easy to comprehend, but limited in application.

    As the language is developed in this article, additional features are added to increase the scope of applicability of the language. These features include intermediate variables, additional control parameters, and additional step in the built-in file-processing procedure. It is clear that further additions will be required; in fact, some were hinted at in the article. It will be interesting to see whether the language can be enriched sufficiently to make it a generally useful file processing tool, without allowing procedural aspects to creep in by the back door.
          in ACM Computing Reviews 7(01) January-February 1966 view details