POGOL(ID:6653/pog002)

Advanced file manipulation system 


Advanced file manipulation system


References:
  • Lambert, Gloria J. "Large scale file processing: POGOL" pp226-234 view details Abstract: INTRODUCTION
    POGOL is a data processing language which is designed to facilitate problem solutions which involve very large files. Since a programmer can easily comprehend a finite though large file, the first mental solution to a problem usually permits skipping back and forth in the file searching for pertinent information and manipulating it. Unless the file is small enoup$ to fit into core, this approach to problem solving becomes much too cumbersome and expensive. An entire new concept of problem solution is needed to deal with extremely large files.
    An approach to this problem is to produce a linear solution. That is, since the files are too large to be stored in core, the file is read once, a transformation is performed and an output file (or files) is produced. In order to arrive at the final desired solution, a series of these transformations may be necessary. Using POGOL one may define a total problem solution using this linear approach without having to consider machine resources.
    The POGOL system will put together a configuration which makes good use of the resources available, producing actual files only when necessary. Extract: PROBLEM SOLUTION INVOLVING LARGE FILES
    PROBLEM SOLUTION INVOLVING LARGE FILES
    A. HISTORICAL APPROACH
    This type of linear problem solution was implemented in the past by producing a set of generalized programs which perform certain file transformations. Then a problem solution was formulated by piecing together a series of these generalized programs and specifying the options desired for this solution and this run. Obviously, this produces solutions which can be put together by the programmer fairly quickly but which are wasteful of machine resources. In addition, if the generalized program does not produce the exact desired result, it could be awkward to make a specialized change.

    B. POGOL APPROACH
    POGOL provides a linguistic capability to produce this type of problem solution in a natural way. The basic structure of a POGOL program is a series of operations, each of which consists of one or more input files, a verb (some major data processing function), and one or more output files. These operations are connected via data paths defined by the files involved. That is, if file A  is output from operation 1 and input to operation 3, a data path has been defined which connects operation 1 to operation 3.
    The verb in an operation directs the basic function of the operation including its input strategy. In addition, the verb nay be wmented by algorithmic coding (section III. B). Each record in the input file(s) can be filtered through some algorithmic coding before the verb gets it. Certain conditions encountered by the verb can cause exits to specified routines. When the verb has produced a record or a data object (depending on the verb?s function), control is passed to the output section of the operation where additional algorithmic functions can be performed. Usually new records for the output file(s) are built in the output section, but this mav actually occur anuywhere in the operation. Using these components, the programmer can specify a tailored solution to his problem whiie letting the compiler direct the data paths necessary to handle the large files involved.
    When writing a POGOL program, one need not concern himself with machine resources or constraints. The programmer states his total problem solution in POGOL source. The compiler carries a parameter file which defines the machine resource limitations to be imposed on a job step. The compiler then segments the job accordingly, producing all the command languege necessary to the resulting job. It also handles the intermediate files which must be produced due to segmentation. The user needs to provide only his POGOL program end the command language to describe external input end output files. If the user existing non-POGOL programs or subroutines he wants executed, there is a facility for specifying what point in the procedure the program should inserted or the subroutine called. Extract: POGOL, LANGUAGE - A VERBS
    POGOL, LANGUAGE
    A. VERBS
    The verbs are the major components of the POGOL language. The verb is the controlling factor in en operation. It dictates how the records from the input file or files will move through the operation. The algorithmic coding in an operation serves to augment the verb, allowing the programmer to tailor the verb?s action to his particular problem Each verb issues requests for data baaed on its function. It may need one record from one file, several records from one file, or one record Nom each of several files. The programmer may specify that as the verb proceeds to work on the file(s), if certain conditions arise, en exit is to be taken from the verb to perform specialized algorithmic functions.
    In most of the verbs it is possible (in fact, sometimes necessary) to specify a sequencing or control group identifier in a file. A control group is a group of records which have a logical. association. There are two basic ways of indicating a control group. Each record may contain a key field which indicates where this record belongs in the file. Or the first record of a group may contain a special value or condition which indicates that this is the start of a new control group. The specification of the second condition is called a BREAK clause. This language feature is actually independent o? the verb. It is stated in the verb statement only because of the timing factor involved. That is, the KEY or BREAK clause is analyzed when the record becomes available to the verb and not when it becomes available to the operation.
    There are fifteen verbs defined in the POGOL language. These are the major data processing functions needed for the class of problem POGOL was designed to address. Each verb has been defined to be fairly flexible in itself as well as to allow interaction with algorithmic coding. The verbs work with all data types except binary.
    Following is a brief description of each verb and its basic function.
      Combine
    takes a single input file and forms all combinations of records within a control group.
      Couple
    picks up the specified data from each record in a control group, and produces the resulting concatenated string.
      Identify
    selects those records from the input file that belong to groups described in the verb statement.
      Index
    is used to identify ?words? in the input record then show the word (or the reverse of the word) in context.
      Match
    processes two files which have been sequenced - in control groups. One may then select matched or unmatched conditions on each of the two files.
      Merge
    processes two or more sequenced files producing a single sequenced set of records.
      Offset
    produces successive segments or groups of the origin string in either left to right or cyclic fashion stepping in equal increments.
      Pack
    reorganizes the specified data within a control group in the input file by packing the data from the individual records in the control group.
      Print
    provides a report generation facility which allows the programmer to describe the page, spacing, top and bottom margins, routines to handle page overflow, and columnar printing. Special algorithmic statements are used to build and print lines and perform the various printing functions.
      Process
    provides a verb environment whereby the programmer can algorithmically specify the desired action including the input strategy.
      Scan
    inspects the input data for occurrences of specified strings and returns hits, their position, and preceding or following data.
      Search
    essentially_ provides a dictionary capability One file contains the "words" and "meanings". The second file contains values which are to be "looked-up" in the first file.
      Select
    processes the file in control groups producing records which meet the selection criteria (e.g., F, - first of multiples, LM10 - last record in a control group of at least 10 records).
      Sort
    sequences records from one or more files. The individual files may have the same or different keys and may retain their individual identity or be merged into one file on output.
      Update
    provides a file maintenance facility. The verb will delete records or change the value of fields within the specified record(s).

    Extract: POGOL, LANGUAGE - B. ALGORITHMIC STATEMENTS
    B. ALGORITHMIC STATEMENTS
    Adjust shifts the significant data in a field right or left either within itself or into another field.
    Binary performs binary operations on two fields.
    Call links to a subroutine (POGOL or non-POGOL).
    Clear resets a field(s) to the appropriate nulls.
    Close causes an input file to be closed and an end-of-file signal to be passed to the verb.
    Convert converts data from one data type to another.
    Display gives debug output on the line printer.
    Do specifies a loop which can have both parallel and rectangular indexine.
    Edit returns a coumt of the significant data in the datta field.
    Goto branches with the option to either return here or to another label.
    If is the normal boolean expression with true and false paragraphs.
    Move transfers data from one field to another`without conversion.
    On & Off control the setting of switches.
    Order sorts the data within a field.
    Output constructs a record and puts it on an output file.
    Read causes a record to be input from a parameter file.
    Reference dynamically assigns a data reference or address to a pseudonym. A pseudonym is a shorthand notation which refers to the address of some data string. It is used for either flexibility or efficiency.
    Set assigns the value of an expression to a destination.
    Drop, Erase and Return are statements the programmer can use to affect the sequence of control, especially as it concerns the flow of data through an operation.

          in [ACM SIGACT-SIGPLAN] Proceedings of the ACM Symposium on Principles of Programming Languages, Boston, October 1973. Association for Computing Machinery. view details