DATAN(ID:7662/)


for DATa ANalysis

Interactive curve-fitting system system

Roger Simonsen and  Louise Anketell, The Boeing Company, Seattle, 1965



Related languages
DATAN => PEG   Influence

References:
  • PRICE, J. F., AND SIMONSEN, R. H. Various methods and computer routines for approximation, curve fitting and interpolation. Doc. D1-82-0151 Rev., Boeing Sci. Res. Lab., Seattle, Wash., July 1963. view details
  • Simenson, R. H., and Anketell, D. L "Mechanization of the curve fitting process. DATAN" p299-304 view details Abstract: A process for fitting a curve to approximate data and the problem it creates for the engineer-programmer is defined. An approach has also been defined and a system has been written for the SRU 1107 to mechanize a major portion of this process. The techniques developed to accomplish the mechanization are largely empirical, and are dependent for their information only on the actual data points. Extract: Introduction
    Introduction
    The engineer programmer often faces the difficult problem of extracting information from a table of approximate data. Interpolation between data points is appropriate only if the data contains no scatter. (Scatter is defined here as a lack of significance.) Curve fitting is a more consistent and useful method which compensates for scatter.
    Use of curve fitting, however, is complicated by the myriad of specialized mathematical methods available; the chosen method almost always depends on the particular data being processed. Careful data analysis must be performed; it is not at all unusual to plot the data before deciding on a curve-fitting method.
    Other elaborate steps are also necessary. Initial estimates of the function type, the necessary parameters, the amount of scatter in the data, etc., must be made-- estimates which rely heavily on intuition and judgment.
    Even after this work, one or more parameter estimates may have to be adjusted. It may turn out that the function type is wrong, or that the curve-fitting method must be modified. In short, the entire process, even for an experienced curve-fitting analyst, evolves rather heuristically for each new curve-fitting problem.
    The authors have developed a computer-oriented technique that eliminates much of the guesswork in curve fitting. Called DATAN (DATa ANalysis) it shortens the time spent in obtaining a fit. DATAN can simulate the curve-fitting process by isolating and analyzing the prominent geometric characteristics of the data and prescribing the appropriate fitting function. The system evaluates the resultant fit and analyzes the pattern of residuals, or error function, as a new set of data.
    Extract: The DATAN Approach
    The DATAN Approach
    A. Objectives. The primary objective of DATAN is function recognition, that is, to learn from a set of data its geometric properties, and to use this information to prescribe a fitting function which will have the same geometric properties.
    A secondary objective is to obtain a mathematically simple curve-fitting function which is both useful and adequate.
    B. Assumptions. It is assumed that the data is approximate, that it is a function of one variable and that the amount of scatter in the data is estimable and relatively uniform throughout the domain. DATAN also assumes that there are no discontinuities within the data range. Perhaps the most important and most often violated assumption is that the data is sufficient in quality and quantity to define its identifying geometric characteristics.
    C. Presumptions. Experience has indicated that the largest percentage of data cases can be adequately represented by polynomials. However, a polynomial provides a very poor representation if the data exhibits asymptotic behavior. Trigonometric (tangent, cotangent, secant and cosecant), exponential, hyperbolic and logarithmic data and other data with asymptotic tendencies are usually well represented by rational functions. Most sinusoidal data cases are best fit by the actual periodic relationships which they represent. In view of these observations and the objectives, three fitting function categories are established: polynomial functions, rational functions, and sinusoidal functions. These are investigated in that order.
    Extract: Conclusions
    Conclusions
    The use of the DATAN approach by the inexperienced analyst will certainly eliminate a considerable amount of redundant and unnecessary learning effort in the field of curve fitting. The engineer-programmer usually obtains much more information about his data fitting problem; very often he has a satisfactory fitting function in a much shorter amount of time (elapsed time, computer time and man hours) than is normally required.
    The background and development work necessary for DATAN has also resulted in an increased awareness of the present scope of curve fitting, and has indicated areas where extension is both desirable and feasible.
          in [ACM] CACM 9(09) September 1966 view details
  • Smith, Lyle B. "A Survey of Interactive Graphical Systems for Mathematics" view details Extract: PEG system
    PEG system
    At the Stanford Linear Accelerator Center, Stanford University, Stanford, California, the author has developed the PEG (On-line Data-Fitting) system. The work was begun in the fall of 1967 on an IBM System 360/75 computer using an IBM 2250 II display unit with lightpen as the interactive console, Figure 11. The interactive program runs in a separate partition of memory with high priority. By the fall of 1968 a working system was available and was used by physicists for actual datafitting problems. By October 1968 the IBM 360/75 had been replaced by an IBM 360/91, and the PEG system was operational on that computer.
    As described in Smith (1969), the PEG
    system allows user'selection of:
    a) fitting function: user defined function, orthogonal polynomials, spline func- FIG 9 Photograph of IBM 2250screen during execution of Dixon's Time Serms Spectrum Estimation program. Note the light "buttons" at the bottom of the screen for lightpen selection of options
    b) data mode: data from cards, data of previous fit, residuals of previous fit, and keyboard entry; and
    c) display mode--after a fit has been computed there are seven different display modes.

    In addition to the above, PEG allows specification of degree, initial guesses for nonlinear problems, choice of minimization method (in some cases), and correction, subset selection, selective deletion, or transformation of data values.

    All user actions are either lightpen selections or numerical entries from the keyboard. This has been accomplished by anticipating in advance all possible (at least nearly all, hopefully) desires of a user and providing for on-line selection from the list of available options.

    The PEG System was partially inspired by the DATAN System, see Simonsen and Anketell (1966). Some other references of interest in the approximation and curvefitting areas are Conn and von Holdt (1965) and de Maine (1965). Pyle (1965) describes a system for on-line data input by question and answer which is related to the method employed by PEG to obtain input from the user. PEG in many cases asks multiple choice questions which can be answered with the lightpen.
          in [ACM] ACM Computing Surveys 2(4) Dec1970 view details