Analytical and statistical autocode
Autocode for Elliot 401 adapted for analytical and statistical work. Featured a special SCAN operator to manipulate all elements of tables at the same time.
Dr. Yates reviewed work on the analysis of surveys carried out on an ELLIOTT 401 at Rothamsted. The first large scale survey for which the computer was used covered fertilisation practice in 25.000 fields. The analysis was tried on conventional punched card equipment, but without success because of the complicated weighting required: it was easier by hand. Computer analysis was completely successful.
At first, special programs were written for particular surveys, but programs are now being generalised. These have been developed from a basic random-sampling program, assuming a large machine with no limitations. In machines with a small drum store, the limitations of storage prevent the use of a completely comprehensive program: sub-routines may be more profitable. Development along these lines showed that one could do more on a small machine than had been expected.
The data, on cards, is a series of answers to questions. Each card is processed on input, saving storage and enabling the number of units in a survey to be unlimited. Variates for each unit are stored temporarily, and derived variates constructed if necessary (similar to regrouping in AUTOSTAT).
Mathematical functions are then constructed; tabulations may be quantitative or qualitative. There are special minor routines, for example, to obtain two-phase or group information from related cards.
The program covers the basic analysis of surveys. For critical analyses of tabulations the overall generalisations may be affected by differing weightings. A further program has been developed to apply successive approximation routines for comparison.
Dr. Yates' full paper is published in The Computer Journal (Vol. 3, No. 3, October 1960).
Surveys are characterized by the collection of information on a number, usually large, of "units." The information on each unit max be of any degree of complexity. In certain opinion and market research surveys. for example, only "yes" or "no" answers to specific questions may be collected, together with simple characteristics, such as age and sex. of the person interviewed: in a farm economic survey full information on the working and economics of the farm may be sought. Certain items of information may be known beforehand and form the basis of selection of the units.
In many surveys there is a hierarchy of units. Thus, in a household survey, the household constitutes one type of unit, and each individual in the household a second type of unit. In such a survey, moreover, we may select a sample of households, and from each selected household a sample of individuals; this is known as two-stage sampling. Furthermore, additional items of information max be collected from a sub-sample only of all the units in a survey; this is known as two-phase sampling. Multi-stage and multi-phase sampling max be combined in the same survey.
A general program for the analysis of surveys must make provision for the very varied topics of analysis that are required in different types of survey. A scheme for such a program on a large computer has been outlined in the third edition of Sampling Methods for Censuses and Surveys (Yates. 1960). At first sight it appeared that a program of this generality would not be practicable on a small computer such as the Elliott 401 (a drum machine with a store of 2.944 words) at present available to us at Rothamsted. Further investigation showed, however, that a useful general program could be written. The present paper contains a brief account of this program. Extract: An Autocode Form of the Instructions
An Autocode Form of the Instructions
Douglas and Mitchell (1960) have recently described a program for the analysis of surveys of the market research type, which was written for Pegasus. In it they have adopted an autocode form of language. A somewhat similar language has been adopted in the scheme outlined in Sampling Methods for Censuses and Surveys. and could be used in the 401 program: all that would be required would be a preliminary interpretive routine which would translate the autocode instructions into the coded instructions of the 401 program. Storage space is not a major problem, since the translation has to be performed before the actual analysis is begun. The writing of such interpretive routines is. however, a lengthy and involved task, and it was decided that the more direct instruction code adopted in our program would be adequate for our needs.
in The Computer Journal 3(3) October 1960 view details
Earlier editions of the book contained sections on the use of Hollerith punched-card equipment for census and survey work, and the author has now added a chapter on the use of electronic computers. This will be useful in drawing the attention of practical statisticians to the advantages that can be obtained from the use of computers, both in the reduction and in the critical evaluation of data. A section on the editing of data with the aid of a digital computer is of special interest. Those who have not been concerned with data reduction, whether the data arise from statistical surveys or from scientific experiments, may not realize how important this is. Any large body of data is bound to contain errors and inconsistencies and these must be taken care of by the machine, at machine speed, if a bottle-neck is not to arise. The machine must be programmed so as to subject the data to a close scrutiny, and to reject, correct, or refer for future examination, any which fail to pass the tests. In cases where some rejection of data may be necessary there is a distinct advantage in having it done by a program rather than by a human being, since the machine can be trusted to apply the rules impartially, and, if it is later suspected that bias has been introduced, an examination of the program will reveal the exact nature of the criteria that were used.
The critical analysis of results calls for much computation. Most of the procedures used are straightforward in themselves, but the necessity of developing approximate methods which could be applied without excessive labour has in the past tended to confuse and complicate them. The coming into wide-spread use of digital computers should enable many sources of mystery to the uninitiated to be removed.
in The Computer Journal 3(4) January 1961 view details
in The Computer Journal 4(1) April 1961 view details
in The Computer Journal 4(4) January 1962 view details
in The Computer Journal 5(4) January 1963 view details
problems arising in the statistical analysis of data. It incorporates many of
the features found in other algebraic autocodes, but these are generalized to
permit operations on multiway tables or parts of tables as well as single
variables. There are comprehensive input and output facilities and special
automatic provision for the input of survey data. Data and previously compiled
programs may be stored on magnetic tape and readily incorporated
into further programs.
in The Computer Journal 5(4) January 1963 view details
in R.C. Milton and J A. Nelder (Eds.) "Statistical Computation" Academic, New York, 1969 view details