DOLPHIN(ID:8489/)Context-based text filing and editing system developed at Lancaster in 1970 ffectively and advanced free-form text storage and retrieval system References: The filing system The data base which is maintained and manipulated by the DOLPHIN system is constructed from a number of 'files'. Each file is identified by a 12-character name and belongs to an individual user. Within itself, each file contains several 'subfiles', each of which is a document of some sort. Subfiles are also identified by 1Zcharacter names which are local to each file. The data base is stored on a disk, the total capacity of the present implementation being about 24 million characters. The commands which operate on complete subfiles are called 'macro-instructions'. They provide conventional filing facilities, such as reading subfiles from cards or paper tape, listing or punching them, copying, concatenating or deleting them. The character set used for storing subfiles includes provision for upper and lower case letters, and for arbitrary compound characters produced on a Flexowriter by backspacing and overprinting. A more unusual facility in the system is one for 'justifying' a subfile so that all of its lines are of a constant specified width. Justification is useful in a system where -as we shall explain-editing can be done on a character basis, instead of line by line. In the filing system, all space allocation is dynamic, so that no individual user can run out of space unless the whole of the disk is occupied. The two macro commands which make available the editing facility are EDIT and LINEDIT. Each of these instructions causes a working copy of the named subfile to be created. EDIT allows the copy to be altered at character level, and LINEDIT, at the line level. Extract: The editing commands The editing commands The 'editing commands' or 'micro-instructions' all operate on the copy of the subfile selected by the most recent EDIT or LINEDIT command. All editing is essentially carried out with the aid of 'pointers'. A pointer is a special variable associated with the text being edited. It acts as a kind of 'bookmark', except that it can be attached to a particular character, instead of a line or page. (In the LINEDIT mode, a pointer can only be attached to a 'newline' character). There are 26 general pointers in the system, called 'A' to 'Z', and a special pointer called '*'. When a text is first selected for editing, * is attached to the first character, and all the other pointers are given the special value 'undefined'. With certain exceptions, the micro-instructions fall into two groups: those which manipulate the pointers, and those which actually alter the text being edited. We shall discuss the pointer-setting instructions first. In the following description, the symbol 'P' stands for any pointer (that is, * or one of the letters A to Z). Whenever any instruction includes two pointers which are necessarily different, then we shall use the symbols 'PI' and 'P2'. The following sets of instructions are available This attaches the specified pointer to the beginning of the text. This sets the pointer P1 to the same position as pointer P2. 3.3. P = P + n (where n is an integer) (e.g. X = Y + 500 or * = * + 8) This sets the pointer on the left n characters (or lines if the system is currently in the LINEDIT mode) past the position of the pointer on the right. The same pointer may be mentioned on both sides; in this case, the effect is to advance that pointer by the specified number of characters or lines. 3.4. P1 = (quotation) = P2 (e.g., A = "The sun set." = B) In obeying this command, the system starts by scanning the text to locate the quotation. The scan starts at the current value of *, and terminates either at the first occurrence of the quotation, or at the end of the text if the quotation cannot be found. If the quotation is found, then the pointers P1 and P2 are set to its beginning and end, and * is set to one character past the beginning. This command forms one of the main methods of marking part of a text according to context. The convention which relates to * ensures that a series of instructions with identical quotations will detect consecutive occurrences of the string in question, instead of finding the same one repeatedly. The instruction has two truncated forms in which PI and P2 are omitted. These forms are commonly used in practice. If the system is in the LINEDIT mode, then the pointers P1 and P2 are set to the beginning and end of the line (or lines) which contain the quotation. All the instructions which actually change the file being edited are variants of one basic form: 3.5. (segment) = (segment string) Here, the (segment) on the left serves to identify a portion of the text which is to be replaced. It can take two forms: (a) It can be an ordered pointer pair. In this case the segment specified is that between the pointers. (b) It can be a quotation. In this case, the system calls its search procedure and selects the first string which completely matches the quotation. The (segment string) on the right can be empty, or it can be an ordered pointer pair, or a quotation, or a sequence of these items separated by the concatenation operator '+' The segment string specifies the text which is to replace the portion specified by the left hand segment. In all cases, the replacement text is a copy of the segments or quotations mentioned, since the actual segments or quotations could not be inserted without removing them from elsewhere. To illustrate this mechanism, we shall give some common forms of the instruction: (1) Replacement: "program" = "programme" the next occurrence of the word 'program' is replaced by 'programme'. (2) Deletion: "of Great Britain" = the next occurrence of the sequence 'of Great Britain' is deleted. (3) Insertion after a selected point: "the President" = A AA = "of the United States of America" in this sequence of two instructions the pointer A is first set to the end of the next occurrence of the string 'the President'. The null segment AA (which has position but no extent) is then replaced by the string 'of the United States of America'. Insertion before a selected point can be dealt with in a similar way. (4) Interchange of two adjacent segments (we assume that AB, and BC are both defined segments) AC= BC +AB (5) Interchange of two non-adjacent segments (we assume that PQ and RS are both defined segments, not adjacent, with PQ before RS). P S = R S + Q R + P Q The remaining micro-instructions have no common factor, and are best described individually. 3.6. REPEAT (quotation) = (quotation) (e.g. REPEAT "dogfish" = "rock salmon") Every occurrence of the first quotation in the text is replaced by the second. 3.7. TYPE P n (where n is an integer) or TYPE P1 P2 (e.g. TYPE * 30 or TYPE PQ) In the first variant, it types out n characters or lines starting from the given pointer. In the second, it types out the entire segment specified by the ordered pointer pair. 3.8. SAVE (e.g. SAVE) This command is given at the end of editing. The original subfile is deleted from the DOLPHIN store and replaced by the edited copy. The system is switched back into the state where it accepts macro-instructions again. 3.9. LOSE (e.g. LOSE) This instruction, like SAVE, is given at the end of editing. The difference is that the original subfile is kept and the edited copy is destroyed. The command is useful if a serious mistake has been made, and it is required to start editing anew. Extract: Use of the DOLPHIN system The use of the system The DOLPHIN system has been in general public use at Lancaster University for about one year. The main mode of use is off-line, through a locally built operating system. There are two ways in which access can be gained to the files: 1. The DOLPHIN editor can be called in the same way as any other compiler or standard program. A typical off-line editing run might have the following appearance : TASK Y/PQ37 BLOGGS CORRECTION/ COMPILER DOLPHIN USE BLOGGS FILE EDIT PROGRAM ONE "fa + fb" = "fa - 2*fb" "'procedure' G(x)" = " 'procedure' G(x); 'value' x" REPEAT "sine" = "sin" SAVE END 2. Any document in the DOLPHIN filing system may be inserted at any point in any input document by including a line of the form: *DOLPHIN/FILE NAME. SUBFILE NAME/ For example, the program edited by the run above may be used by the following job: TASK Y/PQ37 BLOGGS FIRST PROGRAM/ COMPILER ALGOL *DOLPHIN/BLOGGS FILE. PROGRAM ONE/ 3 5 7 1 13 (Data for the program) Until recently, DOLPHIN was used mainly in the 'batch' mode through the local operating system. This is no longer necessary, because we have made use of the operating system and DOLPHIN as components of a multi-access scheme which permits the editor to be used simultaneously from several consoles without disturbing the batch processing function. The filing system now contains about 300 subfiles, including research programs, texts of various kinds, students' programs, standard data for student problems, and library programs. The general opinion among the users of DOLPHIN is that it has greatly improved the facilities of the computing service. (But presumably the same would have been true of any reasonable filing and editing system.) Those of the members of staff at Lancaster who have used other editing systems say that on the whole, DOLPHIN compares favourably with them, and we consider that DOLPHIN is sufficiently effective in the on-line mode to merit implementation on a much larger scale when suitable advanced hardware becomes available to us. Abstract: This paper describes a filing and editing system which was constructed at the University of Lancaster as part of a project supported by the Science Research Council. The editor works by context rather than by line number, and the whole system is closely integrated with the local operating system, although the filing system can also be used on-line in an independent manner. Extract: Introduction Introduction The DOLPHIN system was devised as part of an S. R. C. supported project concerned with the simplification of program development and editing. Originally, DOLPHIN was conceived as part of a software 'package' designed to provide a complete environment for writing, editing, testing and documenting complex programs like compilers and operating systems. It was one of the first parts to be completed, and it has now acquired a usefulness far outside its original scope. We shall start by stating the original design requirements of the editor, since they help to understand the strengths and weak points in its present application. The requirements were: 1.1. The system had to be able to store documents of any type, including source programs, object programs, and texts which were not programs at all (such as, for example, descriptive documentation). The system had to allow for faithful storage and reproduction of any document produced on a Flexowriter, even if it included an arbitrary number of 'compound' characters made by back spacing and over-printing. 1.2. The system had to allow for easy editing of texts of all types including the replacement of complete lines or parts of lines. It was also necessary to be able to insert, delete or interchange large segments of text. 1.3. The system had to be usable in both on-line and off-line modes. 1.4. The system had to provide for several distinct users, each one possessing a number of documents. The continued integrity of the documents over machine failure was considered important, but since the users were expected to be the members of a close-knit group of computer scientists, no particular precautions were to be taken against security violations (such as the unauthorised reading, amendment or destruction of other people's documents). in The Computer Journal 13(2) 1970 view details |