SP/1(ID:5964/sp:002)String processor for FortranList-based preprocessor for FORTRAN to enable string macros after the fashion of LIMP, but specific to FORTRAN. Due to inefficiencies in the compilation time, the list-based system was replaced by a stack-based system as MP/1 The feature set for the string manipulation was designed to be the same as SNOBOL, but lacked the indirect referencing system Related languages
References: Extract: Introduction In general, string processing systems deal with data which is in the form of unstructured strings of characters. COMIT (Yngve, 1962), SNOBOL-3 (Farber, Griswold and Polansky, 1966) and SNOBOL4 (Griswold, Poage and Polansky, 1968) are three well-known string processing languages. Typical of the types of operation possible in these languages are matching, insertion, replacement and concatenation of strings and substrings. With the increasing usage of computers in many different fields, the distinction between numeric and non-numeric applications is becoming less apparent, as for example in information retrieval problems. Consequently it seems desirable that a single programming system should incorporate efficient numeric and non-numeric capabilities. The SP/1 system described here has been designed and implemented as a string processing system embedded in FORTRAN-IV. To avoid adding to the diversity of programming systems already in existence and since SNOBOL is a wellknown language whose syntax is readily adaptable to a FORTRAN environment, the operations provided in SP/I are similar to those available in SNOBOL-3. Unlike, for example DASH (Milner, 1967), which is a string processing extension embedded in ALGOL, SP/1 is both a syntactic and semantic extension to FORTRAN. The string processing statements can be represented by a set of macros which are expanded into FORTRAN statements by a macro generator (Macleod and Pengelly, 1969) prior to compilation. The macros have been designed so that there is a close similarity between the syntax of the corresponding SP/1 and SNOBOL3 statements. For example, the SNOBOL-3 statement REPEAT E "(" *V* ")" = V /S (REPEAT) deletes all the pairs of left and right parentheses from a string E. The corresponding SP/I statement is In addition, SP/1 provides a data type known as an association which may have a range of alternative values associated with it. This data type is in some ways similar to the pattern type in SNOBOL-IV and the assertion type in AXLE (Cohen and Wegstein, 1965). A further distinctive feature of SP/1 is that strings are stored as sequences of atoms where an atom is the smallest meaningful unit of the string. The size of an atom is determined on input as shown below, but normally an atom may be regarded as a single character symbol or as a group of consecutive alphameric characters. The latter could be the case for example in text processing where one atom would be equivalent to a word of text. This approach allows the processor to operate on strings composed of text words while retaining the capability to manipulate strings of individual symbols where required. This provides faster operation with a considerable saving in storage requirements in the text processing types of applications where the smallest logical unit of information is a word of text. Thus, essentially, there are two modes of operation, character and text, corresponding to the two types of storage. In the current version of SP/1 mixed mode operations are not allowable. The method of string storage, which involves a hash table, is described elsewhere (Macleod, 1969a). Extract: Introduction In general, string processing systems deal with data which is in the form of unstructured strings of characters. COMIT (Yngve, 1962), SNOBOL-3 (Farber, Griswold and Polansky, 1966) and SNOBOL4 (Griswold, Poage and Polansky, 1968) are three well-known string processing languages. Typical of the types of operation possible in these languages are matching, insertion, replacement and concatenation of strings and substrings. With the increasing usage of computers in many different fields, the distinction between numeric and non-numeric applications is becoming less apparent, as for example in information retrieval problems. Consequently it seems desirable that a single programming system should incorporate efficient numeric and non-numeric capabilities. The SP/1 system described here has been designed and implemented as a string processing system embedded in FORTRAN-IV. To avoid adding to the diversity of programming systems already in existence and since SNOBOL is a wellknown language whose syntax is readily adaptable to a FORTRAN environment, the operations provided in SP/I are similar to those available in SNOBOL-3. Unlike, for example DASH (Milner, 1967), which is a string processing extension embedded in ALGOL, SP/1 is both a syntactic and semantic extension to FORTRAN. The string processing statements can be represented by a set of macros which are expanded into FORTRAN statements by a macro generator (Macleod and Pengelly, 1969) prior to compilation. The macros have been designed so that there is a close similarity between the syntax of the corresponding SP/1 and SNOBOL3 statements. For example, the SNOBOL-3 statement REPEAT E "(" *V* ")" = V /S (REPEAT) deletes all the pairs of left and right parentheses from a string E. The corresponding SP/I statement is In addition, SP/1 provides a data type known as an association which may have a range of alternative values associated with it. This data type is in some ways similar to the pattern type in SNOBOL-IV and the assertion type in AXLE (Cohen and Wegstein, 1965). A further distinctive feature of SP/1 is that strings are stored as sequences of atoms where an atom is the smallest meaningful unit of the string. The size of an atom is determined on input as shown below, but normally an atom may be regarded as a single character symbol or as a group of consecutive alphameric characters. The latter could be the case for example in text processing where one atom would be equivalent to a word of text. This approach allows the processor to operate on strings composed of text words while retaining the capability to manipulate strings of individual symbols where required. This provides faster operation with a considerable saving in storage requirements in the text processing types of applications where the smallest logical unit of information is a word of text. Thus, essentially, there are two modes of operation, character and text, corresponding to the two types of storage. In the current version of SP/1 mixed mode operations are not allowable. The method of string storage, which involves a hash table, is described elsewhere (Macleod, 1969a). Extract: Statements of the language Statements of the language The statements described below illustrate how the formats of the macros representing the operations have been designed. The macros are processed into routine calls as is described subsequently. The system may of course be applied directly by calls to these routines without having recourse to the corresponding macros. String and association names are implicitly declared when they are assigned values. There is no explicit name declaration statement. Extract: Summary Summary The system is in principle similar to SNOBOL-3. Two major additions are the provision of the association data type and the option of treating data as strings of either characters or text words. It is hoped that these additions will noticeably enhance the system although their true worth can only be gauged after considerable experience. An important SNOBOL feature which has been omitted is the indirect referencing facility. In SNOBOL, if SI is the string "ABC" then $S1 refers to the string whose name is ABC and $ ($Sl) refers to the string whose name is the contents of the string ABC. This allows strings to be referenced indirectly during the execution of a-program. Because SP/1 uses FORTRAN names, the symbolic content of string names is lost after compilation and thus there is no correspondence between a string name and a string with the same symbolic content as the name. It would be possible to give strings literal names but this would increase considerably the time required to locate a particular string and since string names can be represented by FORTRAN array elements a degree of indirect referencing can still be retained. As well as providing a useful extension to FORTRAN, this system also illustrates how a suitable macroprocessor can be applied to give a considerable syntactic extension to an existing high level language. in The Computer Journal 13(3) view details |