DASH(ID:7210/das005)

String patters in ALGOL 60 


String manipulation extensions to Algol 60

Designed to incorporate both Snobol string pattern mechanisms and the Wirth/Hoare record extensions


People:
Related languages
ALGOL 60 Revised => DASH   Extension of
ALGOL W => DASH   Incorporated some features of
SNOBOL4 => DASH   Incorporated some features of
DASH => SP/1   Citation

References:
  • Milner, R. "String Handling in ALGOL", pp321-324 view details Abstract: DASH (Dynamic ALGOL String Handling) is a set of procedures designed to extend ALGOL to the expression of non-numerical or partly non-numerical algorithms for which it is normally unsuited. Extract: Introduction
    Introduction
    DASH is designed to allow a user to handle strings and
    perform efficient arithmetic in the same language. There
    are a number of languages designed to handle strings
    (Farber, Griswold and Polonsky, 1964 and 1966 ; Guzman
    and McIntyre 1966), but their arithmetic facilities are
    slight. On the other hand, on many computers the only
    "general purpose" languages at present implemented
    (e.g. ALGOL 60, FORTRAN) are almost purely
    arithmetic. This hampers both users and those whose
    job is computer science education.
    The procedures to be described aim to provide string
    processing of the SNOBOL type (Farber et al., 1966)
    to the extent that is reasonably possible within ALGOL.
    The description is informal, and some details are omitted
    for brevity. The reader will benefit from a knowledge
    of SNOBOL, though it is hoped an understanding can
    be gained without this knowledge.
    The superimposition of procedures of this type on
    ALGOL will be rendered easier and more natural with a
    facility such as the proposed record handling of Wirth
    and Hoare (1966).
          in The Computer Journal 10(3) 1967 view details
  • Macleod, IA "SP/1 - a FORTRAN integrated string processor" pp255-260 view details Extract: Introduction
    In general, string processing systems deal with data which
    is in the form of unstructured strings of characters.
    COMIT (Yngve, 1962), SNOBOL-3 (Farber, Griswold
    and Polansky, 1966) and SNOBOL4 (Griswold, Poage
    and Polansky, 1968) are three well-known string processing
    languages. Typical of the types of operation
    possible in these languages are matching, insertion,
    replacement and concatenation of strings and substrings.
    With the increasing usage of computers in many different
    fields, the distinction between numeric and non-numeric
    applications is becoming less apparent, as for example
    in information retrieval problems. Consequently it
    seems desirable that a single programming system should
    incorporate efficient numeric and non-numeric capabilities.
    The SP/1 system described here has been designed
    and implemented as a string processing system embedded
    in FORTRAN-IV.
    To avoid adding to the diversity of programming
    systems already in existence and since SNOBOL is a wellknown
    language whose syntax is readily adaptable to a
    FORTRAN environment, the operations provided in
    SP/I are similar to those available in SNOBOL-3.
    Unlike, for example DASH (Milner, 1967), which is a
    string processing extension embedded in ALGOL, SP/1
    is both a syntactic and semantic extension to FORTRAN.
    The string processing statements can be represented by
    a set of macros which are expanded into FORTRAN
    statements by a macro generator (Macleod and Pengelly,
    1969) prior to compilation. The macros have been
    designed so that there is a close similarity between the
    syntax of the corresponding SP/1 and SNOBOL3
    statements.
    For example, the SNOBOL-3 statement
    REPEAT E "(" *V* ")" = V /S (REPEAT)
    deletes all the pairs of left and right parentheses from a
    string E. The corresponding SP/I statement is
    In addition, SP/1 provides a data type known as an
    association which may have a range of alternative values
    associated with it. This data type is in some ways
    similar to the pattern type in SNOBOL-IV and the
    assertion type in AXLE (Cohen and Wegstein, 1965).
    A further distinctive feature of SP/1 is that strings are
    stored as sequences of atoms where an atom is the
    smallest meaningful unit of the string. The size of an
    atom is determined on input as shown below, but
    normally an atom may be regarded as a single character
    symbol or as a group of consecutive alphameric characters.
    The latter could be the case for example in text
    processing where one atom would be equivalent to a
    word of text. This approach allows the processor to
    operate on strings composed of text words while retaining
    the capability to manipulate strings of individual symbols
    where required. This provides faster operation with a
    considerable saving in storage requirements in the text
    processing types of applications where the smallest
    logical unit of information is a word of text. Thus,
    essentially, there are two modes of operation, character
    and text, corresponding to the two types of storage. In
    the current version of SP/1 mixed mode operations are
    not allowable. The method of string storage, which
    involves a hash table, is described elsewhere (Macleod,
    1969a). Extract: Introduction
    In general, string processing systems deal with data which
    is in the form of unstructured strings of characters.
    COMIT (Yngve, 1962), SNOBOL-3 (Farber, Griswold
    and Polansky, 1966) and SNOBOL4 (Griswold, Poage
    and Polansky, 1968) are three well-known string processing
    languages. Typical of the types of operation
    possible in these languages are matching, insertion,
    replacement and concatenation of strings and substrings.
    With the increasing usage of computers in many different
    fields, the distinction between numeric and non-numeric
    applications is becoming less apparent, as for example
    in information retrieval problems. Consequently it
    seems desirable that a single programming system should
    incorporate efficient numeric and non-numeric capabilities.
    The SP/1 system described here has been designed
    and implemented as a string processing system embedded
    in FORTRAN-IV.
    To avoid adding to the diversity of programming
    systems already in existence and since SNOBOL is a wellknown
    language whose syntax is readily adaptable to a
    FORTRAN environment, the operations provided in
    SP/I are similar to those available in SNOBOL-3.
    Unlike, for example DASH (Milner, 1967), which is a
    string processing extension embedded in ALGOL, SP/1
    is both a syntactic and semantic extension to FORTRAN.
    The string processing statements can be represented by
    a set of macros which are expanded into FORTRAN
    statements by a macro generator (Macleod and Pengelly,
    1969) prior to compilation. The macros have been
    designed so that there is a close similarity between the
    syntax of the corresponding SP/1 and SNOBOL3
    statements.
    For example, the SNOBOL-3 statement
    REPEAT E "(" *V* ")" = V /S (REPEAT)
    deletes all the pairs of left and right parentheses from a
    string E. The corresponding SP/I statement is
    In addition, SP/1 provides a data type known as an
    association which may have a range of alternative values
    associated with it. This data type is in some ways
    similar to the pattern type in SNOBOL-IV and the
    assertion type in AXLE (Cohen and Wegstein, 1965).
    A further distinctive feature of SP/1 is that strings are
    stored as sequences of atoms where an atom is the
    smallest meaningful unit of the string. The size of an
    atom is determined on input as shown below, but
    normally an atom may be regarded as a single character
    symbol or as a group of consecutive alphameric characters.
    The latter could be the case for example in text
    processing where one atom would be equivalent to a
    word of text. This approach allows the processor to
    operate on strings composed of text words while retaining
    the capability to manipulate strings of individual symbols
    where required. This provides faster operation with a
    considerable saving in storage requirements in the text
    processing types of applications where the smallest
    logical unit of information is a word of text. Thus,
    essentially, there are two modes of operation, character
    and text, corresponding to the two types of storage. In
    the current version of SP/1 mixed mode operations are
    not allowable. The method of string storage, which
    involves a hash table, is described elsewhere (Macleod,
    1969a).
          in The Computer Journal 13(3) view details