The General Inquirer(ID:5997/the006)

Special input language for Harvard information retrieval system 


Stone et al Harvard 1962

Special input language for Harvard information retrieval system, requiring the delineation of parts of speech and phrase groups.


References:
  • Stone, P. J., Bayles, R. F., Namerwirth, J. Z., and Ogilvie, D. M. "The general inquirer: a computer system for content analysis and retrieval based on the sentence as a unit of information". Behav. Sci., 7, 4 (1962), 1-15. view details
  • Simmons, R.F., "Answering English Questions by Computer - A Survey" SDC Report SP-1536 Santa Monica, Calif.; April, 1964 view details
  • Simmons, R. F. "Answering English questions by computer: a survey" p53-70 view details Abstract: Fifteen experimental English language question-answering systems which are programmed and operating are described and reviewed. The systems range from a conversation machines to programs which make sentences about pictures and systems which translate from English into logical calculi. Systems are classified as list-structured data-based, graphic data-based, text-based and inferential. Principles and methods of operations are detailed and discussed.

    It is concluded that the data-base question-answerer has passed from initial research into the early developmental phase. The most difficult and important research questions for the advancement of general-purpose language processors are seen to be concerned with measuring meaning, dealing with ambiguities, translating into formal languages and searching large tree structures. DOI Extract: General Inquirer
    The General Inquirer.
    A paper by P. Stone et al. (1962) at Harvard University describes a COMIT program system useful for analyzing the content of text. As a question-answerer, the General Inquirer recovers all sentences containing a given set of concepts. As in the Householder- Theme ALA, a thesaurus is used for coding words as to concept membership and, if desired, an intermediate language may be used which makes explicit the syntax of the text arrd question. However, the General Inquirer differs from most of the systems so far described in that any syn tactic manipulations are done as a manual pre-editing phase for the text and the questions.
    Probably the most interesting feature about these programs is the dictionary and thesaurus operation. The thesaurus is built especially for the content to be studied. For example, a thesaurus for psychological studies ineludes headings such as Person, Behavioral Process, Qualities, etc. As subheadings under Behavioral Process, such cluster tags as the following are found: react, see, hear, smell, defend, dream, escape, etc. For an anthropological study the thesaurus would contain many different headings.
    The dictionary includes about 3000 common English words and words of special interest to arty particular investigation. The dictionary lookup is accomplished by first filtering out function words such as and, or, to, of, etc., then looking up the remaining portion of text (about 50 percent) in the complete dictionary. Each of the words in the main dictionary is defined by the thesaurus tags or clusters to which it belongs. Thus the dictionary entry for "abandon" has the following format:
    ABANDON = GO+REJECT+END+DANGER+ALONE
    In the processing phase each word in the text to be examined may be tagged by its cluster mernberships and matched against the terms of the request.
    Content analyses may be as simple as frequency counts of tag-concepts in a discourse or they may be requests for all sentences in the discourse with the tags "reject" and "person." A great deal of useful analysis can be accomplished using just the semantic portion of the General Inquirer. However, to avoid apparent triatches which are structurally dissimilar (as in the overworked example, man bites dog vs. dog bites man) a syntactic analysis is often desirable.
    The following interesting semantic-syntactic categories are used in the manual pre-editing phase.
    1. Subject and incorporated modifiers
    2. Nonincorporated subject modifiers
    3. Predicate verbs
    4. Verb modifiers including time referents
    5, Object and incorporated modifiers
    6. Nonincorporated object modifiers
    7. Indirect object and modifiers
    8. Attributive nouns
    9. Attributive verbs
    The following example (from a suicide note) will illustrate how these codes are actually used.
    IN THE LAST/4 WEEK/4 A NUMBER/1
    OF OCCURRENCES/1 HAVE FORCED/3 ME/5
    INTO A POSITION/4 + WHERE I/5 FEEL/9
    MY LIFE/1 IS NOT WORTH/3 CONTINUING/S,
    The "+" separates two clauses which were coded as separate "thought sequences." For the case of pronouns or ellipses the referent words are added in parentheses, it is to be noticed that the grammar distinguishes between metaphrases such as "I feel that" and declarative statements about apparent facts.
    Like the Householder-Thorne ALA, the General Inquirer finds that only a limited syntactic analysis is sufficient for its purpose.

          in [ACM] CACM 8(01) Jan 1965 view details
  • Stone, Philip J., Dunphy, Dexter, Smith, Marshall, Ogilvie, Daniel, "The General Inquirer: A Computer Approach to Content Analysis" MIT 1966 view details
          in [ACM] CACM 8(01) Jan 1965 view details
  • Stone, Philip J.; Kirsch, John; "User's Manual for the General Inquirer" The M.I.T. Press, Camridge Mass. USA 1968 view details
          in [ACM] CACM 8(01) Jan 1965 view details