for Simulated Linguistic Computer

General purpose programming language that came out of the Georgetown MT project

  • Hutchins, John "Machine translation: past, present, future" Chichester, Ellis Horwood, 1986 view details Extract:
    There remained the ‘sentence-by-sentence’ method of A.F.R.Brown. This was designed for French-English translation. In a lecture given to the Association for Computing Machinery in June 1957, Brown (1958) reported that by January of that year he had devised rules for dealing with 220 sentences in chemistry. He described his method thus: "I opened a recent French chemical journal at random, went to the beginning of the article, and set out to formulate verbal rules that would translate the first sentence. It had about forty words, and it took ten hours to work out the rules. Turning to the second sentence, I added new items to the dictionary, invented new rules, and modified existing rules until the system would handle both sentences. The third sentence was attacked in the same way, and so on up to 220." (There could be no better description of the ‘pure’ cyclic approach; cf. 4.4 and 8.2) Brown was confident that in this way "most of the major difficulties have been met and solved" for French, and that "further progress... should be very rapid." By June 1957 the program had been coded and tested on an ILLIAC computer. (However, dictionary lookup had not yet at this stage been mechanized.) In the programming for moving, substituting and rearranging elements much use was made of sub-routines which in Brown’s view were "so general as to be almost independent of what languages are concerned", a feature which he emphasised in later developments.

    Two years later, in June 1959, the system was ready for testing (at the same time as GAT). On a prepared French text of 200,000 words and a random text of 10,000 words the results were considered to be nearly as acceptable as those for GAT. Later the same month, at the Paris Unesco conference, Brown gave a demonstration of his French system; this was the first public demonstration of a MT system with an unprepared text. By this time, the method was developing definitely into a general programming system designed to provide facilities for various linguistic and MT operations under full control of the linguist, who was able to alter and expand data and rules whenever desirable. In recognition of this development, Brown’s system was renamed the Simulated Linguistic Computer (SLC).

    The computer implementation of the GAT method, the SERNA system, was largely the work of Peter Toma (1959), initially alone. Toma had joined the Georgetown project in June 1958 to work on dictionary searching and syntactic analysis in Zarechnak’s group.6 (Toma had worked previously at the California Institute of Technology and for the International Telemeter Corporation on the Mark I system under Gilbert King.). Toma and his colleagues obtained access to the Pentagon’s IBM 705 computer during its ‘servicing time’, and between November 1958 and June 1959 worked continuously throughout every weekend (Toma 1984). According to Toma, the test of GAT in June 1959 was run on the Pentagon computer.

    There is some controversy over the significance of Toma’s contribution to the Georgetown system. Toma claims that SERNA, acronym of the Russian ‘S Russkogo Na Angliskij’ (from Russian to English), was entirely his own creation, but Zarechnak (1979: 31-32) contends that Toma’s responsibility was limited to coordination of the programming efforts while Zarechnak had overall responsibility for the linguistic formulations. While this may be true, there is no denying that Toma’s programming skills made possible the "first significant continuous outputs for Russian to English", as Dostert readily acknowledged (in the preface to Macdonald 1963).

    On 25th January 1960 a demonstration of GAT (SERNA) was staged at the Pentagon before representatives of government agencies, rerunning some of the earlier tests of the Russian-English translations of organic chemistry. Early in 1961 the programming system for GAT was converted for use on the IBM 709. The opportunity was taken to introduce certain improvements in the efficiency and accuracy of the operations. As a result, so many alterations of the SERNA programs were necessary that in effect there was a new system; it was now called the Direct Conversion programming system, and placed under the direction of John Moyne (1962).

    Apart from Russian and French, research teams at Georgetown also examined other languages. Chinese was investigated by a team advised by John de Francis, producing in 1962 a Chinese-English MT dictionary using telegraphic code for Chinese characters, and starting work on a MT system for mathematics texts. There was some work on the comparative syntax of English and Turkish, and during 1961 some discussion about setting up a pilot project for English-Turkish translation (Macdonald 1963). Brown did a tentative study of Arabic-English MT on the SLC basis (Brown 1966). Much more substantial was the work of the Comparative Slavic Research Group set up in October 1961 under Milos Pacak. This group investigated Czech, Polish, Russian and Serbo- Croatian with the objective of establishing a common intermediary language, for use in MT systems for these languages into and from English.

    By late 1961 the SLC French-English system had been adapted for Russian-English, and it could also now be run on the IBM 709. SLC was now no longer restricted to one specific language pair but it had become a generalized programming system (Brown 1966). As a MT system for French-English translation, the SLC method remained largely the special and sole concern of Dr. Brown (Zarechnak & Brown 1961); but as a programming system it was often used to support the GAT Russian-English system.7 At the Teddington conference in September 1961, the demonstration of GAT was run on SLC only, since conversion of the SERNA programs to the IBM 709 was not yet complete. As a result of this demonstration, EURATOM (at Ispra, Italy) decided to install the Georgetown system using SLC programming, both for producing translations for their personnel and as a basis for further research (ch.11.1 below).

    Another demonstration of GAT was conducted in October 1962 at the Oak Ridge National Laboratory, under the auspices of the U.S. Atomic Energy Commission. This time the texts were in the field of cybernetics, using both prepared and unprepared texts. [...]

    By by one from the first practical applications of logical capabilities of machines was their utilization for the translation of texts from an one tongue on other. Linguistic differences represent the serious hindrance on a way for the development of cultural, social, political and scientific connections between nations. Automation of the process of a translation, the application of machines, with a help which possible to effect a translation without a knowledge of the corresponding foreign tongue, would be by an important step forward in the decision of this problem.

    It was admitted that the system, developed primarily for the field of organic chemistry, had problems with the new vocabulary and style of cybernetics literature, but clearly there was confidence in the Georgetown team’s ability to improve the programs and dictionaries, and the Oak Ridge authorities decided to install GAT for producing internal translations.

    In the event, the GAT Russian-English systems were installed at Ispra in 1963 and at Oak Ridge in 1964 at or just after the termination of the Georgetown project in March 1963. So came to an end the largest MT project in the United States. Some MT research on a Russian-English system continued at Georgetown under the direction of R.Ross Macdonald after 1965 (Josselson 1971), but it was to be on a much smaller scale and without CIA sponsorship. The reasons for the unexpected withdrawal of support in 1963 are unclear. Zarechnak (1979) believes the official version citing unsatisfactory quality was not completely honest, while Toma (1984) alludes to internal conflicts between linguists and programmers leading to wholesale resignations. Whatever the cause, there could be very little further experimental development of the Georgetown systems after their installation at Ispra and Oak Ridge. Indeed, they remained virtually unchanged until their replacements by Systran (ch.12.1) at Ispra in 1970 and at Oak Ridge in 1980.

  • Brown, A.F.R. "Machine translation: just a question of finding the right programming language?" view details
          in Hutchins, W.J. (ed.) "Early years in machine translation: memoirs and biographies of pioneers", Amsterdam: John Benjamins, 2000 view details
  • Toma, P. "From SERNA to Systran" view details
          in Hutchins, W.J. (ed.) "Early years in machine translation: memoirs and biographies of pioneers", Amsterdam: John Benjamins, 2000 view details
  • Zarechnak, M. "The early days of GAT-SLC" view details
          in Hutchins, W.J. (ed.) "Early years in machine translation: memoirs and biographies of pioneers", Amsterdam: John Benjamins, 2000 view details