SIN(ID:2502/sin001)Symbolic maths packagefor Symbolic INtegration Joel Moses 1967 Project MAC, MIT Symbolic maths package Places Structures: Related languages
References: Introduction The research described in this paper had as its goal the creation of a program which solves freshman calculus word problems. The program, CARPS (CAlculus Rate Problem Solver), is restricted to rate problems. It is described in greater detail in MACTR51 (Thesis). CARPS was primarily motivated by Bobrow's work on STUDENT? a program which solves high school algebra word problems. An understanding of STUDENT is sufficiently important to our work that we shall analyze Bobrow's program in the second section of the paper. CARPS is written in two languages. The bulk of the coding is in LISP. There are, however, large sections which require a great deal of pattern matching, something in which LISP is not particularly powerful. These sections were written in CONVERT, a language especially designed for pattern matching. Since CONVERT is embedded in LISP it was an especially convenient choice because we could easily switch back and forth between the two languages. Both of these languages were available on the Project MAC PDP6 time sharing system which was used in this research. Also available were J. Moses' algebraic simplification and differentiation routines which are described in [Moses]. The PDP6 system which has a quarter million words of core storage, gave us a decided advantage over Bobrow, whose program had to fit into a 32K 7094 LISP system, whereas ours wallows in the comparative luxury of 45K of memory. Extract: STUDENT STUDENT As we mentioned earlier Bobrow's program, STUDENT, solves algebra word problems. To really understand how STUDENT works we should go through a problem and see how the program solves it. Extract: Conclusions Conclusions In section 2 we noted several aspects of STUDENT which we felt needed further work. Let us compare CARPS and STUDENT in these areas. 1) By storing information in terms of structures, CARPS is better able to recognize that two phrases describe the same object. 2) Again because of the use of structures CARPS can gather information about an object in piecemeal fashion. STUDENT was essentially required to generate one equation for each sentence in the problem description. In calculus word problems it is not uncommon to have two or three sentences providing information for one equation. 3) CARPS to a limited degree is able to use its knowledge to parse its input sentences. For example we saw how ALTITUDE was interpreted as ALTITUDE OF THE FILTER because CARPS knew that since the filter was a cone and cones have altitudes, the filter had an altitude. 4) Whereas STUDENT has only one soiutlon method (i.e., solution of linear equations), CARPS has several and can decide which is appropriate for a given problem. CARPS machinery for solving its equations (e.g., differentiation, simplification) is also more complex than STUDENT'S. 5) CARPS utilizes a more sophisticated gramtical analysis of the sentences than STUDENT. This is used both in breaking up sentences and in generating the internal structures. Up to now 14 problems have been solved by CARPS. Most were taken from standard texts, sometimes with slight modifications. (Variations in the statements of problems would, of course, increase the number of problems solvable by CARPS.) The other problems solved by CARPS are in Appendix A. Many of the improvements that we claim for CARPS were necessitated by the increased complexity of the problems that we expected it to solve. However many weaknesses of STUDENT are still present in some form in our design. 1) An important weakness in the program is due to its dependence on key words to signify the type of problem (i.e., distance or volume) and the method of solution to be used. What one would like to have in a calculus problem solver is a program which would use the information presented in the problem to figure out relationships among the elements (e.g., similar triangles) and actually propose the method of solution. The answers which could be provided by such a program are currently built into CARPS' data base for several cases, but this scheme severely limits the power of the program. 2) Another weakness of CARPS is its limited knowledge of English syntax. It would not be too difficult for CARPS to learn new syntactic rules by adding these rules to its CONVERT subroutines. Actually what would be more satisfying would be a different method of parsing the sentences into components of structures. Currently the CONVERT rules are attempted one at a time until one matches the sentence. A better approach would be an incremental left to right parse which, when finished with the sentence, would have.translated it into the internal model. 3) A very powerful calculus word problem solver will require a good deal of "cormnon sense'' knowledge. Consider this problem which we gave to CARPS (A LADDER 20.0 FT LONG LEANS AGAINST A HOUSE /. FIND THE RATE AT WHICH THE TOP OF THE LADDER IS MOVING DOWNWARD IF ITS FOOT IS 12.0 FT FROM THE HOUSE AND MOVING AT THE RATE 2.0 ET PER SEC /.) Much to our surprise CARPS was not able to solve it. A closer look at the problem shows why. The last phrase mentions that the ladder is moving at the rate 2 ft per second. CARPS has, as an internal check, the requirement that associated with each velocity must be the direction of the velocity. The point is that the problem never gave this direction. Most people, however, would assume that it was moving horizontally away from the nouse. T'ne reason of course, is that a familiarity with ladders or the law of gravity tells us that this is the most likely way for it to be moving. This is not an isolated incident. Consider the problem (A BARGE WHOSE DECK IS 10 FT BELOW THE LEVEL OF A DOCK IS BEING DRAWN IN BY MEANS OF A CABLE ATTACHED TO THE DECK AND PASSING THROUGH A RING ON THE DOCK. WHEN THE BARGE IS 24 FT FROM AND APPROACHING THE DOCK AT 3/4 FT / SEC, HOW FAST IS THE CABLE BEING PULLED IN ?) The ship is moving horizontally towards the dock. The problem does not mention this. Nor is the need for a body of "real world" information unique to calculus problems. Consider an example taken from a children's story. "Mike's costume had big ruffles. He refused to wear it." The question is "Why?". A program which will answer correctly must know that boys usually will not wear girl's clothes, and anything with ruffles on it must be for girls. It is our belief that a program which is to understand second grade stories must have at its disposal a body of facts comparable to that of an average seven year old. More generally, computer understanding of any body of literature will require a data base similar to that of a human reader. The major problems facing computer understanding of natural language are, in our opinion, collecting, storing, and utilizing such large bodies of information. in Donald E. Walker, Lewis M. Norton (Eds.): Proceedings of the 1st International Joint Conference on Artificial Intelligence IJCAI69, Washington, DC, May 1969. William Kaufmann, 1969 view details in Donald E. Walker, Lewis M. Norton (Eds.): Proceedings of the 1st International Joint Conference on Artificial Intelligence IJCAI69, Washington, DC, May 1969. William Kaufmann, 1969 view details in [ACM] CACM 14(08) August 1971 view details in [ACM] CACM 14(08) August 1971 view details The best known, purely symbolic systems are, of course, Formac and its current version PL/IFORMAC (Petrick, 1971; pp. 105114). Formac was the first widely available generalpurpose algebraic manipulation system and served for a period to define the field. Certainly, there was a time when one could have safely made the statement that the majority of all mechanical symbolic mathematical computations had been done within Formac. The practical success of these systems, in spite of their rigidity with respect to user modifications and their lack of any seminumerical facilities for rational function computations, is probably due to the overall intelligence of the facilities that were provided. Above all, they were certainly sufficient to support the dominant application area of truncated power series expansion. Current support is minimal. Extract: Symbolic systems SYMBOLIC SYSTEMS. We should mention first a sequence of three early programs for the simplification of general symbolic mathematical expressions represented as prefixnotation tree structures. The first, at M.I.T., was due to Hart, and the other two were due to Wooldridge and Korsvold at Stanford. The latter has survived in current usage as a result of its incorporation, subject to modification, into the MATHLAB, MACSYMA, and SCRATCHPAD systems. In the mid1960s there appeared two systems, Formula Algol and FAMOUS, which, while dedicated to the symbolic manipulation of mathematical expressions, presented the user with almost no builtin automatic simplification facilities. This was due, at least in the case of FAMOUS, to a conscious decision that, since the "simplicity" of an expression is surely context dependent, it should be reasonable to present the user with complete control over the simplification process. That is, the user'should be compelled to define all transformations, rather than, as with most systems, be permitted simply to switch on and off the transformations supplied by the system architects. No system of this species has ever solved the inherent efficiency problems to the extent that it could serve more than didactic purposes. Probably neither Formula Algol nor FAMOUS could be revived today. Another lost symbolic system of importance is the Symbolic Mathematical Laboratory of W. A. Martin. This system provided highquality 2D graphics on a DEC340 display and was also the first to employ a light pen for subexpression selection. In some ways, it represented a degree of interaction that has not been duplicated by any subsequent system. Nor were its innovative internal programming techniques restricted to its graphics facilities. Of particular interest is the use of hash coding for subexpression matching (Petrick, 1971; pp. 305310). in Encyclopedia of Computer Science, Ralston, Anthony, and Meek, Chester L. (eds) New York, NY Petrocelli/Charter 1976 view details A BRIEF HISTORICAL SKETCH  The development of systems for symbolic mathematical computation first became an active area of research and implementation during the decade 19611971. . . . . . . To put the decade 19611971 into perspective, let us recall that FORTRAN appeared about 1958 and ALGOL in 1960. These two languages were designed primarily for numerical mathematical computation. Then in 1960/1961 came the development of LISP, a language for list processing. LISP was a major advancement on the road to languages for symbolic computation. An operation such as symbolic differentiation which is foreign to FORTRAN and ALGOL is relatively easy in LISP. (Indeed this is one of the standard programming assignments for students first learning LISP.) As will be noted later, several computer algebra systems were written in LISP. 19611966  In 1961, James Slagle at M.I.T. wrote a LISP program called SAINT for Symbolic Automatic INTegration. This was one of the earliest applications of LISP to symbolic computation and it was the first comprehensive attempt to program a computer to behave like a freshman calculus student. The program was based on a number of heuristics for indefinite integration and it performed about as well as a good calculus student. One of the first systems for symbolic computation was FORMAC, developed by Jean Sammet, Robert Tobey, and others at IBM during the period 19621964. It was a FORTRAN preprocessor (a PL/I version appeared later) and it was designed for the manipulation of elementary functions including, of course, polynomials and rational functions. Another early system was ALPAK, a collection of FORTRANcallable subroutines written in assembly language for the manipulation of polynomials and rational functions. It was designed by William S. Brown and others at Bell Laboratories and was generally available about 1964. A language now referred to as Early ALTRAN was designed at Bell Laboratories during the period 19641966. It used ALPAK as its package of computational procedures. There were two other significant systems for symbolic computation developed during this period. George Collins at IBM and the University of Wisconsin (Madison) developed PM, a system for polynomial manipulation, an early version of which was operational in 1961 with improvements added to the system through 1966. The year 1965 marked the first appearance of MATHLAB, a LISPbased system for the manipulation of polynomials and rational functions, developed by Carl Engelman at M.I.T. It was the first interactive system designed to be used as a symbolic calculator. Included among its many firsts was the use of twodimensional output to represent its mathematical output. The work of this period culminated in the first ACM Symposium on Symbolic and Algebraic Manipulation held in March 1966 in Washington, D.C. That conference was summarized in the August 1966 issue of the Communications of the ACM. 19661971  In 1966/1967, Joel Moses at M.I.T. wrote a LISP program called SIN (for Symbolic Integrator). Unlike the earlier SAINT program, SIN was algorithmic in approach and it was also much more efficient. In 1968, Tony Hearn at Stanford University developed REDUCE, an interactive LISPbased system for physics calculations. One of its principal design goals was portability over a wide range of platforms, and as such only a limited subset of LISP was actually used. The year 1968 also marked the appearance of Engelman's MATHLAB68, an improved version of the earlier MATHLAB interactive system, and of the system known as Symbolic Mathematical Laboratory developed by William Martin at M.I.T. in 1967. The latter was a linking of several computers to do symbolic manipulation and to give good graphically formatted output on a CRT terminal. The latter part of the decade saw the development of several important general purpose systems for symbolic computation. ALTRAN evolved from the earlier ALPAK and Early ALTRAN as a language and system for the efficient manipulation of polynomials and rational functions. George Collins developed SAC1 (for Symbolic and Algebraic Calculations) as the successor of PM for the manipulation of polynomials and rational functions. CAMAL (CAMbridge Algebra system) was developed by David Barton, Steve Bourne, and John Fitch at the University of Cambridge. It was implemented in the BCPL language, and was particularly geared to computations in celestial mechanics and general relativity. REDUCE was redesigned by 1970 into REDUCE 2, a general purpose system with special facilities for use in highenergy physics calculations. It was written in an ALGOLlike dialect called RLISP, avoiding the cumbersome parenthesized notation of LISP, while at the same time retaining its original design goal of being easily portable. SCRATCHPAD was developed by J. Griesmer and Richard Jenks at IBM Research as an interactive LISPbased system which incorporated significant portions of a number of previous systems and programs into its library, such as MATHLAB68, REDUCE 2, Symbolic Mathematical Library, and SIN. Finally, the MACSYMA system first appeared about 1971. Designed by Joel Moses, William Martin, and others at M.I.T., MACSYMA was the most ambitious system of the decade. Besides the standard capabilities for algebraic manipulation, it included facilities to aid in such computations as limit calculations, symbolic integration, and the solution of equations. The decade from 1961 to 1971 concluded with the Second Symposium on Symbolic and Algebraic Manipulation held in March 1971 in Los Angeles. The proceedings of that conference constitute a remarkably comprehensive account of the state of the art of symbolic mathematical computation in 1971. 19711981  While all of the languages and systems of the sixties and seventies began as experiments, some of them were eventually put into "production use'' by scientists, engineers, and applied mathematicians outside of the original group of developers. REDUCE, because of its early emphasis on portability, became one of the most widely available systems of this decade. As a result it was instrumental in bringing computer algebra to the attention of many new users. MACSYMA continued its strong development, especially with regard to algorithm development. Indeed, many of the standard techniques (e.g. integration of elementary functions, Hensel lifting, sparse modular algorithms) in use today either came from, or were strongly influenced by, the research group at M.I.T. It was by far the most powerful of the existing computer algebra systems. SAC/ALDES by G. Collins and R. Loos was the followup to Collins' SAC1. It was a noninteractive system consisting of modules written in the ALDES (Algebraic DEScription) language, with a translator converting the results to ANSI FORTRAN. One of its most notable distinctions was in being the only major system to completely and carefully document its algorithms. A fourth general purpose system which made a significant mark in the late 1970's was muMATH. Developed by David Stoutemyer and Albert Rich at the University of Hawaii, it was written in a small subset of LISP and came with its own programming language, muSIMP. It was the first comprehensive computer algebra system which could actually run on the IBM family of PC computers. By being available on such small and widely accessible personal computers, muMATH opened up the possibility of widespread use of computer algebra systems for both research and teaching. In addition to the systems mentioned above, a number of special purpose systems also generated some interest during the 1970's. Examples of these include: SHEEP, a system for tensor component manipulation designed by Inge Frick and others at the University of Stockholm; TRIGMAN, specially designed for computation of Poisson series and written in FORTRAN by W. H. Jeffreys at University of Texas (Austin); and SCHOONSCHIP by M. Veltman of the Netherlands for computations in highenergy physics. Although the systems already mentioned have all been developed in North America and Europe, there were also a number of symbolic manipulation programs written in the U.S.S.R. One of these is ANALITIK, a system implemented in hardware by V. M. Glushkov and others at the Institute of Cybernetics, Kiev. 19811991  Due to the significant computer resource requirements of the major computer algebra systems, their widespread use remained (with the exception of muMATH) limited to researchers having access to considerable computing resources. With the introduction of microprocessorbased workstations, the possibility of relatively powerful desktop computers became a reality. The introduction of a large number of different computing environments, coupled with the often nomadic life of researchers (at least in terms of workplace locations) caused a renewed emphasis on portability for the computer algebra systems of the 1980's. More efficiency (particularly memory space efficiency) was needed in order to run on the workstations that were becoming available at this time, or equivalently, to service significant numbers of users on the timesharing environments of the day. This resulted in a movement towards the development of computer algebra systems based on newer "systems implementation'' languages such as C, which allowed developers more flexibility to control the use of computer resources. The decade also marked a growth in the commercialization of computer algebra systems. This had both positive and negative effects on the field in general. On the negative side, users not only had to pay for these systems but also they were subjected to unrealistic claims as to what constituted the state of the art of these systems. However, on the positive side, commercialization brought about a marked increase in the usability of computer algebra systems, from major advances in user interfaces to improvements to their range of functionality in such areas as graphics and document preparation. The beginning of the decade marked the origin of MAPLE. Initiated by Gaston Gonnet and Keith Geddes at the University of Waterloo, its primary motivation was to provide user accessibility to computer algebra. MAPLE was designed with a modular structure: a small compiled kernel of modest power, implemented completely in the systems implementation language C (originally B, another language in the "BCPL family'') and a large mathematical library of routines written in the userlevel MAPLE language to be interpreted by the kernel. Besides the command interpreter, the kernel also contained facilities such as integer and rational arithmetic, simple polynomial manipulation, and an efficient memory management system. The small size of the kernel allowed it to be implemented on a number of smaller platforms and allowed multiple users to access it on timesharing systems. Its large mathematical library, on the other hand, allowed it to be powerful enough to meet the mathematical requirements of researchers. Another system written in C was SMP (Symbolic Manipulation Program) by Stephen Wolfram at Caltech. It was portable over a wide range of machines and differed from existing systems by using a language interface that was rulebased. It took the point of view that the rulebased approach was the most natural language for humans to interface with a computer algebra program. This allowed it to present the user with a consistent, patterndirected language for program development. The newest of the computer algebra systems during this decade were MATHEMATICA and DERIVE. MATHEMATICA is a second system written by Stephen Wolfram (and others). It is best known as the first system to popularize an integrated environment supporting symbolics, numerics, and graphics. Indeed when MATHEMATICA first appeared in 1988, its graphical capabilities (2D and 3D plotting, including animation) far surpassed any of the graphics available on existing systems. MATHEMATICA was also one of the first systems to successfully illustrate the advantages of combining a computer algebra system with the easytouse editing features on machines designed to use graphical userinterfaces (i.e. window environments). Based on C, MATHEMATICA also comes with its own programming language which closely follows the rulebased approach of its predecessor, SMP. DERIVE, written by David Stoutemyer and Albert Rich, is the followup to the successful muMATH system for personal computers. While lacking the wide range of symbolic capabilities of some other systems, DERIVE has an impressive range of applications considering the limitations of the 16bit PC machines for which it was designed. It has a friendly user interface, with such added features as twodimensional input editing of mathematical expressions and 3D plotting facilities. It was designed to be used as an interactive system and not as a programming environment. Along with the development of newer systems, there were also a number of changes to existing computer algebra systems. REDUCE 3 appeared in 1983, this time with a number of new packages added by outside developers. MACSYMA bifurcated into two versions, DOEMACSYMA and one distributed by SYMBOLICS, a private company best known for its LISP machines. Both versions continued to develop, albeit in different directions, during this decade. AXIOM, (known originally as SCRATCHPAD II) was developed during this decade by Richard Jenks, Barry Trager, Stephen Watt and others at the IBM Thomas J. Watson Research Center. A successor to the first SCRATCHPAD language, it is the only "strongly typed'' computer algebra system. Whereas other computer algebra systems develop algorithms for a specific collection of algebraic domains (such as, say, the field of rational numbers or the domain of polynomials over the integers), AXIOM allows users to write algorithms over general fields or domains. As was the case in the previous decade, the eighties also found a number of specialized systems becoming available for general use. Probably the largest and most notable of these is the system CAYLEY, developed by John Cannon and others at the University of Sydney, Australia. CAYLEY can be thought of as a "MACSYMA for group theorists.'' It runs in large computing environments and provides a wide range of powerful commands for problems in computational group theory. An important feature of CAYLEY is a design geared to answering questions not only about individual elements of an algebraic structure, but more importantly, questions about the structure as a whole. Thus, while one could use a system such as MACSYMA or MAPLE to decide if an element in a given domain (such as a polynomial domain) has a given property (such as irreducibility), CAYLEY can be used to determine if a group structure is finite or infinite, or to list all the elements in the center of the structure (i.e. all elements which commute with all the elements of the structure). Another system developed in this decade and designed to solve problems in computational group theory is GAP (Group Algorithms and Programming) developed by J. Neubueser and others at the University of Aachen, Germany. If CAYLEY can be considered to be the "MACSYMA of group theory,'' then GAP can be viewed as the "MAPLE of group theory.'' GAP follows the general design of MAPLE in implementing a small compiled kernel (in C) and a large group theory mathematical library written in its own programming language. Examples of some other special purpose systems which appeared during this decade include FORM by J. Vermaseren, for high energy physics calculations, LiE, by A.M. Cohen for Lie Algebra calculations, MACAULAY, by Michael Stillman, a system specially built for computations in Algebraic Geometry and Commutative Algebra, and PARI by H. Cohen in France, a system oriented mainly for number theory calculations. As with most of the new systems of the eighties, these last two are also written in C for portability and efficiency. Research Information about Computer Algebra  Research in computer algebra is a relatively young discipline, and the research literature is scattered throughout various journals devoted to mathematical computation. However, its state has advanced to the point where there are two research journals primarily devoted to this subject area: the "Journal of Symbolic Computation" published by Academic Press and "Applicable Algebra in Engineering, Communication and Computing" published by SpringerVerlag. Other than these two journals, the primary source of recent research advances and trends is a number of conference proceedings. Until recently, there was a sequence of North American conferences and a sequence of European conferences. The North American conferences, primarily organized by ACM SIGSAM (the ACM Special Interest Group on Symbolic and Algebraic Manipulation), include SYMSAM '66 (Washington, D.C.), SYMSAM '71 (Los Angeles), SYMSAC '76 (Yorktown Heights), SYMSAC '81 (Snowbird), and SYMSAC '86 (Waterloo). The European conferences, organized by SAME (Symbolic and Algebraic Manipulation in Europe) and ACM SIGSAM, include the following whose proceedings have appeared in the SpringerVerlag series "Lecture Notes in Computer Science": EUROSAM '79 (Marseilles), EUROCAM '82 (Marseilles), EUROCAL '83 (London), EUROSAM '84 (Cambridge), EUROCAL '85 (Linz), and EUROCAL '87 (Leipzig). Starting in 1988, the two streams of conferences have been merged and they are now organized under the name ISSAC (International Symposium on Symbolic and Algebraic Computation), including ISSAC '88 (Rome), ISSAC '89 (Portland, Oregon), ISSAC '90 (Tokyo), ISSAC '91 (Bonn) and ISSAC '92 (Berkeley).  Professor Keith Geddes Symbolic Computation Group Department of Computer Science University of Waterloo Waterloo ON N2L 3G1 CANADA in Encyclopedia of Computer Science, Ralston, Anthony, and Meek, Chester L. (eds) New York, NY Petrocelli/Charter 1976 view details Resources
