JARGOL(ID:7202/)

Proposed algorithmic language by Rees 


Experimental programming language by Jonathon Rees


People:
Resources
  • Manifesto on JAR's Next Language
    Reasons to design a programming language:

    All current languages are terrible.
    It's fun, easy, and satisfying.
    You get to choose a silly name for it.
    Languages are games, and people like games.
    Everyone else is doing it.
    There's a chance it might be useful for something.
    By doing it once again, perhaps some people will learn something new.
    Languages are for

    ease of writing
    ease of reading: how does a reader come to believe that the program has any particular beneficial property (useful, harmless, correct, fast, etc.)?
    ease of manipulation (of things written in the language)
    No matter what your goals are initially, demands will always expand. Always best to start by figuring out how to do abstraction.
    One of my goals is to try to reduce the sophistry, insularity, and inbreeding of the programming activity. I want the culture and terminology of programming to be closer to that of mathematics, engineering, science, and even humanities. This means abstracting away from von Neumann computers and working toward a formal language for expressing human, logical, and scientific problems and their solutions.

    Meaning and Information
    The central ideas that I'd like to capture are meaning and information.

    Things written in the language should have meaning. Names should have meaning. A meaning may be assigned a priori (i.e. dictated by me) or may be derived by combining things that already have meaning.

    Conversely, as many useful meanings as possible should have expression in the language. One should be able to express anything that any plausible meta-program might care about. If a particular meta-program doesn't understand something, it can just ignore it, and chances are some other meta-program will be able to use it.

    I'm not sure I know what meaning is, but I'm striving for a common sense definition that blends into mathematical meaning. I am not referring to the so-called "meanings" of denotational semantics.

    Example (not a good one): What does a+b mean? A mathematician might say it means that there is a binary ACI operator (ACI = associative, commutative, and identity-possessing) and it is being applied to two things, a and b, whose meaning derives from the context of discussion - that is, a, +, and b are all pronouns, but we know that a+b = b+a (where = is itself defined).

    By information I mean Platonic bags of bits -- immutable and identity-free. Bignums and symbols in Lisp are like this, but information may also be structured (S-expression-like). Some memory might hold different information at different times, but memory has identity and is therefore not information.

    I'm not totally happy with the term "information," since it implies that someone is being informed of something (knows something that they didn't know before), and I want a term that means uninterpreted bits. However, I haven't found anything better. "Data" and "datum" are close but in English they mean "given" and have the semantics of an independent variable, whereas I'm looking for a term that would also include dependent variables (outputs). "Utterance" is clumsy. "Message" is almost right. "Artifact" has been proposed. "String," "text," "content," "expression," "resource", "version" all have some good properties but none works well.

    [23 June 2002] Today I like the term "number". There is good mathematical precedent for using this word to mean an object with rich internal structure: the game/numbers described by John Horton Conway and presented in Knuth's book Surreal Numbers.

    Design concepts
    Essential features for any language I'd care to design right now:

    The surface syntax must be deeply editable (auto-indent, control-meta-T, etc. all work nicely). Cambridge Polish has this property; I don't rule out the possibility that other syntaxes might also be editable, but have never seen anything that works nearly as well.
    Information-oriented: The central data type is information, and operations such as transmitting, sharing, interning, hashing, encrypting, and interpreting are basic.
    Interpretation is normalization- or deduction-based, not evaluation-based. Specialization of an abstraction to an argument is semantically substitute + simplify. (Lambda calculus is not just some arbitrary widget; it really is about abstraction and specialization.) As in 3-Lisp, '2 is more like "2" than it is like 2.
    Must be easy to write meta-programs in the language (interpreters in particular); full virtualizability. Paul Graham points out that meta-circularity practically defines Lisp, and although I resisted this idea for a long time, I think I now agree.
    Highly web integrated, of course.
    Must avoid multithreaded hell. Locks are evil. (Look at E.)
    Should it ever become necessary to do a native implementation, as opposed to piggybacking off of Lisp/Scheme/Java/etc, write memory management first.
    etags or something like it has to work.
    Good prettyprinter.
    The following are ideas that I'd like to try out.
    There is a community glossary.
    A program's operational interpretation is driven by algorithm off of declarative assertions. E.g. a program may have multiple assertions of the form "procedure x implements concept y", and they may be all true (according to local and community glossaries), but the programmer supplies the algorithm to decide which particular procedure to use.
    "Structured strings" -- an idea I started thinking about way back in the 6.821 days, and continued with a 1994 markup language. The idea is to have something that's algebraically the same as the string datatype (i.e. finite sequences of characters drawn from a finite alphabet), but which in its representation maintains a parse tree so that structural operations, such as going forward over phrases (paren-balanced substrings), are constant time. The effect is to unify string quoting and s-expression quoting, which should help make the notion of quotation more palatable both to newcomers and to mathematicians.
    Quotation can be a binary operator. Not only should the program have meaning, but so should the data. Instead of the string "the dog ate the cake", there should be the assertion 'Fred said "the dog ate the cake"' or 'A speaker of Spanish said "the dog ate the cake"'. [KMP: Should be a nonprimitive notion.]
    Monad-like theory of side effects. Think hard about total mutable memory, its contents as an object, and persistence. Think hard about the sequence of measurements taken from a sensor (input device, e.g. keyboard, clock, network interface).
    Must solve the multiple-value-return problem somehow, perhaps using "lightweight lists" (identity-free, immutable lists for you Lisp programmers out there).
    Single-argument functions a la ML, not automatic argument sequences a la Lisp/Java/etc. (Note syntax quagmire.)
    Category-theory feel, e.g. sum/product symmetry, homomorphisms on programs, etc. Surjective pairing??
    Transparent function option - examine code, user-mode compilation, algebra, differentiation...
    User-mode disciplines such as static typing. (This should fall out as a consequence of the ease-of-metaprogramming goal.)
    Translations to and from as many other languages as possible. Adapters for as many network protocols as possible.
    Incorporate a decent, simple theory of knowledge, evidence, deduction, and belief.
    Theories of transactions, energy barriers, and reversibility.
    The following are flakey ideas that I want to try but would probably give up on pretty quickly if they didn't work.
    The language is a text markup language.
    A version of read that is character-for-character invertible.
    Modest amount of syntactic sugar, e.g. [ ] for lists & list patterns. (I don't know about this one. Shades of MDL.)
    Incorporation of images, etc. into programs ?? Or, a way to turn an ASCII program into a whizzy conference-talk-like presentation of the program.
    Parenthesis omission according to parts of speech. A mode whereby one can write (big brown dog eats liver) instead of ((big (brown dog)) eats liver). (Rebol tried to do this but screwed up.) Don't get all bent out of shape -- I intend something very simple here, like distinguishing unary, nilary, and n-ary operators, not full NLP.
    Use of "a" and "the" as an alternative to variables. Something like this:
        (lambda ((a list)) ... (the list) ...),
        (lambda ((a list) (a second list))
          ... (the second list) ... (the list) ...)
    Lisp backquote and comma become "...[...]..." by analogy with conventions used in ordinary books and articles. This hack is supposed to support the markup language, natural language, and no-sophistry themes, but I also perhaps mean it to help seduce those with anti-Lisp prejudice. You have to admit that prefix operators for quotation and unquotation are pretty odd.
    Silly idea: "given" instead of "lambda". (given (x) (* x x)) reads as "Given x, x times x."
    The language definition should be a substandard. ("Substandard" is a trademark of Kent M. Pitman. Ask KMP what it means.)
    Community Glossary
    The purpose of the language's community glossary is to encourage and abet, but not to require, the language's users to use consistent terminology across all programs.

    The community glossary will just be a set of term descriptions submitted to a central repository by people who are using the language. Submission will be easy -- perhaps an automatic part of a program build process. E.g. if a program includes a definition of a term and the definition is declared "community" then the definition will be submitted to the repository. Discussions of the merits and details of definitions will also be encouraged and archived.

    New entries should have some support -- some evidence that this definition of the term is consistent with established usage in natural language.

    [Maybe I can get Greenspun or KMP interested in this?]

    Before using a term in a program, programmers will be encouraged to consult the community glossary and link to a particular definition found there, and it will be easy for them to do so.

    Example: There might be community agreement that "sorted" is an adjective that means that the elements of a list occur in order. "Element," "list," and "order" would also have definitions in the glossary.

    Another example: One might go to the community glossary and look up "sort". There might be three or four definitions of this term, written in natural language prose, each with a date and author. Some authors might say "sort" is something you do to a list, while for others it has the sense of "kind", as in universal algebra. Each author will argue that his or her definition is good, and others will argue against them, in an archived discussion.

    Definitions may or may not be operational. For example, "halt" could be given a definition in terms of halting of Turing machines. Even terms such as "sorted list" which can be made operational would have an implementing program only as backup to a declarative definition.

    In the extreme, a programmer writes programs that simply implement meanings defined in terms that are all defined in the community glossary. The program then has a readily understood informal or formal specification.

    As in a dictionary of English, multiple definitions are acceptable, although discouraged. A use of a term should be associated with the particular definition desired. Somehow there has to be a mechanism to allow good definitions to become ascendant and others to be phased out. I don't know how to do that in a way that scales.

    As suggested above, the glossary may contain programs (preferably public domain) that illustrate, exemplify, or even help to define terms (e.g. tests).

    The glossary will also propose grammatical aspects of terms, such as parts of speech.

    external link