Computing Epochs, Part II: psyllogism

psyllogism

Computing Epochs, Part II

Jun 21, 2010 11:31

To review, I identified four epochs that computer programming has gone through since its inception: The Imperative, the Structured, the Object Oriented, and the Dynamic. The Imperative was characterized by being very close to the machine code itself, with very little abstraction or imposed structure. The Structured epoch added some basic abstraction in the form of loops (for, while, do), subroutines, and basic data structures. The Object Oriented epoch combined data structures with the methods that act upon it (the eponymous "objects") with a great deal of abstraction available to it where programmers were now attempting to model "real world" objects and concepts instead of dealing closely with the actual circuitry. Finally, the Dynamic epoch coincided with the rise of the World Wide Web and adds incredibly rapid development and deployment, plus interactivity to the largely text-based web pages over a bandwidth-constrained network. I argued that the OO epoch and the Dynamic epoch are currently more-or-less coexisting simultaneously in two different domains. So the question is, what comes next?

An easy answer would be to try to combine the two paradigms. Believe me, this has been tried, but the success has been mixed at best. Java was an early attempt to bridge the gap with its applets embedded into web pages, but this failed spectacularly and the fledgling language nearly collapsed until it was resurrected as a largely server-side successor to the OO epoch. But there are two things that Java brought to mainstream popularity that are important: a virtual machine infrastructure, which is beyond the scope of this blog post, and Garbage Collection. No, Java won't take your trash out to the curb. Instead, Garbage Collection (GC) was a gigantic step in programmer productivity that was surprisingly (in hindsight) very controversial in its time. In the OO paradigm, anywhere from a handful to hundreds or thousands (or more?) digital objects are created in each program. Before GC, when the program was done with each object, the programmer had to explicitly "destroy" the object and free up the computer memory it was occupying. This necessitated the programmer having to keep track of every single object, through every single convoluted program flow, and know when they were done with the object. GC removes the need to do any of this. Now, the computer itself keeps track of all the objects, knows when the object is finished, and destroys it for you. No muss, no fuss! The programmer is free to declare objects willy-nilly and trust that the computer will manage the memory in a reasonable way. As an aside, ALL Dynamic epoch languages take Garbage Collection as a given, so the "modern" epoch could almost just be called the Automatic Garbage Collection epoch, but Dynamic epoch just has a better ring to it, you know?

Why do we need a new epoch? In 2001, IBM introduced the first consumer-grade dual-core computer processor. For the first time, an individual's desktop computer could execute two machine instructions at exactly the same time. Before, computers could only "fake" multitasking by slicing up separate tasks and executing part of one task, then part of another, then back to the first, and so forth really really fast. Dual core essentially doubled the theoretical throughput, but you didn't notice your new computer double in speed, did you? There's a good reason for that. As stated, it used to be a safe assumption that each computer would only have a single processing "pipeline". So programmers wrote their code with that assumption implicit in their program logic. So called "parallel" or "concurrent" program was incredibly hard, both because it's theoretically difficult to do well correctly but also because there wasn't much incentive to develop it. Now consumer grade computers are shipping with 4 or even 8 processors, and the number will only increase, but programmers have been slow to take advantage of them.

This problem is akin to the problem facing programmers before Dijkstra's "GOTO Considered Harmful" essay showed how Structured Programming could take advantage of the increased speed of processors of that time without exponentially increasing the complexity of the programs. There are potential solutions brewing in academia akin to the research OO languages of Simula and Smalltalk before the OO epoch. And there are proposed technical solutions akin to Garbage Collection as made the Dynamic epoch possible. So a combination of human, academic, and technical solutions needs to come together for this. I don't know exactly who/what the winning combination will be, but I'm pretty sure it will be able to be called Declarative. Hence my tentative name for the Next Big Thing will be the Declarative epoch.

Like Imperative or Dynamic, Declarative is a term already loaded with meaning in computer programming. To me, Declarative means the highest level of abstraction yet. You Declare WHAT you want the computer program to do, but not exactly HOW you want it to do it. This might scare many programmers out there, but this is exactly what we've been doing through each of the previous epochs. for/while/do loops tell the computer to figure out how to jump through the machine instructions. OO classes tell the computer to figure out how to store the data and locate the methods that act on the data, without the programmer having to explicitly specify all that. Garbage collection is the most obvious application of this, telling the computer to figure out how to manage its own memory without getting in the way of the programmer. Another Declarative example is the ubiquitous database querying language SQL, where the programmer declares WHAT they want out of the database, and leave it to the database program to figure out how to get that done. In a truly Declarative language, the programmer lets the computer figure out how to parallelize its computations, perhaps with hints like how for/while/do loops give hints how the computer should jump around its machine instructions. Such a language would also be easily deployed to a network environment, which is naturally parallel and asynchronous. There are some early forays into this: OCaml, Haskell, and Scala all show promise but I'm not willing to declare any of them as the winner yet ;-). Erlang is another, especially mature, example with actor-based concurrency which I believe is the most scalable and also the most intuitive concurrent programming paradigm. Interestingly, Microsoft is possibly the biggest driver in this direction, funding much of the research and development of Haskell, introducing LINQ to its .NET development stack (an extremely SQL-flavored framework that I'm very intrigued by), and releasing Visual Studio 2010 with release for its experimental language F# (which itself is largely based on the aforementioned language OCaml). Will the next computing epoch be led by Microsoft, of all organizations?

Maybe not. There's one little language I've ignored that seems to both predict and transcend all of the epochs I've identified. In 1958, in between the release of the the first popular compiled languages of FORTRAN and COBOL, John McCarthy released a computer language he called LISP. When other languages were slaves to the von-Neumann architecture [PDF], LISP completely eschewed any reference that it was running on a mere machine. It came out of the gate with structured programming concepts AND Garbage Collection (many programmers seem to think that Java invented garbage collection and are subsequently surprised to find that it existed in the second-oldest programming language still in popular use). LISP both heavily influenced the thinking that led to Object Oriented programming, and was one of the first to incorporate OO concepts once they became formally defined. It has even had some success on the WWW, being used in the initial Yahoo!Store web system and the initial implementation of the popular news sharing site reddit. Here recently, an up-and-coming programming language named Clojure has been built on top of the Java Virtual Machine ecosystem which is essentially a Java implementation of the Platonic ideal of LISP. Rich Hickley, the creator of Clojure, definitely seems to grok the current state of software development in the highly distributed, concurrent, asynchronous, parallel environment and has added explicit support for these concepts in his implementation. I wish there was a text transcript of that presentation, "Are We There Yet?", that I linked to above. It's the clearest explanation of the current transition period we are in that I've ever seen, but it's a really long video to watch. Now, LISP has a reputation to live up to of always being beneath the mainstream but leading the way. Programmer's keep trying to attack new problems by developing novel new solutions, only to later find out that LISP could already do that. Admittedly, LISP has a highly esoteric syntax and its semantics are non-intuitive to most people at first. But it is amazingly effective at adapting to the new complexities required of future software before other, mainstream programming languages are able to adapt that capability into their more intuitive, acceptable paradigm. So again, I won't declare Clojure the winner for the next Epoch, but I would definitely look to Clojure to light the way.

systems