Comments | chard: Fix

chard

Fix

Jan 20, 2008 17:24

I recently had occasion to write a Forth interpreter in Linden Scripting Language (in Second Life). I've done it before (more than twice) and it only took me an hour or two. I've written the odd Scheme interpreter (and compiler) too. Forth and Scheme are both elegant little things, and I've always wanted to think of a way of combining them, but ( Read more... )

theory, programming, programming languages, lambda calculus, logic

Comments 10

nickbarnes January 21 2008, 09:21:34 UTC

"Every token is a function from stack to stack" reminds me vaguely of Emacs, in which every active key is bound to a function, but in text mode most of these functions are (self-insert-command N).
The Church stuff is cute but useless. Until you got onto that, I was wondering whether there's something useful in here.
There's more than one "place": operand stack/return stack/dictionary. Sometimes (in "immediate mode"), when I read a token I apply it to the operand stack (which might have the effect of pushing the value onto the stack). Sometimes (in "compilation mode") I push it somewhere else. Is it interesting to abstract this idea?

chard January 21 2008, 09:30:11 UTC

"Cute but useless" probably sums this whole thing up. The point of investigating the Church encodings is to get an understanding of what the language is at a low level. It's my way of investigating purity, and shaking out assumptions, and understanding the equivalence with the lambda calculus. I don't see why it stops you thinking there might be something useful.

The language XY has something like the three places you mention: operand stack, queue of things to be done (which incorporates the PC and return stack), and dictionary. However, I didn't like the notion of "valence", which seemed to make it more complicated than necessary.

chard January 21 2008, 11:24:23 UTC

I think the main thing that's going to make this less useful than Forth in tiny environments is that anonymous definitions and the "cat" operator will require a garbage collector. On the other hand, in Forth it's hard to manipulate structured data. Since this language has no update, all references will be to older objects, so it should be possible to implement a one-pass GC similar to that described to me by Joe Armstrong for Erlang.

The current Forth interpreter is just 180 lines of Linden Scripting Language (excluding comments, blank lines, and debugging) and occupies well under 8K of run-time memory. About 20 lines of that are inter-process communication for a registration mechanism for external libraries. Even the simplest GC would be a significant overhead, but the increase in programming power would be enormous.

chard January 21 2008, 12:31:23 UTC

The closes language to Fix is False, with a 1024 byte compiler in 68000 code, apparently.

chard January 21 2008, 12:59:40 UTC

I'd like to say a bit about in-memory representation. Again, there are two ways here: Forth-like and Scheme-like. In a Scheme-like system we have a heap containing (mostly) pairs, with a tagging scheme that lets us store other stuff. In Forth we have a dictionary which we mostly append threaded-executable code to. Each "object" in the dictionary is a list of other objects which is traversed depth-first in order to execute the code, except for some special primitives. There's a striking similarity between these two things, which is why I am particularly interested in things like Church encodings, which are precisely about making data structures out of executable raw materials.

Another way of looking at it is that each Forth word is a like a CDR coded list of other words, terminated by a return instruction. (In fact, "return" can just be a word like "R> drop R> jump".) The Fix "cat" operator builds a three-word object with the top two stack items and a "return". In this sense "cat" is almost like Scheme "cons". I think this ( ... )

gareth_rees January 21 2008, 13:32:18 UTC

I recently had occasion to write a Forth interpreter in Linden Scripting Language (in Second Life)

Can you say something about the reason you did this?

chard January 21 2008, 14:27:11 UTC

Linden Scripting Language (LSL) is the language in which you write scripts in Second Life. When a script is placed in a primitive (such as a cube) that is rezzed (exists) in the virtual world, then it is run and becomes a process. Each simulator in Second Life schedules and runs all the processes in all the primitives that are rezzed. These processes provide behaviour for Second Life objects. (However, basic physics, drawing, particle systems, hovertext, purchasing, permissions, and a bunch of other stuff are stored properties of a primitive or object, and don't need scripting.)

However, LSL is an inherently event-driven language. You write a set of "states" and each state has a set of "event handlers". In order to, say, rez an object, you have to send a request and return from your event handler, then wait for your "object_rez" event handler to be poked. This means you can't write a subroutine like "make_a_cube_containing([list of objects])". Instead, you end up programming everything as a state machine with a load of ( ... )