So
Prabhakar estimates that a reasonable compiler written in Scheme will have about 15000 lines of code.
pcolijn_feed's and my current count: slightly under 7000. Perhaps that's why we're not generating very much code yet.
Actually it makes me feel a little relieved because I was wondering if our code count was high given the expressiveness of Scheme.
But you are right about the 80 columns. If you have to break up argument lists so that there's one argument per line, that's the right thing to do, so maybe LOC isn't the right measure. That has nothing to do with expressiveness. Lack of expressiveness in Scheme is not coding idiomatically, not taking advantage of higher-order functions (built-in and ones you write, e.g. tree walkers), doing things several times in a slightly different way instead of abstracting out the commonality.
I don't entirely understand the rules for 444 -- you have to write your own tools, but how far does that extend? Can you use the Scheme SRFIs? PLT Scheme's "parser-tools" collection is clearly out; that's a port of lex/bison. But what about their extensions -- structures, objects, regexps, ML-style pattern matching? (That last one alone seems to be able to cut code size in half at least, by not requiring laborious constructing/deconstructing of data structures; using it, one can do insertion into a red-black (balanced) BST in about twenty lines of code.)
Reply
We can use the SRFIs and all the extensions. I'm not sure if we're coding idiomatically since I'm not sure what the Scheme idioms are, and whether we have good abstractions depends on whether the code is written at 2pm or 4am :)
Reply
Reply
I'm not a Scheme expert, though I've probably done more coding in it than you have (that is, more diverse coding -- all of what I've written combined may not total 7000 lines). Your code isn't as idiomatic as mine would be (and mine is probably not as idiomatic as a Scheme expert's would be). You don't seem to be using higher-order functions (map, for-each, filter, etc.) as much as I would expect, though it's variable; they show up in some places but not others (is one of you more comfortable with them than the other?). Leaning on SRFI 1 more would probably have helped. There are definitely places where more abstraction would help, for instance in the "dependencies" in parser.scm or in the large number of table-set!'s in analyser.scm (in fact there are more set!'s than I would like overall, though admittedly it is hard when you are being taught the algorithms in imperative pseudocode). You're not using symbols very much -- why give tokens numerical values? And there's no internal documentation at all, except apparently in useful.scm. You're leaning on string-append a lot; SRFI 28 (format) or fprintf to string ports (neither of which you have in Scheme 48, apparently) would have helped (this is going to bite you hard in the code generation you haven't done yet). Some of your functions go on for several screenfuls, which is definitely unidiomatic.
Okay, that's me being critical, which is my default mode. On the positive side, there doesn't appear to be a lot of code bloat here; you're not writing Java or C++ in Scheme or anything like that; you're using records and hash tables intelligently; and you have a sensible organization. It looks pretty good.
Reply
I don't think either of us really knew a lot about the advantages/disadvantages of the various scheme variants when we started, and scsh worked so we stuck with it.
You mention that string-appends will be problematic; are they O(n^2) in scheme48 or something? Would it therefore be better to join a list of strings in lieu of a big string-append?
Reply
Reply
I don't understand why printing values into a format string makes code more legible than using string-append to glue a bunch of string together. For example if we do what we're trying to do in generator/bindings.scm with a format string we'll have a huge string that's difficult to linebreak.
Reply
I was referring to the first use of string-append in generator/bindings.scm:
(string-append tab "ld" tab "[%sp+" (number->string offset) "],"
"%o" (number->string reg-num) "\n")
which I think is a bit more readable as
(format "\tld\t[%sp+~a],%o~a\n" offset reg-num)
though admittedly those tab characters make it ugly. Anyway, the point is that printing into a string is often cleaner; it doesn't break up the fixed portion as much.
Your big use of string-append in bind-c-func keeps repeating tab and newline chars. I'd write either a function or macro to produce a formatted line or sequence of lines of assembler, and just supply the parts that differ. Something like (asm-str "mov" "%fp, %sp").
Reply
Reply
vim is the only IDE you need. An IDE is slow, doesn't work well over remote X connections, and has you depending on the IDE creators to add extensions for things like Subversion; if they don't, you constantly flip back and forth between the console and IDE to get anything done, and are non-productive.
Reply
Reply
My main problem with standard IDEs (Visual Studio, Eclipse) is their emphasis on mouse usage and dialogs. But that doesn't necessarily need to be the case:
I barely, if ever, touch the mouse (or switch Windows) when I'm using XEmacs, which incidentally, has the ultimate in trivial extensibility if you don't feel like writing elisp: M-x shell. :)
Reply
My main issue with vim is that the syntax-highlighting is braindead. It's always confused about matching parens in scheme and it's really annoying, and there's apparently no easy way to fix it since the syntax-highlighting is really just a hacky set of regexes that get run, and they don't get run on off-screen areas of the buffer, so if a matching paren goes off-screen, vim doesn't know about it anymore. Grr! Though I suppose that encourages not writing functions > 1 screen in length...
Reply
Reply
Leave a comment