Advent of Raku (and a little Go)

Jan 08, 2022 21:54

Advent of Code is an annual programming challenge. Every night at midnight EST from December 1st through 25th, a new problem is posted to the website, along with a participant-specific input file. Participants can use any programming language they want to solve the problem, using their input, and see if the output of their program matches the expected value. Once that's done, a second part of the problem becomes available. Part 2 uses the same input file and is generally a variation on the first part, such as changing the goal from "find the maximum" to "find the sum of all" or increasing the number of times an operation needs to be performed so that an inefficient algorithm will be far too slow. Problems are usually straightforward at the beginning of the month and get more challenging as the month progresses. There's a stats page measuring time-to-completion and a top-100 leaderboard, but no prizes; many folks pursue their own goals like learning a new language or minimizing program runtime which are somewhat at odds with quick finishes.

After having lots of fun and learning Kotlin with Advent of Code in December 2020, I decided to use Raku for this past year's edition (with code shared on GitHub). Raku is the new name of Perl 6, a "member of the Perl family of languages" which famously took a decade and a half of community experimentation before a finalized version was released. Perl, in turn, is a language originally focused on working with collections of text files which is famously easy to write or hard to read, depending on who you ask. Raku keeps many of Perl's basic syntactic elements-like scalar, array, and hash variables identified by $, @, and % sigils-but also brings to bear a lot of modern programming language developments like class-based object orientation, concurrency, type constraints, multi-dispatch, and just-in-time compilation. Raku is also what one might call Unicode-forward. Most languages these days allow Unicode literals in strings, and most languages made since the '90s allow Unicode letters in program identifiers. Raku takes this significantly further. First, Raku strings are sequences of graphemes rather than just bytes or Unicode code points, so both the single-code-point and the "combining diacritics" variants of "é" are identical in Raku. Second, Unicode characters aren't just limited to letters: the language and standard library provide $thing ∈ ($set1 ∪ $set2) for set operations, @words».uc to upper-case each element of an array, ⚛++ for atomic-increment, 25 ** ½ as another way to perform a square root, and quoting strings with arbitrary characters like say q༺hello world༻.uc. Additionally, Raku took one of Perl's big selling points in the '80s and '90s (terse and powerful regular expressions) and evolved them into grammars which are easier to read, modify, and compose, and likely also faster. This grammar support is what drew my interest to Raku, since I started a hobby project that involves parsing a small amount of information from source code written in a large number of languages, and the ability to quickly write but still maintain textual matchers would make that project more pleasant.

One of the driving principles in the design of Raku (and Perl before it) is There's more than one way to do it (TMTOWTDI). Another principle (repeated frequently by people who post in help forums, it seems) is that programming in Raku should be fun. Several times while working on an Advent of Code solution I tried to do something in a way that looked both elegant and reasonable, only to find out that There's More Than One Way To Do It, But Some Of Those Ways Are Wrong. (Furthermore, since TMTOWTDI, the documentation usually doesn't say "This isn't a good tool for X, use Y instead.") For example, I spent at least half an hour trying to understand why my memoization code didn't seem to be caching my Pair objects, despite all my experiments with the interactive interpreter looking like things should work just fine. It turns out that foo => 42 creates a Pair which uses value equality for both key and value, but my $key = 'foo'; my $val = 42; $key => $val creates a Pair that holds on to the $val scalar (but not the key scalar) and thus uses (half) reference equality. The documentation explains this behavior in an aside and explains that .freeze will result in an immutable Pair, but it's easy to encounter Pairs in documentation that doesn't mention that, and the half hour of WTF was not at all fun. (Another Way To Do It would be implementing a 2-value tuple class myself, which wouldn't have added fancypants scalar handling.) This discovery also reduced my confidence in the language a bit: when I look at a block of code, am I sure that my variables work as values, or might something else hold on to this funny scalar reference?

Another un-fun discovery was that for @first ∩ @second { … } doesn't iterate through the shared elements of two arrays, but instead iterates through Pairs, where the value is always True. I was aware that many set implementations are implemented as a hash table where the keys are set elements and the values are a placeholder value like a boolean. But most languages hide this implementation detail and present a set as akin to an unordered list which doesn't allow duplicate values. The workaround is easy (call .keys on the set when iterating over it), and it provides a nice symmetry with Bags (multisets) and Mixes (weighted sets), but it was still a big surprise. This was made worse by Raku's gradual typing discipline and implicit conversions; I think I was putting the set elements into a Hash, which converts keys to strings by default, so rather than a compile or runtime error complaining that I was using a Pair where a Str was expected I got a Hash with keys like fooTrue rather than just foo. Iteration and automatic conversion also combine in un-fun ways because the base class for most objects implements all the List methods by converting non-iterable objects into single-element lists. I would find it a lot more fun if the Raku compiler would tell me that sub answer(--> Int) { 42 } ; for answer() { … } was attempting to iterate over a non-iterable (maybe I changed answer from a List to an Int and forgot to update all the callers) rather than silently iterating over a single element. This annoyance is compounded by the fact that scalar references to iterable types (like a sequence, list, or array) are turned into single-element lists in this context, so changing my $x = nums => (1, 2, 3, 4); .say for $x.value (which prints four lines with one number each) to my $x = (1, 2, 3, 4); .say for $x changes the output to print a single line with four numbers and a pair of parentheses. This makes changing the shape of data structures while developing a program (like adding or removing a wrapper class) create surprising effects that aren't caught by the compiler. And maybe it's just me, but I think programming is more fun when the computer quickly tells you when you make a mistake, rather than losing an hour of sleep because you were debugging code that looked reasonable.

Working in Raku did have some fun elements. Contrary to my complaints about automatic conversions for container types, the ability to seamlessly work with numbers parsed from text was pretty nice for Advent of Code problems. Having a Complex number type that works as a hash key made several 2D grid traversal problems fairly convenient. And I even got to leverage Raku's nice Unicode property support on a problem about open/close punctuation balancing. I opted to use Raku this year as a way to try out grammars, and they were generally nice. They're way more readable than Perl regular expressions and the "Perl-compatible" regex implementations that have come along in the past three decades, and the ability to supply an Actions class to parse the relevant data from a match and bubble it up makes working with the result of a complex parse much nicer. Grammars are generally overkill for AoC inputs, but I found the structure pretty nice. The one downside to Raku grammars was the lack of helpful information when a parse failed. Unlike a typical programming language compiler that outputs the line and column where input didn't match expectations, a failed grammar parse just returns null, even though Raku has a Failure type that allows conveying additional information. So when my parsing rules were wrong I generally had to put output statements in the Action class and manually inspect what the next token should've been.

Contrary to last year, when I mostly focused on implementing the code and learning the language, I spent a lot of time on the social side of AoC this year. Last year I participated a bit in the Reddit community to provide hints to folks who were stuck. This year, after solving the problem and then participating in Google's group chat about the day's challenge, I frequently spent a couple hours on Reddit, reading through the "solutions megathread" and checking out people's visualizations. This meant a lot of 3am bedtimes this December, followed by another late night. Coupled with trying to actually get some work done, I spent far too much time staring at a screen this December. Also, unlike 2020, there were social reasons to leave the house this year-friends were amused that I was programming in Vim over SSH from a smartphone while semi-drunk at a holiday party while also chatting with folks. (There were just a couple small bugs by the time I got home, and Vim's terse modal editing proved to be a nicer phone-based development environment than I'd expected.)

Three weeks of late nights-including an all-nigher for day 19 because I'd incorrectly assumed that rotations in 3D were commutative-definitely caught up with me. I was pretty burned out by the time I got stuck and went to bed on day 23, which is kind of an interesting problems with a lot of little fussy ways to introduce bugs. I was extra toasty on day 24 (the night leading into Christmas Eve) when I discovered that what I thought would be a reasonable solution-a modified binary search-didn't work at all because in the possible solution space of about 20 trillion numbers, fewer than ten are even potentially the right answer. This fact about the input file wasn't at all clear from the problem statement, and the frustration was intensified by the fact that (contrary to every other AoC problem I've seen) there was no realistic example and expected output to experiment with. The fact that a seemingly reasonable solution can run for hours without providing any insight about the problem (other than "valid values are sparse") and that (as far as I can tell) an efficient solution to this otherwise NP-complete problem requires making assumptions based on a single input file, pretty much soured me on what had otherwise been an enjoyable month. That one day's problem (following a couple late nights) made me strongly question ever participating in "live" (i.e. during December) AoC ever again, which isn't a good feeling to have on Christmas Eve.

Raku's final un-fun factor played a role in this burnout too: slow execution speed. I'd seen folks who hang out on Raku help communities warn folks that Raku performance isn't great, but I figured it would be fine for Advent of Code, which has lots of folks working in languages like Python and more tortured environments like Bash or Google Sheets. But on days 19, 23, and 24 I discovered that my Raku code would spend tens of minutes running on the example input before producing a wrong answer, which is not a good situation in a "implement a program before bedtime" challenge. To more quickly test wrong hypotheses and spot bugs, I reimplemented those days in Go. The Go language is far more verbose and has many fewer features than Raku, but I could implement a Go solution and try it five times in the time it would take to run my Raku code twice. My day 19 solution in Go-using the same algorithm and only slightly optimized code paths-was about a hundred times faster than the Raku implementation. I recall noticing that one Go run took 45 seconds while Raku took 45 minutes. I spent more time optimizing the runtime of the day 23 solution (due to some discussion in the group chat at work) and ended up with a 2.5 second solution in Go and a 68 minute solution in Raku. I even spent some time with the Raku profiler (which amusingly produced about a gigabyte of output for 45 seconds of runtime and had to be analyzed with sqlite because the HTML profiler output crashes any browser tab) and was only able to get a maximum 10% speedup after playing with all of the hot code paths under my control. Two orders of magnitude in runtime is difficult to make up with even the most amazing language expressiveness.
This entry was originally posted at https://flwyd.dreamwidth.org/400979.html - comment over there.

raku, program, go, advent of code

Previous post
Up