Project Euler in Factor, part 1: pozorvlak

pozorvlak

Project Euler in Factor, part 1

Nov 17, 2009 23:48

In a (probably futile) attempt to acquire Real Ultimate Power, I have been trying some Project Euler problems in Factor. Here are my first five solutions, along with my comments: I was going to post my first ten solutions, but this was getting too long.

Spoilers herein, obviously.

Problem 1: Add all the natural numbers below one thousand that are multiples of 3 or 5.

Solution:
USING: sets math kernel sequences ;
IN: project-euler.problem1

: mults ( seq n -- seq2 ) [ mod 0 = ] curry filter ;

: sum35 ( seq -- n )
[ 3 mults ] [ 5 mults ] bi union sum ;

: solution1 ( -- n ) 1000 sum35 ; Tests:
USING: tools.test project-euler.problem1 ;
IN: project-euler.problem1.tests

[ V{ 0 3 6 9 } ] [ 10 3 mults ] unit-test
[ 23 ] [ 10 sum35 ] unit-testComments:

Pretty straightforward. The mults word takes a sequence seq and a number n and returns those elements of seq which are multiples of n. Square brackets introduce a quotation, or anonymous word/function/thing, and curry does what you'd expect, so [ mod 0 = ] curry takes a number n off the stack and pushes back a quotation which determines whether its argument is exactly divisible by n. [I'm finding it very hard to avoid function/argument/return language, even though I know it doesn't work like that: please bear with me!]. filter, as you'd expect, takes a sequence and a quotation and returns all the elements of the sequence for which the quotation returns true; note that in Factor, the integer n can also serve as the sequence 0, 1 .. n-1. The bi combinator takes two quotations and applies them to the top element of the stack, pushing both results; here, we use it to get all the multiples of 3 and all the multiples of 5 in the desired range, then union them together and take the sum of the resulting sequence.

Problem 2: Find the sum of all the even-valued terms in the Fibonacci sequence which do not exceed four million.

Attempt 1:

"Oh goody", I thought, "a chance to find out how Factor handles lazy infinite lists. I know, I'll translate the one-liner fibs = 1:1:(zipWith (+) fibs (tail fibs)) from the Gentle Introduction to Haskell into Factor". Unfortunately, I couldn't find a lazy equivalent of Haskell's zipWith (which takes a function f and two lists x and y, and returns the list [f x0 y0, f x1 y1, ...]), so I had to write my own, which works by zipping the two lists (creating a list of pairs) and transforming the quotation so that it acts on pairs rather than on the first two elements on the data stack. I was keen to re-use an existing lazy-list making word, because list in Factor is actually an abstract class, implemented by concrete classes called lazy-cons, lazy-map, lazy-while and so on. I called my word l2map, because the equivalent word for sequences (which are not the same as lists) is called 2map and the l- prefix was generally used to indicate "Applies to lists". BTW, Factor also has a word map*, which works on arbitrarily many sequences.
USING: lists lists.lazy kernel sequences math ;
IN: project-euler.problem2-1

: make-2map-quot ( quot -- quot )
[ swap [ first ] keep second rot call ] curry ;

: l2map ( seq seq2 quot -- seq3 )
-rot lzip swap make-2map-quot lazy-map ;

: fibs ( -- fs )
1 1 fibs fibs cdr [ + ] l2map lazy-cons lazy-cons ;

: solution2-1 ( -- n ) fibs [ 2 mod 0 = ] lfilter
[ 4000000 <= ] lwhile 0 [ + ] foldr ; But typing solution2-1 into the listener gives a "Data stack overflow" message. A quick [ solution2-1 ] walk reveals the problem: though the list itself is lazy, the recursive call to fibs is evaluated strictly, and this means that it will inevitably recurse until I run out of stack space. Stupid boy. So, I tried a different approach.

Attempt 2:

The Fibonacci sequence is an example of a Lucas sequence; that is to say, one for which every element after the first two is found by adding the two previous elements. There's an obvious recursive definition of Lucas sequences: lucas(x,y) = x, y, lucas(y, x+y). In Factor, this becomes
USING: lists lists.lazy kernel math ;
IN: project-euler.problem2-2

: lazy-lucas ( x y -- ls )
2dup + [ lazy-lucas ] 2curry swap [ ] curry swap lazy-cons ;

: lazy-fibs ( -- ls ) 1 1 lazy-lucas ;

: solution2-2 ( -- n )
lazy-fibs [ 2 mod 0 = ] lfilter
[ 4000000 <= ] lwhile 0 [ + ] foldl ; This executes in a few milliseconds, and is pleasingly shorter, though there's still a lot of explicit stack-juggling.

I could, of course, have written an alternative solution in which I iterated over the Fibonacci sequence keeping at most two elements and a running total in memory, but having got the right answer I was keen to move on to

Problem 3: Find the largest prime factor of the number 600851475143.
USING: math.primes.factors sequences kernel ;
IN: project-euler.problem3

: biggest-factor ( n -- f ) factors last ;

: solution3 ( -- f ) 600851475143 biggest-factor ; When doing Project Euler problems in a high-level language, one has two choices; either one can implement everything from scratch (thus learning about the necessary mathematics and techniques for coding efficiently), or one can use as much library code as possible (thus learning about the language's libraries and preferred style). Ideally, one should do both, but since I'm trying to learn Factor rather than number theory, I went for the second option. I think the resulting code requires no further explanation :-)

Problem 4: Find the largest palindrome made from the product of two 3-digit numbers.

This one was lots of fun. And by "lots of fun", I mean "took me bloody ages".

Attempt 1:
USING: math math.parser math.functions math.ranges kernel sequences
lists.lazy lists ;
IN: project-euler.problem4

: palindrome? ( str -- ? ) dup reverse = ;

: numeric-palindrome? ( n -- ? )
number>string palindrome? ;

: digit-nums ( n -- seq )
[ 1 - ] keep [ 10^ ] bi@ 1 - [a,b] reverse ;

: a*b ( a,b -- ab ) [ first ] keep second * ;

: digit-nums-list ( n -- list )
digit-nums sequence>list ;

: digit-num-pairs ( n -- list )
digit-nums-list dup lcartesian-product ;

: products ( list -- list ) [ a*b ] lazy-map ;

: first-palindrome ( list -- n )
[ palindrome? ] lfilter car ;

: length>palindrome ( n -- n )
digit-num-pairs products first-palindrome ;

: solution4 ( -- n ) 3 length>palindrome ; Another use of lazy lists. When the only tool you know is a hammer... digit-nums makes a descending sequence of all n-digit natural numbers by applying the 10^ word to both n and n-1 (using the apply combinator bi@, which is dual to bi), then using the wonderfully-named word [a,b] to get the range in between them. Since Factor has almost no syntax, it's legitimate - and encouraged - to use punctuation in word names where appropriate. Some more examples in this code are the use of ? to mark predicates and the use of > to mark conversion functions. After the code's constructed the sequence of three-digit numbers, it copies it (using dup), then constructs a lazy list of all pairs of three-digit numbers (using lcartesian-product), then goes through the list looking for the first palindrome.

There are a couple of earlier versions of this code here, from when I gave up and asked for help on the #concatenative IRC channel (on which more later). Note how half of the first program is commented out? The compilation error I was getting was inside the commented-out section. Fortunately, erg pointed out my mistake instantly: in the stack effect declaration for a*b, I'd written ab) rather than ab ). Stack effect declarations, like most things in Factor that look like syntax, aren't implemented using special cases hard-coded into the lexer or parser: instead, they're implemented using parsing words (the equivalent of Lisp's reader macros). So ( is actually a parsing word that says "read characters until you hit a ), and parse everything you find as a stack effect declaration". In particular, this turns off the normal lexer's treatment of ! as a comment-character (or, more precisely, it stops the ! parsing word from being invoked...). Unfortunately, when I mistyped ab) for ab ) it meant that the stack-effect parser interpreted it as just another stack object, and carried on reading until the next ) on its own, which was on the next line. I suggested various ways in which this kind of thing could generate a warning instead, but was told that they'd all rule out some genuinely useful cases and anyway it was too rare a problem to be worth bothering with - clearly I'd made a very beginnerish mistake indeed :-(

Even after cleanup, unfortunately, this code suffers from three serious problems:

When run with n=3, it gobbles resources and hangs the listener.
When run with n=2, it doesn't hang the listener, but it does die with an error ("Generic word car does not define a method for the word class", since you ask).
Even if it ran without crashing, it probably wouldn't produce the correct answer: the path it takes through the space of pairs of three-digit numbers doesn't guarantee that the products will decrease, so the first palindrome it comes to need not be the largest.

Clearly, an entirely different approach was called for.

Attempt 2:
USING: math math.primes.factors math.parser math.ranges kernel sequences ;
IN: project-euler.problem4-2

: >palindrome ( d -- db )
number>string dup reverse append string>number ;

: has-digits? ( n m -- ? )
swap number>string length = ;

: two-3-digit-divisors? ( n -- ? )
[ divisors [ 3 has-digits? ] filter ]
[ [ swap / 3 has-digits? ] curry ] bi any? ;

: solution4-2 ( -- n )
999 100 [a,b] [ >palindrome two-3-digit-divisors? ]
map-find nip >palindrome ; Instead of iterating over pairs of three-digit numbers and looking for palindromes, we iterate over palindromes and look for products of three-digit numbers. Since the desired answer will almost certainly have six digits, we can enumerate candidates by taking three-digit numbers and gluing them back-to-back. This is done using the >palindrome word, which uses the >thing convention for "word that makes things" (object constructors in the Java/C++ sense are generally called ). This code produces the correct answer in an eyeblink; a back-of-an-envelope calculation suggests that it shouldn't actually be doing much less calculation than the first approach (if I'd managed to solve the enumerate-in-the-right-order problem), so I guess I must have been hitting a lazy evaluation/space leak bug. D'oh.

Edit: the reference solution, in extras/project-euler/004.factor, is on the lines of my first attempt, but does everything strictly. In one of those "I don't think we're on the BBC Micro anymore, Toto" moments, I'd totally failed to realise that you could just load the whole million-element array into memory and sort it. On the other hand, my final solution runs 100 times faster than the reference solution on my machine, and could be easily made to scale to longer palindromes. Anyway, on to

Problem 5: What is the smallest number divisible by each of the numbers 1 to 20?

Well, it's the product of each pn < 20, where p is a prime and n is maximal. Obviously.
USING: kernel math math.ranges math.primes math.functions sequences ;
IN: project-euler.problem5

: logn ( x y -- xlogy ) [ log ] bi@ / ;

: highest-power-lt ( p n -- n^m )
swap dup -rot logn floor ^ >integer ;

: lcm-all ( n -- N )
[ primes-upto ] [ [ highest-power-lt ] curry ] bi map product ;

: solution5 ( -- N )
20 lcm-all ; The naive solution 20 [1,b] 1 [ lcm ] reduce ; works too, but doesn't scale to large n nearly so well.

Edit: As Slava pointed out to me, dup -rot is equivalent to tuck, and swap tuck is equivalent to over, so I could have replaced the mess of shuffle words in highest-power-lt with a simple over.

Overall comments

The dev tools are nice, once you work out what everything's called - the debugger's called the "walker", for instance. I particularly like the way that the listener pops up a stack effect declaration in a minibuffer when you type in a word's name - since the conventions are strictly followed, this is often all the documentation you need. When you need more, it's easy to get help and see source code using the help and see words, or using the browser (Alt+B). I haven't yet worked out how to take best advantage of the editor integration, but I'm sure that will come.
For getting started, I found the Factor cookbook invaluable.
#concatenative is one of the friendliest tech forums I've encountered, and hands down the most helpful. In every case my (admittedly n00bish) questions were answered before I'd finished posing them.
I'm finding programming without variables to be really hard. Factor does have facilities for both lexically- and dynamically-scoped variables, but I've been mostly avoiding their use, on the grounds that I'll have to learn how to use the stack some time and I might as well do so now.

Any comments or questions, on the code or the presentation, are very welcome!

Edit: thanks once again to the members of #concatenative, particularly slava and erg, for comments on this post.

computers, programming, maths, factor