kolmogorov complexity of integers: buhrger

buhrger

kolmogorov complexity of integers

Aug 24, 2010 08:51

inspired by a tangent on a recent post on less worgn, yet another way of thinking about and quantifying the roundness of a natural number. if we think of round numbers as being in some sense inherently simple, then it's natural to turn to kolmogorov complexity. as per the less worng post, if we take as primitive concepts the number 1 and the binary operations of addition and subtraction, we can then count the minimum number of these primitives it takes to make any given natural number. so, for example, five is 1+(1+(1+(1+1))), requiring five 1s and four +s. but we can also represent six using only five ones: (1+1)×(1+(1+1)), and twelve only takes seven ones: (1+(1+1))×((1+1)+(1+1)).

writing representations in this form quickly gets unwieldy, due to the proliferation of parentheses. one can go to polish notation, in which case twelve ends up looking like ×+1+11++11+11 - shorter to write, not necessarily clearer to read. of course, as we're dealing with binary operations, the next place to go is binary trees, but those are even tougher to actually type up.

but again what we're interested in here is ultimately the kolmogorov complexity of these things, in this case the size of these expressions. one could go with the length of the polish notation, but as it's equivalent to the binary tree, one always ends up with an odd number: k nodes and k+1 leaves. so instead, i've been just counting the number of nodes, equivalently the number of addition and multiplication operations necessary to construct the integer. i've been denoting this by K(n).

so we have two possible ways of representing a new integer:

add one to the previous integer, in which case K(n) = K(n - 1) + 1
multiply two previous integers, in which case K(n) = K(d) + K(n/d) + 1, where d|n is a nontrivial proper divisor

by taking the minimum over all such expressions, we can find the actual value of K(n). if you happen to have previous values saved, the calculation isn't that onerous. (yes, it requires a factorization of n, but so does every roundness measure you're ever gonna find.)

in this context, the round numbers are the ones with relatively low values of K(n). two different approaches to "relatively":

as K(n) exhibits roughly logarithmic growth (unsurprising, given the combinatorics of binary trees), we can normalize K(n)/ln(n) and then pick comparatively low values. (doing numerical asymptotics on this is tough, but the value tend to be between 2 and 4 for large 3-digit values of n.)
given the first representation method, we're always guaranteed that K(n) ≤ K(n - 1) + 1. often we have K(n) = K(n - 1). a round number is one where K(n) < K(n - 1), especially if K(n) < K(n - 1) - 1. by that definition, the first few round numbers are 24, 48, 54, 60, 72, 90, all of which have K(n) = K(n - 1) - 2. the first case of K(n) = K(n - 1) - 3 is 108, and the first case of K(n) = K(n - 1) - 4 is, unsurprisingly, 720.

the non-round numbers (by either criterion), are the ones where we have to use K(n) = K(n - 1) + 1. these are mostly primes, but there are a few composites as well, the first few cases being 46, 55, and 82.

math