For a little while I've been poking around in some basic number theory using the Sage computer mathematics system (and a tiny bit of PARI/GP, which is another package that comes bundled inside of Sage).
I was initially inspired by a
blog post of John Cook's about the Perrin numbers, a sequence sort of like the Fibonacci numbers that can be used via a simple further operation to generate what seems to be a list of prime numbers (and it in fact contains all the prime numbers, but eventually starts including some composite ones as well... starting with 271,441.)
This seems pretty remarkable, but actually it's just one of a whole family of such sequences. Every recurrence relation that generates the next number in a sequence as a linear combination of the previous k numbers can be used to produce a primality test like this. Some of them aren't very good, though! They all pass all the primes, but they have different frequencies of false positives, or "pseudoprimes". The Perrin sequence is one of the best ones at order 3, meaning the previous 3 values are involved.
Anyway, I wrote a lot about this in a
series of posts to Google+ using the hashtag #pseudoprimes, in which I gradually became more and more deconfused.
Cook wrote about a few of my early explorations here.
I went on to do some automated searches to find the recurrence relations that generated pseudoprimes with the most impressive infrequency, or at least had the biggest first ones.
I found some whoppers, especially at order 4 where there was one whose first pseudoprime is 8,016,241; but I also discovered, to my mild surprise, that the most impressive one at order 2 seems to be a case that is not very well-known (though I doubt I am its true discoverer; several closely related sequences are in OEIS, and I'm just noodling around here).
The main purpose of this post is to provide all of my Sage code.
The following appears to be written in Python, but you will need
Sage (or an account on a Sage cloud server, like the spiffy new thing at
https://cloud.sagemath.com/ ) to actually run it.
The pseudoprime searches are kind of computation-intensive, so I'd be careful about running them on somebody else's machine. Note, here I don't actually use the methods I was talking about in the post John Cook referenced, because they don't seem to work well in Sage (they work better in PARI/GP, which is what
Joerg Arndt used in the algorithm on the OEIS page for Perrin pseudoprimes). For some of the toughest order-4 cases, I did my automated searches in Sage with a feasibly low search limit (around 10^6) on each sequence, then switched to PARI/GP to actually find pseudoprimes for the few that exhausted that limit.
Many of these functions take a matrix M that defines a recurrence relation. You can make such a matrix with the recur function, which takes a list of the coefficients that the most recent values in the sequence will be multiplied by, and added to get the next one (for the Fibonacci recurrence it would be [1, 1]; for the Perrin numbers, [1, 1, 0]).
For every such recurrence, there's a special set of seed values where the nth sequence value is the trace of Mn, and that was the case I was mostly interested in because those are the ones that most easily generate primes and pseudoprimes. (For the Fibonacci case, that sequence is not the Fibonacci numbers, but the closely related Lucas numbers, which start with [2, 1] as the seed values.) sequence_for_seed lets you specify your own seed value if you want to explore seeds other than that one.
# Sage Python functions for messing around with linear recurrence
# sequences and pseudoprimes, by Matt McIrvin, May 2013.
# Generate the recurrence matrix for a given list of coefficients:
# [c0 ... c(k-1)] where p(n) = c0 * p(n-k) + ... + c(k-1) * p(n-1).
# This just has the coeffs in the bottom row and a shifted identity
# that moves the other sequence values elsewhere.
def recur(coeff):
dim = len(coeff)
M = zero_matrix(ZZ, dim, dim)
M.set_block(0, 1, identity_matrix(dim-1))
M.set_block(dim - 1, 0, matrix(ZZ, 1, dim, coeff))
return M
# Find the initial seed values for a given recurrence matrix
# for the sequence corresponding to tr(M^n).
def seed_values(M):
p = []
for n in xrange(0, M.ncols()):
p.append((M^n).trace())
return p
# Iteration step.
def find_next(p, coeff, r):
pnext = 0
for k in xrange(0, r):
pnext = pnext + p[k] * coeff[k]
# advance the list of values
del p[0]
p.append(pnext)
return pnext
# Pseudoprimality test.
# pnext = sequence value, tr = matrix trace, n = index being tested
def is_pseudoprime(pnext, tr, n):
return (pnext % n == tr % n) and not(is_prime(n))
# Recurrence sequence for k*k matrix M and initial list of values [p0 ... p(k-1)].
def sequence_for_seed(M, p, limit):
seq = p[:]
r = M.ncols()
coeff = M.row(r - 1)
for n in xrange(r, limit):
seq.append(find_next(p, coeff, r))
return seq
# Return the recurrence sequence for matrix M, up to position limit.
def sequence(M, limit):
return sequence_for_seed(M, seed_values(M), limit)
# Ratios of successive values.
def ratios_lag(L, lag):
out = []
for k in xrange(1, len(L)):
if k - lag >= 0 and L[k - lag]!= 0 :
out.append(N(L[k]/L[k - lag], digits=15))
return out
def ratios(L): return ratios_lag(L, 1)
# Print all pseudoprimes and their factorizations for M up to limit.
def pseudoprimes(M, limit):
# The naive method is actually faster on Sage, and feasible for smaller numbers
# First fill out the vector
p = seed_values(M)
r = M.ncols()
coeff = M.row(r - 1)
tr = p[1]
for n in xrange(r, limit):
pnext = find_next(p, coeff, r)
if is_pseudoprime(pnext, tr, n):
print n, factor(n)
# Just return the first pseudoprime for M less than limit, or None.
# Used in the coefficient searches.
def first_pseudoprime(M, limit):
p = seed_values(M)
r = M.ncols()
coeff = M.row(r - 1)
tr = p[1]
for n in xrange(r, limit):
pnext = find_next(p, coeff, r)
if is_pseudoprime(pnext, tr, n):
return n
return None
# Print some interesting information about M and its characteristic polynomial.
def characterize(M):
char = charpoly(M); print char
solns = solve(char(x), x); show(solns)
print M.eigenvalues()
print M.eigenvectors_right()
# Searches of coefficient space for recurrence relations with big first pseudoprimes.
# Will print the bottom row of the matrix for all cases with integer coefficients
# between -coeff_limit and coeff_limit inclusive, with a first pseudoprime at least
# as large as print_min. If the first pseudoprime is >= max, will print OUT OF RANGE.
def rank_2_search(coeff_limit, print_min, max):
for c0 in xrange(-coeff_limit, coeff_limit+1):
for c1 in xrange(-coeff_limit, coeff_limit+1):
M = Matrix([[0, 1], [c0, c1]])
p = first_pseudoprime(M, max)
if p == None :
print "[ ", c0, c1, " ]: OUT OF RANGE"
elif p >= print_min :
print "[ ", c0, c1, " ]: ",p, factor(p)
def rank_3_search(coeff_limit, print_min, max):
for c0 in xrange(-coeff_limit, coeff_limit+1):
for c1 in xrange(-coeff_limit, coeff_limit+1):
for c2 in xrange (-coeff_limit, coeff_limit+1):
M = Matrix([[0, 1, 0], [0, 0, 1], [c0, c1, c2]])
p = first_pseudoprime(M, max)
if p == None :
print "[ ", c0, c1, c2, " ]: OUT OF RANGE"
elif p >= print_min :
print "[ ", c0, c1, c2, " ]: ",p, factor(p)
def rank_4_search(coeff_limit, print_min, max):
for c0 in xrange(-coeff_limit, coeff_limit+1):
for c1 in xrange(-coeff_limit, coeff_limit+1):
for c2 in xrange (-coeff_limit, coeff_limit+1):
for c3 in xrange (-coeff_limit, coeff_limit+1):
M = Matrix([[0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1], [c0, c1, c2, c3]])
p = first_pseudoprime(M, max)
if p == None :
print "[ ", c0, c1, c2, c3, " ]: OUT OF RANGE"
elif p >= print_min :
print "[ ", c0, c1, c2, c3, " ]: ",p, factor(p)