I joined a Meetup group who play different kinds of board games every Tuesday evening. Originally I joined because the group specialized in Scrabble; and for several months that’s just about all we played. But as the group slowly grew, we began to diversify. These days we usually get enough people that we have three or four games going simultaneously (much to the dismay of anyone trying to concentrate while sipping their coffee at the Borders coffee shop where we play).
Two weeks ago several of us played a new game (to me, at least) called
Set. It’s a very simple pattern-matching game, played with a deck of 81 cards bearing abstract colored shapes. The goal is to identify sets of three cards, officially called SETs (in all caps, to distinguish them from ordinary mathematical sets), that either all match or are all different in each of four attributes: color, shape, number and shading. Plate 1 shows three SETs. The first is matched in color (green), shape (diamonds), and shading (filled), but differs in number; the second differs in all four attributes, and the third is matched in number but differs in the other three attributes.
To start the game, someone deals a 3×4 array of cards from the (randomized) deck and everybody stares at it until someone finds a SET, at which point s/he yells, “Set!” and takes the three cards making up the SET. The cards are replaced by three new ones from the deck, and the cycle repeats until the deck has been played entirely, and nobody can identify any more SETs.
We had one experienced player at the table, in addition to four of us rank novices. I asked her whether it was possible to deal 12 cards without a SET. She didn’t know, but pointed out that the rules allowed for the possibility: if nobody can find a SET, then three cards are added to the array without removing any cards. Presumably, if nobody still can find a SET, yet another three more cards are dealt, but it seemed pretty farfetched to me that 15 cards wouldn’t contain at least one SET. Indeed, we never failed to find a SET within a few seconds in our groups of 12 cards.
On my way to work, a few days later, I started thinking about two related questions: (1) what was the probability of drawing 12 cards without a SET? (2) What was the maximum number of cards possible without a SET?
To make things easier, I coded the 81 cards in the Set deck as four-digit numbers, one digit for each attribute, using the following convention:
Color: red (1), green (2), purple (3)
Shape: diamond (1), pill (2), tilde (3)
Number: one (1), two (2) or three (3)
Shading: open (1), hatched (2), filled (3)
Thus, the entire deck is represented as 81 numbers from 1111 to 3333. The deck includes all possible combinations of the attributes, and all cards are unique. The first two SETs in Plate 1 would, then, be encoded as {2113, 2123, 2133} and {3111, 2322, 1233}. (The horizontal stripes in the hatch-filled shapes aren’t very clear in the photo, sorry.) By stacking the numbers,
2113 3111
2123 2322
2133 1233it is easy to see that these really are SETs, because the numerals in each column are either all the same or all different.
Because all the cards are unique, at least one attribute in a SET must be different. Thus, if we make sure that only two of the three values are present for each attribute, we shouldn’t have any sets. So, the collection of all cards with no 3s in their numerical representations should not contain any SETs. There are sixteen such cards (see below and Plate 2):
1111 1112 1121 1122
1211 1212 1221 1222
2111 2112 2121 2122
2211 2212 2221 2222
That answers the question of whether we can draw 12 cards without a SET. Does it also identify the maximum number of cards possible without a SET?
It is not difficult to see that if we add any other card from the deck to our group of 16, we’ll have at least one SET. Take, for example, the card 3132. It forms a SET with 1112 and 2122. Just vary the attributes that have a value of 3 in the 17th card (here, the first and third), and keep the others equal to the values in the 17th card, and there’s your SET. What’s more, 3132 can also be grouped with 1122 and
2112 to make a SET. No fewer than eight pairs of cards complete a set with 3333: take any card and its reflection (direct opposite) about a point at the center of the 4×4 array.
Can we find a group of more than 16 cards containing no SETs? I realized it was possible to substitute certain cards for cards in the no-SET group of 16 above, as in Plate 3, where I replaced 1111 with 1113:
As it happens, there is not other card in the deck that does not form a SET with at least one pair in this group of 16. I suspected that 16 was the largest number of cards possible without a SET, but it became clear that the number of no-SET groups with 16 cards was very large, and just enumerating them was going to be far more work than I cared to invest, let alone testing each with all the remaining cards in the deck to see whether it could be augmented into a no-set 17-card group.
At this point, having officially thrown in the towel on my second question, I decided to cheat. It shouldn’t come as a surprise that other people have already thought hard about this problem. In fact, it is possible to assemble as many as
20 cards without a SET. The official Set Web site shows a
proof that this is the largest number (note that the example picture is arranged in the same way as the last figure in the proof). The sad part is that I’d thought of mapping the entire Set deck onto a 4-dimensional tic-tac-toe grid, but immediately dismissed it as unhelpful.
That still leaves my first question. I now knew that the probability of drawing 12 cards without any SET was greater than zero; but what was it? An analytical derivation is likely to be considerably more involved than finding the largest possible collection of cards without a SET; I didn’t even begin to attempt it. However, I’ve yet to find an answer on the Web, either, so maybe it isn’t all that straightforward even for a real mathematician. Though I’m not enough of a mathematician to work out the probability analytically, I am enough of a statistician to write a program to simulate thousands and thousands of independent draws of 12 cards, count the number of SETs in each, and thereby to estimate the probability. Applied statisticians use simulation all the time to estimate probabilities for probability distributions that are either too complex to work out mathematically or impossible to define even in theory.
I wrote a script to play Set in the statistical programming language
R. R, an interpreted language, isn’t anywhere near as fast at crunching numbers as a compiled language like C++, but it’s wonderfully suited for this sort of task: it took only 18 lines of code to simulate drawing a group of 12 cards at random (which itself took exactly one line of code) and counting all the SETs. Plate 4 shows the distribution of the number of SETs in 100,000 draws of 12 cards:
You can see that there are lots of SETs out there: more than half the samples contained three or more SETs, and six of them had 10 SETs each. Exactly 3,166 of 100,000 draws had no SETs. My estimate of the probability of not finding a SET in 12 cards, then, is 3166/100000 = 0.0317, or about 1 in 31.6.*
In summary, it’s not all that uncommon to have to draw additional cards to a given collection of 12 during a game of Set. There is about a 50/50 chance of finding a no-set group of 12 cards in 22 independent rounds. (Because the rounds in a real game of Set are not independent-two successive rounds share all but three of their cards-the chance of being stuck with a SETless group of cards is dependent on the number of SETs in the previous round.) On the other hand, the abundance of SETs (median = 3) in 12 cards suggests that when someone finds a SET and collects her cards, it is advantageous to keep looking for SETs in the remaining nine cards before the next three are even dealt out. (A quick simulation of 1000 groups of 9 random cards shows that such a group is about 70% likely to contain at least one SET.)
As an aside, I was delighted to find out that Set was developed by a
genetic epidemiologist. I should write the designer and ask how the categories in Set correspond to genetic elements of her project on heritability of canine epilepsy. Perhaps I should be teaching the students in my statistical genetics class to play Set!
Part II of my explorations of Set may be found
here.
_____________________
*Warning: If statistics makes your eyes glaze over, best to skip this paragraph, unless you wish to use it as a cure for insomnia.) Of course, this estimate has some uncertainty attached to it. (That’s the disadvantage of only having a finite amount of time to run simulations.) It so happens that the standard error (SE) of an estimate of probability measured as a number of counts, if the probability of success (here, finding a group of cards without a SET) is small, is approximately equal to the square root of the total counts. Moreover, if the number of counts is reasonably large (where “reasonably large” is usually taken as more than 30), then the estimate is approximately normally distributed (that is, as the familiar bell-shaped curve) with mean as the counts divided by the total number of trials, and SE the square root of the counts divided by total trials. A 95% “confidence interval” (CI; here, the range of probability values 95% likely to contain the true value) for a normally distributed estimate is the mean ± 1.96 SE. The square root of 3166 is about 56.3, and so 1.96 SEs is 110.3; hence the 95% CI for the probability is (0.0301, 0.0328). Naturally, if I ran more trials, I could estimate the likelihood with greater precision, but I already have it nailed to better than ± 5%-close enough to satisfy my curiosity.