How to: Accurately simulate the roll of dice with any number of sides: greyooze

greyooze

How to: Accurately simulate the roll of dice with any number of sides

Jun 17, 2016 11:42

Variations on this question comes up often, so I'm making a post about the maths behind it so that I have something to link to.

It's a fun thing to reason about, so put aside any crazy notions you may have that "statistics is boring" and let's dive in!

First, an example

Let's take a specific example first: I have 6-sided dice and want to simulate the roll of a 12-sided die. How can I do this?

You might correctly guess that the answer has something to do with the fact that 2×6 = 12. Figuring out exactly what to roll to take advantage of that is trickier than you might think, though. Here are some common wrong guesses!

"roll the d6 and double it" - Whoops, no, that will only give us the even numbers {2, 4, 6, 8, 10, 12}! That's not the same thing at all.
"roll the die twice and add: d6 + d6" - This fails in two ways. First of all, it only gives numbers in the range [2, 12]: there's no way to roll a 1 like you could on a normal d12! Secondly, this doesn't give a flat probability distribution. There's only a 1/36 chance of rolling a 2, and similarly for rolling a 12. But you have a 1/6 chance of rolling a 7! A normal d12 would give the same odds for every result.
"roll the d6 and multiply by a d2 (i.e. coin flip): d6 × d2" - This again misses some results: there's no way to roll a 7, 9, or 11. Also, there are two ways to roll any of {2, 4, 6} which means their probabilities will be off. At least the other results correctly have probability 1/12, I guess!

Okay. So those didn't work, I guess there's only one thing for it. We're going to have to actually care about this problem. (Audience: nooooo!)
So, what properties do we want?

Must give the full range of results in the range [1, 12] without missing any
Each result must have equal probability 1/12

(Note that all of our above attempts actually managed to fail on both criteria, which is kind of impressive I guess?)

The solution

But there is a way to do it, and today we're gonna learn how! We're going to need a d6 and a d2. We can use a coin flip for the d2. Or for bonus fun we can actually use the d6 to do it! Just roll the d6 and only check if the result is an odd or even number: boom, you have a flat two-point probability distribution.

But wait, we will need to modify the d2 to roll results in [0, 1] instead of the usual [1, 2]. You can think of this in a couple of different ways, pick the one you find easiest to visualise:

"relabel both faces"
"relabel the '2' face as a '0' face"
"flip the d2 as usual, then subtract 1"

Anyway now we have something that rolls flat results in [0, 1]. We also still have our d6 that rolls flat results in [1, 6]. Now we just do the following:

Flip our modified d2! This gives a result of 0 or 1.
Multiply by 6. This gives a result of either 0 or 6.
Roll the d6 and add it to our working result. The results of this addition for all possible inputs are shown in the table below:

+123456
0123456
6789101112

So, does this meet our criteria for a good simulated d12? Yes!

It gives the full range of results from [1, 12]
Every result is equally likely, with probability 1/12

The dual solution

The sharp-eyed among you might have spotted that there is actually a second way to do it. Suppose we keep our d2 a regular d2 that gives results in [1, 2]. Instead we modify our d6 to give results from {0, 1, 2, 3, 4, 5, 6}! Like before, you can think of this modification in a few ways, so pick the one you like best:

"just relabel all the faces"
"relabel the '6' face as a '0' face"
"roll the d6 as usual, then subtract 1"

So now we have something that rolls flat results in [0, 5]. Perhaps we should give this idea a name, "z6"? The "z" stands for "zero".
We also have our unmodified d2 that rolls flat results in [1, 2]. Now we just do the following:

Roll our modified d6 (or "z6" if you like). This gives a result from the set {0, 1, 2, 3, 4, 5}.
Multiply by 2. This gives a result from the set {0, 2, 4, 6, 8, 10}.
Flip the d2 and add it to our working result. The results of this addition for all possible inputs are shown in the table below:

+12
012
234
456
678
8910
101112

As you can see, this is pretty much the same as before, just with a different order. And it still meets our criteria!

It gives the full range of results from [1, 12]
Every result is equally likely, with probability 1/12

Pretty exciting so far. Let's generalise it!

Let's make it work for other die sizes

Okay, we want to be able to use this technique for simulating more die sizes than just the d12. Let's identify the pattern in our two solutions above and see what we can do with it.

Let's assume we've got two differently-sized dice at our disposal. Since we want to be general, we'll assume they have A sides and B sides respectively. We know the following:

The A-sided dice (which we'll refer to as a "dA") has sides labelled {1, 2, 3, ..., A}, and each side has equal probability 1/A of occurring.
The B-sided dice (which we'll refer to as a "dB") has sides labelled {1, 2, 3, ..., B}, and each side has equal probability 1/B of occurring.

Now, we know from earlier that we're going to want to modify one of our dice to have a zero side. As before we can think of this as "relabelling all faces", "relabel the 'A' face as a '0' face" or "roll the dA as usual, then subtract 1". Whichever interpretation we pick, we'll end up with the same result:

We'll say "zA" for the modified dA-with-zero. It has sides labelled {0, 1, 2, 3, ..., (A-1)}.
We'll say "zB" for the modified dB-with-zero. It has sides labelled {0, 1, 2, 3, ..., (B-1)}.

As we saw earlier, we can go about our procedure in one of two ways. The first approach is:

Roll the zA. This gives a result from the set {0, 1, 2, 3, ..., (A-1)}.
Multiply by B. This gives a result from the set {0, B, 2B, 3B, ..., (A-1)×B}.
Roll the dB and add it to our working result. The results of this addition for all possible inputs are shown in the table below:

+123…B
0123…B
BB+1B+2B+3…2B
2B2B+12B+22B+3…3B
3B3B+13B+23B+3…4B
⋮………⋱⋮
(A-1)×B(A-1)×B + 1(A-1)×B + 2(A-1)×B + 3…A×B

Now if you read the results of this table row-by-row you can see that:

It gives the full range of results from [1, A×B]
Every result is equally likely, with probability 1/(A×B)

But that's just the criteria for a simulated die with AB sides!

Likewise, we can use the dual approach:

Roll the zB. This gives a result from the set {0, 1, 2, 3, ..., (B-1)}.
Multiply by A. This gives a result from the set {0, A, 2A, 3A, ..., (B-1)×A}.
Roll the dA and add it to our working result. The results of this addition for all possible inputs are shown in the table below:

And you can write out the table yourself this time, I'm doing a lot of typing here! ;-)

So we've achieved something great. If we want to simulate a larger die roll dN, we just need to find two numbers A and B such that N=A×B. Then we can simulate the dN using the smaller dice dA and dB. Awesome!

Now that doesn't quite get us to simulating any die size: it only lets us create larger dice than our existing ones, and only certain sizes. There's no useful way to simulate a d11 with this approach, for example.

Simulating smaller-sided dice is super simple

Suppose I have a d6 and want to simulate the roll of a d5.

The d6 has possible results {1, 2, 3, 4, 5, 6}.
The d5 has possible results {1, 2, 3, 4, 5}.

I guess what we want to do is somehow "ignore" the 6 face. How do we do that? Easy!

Roll the d6.
If you get a result of "6", re-roll.
Continue until you get a result in the valid range [1, 5].

Does this meet our two criteria for a simulated die? Well, it certainly gives the full range of results from [1, 5], because that last line in our procedure requires that before you stop re-rolling. What about the probability distribution?

It turns out that, yes, they results are equally distributed; the reason might not be obvious. Essentially we are taking all the probability mass that would normally be assigned to the "6" face, and splitting it up evenly over the 5 valid faces.

On the first roll, we have a 5/6 chance of getting a valid result in [1, 5]. Each valid result is equally likely.
Otherwise (1/6 chance), we got a 6 and need to re-roll.
On the second roll, we have a 5/6 chance of getting a valid result in [1, 5]. Each valid result is equally likely.
Otherwise (1/6 chance), we got a 6 and need to re-roll.
On the third roll, we have a 5/6 chance of getting a valid result in [1, 5]. Each valid result is equally likely.
Otherwise (1/6 chance), we got a 6 and need to re-roll.
And so on.

You may have spotted that there's no upper limit on the number of rolls needed. In theory, you could roll thirty-five "6" values in a row and still not have gotten a valid result! Fortunately, this isn't very likely to happen.

Even smaller dice

Suppose we want to simulate an even smaller die, like the d4. This is no problem. Again, we simply re-roll the dice if we get an invalid result (the "5" or "6" face, in this instance). Keep going until you get a valid result.

Fully armed and operational

Okay, great. Now we know two techniques:

If we have dice of sizes A and B, we can simulate a larger die of size (A×B).
If we have a die of size N, we can simulate any smaller size (N-1), (N-2), ….

Which is totally enough to simulate a die of any size whatsoever! Here are the final step-by-step instructions.
I've split it into three parts for ease of reading, but you can think of it all as one process if you find that clearer.

Procedure A

In this procedure, N is the number of sides for the die we want to simulate.

If N is the size of a physical die you have (e.g. N=6) then just use that physical die. Stop.
If N is a prime number, use Procedure B instead. Stop.
Otherwise, find two numbers P and Q such that N=P×Q. If there are multiple different ways to factorise (e.g. 24 = 6×4 = 2×12 = 3×8), then pick one that makes steps A4 & A5 easiest.
Apply Procedure A on the smaller number P to find a way to simulate a die of this size.
Apply Procedure A on the smaller number Q to find a way to simulate a die of this size.
Use Procedure C with the numbers P and Q as input (either way around). Stop.

Procedure B

In this procedure, N is the number of sides for the die we want to simulate. N is a prime.

Find a number X which is greater than N, where X is not a prime. There are an infinite number of possible choices, but values which are close to N and provide "easy" input at step B2 below are best.
Apply Procedure A on the number X to find a way to simulate a die of this size.
Roll the dX.
If you get a result outside the valid range [1, N], re-roll (go back to step B3 again).
Otherwise, you now have a result for the simulated dN. Stop.

Procedure C

In this procedure, P and Q are the numbers of sides of two dice that we already know how to simulate.
We want to simulate a die with P×Q sides.

Define a zQ to be a die with Q sides, labelled {0, 1, 2, ..., (Q-1)}. This is just a relabelling of a dQ.
Roll the zQ.
Multiply by P.
Roll the dP and add to the working result.
Now you have a result for the simulated die with P×Q sides. Stop.

algorithms, mathematics, games design