In the recent years, a body of research has indicated that most people handle natural frequencies better than probabilities, when it comes to statistical reasoning. A probability is a number from zero to one, such as 0.6, and the same information can be encoded using natural frequencies as "6 out of 10". Prior to recent exposure by the media, I have not heard of the term "natural frequencies", and while I am not surprised of the conclusion, I find it interesting that the reported differences in correct statistical reasoning using natural frequencies versus probabilities is so stark.
It seems that the most frequently cited example is the application of the Bayes' rule. Consider the following problem in the medical domain, with the numbers all contrived:
A medical test is developed to detect lung cancer. In a clinical trial of 10000 people, 1% of the people have lung cancer. The probability that the test detects as positive, given that the person has lung cancer, is 90%. The probability that the test detects as positive, given that the person does not have lung cancer, is 10%. What is the probability that a person in the clinical trial has lung cancer, given that the person is detected as positive by the test?
The correct answer is about 8.3%. The correct computation involves Bayes' rule as follows:
P(cancer | positive)
= [P(positive | cancer) * P(cancer)] / [P(positive | cancer) * P(cancer) + P(positive | no cancer) * P(no cancer)]
= [0.9 * 0.01] / [0.9 * 0.01 + 0.1 * 0.99]
= 0.083 (approx)
However, the formula is complicated and is difficult to apply correctly. Furthermore, the answer is counter-intuitive to many people, who think that the answer should be around 90%. A common flawed line of reasoning stems from the fact that the correct detection rate for both people with lung cancer and people without lung cancer are both 0.9, so people simply average the two probabilities. Another common erroneous line of reasoning is caused by people mistaking the conditional probabilities as joint probabilities, and compute 0.9 / [0.9 + 0.1] = 0.9. To these people, given the probabilities in the problem, it is not intuitive that a small proportion of false positives from a very large pool of people with no lung cancer results in a very large number of false positives. Yet such low answers are to be expected for diagnostic tests of rare events, even when both false positive rate and false negative rate are both low. The medical domain sees many rare events in the form diseases which need to be detected. Another domain where rare events are often found is physical security, such as intrusion detection and terrorist detection.
The same problem presented in natural frequencies is given below:
A medical test is developed to detect lung cancer. In a clinical trial of 10000 people, 100 of the people have lung cancer. Out of the 100 people who have lung cancer, the test detects 90 of them as positive. Out of the 9900 people who do not have lung cancer, the test wrongly detects 990 of them as positive. Out of the people that are detected as positive by the test, what is the proportion of the people having lung cancer?
Application of Bayes's rule is now much simpler:
P(cancer | positive)
= #(positive and cancer) / [#(positive and cancer) + #(positive and no cancer)]
= 90 / [90 + 990]
= 0.083 (approx)
Research indicates that much more people derive the correct answer for the natural frequencies version of the same problem. In the natural frequencies version, one thing that stands out is the scale of the people who are correctly detected as positive, which is 900, as opposed to the people who are wrongly detected as positive, which is a much larger number of 9900. This insight is lost or obscured when this problem is presented using conditional probabilities, which are 0.9 and 0.1 respectively. In fact, the probabilities version of the Bayes' rule is complicated with so many multiplications, for which the purpose of the multiplications is to recover the joint probabilities P(positive and cancer) and P(positive and no cancer). Perhaps it is easier for people to understand if they are given joint probabilities rather then conditional probabilities.
Tree diagrams are often used to visualize such problems. Based on research guidance, it is recommended to use natural frequencies rather than conditional probabilities in tree diagrams. Personally, for problems involving only two variables, I think that it might be intuitive to present the information using a confusion matrix, where the numbers are natural frequencies:
Positive Negative
Cancer 90 10
No cancer 990 8910
Then the solution to this problem can be determined from the numbers in the first column, and ignoring the second column altogether. A variant of the confusion matrix uses joint probabilities rather than natural frequencies, but the result might be just as effective.
Some of the research also note that Bayes' rule has traditionally been taught using probabilities. Experiments have shown that the majority of people perform such computations using probabilities, going as far as converting a problem presented in natural frequencies into probabilities format, instead of using the frequency counts as-is. As a result, the chances of making errors is greatly increased. Due to these research, a small number of schools have started teaching Bayes' rule using probabilities as well as natural frequencies.
Finally, I make a comment on conditional probabilities, which is often a great source of confusion. I would like to assert that all probabilities are conditional probabilities. For probabilities that are not explicitly stated as conditional probabilities, these probabilities are really conditional on some universe which is derived from the context. For example, in the lung cancer problem, P(cancer) actually mean P(cancer | universe), where the universe refers to the 10000 people who participated in the clinical trial. However, the notation omit this fact, which in my opinion causes confusion to people who are learning probability theory. Given the above, conditional probabilities can be seen as just ordinary probabilities with the universe set to the conditional. That is, P(cancer | positive) is the probability of that a person having lung cancer, where the universe is now the set of people that are detected as positive by the medical test.