Probability Confuses

July 19, 2011

It took a long time before humanity got a grip on modeling probabilities. The first formal treatment of the subject I’m aware of were the letters between Blaise Pascal and Pierre de Fermat. In these letters which changed the way people see the world the two intellectual giants were trying to figure out the correct way to solve the divide the pot of an unfinished game, but they did not agree about how to do it. Even from the start probability problems were showing themselves to be confusing to the brightest. This, despite thousands of years of preparation for contending with the problems. Now, mere hundreds of years later, probability still confuses.

In 1975 Marilyn vos Savant who was in the record books for having the world’s highest IQ, published what is now known as the Monty Hall problem:

Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?

Savant claimed that the best thing to do in this situation would be to switch doors. She correctly reasoned that by switching doors you would give yourself a 2/3 chance of having the right door, but that by staying with the door you had chosen you would have only a 1/3 chance at having selected the correct door. Many people disagreed with her. Not just people who knew nothing about probability, even PhDs were telling her she had gotten the wrong answer. The Monty Hall problem is famous for having sparked so much disagreement. It is taught about in statistics courses across the country. The problem has even been covered in the movie 21.

People get probability wrong all the time; this is just a famous example.

Sports fans believe that hot streaks are a real thing. They might be angry with a coach for not putting the game in the hands of a hot player. Studies don’t agree with them. They have looked at the so-called streaks in sports. It turns out that there is no such thing. What we see as a streak is actually just chance. We get confused about streaks outside of sports too. Peter Norvig wrote about our inability to spot true randomness in an article about proper experimental design.

There is also a study in which doctors were asked to determine how likely a patient was to have cancer. They were given all the numbers they would need to come to the right answer, but almost all of them didn’t. Most of the doctors claimed that the patients were likely to have cancer. In truth, their chance of having cancer wasn’t high. The doctors got the answer wrong by an order of magnitude despite their education.

The list of ways humans make mistakes when thinking and working with probability is long. We are able to calculate probabilities, but we don’t find it intuitive.

I did a probability problem which I found in the back of Introduction To Algorithms. Here is the problem:

A deck of 10 cards, each bearing a distinct number from 1 to 10, is shuffled to mix the cards thoroughly. Three cards are removed one at a time from the deck. What is the probability that the three cards are selected in sorted (increasing) order?

I decided that the answer to the problem was that 1/6 of the time the cards would be drawn in ascending order. After all, for each set of three cards there are six different orders in which they can be drawn, but only one of these orderings will be sorted properly. When I shared this answer with my dad, he rejected it immediately. Minutes later I found myself with ten cards in my hand running through things manually and trying to prove to him that my answer was the right one. We went through twenty deals without ever having seen the cards dealt in ascending order and suddenly my father was sure: I was wrong, and he was right.

A thousand draws tell a different story:

;; A deck of ten cards each bearing a distinct number
;; between one and ten are shuffled in order to mix the
;; cards thoroughly. Three cards are removed from the
;; deck one at a time. What is the probability that the
;; three cards are selected in sorted order?

;; First let's create the deck…
(def cards (range 1 11)) ;; > [1 2 3 … 9 10]

;; Next we draw cards from a shuffled deck by
;; using shuffle and take
(defn draw-cards [] (take 3 (shuffle cards)))

;; Testing this function in the REPL we get…
;; user> (draw-cards)
;; (2 7 8)
;; user> (draw-cards)
;; (9 8 4)

;; Now we need a way to tell whether or not the cards
;; are in sorted order. We can use a comparison
;; operator to do that pretty simply
(defn ascending? [draws] (apply < draws))

;; Now let's say we try drawing a few thousand times…
;; How frequently will the cards be in ascending order?

(count (filter ascending? (repeatedly 6000 draw-cards)))
;; 1027