Essays/Generalized Monty Hall

From J Wiki
Jump to: navigation, search

The Problem

There are 5 doors, behind each door there is either a car or one of 4 goats. Player picks a door (Choice 1), game master reveals another door with a goat. Player can either stay with Choice 1 or continue to play. In which case he chooses one of the three remaining doors (Choice 2). Game master then reveals another door with a goat and the player can either stay with Choice 2 of switch to the last door (Choice 3).

What are the chances of getting the car with Choice 1, Choice 2 and Choice 3? Is there a general formula for n choices and 2n-1 doors, with positive integer n?

Sample Space

As in Binary Probability we will seek a sample space with elements of equal probability. The space has the following structure, by the presence of the car in the Choices.

We start with 5 items: one car and four goats, and make 5 withdrawals from that pool: 3 Choices by player and two reveals of goats by game master in between. Let's identify what options each of the Choices have.

  • If car in Choice 1, then for Choice 1 that's the only option you will get; for Choice 2, it's 3 goats after the car was taken by Choice 1 and a goat by the game master; for Choice 3 it's the remaining one goat after one was taken in Choice 2 and another by game master between Choices 2 and 3.
  • If car in Choice 2, for Choice 1 you have the options of all 4 goats, but not the car since it's in Choice 2; for Choice 2 has only the car; for Choice 3, it's one goat remaining after one was taken in Choice 1 and two by the game master.
  • Similarly, if car in Choice 3.

Let's put this in a table:

Choice 1 Choice 2 Choice 3
Car in Choice 1 the car 3 goats 1 goat
Car in Choice 2 4 goats the car 1 goat
Car in Choice 3 4 goats 2 goats the car

Note, because the single car is represented in each possible position, and for each position of the car, the remaining positions contain the maximum possible number of goats, the given structure covers all possibilities in this game.

Extending this thought to higher number of Choices, we can see that the table describes a general sample space, when extended upwards and to the left by one item for each new choice. And it can be constructed procedurally in two ways. First by cartesian product of choices:

   struct1=: [: (~:*]-<)"0/~ 1+2*i.@-
   struct1 3
0 3 1
4 0 1
4 2 0

The struct1 verb generates the structure of options for a given number of choices, where 0 is the car and positive number is the goats.

   'C G'=: 'cg'
   char=: <@((C"_)`(#&G))@.*"0   NB. character view

   char struct1 3
+----+---+-+
|c   |ggg|g|
+----+---+-+
|gggg|c  |g|
+----+---+-+
|gggg|gg |c|
+----+---+-+

Second recursively:

   struct2=: 3 : '((0,1+2*i.@-@#) , (,.~ +:@#))^:(y-1),.0'
   ]S=: char struct2 3
+----+---+-+
|c   |ggg|g|
+----+---+-+
|gggg|c  |g|
+----+---+-+
|gggg|gg |c|
+----+---+-+

Interestingly, the two-fold increasing sequences of number of goats in the rows naturally reflect the fact of alternating goats revealed by the game master.

Groups of Elements

The elements of the sample space are obtained as catalogue by rows

   ,@{"1 S
+---+---+---+---+---+---+---+---+
|cgg|cgg|cgg|   |   |   |   |   |
+---+---+---+---+---+---+---+---+
|gcg|gcg|gcg|gcg|   |   |   |   |
+---+---+---+---+---+---+---+---+
|ggc|ggc|ggc|ggc|ggc|ggc|ggc|ggc|
+---+---+---+---+---+---+---+---+

   <@:>@,@{"1 S    NB. or more compactly
+---+---+---+
|cgg|gcg|ggc|
|cgg|gcg|ggc|
|cgg|gcg|ggc|
|   |gcg|ggc|
|   |   |ggc|
|   |   |ggc|
|   |   |ggc|
|   |   |ggc|
+---+---+---+

Note the elements are grouped by car in Choice. This allows to directly obtain the histogram of the distribution of car in choices:

   '* '{~ a:= ,@{"1 S
***
****
********

   '* '{~ a:= ,@{"1 char struct1 2     NB. Monty Hall classic
*
**
   '* '{~ a:= ,@{"1 char struct1 4     NB. case of 4 choices
***************
******************
************************
************************************************

Probability

The density of probability of drawing a car in a Choice, i.e. events car in Choice, is obtained using the multiplication rule from options in each row (which correspond to these events), from the structure replacing the 0s for car with 1s.

   car1=: 1:^:(0&=)"0  NB. car=0 to 1
   dens=: % +/         NB. density from frequencies

   x: dens */"1 car1 struct1 3
1r5 4r15 8r15

   x: dens */"1 car1 struct1 2  NB. classic
1r3 2r3

Or more directly, for n choices and the frequency of choice m, let us look closer at its sctructure. For n=4

   car struct1 4
1 5 3 1   NB.  m: 1 --> n
6 1 3 1   NB.           ^
6 4 1 1   NB.           |
6 4 2 1   NB.         n:|

In general, the expression for frequency of choice m out of n total choices is


F_n(m) = \prod_{i=1}^n
    \left\{
\begin{array}{rl}
  2(n-i), & i<m \\
       1, & i=m \\
2(n-i)+1, & i>m \\
\end{array}
    \right. ,
    \quad 1 \le m \le n


   freq=: [: ([: */ = 1&["0 ]-<)"0 1/~ 1+2*i.@-
   x: dens freq 5
1r9 8r63 16r105 64r315 128r315

Since the events car in Choice, c \in C, are disjoint, there is only one car, and by construction their elements together exhaust all possibilities, \sum P(c) = 1 for c \in C.

   +/ dens */"1 car1 struct1 5    NB. for 5 choices
1

Frequencies for the first n=1..8 are

   freq"0] 1+i.8
     1      0      0      0      0      0      0      0
     1      2      0      0      0      0      0      0
     3      4      8      0      0      0      0      0
    15     18     24     48      0      0      0      0
   105    120    144    192    384      0      0      0
   945   1050   1200   1440   1920   3840      0      0
 10395  11340  12600  14400  17280  23040  46080      0
135135 145530 158760 176400 201600 241920 322560 645120

Compare with the values of bifactorial for 1 \le m \le n.

The frequencies could also be built recusively

   freqnext=: ({. * <:@+:@#) , (* +:@#)
   freqtab=: 3 : 'freqnext&.>^:(<y) <1'

   freqtab 5
+-+---+-----+-----------+-------------------+
|1|1 2|3 4 8|15 18 24 48|105 120 144 192 384|
+-+---+-----+-----------+-------------------+

Razed Densities for the first n=1..20

   load 'plot'
   plot ;dens&.>freqtab 20

attachment:freqtab1.png

For further discussion of bifactorial distribution, see bifactorial.

Simulation

The following verb will return the number of marked items drawn for each of y choices, in x trials.

doors =: ] ? #
pick  =: ?@#/@$
remove=: (] -. {)"0 1
goat  =: [: i.&1"1 ~:&0

monty=: 4 : 0
  D=. x doors _1+2*y
  C=. 0$~0,x
  while. 1 do.
    p=. pick D
    C=. C , p}|:D
    D=. p remove D
    if. 0={:$D do. break. end.
    D=. (goat remove ]) D
  end.
  +/0=|:C
)

Here is the result for y=1..20 choices in 1000 trials each. Compare with the densities of the theoretical distribution above.

   plot ; 1000 <@(monty % [)"0 >:i. 20

attachment:freqtab1a.png

See Also


Contributed by Oleg Kobchenko