Introduction
This post shows how a common problem in Computer Science– Trapper Miner– can be implemented in OCaml as a recursive probabilistic program.
We consider a miner who is trapped in a mine. The miner is sent to the mine for n
times independently, for each time, with probability 1/2
he is trapped and with the same probability he is safe. When he is trapped, there are 3
doors in the mine for using. The first door leads to a tunnel that will take him to safety after 3
hours (representing by the variable counter
). The second door leads to a tunnel that returns him to the mine after 5
hours. And the third door leads to a tunnel that returns him to the mine after 7
hours. At all times, the miner is equally likely to choose any one of the doors, meaning that he chooses any door with equivalent probability 1/3
.
This is a common problem in a course of foundation of Computer Science. What we are interested in here is what is the number of hours that the miner needs to take to reach safety in average for \(n\) independent times.
Manual Method
Let’s consider the manual way to compute expected time the miner reaches safety for n
independent times. Let \(X_i\) be the random variable representing the miner escape time in the \(i^{th}\) time. And let \(X = X_1 + \cdots + X_n\), we are asking for \(\mathbb{E}[X]\).
We first need to use the law of total expectation to calculate \(\mathbb{E}[X_i]\). Let \(A\) be the event that the miner is trapped, then we have \(\mathbb{E}[X_i] = \mathbb{E}[X_i \mid A] {\cdot} \mathbb{P}(A) + 0 {\cdot} \mathbb{P}(\neg A)\), which is equal to \(\frac{1}{2} {\cdot} \mathbb{E}[X_i \mid A]\).
Now use again the law of total expectation to calculate \(\mathbb{E}[X_i \mid A]\). Let \(B\) be the event that the miner reaches safety in the first try. Then we have \(\mathbb{E}[X_i \mid A] = \mathbb{E}[(X_i \mid A) \mid B] {\cdot} \mathbb{P}(B) + \mathbb{E}[(X_i \mid A) \mid \neg B] {\cdot} \mathbb{P}(\neg B)\), which is \(3{\cdot}\frac{1}{3} + (5{\cdot}\frac{1}{2} + 7{\cdot}\frac{1}{2} + \mathbb{E}[X_i \mid A]) {\cdot} \frac{2}{3} = 5 + \frac{2}{3}{\cdot}\mathbb{E}[X_i \mid A]\).
Thus, \(\mathbb{E}[X_i \mid A] = 15\). Finally, by linearity of expectation, we get \(\mathbb{E}[X] = \sum^{n}_{i=1}\mathbb{E}[X_i] = \frac{15}{2} {\cdot} n\) if \(n > 0\), otherwise it is 0.
As we have shown in the above reasoning process, the manual method involves many heuristic theorems and techniques in probability theory and algebra. As a result, it is very hard or impossible to automate. If we use our automatic analyzer, Absynth, that can infer the tightest upper bound on this expected value, the we get the same tightest bound 7.5max(0,n)
.
Implementation
let rec reach_safe flag counter =
if (flag > 0) then
begin
let r = Random.int 100 in
(* encode the probability 1/3 *)
if (r < 34) then
reach_safe 0 (counter + 3)
else
begin
let r = Random.int 100 in
(* encode the probability 1/2 *)
if (r <= 50) then
reach_safe 1 (counter + 5)
else
reach_safe 1 (counter + 7)
end
end
else counter
let rec trapped_miner n counter =
let () = assert (n >= 0) in match n with
| 0 -> counter
| _ ->
let r = Random.int 100 in
if (r <= 50) then
trapped_miner (n - 1) (reach_safe 1 counter)
else
trapped_miner (n - 1) counter
(* testing function *)
let test nr f n =
let rt = ref 0 in
for i = 1 to nr do
let cost = f n 0 in
rt := (!rt) + cost
done;
(float_of_int (!rt)) /. (float_of_int nr)
The following are the average values of counter
when we run trapped_miner
for 10000
times.
# test 10000 trapped_miner 10;;
- : float = 74.7524