An Example of Probabilistic Programming

Van C. Ngo · November 19, 2017

Introduction

This post shows how a common problem in Computer Science– Trapper Miner– can be implemented in OCaml as a recursive probabilistic program.

We consider a miner who is trapped in a mine. The miner is sent to the mine for n times independently, for each time, with probability 1/2 he is trapped and with the same probability he is safe. When he is trapped, there are 3 doors in the mine for using. The first door leads to a tunnel that will take him to safety after 3 hours (representing by the variable counter). The second door leads to a tunnel that returns him to the mine after 5 hours. And the third door leads to a tunnel that returns him to the mine after 7 hours. At all times, the miner is equally likely to choose any one of the doors, meaning that he chooses any door with equivalent probability 1/3.

This is a common problem in a course of foundation of Computer Science. What we are interested in here is what is the number of hours that the miner needs to take to reach safety in average for \(n\) independent times.

Manual Method

Let’s consider the manual way to compute expected time the miner reaches safety for n independent times. Let \(X_i\) be the random variable representing the miner escape time in the \(i^{th}\) time. And let \(X = X_1 + \cdots + X_n\), we are asking for \(\mathbb{E}[X]\).

We first need to use the law of total expectation to calculate \(\mathbb{E}[X_i]\). Let \(A\) be the event that the miner is trapped, then we have \(\mathbb{E}[X_i] = \mathbb{E}[X_i \mid A] {\cdot} \mathbb{P}(A) + 0 {\cdot} \mathbb{P}(\neg A)\), which is equal to \(\frac{1}{2} {\cdot} \mathbb{E}[X_i \mid A]\).

Now use again the law of total expectation to calculate \(\mathbb{E}[X_i \mid A]\). Let \(B\) be the event that the miner reaches safety in the first try. Then we have \(\mathbb{E}[X_i \mid A] = \mathbb{E}[(X_i \mid A) \mid B] {\cdot} \mathbb{P}(B) + \mathbb{E}[(X_i \mid A) \mid \neg B] {\cdot} \mathbb{P}(\neg B)\), which is \(3{\cdot}\frac{1}{3} + (5{\cdot}\frac{1}{2} + 7{\cdot}\frac{1}{2} + \mathbb{E}[X_i \mid A]) {\cdot} \frac{2}{3} = 5 + \frac{2}{3}{\cdot}\mathbb{E}[X_i \mid A]\).

Thus, \(\mathbb{E}[X_i \mid A] = 15\). Finally, by linearity of expectation, we get \(\mathbb{E}[X] = \sum^{n}_{i=1}\mathbb{E}[X_i] = \frac{15}{2} {\cdot} n\) if \(n > 0\), otherwise it is 0. As we have shown in the above reasoning process, the manual method involves many heuristic theorems and techniques in probability theory and algebra. As a result, it is very hard or impossible to automate. If we use our automatic analyzer, Absynth, that can infer the tightest upper bound on this expected value, the we get the same tightest bound 7.5max(0,n).

Implementation

let rec reach_safe flag counter = 
   if (flag > 0) then
   begin
      let r = Random.int 100 in 
      (* encode the probability 1/3 *)
      if (r < 34) then
        reach_safe 0 (counter + 3)
      else 
      begin
         let r = Random.int 100 in
         (* encode the probability 1/2 *)
         if (r <= 50) then
            reach_safe 1 (counter + 5)
         else
            reach_safe 1 (counter + 7)
      end
   end
   else counter

let rec trapped_miner n counter = 
   let () = assert (n >= 0) in match n with
   | 0 -> counter
   | _ -> 
     let r = Random.int 100 in
     if (r <= 50) then 
        trapped_miner (n - 1) (reach_safe 1 counter)
     else 
        trapped_miner (n - 1) counter

(* testing function *)
let test nr f n =
   let rt = ref 0 in
   for i = 1 to nr do
      let cost = f n 0 in
      rt := (!rt) + cost
   done; 
   (float_of_int (!rt)) /. (float_of_int nr)

The following are the average values of counter when we run trapped_miner for 10000 times.

# test 10000 trapped_miner 10;;
- : float = 74.7524