In this post, I would like to start a topic about probabilistic programming in OCaml.
Probabilistic programing languages are standard programming languages like C, Java, or ML, with two additional constructs:
The sampling construct for drawing a value at random from probability distributions
The probabilistic branching construct for controlling the flow in a program through observations.
The applications of probabilistic programming is wide, for instance, probabilistic programs are used in machine learning and robotics to describe distribution functions that are analyzed using Bayesian inference. In security, probabilistic programming has played a central role in cryptography, i.e., probabilistic security guarantees and probabilistic encryption. Probabilistic programs are also used to model the performance and reliability properties of a variety of systems, in which the uncertainties can come from the reliability of components (hardwares or softwares), the data from sensors, the reliability of the communication channels in the systems (e.g., the aircraft’s control systems).
This raises the question of how to analyze and verify that a system exhibiting probabilistic behavior satisfies a certain property. There is a rapidly growing trend in research on probabilistic programs which focuses on many aspects such as static analysis, program compilation, and program verification. To mention a few, the extension the framework of abstract interpretation to probabilistic programs, the development of a calculus for obtaining upper bounds on the expected value of run-time for probabilistic programs, and the developments of probabilistic and statistic model checking techniques for verifying probabilistic temporal properties of probabilistic and timed systems.
The following implements a simple example of probabilistic program for finding a given value in a list of integer values. The fist function
find represents the standard program, e.g., without probability. It is easy to see that in the worst-case, the variable
acc (initialized to 0) will have the value as the length of the input list
let rec find a l acc = match l with |  -> (false, acc) | x::xs -> if a = x then (true,(acc + 1)) else find a xs (acc + 1);;
let rec pfind a l acc = match l with |  -> (false, acc) | x::xs as l -> let c = Random.int 2 in if c = 0 then pfind a l (acc + 1) else (if a = x then (true,(acc + 1)) else pfind a xs (acc + 1));;
let test n f a l = let res = ref  in for i = 1 to n do let (found,cost) = f a l 0 in res := cost::(!res) done; let sum = List.fold_left (fun s x -> s + x) 0 !res in (float_of_int sum) /. (float_of_int n);;
In the second function
pfind, the integer value
c is sampled from a uniform distribution from 0 (inclusive) to 2 (exclusive). In other words,
0 with probability
0.5 and is
1 with probability
1 - 0.5 = 0.5. A a result, with probability
0.5, the recursive call on the tail of the list is skipped and with the same probability the recursion is called on the tail of the list.
We can see that the recursion can be infinite, e.g., the number of times such that
0 is infinite. However, we can calculate that the probability that the number of times such that
0 is infinite, is with which is
0. It is said that
pfind is almost surely terminated.
The question is what is the behavior of
pfind. For example,
what is the expected value of the variable
For this simple program, we can compute manually that this expected value is
|l| is the number of elements in the list in the worst-case. My on-going work is to automatically and statically infer a symbolic upper bound on this expected value. The following are the average values of
acc when we run
# test 10000000 find 10 [0;1;2;3;4;5;6];; - : float = 7. # test 10000000 pfind 10 [0;1;2;3;4;5;6];; - : float = 13.998363