Probability and OCaml
Posted by Daniel Lyons Thu, 22 Feb 2007 09:04:00 GMT
I’m reading a cool book called Understanding Probability. Pretty early on in chapter 2 there is a chart that looks something like this:
| n | Kn – np | fn |
| 10 | 1.0 | 0.6000 |
| 25 | 1.5 | 0.5600 |
| 50 | 2.0 | 0.5400 |
and so forth. In this case, the experiment is flipping a coin, n is the number of repetitions of the experiment, Kn is the number of times we get heads, p is the probability of getting heads (0.5), and fn is the relative frequency of getting heads. Obviously, this chart represents one particular set of trials. The author encouraged writing code so I busted out some OCaml on paper in the car, and implemented it from the hotel room a few hours ago.
type face = H | T
That’s just the coin type.
module ProbabilityLab =
struct
let run_experiment experiment times =
Array.init times (fun i -> experiment ())
Here I’m just applying some function over and over again with no parameter to run the experiment, and using that to fill in an array. The experiment gets invoked once for each time we want to do a trial. An array comes out. Simple.
let count_value value result =
float_of_int (Array.fold_right
(fun x y -> y + if x = value then 1 else 0)
result
0)
let count_total result =
float_of_int (Array.length result)
let relative_frequency value result =
count_value value result /. count_total result
These are some utility routines. count_value just counts up the number of times a particular value occurs in an array. count_total gives the length of the array. relative_frequency computes fn for a particular value. Notice that each of these takes the result array as the last parameter and returns a float.
let expected_actual_difference probability value result =
let actual = count_value value result in
let expected = (probability *. count_total result) in
expected -. actual
expected_actual_difference is for the Kn – fp. We’ll use it to see how many more heads we get than half: if the random number generator was completely unrandom and produced heads and then tails always, then this number would fluctuate between zero and either one or negative one.
let collect_summaries summaries result =
List.map (fun f -> f result) summaries
end;;
The last thing in my ProbabilityLab module in a function which takes a list of functions and applies each of them to the result in turn. It’s the opposite of the way you normally map; instead of one function of a simple parameter being invoked across a list, we’re going to invoke each function in the list across the same complex piece of data. This will simplify the table production later.
A simple coin flipping module:
module Coin =
struct
let flip () = match Random.int 2 with
| 0 -> H
| _ -> T
let flip_unfair () = match Random.int 6 with
| 0 -> H
| _ -> T
end;;
As you can see, it just implements two functions: flip and flip_unfair, which represent an unweighted coin and a coin that leans heavily towards tails, only getting heads about 1/6th of the time.
As an example, we can now run a simple experiment such as:
# ProbabilityLab.run_experiment Coin.flip 10;; - : face array = [|H; T; H; H; H; T; H; T; T; T|]
That’s the experiment result we’re going to have to work with.
let coin_flip_experiment coin times =
begin
Random.self_init ();
let result = ProbabilityLab.run_experiment coin times in
ProbabilityLab.collect_summaries [
ProbabilityLab.count_total;
ProbabilityLab.expected_actual_difference 0.5 H;
ProbabilityLab.relative_frequency H]
result
end;;
This function performs a coin flipping experiment. It takes one of the coins as a parameter (meaning, flip or flip_unfair) and a number of times to flip the coin. It creates the result using the run_experiment function in the ProbabilityLab module and then collects three summaries. The first summary is just n. The second is the number of heads we got beyond 50% (the expected amount), kn – np. The third is the computed relative frequency of obtaining heads with this experiment. Note that they’re all curried function calls that are expecting one more parameter—the result array.
Clearly, coin_flip_experiment does the bulk of the work. But now we need to see what it takes to drive it:
let process () =
let coins = [Coin.flip; Coin.flip_unfair] in
let trials = [ 10; 25; 50; 100; 250; 500;
1000; 2500; 5000; 7500; 10000; 15000;
20000; 25000; 30000] in
List.map (fun coin ->
List.map (fun trial -> coin_flip_experiment coin trial) trials)
coins
;;
Simple. A nested map across the coins and the numbers of trials we want. Here’s the output, unbeautified, for one run on my computer:
# process ();; - : float list list list = [[[10.; 0.; 0.5]; [25.; -1.5; 0.56]; [50.; -1.; 0.52]; [100.; -4.; 0.54]; [250.; -2.; 0.508]; [500.; 2.; 0.496]; [1000.; 21.; 0.479]; [2500.; 6.; 0.4976]; [5000.; -20.; 0.504]; [7500.; 30.; 0.496]; [10000.; 23.; 0.4977]; [15000.; 78.; 0.4948]; [20000.; 60.; 0.497]; [25000.; 139.; 0.49444]; [30000.; -53.; 0.501766666666666694]]; [[10.; 4.; 0.1]; [25.; 9.5; 0.12]; [50.; 18.; 0.14]; [100.; 35.; 0.15]; [250.; 88.; 0.148]; [500.; 180.; 0.14]; [1000.; 321.; 0.179]; [2500.; 861.; 0.1556]; [5000.; 1668.; 0.1664]; [7500.; 2516.; 0.164533333333333337]; [10000.; 3268.; 0.1732]; [15000.; 5033.; 0.164466666666666678]; [20000.; 6705.; 0.16475]; [25000.; 8413.; 0.16348]; [30000.; 9916.; 0.169466666666666654]]]

the prob. of a ball landing on all blacks nos. or all reds nos. in 9 continuous trys on a roulette wheel 1-36 with one 0 (always a loser)?