Hats and capricious secretary: a MonteCarlo trick to compute “e”

A popular question on basic probability  appearing in quant job interviews is the classical matching problem described in amusing forms: hats in a check-room are mixed and randomly distributed to the guests; a secretary has mixed a set of letters and their envelopes. What is the probability that no hat will match the right guest? What is the probability that exactly 3 letters will reach the right destination?

The more classical formulation of the same problem is about seeking for a match among two identical decks of N cards randomly mixed. Being A_i the probability that the i^{th} card will match (regardless to other possible matchings!), we have

P(A_i) = \frac{1}{N}\,\,, P(A_iA_j) = \frac{1}{N(N-1)}\,\,, P(A_{i_1}...A_{i_{i_r}}) = \frac{(N-r)!}{N!}
The probability of at least one match is given by
P(A_1\cup A_2 \cup ... \cup A_N)

= \sum_{i=1}^{N} P(A_i)- \sum_{i<j}P(A_iA_j) +...+(-1)^{N-1}P(A_1...A_N)

= 1- \frac{1}{2!}+\frac{1}{3!}-...+(-1)^{N-1}\frac{1}{N!}
Therefore, in the limit of large N the probability approaches 1-1/e , which is a quite surprisingly result. This result can be used to compute the probability of zero matches which is simply 1 minus the probability of at least one match.

From this simple result the amusing idea to try to compute e with a MonteCarlo, just as it is usually done for \pi. Here my Mathematica code which does the job:

MonteCarloNepero[NN_, L_] := Module[{mean, check, indicatorCount},
 indicatorCount = 0;
 check = (# - RandomSample[#]) &@Table[l, {l, 1, L}];
 If[FreeQ[check, 0],
 indicatorCount = indicatorCount + 1;
 , {NN}];
 mean = indicatorCount/NN;

Where NN is the number of MonteCarlo runs and L is the length of the card deck. The convergence in the number of cards is very fast so with L=15 the result is already fair. Far more slow is the convergence in NN… For example we may get

In[2]:= MonteCarloNepero[1000, 15]
Out[2]:= 2.75482

In[3]:= E // N
Out[3]= 2.71828

Not that accurate, but fun.

Free Open-Source Statistics Cookbook

Data Science 101

Matthias Vallentin, a computer science PhD student at UC Berkeley, has published a Probability and Statistics Cookbook. The book can be freely downloaded in PDF format via the website. Also, the latex source is available on Github. Matthias states that others are free to fork the source and make changes.

The book is not a textbook. It is more of a cheatsheet. It contains many of the common probability and statistics techniques and the associated formula. I would consider this book to be an excellent resource to have around.

View original post

Sampling from Cauchy (and Gaussian) distribution

How we construct  a random draw  from a Cauchy distribution, if we start with a draw from uniform distribution? That’s what asks the Exercise 7.3 of the book The Concepts and Practice of Mathematical Finance by Mark Joshi. We post here some C++ code as an answer. This is easily done by explicitly computing the inverse cumulative distribution function F^{-1}(x) of the Cauchy distribution and plugging then random numbers generated with a uniform distribution (this is done with the function GetOneRandomByICD).

f(x) = \frac{1}{\pi}\frac{1}{1+x^2},\; F(x) = \frac{1}{\pi}\arctan (x) + \frac{1}{2},
F^{-1}(x) = \tan [\pi (x-1/2)]

More tricky is the case in which we don’t have the F^{-1}(x). In that case we can obtain our sample by using two random uniforms – GetOneCauchyByUniforms –  or using two random numbers with gaussian distribution – GetOneCauchyByGaussians. In the latter case of course we need also a routine  to compute gaussian numbers. We present here the routine GetOneGaussianByRatio, which is similar to GetOneCauchyByUniforms (both are based on ratio of uniforms – see Probability and Random Processes by Grimmet and Stirzaker). The mathematical proofs are in the probability section of my notes.

#ifndef RANDOM_H
#define RANDOM_H

double CauchyICD(double uniform);
double GetOneRandomByICD(double (*func_ptr)(double));

double GetOneCauchyByUniforms();
double GetOneCauchyByGaussian();

double GetOneGaussianByRatio();

#include "Random.h"
#include <cmath>
#include <cstdlib>

double CauchyICD(double x){
 return std::tan(M_PI*(x-0.5));
double GetOneRandomByICD(double (*func_ptr)(double)){
	return (*func_ptr)(rand()/static_cast<double>(RAND_MAX));

double GetOneGaussianByRatio(){

	double x;
	double y;

	double ratio;

	do {
		x = 2.*rand()/static_cast(RAND_MAX)-1.;
		y = 2.*rand()/static_cast(RAND_MAX)-1.;
		ratio = x/y;
	while((y > std::exp(-ratio*ratio*0.25))     
           || (y < -std::exp(-ratio*ratio*0.25)) );

	return ratio;

double GetOneCauchyByUniforms(){

	double x;
	double y;

	double sizeSquared;

	do {
		x = 2.*rand()/static_cast<double>(RAND_MAX)-1.;
		y = 2.*rand()/static_cast<double>(RAND_MAX)-1.;
		sizeSquared = x*x + y*y;
	while(sizeSquared >= 4.0);

	return y/x;

double GetOneCauchyByGaussian(){
	double x = GetOneGaussianByRatio();
	double y = GetOneGaussianByRatio();

	return x/y;

A simple test program which write to files samples of Cauchy distributed numbers:

#include "Random.h"
#include <iostream>
#include <fstream>
#include <cmath>
#include <cstdlib>

int main(){

	long sampleLength = 10000;

	std::ofstream myfile1;
	myfile1.open ("inverse_cauchy.txt");
	std::ofstream myfile2;
	myfile2.open ("ratio_cauchy.txt");	
        std::ofstream myfile3;
	myfile3.open ("gauss_cauchy.txt");	

    for(long n=0; n< sampleLength; n++){
		 myfile1 << GetOneRandomByICD(&CauchyICD) << "\n";
		 myfile2 << GetOneCauchyByRatio() << "\n";
		 myfile3 << GetOneCauchyByGaussian() << "\n";


	return 0;

Now just compile and try it.