*WARNING: This post contains probability, which is not my strong point. If I have made mathematical or terminological errors, please correct them kindly in the comments.*

Probability can be a bit of a minefield. I try not to rely on my intuition at all when someone asks me a probability question, because so often it can be wrong. So imagine my consternation on Friday evening at the end of a long hard day working with summer school pupils on probability when I was given a question by a close friend who should have known better, and asked to apply my intuition to it.

In order to pose his question, I’d better explain some terminology – skip this paragraph if you’re familiar with the idea of a uniform distribution. Imagine a number line with all the numbers between 0 and 1. You can pick any point on the number line you like. It doesn’t have to be a terminating decimal, it doesn’t even have to be a recurring decimal – it can be rational or irrational, close to 0, close to 1, or anywhere in between. The uniform distribution is a probability distribution where each of these choices is equally likely.

Start by imagining a random variable uniformly distributed on (0,1). It feels pretty intuitive that the expected value of such a random variable should be 0.5. My good friend’s question was as follows: What happens if you have two independent random variables, both uniformly distributed on (0,1), and you want to know the expected value of the smaller of the two variables?

My intuition said that the answer had to be less than 0.5, because for each random variable the expectation was 0.5, so for the two together and taking the smaller of the two, it just had to be less. My thinking then went very arm-wavy: some of the time they will both be less than a half, some of the time one will be less and one greater, and some of the time they will both be greater than a half, so maybe it’s about a third? My arm-wavy gut instinct was confirmed by my friend’s algebra.

I’m not too confident at manipulating probability density functions so I resorted to my usual tactic when scared off by continuous distributions: looked for a discrete analogue. What if I had a set of WHOLE numbers, and I picked pairs of them (with replacement, so the same number can come out twice)? I started with the set of numbers from 0-6 (deliberately picked a multiple of 3 in case thirds came into it). Actually that’s not entirely true; I started with the numbers from 1-6 and then changed my mind… This was a small enough set that I could list possibilities and calculate the average, and working on this special case gave me some insights to try working through some algebra for the set of numbers from 0-n. I leave this as an exercise to the reader 🙂

What I’d be most interested in though is if anyone has any insight to the original continuous version of the problem – how would you explain to a layman why the answer should be a third? Do you have a justification that doesn’t rely on integrating density functions and suchlike? Do you trust your instinct when it comes to probability, or are you cautious without calculation or experimentation to give you a feel for the problem?

August 31, 2010 at 18:08 |

This was interesting. Normally with continuous distributions you need to be careful about saying things like ‘each point occurs with equal probability’

Anyway, I thought about this and thought that:

since each point is just as likely to occur either side of 0.5 and the distribution is symmetric you can say the mean for one variable is the smallest

For the smallest of two this is sort of intuitive:

split the (0,1) into 3 thirds [A, B, C]

IF you chuck two U(1) onto this then it can fall into the thirds as

[XX, -, -] [X, X, -] [X, -, X]

[-, XX, -] [ – , X, X] [ – , -, XX]

each with even chance. The top line gives smallest value less than 1/3, the second with smallest value > 1/3. Since both occur half the time, the expected value is 1/3. Its easier to show with a picture.

August 31, 2010 at 18:10 |

The same construction also seems to work for three U(1), giving an expectation of 0.25.

I wouldn’t trust this without algebra though! One without the other always seems dangerous in probability.

August 31, 2010 at 18:11 |

Great brain-teaser! I’ve got a nice algebraic way of doing it :o)

Let’s call our expected value M for mean.

We’ve got two random numbers, and let’s think about three possibilities: they’re both bigger than 0.5 (1/4 of the time), they’re both less than 0.5 (1/4 of the time) or one is bigger and one smaller (1/2 of the time).

The last case is the easiest to deal with: the smaller number is uniformly distributed between 0 and 0.5, so its EV is 0.25.

Now, if they’re both smaller than 0.5, we’ve just scaled the problem – the expected value has to be M/2. Similarly, if they’re both greater than 0.5, the expected value is 0.5 + M/2.

If we do the algebra:

M = (1/2)*0.25 + (1/4)*(M/2) + (1/4)*(0.5 + M/2)

8M = 1 + M + 1 + M

6M = 2

M = 1/3.

Neat, huh?

August 31, 2010 at 18:12 |

That :o) should have been a :), of course.

August 31, 2010 at 19:19 |

I think my intuition was trying to draw a picture like yours, Steve. I also wondered what would happen with more than two random variables.

Colin, I love your method for this, it’s much neater than the complicated integration methods, and makes sense to my poor non-probability-understanding brain 🙂

September 1, 2010 at 07:00 |

If you are interested in another way of thinking about it Alison you might also consider looking (in 3d) down onto a square labelling the values obtained – two sides are axes. Bottom left is ‘both turn out with zero’, top right is ‘both turn out with one’. Imagine the square is the base of a (irregular) pyramid where the height is the minimum obtained. The height along the two axes is zero and the height at 1,1 is 1. The volume of this baby is 1/3, so that’s the answer.

(To visualise this perhaps look at a discrete case where each value can take 0 to 10. Imagine building a ziggurat sort of thing where the height at each point is the minimum value achieved. There will be 10 steps. If you slide each step to be centred in the middle of the square you’ll see that you’ll end up with a pyramid)

In terms of intuition, it seems to me that probability is all about counting in the end, making sure that you don’t double count. For continuous distributions you can try to count sets of possibilities which occur with the same probability.