*WARNING: This post contains probability, which is not my strong point. If I have made mathematical or terminological errors, please correct them kindly in the comments.*

Probability can be a bit of a minefield. I try not to rely on my intuition at all when someone asks me a probability question, because so often it can be wrong. So imagine my consternation on Friday evening at the end of a long hard day working with summer school pupils on probability when I was given a question by a close friend who should have known better, and asked to apply my intuition to it.

In order to pose his question, I’d better explain some terminology – skip this paragraph if you’re familiar with the idea of a uniform distribution. Imagine a number line with all the numbers between 0 and 1. You can pick any point on the number line you like. It doesn’t have to be a terminating decimal, it doesn’t even have to be a recurring decimal – it can be rational or irrational, close to 0, close to 1, or anywhere in between. The uniform distribution is a probability distribution where each of these choices is equally likely.

Start by imagining a random variable uniformly distributed on (0,1). It feels pretty intuitive that the expected value of such a random variable should be 0.5. My good friend’s question was as follows: What happens if you have two independent random variables, both uniformly distributed on (0,1), and you want to know the expected value of the smaller of the two variables?

My intuition said that the answer had to be less than 0.5, because for each random variable the expectation was 0.5, so for the two together and taking the smaller of the two, it just had to be less. My thinking then went very arm-wavy: some of the time they will both be less than a half, some of the time one will be less and one greater, and some of the time they will both be greater than a half, so maybe it’s about a third? My arm-wavy gut instinct was confirmed by my friend’s algebra.

I’m not too confident at manipulating probability density functions so I resorted to my usual tactic when scared off by continuous distributions: looked for a discrete analogue. What if I had a set of WHOLE numbers, and I picked pairs of them (with replacement, so the same number can come out twice)? I started with the set of numbers from 0-6 (deliberately picked a multiple of 3 in case thirds came into it). Actually that’s not entirely true; I started with the numbers from 1-6 and then changed my mind… This was a small enough set that I could list possibilities and calculate the average, and working on this special case gave me some insights to try working through some algebra for the set of numbers from 0-n. I leave this as an exercise to the reader đŸ™‚

What I’d be most interested in though is if anyone has any insight to the original continuous version of the problem – how would you explain to a layman why the answer should be a third? Do you have a justification that doesn’t rely on integrating density functions and suchlike? Do you trust your instinct when it comes to probability, or are you cautious without calculation or experimentation to give you a feel for the problem?