Wednesday, November 23, 2011

Amusing probability case

Write down a decimal point. Then choose a digit at random, with equal probability 1/10 of each possible digit. Repeat ad infinitum, with all the digits chosen independently. Let X be the number you've written down the infinite decimal expansion of.

Suppose you find out that X is going to be either 1/4 or 1/3. Which of the two is more likely? Answer: 1/4. For there are two ways of getting 1/4: 0.250000... and 0.249999.... But there is only one way of getting 1/3: 0.333333..., and each infinite sequence is equally likely. Thus, intuitively P(X=1/4 | X=1/3 or X=1/4)=2/3. Surprised?

Another interesting fact here. In the technical probability-theory sense, X is uniformly distributed on the interval [0,1]. But in the intuitive sense, it's not. So the technical probability-theory sense does not capture the notion of uniform distribution.

Similarly, the technical probability-theory sense of independence does not capture the intuitive notion of independence. Suppose that a random process uniformly picks out a number Y in the interval [0,1], and suppose you get a dollar if and only if the number is 1/2. Let A be the event that the number picked out is 1/2 and let B be the event that you get a dollar. Then P(A&B)=P(A)=0=P(A)P(B), and hence in the probability-theoretic sense A and B are independent. But intuitively they are far from independent: B is entirely determined by A.

Maybe a better definition of independence for philosophical (though maybe not mathematical) purposes is that both P(A|B)=P(A) and P(B|A)=P(B). And then conditional probabilities should not be defined by ratios of unconditional probabilities.

9 comments:

James Bejon said...

My worry about your argument based on counting the number of possible expansions a number has is that it depends on the base you choose to use.

Alexander R Pruss said...

Yes: the point is that the decimal (or binary or whatever) expansion method isn't a good way of uniformly generating a random real number if one cares about zero-probability issues.

John Jones said...

Can we respond to a command to write down a number? The idea of an independent, stand-alone number looks more like a numeral. In which case, numbers such as 3.333.. have no numeral counterpart (dots don't signify a numeral) and so cannot be counted.

James Bejon said...

Ah, I see...

James said...

There's a mistake, actually. What's uniformly distributed are the various decimal expansions, not the rational numbers they are equivalent to. 0.2499999... and 0.25 are different sequences of numbers, but they aren't different real numbers. The sequences are what is distributed uniformly.

Alexander R Pruss said...

Actually, in the technical mathematical sense, this procedure does generate a uniform distribution. The technical mathematical sense is that P(A) = P(A#a) for all measurable subsets A of [0,1] and real numbers a in [0,1], where A#a = { y : there is an x in A such that y = x#a }, and where x#a is addition modulo 1.

Alexander R Pruss said...

Probability measures that agree outside of a set of zero probability are equivalent, and the set of real numbers for which the decimal expansion is non-unique has probability zero, being merely countable.

John Jones said...

The post presents issues of semiotics rather than problems in mathematics and independence/probability theories.

For example, a sign that is taken at random is not a random sign, that is, is not context free. It is a symbol. And especially, that context and its symbol cannot be deemed to be mathematical (where you say "the number picked out is 1/2") simply on the grounds that the glyph has the same shape (1/2) as a mathematical symbol. So I think your investigation must first be grounded by undertaking a semiotic analysis. This would encompass the terms used and the appropriate context for their use - I think that you are applying terms outside appropriate contexts. Sorry if that sounds gloomy.

Alexander R Pruss said...

First of all, this can be given a simple fully mathematical description, with all the stuff on "writing" removed: Let X_1, X_2, ... be a sequence of independent random variables uniformly distributed on {0,1,...,10} and let X be the sum of 10^-n X_n.

And if you want to keep the writing, you can add all the context you like for the glyphs, and then ask: Under such-and-such an interpretation of the glyphs, let X be the number represented by this infinite sequence of glyphs.