Thursday 10 September 2015

The two envelopes problem

This post concerns the two envelopes problem.

To summarise, suppose we know that one envelope contains some money, and another envelope contains twice that amount of money. However, we do not know which is which. We choose one of the envelopes, which contains an unknown amount $x$. There is a 50% chance that this is the larger envelope, and a 50% chance it is the smaller. Thus it would seem that the expected value for the other envelope should be $\frac{1}{2}(2x+\frac{1}{2}x)>x$, so that the other envelope is always the larger one. But since this argument would apply equally well to each envelope, it is obviously incorrect.

The problem is a little tricky, but the error with the argument is clear once you spot it.

The mistake is that one cannot talk about expectation value in the absence of a prescribed probability distribution. Suppose someone puts some money in an envelope. What is the expected value for the amount of money in the envelope? It's clearly a nonsense question. Assuming every positive amount has equal probability, then the expected value would seem to be $\infty / 2$. Similarly, suppose someone puts some money in one envelope, and twice that amount in another envelope. What is the expected value of the amount of money in either envelope? Again, no reasonable answer can be given. Therefore, having supposed that the first envelope contains x, the probability distribution for the second envelope is simply unknown. The only thing we can say about it is that values other than x/2 and 2x are not possible.

While it is true that the other envelope must be either x/2 or 2x, it is not true that each of these must be equally likely. In fact, if the person preparing the envelopes only ever chooses from one of two values, the possible values for the envelopes span only a factor of two, so x/2 and 2x cannot both be possible. Furthermore, the probability distribution for the amount contained in the first envelope and the probability distribution for whether it is the larger or the smaller of the envelopes, are not independent random variables. Although it is correct to say that $x$ has a 50% chance of being the smaller value, and a 50% chance of being the larger value, $x$ may be a different value in each case!

In summary, one must begin with the (not independent) probability distributions for the two envelopes, before being able to talk about expectation values.

Tuesday 25 August 2015

When does a smooth function fail to be analytic?

It's well known that not all smooth functions $f(x)$ are analytic, i.e. can be locally represented by a power series \[ f(x) = f(0) + f'(0)x + \frac{1}{2!}f''(x)x^2 + \frac{1}{3!}f^{(3)}(x)x^3 + \ldots \] The counterexample typically given is the function \[ f(x) = \left\{ \begin{array}{lr} 0 & : x \leq 0 \\ e^{-1/x} & : x > 0 \end{array} \right. \] However, I thought it would be worthwhile to make explicit why the failure occurs.

Suppose we are interested in the interval $[0,\varepsilon]$. An $n$th-order polynomial is a function whose $n$th derivative is constant. So it stands to reason that as long as a function's $n$th derivative doesn't change "too much" over this interval, the function should be well approximated by the first $n$ terms of its Taylor series.

Let's write \[ f^{n}(x) = f^{n}(0) + E_{n}(x), \] where $E_{n}(x)$ is the error function representing how much $f^{(n)}$ differs from constant. Then we can integrate to get \[ f^{(n-1)}(x) = f^{(n-1)}(0) + f^{(n)}(0)x + \int_0^x{E_n(t)dt}, \] where the error in the $(n-1)$th derivative is \[ \left| E_{n-1}(x) \right| = \left| \int_0^x{E(t)dt} \right| \leq \sup_{t \in [0,\varepsilon]}{\left| E_n(t) \right|}x. \] Iterating this process, we eventually find that the original function differs from it's $n$th-order Taylor series by \[ \left| E_0(x) \right| \leq \frac{1}{n!} \sup_{t \in [0,\varepsilon]}{\left| E_n(t) \right|}x^n. \] For the function to be analytic, this expression needs to converge to zero as $n \to \infty$. Due to the $n!$ denominator, this can only fail if $E_n$, and hence the $n$th derivative, is growing extremely quickly as n increases. In this sense, non-analytic functions are pathological.

We can observe this pathological behaviour for our counterexample by plotting a few of the derivatives. While the 3rd derivative (left) shoots up to the value of 30 within .15 of the origin, the 10th derivative (right) is already order $10^{14}$ approximately $.04$ from the origin!