unknown variance of concave distribution

14 February 2025
math,
long

A strange fact that was not immediately obvious to me is that variance of a uniform distribution on \([a,b]\) can act as a conservative estimate for unknown variance \(\sigma^2\) of many other distributions, namely convex distributions on \([a,b]\).

The uniform distribution on \([a,b]\) has variance \((b-a)^2/12\). For any concave distribution on \([a,b]\) (where the PDF lies above any line segment joining two points on its graph), the variance \(\sigma^2\) satisfies \(\sigma^2 \le (b-a)^2/12\). This holds because any concave PDF is unimodal, and on \([0,1]\), the variance of a distribution with mean \(\mu\) cannot exceed \(\mu(2-3\mu)/3\) for \(\mu \le 1/2\) or \((1-\mu)(3\mu-1)/3\) for \(\mu \ge 1/2\), neither of which exceeds \(1/12\). It takes some work to show this, which we’ll expand on shortly. But, rescaling to \([a,b]\) gives the general result. A tighter bound exists when the mean \(\mu\) is known: for \(\mu \le (a+b)/2\), \(\sigma^2 \le (\mu-a)(2b+a-3\mu)/3\), with a symmetric expression for \(\mu > (a+b)/2\).

The supremum of the variance of unimodal distributions on \([0,1]\) with mean \(\mu\) is \(\mu(2 - 3\mu)/3\) for \(0 \le \mu \le 1/2\) or \((1-\mu)(3\mu-1)/3\) for \(1/2 \le \mu \le 1\). This supremum is achieved by a distribution that, while lacking a density function, can be considered “unimodal” in a generalized sense. Specifically, it has a point mass at \(0\) (when \(\mu < 1/2\)) or at \(1\) (when \(\mu > 1/2\)), with the rest of the distribution being uniform.

To derive this, we optimize the second moment \(\mathbb{E}[x^2]\) of a unimodal distribution on \([0,1]\) under the constraints of normalization (\(\int_0^1 f(x) \, dx = 1\)), mean \(\mathbb{E}\) \(\int_0^1 x f(x) \, dx = \mu\), and unimodality (non-increasing density on either side of a mode \(\lambda\)). The optimal distribution is piecewise constant, with density \(a = (1 + \lambda - 2\mu)/\lambda\) on \([0,\lambda)\) and \(b = (2\mu - \lambda)/(1 - \lambda)\) on \((\lambda,1]\). The second moment \(\mathbb{E}[x^2] = \frac{1}{3}(2\mu + (2\mu - 1)\lambda)\) is linear in \(\lambda\), so it is maximized at \(\lambda = 0\) (for \(\mu < 1/2\)) or \(\lambda = 1\) (for \(\mu > 1/2\)). When \(\mu = 1/2\), the second moment is constant for all \(\lambda\).

In the limits, the optimal distribution approaches a uniform distribution with a point mass at \(0\) (for \(\mu < 1/2\)) or at \(1\) (for \(\mu > 1/2\)). These distributions, though not continuous, satisfy the unimodality condition and achieve the supremum of the variance \(\sigma^2_\mu\).

Note: Recall how to find the variance of a uniform distribution X on \([a,b]\).

We know variance is the difference of moments \(\mathbb{V}[X] = \mathbb{E}[X^2] - (\mathbb{E} [X])^2\).

As we see \(\mathbb{E} [X] = \frac{1}{b-a}\int_{[a,b]}x dx = \frac{a+b}{2}\), and \(\mathbb{E} [X^2] = \frac{1}{b-a}\int_{[a,b]}x^2 dx = \frac{b^3-a^3}{3(b-a)}=\frac{a^2+ab+b^2}{3}\), we then see \[ \mathbb{V} [X] = \frac{a^2+ab+b^2}{3} - \frac{a^2+2ab+b^2}{4} = \frac{a^2-2ab+b^2}{12} =\frac{(b-a)^2}{12} \]

Inspired by this discussion and also this discussion, with help from this calculation

← Previous
note: complex derivatives and matrices
Next →
note: multiple linear regression simulation