Bias
The bias of an estimate is the difference between the expectation value of the point estimate and value of the parameter.
Note that the expectation value of \(\hat{\theta}\) is computed over the (unknown) generative distribution whose PDF is \(f(x)\).
Bias of the plug-in estimate for the mean
We often want a small bias because we want to choose estimates that give us back the parameters we expect. Let’s first investigate the bias of the plug-in estimate of the mean. As a reminder, the plug-in estimate is
where \(\bar{x}\) is the arithmetic mean of the observed data. To compute the bias of the plug-in estimate, we need to compute \(\langle \hat{\mu}\rangle\) and compare it to \(\mu\).
Because \(\langle \hat{\mu}\rangle = \mu\), the bias in the plug-in estimate for the mean is zero. It is said to be unbiased.
Bias of the plug-in estimate for the variance
To compute the bias of the plug-in estimate for the variance, first recall that the variance, as the second central moment, is computed as
So, the expectation value of the plug-in estimate is
We now need to compute \(\left\langle\bar{x}^2\right\rangle\), which is a little trickier. We will use the fact that the measurements are independent, so \(\left\langle x_i x_j\right\rangle = \langle x_i \rangle \langle x_j\rangle\) for \(i\ne j\).
Thus, we have
Therefore, the bias is
If \(\hat{\sigma}^2\) is the plug-in estimate for the variance, an unbiased estimator would instead be
Justification of using plug-in estimates.
Despite the apparent bias in the plug-in estimate for the variance, we will normally just use plug-in estimates going forward. (We will use the hat, e.g. \(\hat{\theta}\), to denote an estimate, which can be either a plug-in estimate or not.) Note that the bootstrap procedures we lay out in what follows do not need to use plug-in estimates, but we will use them for convenience. Why do this? The bias is typically small. We just saw that the biased and unbiased estimators of the variance differ by a factor of \(n/(n-1)\), which is negligible for large \(n\). In fact, plug-in estimates tend to have much smaller error than the confidence intervals for the parameter estimate, which we will discuss next.