Mean of the bootstrap sample vs statistic of the sample

Say I have a sample and the bootstrap sample from this sample for a stastitic $\chi$ (e.g. the mean). As we all know, this bootstrap sample estimates the sampling distribution of the estimator of the statistic. Now, is the mean of this bootstrap sample a better estimate of the population statistic than the statistic of the original sample? Under what conditions would that be the case?

Amelio Vazquez-Reina asked Jan 14, 2015 at 13:53 Amelio Vazquez-Reina Amelio Vazquez-Reina 19.5k 27 27 gold badges 81 81 silver badges 120 120 bronze badges

$\begingroup$ The mean of the bootstrap sample is the mean of the sample and you do not need a bootstrap sample in this case. $\endgroup$

Commented Jan 14, 2015 at 14:58

$\begingroup$ Thanks @Xi'an I am not sure I follow. The mean of the bootstrap sample can be numerically different from the mean of the sample. Are you trying to say that the two are still theoretically equivalent? Can you confirm on both ends? $\endgroup$

Commented Jan 14, 2015 at 15:06

$\begingroup$ Let's get our terminology clear: "bootstrap sample" could refer either to a specific sample-with-replacement from the data or it could refer to a (multivariate) random variable of which such a sample would be considered one realization. You are correct that the mean of a realization can differ from the mean of the data, but @Xi'an provides the more relevant observation that the mean of the random variable (which by definition is the bootstrap estimate of the population mean) must coincide with the mean of the data. $\endgroup$

Commented Jan 14, 2015 at 15:55

$\begingroup$ Then your question is almost identical to stats.stackexchange.com/questions/126633/…; the only difference is that the bootstrap sample realizations can overlap, but the analysis given in the answer there is easily carried over to the bootstrap situation, with the same result. $\endgroup$

Commented Jan 14, 2015 at 16:46

$\begingroup$ I see the connection @whuber, although in bootstrap one has "subsets with replacement" and the realizations may overlap, as you said. I would imagine that the distribution (e.g. pseudorandomness) used to get the re-samples in bootstrap can also affect the bias of the estimate from the bootstrap sample. Perhaps the answer is that for all practical matters the difference is negligible. This is what the question is after: conditions, subtleties, and the difference in practice. $\endgroup$

Commented Jan 14, 2015 at 16:50

2 Answers 2

$\begingroup$

Let's generalize, so as to focus on the crux of the matter. I will spell out the tiniest details so as to leave no doubts. The analysis requires only the following:

The arithmetic mean of a set of numbers $z_1, \ldots, z_m$ is defined to be $$\frac\left(z_1 + \cdots + z_m\right).$$
Expectation is a linear operator. That is, when $Z_i, i=1,\ldots,m$ are random variables and $\alpha_i$ are numbers, then the expectation of a linear combination is the linear combination of the expectations, $$\mathbb\left(\alpha_1 Z_1 + \cdots + \alpha_m Z_m\right) = \alpha_1 \mathbb(Z_1) + \cdots + \alpha_m\mathbb(Z_m).$$

Let $B$ be a sample $(B_1, \ldots, B_k)$ obtained from a dataset $x = (x_1, \ldots, x_n)$ by taking $k$ elements uniformly from $x$ with replacement. Let $m(B)$ be the arithmetic mean of $B$. This is a random variable. Then

$$\mathbb(m(B)) = \mathbb\left(\frac\left(B_1+\cdots+B_k\right)\right) = \frac\left(\mathbb(B_1) + \cdots + \mathbb(B_k)\right)$$

follows by linearity of expectation. Since the elements of $B$ are all obtained in the same fashion, they all have the same expectation, $b$ say:

$$\mathbb(B_1) = \cdots = \mathbb(B_k) = b.$$

This simplifies the foregoing to

$$\mathbb(m(B)) = \frac\left(b + b + \cdots + b\right) = \frac\left(k b\right) = b.$$

By definition, the expectation is the probability-weighted sum of values. Since each value of $X$ is assumed to have an equal chance of $1/n$ of being selected,

$$\mathbb(m(B)) = b = \mathbb(B_1) = \fracx_1 + \cdots + \fracx_n = \frac\left(x_1 + \cdots + x_n\right) = \bar x,$$

the arithmetic mean of the data.

To answer the question, if one uses the data mean $\bar x$ to estimate the population mean, then the bootstrap mean (which is the case $k=n$) also equals $\bar x$, and therefore is identical as an estimator of the population mean.

For statistics that are not linear functions of the data, the same result does not necessarily hold. However, it would be wrong simply to substitute the bootstrap mean for the statistic's value on the data: that is not how bootstrapping works. Instead, by comparing the bootstrap mean to the data statistic we obtain information about the bias of the statistic. This can be used to adjust the original statistic to remove the bias. As such, the bias-corrected estimate thereby becomes an algebraic combination of the original statistic and the bootstrap mean. For more information, look up "BCa" (bias-corrected and accelerated bootstrap) and "ABC". Wikipedia provides some references.