Two sophisms concerning finite populations
Many undergraduates ask the statistics teacher why bigger populations don’t require bigger samples, and the teacher shows them the Bernoulli model and the normal model with known variance. I show an example where the undergraduates are right and the teacher is wrong. Some graduate students majoring in statistics think that the Bahadur-Savage theorem states that nonrandomized nonparametric confidence intervals for the mean of an unbounded population must cover the whole domain. That is not what Bahadur and Savage said, and I give an example of what can go wrong. The two examples are identical with each other.
First sophism
Let a finite population have 20 numbers with ties forbidden, and let us take from it a sample of 19 numbers chosen at random without replacement. Let the most negative number in the population be called “leftMost.” Then the probability that leftMost will be in the sample is 19/20, which is 95 percent. Let a confidence interval go from the sample minimum (inclusive) to positive infinity (exclusive.) This is a 95% confidence interval for the whole population. If the population has instead 40 numbers (with ties forbidden as before,) then the sample must have 38 numbers to keep the 95%.
Solution of the first sophism
The purpose of sampling is to save money. Sampling 95% of the population saves nearly no money.
Second sophism
We copy the hypothesis and construction of the first sophism. Since the confidence interval has a 95% chance of containing the whole population, it has a 95% (or more) chance of containing the population mean.
Solution of the second sophism
We are drawing from a finite population without replacement. That is, we are not drawing independently. I think the usual form of the Bahadur-Savage theorem requres independence. See E. L. Lehmann and Joseph P. Romano, Testing Statistical Hypotheses, Third Edition, Springer, fourth printing, 2008. Please look on page 467 top. It says, “i. i. d.”
Non-originality, date, e-mail address, links
I cannot claim to have invented the above sophisms. I have merely assembled them from pieces known to everybody. I put the sophisms on this web page to amuse students of statistical inference and their teachers. The date of this page is 18 March 2010. Comments on all this, both constructive and destructive, come to me, Harold Kaplan,
at dot
smtw2gh toadmail com
Harold Kaplan’s statistics.htm
John C. Pezzullo’s page