In the following sections are examples showing use of the programs. The user is respectfully invited to try out the examples or to use any others. The only thing to remember is: follow the grammatical rules of JavaScript. (This is because the “eval” method of JavaScript is used in picking up the data from the text area.) In particular, remember that starting an integer with a zero will force the use of base 8.
[ 2.5, [3,10972,0,3,2,4,1,4,3,4,4,2,2,3,4,1] ]
This exact Bayes test is at the same time a conservative frequentist test. The first model is the null hypothesis, the second model is the alternative hypothesis, and the printed likelihood ratio is an upper bound on the p value. Optional stopping is permitted for Bayes inference. Also, optional stopping is permitted here for frequentist inference, because the sequence of likelihood ratios is a nonnegative martingale, if the null hypothesis is true.
Optional stopping refers to the future. If more money is available, the Bayesian or frequentist statistician has the option of spending that money, so as to get some more numbers and put them into the sample and make a new inference. But the statistician has the option of stopping, even if the money is available, and stopping may be wise, because the new data may cause the p value to go up instead of down. These two options, stopping and not stopping, are available for both Bayesians and frequentists. However, the frequentist may do something which the Bayesian may not do. The frequentist may go back into the past. That is, if the numbers in the array are in chronological order, earlier on the left and later on the right, she may tell the computer program to use not only the present form of the sample but also all earlier forms of the sample, trying to reduce the p value as much as possible. Doob’s theorem for nonnegative martingales says that this is permitted. To signal the computer program that she wishes to go back into the past in addition to using the present, she can call the back method, so:
back(); [ 2.5, [3,10972,0,3,2,4,1,4,3,4,4,2,2,3,4,1] ]
The material in this section is for infinite populations. For finite populations, the reader is respectfully invited to click on Finite population. To top. Midpoint method. The test in the previous section uses a prior probability that is uniform on the unit interval, so the program for that test depends on symmetric polynomials (in the x values) and calculus and beta functions. It is complicated. One feels the need of a different, more simple, program, to check the results of that complicated program. The program of the present section is simple.
Let a positive integer k be given. The test belonging to the simple program uses a prior that is uniform on the numbers .5/k, 1.5/k, ... (k-.5)/k . That is, the simple program uses the midpoint method instead of integral calculus. The beta functions and symmetric polynomials disappear. The simple program is much slower. Its test is still exact in the Bayes sense and conservative in the frequentist sense, and optional stopping is still permitted.
It is necessary to define the value of k. Here is the way of doing it for the table in the previous section:
k=1e3; [ 2.5, [3,10972,0,3,2,4,1,4,3,4,4,2,2,3,4,1] ]
The back method mentioned in the previous section is available here too. Here is a use of it:
back(); k=1e3; [ 2.5, [3,10972,0,3,2,4,1,4,3,4,4,2,2,3,4,1] ]
The material in this section is for infinite populations. For finite populations, the reader is respectfully invited to click on Finite population. To top. From bins. There was a lot of repetition in the numerical table used above. Let us collect the numbers into bins labeled 0, 1, 2, 3, 4, and 10972:
[ 2.5, [0,1,2,3,4,10972], [1,2,3,4,5, 1] ]
The back method is not permitted to be used here, because the data have been re-ordered, and the program knows this and will ignore any attempt to use the method. To top. Probability is symmetric. The “From bins” button could be used to test whether the fictitious table of counts [1,6,11] came from symmetric distributions. One could assign labels [0,1,2] to those counts and say that mu was halfway between 0 and 2, namely, 1. That would be tedious typing. To speed up this important special case there is a button called “Probability is symmetric.” The user is respectfully invited to select and copy the table
[ [1,6,11] ]
On the other hand, the user might be thinking that the mean is smaller than symmetry permits. In that case, a function called reverseOrder might be applied to the array. It swaps all the numbers end to end, so the leftmost goes to the rightmost position, and so on. Here is the table:
[ reverseOrder( [1,6,11] ) ]
The back method is not permitted here. If the user attempts to use the method, the program will ignore the attempt. To top. Combination of independent p values. Continuous domain. When doing a meta-analysis one wishes that experimenters would publish Bayesian likelihood ratios, because then one could merely multiply those ratios. Instead, most of the experimenters publish frequentist p values. (Indeed, some kinds of designed experiments are difficult or impossible to do by Bayesian methods.) For a fixed number of p values one can use Fisher’s sum of logarithms formula and look in the table of chi-square, but in meta-analysis the number is not fixed. One then thinks of finding a nonnegative function of p which will have a mean equal to unity if the p values are uniformly distributed, but a bigger mean if the p values tend toward zero. Partly copying Fisher we use the formula -log(p), but instead of adding these negated logarithms we do a Bayes inference on their population mean. (We are out by a factor of 2 from what Fisher did, but he was thinking of the chi-square table, not of expected value.) Here is a fictitious array of p values: .02, .45, .01, .10, .23, .16, .18, .30, .68, .14 . The user is respectfully invited to select and copy the table
[ [.02, .45, .01, .10, .23, .16, .18, .30, .68, .14] ]
Some experimenters do not publish p values but only say whether the five percent test boundary was crossed. Suppose that five experiments did not cross the boundary and three crossed. Then we might score zero for each that failed to cross and twenty for each that succeeded in crossing. This is to make the null-hypothesis expected value equal to 1. Here is the table:
[ 1, [20,20,0,0,20,0,0,0] ]
Of course, there is nothing magic about five percent. One can imagine that some experimenters always use four percent. Suppose that two of their experiments fail to cross the boundary and only one succeeds in crossing. Let us insert the new experimental results into the above table, this time scoring zero for failure and twenty-five for success:
[ 1, [20,20,0,0,20,0,0,0, 0,25,0] ]
There is no way to construct a martingale for this inference, because we are not able to put the data into chronological order, so the back method is not permitted. The computer program does not know this, so the user must take care not to use the method. To top. Discrete domains.
In the previous paragraphs of this section I have assumed that the distribution of the frequentist statistic was continuous, so that Fisher’s negated logarithm of p value would have its desired descending exponential shape. For designed experiments this is seldom true, and the inference must be worked another way. For concreteness let us consider Fisher’s famous “Lady Tasting Tea” experiment. For shortness, let us suppose that the lady is presented with 6 cups of tea, half poured milk before tea and half poured tea before milk. The lady knows there are exactly 3 of each, but she does not know which are which, so she must guess. She might guess 0 right, 2 right, 4 right, or 6 right. The respective probabilities (if the null hypothesis is true) are 1/20, 9/20, 9/20, and 1/20. The respective p values are 20/20, 19/20, 10/20, and 1/20. We need to find the population mean value of the negated logarithm of the p value. It is -Math.log(20/20)*1/20 -Math.log(19/20)*9/20 -Math.log(10/20)*9/20 -Math.log( 1/20)*1/20 . This population mean is not equal to 1, so we must normalize the negated logarithms by dividing each by their population mean. When I say “we” I mean the computer.
Suppose now that four students of Fisher have each tested a volunteer lady, and that the numbers of cups that these ladies have right are 6, 6, 4, and 2. Recall that Fisher’s lady had all her cups right. Then the table for Fisher’s lady and these four other ladies is
var normalizer= -Math.log(20/20)*1/20 -Math.log(19/20)*9/20 -Math.log(10/20)*9/20 -Math.log( 1/20)*1/20 ; [ 1, [-Math.log( 1/20)/normalizer, -Math.log( 1/20)/normalizer, -Math.log( 1/20)/normalizer, -Math.log(10/20)/normalizer, -Math.log(19/20)/normalizer] ]
There is no way to construct a martingale for this inference, because we have no way of putting the sample into chronological order, so the back method is not permitted. The computer program does not know this, so the user must take care not to use the method. To top. Plea to programmers.
The calculation of these normalized negated logarithms is not quite easy for meta-analysts and much less easy for experimenters. Therefore, in the name of meta-analysis, I respectfully request programmers who write software for exact frequentist tests to enhance their programs, causing the programs to output the appropriate normalized negated logarithms along with the p values. Similarly, I respectfully request experimenters who use such enhanced programs to publish the normalized negated logarithms along with the p values. (Actually, I request the programmers to write Bayesian software where that is possible, so that the experimenters can use it.) When that day comes, the meta-analyst can just copy the normalized negated logarithms from the published papers. The above table will then look like this:
[1, [6.179509143459682, 6.179509143459682, 6.179509143459682, 1.429803783818095, 0.10580631135305135] ]
And, as I keep saying, there is no obvious way to construct a martingale for this inference, so the back method is not permitted. The computer program does not know this, so the user must take care not to use the method.
In order to begin the process of enhancing statistical programs, I have enhanced my files symmetryAroundZero.htm, wdf.htm, twoSample.htm, and http://www.toad.net/~jkaplan2/BlockTreat.htm; I respectfully suggest them to software programmers and experimenters as examples of what I mean. The wdf.htm program has not only frequentist inference with p value and normalized negated logarithm output but also Bayesian inference with likelihood ratio output. To top. De-emphasis of experiments.
Now we come to the problem of de-emphasis of experiments. The ladies in these tea-tasting tests were meant to be bona fide volunteers. Suppose now that the meta-analyst gets suspicious and asks all the ladies about that. Perhaps one of them turns out to be the mother of her experimenter. Should her experiment be kept? Should it be struck from the table? There is a middle way. The analyst can take the common average of the doubtful normalized negated logarithm with 1. Doing so will not change the population mean of the normalized negated logarithms. Here is the table:
[1, [6.179509143459682, 6.179509143459682, 6.179509143459682, 1.429803783818095, ( 0.10580631135305135+1 )/2] ]
[ [3,10972,0,3,2,4,1,4,3,4,4,2,2,3,4,1], .05 ]
If the inference is frequentist, and if the numbers in the array are in chronological order, earlier on the left and later on the right, then the back method may be used:
back(); [ [3,10972,0,3,2,4,1,4,3,4,4,2,2,3,4,1], .05 ]
The material in this section is for infinite populations. For finite populations, the reader is respectfully invited to click on Finite population. To top. Frequentist confidence using KS. The Bayes programs in this file do well for small samples, but for big samples they are much too slow and have a tendency to underflow and overflow. For big samples it may be better to use frequentist methods, if one does not need optional stopping, and if the x values all come from the same population. A speedy numerically stable frequentist nonparametric confidence using the one-sided Kolmogorov-Smirnov formula in the case of a nonnegative variable is fortunately available. (Everybody keeps re-inventing it, so I do not know who thought of it first. Maybe Kolmogorov and Smirnov did, but perhaps they thought it was too obvious to publish.) It competes against the Bayes confidence in the previous section. Just use the same sample as in the previous section and click on the “Frequentist confidence using KS” button. For this small sample the KS gets a much worse confidence boundary, but for some big samples it can do much better. I have put it here so programmers can port it to speedy languages such as Java, C, C++, C sharp, Fortran, Pascal, and FreeBasic, for use on those big samples.
The back method does not make any difference at all for this inference. Optional stopping is forbidden. Finite population inference is impossible. To top. Run a program. It may happen that a statistician has independently collected two samples from different populations and tested them for different means. Here are two fictitious examples:
[ 2.1, [2,3,4,3,0,4,1,4,4,1,3,2,2,3,4] ]
[ 2.7, [2,5,3,1,5,4,4,4,5,5,3,3,2,4,5] ]
asStated([ 2.1, [2,3,4,3,0,4,1,4,4,1,3,2,2,3,4] ]) * asStated([ 2.7, [2,5,3,1,5,4,4,4,5,5,3,3,2,4,5] ])
There is no obvious way to construct a martingale for this inference, so the back method is not permitted. The computer program does not know this, so the user must take care not to use the method.
Six functions named asStated, fromBins, symmetric, combination, getConfidence, and meanUsingKS are available to do the work of the first six buttons below the upper text area.
Here is another example. Let us suppose that it is legitimate to combine the meta-analyses in the “Combination of independent p values” section. The tables are independent of each other, so the program is
combination( [ [.02, .45, .01, .10, .23, .16, .18, .30, .68, .14] ] ) * asStated( [ 1, [20,20,0,0,20,0,0,0, 0,25,0] ] )
There is no obvious way to construct a martingale for the inferences in this section, so the back method is not permitted. The computer program does not know this, so the user must take care not to use the method. To top. Two-sample inference. If all that we know about two random variables is that they are nonnegative, then there is no statistical way of proving that one has a bigger population mean than the other. The best that we can do is to get separate one-sided confidence intervals for the population means and compare the left boundaries of those one-sided confidence intervals for the population means. That takes too long to say or to type. Let us use the English word “promise” to mean “left boundary of the one-sided confidence interval for the population mean.” (The right boundary is of course infinity.) Actually there are three promises we ought to get: for the first sample, for the second sample, and for the concatenation of the two samples. For example, let a sample 0,0,0,0,0,2,2,2,2,2,2 be drawn from one population, and let a sample 1,1,1,1,1,1,1,1,1,1 be drawn from another. (These are fictitious.) Suppose that an overall α value of .05 is to be used, so that the overall confidence will be 95 percent. Here is a program using the Bonferroni division of α into three equal parts:
var alpha=.05; var spaces=" "; var a=[0,0,0,0,0,2,2,2,2,2,2]; var b=[1,1,1,1,1,1,1,1,1,1]; getConfidence([a, alpha/3]) + spaces + getConfidence([b, alpha/3]) + spaces + getConfidence([a.concat(b),alpha/3]);
Make no mistake: comparison of promises is not the same thing as comparison of population means. I wish I knew something better, but I do not.
There is no obvious way to construct a martingale for this inference, so the back method is not permitted. The computer program does not know this, so the user must take care not to use the method. To top. Finite population.
All of the above inferences are for infinite populations, or for finite populations sampled with replacement. Now it is time to think of a finite population (of nonnegatives, of course) sampled without replacement using equal probability. If we know how many things are in the population, the test can have a smaller p value, and the confidence boundary can be bigger. Let there be bigN things in the population. I suppose for example that bigN is 21056. I respectfully invite the reader to select and copy
bigN=21056; back(); k=1e3; [ 2.5, [3,10972,0,3,2,4,1,4,3,4,4,2,2,3,4,1] ]
This finite population inference is frequentist only. I cannot think how to make Bayes models for it. Chronological order of the sample is required for finite population, whether the back method is used or not. It is advantageous to use the method.
Something similar applies to confidence:
back(); bigN=21056; k=1e3; [ [3,10972,0,3,2,4,1,4,3,4,4,2,2,3,4,1], .05 ]
Of course, that is too special. Still keeping x≥0, let us instead suppose that E(x)=μ for f(x), where μ>0. Then E(x/μ)=1, so ∫(x/μ)f(x)dx=1. Then (x/μ)f(x) is a probability density on x≥0. We form the likelihood ratio (x/μ)f(x)/f(x), and it is merely x/μ.
That is only one x in the sample. Nearly always in practice there are more x values, say, n of them. Let them have possibly different densities fj(xj), but let them all have the same population mean μ. Then the numerator of the likelihood ratio is ∏(xj/μ)fj(xj) where the product is taken for j going from 0 to n-1, inclusive. (This is to agree with the way we program it on a computer.) Similarly the denominator is ∏fj(xj). The ratio is merely ∏(xj/μ).
That is still too special. A single xj equal to zero will make the whole likelihood ratio equal zero. We can do better. Choose any number s between 0 and 1 inclusive. Notice that if E(x/μ)=1 then also E(1-s + sx/μ)=1, so ∫(1-s + sx/μ)f(x)dx will be 1, and so (1-s + sx/μ)f(x) must be a probability density. Our numerator now becomes ∏(1-s + sxj/μ)fj(xj), but the denominator does not change. The likelihood ratio is now merely ∏(1-s + sxj/μ).
Well, but what value of s are we using? The present page is about Bayes inference, so we use all of the s values with equal weights and integrate from s=0 to s=1. The numerator is now ∫01∏(1-s + sxj/μ)fj(xj)ds. As before, the denominator does not change. The likelihood ratio is now merely ∫01∏(1-s + sxj/μ)ds.
A statistician using that likelihood ratio need not say what densities fj(xj) are meant, so the inference is nonparametric. It suffices to say what value of μ is meant. That integral looks a bit messy, but it is just the integral of a polynomial in s, so symmetric polynomials (in the x values) and calculus and beta functions will solve it, for small samples.
There is just one little detail to add. The likelihood ratio I have described is not the one I print on the screen. Instead I print the other likelihood ratio, the reciprocal of that one. I do so for easier comparison with the frequentist p value.
Although the inference is nonparametric, it is not distribution-free. That is, the frequentist distribution of the likelihood depends on the functions fj, so the frequentist power of the test is not known, even though we have an upper bound on the frequentist size. To top. Added material for finite population. For infinite populations with mean μ, the null hypothesis expected value of xj/μ is unity, as I said above. Let me write that quotient in JavaScript form as x[j]/mu. That is how it was formerly in my midpoint method. For a finite population, I must talk about bigN*mu, which is the null hypothesis total of all the x values in the population. I write sumOfPreviousXValues to mean the sum of all the x values used already in the algorithm. Then bigN*mu-sumOfPreviousXValues is the null hypothesis sum of all the x values which are in the population but not yet seen. Also, bigN-j is the count of those x values not yet seen. So ( bigN*mu-sumOfPreviousXValues )/( bigN-j ) is the null hypothesis expected value of x[j]. Let us divide both numerator and denominator by bigN, getting ( mu-sumOfPreviousXValues/bigN )/( 1-j/bigN ). I call the new numerator reducedMu, and the new fraction I call expectedX. The new fraction will work correctly if bigN is finite. Also, it will work correctly if bigN is infinite.
There is a remaining difficulty to handle. It may be that reducedMu is strictly negative. If so, then the null hypothesis is false, and the method can return zero as a p value immediately. More subtly, reducedMu may be exactly zero. Then the method ought to continue until either another strictly positive x is found in the sample, or else the sample is all used up. Of course, the likelihood ratio cannot change if reducedMu is zero and all the remaining sample x values are zero. To top. Bibliography. The Bayes programs on this page are suggested by Abraham Wald, Sequential Analysis, John Wiley & Sons, Inc. New York, London, 1947. Pages 48-49 and 74 are particularly helpful.
For martingales see J. L. Doob, Stochastic Processes, John Wiley & Sons, Inc., New York, 1953. The definition of martingale is on page 91. The application to likelihood ratio is on page 93. The inequality theorem for nonnegative martingales begins at the bottom of page 314. For an easier-looking proof, see pages 235-236 of William Feller, An Introduction to Probability Theory and Its Applications, Volume II, John Wiley & Sons, Inc., New York, 1966. Another good place to look is pages 524-526 of Michel Loève, Probability Theory, second edition, D. van Nostrand Company, Inc., Princeton, New Jersey, 1960.
Any mistakes in the programs are mine: not Doob’s, Feller’s, Fisher’s, Loève’s, or Wald’s. To top. Insert commas. It may be that the user is bringing in data copied from other web pages, or from files, and is pasting the data from the clipboard to the upper text area. In that case the numbers are perhaps separated by blanks or tabulation characters or the like instead of by commas. The “Insert commas” button is meant to change such other separators to commas. The button does not always guess rightly what is expected of it, so the user is respectfully asked to look at the button’s work to make sure everything is as desired. It is a good idea to type the left and right square brackets before clicking the “Insert commas” button. To top. License, revision date, and e-mail address. All of this file is in the public domain. The date of this revision is 2 June 2008. Criticism both constructive and destructive comes to me, Harold Kaplan,
at dot smtw2gh toadmail com