The Bayes programs on this page are all exact in the Bayesian sense, but not in the frequentist sense. Two of them are conservative in the frequentist sense. Those two are “Probabilities are uniform” and “Probabilities are proportional to numerators.” For those two the printed Bayesian likelihood ratio is an upper bound on the frequentist p-value. The frequentist null hypothesis for each of those two tests is its title in quotation marks. (Those two null hypotheses are simple, and the null hypothesis expected value of the reciprocal of the printed likelihood ratio is unity, and we use Markov’s inequality.)
All the Bayes programs on this page, except one, make use of Dirichlet distributions and Laplace’s Law of Succession. The exception is the program for “One parameter” which uses formulas furnished by the user and integrates by Romberg’s method applied to the midpoint rule. Any mistakes in the programs are mine, not Dirichlet’s or Laplace’s or Romberg’s.
In the following paragraphs are some examples showing use of the programs on this page. The user is respectfully invited to try out these examples or to use any others. The only thing to remember is: follow the grammatical rules of JavaScript. (This is because the “eval” method of JavaScript is used in picking up the data from the text area.) In particular, remember that integers beginning with a zero digit are in base 8.
If the user sees an answer “less than 5e-324”, then there was an underflow. 5e-324 is the smallest non-zero number.
To top.
Which browsers?
Modern browsers such as Safari 3, Microsoft Internet Explorer 6, Netscape 7, and Opera 8 can use this page. Netscape 4 is out of date and cannot use this page.
To top.
Rows are independent of columns
The following contingency table is quoted from page 584 of Kendall, Maurice G., and Stuart, Alan, The Advanced Theory of Statistics, Volume 2, Inference and Relationship, Charles Griffin & Company Limited, London, 1961. They cite Ammon, Zur Anthropologie der Badener.
[ [1768, 807, 189, 47], [ 946, 1387, 746, 53], [ 115, 438, 288, 16] ]The user is respectfully invited to select and copy that table, and to clear out the upper text area with the “Clear” button, and to paste the table into that text area, and to click on the “Rows are independent of columns” button. The likelihood ratio will be printed in the lower text area. Ratio larger than 1 is in favor of the independence model. Smaller ratio is in favor of the non-independence model.
Here is a table from Agresti, A., A Survey of Exact Inference for Contingency Tables, Statistical Science 1992, Vol. 7, No.1, 131-177. It is Table 1 on page 132.
[ [17066,14464,788,126,37], [ 48, 38, 5, 1, 1] ]Agresti says that the exact p-value is 0.139 using the likelihood-ratio statistic. The user is respectfully invited to compare our present Bayesian likelihood-ratio for that table.
The present program can do sparse tables. Here is a table copied from P. Diaconis and B. Sturmfels, Algebraic algorithms for sampling from conditional distributions, Annals of Statistics, 1998, Vol. 26, No. 1, 363-397. This table is on their page 364.
[ [1,0,0,0,1,2,0,0,1,0,1,0],
[1,0,0,1,0,0,0,0,0,1,0,2],
[1,0,0,0,2,1,0,0,0,0,0,1],
[3,0,2,0,0,0,1,0,1,3,1,1],
[2,1,1,1,1,1,1,1,1,1,1,0],
[2,0,0,0,1,0,0,0,0,0,0,0],
[2,0,2,1,0,0,0,0,1,1,1,2],
[0,0,0,3,0,0,1,0,0,1,0,2],
[0,0,0,1,1,0,0,0,0,0,1,0],
[1,1,0,2,0,0,1,0,0,1,1,0],
[0,1,1,1,2,0,0,2,0,1,1,0],
[0,1,1,0,0,0,1,0,0,0,0,0] ]
Diaconis and Sturmfels say, on their page 363, “The classical rules of thumb for validity of the chi-square approximation (minimum 5 per cell) are badly violated here, and there are too many tables with these margins to permit exact enumeration.” They are speaking about non-Bayes tests. The present Bayes program shows a strong preference for the “Rows are independent of columns” model for this table.
To top.
Any mistakes I have made in using Wilks’ formulas are my own mistakes, certainly not his.
To top.
Probabilities are uniform
All students of statistical inference know about “
An argument for Divine Providence,
taken from the constant Regularity observ’d in the Births of both Sexes.”
By Dr. John Arbuthnott,
Physitian in Ordinary to Her Majesty, and
Fellow of the College of Physitians and the Royal Society.
From Phil. Trans. (1710) 27, 186-90.
The user is respectfully invited to select and copy the table
[82,0]and to click the “Clear” button to clear the upper text area, and to paste the table into that text area, and to click on the “Probabilities are uniform” button. The likelihood ratio will be printed in the lower text area. Ratio larger than 1 is in favor of uniformity, and smaller ratio is in favor of non-uniformity.
The program is not limited to only two cells. Here is a fictitious table for throwing a cheap variety-store plastic die one hundred times and counting how many times each face was on top. The numbers of times for 1 2 3 4 5 and 6 times are shown. As before, likelihood ratio larger than 1 is in favor of uniformity of probabilities, and smaller ratio is in favor of non-uniformity.
[21,25,26,16,0,12]To top.
[ [315,101,108,32], [ 9, 3, 3, 1] ]The numbers 9 3 3 and 1 are to be understood as the numerators of the theoretical probability fractions 9/16 3/16 3/16 and 1/16. (In Kendall and Stuart the denominator 16 is actually printed. There is no advantage in making the user type the denominator over and over again, so the present program does the denominator without being told.) The user is respectfully invited to select and copy that table, to clear the upper text area, to paste the table into that text area, and to click on the “Probabilities are proportional to numerators” button. The likelihood ratio will be printed in the lower text area. Likelihood ratio larger than 1 is in favor of proportionality, and smaller ratio is in favor of non-proportionality. To top.
[
[ 102, function(t){ return (1-t)*(1-t)*(1-t) } ],
[ 301, function(t){ return 3*(1-t)*(1-t)*t } ],
[ 317, function(t){ return 3*(1-t)*t*t } ],
[ 96, function(t){ return t*t*t } ]
]
The user is respectfully invited to copy the table, to clear the upper text area, to paste into that upper text area, and to click on the “One parameter” button. The likelihood ratio will be printed in the lower text area. Ratio larger than 1 is in favor of the one parameter model of the table. Smaller ratio is in favor of a Dirichlet model with more parameters. The likelihood ratio of this table is bigger than 90.
If instead the table is
[
[ 0, function(t){ return (1-t)*(1-t)*(1-t) } ],
[ 301, function(t){ return 3*(1-t)*(1-t)*t } ],
[ 317, function(t){ return 3*(1-t)*t*t } ],
[ 96, function(t){ return t*t*t } ]
]
then the likelihood ratio is less than 1.5e-29. This is assuming that three tails could occur but did not. It is easy to change the model so that three tails is specially forbidden. That is, let us delete the line for three tails. Now the table looks like
[
[ 301, function(t){ return 3*(1-t)*(1-t)*t } ],
[ 317, function(t){ return 3*(1-t)*t*t } ],
[ 96, function(t){ return t*t*t } ]
]
The likelihood ratio changes to more than 7.9 . Let the user not worry about the denominator for these three numerators: the program is calculating that denominator, so the user need not do it.
To top.
[ [50,8], [92,50] ]and
[ [50,92], [8,50] ]The likelihood ratio for each is 0.056850990006292665, but we have two of them. How are they to be combined? Collapsing them into one table is a poor idea, because then the table will be
[ [100,100], [100,100] ]which has a likelihood ratio slightly over 12. This happens even though the two original tables have the same cross-ratio as each other. It is an example of Simpson’s paradox. We need to think of another way of combining the data. Since the likelihood ratios are independent of each other, we may multiply the ratios together. This multiplication ought to be done by a computer, not a human. Here is a program to do the job:
independent( [[50,8],[92,50]] ) * independent( [[50,92],[8,50]] )The “independent” function does what the “Rows are independent of columns” button does. The user is respectfully invited to select and copy that program, to clear the upper text area, to paste the program into that text area, and to click on the “Run a JavaScript program” button. The product of the two likelihood ratios will be printed in the lower text area. That product is satisfyingly small.
Another use for programming is testing an array for symmetry around its center. Here is an artificial example:
[50,0,49,1]It is not a good idea to combine the two cells on the left and combine the two cells on the right to get
[50,50]That is perfectly symmetrical, but the array we started with did not look symmetrical at all. A better way is to compare the first and fourth cells, and compare the second and third cells, and then multiply the likelihood ratios:
uniform( [50,1] ) * uniform( [0,49] )(Here the “uniform” function does what the “Probabilities are uniform” button does.) Symmetry is seen to have a very small likelihood compared with non-symmetry.
Besides the “independent” and “uniform” functions this file contains two other functions. There is a function called “proportional” to do what the “Probabilities are proportional to numerators” button does, and there is a function called “oneParameter” to do what the “One parameter” button does.
To top.
Comparisons with non-Bayes inferences
This section has bigger programs, so as to compare Bayes and non-Bayes inference.
The number 1.645 is copied out of the “normal” table used in many introductory courses in statistics. It is used in building a 5% one-sided boundary for a non-sequential test of the fairness of a coin. Here is a JavaScript program to find the number of heads that will cause that test to “reject,” and to use that number of heads in a Bayes inference.
var boundary=1.645; var tosses=1e4; var yatesCorrection=.5; var heads=Math.ceil( yatesCorrection + tosses*.5 + boundary*Math.sqrt(tosses*.5*.5) ); var tails=tosses-heads; var x=[heads,tails]; // throw( x ); uniform( x )The user is respectfully invited to select and copy that program, and to click the “Clear” button to clear the upper text area, and to paste the program into that text area, and to click the “Run a JavaScript program” button. The likelihood ratio will be printed in the lower text area. Ratio larger than 1 is in favor of uniformity, that is, in favor of fairness of the coin. Smaller ratio is in favor of unfairness of the coin. To see the numerical data in the array, the user can uncomment the throw statement and click again.
The number 2.706 is copied out of the “chi-squared” table used in many introductory courses in statistics. It is used in building a 5% one-sided boundary for a non-sequential test of the independence of a 2 by 2 contingency table. Here is a JavaScript program to find the contingency table entries that will cause that test to “reject,” and to use those entries in a Bayes inference.
var boundary=2.706; var n=2e4; var expected=n/4; var yatesCorrection=0.5; var higherObserved=Math.ceil( expected+yatesCorrection+Math.sqrt( boundary*expected/4 ) ); var lowerObserved=2*expected-higherObserved; var x=[[higherObserved,lowerObserved],[lowerObserved,higherObserved]]; // throw( x ); independent( x )The user is respectfully invited to select and copy and paste this program and to click the “Run a JavaScript program” button. Likelihood ratio larger than 1 is in favor of independence of the table. Smaller ratio is in favor of non-independence.
The user will notice that for each of the two programs the Bayes decision and non-Bayes decision are 5% decisions, but in opposite directions. These are numerical examples of the Jeffreys-Lindley paradox.
To top.
Enclose in brackets
It may be that the user is bringing in data copied from other web pages, or from files, and is pasting the data from the clipboard to the upper text area. If the number of rows is large, there needs to be a quick way to put the left and right square brackets on each line, together with a comma after the right square bracket. The “Enclose in brackets” button does this. Actually, it does the top and bottom rows wrong, so I respectfully ask the user to repair those two rows. Also, the button will delete completely empty lines, those having not even any whitespace. If the “Enclose in brackets” button is by mistake clicked more than once, the “Un-enclose” button may help.
To top.
Un-enclose
This button is to repair the damage done by clicking the “Enclose in brackets” button more than once. Even worse damage can be done by clicking the “Insert commas” button after too much clicking of the “Enclose in brackets” button. One click of the “Un-enclose” button may suffice to repair the damage from those two buttons. However, the “Un-enclose” button cannot restore the completely empty lines deleted by the “Enclose in brackets” button.
To top.
Insert commas
It may be that the user is bringing in data copied from other web pages, or from files, and is pasting the data from the clipboard to the upper text area. In that case the numbers are perhaps separated by blanks or tabulation characters or the like instead of by commas. The “Insert commas” button is meant to change such other separators to commas. The button does not always guess rightly what is expected of it, so the user is respectfully asked to look at the button’s work to make sure everything is as desired.
To top.
What priors? How much do they matter?
The prior probability distributions for the cells of the non-independence model in the test of independence are uniform in the present page. It follows by calculus that the priors for the margins of that model are not uniform. For “coherence” we ought to use the same marginal priors in the independence model as in the non-independence model, and this I have done. Some readers of this page have e-mailed me saying that other programmers have used uniform priors for the margins of the independence model. I respectfully suggest that this can make a big numerical difference in the result. To show this, I have written an incoherent function competing against the independent function. The incoherent function uses only uniform priors for the margins. Here is a short program which builds a ten-by-ten array, fills it with zeroes and ones in a checker-board fashion, and calls both the functions, independent on the left and incoherent on the right.
var k1=10;
var k2=10;
var x=[];
for( var eye=0;eye<k1;eye++ )
{
x[eye]=[];
for( var j=0;j<k2;j++ )x[eye][j]=(eye+j)%2;
}
independent( x )+" "+incoherent( x );
I respectfully invite the reader to select that program with the mouse, copy it, move up to the upper text area, clear the area if necessary, paste into the area, and click on the “Run a JavaScript program” button. One of the answers is strongly in favor of independence, and the other is strongly opposed.
To top.
at dot
smtw2gh gmail com
To top.