Rows are independent of columns
Probabilities are uniform
Probabilities are proportional to numerators
One parameter
Run a JavaScript program

Other links within this page

Introduction
Which browsers?
Theory
Comparisons with non-Bayes inferences
Enclose in brackets
Un-enclose
Insert commas
What priors? How much do they matter?
License, revision date, and e-mail address

Introduction

The Bayes programs on this page are all exact in the Bayesian sense, but not in the frequentist sense. Two of them are conservative in the frequentist sense. Those two are “Probabilities are uniform” and “Probabilities are proportional to numerators.” For those two the printed Bayesian likelihood ratio is an upper bound on the frequentist p-value. The frequentist null hypothesis for each of those two tests is its title in quotation marks. (Those two null hypotheses are simple, and the null hypothesis expected value of the reciprocal of the printed likelihood ratio is unity, and we use Markov’s inequality.)

All the Bayes programs on this page, except one, make use of Dirichlet distributions and Laplace’s Law of Succession. The exception is the program for “One parameter” which uses formulas furnished by the user and integrates by Romberg’s method applied to the midpoint rule. Any mistakes in the programs are mine, not Dirichlet’s or Laplace’s or Romberg’s.

In the following paragraphs are some examples showing use of the programs on this page. The user is respectfully invited to try out these examples or to use any others. The only thing to remember is: follow the grammatical rules of JavaScript. (This is because the “eval” method of JavaScript is used in picking up the data from the text area.) In particular, remember that integers beginning with a zero digit are in base 8.

If the user sees an answer “less than 5e-324”, then there was an underflow. 5e-324 is the smallest non-zero number. To top.

Which browsers?

Modern browsers such as Safari 3, Microsoft Internet Explorer 6, Netscape 7, and Opera 8 can use this page. Netscape 4 is out of date and cannot use this page. To top.

Rows are independent of columns

The following contingency table is quoted from page 584 of Kendall, Maurice G., and Stuart, Alan, The Advanced Theory of Statistics, Volume 2, Inference and Relationship, Charles Griffin & Company Limited, London, 1961. They cite Ammon, Zur Anthropologie der Badener.
[
[1768,  807, 189, 47],
[ 946, 1387, 746, 53],
[ 115,  438, 288, 16]
]
The user is respectfully invited to select and copy that table, and to clear out the upper text area with the “Clear” button, and to paste the table into that text area, and to click on the “Rows are independent of columns” button. The likelihood ratio will be printed in the lower text area. Ratio larger than 1 is in favor of the independence model. Smaller ratio is in favor of the non-independence model.

Here is a table from Agresti, A., A Survey of Exact Inference for Contingency Tables, Statistical Science 1992, Vol. 7, No.1, 131-177. It is Table 1 on page 132.

[
[17066,14464,788,126,37],
[   48,   38,  5,  1, 1]
]
Agresti says that the exact p-value is 0.139 using the likelihood-ratio statistic. The user is respectfully invited to compare our present Bayesian likelihood-ratio for that table.

The present program can do sparse tables. Here is a table copied from P. Diaconis and B. Sturmfels, Algebraic algorithms for sampling from conditional distributions, Annals of Statistics, 1998, Vol. 26, No. 1, 363-397. This table is on their page 364.

  [ [1,0,0,0,1,2,0,0,1,0,1,0],
    [1,0,0,1,0,0,0,0,0,1,0,2],
    [1,0,0,0,2,1,0,0,0,0,0,1],
    [3,0,2,0,0,0,1,0,1,3,1,1],
    [2,1,1,1,1,1,1,1,1,1,1,0],
    [2,0,0,0,1,0,0,0,0,0,0,0],
    [2,0,2,1,0,0,0,0,1,1,1,2],
    [0,0,0,3,0,0,1,0,0,1,0,2],
    [0,0,0,1,1,0,0,0,0,0,1,0],
    [1,1,0,2,0,0,1,0,0,1,1,0],
    [0,1,1,1,2,0,0,2,0,1,1,0],
    [0,1,1,0,0,0,1,0,0,0,0,0] ]
Diaconis and Sturmfels say, on their page 363, “The classical rules of thumb for validity of the chi-square approximation (minimum 5 per cell) are badly violated here, and there are too many tables with these margins to permit exact enumeration.” They are speaking about non-Bayes tests. The present Bayes program shows a strong preference for the “Rows are independent of columns” model for this table. To top.

Theory

The mathematics of the present file is copied out of Wilks, Samuel S., Mathematical Statistics, John Wiley & Sons, Inc., New York, London, Second printing with corrections, 1963. I respectfully direct the reader to pages 177ff, section 7.7, THE DIRICHLET DISTRIBUTION. The most useful formulas, for Bayes inference, are on page 178, formulas (7.7.4) and (7.7.5). That is, the Bayes probability is a product and ratio of gamma functions.

Any mistakes I have made in using Wilks’ formulas are my own mistakes, certainly not his. To top.

Probabilities are uniform

All students of statistical inference know about “ An argument for Divine Providence, taken from the constant Regularity observ’d in the Births of both Sexes.” By Dr. John Arbuthnott, Physitian in Ordinary to Her Majesty, and Fellow of the College of Physitians and the Royal Society. From Phil. Trans. (1710) 27, 186-90. The user is respectfully invited to select and copy the table
[82,0]
and to click the “Clear” button to clear the upper text area, and to paste the table into that text area, and to click on the “Probabilities are uniform” button. The likelihood ratio will be printed in the lower text area. Ratio larger than 1 is in favor of uniformity, and smaller ratio is in favor of non-uniformity.

The program is not limited to only two cells. Here is a fictitious table for throwing a cheap variety-store plastic die one hundred times and counting how many times each face was on top. The numbers of times for 1 2 3 4 5 and 6 times are shown. As before, likelihood ratio larger than 1 is in favor of uniformity of probabilities, and smaller ratio is in favor of non-uniformity.

[21,25,26,16,0,12]
To top.

Probabilities are proportional to numerators

From page 422 of Kendall and Stuart, op. cit., comes the following table for one of Mendel’s plant-breeding experiments.
[
[315,101,108,32],
[  9,  3,  3, 1]
]
The numbers 9 3 3 and 1 are to be understood as the numerators of the theoretical probability fractions 9/16 3/16 3/16 and 1/16. (In Kendall and Stuart the denominator 16 is actually printed. There is no advantage in making the user type the denominator over and over again, so the present program does the denominator without being told.) The user is respectfully invited to select and copy that table, to clear the upper text area, to paste the table into that text area, and to click on the “Probabilities are proportional to numerators” button. The likelihood ratio will be printed in the lower text area. Likelihood ratio larger than 1 is in favor of proportionality, and smaller ratio is in favor of non-proportionality. To top.

One parameter

The previous section was about constant numerators. In this section they are generalized to numerators depending on an unknown value of a parameter t. The prior distribution of the t parameter is understood to be uniform from zero to unity. In the following example the first model supposes that Nature is tossing a bent coin three times, and the number of heads determines which cell has its count advanced. Let the cell counts for 0, 1, 2, and 3 heads be 102, 301, 317, and 96. Let the prior density for probability of heads on one toss be flat. Then the table to copy is
[
[ 102, function(t){ return (1-t)*(1-t)*(1-t) } ],
[ 301, function(t){ return   3*(1-t)*(1-t)*t } ],
[ 317, function(t){ return       3*(1-t)*t*t } ],
[  96, function(t){ return             t*t*t } ]
]
The user is respectfully invited to copy the table, to clear the upper text area, to paste into that upper text area, and to click on the “One parameter” button. The likelihood ratio will be printed in the lower text area. Ratio larger than 1 is in favor of the one parameter model of the table. Smaller ratio is in favor of a Dirichlet model with more parameters. The likelihood ratio of this table is bigger than 90.

If instead the table is

[
[   0, function(t){ return (1-t)*(1-t)*(1-t) } ],
[ 301, function(t){ return   3*(1-t)*(1-t)*t } ],
[ 317, function(t){ return       3*(1-t)*t*t } ],
[  96, function(t){ return             t*t*t } ]
]
then the likelihood ratio is less than 1.5e-29. This is assuming that three tails could occur but did not. It is easy to change the model so that three tails is specially forbidden. That is, let us delete the line for three tails. Now the table looks like
[
[ 301, function(t){ return   3*(1-t)*(1-t)*t } ],
[ 317, function(t){ return       3*(1-t)*t*t } ],
[  96, function(t){ return             t*t*t } ]
]
The likelihood ratio changes to more than 7.9 . Let the user not worry about the denominator for these three numerators: the program is calculating that denominator, so the user need not do it. To top.

Run a JavaScript program

Sometimes a problem is more complicated than those above, so that a little programming is required. For example, it may happen that we have not one but two contingency tables for related data. For illustrative purposes let them be
[
[50,8],
[92,50]
]
and
[
[50,92],
[8,50]
]
The likelihood ratio for each is 0.056850990006292665, but we have two of them. How are they to be combined? Collapsing them into one table is a poor idea, because then the table will be
[
[100,100],
[100,100]
]
which has a likelihood ratio slightly over 12. This happens even though the two original tables have the same cross-ratio as each other. It is an example of Simpson’s paradox. We need to think of another way of combining the data. Since the likelihood ratios are independent of each other, we may multiply the ratios together. This multiplication ought to be done by a computer, not a human. Here is a program to do the job:
independent( [[50,8],[92,50]] )
*
independent( [[50,92],[8,50]] )
The “independent” function does what the “Rows are independent of columns” button does. The user is respectfully invited to select and copy that program, to clear the upper text area, to paste the program into that text area, and to click on the “Run a JavaScript program” button. The product of the two likelihood ratios will be printed in the lower text area. That product is satisfyingly small.

Another use for programming is testing an array for symmetry around its center. Here is an artificial example:

[50,0,49,1]
It is not a good idea to combine the two cells on the left and combine the two cells on the right to get
[50,50]
That is perfectly symmetrical, but the array we started with did not look symmetrical at all. A better way is to compare the first and fourth cells, and compare the second and third cells, and then multiply the likelihood ratios:
uniform( [50,1] )
*
uniform( [0,49] )
(Here the “uniform” function does what the “Probabilities are uniform” button does.) Symmetry is seen to have a very small likelihood compared with non-symmetry.

Besides the “independent” and “uniform” functions this file contains two other functions. There is a function called “proportional” to do what the “Probabilities are proportional to numerators” button does, and there is a function called “oneParameter” to do what the “One parameter” button does. To top.

Comparisons with non-Bayes inferences

This section has bigger programs, so as to compare Bayes and non-Bayes inference.

The number 1.645 is copied out of the “normal” table used in many introductory courses in statistics. It is used in building a 5% one-sided boundary for a non-sequential test of the fairness of a coin. Here is a JavaScript program to find the number of heads that will cause that test to “reject,” and to use that number of heads in a Bayes inference.

var boundary=1.645;
var tosses=1e4;
var yatesCorrection=.5;
var heads=Math.ceil( yatesCorrection + tosses*.5 + boundary*Math.sqrt(tosses*.5*.5) );
var tails=tosses-heads;
var x=[heads,tails];
// throw( x );
uniform( x )
The user is respectfully invited to select and copy that program, and to click the “Clear” button to clear the upper text area, and to paste the program into that text area, and to click the “Run a JavaScript program” button. The likelihood ratio will be printed in the lower text area. Ratio larger than 1 is in favor of uniformity, that is, in favor of fairness of the coin. Smaller ratio is in favor of unfairness of the coin. To see the numerical data in the array, the user can uncomment the throw statement and click again.

The number 2.706 is copied out of the “chi-squared” table used in many introductory courses in statistics. It is used in building a 5% one-sided boundary for a non-sequential test of the independence of a 2 by 2 contingency table. Here is a JavaScript program to find the contingency table entries that will cause that test to “reject,” and to use those entries in a Bayes inference.

var boundary=2.706;
var n=2e4;
var expected=n/4;
var yatesCorrection=0.5;
var higherObserved=Math.ceil( expected+yatesCorrection+Math.sqrt( boundary*expected/4 ) );
var lowerObserved=2*expected-higherObserved;
var x=[[higherObserved,lowerObserved],[lowerObserved,higherObserved]];
// throw( x );
independent( x )
The user is respectfully invited to select and copy and paste this program and to click the “Run a JavaScript program” button. Likelihood ratio larger than 1 is in favor of independence of the table. Smaller ratio is in favor of non-independence.

The user will notice that for each of the two programs the Bayes decision and non-Bayes decision are 5% decisions, but in opposite directions. These are numerical examples of the Jeffreys-Lindley paradox. To top.

Enclose in brackets

It may be that the user is bringing in data copied from other web pages, or from files, and is pasting the data from the clipboard to the upper text area. If the number of rows is large, there needs to be a quick way to put the left and right square brackets on each line, together with a comma after the right square bracket. The “Enclose in brackets” button does this. Actually, it does the top and bottom rows wrong, so I respectfully ask the user to repair those two rows. Also, the button will delete completely empty lines, those having not even any whitespace. If the “Enclose in brackets” button is by mistake clicked more than once, the “Un-enclose” button may help. To top.

Un-enclose

This button is to repair the damage done by clicking the “Enclose in brackets” button more than once. Even worse damage can be done by clicking the “Insert commas” button after too much clicking of the “Enclose in brackets” button. One click of the “Un-enclose” button may suffice to repair the damage from those two buttons. However, the “Un-enclose” button cannot restore the completely empty lines deleted by the “Enclose in brackets” button. To top.

Insert commas

It may be that the user is bringing in data copied from other web pages, or from files, and is pasting the data from the clipboard to the upper text area. In that case the numbers are perhaps separated by blanks or tabulation characters or the like instead of by commas. The “Insert commas” button is meant to change such other separators to commas. The button does not always guess rightly what is expected of it, so the user is respectfully asked to look at the button’s work to make sure everything is as desired. To top.

What priors? How much do they matter?

The prior probability distributions for the cells of the non-independence model in the test of independence are uniform in the present page. It follows by calculus that the priors for the margins of that model are not uniform. For “coherence” we ought to use the same marginal priors in the independence model as in the non-independence model, and this I have done. Some readers of this page have e-mailed me saying that other programmers have used uniform priors for the margins of the independence model. I respectfully suggest that this can make a big numerical difference in the result. To show this, I have written an incoherent function competing against the independent function. The incoherent function uses only uniform priors for the margins. Here is a short program which builds a ten-by-ten array, fills it with zeroes and ones in a checker-board fashion, and calls both the functions, independent on the left and incoherent on the right.
var k1=10;
var k2=10;
var x=[];
	for( var eye=0;eye<k1;eye++ )
	{
	x[eye]=[];
	for( var j=0;j<k2;j++ )x[eye][j]=(eye+j)%2;
	}
independent( x )+"   "+incoherent( x );
I respectfully invite the reader to select that program with the mouse, copy it, move up to the upper text area, clear the area if necessary, paste into the area, and click on the “Run a JavaScript program” button. One of the answers is strongly in favor of independence, and the other is strongly opposed. To top.

License, revision date, and e-mail address

The tabular data quoted from journals and books and other web pages are copyrighted by their publishers. The remainder of this file, including the programs, is in the public domain. The date of this revision is 4 March 2012. Criticism both constructive and destructive comes to me, Harold Kaplan,
       at     dot        
smtw2gh  gmail   com
To top.
Harold Kaplan’s statistics.htm
John C. Pezzullo’s page