The Java™ applet on this page is a frequentist exact (but non-deterministic) freeware Monte Carlo program to work two-dimensional contingency tables with structural zeroes, testing quasi-independence of rows against columns. The algorithm is due to Besag and Clifford (1989.) I wrote and uploaded this applet merely because I could not find such an applet on the Web.
The AWT controls in order are a single-line TextField to hold the paths integer, a single-line TextField to hold the steps integer, a multi-line TextArea which I will call the upper text area, a Button called “goSwap,” a faded Button called “percent,” a faded Button called “stop,” and a multi-line TextArea which I will call the lower text area.
When loading is finished, path and steps are already set to their default values: one thousand and one thousand. Nothing prevents the user from changing these. The paths number is the total number of genuine and fake paths. The steps number is the number of steps to each path.
The values of paths and steps are up to the user. If lunch will take an hour, then paths can be made larger, so as possibly to get a better p-value, or maybe steps can be made larger, so as to get more power.
From Das (1945) and White (1963) I copy
Classification of Purum marriages
Sib of husband
Sib of wife Marrim Makan Parpa Thao Kheyang
Marrim [0] 5 1 [0] 6
Makan 5 [0] 0 16 2
Parpa [0] 2 [0] 10 11
Thao 10 [0] [0] [0] 9
Kheyang 6 20 8 0 1
I use “-1” instead of “[0]” to signify a structural zero, so the table looks like
-1 5 17 -1 6 5 -1 0 16 2 -1 2 -1 10 11 10 -1 -1 -1 9 6 20 8 0 1I respectfully suggest that the reader use the mouse to select that table, copy to the clipboard, move to the upper text area, clear that area if necessary, paste into the area, and click on the “goSwap” button. After less than a second, the program prints in the lower text area
pValue of this run is 0.001 time to run was 0.782 secondsThe p-value could not be smaller than 0.001, because paths was only one thousand. Perhaps a bigger paths would make a smaller p-value.
The present program sometimes fails. Here is an example to select and copy to the clipboard and paste into the upper text area and run with the “goSwap” button:
-1 2 30 4 -1 5 6 70 -1The answer seems to be
pValue of this run is 1.0 time to run was 0.766 secondsThat is wrong. The answer using Determ2.htm is instead
pValue is 1.196092889749022E-5 time to run was 0.016 secondsSpeaking vaguely, there are too many structural zeroes for the size of the table. On the other hand, the table
-1 2 30 3 4 -1 5 4 6 70 -1 5 6 7 8 -1gets the answer
pValue of this run is 0.001 time to run was 0.828 secondsusing the present program.
If on the other hand the percent is small, then just please click the mouse on the “stop” button. Then please change the paths and/or the steps to something smaller, and click the “goSwap” button again.
to Top
Rules to keep in mind
The integer values for paths and steps must not have any decimal points or exponents. The same goes for the counts in the upper text area. These rules and the usual rules of number format for the Java language will be enforced and diagnosed in the lower text area. Each non-empty row in the upper text area must have the same number of counts as each other non-empty row. Empty rows are permitted, and they have no meaning. Counts in the same row may be separated by one or more blanks or one or more tabs or any combination. The counts numbers in the upper text area may be strictly negative to signify structural zeroes.
The experimenter ought to click the “goSwap” button only once. (Running the program ten times and taking the smallest p-value is clearly cheating.) Students will perhaps be told by their teacher how many times they may click.
to Top
Browsers and Java
All the famous modern browsers can do this page correctly, but the “zoom” must be set to 100% or reset to zero, depending on the browser. The zoom control will commonly be on the “view” dropdown menu. If the zoom is wrong, then the “layout” of the Java applet will be wrong.
However, the program can work only if Java is installed and Java is turned on. If there is no Java or if Java has been turned off, then the text fields and the text areas and the “goSwap” and “percent” and “stop” buttons will not even be visible. In that case, the user is respectfully requested to download and install Java and/or to turn Java back on. Users who have trouble doing this are respectfully asked to get help from their classmates, children, spouses, or teachers.
to Top
Download
A reader or user wishing to download the files of this applet is respectfully invited to click on
McmcBesag.jar
to save the “jar” file. All the other McmcBesag files are zipped into it, so it can be unzipped after download to see and change them all. Yes, a “jar” file is merely a special kind of “zip” file.
If the browser’s downloader renames the “jar” file to a “zip” file, please be sure to rename it back to a “jar”. I know a browser which takes too much on itself in this way.
to Top
The algorithm
Let the sample in the upper text area be called “genuine.” We wish to build some “fake” samples to compare against it, and we wish the fakes and the genuine to be exchangeable if the null hypothesis is true. This would be true if the genuine and the fakes arose, all in the same random way, from a “starting” sample.
Besag and Clifford (1989) remembered that the Markov chain made by a Metropolis algorithm had the same probabilities in both directions. (We are, of course, supposing that the null hypothesis is true.) That is, we may move backwards in time from the genuine sample to the starting sample. Then we may move forward in time from the starting sample to each of the fakes.
My Java method to do this is called “backwardForward.” This method is called twice, the first time to find the expected values, and the second time to use those expected values to get chi-squares. The same random number seed is used both times.
to Top
Advantages and disadvantages
The test on this page is frequentist and exact. Though a frequentist exact test, it is speedy. Asymptotic methods are more speedy but not exact. Bayes methods are more speedy but not frequentist. Deterministic methods, such as Fisher’s, are usually slow.
However, the test on this page is not deterministic. Since it uses random numbers, the p-values can differ from run to run. This is why the experimenter may run only once. Students, of course, may run as many times as their teacher permits.
to Top
A note to developers
Statisticians who do not plan to change or repair the program need not read this. The attention of developers is respectfully drawn to the line
final boolean debugging=false;in the McmcBesag.java file. The debugging variable is used in the try/catch statement
try
{
parse();
}
catch( Throwable thro )
{
String temp="";
temp+=thro.getClass().getCanonicalName();
temp+="\n\n";
temp+=thro.getMessage();
temp+="\n";
lowerTextArea.setText( temp );
if( debugging )
{
StackTraceElement[] ste=thro.getStackTrace();
int n=ste.length;
for( int j=0;j<n;j++ )
{
lowerTextArea.append( "\n"+ste[j] );
}
}
}
where I have bolded and bigged the line where debugging is used. When debugging is true, the StackTraceElements will be shown in the lower text area. That is, the program will say not only what happened, but also where it happened. Doubtless the professional Java programmers know about this already, but I am an amateur and I just found out. After the program is changed or repaired, the value of debugging can of course be set back to false.
Besag, J., Clifford, P. “Generalized Monte Carlo significance tests,” Biometrika 76 (1989), 633-642.
Das, T. (1945). The Purums: An Old Kuki Tribe of Manipur. Calcutta, University of Calcutta.
White, H. C. (1963). An Anatomy of Kinship, p. 138. Englewood Cliffs, N. J., Prentice-Hall.
to Top
Revision date, licenses, and e-mail address
This program and its files are revised 7 March 2012.
The tabular data quoted from journals and books and other web pages are copyrighted by their publishers.
All the rest of this file and all of the other files which I include with it are in the public domain.
The Java™ language is the property of Oracle.
Please send all criticism, both constructive and destructive, to me, Harold M. Kaplan,
at dot
smtw2gh gmail com
to Top