The user is respectfully warned that the data and formulas typed into this page will be picked up by JavaScript, so the data and formulas must follow the grammatical rules of JavaScript. In particular, an integer with a leading zero will be understood to be in base 8. A decimal-pointed number with an unnecessary leading zero on the left of the point may be regarded as wrong.
To top.
Get p from x
This page is for the exact one-sample one-sided Smirnov test of fit to the uniform distribution on the unit interval. See Smirnov (1939). This is sometimes called the Kolmogorov-Smirnov test, but it appears that Kolmogorov (1933) did the two-sided test, and his formulas are asymptotic or recurrent. I have not read either of these references; I read about them in Kendall and Stuart (1961) pages 457-458. I have copied Smirnov’s formula from Walsh (1962) page 320, second line from the bottom of the page. Actually, I do not subtract the sum from 1, because I am seeking the p-value. I have checked Smirnov’s formula numerically by constructing an appropriate system of ordinary differential equations and solving by Euler’s method.
Here is an example. Let a sample consist of only one number, say, 0.03 . Let us write this as
[ .03 ]The reader is respectfully invited to select this array, including its brackets, with the mouse, to copy it to the clipboard, so move up the file to the upper text area, to click on the “Clear” button if necessary, to paste into the text area, and to click on the “Get p from x” button. The exact p-value and an asymptotic approximation to it will appear in the lower text area. (“Exact” is in the statistical sense. The floating-point arithmetic is not exact.) The reader will notice that the exact value is smaller, here much smaller. (I have never heard of the exact p-value’s being bigger.)
The null hypothesis, as I said, is that the population is uniform from zero to one. The alternative hypothesis is that the numbers have a “leftward” tendency. That is, the sample distribution function has a tendency to be greater than the null hypothesis would suggest. In the present sample, .03 is much more to the left than would usually occur for the uniform distribution. If the user had in mind a “rightward” alternative, she/he could use instead
[ .03,rightward ]and select and copy and so on. This time the exact p-value will be 97%, and the approximation will be even bigger, so the sample seems to fit the null hypothesis. The word rightward need not be on the right of the sample number(s). It can be on the left
[ rightward,.03 ]or among the numbers if there are more numbers. Let us have some more numbers:
[.4,.7,.3,.8,.6,.1,.2,.5,.9 ]Again the reader is respectfully invited. As the reader sees, the numbers need not be in order, because the program will sort them. This time the p-value and the approximate p-value are large, so the null hypothesis is not rejected.
It is easy to invent a sample which will reject both leftward and rightward. Here it is:
[ 0,0,0,0,0,1,1,1,1,1 ]and here it is again:
[ 0,0,0,0,0,1,1,1,1,1,rightward ]Again the reader is respectfully invited. Here is a different sample with much the same behavior:
[ .5,.5,.5,.5,.5,.5,.5,.5,.5,.5 ]
Of course, the uniform null hypothesis is not the only possible null hypothesis. Perhaps a user has in mind the descending exponential distribution with mean equal to one. Then the distribution function F(x) is, in JavaScript, function(x){ return 1-Math.exp(-x); }, so we insert it into the array where we might use a rightward:
[.4,.7,.3,.8,.6,.1,.2,.5,.9,
function(x){ return 1-Math.exp(-x); } ]
(It is permissible to break the array onto two lines, because JavaScript permits this.) This time the p-value rejects, and the approximate p-value almost rejects. The alternative hypothesis is, as before, leftward. To test rightward, we change the function by swapping left and right in the function’s range:
[.4,.7,.3,.8,.6,.1,.2,.5,.9,
function(x){ return Math.exp(-x); } ]
Maybe we ought to use a mean value of .5 instead of 1:
[.4,.7,.3,.8,.6,.1,.2,.5,.9,
function(x){ return 1-Math.exp(-x/.5); } ]
That seems to fit better.
The number of functions in the array must be zero or one. The position of the function, if any, among the numbers is what you will, beginning or ending or in between. The function, if any, must have exactly one argument. The grammar of JavaScript must be obeyed.
To top.
Get p from n and d
Sometimes a sample is large, and somebody has already found the maximum difference, d, between the sample distribution function and the population distribution function. Then the user may wish to put the values of n and d into an array, for example
[ 1000,.04 ]and click on some appropriate button. The user is respectfully invited to select that array, including its square brackets, with the mouse, to copy to the clipboard, to move up to the upper text area, to click on “Clear” if necessary, to paste into the upper text area, and to click on the “Get p from n and d” button. The user is respectfully reminded that this is a one-sided test. If both sides are being used, then the p-value ought to be multiplied by 2. To top.
[ 20,.05 ]somewhat as before, clicking on the “Get d from n and p” button. This is, of course, a one-sided confidence, so the confidence boundary is only below the sample distribution function but not above. To top.
[ .05,.1 ]in the usual way, finally clicking on the “Get n from p and d” button. In the answers I have placed quotes around the numbers to signify that they have been rounded up to whole numbers. Remember, please, that this is for a one-sided test. To top.
var x=[]; for(var j=0;j<10;j++)x[j]=j; x;The user is respectfully invited. To top.
Kendall, Maurice G., and Stuart, Alan, The Advanced Theory of Statistics, Volume 2, Inference and Relationship, Charles Griffin & Company Limited, London, 1961.
Kolmogorov, A. (1933). Sulla determinazione empirica di una legge di distribuzione. G. Ist. Ital. Attuari, 4, 83.
Smirnov, N. V. (1939). Sur les écarts de la courbe de distribution empirique. Rec.Math. (Matemat. Sbornik), N.S., 6 (48), 3.
Walsh, John E., Handbook of Nonparametric Statistics, D. Van Nostrand Company, Inc., Princeton New Jersey, Toronto, London, New York, 1962.
To top.
License, revision date, and e-mail address
The formulas of Smirnov are perhaps copyrighted by his publishers. The remainder of this file, including the programs, is in the public domain. The date of this revision is 6 March 2012. Criticism both constructive and destructive comes to me, Harold Kaplan,
at dot
smtw2gh gmail com
To top.
Harold Kaplan’s statistics.htm
John C. Pezzullo’s page