This is a one-sample inference. The corresponding two-sample inference is at randomBeta2.htm.
In the following sections are examples showing use of the programs. The user is respectfully invited to try out the examples or to use any others. The only thing to remember is: follow the grammatical rules of JavaScript. (This is because the “eval” method of JavaScript is used in picking up the data from the textarea.) In particular, remember that starting an integer with a zero may force the use of base 8.
Google Chrome and Safari are much speedier than Mozilla Firefox. However, in Mozilla Firefox the canvas is an image, so the canvas can be saved and viewed, and when viewed the canvas can be copied to the clipboard for later pasting into word processors and spread-sheets. Google Chrome and Safari do not treat the canvas as an image. To top. Random beta Let us consider the following two JavaScript statements:
died= [ 6,4,3 ]; censored=[ 7,7,3 ];
As this page opens, there are three “textareas” visible. I will call them the top, middle, and bottom textareas. The top is for input. The middle and bottom are for output. Also, there will be a “canvas” output at the beginning of the file, but it is not visible yet. I respectfully invite the reader to use the mouse to select the two JavaScript statements and to “copy” them to the clipboard, then to move up to the top textarea, to click the “clear” button if necessary, to paste into that textarea, and finally to click on the “Random beta” button. After maybe five seconds, the canvas will open at the beginning of the file. The graph shows three descending curves. The curve in the middle, which I have colored blue, is the survival curve of Kaplan and Meier (1958). The top black horizontal line is at 1.0, and the bottom black horizontal line is at zero, so the middle thick black horizontal is at 0.5, half-way between.
The top and bottom curves, which I have colored red, are the top and bottom edges of a “Bayes credible band” containing the Kaplan-Meier survival curve. I have built it to have 95% credibility. Readers unfamiliar with Bayesian inference are invited to look at Lehmann and Romano (2008). A later section of the present page, Modifiers, will show how to change the percent of credibility and the colors and thicknesses of the line segments.
It may be that some readers would rather have a numerical table instead of a graph. The bottom textarea is a table with tab characters separating the number fields. It is meant to be put into a spread-sheet. The way of doing this is first to move the mouse into the textarea, then click with the right-hand button of the mouse, then use the left-hand button of the mouse to “select all,” then “copy.” Then open one’s favorite spread-sheet. Then click the mouse on the cell at “A1.” Then paste. Most spread-sheets will then accept the table. A few will instead open a “wizard.” Then just please make sure the wizard’s circle or square for separating with a tab character is checked, and proceed.
The columns of the table from left to right are deaths, censorings, row number, lower edge of the band, Kaplan-Meier survival, and upper edge of the band. Row number increases as time increases. The reader will notice that there are four rows, not the three that would be expected. The fourth row contains the “fiction” data. They will be explained later in this section.
Most spread-sheet programs can make “charts.” These are like graphs. To make a chart, just select columns C through F, or columns D through F, depending on the kind of chart, and proceed. Readers who do not know what I am talking about are respectfully asked to consult friends. No two spread-sheet programs have exactly the same kinds of charts.
Let me return to the graph in the canvas at the top of the present file. The best browser to work on it is Mozilla Firefox. I invite the user who has that browser to click the canvas with the mouse and then click the mouse with the right-hand button. It will then be possible to click the left-hand button for “View Image” or “Save Image As.” I recommend the former. The graph will then be viewed all by itself, without the rest of the page. Also, the graph can then be copied to the clipboard: again click the right-hand button of the mouse and then click the left button for “Copy Image.” Then the graph can be pasted from the clipboard into a spread-sheet or into a word processor. Once the graph has been so pasted, its width and height can be changed. To return from the viewing, just use the keyboard’s “Backspace” key.
Now let us return to the topic of “fiction” data. They have been put in the arrays to reduce the bias and improve the stability of the Kaplan-Meier estimator, especially in small samples. Here is an innocent-looking data-set:
died= [ 1 ]; censored=[ 0 ];
died= [ 1 ]; censored=[ 0 ]; fiction=0;
This section does not explain anything about the middle textarea, but the Modifiers section does. To top. Two more examples The previous example had only three time values. Here is an example with thirty. Again I respectfully invite the reader, and this time it really will be necessary to click on the “clear” button before pasting:
died= [ 0,1,0,1,1,0,2,0,0,3,0,1,2,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0 ]; censored=[ 0,1,1,3,1,1,3,2,1,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ];
The next example has 300 time values, so it looks sparse:
died= [ 0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ]; censored=[ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ];
This may be a good time to point out that each example had the same number of time values for its deaths as for its censorings The first had 3 and 3, the next had 30 and 30, and the last had 300 and 300. This is required. To top. Modifiers Now it is time to talk about the middle textarea. Let us again consider the thirty-time-value sample
died=[ 0,1,0,1,1,0,2,0,0,3,0,1,2,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0 ]; censored=[ 0,1,1,3,1,1,3,2,1,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ];
half width is 0.28583173176835713 and the time was 5.577 seconds
many=100000; oneOverAlpha=20; died=[ 0,1,0,1,1,0,2,0,0,3,0,1,2,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0 ]; censored=[ 0,1,1,3,1,1,3,2,1,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 ]; maxSeconds=30; separator="\t"; colors=[ "red","blue","red","black" ]; thicknesses=[ 5,5,5,1 ]; censoredBeforeDied="no"; swap="no"; fiction=1;
died=[ 0,1,0,1,1,0,2,0,0,3,0,1,2,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0 ]; censored=[ 0,1,1,3,1,1,3,2,1,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ]; thicknesses=[ 15,15,15,3 ];
died=[ 0,1,0,1,1,0,2,0,0,3,0,1,2,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0 ]; censored=[ 0,1,1,3,1,1,3,2,1,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ]; thicknesses=[ 15,15,15,3 ]; colors=[ "mediumblue","crimson","mediumblue","darkgray" ];
died=[ 0,1,0,1,1,0,2,0,0,3,0,1,2,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0 ]; censored=[ 0,1,1,3,1,1,3,2,1,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ]; thicknesses=[ 15,15,15,3 ]; colors=[ "mediumblue","crimson","mediumblue","darkgray" ]; separator=",";
Now another thing: the program works by Monte Carlo, using a great number of times around the big loop. The value of “many” is one more than that number of times. To get more precision, one might use ten times the default value:
died=[ 0,1,0,1,1,0,2,0,0,3,0,1,2,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0 ]; censored=[ 0,1,1,3,1,1,3,2,1,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ]; thicknesses=[ 15,15,15,3 ]; colors=[ "mediumblue","crimson","mediumblue","darkgray" ]; separator=","; many=1e6;
died=[ 0,1,0,1,1,0,2,0,0,3,0,1,2,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0 ]; censored=[ 0,1,1,3,1,1,3,2,1,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ]; thicknesses=[ 15,15,15,3 ]; colors=[ "mediumblue","crimson","mediumblue","darkgray" ]; separator=","; many=1e6; maxSeconds=60;
So far the percentage of credibility has been 95. Perhaps somebody needs 99 percent. The way to do that is to change “oneOverAlpha” to 100. (I remark that “oneOverAlpha” must be a factor of “many.” This is enforced.)
died=[ 0,1,0,1,1,0,2,0,0,3,0,1,2,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0 ]; censored=[ 0,1,1,3,1,1,3,2,1,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ]; thicknesses=[ 15,15,15,3 ]; colors=[ "mediumblue","crimson","mediumblue","darkgray" ]; separator=","; many=1e6; maxSeconds=60; oneOverAlpha=100;
For the next two modifiers, let us change back to the original three-time-value sample:
died=[ 6,4,3 ]; censored=[ 7,7,3 ];
died=[ 6,4,3 ]; censored=[ 7,7,3 ]; censoredBeforeDied="yes";
died=[ 6,4,3 ]; censored=[ 7,7,3 ]; censoredBeforeDied="yes"; swap="yes";
I point out to the reader that these JavaScript statements may be in any order, not just the one I used. Since there may be as many as 11 statements, there may be as many as 11! = 39916800 different orders. Also, one may use blank lines between the statements, if desired.
To top. The idea of the inference Let there be four times; call them 1, 2, 3, and 4. Suppose that deaths can occur only at 1 and 3, and that censorings can occur only at 2 and 4.
Let the numbers of deaths or censorings for those four times be x, y, z, and w. Let the chances of deaths or censorings for those four times be p, q, r, and s. Write n for the total of all deaths and censorings. The probability of x deaths at time 1 is p x(1-p) n-x, except that I have not put in the normalizing factorials. Those factorials depend only on x, y, z, and w, but not on p, q, r, or s. The conditional probability of y censorings at time 2 given that x deaths occurred at time 1 is q y(1-q) n-x-y. Hence the joint probability that x deaths occurred at 1 and y censorings occurred at 2 is the product p x(1-p) n-x q y(1-q) n-x-y. Similarly we get the conditional probability r z(1-r) n-x-y-z and the joint probability p x(1-p) n-x q y(1-q) n-x-y r z(1-r) n-x-y-z and the conditional probability sw(1-s) n-x-y-z-w and the joint probability p x(1-p) n-x q y(1-q) n-x-y r z(1-r) n-x-y-z sw(1-s) n-x-y-z-w.
This is the likelihood. Since we are doing a Bayes inference, we say that the exponents are constants, but that the chances p, q, r, and s are variables. We shall need a priori probabilities for those variables. Let us use the “improper” formulas p-1(1-p)-1, q-1(1-q)-1, r-1(1-r)-1, and s-1(1-s)-1. Multiplying them upon the likelihood, we get p x(1-p) n-x p-1(1-p)-1 times q y(1-q) n-x-y q-1(1-q)-1 times r z(1-r) n-x-y-z r-1(1-r)-1 times sw(1-s) n-x-y-z-w s-1(1-s)-1. That is, the a posteriori probability is a product of four factors, each in a different variable. Let us “integrate out” the factor in q. Similarly, let us integrate out the factor in s. The results of the integrals are constants, so there is no need to write them. The marginal a posteriori probability is now p x(1-p) n-x p-1(1-p)-1 times r z(1-r) n-x-y-z r-1(1-r)-1. The factor in p is, except for its normalizing factor, a “beta” density in p. Similarly, the factor in r is, except for its normalizing factor, a beta density in r. The program is a Monte Carlo, and each time around the big loop it must take a random point in ( p,r ) space. What is needed is a way of taking a random number from a beta distribution. It is known that the order statistics of a sorted sample from the uniform distribution between zero and one have beta distributions. The program makes use of that fact.
Let the random numbers so made be called p and r. Let km( 1 ) and km( 3 ) be the Kaplan-Meier values at time 1 and time 3. Let us define a function “f”of two variables p and r by f( p,r )=max( | km( 1 )-( 1-p ) | , | km( 3 )-( 1-p )( 1-r) | ) Here I have used the vertical line segments for absolute value.
Then the program goes around its big loop “many”-1 times, making random numbers p and r, and calculating f( p,r ), and putting the values into an array. When the big loop is all finished, the array is sorted. Then the “half width” is found in the array by using “many” and “oneOverAlpha” to calculate the subscript.
This section’s description simplifies the program and leaves out some details. I hope that the idea of the inference is conveyed.
After I wrote the above, I found out about Lo (1993). My ideas are in his paper, but the notation is different. His ideas are improvements on and extensions of Rubin (1981).
To top. Run a JavaScript program While building this page I needed a way to run little JavaScript programs, so I constructed the “Run a JavaScript program” button. When I was done I left the button so users could practice JavaScript programming with it. If a program is in the top textarea, the button will run it. Here is a trivial example:
var x=[]; for(var j=0;j<10;j++)x[j]=j; x;
To top. Bibliography
Kaplan, E. L., and Meier, Paul, Nonparametric Estimation from Incomplete Observations, Journal of the American Statistical Association, Volume 53, Number 282 (June 1958), pages 457-481. Stable URL: http://www.jstor.org/stable/2281868
Lehmann, E. L., and Romano, Joseph P., Testing Statistical Hypotheses, Third Edition, Springer, 2005, fourth printing, 2008. See especially Section 5.7, Bayesian Confidence Sets, which is on pages 171-175. Lo, Albert Y., A Bayesian Bootstrap for Censored Data, Annals of Statistics, Volume 21, Number 1 (1993), pages 100-123. This is also available in the Web at http://projecteuclid.org/euclid.aos/1176349017
Lo, Albert Y., A Bayesian Bootstrap for Censored Data, Annals of Statistics, Volume 21, Number 1 (1993), pages 100-123. This is also available in the Web at http://projecteuclid.org/euclid.aos/1176349017
Rubin, Donald B., The Bayesian Bootstrap, Annals of Statistics, Volume 9, Number 1 (1981), pages 130-134. This is also available in the Web at http://projecteuclid.org/euclid.aos/1176345338
To top. License, revision date, and e-mail address All of this file is in the public domain. The date of this revision is 3 March 2012. Criticism both constructive and destructive comes to me, Harold Kaplan,
at dot smtw2gh gmail com