New Tables for the p-Product Meta-Analytic Method

Richard B. Darlington
Cornell University

The p-product method works with the significance level p computed for each of k mutually independent experiments. The meta-analyst ranks the p's, selects some positive integer s typically much smaller than k, and computes R, the product of the s smallest values of p. R is used to test the composite null hypothesis that all k individual null hypotheses are true. The critical value of R depends on k, s, and alpha. Previous work offered tables of critical values of R for every value of k from 2 to 30, for values of s from 2 to 5, and for 53 values of alpha ranging from .1 down to .00001.

The tables presented here are designed to supplement, not replace, those previous tables. The new tables cover much larger values of s and k, but fewer values of alpha. Altogether these new tables contain 7*19*970 critical values of R, for every combination of 7 values of alpha, 19 values of s, and the 970 values of k from 31 to 1000. The 7 values of alpha are .1, .05, .025, .01, .005, .0025, and .001. Values of s are the 19 integers from 2 to 20. No special tables are needed for s = 1, since the Bonferroni method applies to that case.

The same 7*19*970 values are presented twice. If you think of these values as forming a block of size 7 x 19 x 970, then we present both the 19 slices of the block in which s is held constant, and the 970 slices of the block in which k is held constant. This was done because for file-drawer analyses it is most convenient to have a table with many values of k and a single value of s, while for non-file-drawer analyses it is most convenient to have a table with many values of s and a single value of k.

Tables for single values of k
These 970 tables are grouped into 10 files of about 100 tables each. Each table is 19 x 7, and each file is about 160K bytes.

k from 31 to 100
k from 101 to 200
k from 201 to 300
k from 301 to 400
k from 401 to 500
k from 501 to 600
k from 601 to 700
k from 701 to 800
k from 801 to 900
k from 901 to 1000

Tables for single values of s
Each table is 970 x 7, and each file is about 80K bytes.

s = 2
s = 3
s = 4
s = 5
s = 6
s = 7
s = 8
s = 9
s = 10
s = 11
s = 12
s = 13
s = 14
s = 15
s = 16
s = 17
s = 18
s = 19
s = 20

How these values were computed

Critical values of R were computed by simulation and then smoothed. First each critical value of R was estimated based on 50,000 or more artificial values of R. (See below for precise numbers.) For instance, for k = 800 and s = 20, each trial consisted of generating 800 artificial values of p, which were simply uniform random numbers from 0 to 1. These 800 values were sorted from low to high, and the first 20 values were then multiplied to find a value of R. This was repeated 50,000 times. The same random numbers, and indeed the same sort, were used to compute 19 different values of R: the product of the first 2 sorted p's was found, then multiplied by the third p to get the product of the first 3 p's, and so on up to 20. But new random numbers were generated for each new value of k.

Once 50,000 values of R had been computed for, say, s = 20 and k = 800, they were sorted from low to high. Then the 50th value of R was taken as a preliminary estimate of the critical value for alpha = .001 (since 50 = .001 x 50,000), the 125th was taken for alpha = .0025 (since 125 = .0025 x 50,000), etc.

Despite the fact that 50,000 is fairly large, values like 50 and 125 are small enough so that this procedure for estimating critical values does contain more sampling error than one would like, particularly for very low values of alpha such as .001 and .0025. The great majority of this random error was eliminated by the next step, which consisted of smoothing the critical values of R across the 970 different values of k for a given combination of s and alpha. As mentioned above, different random numbers had been used for each value of k, so random errors in these 970 critical values were mutually independent. I found that when ln(critical value of R) was plotted against ln(k), the plot was virtually a straight line except for the random fluctuations.

I didn't assume the true curve was exactly a straight line, but it appeared that the true curve would be very well approximated by a cubic polynomial, since the residuals from this polynomial had no visible trend. So for each of the 7*19 combinations of s and alpha, I fitted a cubic polynomial, predicting the 970 critical values of ln(R) from ln(k). Then the original values of ln(R) were replaced by the fitted or predicted values. The fitted values were then exponentiated to get back to values of R. These are the values appearing in the tables here. Thus each entry in the tables is a point on a cubic curve, exponentiated.

Successive values of k become closer together on a log scale as k increases, so the smoothing process seems more effective for higher values of k. For instance, one is more comfortable interpolating between R values for k = 830 and k = 832, to estimate the R for k = 831, than when the values of k are 30, 31, and 32. Therefore it didn't seem necessary to use as many repetitions for high values of k as for lower values. The number of repetitions used for each value of k was 355K up to k = 70, then 305K up to k = 80, then 240K up to k = 100, then 175K up to k = 162, then 110K up to k = 260, then 60K up to k = 550, then 50K up to k = 1000. Altogether, each cubic curve was based on 83.58 million independently-generated artificial values of R. This smoothing was not used in the previous tables for which k < 31. But in those tables, 2 million values of R were used for each value of k, instead of the 50K to 355K used here.