Estimating false discovery rates for contingency tables

MSR-TR-2009-53 |

When testing a large number of hypotheses, it can be helpful to estimate or control the false discovery rate (FDR), the expected proportion of tests called significant that are truly null. The FDR is intricately linked to probability that a truly null test is significant, and thus a number of methods have been described that estimate or control the FDR by directly using the p-values of the hypothesis tests. Most of these methods make the assumption that the p-values are uniformly and continuously distributed under the null hypothesis, an assumption that often does not hold for finite data. In this paper, we consider the estimation of FDR for contingency tables. We show how Fisher’s exact test can be extended to efficiently calculate the exact null distribution over a set of contingency tables. Using this exact null distribution, we explore the estimation of each of the terms in the FDR estimation, characterize the asymptotic convergence of the estimator, and show how the conservative bias can be reduced by removing certain tests from consideration. The resulting estimator has substantially less conservative bias than traditional approaches.