Abstract

The use of linear mixed models (LMMs) in genome-wide association studies (GWAS) is now widely accepted because LMMs have been shown to be capable of correcting for several forms of confounding due to genetic relatedness, such as population structure and familial relatedness, and because recent advances have made them computationally efficient. LMMs tackle confounding by using a matrix of pairwise genetic similarities to model the relatedness among subjects. The consensus until now has been that all available single-nucleotide polymorphisms (SNPs) should be used to determine these similarities. Here, however, we show theoretically and experimentally that carefully selecting a small number of SNPs systematically increases power (that is, it jointly reduces false positives and false negatives), improves calibration (lessens inflation or deflation of the test statistic) and reduces computational cost.