Freedman's paradox
In statistical analysis, Freedman's paradox, named after David Freedman, is a problem in model selection whereby predictor variables with no relationship to the dependent variable can pass tests of significance – both individually via a t-test, and jointly via an F-test for the significance of the regression. Freedman demonstrated that this is a common occurrence when the number of variables is similar to the number of data points.
Specifically, if the dependent variable and k regressors are independent normal variables, and there are n observations, then as k and n jointly go to infinity in the ratio k/''n=ρ'',
- the R2 goes to ρ,
- the F-statistic for the overall regression goes to 1.0, and
- the number of spuriously significant regressors goes to αk where α is the chosen critical probability. This third result is intuitive because it says that the number of Type I errors equals the probability of a Type I error on an individual parameter times the number of parameters for which significance is tested.