Rfast (version 1.7.3)

Many score based GLM regressions: Many score based GLM regressions

Description

Many score based GLM regressions.

Usage

score.glms(y, x, oiko = NULL, logged = FALSE ) score.multinomregs(y, x, logged = FALSE )

Arguments

y
A vector with either discrete or binary data for the Poisson and binary logistic regression respectively. Otherwise it is a vector with discrete values or factor values for the multinomial regression. If the vector is binary and choose multinomial regression the function checks and transfers to the binary logistic regression.
x
A matrix with data, the predictor variables.
oiko
This can be either "poisson" or "binomial". If you are not sure leave it NULL and the function will check internally.
logged
A boolean variable; it will return the logarithm of the pvalue if set to TRUE.

Value

A matrix with two columns, the test statistic and its associated p-value. For the Poisson and logistic regression the p-value is derived via the t distribution, whereas for the multinomial regression via the $\chi^2$ distribution.

Details

Instead of maximising the log-likelihood via the Newton-Raphson algorithm in order to perform the hypothesis testing that $\beta_i=0$ we use the score test. This is dramatcially faster as no model need to be fitted. The first derivative of the log-likelihood is known in closed form and under the null hypothesis the fitted values are all equal to the mean of the response variable y. The testis not the same as the likelihood ratio test. It is size correct nonetheless but it is a bit less efficient and less powerful. For big sample sizes though (5000 or more) the results are the same. It is also much faster then the classical likelihood ratio test.

References

Draper, N.R. and Smith H. (1988). Applied regression analysis. New York, Wiley, 3rd edition.

McCullagh, Peter, and John A. Nelder. Generalized linear models. CRC press, USA, 2nd edition, 1989.

Agresti Alan (1996). An introduction to categorical data analysis. New York: Wiley.

See Also

univglms, logistic_only, poisson_only, regression

Examples

Run this code
## 200 variables, hence 200 univariate regressions are to be fitted
x <- matrix( rnorm(100 * 200), ncol = 200 )
y <- rbinom(100, 1, 0.6)   ## binary logistic regression
system.time( univglms(y, x) )
a1 <- univglms(y, x) 
system.time( score.glms(y, x) )
a2 <- score.glms(y, x)
cor(a1, a2)
mean(a1 - a2)

#x <- matrix( rnorm(1000 * 2000), ncol = 2000 )
#y <- rbinom(1000, 1, 0.6)   ## binary logistic regression
#a1 <- univglms(y, x) 
#a2 <- score.glms(y, x)
#cor(a1, a2)
#mean(a1 - a2)

## x <- matrix( rnorm(500 * 2000), ncol = 2000 )
## y <- rbinom(500, 3, 0.5)   
## a <- score.multinomregs(y, x)
## hist(a[, 2])
## sum(a < 0.05) / 2000  ## estimated type I error 

Run the code above in your browser using DataLab