Learn R Programming

logistf (version 1.20)

logistf: Bias-reduced logistic regression

Description

Implements Firth's penalized-likelihood logistic regression

Usage

logistf(formula = attr(data, "formula"), data = sys.parent(), pl = TRUE, 
   alpha = 0.05, control, plcontrol, firth = TRUE, init, weights, 
   plconf = NULL, dataout = TRUE, ...)

Arguments

formula
a formula object, with the response on the left of the operator, and the model terms on the right. The response must be a vector with 0 and 1 or FALSE and TRUE for the outcome, where the higher value (1 or TRUE) is modeled. It is possible
data
a data.frame where the variables named in the formula can be found, i. e. the variables containing the binary response and the covariates.
pl
specifies if confidence intervals and tests should be based on the profile penalized log likelihood (pl=TRUE, the default) or on the Wald method (pl=FALSE).
alpha
the significance level (1-$\alpha$ the confidence level, 0.05 as default).
control
Controls Newton-Raphson iteration. Default is control= logistf.control(maxstep, maxit, maxhs, lconv, gconv, xconv)
plcontrol
Controls Newton-Raphson iteration for the estimation of the profile likelihood confidence intervals. Default is plcontrol= logistpl.control(maxstep, maxit, maxhs, lconv, xconv, ortho, pr
firth
use of Firth's penalized maximum likelihood (firth=TRUE, default) or the standard maximum likelihood method (firth=FALSE) for the logistic regression. Note that by specifying pl=TRUE and firth=FALSE
init
specifies the initial values of the coefficients for the fitting algorithm.
weights
specifies case weights. Each line of the input data set is multiplied by the corresponding element of weights.
plconf
specifies the variables (as vector of their indices) for which profile likelihood confidence intervals should be computed. Default is to compute for all variables.
dataout
If TRUE, copies the data set to the output object.
...
Further arguments to be passed to logistf.

Value

  • The object returned is of the class logistf and has the following attributes:
  • coefficientsthe coefficients of the parameter in the fitted model.
  • alphathe significance level (1- the confidence level) as specified in the input.
  • termsthe column names of the design matrix
  • varthe variance-covariance-matrix of the parameters.
  • dfthe number of degrees of freedom in the model.
  • loglika vector of the (penalized) log-likelihood of the full and the restricted models.
  • iterthe number of iterations needed in the fitting process.
  • nthe number of observations.
  • ythe response-vector, i. e. 1 for successes (events) and 0 for failures.
  • formulathe formula object.
  • callthe call object.
  • termsthe model terms (column names of design matrix).
  • linear.predictorsa vector with the linear predictor of each observation.
  • predicta vector with the predicted probability of each observation.
  • hat.diaga vector with the diagonal elements of the Hat Matrix.
  • convthe convergence status at last iteration: a vector of length 3 with elements: last change in log likelihood, max(abs(score vector)), max change in beta at last iteration.
  • methoddepending on the fitting method `Penalized ML' or `Standard ML'.
  • method.cithe method in calculating the confidence intervals, i.e. `profile likelihood' or `Wald', depending on the argument pl.
  • ci.lowerthe lower confidence limits of the parameter.
  • ci.upperthe upper confidence limits of the parameter.
  • probthe p-values of the specific parameters.
  • pl.iteronly if pl==TRUE: the number of iterations needed for each confidence limit.
  • betahistonly if pl==TRUE: the complete history of beta estimates for each confidence limit.
  • pl.convonly if pl==TRUE: the convergence status (deviation of log likelihood from target value, last maximum change in beta) for each confidence limit.
  • If dataout=TRUE, additionally:
  • dataa copy of the input data set
  • weightsthe weights variable (if applicable)

Details

logistf is the main function of the package. It fits a logistic regression model applying Firth's correction to the likelihood. The following generic methods are available for logistf`s output object: print, summary, coef, vcov, confint, anova, extractAIC, add1, drop1, profile, terms, nobs. Furthermore, forward and backward functions perform convenient variable selection. Note that anova, extractAIC, add1, drop1, forward and backward are based on penalized likelihood ratios.

References

Firth D (1993). Bias reduction of maximum likelihood estimates. Biometrika 80, 27--38. Heinze G, Schemper M (2002). A solution to the problem of separation in logistic regression. Statistics in Medicine 21: 2409-2419. Heinze G, Ploner M (2003). Fixing the nonconvergence bug in logistic regression with SPLUS and SAS. Computer Methods and Programs in Biomedicine 71: 181-187. Heinze G, Ploner M (2004). Technical Report 2/2004: A SAS-macro, S-PLUS library and R package to perform logistic regression without convergence problems. Section of Clinical Biometrics, Department of Medical Computer Sciences, Medical University of Vienna, Vienna, Austria. http://www.meduniwien.ac.at/user/georg.heinze/techreps/tr2_2004.pdf Heinze G (2006). A comparative investigation of methods for logistic regression with separated or nearly separated data. Statistics in Medicine 25: 4216-4226. Venzon DJ, Moolgavkar AH (1988). A method for computing profile-likelihood based confidence intervals. Applied Statistics 37:87-94.

See Also

drop1.logistf add1.logistf anova.logistf

Examples

Run this code
data(sex2)
fit<-logistf(case ~ age+oc+vic+vicl+vis+dia, data=sex2)
summary(fit)
nobs(fit)
drop1(fit)
plot(profile(fit,variable="dia"))

extractAIC(fit)

fit1<-update(fit, case ~ age+oc+vic+vicl+vis)
extractAIC(fit1)
anova(fit,fit1)


data(sexagg)
fit2<-logistf(case ~ age+oc+vic+vicl+vis+dia, data=sexagg, weights=COUNT)
summary(fit2)


# simulated SNP example
# not run
set.seed(72341)
snpdata<-rbind(
    matrix(rbinom(2000,2,runif(2000)*0.3),100,20),
    matrix(rbinom(2000,2,runif(2000)*0.5),100,20))
colnames(snpdata)<-paste("SNP",1:20,"_",sep="")
snpdata<-as.data.frame(snpdata)
for(i in 1:20) snpdata[,i]<-as.factor(snpdata[,i])
snpdata$case<-c(rep(0,100),rep(1,100))


fitsnp<-logistf(data=snpdata, formula=case~1, pl=FALSE)
add1(fitsnp)
fitf<-forward(fitsnp)
fitf

Run the code above in your browser using DataLab