Learn R Programming

LogisticDx (version 0.1)

genLogi: Generate data for logistic regression

Description

Generates a data.frame or data.table with a binary outcome, and a logistic model to describe it.

Usage

genLogiDf(b = 2L, f = 2L, c = 1L, n = 20L, nlf = 3L,
    pb = 0.5, rc = 0.8, py = 0.5, asFactor = TRUE,
    model = TRUE, timelim = 5, speedglm = FALSE)

  genLogiDt(b = 2L, f = 2L, c = 1L, n = 20L, nlf = 3L,
    pb = 0.5, rc = 0.8, py = 0.5, asFactor = TRUE,
    model = TRUE, timelim = 5, speedglm = FALSE)

Arguments

b
binomial predictors, the number of predictors which are binary, i.e. limited to $0$ or $1$
f
factors, the number of predictors which are factors
c
continuous predictors, the number of predictors which are continuous
n
number of observations in the data frame
nlf
the no. of levels in a factor
pb
probability for binomnial predictors: the probability of binomial predictors being $=1$ e.g. if pb=0.3, $30%$ will be $1$s, $70%$ will be $0$s
rc
ratio for continuous variables the ratio of levels of continuous variables to the total number of observations n e.g. if rc=0.8 and n=100, it will be in the range 1-80
py
ratio for y the ratio of 1s to total observations for the binomial predictors e.g. if ry=0.5, 50% will be $1$s, $50%$ will be $0$s
asFactor
If asFactor=TRUE (the default), predictors given as factors will be converted to factors in the data frame before the model is fit
model
If model=TRUE will also return a model fitted with stats::glm or speedglm::speedglm
timelim
function will timeout after timelim secs. This is present to prevent duplication of rows.
speedglm
If speedglm=TRUE, return a model fitted with speedglm instead of glm

Value

  • If model=TRUE: a list with the following values:
  • df or dtA data.frame (for genLogiDf) or data.table (for genLogiDt). Predictors are labelled $x1, x2, ..., xn$. Outcome is $y$. Rows represent to $n$ observations
  • modelA model fit with stats::glm or speedglm::speedglm
  • If model=FALSE a data.frame or data.table as above.

Examples

Run this code
set.seed(1)
genLogiDf()
genLogiDt(b=0, c=2, n=100, rc=0.7, model=FALSE)

Run the code above in your browser using DataLab