model.frame (a generic function) and its methods return a
  data.frame with the variables needed to use
  formula and any … arguments.
model.frame(formula, …)# S3 method for default
model.frame(formula, data = NULL,
            subset = NULL, na.action = na.fail,
            drop.unused.levels = FALSE, xlev = NULL, …)
# S3 method for aovlist
model.frame(formula, data = NULL, …)
# S3 method for glm
model.frame(formula, …)
# S3 method for lm
model.frame(formula, …)
get_all_vars(formula, data, …)
a data.frame, list or environment (or object
    coercible by as.data.frame to a data.frame),
    containing the variables in formula.  Neither a matrix nor an
    array will be accepted.
a specification of the rows to be used: defaults to all
    rows. This can be any valid indexing vector (see
    [.data.frame) for the rows of data or if that is not
    supplied, a data frame made up of the variables used in formula.
should factors have unused levels dropped?
    Defaults to FALSE.
a named list of character vectors giving the full set of levels to be assumed for each factor.
for model.frame methods, a mix of further
    arguments such as data, na.action, subset to pass
    to the default method.  Any additional arguments (such as
    offset and weights or other named arguments) which
    reach the default method are used to create further columns in the
    model frame, with parenthesised names such as "(offset)".
For get_all_vars, further named columns to include
    in the model frame.
A data.frame containing the variables used in
  formula plus those specified in ….  It will have
  additional attributes, including "terms" for an object of class
  "terms" derived from formula,
  and possibly "na.action" giving information on the handling of
  NAs (which will not be present if no special handling was done,
  e.g.by na.pass).
Exactly what happens depends on the class and attributes of the object
  formula.  If this is an object of fitted-model class such as
  "lm", the method will either return the saved model frame
  used when fitting the model (if any, often selected by argument
  model = TRUE) or pass the call used when fitting on to the
  default method.  The default method itself can cope with rather
  standard model objects such as those of class
  "lqs" from package MASS if no other
  arguments are supplied.
The rest of this section applies only to the default method.
If either formula or data is already a model frame (a
  data frame with a "terms" attribute) and the other is missing,
  the model frame is returned.  Unless formula is a terms object,
  as.formula and then terms is called on it.  (If you wish
  to use the keep.order argument of terms.formula, pass a
  terms object rather than a formula.)
Row names for the model frame are taken from the data argument
  if present, then from the names of the response in the formula (or
  rownames if it is a matrix), if there is one.
All the variables in formula, subset and in …
  are looked for first in data and then in the environment of
  formula (see the help for formula() for further
  details) and collected into a data frame.  Then the subset
  expression is evaluated, and it is used as a row index to the data
  frame.  Then the na.action function is applied to the data frame
  (and may well add attributes).  The levels of any factors in the data
  frame are adjusted according to the drop.unused.levels and
  xlev arguments: if xlev specifies a factor and a
  character variable is found, it is converted to a factor (as from R
  2.10.0).
Unless na.action = NULL, time-series attributes will be removed
  from the variables found (since they will be wrong if NAs are
  removed).
Note that all the variables in the formula are included in the
  data frame, even those preceded by -.
Only variables whose type is raw, logical, integer, real, complex or character can be included in a model frame: this includes classed variables such as factors (whose underlying type is integer), but excludes lists.
get_all_vars returns a data.frame containing the
  variables used in formula plus those specified in …
  which are recycled to the number of data frame rows.
  Unlike model.frame.default, it returns the input variables and
  not those resulting from function calls in formula.
Chambers, J. M. (1992) Data for models. Chapter 3 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
model.matrix for the ‘design matrix’,
  formula for formulas  and
  expand.model.frame for model.frame manipulation.
# NOT RUN {
data.class(model.frame(dist ~ speed, data = cars))
## get_all_vars(): new var.s are recycled (iff length matches: 50 = 2*25)
ncars <- get_all_vars(sqrt(dist) ~ I(speed/2), data = cars, newVar = 2:3)
stopifnot(is.data.frame(ncars),
          identical(cars, ncars[,names(cars)]),
          ncol(ncars) == ncol(cars) + 1)
# }
Run the code above in your browser using DataLab