dummyVars
Create A Full Set of Dummy Variables
dummyVars
creates a full set of dummy variables (i.e. less than full rank parameterization)
 Keywords
 models
Usage
dummyVars(formula, ...)
"dummyVars"(formula, data, sep = ".", levelsOnly = FALSE, fullRank = FALSE, ...)
"predict"(object, newdata, na.action = na.pass, ...)
contr.dummy(n, ...) ## DEPRECATED
contr.ltfr(n, contrasts = TRUE, sparse = FALSE)
class2ind(x, drop2nd = FALSE)
Arguments
 formula
 An appropriate R model formula, see References
 data
 A data frame with the predictors of interest
 sep

An optional separator between factor variable names and their levels. Use
sep = NULL
for no separator (i.e. normal behavior ofmodel.matrix
as shown in the Details section)  levelsOnly

A logical;
TRUE
means to completely remove the variable names from the column names  fullRank

A logical; should a full rank or less than full rank parameterization be used? If
TRUE
, factors are encoded to be consistent withmodel.matrix
and the resulting there are no linear dependencies induced between the columns.  object

An object of class
dummyVars
 newdata
 A data frame with the required columns
 na.action

A function determining what should be done with missing values in
newdata
. The default is to predictNA
.  n
 A vector of levels for a factor, or the number of levels.
 contrasts
 A logical indicating whether contrasts should be computed.
 sparse
 A logical indicating if the result should be sparse.
 x
 A factor vector.
 drop2nd

A logical: when the factor
x
has two levels, should both dummy variables be returned (drop2nd = FALSE
or only the dummy variable for the first leveldrop2nd = TRUE
.  ...
 additional arguments to be passed to other methods
Details
Most of the contrasts
functions in R produce full rank parameterizations of the predictor data. For example, contr.treatment
creates a reference cell in the data and defines dummy variables for all factor levels except those in the reference cell. For example, if a factor with 5 levels is used in a model formula alone, contr.treatment
creates columns for the intercept and all the factor levels except the first level of the factor. For the data in the Example section below, this would produce:
(Intercept) dayTue dayWed dayThu dayFri daySat daySun 1 1 1 0 0 0 0 0 2 1 1 0 0 0 0 0 3 1 1 0 0 0 0 0 4 1 0 0 1 0 0 0 5 1 0 0 1 0 0 0 6 1 0 0 0 0 0 0 7 1 0 1 0 0 0 0 8 1 0 1 0 0 0 0 9 1 0 0 0 0 0 0
In some situations, there may be a need for dummy variables for all the levels of the factor. For the same example:
dayMon dayTue dayWed dayThu dayFri daySat daySun 1 0 1 0 0 0 0 0 2 0 1 0 0 0 0 0 3 0 1 0 0 0 0 0 4 0 0 0 1 0 0 0 5 0 0 0 1 0 0 0 6 1 0 0 0 0 0 0 7 0 0 1 0 0 0 0 8 0 0 1 0 0 0 0 9 1 0 0 0 0 0 0
Given a formula and initial data set, the class dummyVars
gathers all the information needed to produce a full set of dummy variables for any data set. It uses contr.ltfr
as the base function to do this.
class2ind
is most useful for converting a factor outcome vector to a matrix of dummy variables.
Value

The output of
 call
 the function call
 form
 the model formula
 vars
 names of all the variables in the model
 facVars
 names of all the factor variables in the model
 lvls
 levels of any factor variables
 sep
NULL
or a character separator terms
 the
terms.formula
object  levelsOnly
 a logical The
dummyVars
is a list of class 'dummyVars' with elements
predict
function produces a data frame.contr.ltfr
generates a design matrix.
References
http://cran.rproject.org/doc/manuals/Rintro.html#Formulaeforstatisticalmodels
See Also
Examples
when < data.frame(time = c("afternoon", "night", "afternoon",
"morning", "morning", "morning",
"morning", "afternoon", "afternoon"),
day = c("Mon", "Mon", "Mon",
"Wed", "Wed", "Fri",
"Sat", "Sat", "Fri"))
levels(when$time) < list(morning="morning",
afternoon="afternoon",
night="night")
levels(when$day) < list(Mon="Mon", Tue="Tue", Wed="Wed", Thu="Thu",
Fri="Fri", Sat="Sat", Sun="Sun")
## Default behavior:
model.matrix(~day, when)
mainEffects < dummyVars(~ day + time, data = when)
mainEffects
predict(mainEffects, when[1:3,])
when2 < when
when2[1, 1] < NA
predict(mainEffects, when2[1:3,])
predict(mainEffects, when2[1:3,], na.action = na.omit)
interactionModel < dummyVars(~ day + time + day:time,
data = when,
sep = ".")
predict(interactionModel, when[1:3,])
noNames < dummyVars(~ day + time + day:time,
data = when,
levelsOnly = TRUE)
predict(noNames, when)