# dummyVars

##### Create A Full Set of Dummy Variables

`dummyVars`

creates a full set of dummy variables (i.e. less than full
rank parameterization)

- Keywords
- models

##### Usage

`dummyVars(formula, ...)`# S3 method for default
dummyVars(formula, data, sep = ".", levelsOnly = FALSE, fullRank = FALSE, ...)

# S3 method for dummyVars
print(x, ...)

# S3 method for dummyVars
predict(object, newdata, na.action = na.pass, ...)

contr.ltfr(n, contrasts = TRUE, sparse = FALSE)

class2ind(x, drop2nd = FALSE)

##### Arguments

- formula
An appropriate R model formula, see References

- ...
additional arguments to be passed to other methods

- data
A data frame with the predictors of interest

- sep
An optional separator between factor variable names and their levels. Use

`sep = NULL`

for no separator (i.e. normal behavior of`model.matrix`

as shown in the Details section)- levelsOnly
A logical;

`TRUE`

means to completely remove the variable names from the column names- fullRank
A logical; should a full rank or less than full rank parameterization be used? If

`TRUE`

, factors are encoded to be consistent with`model.matrix`

and the resulting there are no linear dependencies induced between the columns.- x
A factor vector.

- object
An object of class

`dummyVars`

- newdata
A data frame with the required columns

- na.action
A function determining what should be done with missing values in

`newdata`

. The default is to predict`NA`

.- n
A vector of levels for a factor, or the number of levels.

- contrasts
A logical indicating whether contrasts should be computed.

- sparse
A logical indicating if the result should be sparse.

- drop2nd
A logical: if the factor has two levels, should a single binary vector be returned?

##### Details

Most of the `contrasts`

functions in R produce full rank
parameterizations of the predictor data. For example,
`contr.treatment`

creates a reference cell in the data
and defines dummy variables for all factor levels except those in the
reference cell. For example, if a factor with 5 levels is used in a model
formula alone, `contr.treatment`

creates columns for the
intercept and all the factor levels except the first level of the factor.
For the data in the Example section below, this would produce:

(Intercept) dayTue dayWed dayThu dayFri daySat daySun 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 1 0 0 0 0 1 0 1 0 0 0 1 0 0

In some situations, there may be a need for dummy variables for all the levels of the factor. For the same example:

dayMon dayTue dayWed dayThu dayFri daySat daySun 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0

Given a formula and initial data set, the class `dummyVars`

gathers all
the information needed to produce a full set of dummy variables for any data
set. It uses `contr.ltfr`

as the base function to do this.

`class2ind`

is most useful for converting a factor outcome vector to a
matrix (or vector) of dummy variables.

##### Value

The output of `dummyVars`

is a list of class 'dummyVars' with
elements

the function call

the model formula

names of all the variables in the model

names of all the factor variables in the model

levels of any factor variables

`NULL`

or a character separator

the `terms.formula`

object

a logical

The predict function produces a data frame.

class2ind returns a matrix (or a vector if drop2nd = TRUE).

contr.ltfr generates a design matrix.

##### References

https://cran.r-project.org/doc/manuals/R-intro.html#Formulae-for-statistical-models

##### See Also

##### Examples

```
# NOT RUN {
when <- data.frame(time = c("afternoon", "night", "afternoon",
"morning", "morning", "morning",
"morning", "afternoon", "afternoon"),
day = c("Mon", "Mon", "Mon",
"Wed", "Wed", "Fri",
"Sat", "Sat", "Fri"),
stringsAsFactors = TRUE)
levels(when$time) <- list(morning="morning",
afternoon="afternoon",
night="night")
levels(when$day) <- list(Mon="Mon", Tue="Tue", Wed="Wed", Thu="Thu",
Fri="Fri", Sat="Sat", Sun="Sun")
## Default behavior:
model.matrix(~day, when)
mainEffects <- dummyVars(~ day + time, data = when)
mainEffects
predict(mainEffects, when[1:3,])
when2 <- when
when2[1, 1] <- NA
predict(mainEffects, when2[1:3,])
predict(mainEffects, when2[1:3,], na.action = na.omit)
interactionModel <- dummyVars(~ day + time + day:time,
data = when,
sep = ".")
predict(interactionModel, when[1:3,])
noNames <- dummyVars(~ day + time + day:time,
data = when,
levelsOnly = TRUE)
predict(noNames, when)
head(class2ind(iris$Species))
two_levels <- factor(rep(letters[1:2], each = 5))
class2ind(two_levels)
class2ind(two_levels, drop2nd = TRUE)
# }
```

*Documentation reproduced from package caret, version 6.0-86, License: GPL (>= 2)*