model.matrix
Construct Design Matrices
model.matrix
creates a design (or model) matrix, e.g., by
expanding factors to a set of dummary variables (depending on the
contrasts) and expanding interactions similarly.
Usage
model.matrix(object, ...)
"model.matrix"(object, data = environment(object), contrasts.arg = NULL, xlev = NULL, ...)
Arguments
 object
 an object of an appropriate class. For the default
method, a model formula or a
terms
object.  data
 a data frame created with
model.frame
. If another sort of object,model.frame
is called first.  contrasts.arg
 A list, whose entries are values (numeric
matrices or character strings naming functions) to be used
as replacement values for the
contrasts
replacement function and whose names are the names of columns ofdata
containingfactor
s.  xlev
 to be used as argument of
model.frame
ifdata
is such thatmodel.frame
is called.  ...
 further arguments passed to or from other methods.
Details
model.matrix
creates a design matrix from the description
given in terms(object)
, using the data in data
which
must supply variables with the same names as would be created by a
call to model.frame(object)
or, more precisely, by evaluating
attr(terms(object), "variables")
. If data
is a data
frame, there may be other columns and the order of columns is not
important. Any character variables are coerced to factors. After
coercion, all the variables used on the righthand side of the
formula must be logical, integer, numeric or factor.
If contrasts.arg
is specified for a factor it overrides the
default factor coding for that variable and any "contrasts"
attribute set by C
or contrasts
.
In an interaction term, the variable whose levels vary fastest is the
first one to appear in the formula (and not in the term), so in
~ a + b + b:a
the interaction will have a
varying
fastest.
By convention, if the response variable also appears on the righthand side of the formula it is dropped (with a warning), although interactions involving the term are retained.
Value

The design matrix for a regressionlike model with the specified formula
and data.There is an attribute
"assign"
, an integer vector with an entry
for each column in the matrix giving the term in the formula which
gave rise to the column. Value 0
corresponds to the intercept
(if any), and positive values to terms in the order given by the
term.labels
attribute of the terms
structure
corresponding to object
.If there are any factors in terms in the model, there is an attribute
"contrasts"
, a named list with an entry for each factor. This
specifies the contrasts that would be used in terms in which the
factor is coded by contrasts (in some terms dummy coding may be used),
either as a character vector naming a function or as a numeric matrix.
References
Chambers, J. M. (1992) Data for models. Chapter 3 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
See Also
model.frame
, model.extract
,
terms
sparse.model.matrix
from package
\href{https://CRAN.Rproject.org/package=#1}{\pkg{#1}}MatrixMatrix for creating sparse model matrices, which may
be more efficient in large dimensions.
Examples
library(stats)
ff < log(Volume) ~ log(Height) + log(Girth)
utils::str(m < model.frame(ff, trees))
mat < model.matrix(ff, m)
dd < data.frame(a = gl(3,4), b = gl(4,1,12)) # balanced 2way
options("contrasts")
model.matrix(~ a + b, dd)
model.matrix(~ a + b, dd, contrasts = list(a = "contr.sum"))
model.matrix(~ a + b, dd, contrasts = list(a = "contr.sum", b = "contr.poly"))
m.orth < model.matrix(~a+b, dd, contrasts = list(a = "contr.helmert"))
crossprod(m.orth) # m.orth is ALMOST orthogonal