assocSparse
and cosSparse
for such data are described here.cosCol(X, colGroupX, Y = NULL, colGroupY = NULL, norm = norm2 )
assocCol(X, colGroupX, Y = NULL, colGroupY = NULL, method = res, sparse = TRUE)cosRow(X, rowGroup, Y = NULL, norm = norm2 , weight = NULL)
assocRow(X, rowGroup, Y = NULL, method = res)
Y = NULL
, then all methods return symmetric similarity matrices in the form dsCMatrix
, only specifying the upper triangle. The only exception is when sparse=T
is chose, then the result will be in the form dsyMatrix
.When a second matrix Y is specified, the result will be of the kind dgCMatrix
or dgeMatrix
, respectively.
assoc
and cos
are described in detail in assocSparse
and cosSparse
, respectively. Those methods are extended here in case either the columns (.col
) or the rows (.row
) form groups. Specifically, this occurs with sparse encoding of nominal variables (see splitTable
). In such encoding, the different values of a nominal variable are encoded in separate columns. However, these columns cannot be treated independently, but have to be treated as groups.The .col
methods should be used when similarities between the different values of nominal variables are to be computed. The .row
methods should be used when similarities between the observations of nominal variables are to be computed.
Note that the calculations of the assoc
functionsreally only makes sense for binary data (i.e. matrices with only ones and zeros). Currently, all input is coerced to such data by as(X, "nMatrix")*1
, meaning that all values that are not one or zero are turned into one (including negative values!).
sim.att, sim.obs
for convenient shortcuts around these methods.# convenience functions are easiest to use
# first a simple example using the farms-dataset from MASS
library(MASS)
# to investigate the relation between the individual values
# This is similar to Multiple Correspondence Analysis (see mca in MASS)
f <- splitTable(farms)
s <- assocCol(f$OV,f$AV)
rownames(s) <- f$values
plot(hclust(as.dist(-s)))
Run the code above in your browser using DataLab