do.disr: Diversity-Induced Self-Representation

Description

Diversity-Induced Self-Representation (DISR) is a feature selection method that aims at ranking features by both representativeness and diversity. Self-representation controlled by lbd1 lets the most representative features to be selected, while lbd2 penalizes the degree of inter-feature similarity to enhance diversity from the chosen features.

Usage

do.disr(X, ndim = 2, preprocess = c("null", "center", "scale",
  "cscale", "whiten", "decorrelate"), lbd1 = 1, lbd2 = 1)

Arguments

an \((n\times p)\) matrix or data frame whose rows are observations and columns represent independent variables.

ndim

an integer-valued target dimension.

preprocess

an additional option for preprocessing the data. Default is "null". See also aux.preprocess for more details.

lbd1

nonnegative number to control the degree of self-representation.

lbd2

nonnegative number to control the degree of feature similarity.

Value

a named list containing

Y: an \((n\times ndim)\) matrix whose rows are embedded observations.
featidx: a length-\(ndim\) vector of indices with highest scores.
trfinfo: a list containing information for out-of-sample prediction.
projection: a \((p\times ndim)\) whose columns are basis for projection.

References

liu_unsupervised_2017Rdimtools

Examples

Run this code

# NOT RUN {
#### generate R12in72 dataset
X = aux.gensamples(dname="R12in72")

#### try different lbd combinations
out1 = do.disr(X, lbd1=1, lbd2=1)
out2 = do.disr(X, lbd1=1, lbd2=5)
out3 = do.disr(X, lbd1=5, lbd2=1)
out4 = do.disr(X, lbd1=5, lbd2=5)

#### visualize
par(mfrow=c(2,2))
plot(out1$Y[,1], out1$Y[,2], main="(lbd1,lbd2)=(1,1)")
plot(out2$Y[,1], out2$Y[,2], main="(lbd1,lbd2)=(1,5)")
plot(out3$Y[,1], out3$Y[,2], main="(lbd1,lbd2)=(5,1)")
plot(out4$Y[,1], out4$Y[,2], main="(lbd1,lbd2)=(5,5)")
# }
# NOT RUN {
# }