One of possible drawbacks in SIR method is that for high-dimensional data, it might suffer from rank deficiency of scatter/covariance matrix. Instead of naive matrix inversion, several have proposed regularization schemes that reflect several ideas from various incumbent methods.
do.rsir(
X,
response,
ndim = 2,
h = max(2, round(nrow(X)/5)),
preprocess = c("center", "scale", "cscale", "decorrelate", "whiten"),
regmethod = c("Ridge", "Tikhonov", "PCA", "PCARidge", "PCATikhonov"),
tau = 1,
numpc = ndim
)
an
a length-
an integer-valued target dimension.
the number of slices to divide the range of response vector.
an additional option for preprocessing the data.
Default is "center". See also aux.preprocess
for more details.
type of regularization scheme to be used.
regularization parameter for adjusting rank-deficient scatter matrix.
number of principal components to be used in intermediate dimension reduction scheme.
a named list containing
an
a list containing information for out-of-sample prediction.
a
chiaromonte_dimension_2002Rdimtools
zhong_rsir_2005Rdimtools
bernard-michel_gaussian_2009Rdimtools
bernard-michel_retrieval_2009Rdimtools
# NOT RUN {
## generate swiss roll with auxiliary dimensions
## it follows reference example from LSIR paper.
set.seed(100)
n = 50
theta = runif(n)
h = runif(n)
t = (1+2*theta)*(3*pi/2)
X = array(0,c(n,10))
X[,1] = t*cos(t)
X[,2] = 21*h
X[,3] = t*sin(t)
X[,4:10] = matrix(runif(7*n), nrow=n)
## corresponding response vector
y = sin(5*pi*theta)+(runif(n)*sqrt(0.1))
## try with different regularization methods
## use default number of slices
out1 = do.rsir(X, y, regmethod="Ridge")
out2 = do.rsir(X, y, regmethod="Tikhonov")
outsir = do.sir(X, y)
## visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(out1$Y, main="RSIR::Ridge")
plot(out2$Y, main="RSIR::Tikhonov")
plot(outsir$Y, main="standard SIR")
par(opar)
# }
Run the code above in your browser using DataLab