
Last chance! 50% off unlimited learning
Sale ends in
Sparse Canonical Correlation analysis for high dimensional (biomedical) data. The function takes two datasets and returns a linear combination of maximally correlated canonical correlate pairs. Elastic net penalization (with its variants, UST, Ridge and Lasso penalization) is implemented for sparsity and smoothnesswith a built in cross validation procedure to obtain the optimal penalization parameters. It is possible to obtain multiple canonical variate pairs that are orthogonal to each other.
sCCA(predictor, predicted, penalization = "enet", ridge_penalty = 1,
nonzero = 1, max_iterations = 100, tolerance = 1 * 10^-20,
cross_validate = FALSE, parallel_CV = TRUE, nr_subsets = 10,
multiple_LV = FALSE, nr_LVs = 1)
The n*p matrix of the predictor data set
The n*q matrix of the predicted data set
The penalization method applied during the analysis (none, enet or ust)
The ridge penalty parameter of the predictor set's latent variable used for enet or ust (an integer if cross_validate = FALE, a list otherwise)
The number of non-zero weights of the predictor set's latent variable (an integer if cross_validate = FALE, a list otherwise)
The maximum number of iterations of the algorithm
Convergence criteria
K-fold cross validation to find best optimal penalty parameters (TRUE or FALSE)
Run the cross validation parallel (TRUE or FALSE)
Number of subsets for k-fold cross validation
Obtain multiple latent variable pairs (TRUE or FALSE)
Number of latent variables to be obtained
An object of class "sRDA"
.
Predictor set's latent variable scores
Predictive set's latent variable scores
Weights of the predictor set's latent variable
Weights of the predicted set's latent variable
Number of iterations ran before convergence (or max number of iterations)
Inverse of the predictor set's latent variable variance matrix
The convergence criterion value (a small positive tolerance)
Sum of the absolute values of beta weights
The ridge penalty parameter used for the model
The number of nonzero alpha weights in the model
The number of latient variable pairs in the model
The detailed results of cross validations (if cross_validate is TRUE)
# NOT RUN {
# generate data with few highly correlated variahbles
dataXY <- generate_data(nr_LVs = 2,
n = 250,
nr_correlated_Xs = c(5,20),
nr_uncorrelated_Xs = 250,
mean_reg_weights_assoc_X =
c(0.9,0.5),
sd_reg_weights_assoc_X =
c(0.05, 0.05),
Xnoise_min = -0.3,
Xnoise_max = 0.3,
nr_correlated_Ys = c(10,15),
nr_uncorrelated_Ys = 350,
mean_reg_weights_assoc_Y =
c(0.9,0.6),
sd_reg_weights_assoc_Y =
c(0.05, 0.05),
Ynoise_min = -0.3,
Ynoise_max = 0.3)
# seperate predictor and predicted sets
X <- dataXY$X
Y <- dataXY$Y
# run sRDA
CCA.res <- sCCA(predictor = X, predicted = Y, nonzero = 5,
ridge_penalty = 1, penalization = "ust")
# check first 10 weights of X
CCA.res$ALPHA[1:10]
# }
# NOT RUN {
# run sRDA with cross-validation to determine best penalization parameters
CCA.res <- sCCA(predictor = X, predicted = Y, nonzero = c(5,10,15),
ridge_penalty = c(0.1,1), penalization = "enet", cross_validate = TRUE,
parallel_CV = TRUE)
# check first 10 weights of X
CCA.res$ALPHA[1:10]
CCA.res$ridge_penalty
CCA.res$nr_nonzeros
# obtain multiple latent variables
CCA.res <- sCCA(predictor = X, predicted = Y, nonzero = c(5,10,15),
ridge_penalty = c(0.1,1), penalization = "enet", cross_validate = TRUE,
parallel_CV = TRUE, multiple_LV = TRUE, nr_LVs = 2, max_iterations = 5)
# check first 10 weights of X in first two component
CCA.res$ALPHA[[1]][1:10]
CCA.res$ALPHA[[2]][1:10]
# latent variables are orthogonal to each other
t(CCA.res$XI[[1]]) %*% CCA.res$XI[[2]]
# }
Run the code above in your browser using DataLab