Permute the design matrix so that it is approximately correlated with the surrogate variables.
permute_design(
design_perm,
sv,
target_cor,
method = c("optmatch", "hungarian", "marriage")
)A numeric design matrix whose rows are to be permuted (thus controlling the amount by which they are correlated with the surrogate variables). The rows index the samples and the columns index the variables. The intercept should not be included (though see Section "Unestimable Components").
A matrix of surrogate variables
A numeric matrix of target correlations between the
variables in design_perm and the surrogate variables. The
rows index the observed covariates and the columns index the surrogate
variables. That is, target_cor[i, j] specifies the target
correlation between the ith column of design_perm and the
jth surrogate variable. The surrogate variables are estimated
either using factor analysis or surrogate variable analysis (see the
parameter use_sva).
The number of columns in target_cor specifies the number of
surrogate variables. Set target_cor to NULL to indicate
that design_perm and the surrogate variables are independent.
Should we use the optimal matching technique from Hansen and
Klopfer (2006) ("optmatch"), the Gale-Shapley algorithm
for stable marriages ("marriage") (Gale and Shapley, 1962)
as implemented in the matchingR package, or the Hungarian algorithm
(Papadimitriou and Steiglitz, 1982) ("hungarian")
as implemented in the clue package (Hornik, 2005)?
The "optmatch" method works really well
but does take a lot more computational time if you have, say, 1000
samples. If you use the "optmatch" option, you should note
that the optmatch package uses a super strange license:
https://cran.r-project.org/package=optmatch/LICENSE. If this
license doesn't work for you (because you are not in academia, or
because you don't believe in restrictive licenses), then
try out the "hungarian" method.
A list with two elements:
design_permA row-permuted version of the user-provided
design_perm.
latent_varA matrix of the latent variables on which
design_perm was matched.
Hansen, Ben B., and Stephanie Olsen Klopfer. "Optimal full matching and related designs via network flows." Journal of computational and Graphical Statistics 15, no. 3 (2006): 609-627.
Gale, David, and Lloyd S. Shapley. "College admissions and the stability of marriage." The American Mathematical Monthly 69, no. 1 (1962): 9-15.
C. Papadimitriou and K. Steiglitz (1982), Combinatorial Optimization: Algorithms and Complexity. Englewood Cliffs: Prentice Hall.
Hornik K (2005). "A CLUE for CLUster Ensembles." Journal of Statistical Software, 14(12). doi: 10.18637/jss.v014.i12