Permute the design matrix so that it is approximately correlated with the surrogate variables.
permute_design(
design_perm,
sv,
target_cor,
method = c("optmatch", "hungarian", "marriage")
)
A numeric design matrix whose rows are to be permuted (thus controlling the amount by which they are correlated with the surrogate variables). The rows index the samples and the columns index the variables. The intercept should not be included (though see Section "Unestimable Components").
A matrix of surrogate variables
A numeric matrix of target correlations between the
variables in design_perm
and the surrogate variables. The
rows index the observed covariates and the columns index the surrogate
variables. That is, target_cor[i, j]
specifies the target
correlation between the i
th column of design_perm
and the
j
th surrogate variable. The surrogate variables are estimated
either using factor analysis or surrogate variable analysis (see the
parameter use_sva
).
The number of columns in target_cor
specifies the number of
surrogate variables. Set target_cor
to NULL
to indicate
that design_perm
and the surrogate variables are independent.
Should we use the optimal matching technique from Hansen and
Klopfer (2006) ("optmatch"
), the Gale-Shapley algorithm
for stable marriages ("marriage"
) (Gale and Shapley, 1962)
as implemented in the matchingR package, or the Hungarian algorithm
(Papadimitriou and Steiglitz, 1982) ("hungarian"
)
as implemented in the clue package (Hornik, 2005)?
The "optmatch"
method works really well
but does take a lot more computational time if you have, say, 1000
samples. If you use the "optmatch"
option, you should note
that the optmatch package uses a super strange license:
https://cran.r-project.org/package=optmatch/LICENSE. If this
license doesn't work for you (because you are not in academia, or
because you don't believe in restrictive licenses), then
try out the "hungarian"
method.
A list with two elements:
design_perm
A row-permuted version of the user-provided
design_perm
.
latent_var
A matrix of the latent variables on which
design_perm
was matched.
Hansen, Ben B., and Stephanie Olsen Klopfer. "Optimal full matching and related designs via network flows." Journal of computational and Graphical Statistics 15, no. 3 (2006): 609-627.
Gale, David, and Lloyd S. Shapley. "College admissions and the stability of marriage." The American Mathematical Monthly 69, no. 1 (1962): 9-15.
C. Papadimitriou and K. Steiglitz (1982), Combinatorial Optimization: Algorithms and Complexity. Englewood Cliffs: Prentice Hall.
Hornik K (2005). "A CLUE for CLUster Ensembles." Journal of Statistical Software, 14(12). doi: 10.18637/jss.v014.i12