data.sim: High Dimensional Correlated Data Generation
Description
Generates an high dimensional dataset with a subset of columns being related to the response, while
controlling the maximum correlation between related and unrelated variables.
Usage
data.sim(n = 100, p = 1000, pr = 3, cor = 0.6)
Arguments
n
sample size
p
total number of variables
pr
the number of variables related to the response
cor
the maximum correlation between related and unrelated variables
Value
Returns an nxp matrix with the first pr columns having maximum correlation cor with
the remaining p-pr columns