Learn R Programming

BinaryEMVS (version 0.1)

data.sim: High Dimensional Correlated Data Generation

Description

Generates an high dimensional dataset with a subset of columns being related to the response, while controlling the maximum correlation between related and unrelated variables.

Usage

data.sim(n = 100, p = 1000, pr = 3, cor = 0.6)

Arguments

n
sample size
p
total number of variables
pr
the number of variables related to the response
cor
the maximum correlation between related and unrelated variables

Value

Returns an nxp matrix with the first pr columns having maximum correlation cor with the remaining p-pr columns

Examples

Run this code
data=data.sim(n=100,p=1000,pr=10,cor=.6)
max(abs(cor(data))[abs(cor(data))<1])

Run the code above in your browser using DataLab