Learn R Programming

ADSIHT (version 0.2.1)

gen.data: Generate simulated data

Description

Generate simulated data for sparse group linear model.

Usage

gen.data(
  n,
  m,
  d,
  s,
  s0,
  cor.type = 1,
  beta.type = 1,
  rho = 0.5,
  sigma1 = 1,
  sigma2 = 1,
  seed = 1
)

Value

A list object comprising:

x

Design matrix of predictors.

y

Response variable.

beta

The coefficients used in the underlying regression model.

group

The group index of each variable.

true.group

The important groups in the sparse group linear model.

true.variable

The important variables in the sparse group linear model.

Arguments

n

The number of observations.

m

The number of groups of interest.

d

The group size of each group. Only even group structure is allowed here.

s

The number of important groups in the underlying regression model.

s0

The number of important variables in each important group.

cor.type

The structure of correlation. cor.type = 1 denotes the independence structure, where the covariance matrix has \((i,j)\) entry equals \(I(i \neq j)\). cor.type = 2 denotes the exponential structure, where the covariance matrix has \((i,j)\) entry equals \(rho^{|i-j|}\). cor.type = 3 denotes the constant structure, where the non-diagonal entries of covariance matrix are \(rho\) and diagonal entries are 1.

beta.type

The structure of coefficients. beta.type = 1 denotes the homogenous setup, where each entry has the same magnitude. beta.type = 2 denotes the heterogeneous structure, where the coefficients are drawn from a normal distribution.

rho

A parameter used to characterize the pairwise correlation in predictors. Default is 0.5..

sigma1

The value controlling the strength of the gaussian noise. A large value implies strong noise. Default sigma1 = 1.

sigma2

The value controlling the strength of the coefficients. A large value implies large coefficients. Default sigma2 = 1.

seed

random seed. Default: seed = 1.

Author

Yanhang Zhang, Zhifan Li, Jianxin Yin.

Examples

Run this code

# Generate simulated data
n <- 200
m <- 100
d <- 10
s <- 5
s0 <- 5
data <- gen.data(n, m, d, s, s0)
str(data)

Run the code above in your browser using DataLab