Learn R Programming

hierSDR (version 0.1)

simulate_data: Simulate data with hierarchical subspaces

Description

Simulates data with hierarchical subspaces. Data are generated with two factors that induce heterogeneity

Usage

simulate_data(
  nobs,
  nvars,
  x.type = c("continuous", "some_categorical"),
  sd.y = 1,
  rho = 0.5,
  model = c("1", "2", "3")
)

Arguments

nobs

positive integer for the sample size per subpopulation

nvars

positive integer for the dimension

x.type

variable type for covariates, either "continuous" (where the covariates are multivariate normal with a variance-matrix with AR-1 form with parameter rho) or "some_categorical" (where half covariates are continuous and the other half are binary with dependencies on the continuous covariates)

sd.y

standard deviation of responsee

rho

correlation parameter for AR-1 covariance structure for continuous covariates

model

model number used, either "1", "2", or "3", each corresponds to a different outcome model setting

Value

A list with the following elements

  • x a matrix of covariates with number of rows equal to the total sample size and columns equal to the number of variables

  • z a matrix with number of rows equal to the total sample size and columns as dummy variables indicating presence of a stratifying factor

  • y a vector of all responses

  • beta a list of the true sufficient dimension reduction matrices, one for each subpopulation

  • z.combinations all possible combinations of the stratifying factors z

  • snr scalar the observed signal-to-noise ratio for the response

  • d.correct the true dimensions of the dimension reduction spaces

Examples

Run this code
# NOT RUN {
library(hierSDR)

set.seed(123)
dat <- simulate_data(nobs = 100, nvars = 6,
                     x.type = "some_categorical",
                     sd.y = 1, model = 2)

x <- dat$x ## covariates
z <- dat$z ## factor indicators
y <- dat$y ## response

dat$beta ## true coefficients that generate the subspaces

dat$snr ## signal-to-noise ratio

str(x)
str(z)

dat$z.combinations ## what combinations of z represent different subpops

## correct structural dimensions:
dat$d.correct


# }

Run the code above in your browser using DataLab