Generate data from Gaussian, logistic and Poisson models used in the simulation part of Tian, Y., & Feng, Y. (2023).
models(
family = c("gaussian", "binomial", "poisson"),
type = c("all", "source", "target"),
cov.type = 1,
h = 5,
K = 5,
n.target = 200,
n.source = rep(100, K),
s = 5,
p = 500,
Ka = K
)
a list of data sets which depend on the value of type
.
type
= "all": a list of two components named "target" and "source" storing the target and source data, respectively. Component source is a list containing K
components with the first Ka
ones h
-transferable and the remaining ones h
-nontransferable. The target data set and each source data set have components "x" and "y", as the predictors and responses, respectively.
type
= "source": a list with a signle component "source". This component contains a list of K
components with the first Ka
ones h
-transferable and the remaining ones h
-nontransferable. Each source data set has components "x" and "y", as the predictors and responses, respectively.
type
= "target": a list with a signle component "target". This component contains another list with components "x" and "y", as the predictors and responses of target data, respectively.
response type. Can be "gaussian", "binomial" or "poisson". Default = "gaussian".
"gaussian": Gaussian distribution.
"binomial": logistic distribution. When family = "binomial"
, the input response in both target
and source
should be 0/1.
"poisson": poisson distribution. When family = "poisson"
, the input response in both target
and source
should be non-negative.
the type of generated data. Can be "all", "source" or "target".
"all": generate a list with a target data set of size n.target
and K source data set of size n.source
.
"source": generate a list with K source data set of size n.source
.
"target": generate a list with a target data set of size n.target
.
the type of covariates. Can be 1 or 2 (numerical). If it equals to 1, the predictors will be generated from the distribution used in Section 4.1.1 (Ah-Trans-GLM) in the latest version of Tian, Y., & Feng, Y. (2023). If it equals to 2, the predictors will be generated from the distribution used in Section 4.1.2 (When transferable sources are unknown).
measures the deviation (\(l_1\)-norm) of transferable source coefficient from the target coefficient. Default = 5.
the number of source data sets. Default = 5.
the sample size of target data. Should be a positive integer. Default = 100.
the sample size of each source data. Should be a vector of length K
. Default is a K
-vector with all elements 150.
how many components in the target coefficient are non-zero, which controls the sparsity of target problem. Default = 15.
the dimension of data. Default = 1000.
the number of transferable sources. Should be an integer between 0 and K
. Default = K.
Tian, Y., & Feng, Y. (2023). Transfer learning under high-dimensional generalized linear models. Journal of the American Statistical Association, 118(544), 2684-2697.
glmtrans
.
set.seed(0, kind = "L'Ecuyer-CMRG")
D.all <- models("binomial", type = "all")
D.target <- models("binomial", type = "target")
D.source <- models("binomial", type = "source")
Run the code above in your browser using DataLab