Usage
gen_informative_sample(clustering = TRUE, two_stage = FALSE,
theta = c(0.2, 0.7, 1), M = 3, theta_star = matrix(c(0.3, 0.3, 0.3,
0.31, 0.72, 2.04, 0.58, 0.83, 1), 3, 3, byrow = TRUE), gp_type = "rq",
N = 10000, T = 15, L = 10, R = 8, I = 4, n = 750,
noise_to_signal = 0.05, incl_gradient = "medium")
Arguments
clustering
Boolean input on whether want population generated from clusters of covariance
parameters. Defaults to clustering = FALSE
two_stage
Boolean input on whether want two stage sampling, with first stage defining set
of L blocks, where membership in blocks determined by quantiles of observation unit
variance functions. (They are structured like strata, though they are sub-s
theta
A numeric vector of global covariance parameters in the case of clustering = FALSE.
The length, P, of theta must be consistent with the selected gp_type.
Defaults to theta = c(0.30.7,1.0)
M
Scalar input denoting number of clusters to employ if clustering = TRUE. Defaults to
M = 3
theta_star
An P x M matrix of cluster location values associated with the choice of
M and the selected gp_type. Defaults to
matrix(c(0.3,0.3,0.3,0.31,0.72,2.04,0.58,0.83,1.00),3,3,byrow=TRUE)).
gp_type
Input of choice for covariance matrix formulation to be used to generate the functions
for the N population units. Choices are c("se","rq"), where "se" denotes
the squared exponential covariance function and
N
A scalar input denoting the number of population units (or establishments).
T
A scalar input denoting the number of time points in each of N, T x 1 functions
that contribute to the N x T population data matrix, y. Defaults to T = 15.
L
A scalar input that denotes the number of blocks in which to assign the population
units to be sub-sampled in the first stage of sampling.
Defaults to L = 10.
R
A scalar input that denotes the number of blocks to sample from L = 10 with
probability proportional to the average variance of member functions in each block.
I
A scalar input denoting the number of strata to form within each block. Population units
are divided into equally-sized strata based on variance quantiles. Defaults to I = 4.
n
Sample size to be generated. Both an informative sample under either single
(two_stage = FALSE) or 2-stage (two_stage = TRUE) sample is taken, along with
a non-informative, iid sample of the same size (n)
incl_gradient
A character input on whether stratum probabilities from lowest-to-highest
is to "high", in which case they are proportional to the exponential of the
cluster number. If set to "medium" , the inclusion probabilities are proport
noise_to_signal
A numeric input in the interval, (0,1), denoting the ratio of noise
variance to the average variance of the generated functions, bb_i. Defaults to
noise_to_signal = 0.05