Learn R Programming

mase (version 0.1.2)

gregTree: Compute a regression tree estimator

Description

Calculates a regression tree estimator for a finite population mean or total based on sample data collected from a complex sampling design and auxiliary population data.

Usage

gregTree(y, x_sample, x_pop, pi = NULL, pi2 = NULL, var_est = FALSE,
  var_method = "lin_HB", B = 1000, p_value = 0.05, perm_reps = 500,
  bin_size = NULL, strata = NULL)

Arguments

y

A numeric vector of the sampled response variable.

x_sample

A data frame of the auxiliary data in the sample.

x_pop

A data frame of population level auxiliary information. It must contain the same names as x_sample.

pi

A numeric vector of inclusion probabilities for each sampled unit in y. If NULL, then simple random sampling without replacement is assumed.

pi2

A square matrix of the joint inclusion probabilities. Needed for the "lin_HT" variance estimator.

var_est

A logical indicating whether or not to compute a variance estimator. Default is FALSE.

var_method

The method to use when computing the variance estimator. Options are a Taylor linearized technique: "lin_HB"= Hajek-Berger estimator, "lin_HH" = Hansen-Hurwitz estimator, "lin_HTSRS" = Horvitz-Thompson estimator under simple random sampling without replacement, and "lin_HT" = Horvitz-Thompson estimator or a resampling technique: "bootstrap_SRS" = bootstrap variance estimator under simple random sampling without replacement. The default is "lin_HB".

B

The number of bootstrap samples if computing the bootstrap variance estimator. Default is 1000.

p_value

Designated p-value level to reject null hypothesis in permutation test used to fit the regression tree. Default value is 0.05.

perm_reps

An integer specifying the number of permutations for each permutation test run to fit the regression tree. Default value is 500.

bin_size

A integer specifying the minimum number of observations in each node.

strata

A factor vector of the stratum membership. If NULL, all units are put into the same stratum. Must have same length as y.

Value

A list of output containing:

  • pop_total: Estimate of population total

  • pop_mean: Estimate of the population mean

  • pop_total_var: Estimated variance of population total estimate

  • pop_mean_var: Estimated variance of population mean estimate

  • weights: Survey weights produced by regression tree

  • tree: rpms object

References

mcc17bmase

See Also

greg for a linear or logistic regression model.

Examples

Run this code
# NOT RUN {
library(survey)
data(api)
gregTree(y = apisrs$api00, 
x_sample = apisrs[c("col.grad", "awards", "snum", "dnum", "cnum", "pcttest", "meals", "sch.wide")], 
x_pop = apipop[c("col.grad", "awards", "snum", "dnum", "cnum", "pcttest", "meals", "sch.wide")])

# }

Run the code above in your browser using DataLab