Learn R Programming

mase (version 0.1.3)

gregTree: Compute a regression tree estimator

Description

Calculates a regression tree estimator for a finite population mean/proportion or total based on sample data collected from a complex sampling design and auxiliary population data.

Usage

gregTree(
  y,
  xsample,
  xpop,
  pi = NULL,
  pi2 = NULL,
  var_est = FALSE,
  var_method = "LinHB",
  B = 1000,
  pval = 0.05,
  perm_reps = 500,
  bin_size = NULL
)

Value

A list of output containing:

  • pop_total:Estimate of population total

  • pop_mean:Estimate of the population mean (or proportion)

  • weights:Survey weights produced by gregTree

  • pop_total_var:Estimated variance of population total estimate

  • pop_mean_var:Estimated variance of population mean estimate

Arguments

y

A numeric vector of the sampled response variable.

xsample

A data frame of the auxiliary data in the sample.

xpop

A data frame of population level auxiliary information. It must contain the same names as xsample. If datatype = "raw", must contain unit level data. If datatype = "totals" or "means", then contains one row of aggregated, population totals or means for the auxiliary data. Default is "raw".

pi

A numeric vector of inclusion probabilities for each sampled unit in y. If NULL, then simple random sampling without replacement is assumed.

pi2

A square matrix of the joint inclusion probabilities. Needed for the "LinHT" variance estimator.

var_est

A logical indicating whether or not to compute a variance estimator. Default is FALSE.

var_method

The method to use when computing the variance estimator. Options are a Taylor linearized technique: "LinHB"= Hajek-Berger estimator, "LinHH" = Hansen-Hurwitz estimator, "LinHTSRS" = Horvitz-Thompson estimator under simple random sampling without replacement, and "LinHT" = Horvitz-Thompson estimator or a resampling technique: "bootstrapSRS" = bootstrap variance estimator under simple random sampling without replacement. The default is "LinHB".

B

The number of bootstrap samples if computing the bootstrap variance estimator. Default is 1000.

pval

Designated p-value level to reject null hypothesis in permutation test used to fit the regression tree. Default value is 0.05.

perm_reps

An integer specifying the number of permutations for each permutation test run to fit the regression tree. Default value is 500.

bin_size

A integer specifying the minimum number of observations in each node.

References

mcc17bmase

Examples

Run this code
library(survey)
data(api)
gregTree(y = apisrs$api00, 
xsample = apisrs[c("col.grad", "awards", "snum", "dnum", "cnum", "pcttest", "meals", "sch.wide")], 
xpop = apipop[c("col.grad", "awards", "snum", "dnum", "cnum", "pcttest", "meals", "sch.wide")])

Run the code above in your browser using DataLab