Calculates a regression tree estimator for a finite population mean/proportion or total based on sample data collected from a complex sampling design and auxiliary population data.
gregTree(
y,
xsample,
xpop,
pi = NULL,
pi2 = NULL,
var_est = FALSE,
var_method = "LinHB",
B = 1000,
pval = 0.05,
perm_reps = 500,
bin_size = NULL,
fpc = TRUE,
messages = TRUE
)
A list of output containing:
* pop_total: Estimate of population total.
* pop_mean: Estimate of the population mean (or proportion).
* weights: Survey weights produced by gregTree.
* pop_total_var: Estimated variance of population total estimate.
* pop_mean_var: Estimated variance of population mean estimate.
A numeric vector of the sampled response variable.
A data frame of the auxiliary data in the sample.
A data frame of population level auxiliary information. It must contain the same names as xsample. If datatype = "raw", must contain unit level data. If datatype = "totals" or "means", then contains one row of aggregated, population totals or means for the auxiliary data. Default is "raw".
A numeric vector of inclusion probabilities for each sampled unit in y. If NULL, then simple random sampling without replacement is assumed.
A square matrix of the joint inclusion probabilities. Needed for the "LinHT" variance estimator.
A logical indicating whether or not to compute a variance estimator. Default is FALSE.
The method to use when computing the variance estimator. Options are a Taylor linearized technique: "LinHB"= Hajek-Berger estimator, "LinHH" = Hansen-Hurwitz estimator, "LinHTSRS" = Horvitz-Thompson estimator under simple random sampling without replacement, and "LinHT" = Horvitz-Thompson estimator or a resampling technique: "bootstrapSRS" = bootstrap variance estimator under simple random sampling without replacement. The default is "LinHB".
The number of bootstrap samples if computing the bootstrap variance estimator. Default is 1000.
Designated p-value level to reject null hypothesis in permutation test used to fit the regression tree. Default value is 0.05.
An integer specifying the number of permutations for each permutation test run to fit the regression tree. Default value is 500.
A integer specifying the minimum number of observations in each node.
Default to TRUE, logical for whether or not the variance calculation should include a finite population correction when calculating the "LinHTSRS" or the "SRSbootstrap" variance estimator.
A logical indicating whether to output the messages internal to mase. Default is TRUE.
mcc17bmase
library(dplyr)
data(IdahoPop)
data(IdahoSamp)
xsample <- filter(IdahoSamp, COUNTYFIPS == "16055")
xpop <- filter(IdahoSamp, COUNTYFIPS == "16055")
gregTree(y = xsample$BA_TPA_ADJ,
xsample = xsample[c("tcc", "elev")],
xpop = xpop[c("tcc", "elev")],
var_est = TRUE)
Run the code above in your browser using DataLab