subpop
conducts set inference on the groups of most and least
affected. When subgroup = NULL
, output is for whole sample. Otherwise
the results are subgroup. The output of subpop
is a list
containing six components: cs_most
, cs_least
, u
,
subgroup
, most
and least
. As the names
indicate, cs_most
and cs_least
denote the confidence sets for
the most and least affected units. u
stores the u-th most and least
affected index. subgroup
stores the indicators for subpopulations.
most
and least
store the data of the most and
least affected groups. The confidence sets can be visualized using the
plot.subpop
command while the two groups can be tabulated via
the summary.subpop
command.
subpop(
fm,
data,
method = c("ols", "logit", "probit", "QR"),
var_type = c("binary", "continuous", "categorical"),
var,
compare,
subgroup = NULL,
samp_weight = NULL,
taus = c(5:95)/100,
u = 0.1,
alpha = 0.1,
b = 500,
seed = 1,
parallel = FALSE,
ncores = detectCores(),
boot_type = c("nonpar", "weighted")
)
Regression formula
The data in use
Models to be used for estimating partial effects. Four
options: "logit"
(binary response),
"probit"
(binary response), "ols"
(interactive linear with additive errors), "QR"
(linear model with non-additive errors). Default is
"ols"
.
The type of parameter in interest. Three options:
"binary"
, "categorical"
,
"continuous"
. Default is "binary"
.
Variable T in interset. Should be a character.
If parameter in interest is categorical, then user needs
to specify which two category to compare with. Should be
a 1 by 2 character vector. For example, if the two levels
to compare with is 1 and 3, then c=("1", "3")
,
which will calculate partial effect from 1 to 3. To use
this option, users first need to specify var
as a
factor variable.
Subgroup in interest. Default is NULL
.
Specifcation should be a logical variable. For example,
suppose data contains indicator variable for women
(female if 1, male if 0). If users are interested in
women SPE, then users should specify
subgroup = data[, "female"] == 1
.
Sampling weight of data. Input should be a n by 1 vector,
where n denotes sample size. Default is NULL
.
Indexes for quantile regression.
Default is c(5:95)/100
.
Percentile of most and least affected. Default is set to be 0.1.
Size for confidence interval. Shoule be between 0 and 1. Default is 0.1
Number of bootstrap draws. Default is set to be 500.
Pseudo-number generation for reproduction. Default is 1.
Whether the user wants to use parallel computation.
The default is FALSE
and only 1 CPU will be used.
The other option is TRUE
, and user can specify
the number of CPUs in the ncores
option.
Number of cores for computation. Default is set to be
detectCores()
, which is a function from package
parallel
that detects the number of CPUs on the
current host. For large dataset, parallel computing is
highly recommended since bootstrap is time-consuming.
Type of bootstrap. Default is "nonpar"
, and the
package implements nonparametric bootstrap. The
alternative is "weighted"
, and the package
implements weighted bootstrap.
# NOT RUN {
data("mortgage")
### Regression Specification
fm <- deny ~ black + p_irat + hse_inc + ccred + mcred + pubrec +
ltv_med + ltv_high + denpmi + selfemp + single + hischl
### Issue the subpop command
set_b <- subpop(fm, data = mortgage, method = "logit", var = "black",
u = 0.1, alpha = 0.1, b = 50)
# }
Run the code above in your browser using DataLab