tilt.boot
Non-parametric Tilted Bootstrap
This function will run an initial bootstrap with equal resampling probabilities (if required) and will use the output of the initial run to find resampling probabilities which put the value of the statistic at required values. It then runs an importance resampling bootstrap using the calculated probabilities as the resampling distribution.
- Keywords
- nonparametric
Usage
tilt.boot(data, statistic, R, sim = "ordinary", stype = "i",
strata = rep(1, n), L = NULL, theta = NULL,
alpha = c(0.025, 0.975), tilt = TRUE, width = 0.5,
index = 1, …)
Arguments
- data
The data as a vector, matrix or data frame. If it is a matrix or data frame then each row is considered as one (multivariate) observation.
- statistic
A function which when applied to data returns a vector containing the statistic(s) of interest. It must take at least two arguments. The first argument will always be
data
and the second should be a vector of indices, weights or frequencies describing the bootstrap sample. Any other arguments must be supplied totilt.boot
and will be passed unchanged to statistic each time it is called.- R
The number of bootstrap replicates required. This will generally be a vector, the first value stating how many uniform bootstrap simulations are to be performed at the initial stage. The remaining values of
R
are the number of simulations to be performed resampling from each reweighted distribution. The first value ofR
must always be present, a value of 0 implying that no uniform resampling is to be carried out. Thuslength(R)
should always equal1+length(theta)
.- sim
This is a character string indicating the type of bootstrap simulation required. There are only two possible values that this can take:
"ordinary"
and"balanced"
. If other simulation types are required for the initial un-weighted bootstrap then it will be necessary to runboot
, calculate the weights appropriately, and runboot
again using the calculated weights.- stype
A character string indicating the type of second argument expected by
statistic
. The possible values thatstype
can take are"i"
(indices),"w"
(weights) and"f"
(frequencies).- strata
An integer vector or factor representing the strata for multi-sample problems.
- L
The empirical influence values for the statistic of interest. They are used only for exponential tilting when
tilt
isTRUE
. Iftilt
isTRUE
and they are not supplied thentilt.boot
usesempinf
to calculate them.- theta
The required parameter value(s) for the tilted distribution(s). There should be one value of
theta
for each of the non-uniform distributions. IfR[1]
is 0theta
is a required argument. Otherwisetheta
values can be estimated from the initial uniform bootstrap and the values inalpha
.- alpha
The alpha level to which tilting is required. This parameter is ignored if
R[1]
is 0 or iftheta
is supplied, otherwise it is used to find the values oftheta
as quantiles of the initial uniform bootstrap. In this caseR[1]
should be large enough thatmin(c(alpha, 1-alpha))*R[1] > 5
, if this is not the case then a warning is generated to the effect that thetheta
are extreme values and so the tilted output may be unreliable.- tilt
A logical variable which if
TRUE
(the default) indicates that exponential tilting should be used, otherwise local frequency smoothing (smooth.f
) is used. Iftilt
isFALSE
thenR[1]
must be positive. In fact in this case the value ofR[1]
should be fairly large (in the region of 500 or more).- width
This argument is used only if
tilt
isFALSE
, in which case it is passed unchanged tosmooth.f
as the standardized bandwidth for the smoothing operation. The value should generally be in the range (0.2, 1). Seesmooth.f
for for more details.- index
The index of the statistic of interest in the output from
statistic
. By default the first element of the output ofstatistic
is used.- …
Any additional arguments required by
statistic
. These are passed unchanged tostatistic
each time it is called.
Value
An object of class "boot"
with the following components
The observed value of the statistic on the original data.
The values of the bootstrap replicates of the statistic. There will
be sum(R)
of these, the first R[1]
corresponding to the
uniform bootstrap and the remainder to the tilted bootstrap(s).
The input vector of the number of bootstrap replicates.
The original data as supplied.
The statistic
function as supplied.
The simulation type used in the bootstrap(s), it can either be
"ordinary"
or "balanced"
.
The type of statistic supplied, it is the same as the input value
stype
.
A copy of the original call to tilt.boot
.
The strata as supplied.
The matrix of weights used. If R[1]
is greater than 0 then the
first row will be the uniform weights and each subsequent row the
tilted weights. If R[1]
equals 0 then the uniform weights are
omitted and only the tilted weights are output.
The values of theta
used for the tilted distributions. These
are either the input values or the values derived from the uniform
bootstrap and alpha
.
References
Booth, J.G., Hall, P. and Wood, A.T.A. (1993) Balanced importance resampling for the bootstrap. Annals of Statistics, 21, 286--298.
Davison, A.C. and Hinkley, D.V. (1997) Bootstrap Methods and Their Application. Cambridge University Press.
Hinkley, D.V. and Shi, S. (1989) Importance sampling and the nested bootstrap. Biometrika, 76, 435--446.
See Also
Examples
# NOT RUN {
# Note that these examples can take a while to run.
# Example 9.9 of Davison and Hinkley (1997).
grav1 <- gravity[as.numeric(gravity[,2]) >= 7, ]
grav.fun <- function(dat, w, orig) {
strata <- tapply(dat[, 2], as.numeric(dat[, 2]))
d <- dat[, 1]
ns <- tabulate(strata)
w <- w/tapply(w, strata, sum)[strata]
mns <- as.vector(tapply(d * w, strata, sum)) # drop names
mn2 <- tapply(d * d * w, strata, sum)
s2hat <- sum((mn2 - mns^2)/ns)
c(mns[2]-mns[1],s2hat,(mns[2]-mns[1]-orig)/sqrt(s2hat))
}
grav.z0 <- grav.fun(grav1, rep(1, 26), 0)
tilt.boot(grav1, grav.fun, R = c(249, 375, 375), stype = "w",
strata = grav1[,2], tilt = TRUE, index = 3, orig = grav.z0[1])
# Example 9.10 of Davison and Hinkley (1997) requires a balanced
# importance resampling bootstrap to be run. In this example we
# show how this might be run.
acme.fun <- function(data, i, bhat) {
d <- data[i,]
n <- nrow(d)
d.lm <- glm(d$acme~d$market)
beta.b <- coef(d.lm)[2]
d.diag <- boot::glm.diag(d.lm)
SSx <- (n-1)*var(d$market)
tmp <- (d$market-mean(d$market))*d.diag$res*d.diag$sd
sr <- sqrt(sum(tmp^2))/SSx
c(beta.b, sr, (beta.b-bhat)/sr)
}
acme.b <- acme.fun(acme, 1:nrow(acme), 0)
acme.boot1 <- tilt.boot(acme, acme.fun, R = c(499, 250, 250),
stype = "i", sim = "balanced", alpha = c(0.05, 0.95),
tilt = TRUE, index = 3, bhat = acme.b[1])
# }