resample-package
Overview of the resample package
Resampling functions, including one- and two-sample bootstrap and permutation tests, with an easy-to-use syntax.
- Keywords
- htest, nonparametric
Details
See library(help = resample)
for version number, date, etc.
Data Sets
A list of datasets is at
resample-data
,
Main resampling functions
The main resampling functions are:
bootstrap
,
bootstrap2
,
permutationTest
,
permutationTest2
.
Methods
Methods for generic functions include:
print.resample
,
plot.resample
,
hist.resample
,
qqnorm.resample
, and
quantile.resample
.
Confidence Intervals
Functions that calculate confidence intervals for bootstrap
and bootstrap2
objects:
CI.bca
,
CI.bootstrapT
,
CI.percentile
,
CI.t
.
Samplers
Functions that generate indices for random samples:
samp.bootstrap
,
samp.permute
.
Low-level Resampling Function
This is called by the main resampling functions, but can also be
called directly:
resample
.
New Versions
I will post the newest versions to http://www.timhesterberg.net/r-packages. See that page to join a list for announcements of new versions.
Examples
data(Verizon)
ILEC <- with(Verizon, Time[Group == "ILEC"])
CLEC <- with(Verizon, Time[Group == "CLEC"])
#### Sections in this set of examples
### Different ways to specify the data and statistic
### Example with plots and confidence intervals.
### Different ways to specify the data and statistic
# This code is flexible; there are different ways to call it,
# depending on how the data are stored and on the statistic.
## One-sample Bootstrap
## Not run:
# # Ordinary vector, give statistic as a function
# bootstrap(CLEC, mean)
#
# # Vector by name, give statistic as an expression
# bootstrap(CLEC, mean(CLEC))
#
# # Vector created by an expression, use the name 'data'
# bootstrap(with(Verizon, Time[Group == "CLEC"]), mean(data))
#
# # A column in a data frame; use the name of the column
# temp <- data.frame(foo = CLEC)
# bootstrap(temp, mean(foo))
#
# # Put function arguments into an expression
# bootstrap(CLEC, mean(CLEC, trim = .25))
#
# # Put function arguments into a separate list
# bootstrap(CLEC, mean, args.stat = list(trim = .25))
# ## End(Not run)
## One-sample jackknife
# Like bootstrap. E.g.
jackknife(CLEC, mean)
## One-sample permutation test
# To test H0: two variables are independent, exactly
# one of them just be permuted. For the CLEC data,
# we'll create an artificial variable.
CLEC2 <- data.frame(Time = CLEC, index = 1:length(CLEC))
## Not run:
# permutationTest(CLEC2, cor(Time, index),
# resampleColumns = "index")
# # Could permute "Time" instead.
#
# # resampleColumns not needed for variables outside 'data'
# permutationTest(CLEC, cor(CLEC, 1:length(CLEC)))
# ## End(Not run)
### Two-sample problems
## Different ways to specify data and statistic
## Two-sample bootstrap
# Two data objects (one for each group)
## Not run: bootstrap2(CLEC, data2 = ILEC, mean)
# data frame containing y variable(s) and a treatment variable
## Not run: bootstrap2(Verizon, mean(Time), treatment = Group)
# treatment variable as a separate object
temp <- Verizon$Group
## Not run: bootstrap2(Verizon$Time, mean, treatment = temp)
## Two-sample permutation test
# Like bootstrap2. E.g.
## Not run: permutationTest2(CLEC, data2 = ILEC, mean
### Example with plots and confidence intervals.
## Not run:
# boot <- bootstrap2(CLEC, data2 = ILEC, mean)
# perm <- permutationTest2(CLEC, data2 = ILEC, mean,
# alternative = "greater")
# ## End(Not run)
## Not run:
# par(mfrow = c(2,2))
# hist(boot)
# qqnorm(boot)
# qqline(boot$replicates)
# hist(perm)
# ## End(Not run)
# P-value
perm
# Standard error, and bias estimate
boot
# Confidence intervals
CI.percentile(boot) # Percentile interval
CI.t(boot) # t interval using bootstrap SE
# CI.bootstrapT and CI.bca do't currently support two-sample problems.
# Statistic can be multivariate.
# For the bootstrap2, it must have the estimate first, and a standard
# error second (don't need to divide by sqrt(n), that cancels out).
bootC <- bootstrap(CLEC, mean, seed = 0)
bootC2 <- bootstrap(CLEC, c(mean = mean(CLEC), sd = sd(CLEC)), seed = 0)
identical(bootC$replicates[, 1], bootC2$replicates[, 1])
CI.percentile(bootC)
CI.t(bootC)
CI.bca(bootC)
CI.bootstrapT(bootC2)
# The bootstrapT is the most accurate for skewed data, especially
# for small samples.
# By default the percentile interval is "expanded", for better coverage
# in small samples. To turn this off:
CI.percentile(bootC, expand = FALSE)