simex: Measurement error in models using SIMEX

Description

Implementation of the SIMEX algorithm for measurement error models according to Cook and Stefanski

Usage

simex(model, SIMEXvariable, measurement.error, lambda = c(0.5, 1, 1.5,
  2), B = 100, fitting.method = "quadratic",
  jackknife.estimation = "quadratic", asymptotic = TRUE)
# S3 method for simex
plot(x, xlab = expression((1 + lambda)),
  ylab = colnames(b)[-1], ask = FALSE, show = rep(TRUE, NCOL(b) - 1),
  ...)
# S3 method for simex
predict(object, newdata, ...)
# S3 method for simex
print(x, digits = max(3, getOption("digits") - 3), ...)
# S3 method for summary.simex
print(x, digits = max(3, getOption("digits") -
  3), ...)
# S3 method for simex
refit(object, fitting.method = "quadratic",
  jackknife.estimation = "quadratic", asymptotic = TRUE, ...)
# S3 method for simex
summary(object, ...)

Arguments

model

the naive model

SIMEXvariable

character or vector of characters containing the names of the variables with measurement error

measurement.error

given standard deviations of measurement errors. In case of homoskedastic measurement error it is a matrix with dimension 1xlength(SIMEXvariable). In case of heteroskedastic error for at least one SIMEXvariable it is a matrix of dimension nx

lambda

vector of lambdas for which the simulation step should be done (without 0)

number of iterations for each lambda

fitting.method

fitting method for the extrapolation. linear, quadratic, nonlinear are implemented. (first 4 letters are enough)

jackknife.estimation

specifying the extrapolation method for jackknife variance estimation. Can be set to FALSE if it should not be performed

asymptotic

logical, indicating if asymptotic variance estimation should be done, in the naive model the option x = TRUE has to be set

object of class 'simex'

xlab

optional name for the X-Axis

ylab

vector containing the names for the Y-Axis

ask

logical. If TRUE, the user is asked for input, before a new figure is drawn

show

vector of logicals indicating for wich variables a plot should be produced

…

arguments passed to other functions

object

of class 'simex'

newdata

optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted linear predictors are used

digits

number of digits to be printed

Value

An object of class 'simex' which contains:

coefficients

the corrected coefficients of the SIMEX model,

SIMEX.estimates

the estimates for every lambda,

model

the naive model,

measurement.error

the known error standard deviations,

the number of iterations,

extrapolation

the model object of the extrapolation step,

fitting.method

the fitting method used in the extrapolation step,

residuals

the residuals of the main model,

fitted.values

the fitted values of the main model,

call

the function call,

variance.jackknife

the jackknife variance estimate,

extrapolation.variance

the model object of the variance extrapolation,

variance.jackknife.lambda

the data set for the extrapolation,

variance.asymptotic

the asymptotic variance estimates,

theta

the estimates for every B and lambda,

...

Methods (by generic)

plot: Plot the simulation and extrapolation step
predict: Predict using simex correction
print: Print simex nicely
print: Print summary nicely
refit: Refits the model with a different extrapolation function
summary: Summary of simulation and extrapolation

Details

Nonlinear is implemented as described in Cook and Stefanski, but is numerically instable. It is not advisable to use this feature. If a nonlinear extrapolation is desired please use the refit() method.

Asymptotic is only implemented for naive models of class lm or glm with homoscedastic measurement error.

refit() refits the object with a different extrapolation function.

References

Cook, J.R. and Stefanski, L.A. (1994) Simulation-extrapolation estimation in parametric measurement error models. Journal of the American Statistical Association, 89, 1314 -- 1328

Carroll, R.J., K<U+00FC>chenhoff, H., Lombard, F. and Stefanski L.A. (1996) Asymptotics for the SIMEX estimator in nonlinear measurement error models. Journal of the American Statistical Association, 91, 242 -- 250

Carrol, R.J., Ruppert, D., Stefanski, L.A. and Crainiceanu, C. (2006). Measurement error in nonlinear models: A modern perspective., Second Edition. London: Chapman and Hall.

Lederer, W. and K<U+00FC>chenhoff, H. (2006) A short introduction to the SIMEX and MCSIMEX. R News, 6(4), 26--31

Examples

Run this code

# NOT RUN {
## Seed
set.seed(49494)

## simulating the measurement error standard deviations
sd_me <- 0.3
sd_me2 <- 0.4
temp <- runif(100, min = 0, max = 0.6)
sd_me_het1 <- sort(temp)
temp2 <- rnorm(100, sd = 0.1)
sd_me_het2 <- abs(sd_me_het1 + temp2)

## simulating the independent variables x (real and with measurement error):

x_real <- rnorm(100)
x_real2 <- rpois(100, lambda = 2)
x_real3 <- -4*x_real + runif(100, min = -10, max = 10)  # correlated to x_real

x_measured <- x_real + sd_me * rnorm(100)
x_measured2 <- x_real2 + sd_me2 * rnorm(100)
x_het1 <- x_real + sd_me_het1 * rnorm(100)
x_het2 <- x_real3 + sd_me_het2 * rnorm(100)

## calculating dependent variable y:
y <- x_real + rnorm(100, sd = 0.05)
y2 <- x_real + 2*x_real2 + rnorm(100, sd = 0.08)
y3 <- x_real + 2*x_real3 + rnorm(100, sd = 0.08)

### one variable with homoscedastic measurement error
(model_real <- lm(y ~ x_real))

(model_naiv <- lm(y ~ x_measured, x = TRUE))

(model_simex <- simex(model_naiv, SIMEXvariable = "x_measured", measurement.error = sd_me))
plot(model_simex)

### two variables with homoscedastic measurement errors
(model_real2 <- lm(y2 ~ x_real + x_real2))
(model_naiv2 <- lm(y2 ~ x_measured + x_measured2, x = TRUE))
(model_simex2 <- simex(model_naiv2, SIMEXvariable = c("x_measured", "x_measured2"),
         measurement.error = cbind(sd_me, sd_me2)))

plot(model_simex2)

# }
# NOT RUN {
### one variable with increasing heteroscedastic measurement error
model_real

(mod_naiv1 <- lm(y ~ x_het1, x = TRUE))
(mod_simex1 <- simex(mod_naiv1, SIMEXvariable = "x_het1",
                measurement.error = sd_me_het1, asymptotic = FALSE))

plot(mod_simex1)

### two correlated variables with heteroscedastic measurement errors
(model_real3 <- lm(y3 ~ x_real + x_real3))
(mod_naiv2 <- lm(y3 ~ x_het1 + x_het2, x = TRUE))
(mod_simex2 <- simex(mod_naiv2, SIMEXvariable = c("x_het1", "x_het2"),
              measurement.error = cbind(sd_me_het1, sd_me_het2), asymptotic = FALSE))

plot(mod_simex2)

### two variables, one with homoscedastic, one with heteroscedastic measurement error
model_real2
(mod_naiv3 <- lm(y2 ~ x_measured + x_het2, x = TRUE))
(mod_simex3 <- simex(mod_naiv3, SIMEXvariable = c("x_measured", "x_het2"),
                    measurement.error = cbind(sd_me, sd_me_het2), asymptotic = FALSE))

### glm: two variables, one with homoscedastic, one with heteroscedastic measurement error
t <- x_real + 2*x_real2 + rnorm(100, sd = 0.01)
g <- 1 / (1 + exp(t))
u <- runif(100)
ybin <- as.numeric(u < g)

(logit_real <- glm(ybin ~ x_real + x_real2, family = binomial))
(logit_naiv <- glm(ybin ~ x_measured + x_het2, x = TRUE, family = binomial))
(logit_simex <- simex(logit_naiv, SIMEXvariable = c("x_measured", "x_het2"),
                    measurement.error = cbind(sd_me, sd_me_het2), asymptotic = FALSE))

summary(logit_simex)
print(logit_simex)
plot(logit_simex)

### polr: two variables, one with homoscedastic, one with heteroscedastic measurement error

if(require("MASS")) {# Requires MASS
yerr <- jitter(y, amount=1)
yfactor <- cut(yerr, 3, ordered_result=TRUE)

(polr_real <- polr(yfactor ~ x_real + x_real2))
(polr_naiv <- polr(yfactor ~ x_measured + x_het2, Hess = TRUE))
(polr_simex <- simex(polr_naiv, SIMEXvariable = c("x_measured", "x_het2"),
                    measurement.error = cbind(sd_me, sd_me_het2), asymptotic = FALSE))

summary(polr_simex)
print(polr_simex)
plot(polr_simex)
}
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab