simex (version 1.8)

simex: Measurement error in models using SIMEX

Description

Implementation of the SIMEX algorithm for measurement error models according to Cook and Stefanski

Usage

simex(model, SIMEXvariable, measurement.error, lambda = c(0.5, 1, 1.5,
  2), B = 100, fitting.method = "quadratic",
  jackknife.estimation = "quadratic", asymptotic = TRUE)

# S3 method for simex plot(x, xlab = expression((1 + lambda)), ylab = colnames(b)[-1], ask = FALSE, show = rep(TRUE, NCOL(b) - 1), ...)

# S3 method for simex predict(object, newdata, ...)

# S3 method for simex print(x, digits = max(3, getOption("digits") - 3), ...)

# S3 method for summary.simex print(x, digits = max(3, getOption("digits") - 3), ...)

# S3 method for simex refit(object, fitting.method = "quadratic", jackknife.estimation = "quadratic", asymptotic = TRUE, ...)

# S3 method for simex summary(object, ...)

Arguments

model

the naive model

SIMEXvariable

character or vector of characters containing the names of the variables with measurement error

measurement.error

given standard deviations of measurement errors. In case of homoskedastic measurement error it is a matrix with dimension 1xlength(SIMEXvariable). In case of heteroskedastic error for at least one SIMEXvariable it is a matrix of dimension nx

lambda

vector of lambdas for which the simulation step should be done (without 0)

B

number of iterations for each lambda

fitting.method

fitting method for the extrapolation. linear, quadratic, nonlinear are implemented. (first 4 letters are enough)

jackknife.estimation

specifying the extrapolation method for jackknife variance estimation. Can be set to FALSE if it should not be performed

asymptotic

logical, indicating if asymptotic variance estimation should be done, in the naive model the option x = TRUE has to be set

x

object of class 'simex'

xlab

optional name for the X-Axis

ylab

vector containing the names for the Y-Axis

ask

logical. If TRUE, the user is asked for input, before a new figure is drawn

show

vector of logicals indicating for wich variables a plot should be produced

arguments passed to other functions

object

of class 'simex'

newdata

optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted linear predictors are used

digits

number of digits to be printed

Value

An object of class 'simex' which contains:

coefficients

the corrected coefficients of the SIMEX model,

SIMEX.estimates

the estimates for every lambda,

model

the naive model,

measurement.error

the known error standard deviations,

B

the number of iterations,

extrapolation

the model object of the extrapolation step,

fitting.method

the fitting method used in the extrapolation step,

residuals

the residuals of the main model,

fitted.values

the fitted values of the main model,

call

the function call,

variance.jackknife

the jackknife variance estimate,

extrapolation.variance

the model object of the variance extrapolation,

variance.jackknife.lambda

the data set for the extrapolation,

variance.asymptotic

the asymptotic variance estimates,

theta

the estimates for every B and lambda,

...

Methods (by generic)

  • plot: Plot the simulation and extrapolation step

  • predict: Predict using simex correction

  • print: Print simex nicely

  • print: Print summary nicely

  • refit: Refits the model with a different extrapolation function

  • summary: Summary of simulation and extrapolation

Details

Nonlinear is implemented as described in Cook and Stefanski, but is numerically instable. It is not advisable to use this feature. If a nonlinear extrapolation is desired please use the refit() method.

Asymptotic is only implemented for naive models of class lm or glm with homoscedastic measurement error.

refit() refits the object with a different extrapolation function.

References

Cook, J.R. and Stefanski, L.A. (1994) Simulation-extrapolation estimation in parametric measurement error models. Journal of the American Statistical Association, 89, 1314 -- 1328

Carroll, R.J., K<U+00FC>chenhoff, H., Lombard, F. and Stefanski L.A. (1996) Asymptotics for the SIMEX estimator in nonlinear measurement error models. Journal of the American Statistical Association, 91, 242 -- 250

Carrol, R.J., Ruppert, D., Stefanski, L.A. and Crainiceanu, C. (2006). Measurement error in nonlinear models: A modern perspective., Second Edition. London: Chapman and Hall.

Lederer, W. and K<U+00FC>chenhoff, H. (2006) A short introduction to the SIMEX and MCSIMEX. R News, 6(4), 26--31

See Also

mcsimex for discrete data with misclassification, lm, glm

Examples

Run this code
# NOT RUN {
## Seed
set.seed(49494)

## simulating the measurement error standard deviations
sd_me <- 0.3
sd_me2 <- 0.4
temp <- runif(100, min = 0, max = 0.6)
sd_me_het1 <- sort(temp)
temp2 <- rnorm(100, sd = 0.1)
sd_me_het2 <- abs(sd_me_het1 + temp2)

## simulating the independent variables x (real and with measurement error):

x_real <- rnorm(100)
x_real2 <- rpois(100, lambda = 2)
x_real3 <- -4*x_real + runif(100, min = -10, max = 10)  # correlated to x_real

x_measured <- x_real + sd_me * rnorm(100)
x_measured2 <- x_real2 + sd_me2 * rnorm(100)
x_het1 <- x_real + sd_me_het1 * rnorm(100)
x_het2 <- x_real3 + sd_me_het2 * rnorm(100)

## calculating dependent variable y:
y <- x_real + rnorm(100, sd = 0.05)
y2 <- x_real + 2*x_real2 + rnorm(100, sd = 0.08)
y3 <- x_real + 2*x_real3 + rnorm(100, sd = 0.08)

### one variable with homoscedastic measurement error
(model_real <- lm(y ~ x_real))

(model_naiv <- lm(y ~ x_measured, x = TRUE))

(model_simex <- simex(model_naiv, SIMEXvariable = "x_measured", measurement.error = sd_me))
plot(model_simex)

### two variables with homoscedastic measurement errors
(model_real2 <- lm(y2 ~ x_real + x_real2))
(model_naiv2 <- lm(y2 ~ x_measured + x_measured2, x = TRUE))
(model_simex2 <- simex(model_naiv2, SIMEXvariable = c("x_measured", "x_measured2"),
         measurement.error = cbind(sd_me, sd_me2)))

plot(model_simex2)

# }
# NOT RUN {
### one variable with increasing heteroscedastic measurement error
model_real

(mod_naiv1 <- lm(y ~ x_het1, x = TRUE))
(mod_simex1 <- simex(mod_naiv1, SIMEXvariable = "x_het1",
                measurement.error = sd_me_het1, asymptotic = FALSE))

plot(mod_simex1)

### two correlated variables with heteroscedastic measurement errors
(model_real3 <- lm(y3 ~ x_real + x_real3))
(mod_naiv2 <- lm(y3 ~ x_het1 + x_het2, x = TRUE))
(mod_simex2 <- simex(mod_naiv2, SIMEXvariable = c("x_het1", "x_het2"),
              measurement.error = cbind(sd_me_het1, sd_me_het2), asymptotic = FALSE))

plot(mod_simex2)

### two variables, one with homoscedastic, one with heteroscedastic measurement error
model_real2
(mod_naiv3 <- lm(y2 ~ x_measured + x_het2, x = TRUE))
(mod_simex3 <- simex(mod_naiv3, SIMEXvariable = c("x_measured", "x_het2"),
                    measurement.error = cbind(sd_me, sd_me_het2), asymptotic = FALSE))

### glm: two variables, one with homoscedastic, one with heteroscedastic measurement error
t <- x_real + 2*x_real2 + rnorm(100, sd = 0.01)
g <- 1 / (1 + exp(t))
u <- runif(100)
ybin <- as.numeric(u < g)

(logit_real <- glm(ybin ~ x_real + x_real2, family = binomial))
(logit_naiv <- glm(ybin ~ x_measured + x_het2, x = TRUE, family = binomial))
(logit_simex <- simex(logit_naiv, SIMEXvariable = c("x_measured", "x_het2"),
                    measurement.error = cbind(sd_me, sd_me_het2), asymptotic = FALSE))

summary(logit_simex)
print(logit_simex)
plot(logit_simex)

### polr: two variables, one with homoscedastic, one with heteroscedastic measurement error

if(require("MASS")) {# Requires MASS
yerr <- jitter(y, amount=1)
yfactor <- cut(yerr, 3, ordered_result=TRUE)

(polr_real <- polr(yfactor ~ x_real + x_real2))
(polr_naiv <- polr(yfactor ~ x_measured + x_het2, Hess = TRUE))
(polr_simex <- simex(polr_naiv, SIMEXvariable = c("x_measured", "x_het2"),
                    measurement.error = cbind(sd_me, sd_me_het2), asymptotic = FALSE))

summary(polr_simex)
print(polr_simex)
plot(polr_simex)
}
# }
# NOT RUN {
# }

Run the code above in your browser using DataCamp Workspace