RRlin: Linear randomized response regression

Description

Linear regression for a continuous criterion, using randomized-response (RR) variables as predictors.

Usage

RRlin(formula, data, models, p.list, group = NULL, Kukrep = 1, bs.n = 0,
  nCPU = 1, maxit = 1000, fit.n = 3, pibeta = 0.05)

Arguments

formula

a continuous criterion is predicted by one or more categorical RR variables defined by models. If the number of predictors exceeds the number defined by the vector models, the remaining predictors are treated as non-randomized va

data

an optional data frame, list or environment, containing the variables in the model.

models

character vector specifying RR model(s) in order of appearance in formula. Available models: "Warner", "UQTknown", "UQTunknown", "Mangat", "Kuk", "FR", "Crosswise",

p.list

list of randomization probabilities for RR models in the same order as specified in models. Note, that the randomization probabilities p must be provided in a list, e.g., list(p=c(.2, .3))

group

vector or matrix specifying group membership by the indices 1 and 2. Only for multigroup RR models, e.g., UQTunknown, CDM or SLD

Kukrep

defines the number of repetitions in Kuk's card playing method

bs.n

Number of samples used for the non-parametric bootstrap

nCPU

Number of cores used for the bootstrap

maxit

maximum number of iterations in optimization routine

fit.n

number of fitting runs with random starting values

pibeta

approximate ratio of probabilities pi to regression weights beta (to adjust scaling). Can be used for speeding-up and fine-tuning ML estimation (i.e., choosing a smaller value for larger beta values).

`Value`

Returns an object RRlin which can be analysed by the generic method summary

`References`

van den Hout, A., & Kooiman, P. (2006). Estimating the linear regression model with categorical covariates subject to randomized response. Computational Statistics & Data Analysis, 50, 3311-3323.

`See Also`

vignette('RRreg') or https://dl.dropboxusercontent.com/u/21456540/RRreg/index.html for a detailed description of the RR models and the appropriate definition of p

`Examples`

Run this code# generate two RR predictors
dat <- RRgen(n=500, pi=.4, model="Warner", p=.3)
dat2 <- RRgen(n=500, pi=c(.4,.6), model="FR", p=c(.1,.15))
dat$FR <- dat2$response
dat$trueFR <- dat2$true

# generate a third predictor and continuous dependent variables
dat$nonRR <- rnorm(500, 5, 1)
dat$depvar <- 2*dat$true - 3*dat2$true +
                       .5*dat$nonRR +rnorm(500, 1, 7)

# use RRlin and compare to regression on non-RR variables
linreg <- RRlin(depvar~response+FR+nonRR, data=dat,
                models=c("Warner","FR"),
                p.list=list(.3, c(.1,.15)), fit.n=1)
summary(linreg)
summary(lm(depvar~true +trueFR+nonRR, data=dat))
Run the code above in your browser using DataLab