RRcor: Bivariate correlations including randomized response variables

Description

RRcor calculates bivariate Pearson correlations of variables measured with or without RR.

Usage

RRcor(x, y = NULL, models, p.list, group = NULL, bs.n = 0,
  bs.type = c("se.n", "se.p", "pval"), nCPU = 1)

Arguments

a numeric vector, matrix or data frame.

NULL (default) or a vector, matrix or data frame with compatible dimensions to x.

models

a vector defining which RR design is used for each variable. Must be in the same order as variables appear in x and y (by columns). Available discrete models: Warner, Kuk, FR, Mangat

p.list

a list containing the randomization probabilities of the RR models defined in models. Either, all direct-variables (i.e., no randomized response) in models can be excluded in p.list; or, if

group

a matrix defining the group membership of each participant (values 1 and 2) for all multiple group models(SLD, UQTunknown). If only one of these models is included in models, a vector can be used. For more than one m

bs.n

number of samples used to get bootstrapped standard errors

bs.type

to get boostrapped standard errors, use "se.p" for the parametric and/or "se.n" for the nonparametric bootstrap. Use "pval" to get p-values from the parametric bootstrap (assuming a true correlation of zero). Note th

nCPU

number of CPUs used for the bootstrap

`Value`

RRcor returns a list with the following components::

r estimated correlation matrix

rSE.p, rSE.n standard errors from parametric/nonparametric bootstrap

prob two-sided p-values from parametric bootstrap

samples.p, samples.n sampled correlations from parametric/nonparametric bootstrap (for the standard errors)

`Details`

Correlations of RR variables are calculated by the method of Fox & Tracy (1984) by interpreting the variance induced by the RR procedure as uncorrelated measurement error. Since the error is independent, the correlation can be corrected to obtain an unbiased estimator.

Note that the continuous RR model mix.norm with the randomization parameter p=c(p.truth, mean, SD) assumes that participants respond either to the sensitive question with probability p.truth or otherwise to a known masking distribution with known mean and SD. The estimated correlation only depends on the mean and SD and does not require normality. However, the assumption of normality is used in the parametric bootstrap to obtain standard errors.

`References`

Fox, J. A., & Tracy, P. E. (1984). Measuring associations with randomized response. Social Science Research, 13, 188-197.

`See Also`

vignette('RRreg') or https://dl.dropboxusercontent.com/u/21456540/RRreg/index.html for a detailed description of the RR models and the appropriate definition of p

`Examples`

Run this code# generate first RR variable
n <-1000
p1 <- c(.3,.7)
gData <- RRgen(n,pi=.3,model="Kuk",p1)

# generate second RR variable
p2 <- c(.8,.5)
t2 <- rbinom(n=n, size=1, prob=(gData$true+1)/2)
temp <- RRgen(model="UQTknown",p=p2, trueState=t2)
gData$UQTresp <- temp$response
gData$UQTtrue <- temp$true

# generate continuous covariate
gData$cov <- rnorm(n,0,4) + gData$UQTtrue + gData$true

# estimate correlations using directly measured / RR variables
cor(gData[,c("true","cov","UQTtrue")])
RRcor(x=gData[,c("response","cov","UQTresp")],
      models=c("Kuk","d","UQTknown"),p.list= list(p1,p2) )
Run the code above in your browser using DataLab