sdcMicro (version 4.1.0)

measure_risk: Disclosure Risk for Categorical Variables

Description

The function measures the disclosure risk for weighted or unweighted data. It computes the individual risk (and household risk if reasonable) and the global risk. It also computes a risk threshold based on a global risk value. To be used when risk of disclosure for individuals within a family is considered to be statistical independent. Internally, function freqCalc() and indivRisk are used for estimation.

Usage

measure_risk(obj,...)
#measure_risk(data,keyVars,w=NULL,missing=-999,
#hid=NULL,max_global_risk=.01,fast_hier=TRUE)
ldiversity(obj,ldiv_index,l_recurs_c=2,missing=-999,...)
## S3 method for class 'measure_risk':
print(x, ...)
## S3 method for class 'ldiversity':
print(x, ...)

Arguments

obj
Object of class sdcMicroObjet
...
see arguments below
data
Input data, either a matrix or a data.frame.
keyVars
Names of categorical key variables
w
name of variable containing sample weights
hid
name of the Household ID
missing
A integer value to be used as missing value in the C++ routine
ldiv_index
indices (or names) of the variables used for l-diversity
l_recurs_c
L-Diversity Constant
x
Output of measure_risk, measure_hier or measure_thres
max_global_risk
Maximal global risk for threshold computation
fast_hier
If TRUE a faster approximation is computed if household data are provided.

Value

  • A modified sdcMicroObj object or a list with the following elements:
  • global_risk_ERexpected number of re-identification.
  • global_riskglobal risk (sum of indivdual risks).
  • global_risk_pctglobal risk in percent.
  • Resmatrix with the risk, frequency in the sample and grossed-up frequency in the population (and the hierachical risk) for each observation.
  • global_thresholdfor a given max_global_risk the threshold for the risk of observations.
  • max_global_riskthe input max_global_risk of the function.
  • hier_risk_ERexpected number of re-identification with household structure.
  • hier_riskglobal risk with household structure(sum of indivdual risks).
  • hier_risk_pctglobal risk with household structure in percent.
  • ldiverstiyMatrix with Distinct_Ldiversity, Entropy_Ldiversity andRecursive_Ldiversity for each sensitivity variable.

References

http://neon.vb.cbs.nl/casc/Software/MuManual4.1.pdf

See Also

freqCalc, indivRisk

Examples

Run this code
## measure_risk with sdcMicro objects:
data(testdata)
sdc <- createSdcObj(testdata,
  keyVars=c('urbrur','roof','walls','water','electcon'),
numVars=c('expend','income','savings'), w='sampling_weight')
## risk is already estimated and available in...
names(sdc@risk)
## measure risk on data frames or matrices:
res <- measure_risk(testdata,
  keyVars=c("urbrur","roof","walls","water","sex"))
print(res)
head(res$Res)
resw <- measure_risk(testdata,
  keyVars=c("urbrur","roof","walls","water","sex"),w="sampling_weight")
print(resw)
head(resw$Res)
res1 <- ldiversity(testdata,
  keyVars=c("urbrur","roof","walls","water","sex"),ldiv_index="electcon")
print(res1)
head(res1)
res2 <- ldiversity(testdata,
  keyVars=c("urbrur","roof","walls","water","sex"),ldiv_index=c("electcon","relat"))
print(res2)
head(res2)
# measure risk with household risk
resh <- measure_risk(testdata,
  keyVars=c("urbrur","roof","walls","water","sex"),w="sampling_weight",hid="ori_hid")
print(resh)
# change max_global_risk
rest <- measure_risk(testdata,
  keyVars=c("urbrur","roof","walls","water","sex"),
  w="sampling_weight",max_global_risk=0.0001)
print(rest)
  
## for objects of class sdcMicro:
data(testdata2)
sdc <- createSdcObj(testdata2,
  keyVars=c('urbrur','roof','walls','water','electcon','relat','sex'), 
  numVars=c('expend','income','savings'), w='sampling_weight')
## already interally applied and availabe in object sdc: 
## sdc <- measure_risk(sdc)

Run the code above in your browser using DataLab