Learn R Programming

sdcMicro (version 4.6.0)

localSuppression: Local Suppression to obtain k-anonymity

Description

Algorithm to achieve k-anonymity by performing local suppression.

Usage

localSuppression(obj, k = 2, importance = NULL, combs = NULL, ...)

Arguments

obj
an object of class sdcMicroObj or a data frame or matrix
k
threshold for k-anonymity
importance
numeric vector of numbers between 1 and n (n=length of vector keyVars). This vector represents the "importance" of variables that should be used for local suppression in order to obtain k-anonymity. key-variables with importance=1 will - if possible - not suppressed, key-variables with importance=n will be used whenever possible.
combs
numeric vector. if specified, the algorithm will provide k-anonymity for each combination of n key variables (with n being the value of the ith element of this parameter. For example, if combs=c(4,3), the algorithm will provide k-anonymity to all combinations of 4 key variables and then k-anonymity to all combinations of 3 key variables. It is possible to apply different k to these subsets by specifying k as a vector. If k has only one element, the same value of k will be used for all subgroups.
...
see arguments below
  • keyVarsnumeric vector specifying indices of (categorical) key-variables
  • strataVarsnumeric vector specifying indices of variables that should be used for stratification within 'obj'

Value

Manipulated data set with suppressions that has k-anonymity with respect to specified key-variables or the manipulated data stored in the sdcMicroObj-class.

Methods

list("signature(obj = \"data.frame\")")
list("signature(obj = \"matrix\")")
list("signature(obj = \"sdcMicroObj\")")

Details

The algorithm provides a k-anonymized data set by suppressing values in key variables. The algorithm tries to find an optimal solution to suppress as few values as possible and considers the specified importance vector. If not specified, the importance vector is constructed in a way such that key variables with a high number of characteristics are considered less important than key variables with a low number of characteristics.

The implementation provides k-anonymity per strata, if slot 'strataVar' has been set in sdcMicroObj-class or if parameter 'strataVar' is used when appying the data.frame- or matrix method. For details, have a look at the examples provided.

Examples

Run this code
data(francdat)
## Local Suppression
localS <- localSuppression(francdat, keyVar=c(4,5,6))
localS
plot(localS)

## for objects of class sdcMicro, no stratification
data(testdata2)
sdc <- createSdcObj(testdata2,
  keyVars=c('urbrur','roof','walls','water','electcon','relat','sex'),
  numVars=c('expend','income','savings'), w='sampling_weight')
sdc <- localSuppression(sdc)

## for objects of class sdcMicro, no with stratification
testdata2$ageG <- cut(testdata2$age, 5, labels=paste0("AG",1:5))
sdc <- createSdcObj(testdata2,
  keyVars=c('urbrur','roof','walls','water','electcon','relat','sex'),
  numVars=c('expend','income','savings'), w='sampling_weight',
  strataVar='ageG')
sdc <- localSuppression(sdc)

## it is also possible to provide k-anonymity for subsets of key-variables
## with different parameter k!
## in this case we want to provide 10-anonymity for all combinations
## of 5 key variables, 20-anonymity for all combinations with 4 key variables
## and 30-anonymity for all combinations of 3 key variables.
## note: stratas are automatically considered!
combs <- 5:3
k <- c(10,20,30)
sdc <- localSuppression(sdc, k=k, combs=combs)

## data.frame method (no stratification)
keyVars <- c("urbrur","roof","walls","water","electcon","relat","sex")
strataVars <- c("ageG")
inp <- testdata2[,c(keyVars, strataVars)]
ls <- localSuppression(inp, keyVars=1:7)
print(ls)
plot(ls)

## data.frame method (with stratification)
ls <- kAnon(inp, keyVars=1:7, strataVars=8)
print(ls)
plot(ls, showTotalSupps=TRUE)

Run the code above in your browser using DataLab