Categorical (key) variables. Either the column names or and index of the variables to be used for
risk measurement.
missing
Missing value coding in the given data set.
DisFraction
It is the sampling fraction for the simple random sampling, and the common
sampling fraction for stratified sampling. By default, it's set to 0.01.
Value
ContributionPercentThe contribution of each key variable to the SUDA score, calculated for each row.
scoreThe suda score.
disscoreThe dis suda score
Details
Suda 2 is a recursive algorithm for finding
Minimal Sample Uniques. The algorithm generates all possible
variable subsets of defined categorical key variables
and scans them for unique patterns in the subsets of variables. The lower the
amount of variables needed to receive uniqueness, the higher the risk of the
corresponding observation.
References
C. J. Skinner; M. J. Elliot (20xx)
A Measure of Disclosure Risk for Microdata. Journal of the Royal
Statistical Society: Series B (Statistical Methodology), Vol. 64 (4), pp 855--867.
M. J. Elliot, A. Manning, K. Mayes, J. Gurd and M. Bane (20xx)
SUDA: A Program for Detecting Special Uniques,
Using DIS to Modify the Classification of Special Uniques
Anna M. Manning, David J. Haglin, John A. Keane (2008)
A recursive search algorithm for statistical disclosure assessment.
Data Min Knowl Disc 16:165 -- 196