rankSwap: Rank Swapping

Description

Each ranked value is then swapped with another ranked value that has been chosen randomly within a restricted range.

Usage

rankSwap(data, variables, TopPercent = 5, BottomPercent = 5, K0 = -1, R0 = 0.95, P = 0, missing = -999, seed = NULL)

Arguments

data

matrix or data frame

variables

names or index of variables for that rank swapping is applied.

TopPercent

Percentage of largest values that are group together before rank swapping is applied.

BottomPercent

Percentage of lowest values that are group together before rank swapping is applied.

Subset-mean preservation factor. Preserves the means before and after rank swapping within a range based on K0.

Multivariate preservation factor. Preserves the correlation between variables within a certain range based on the given constant R0.

Rank range as percentage of total sample size.

missing

missig value code.

seed

Seed.

Value

The rank-swapped data set.

Details

Rank swapping sorts the values of one numeric variable by their numerical values (ranking). The restricted range is determined by the rank of two swapped values, which cannot differ, by definition, by more than p percent of the total number of observations.

References

Moore, Jr.R. (1996) Controlled data-swapping techniques for masking public use microdata, U.S. Bureau of the Census Statistical Research Division Report Series, RR 96-04 .

Examples

Run this code

data(testdata)
data_swap <- rankSwap(testdata,variables=c("age","income","expend","savings"))

Run the code above in your browser using DataLab