GaussSuppression: Secondary suppression by Gaussian elimination

Description

Sequentially the secondary suppression candidates (columns in x) are used to reduce the x-matrix by Gaussian elimination. Candidates who completely eliminate one or more primary suppressed cells (columns in x) are omitted and made secondary suppressed. This ensures that the primary suppressed cells do not depend linearly on the non-suppressed cells. How to order the input candidates is an important choice. The singleton problem and the related problem of zeros are also handled.

Usage

GaussSuppression(
  x,
  candidates = 1:ncol(x),
  primary = NULL,
  forced = NULL,
  hidden = NULL,
  singleton = rep(FALSE, NROW(x)),
  singletonMethod = "anySum",
  printInc = TRUE,
  ...
)

Arguments

Matrix that relates cells to be published or suppressed to inner cells. yPublish = crossprod(x,yInner)

candidates

Indices of candidates for secondary suppression

primary

Indices of primary suppressed cells

forced

Indices forced to be not suppressed

hidden

Indices to be removed from the above candidates input (see details)

singleton

Logical vector specifying inner cells for singleton handling. Normally, this means cells with 1s when 0s are non-suppressed and cells with 0s when 0s are suppressed.

singletonMethod

Method for handling the problem of singletons and zeros: "anySum" (default), "subSum", "subSpace" or "none" (see details).

printInc

Printing "..." to console when TRUE

...

Extra unused parameters

Value

Secondary suppression indices

Details

It is possible to specify too many (all) indices as candidates. Indices specified as primary or hidded will be removed. Hidden indices (not candidates or primary) refer to cells that will not be published, but do not need protection. The singleton method "subSum" makes new imaginary primary suppressed cells, which are the sum of the singletons within each group. The "subSpace" method is conservative and ignores the singleton dimensions when looking for linear dependency. The default method, "anySum", is between the other two. Instead of making imaginary cells of sums within groups, the aim is to handle all possible sums, also across groups. In addition, "subSumSpace" and "subSumAny" are possible methods, primarily for testing These methods are similar to "subSpace" and "anySum", and additional cells are created as in "subSum". It is believed that the extra cells are redundant.

Examples

Run this code

# NOT RUN {
# Input data
df <- data.frame(values = c(1, 1, 1, 5, 5, 9, 9, 9, 9, 9, 0, 0, 0, 7, 7), 
                 var1 = rep(1:3, each = 5), 
                 var2 = c("A", "B", "C", "D", "E"), stringsAsFactors = FALSE)

# Make output data frame and x 
fs <- FormulaSums(df, values ~ var1 * var2, crossTable = TRUE, makeModelMatrix = TRUE)
x <- fs$modelMatrix
datF <- data.frame(fs$crossTable, values = as.vector(fs$allSums))

# Add primary suppression 
datF$primary <- datF$values
datF$primary[datF$values < 5 & datF$values > 0] <- NA
datF$suppressedA <- datF$primary
datF$suppressedB <- datF$primary
datF$suppressedC <- datF$primary

# zero secondary suppressed
datF$suppressedA[GaussSuppression(x, primary = is.na(datF$primary))] <- NA

# zero not secondary suppressed by first in ordering
datF$suppressedB[GaussSuppression(x, c(which(datF$values == 0), which(datF$values > 0)), 
                            primary = is.na(datF$primary))] <- NA

# with singleton
datF$suppressedC[GaussSuppression(x, c(which(datF$values == 0), which(datF$values > 0)), 
                            primary = is.na(datF$primary), singleton = df$values == 1)] <- NA

datF

# }

Run the code above in your browser using DataLab