Learn R Programming

GaussSuppression (version 1.1.0)

SuppressLinkedTables: Consistent Suppression of Linked Tables

Description

Provides alternatives to global protection for linked tables through methods that may reduce the computational burden.

Usage

SuppressLinkedTables(
  data = NULL,
  fun,
  ...,
  withinArg = NULL,
  linkedGauss = "consistent",
  recordAware = TRUE,
  iterBackTracking = Inf,
  whenEmptyUnsuppressed = NULL,
  lpPackage = NULL
)

Value

A list of data frames, or, if withinArg is NULL, the ordinary output from fun.

Arguments

data

The data argument to fun. When NULL data must be included in withinArg.

fun

A function: GaussSuppressionFromData or one of its wrappers such as SuppressSmallCounts and SuppressDominantCells.

...

Arguments to fun that are kept constant.

withinArg

A list of named lists. Arguments to fun that are not kept constant. If withinArg is named, the names will be used as names in the output list.

linkedGauss

Specifies the strategy for protecting linked tables. Possible values are:

  • "consistent" (default): All linked tables are protected by a single call to GaussSuppression(). The algorithm internally constructs a block diagonal model matrix and handles common cells consistently across tables.

  • "local": Each table is protected independently by a separate call to GaussSuppression().

  • "back-tracking": Iterative approach where each table is protected via GaussSuppression(), and primary suppressions are adjusted based on secondary suppressions from other tables across iterations.

  • "local-bdiag": Produces the same result as "local", but uses a single call to GaussSuppression() with a block diagonal matrix. It does not apply the linked-table methodology.

recordAware

If TRUE (default), the suppression procedure will ensure consistency across cells that aggregate the same underlying records, even when their variable combinations differ. When TRUE, data cannot be included in withinArg.

iterBackTracking

Maximum number of back-tracking iterations.

whenEmptyUnsuppressed

Parameter to GaussSuppression. This is about a helpful message "Cells with empty input will never be secondary suppressed. Extend input data with zeros?" Here, the default is set to NULL (no message), since preprocessing of the model matrix may invalidate the assumptions behind this message.

lpPackage

Currently ignored. If specified, a warning will be issued.

Details

The reason for introducing the new method "consistent", which has not yet been extensively tested in practice, is to provide something that works better than "back-tracking", while still offering equally strong protection.

Note that for singleton methods of the elimination type (see SSBtools::NumSingleton()), "back-tracking" may lead to the creation of a large number of redundant secondary cells. This is because, during the method's iterations, all secondary cells are eventually treated as primary. As a result, protection is applied to prevent a singleton contributor from inferring a secondary cell that was only included to protect that same contributor.

Note that the frequency singleton methods "subSpace", "anySum0", and "anySumNOTprimary" are currently not implemented and will result in an error. As a result, the singletonZeros parameter in the SuppressDominantCells() function cannot be set to TRUE, and the SuppressKDisclosure() function is not available for use. Also note that automatic forcing of "anySumNOTprimary" is disabled. That is, SSBtools::GaussSuppression() is called with auto_anySumNOTprimary = FALSE. See the parameter documentation for an explanation of why FALSE is required.

The combination of intervals with the various linked table strategies is not yet implemented, so the lpPackage parameter is currently ignored.

Examples

Run this code

### The first example can be performed in three ways
### Alternatives are possible since only the formula parameter varies between the linked tables
 
a <- SuppressLinkedTables(data = SSBtoolsData("magnitude1"), # With trick "sector4 - sector4" and 
                 fun = SuppressDominantCells,        # "geo - geo" to ensure same names in output
                 withinArg = list(list(formula = ~(geo + eu) * sector2 + sector4 - sector4), 
                                  list(formula = ~eu:sector4 - 1 + geo - geo), 
                                  list(formula = ~geo + eu + sector4 - 1)), 
                 dominanceVar  = "value", 
                 pPercent = 10, 
                 contributorVar = "company",
                 linkedGauss = "consistent")
print(a)  

# Alternatively, SuppressDominantCells() can be run directly using the linkedGauss parameter  
a1 <- SuppressDominantCells(SSBtoolsData("magnitude1"), 
               formula = list(table_1 = ~(geo + eu) * sector2, 
                              table_2 = ~eu:sector4 - 1,
                              table_3 = ~(geo + eu) + sector4 - 1), 
               dominanceVar = "value", 
               pPercent = 10, 
               contributorVar = "company", 
               linkedGauss = "consistent")
print(a1)

# In fact, tables_by_formulas() is also a possibility
a2 <- tables_by_formulas(SSBtoolsData("magnitude1"),
               table_fun = SuppressDominantCells, 
               table_formulas = list(table_1 = ~region * sector2, 
                                    table_2 = ~region1:sector4 - 1, 
                                    table_3 = ~region + sector4 - 1), 
               substitute_vars = list(region = c("geo", "eu"), region1 = "eu"), 
               collapse_vars = list(sector = c("sector2", "sector4")), 
               dominanceVar  = "value", 
               pPercent = 10, 
               contributorVar = "company",
               linkedGauss = "consistent") 
print(a2)                 
               
               
               
               
####  The second example cannot be handled using the alternative methods.
####  This is similar to the (old) LazyLinkedTables() example.

z1 <- SSBtoolsData("z1")
z2 <- SSBtoolsData("z2")
z2b <- z2[3:5]  # As in ChainedSuppression example 
names(z2b)[1] <- "region" 
# As 'f' and 'e' in ChainedSuppression example. 
# 'A' 'annet'/'arbeid' suppressed in b[[1]], since suppressed in b[[3]].
b <- SuppressLinkedTables(fun = SuppressSmallCounts,
              linkedGauss = "consistent",  
              recordAware = FALSE,
              withinArg = list(
                list(data = z1, dimVar = 1:2, freqVar = 3, maxN = 5), 
                list(data = z2b, dimVar = 1:2, freqVar = 3, maxN = 5), 
                list(data = z2, dimVar = 1:4, freqVar = 5, maxN = 1)))
print(b)        
       

Run the code above in your browser using DataLab