Learn R Programming

simPop (version 0.2.6)

simCategorical: Simulate categorical variables of population data

Description

Simulate categorical variables of population data. The household structure of the population data needs to be simulated beforehand.

Usage

simCategorical(synthPopObj, additional,
  method=c("multinom", "distribution", "naivebayes"),
  limit=NULL, censor=NULL, maxit=500, MaxNWts=1500, eps=NULL, nr_cpus=NULL, seed=1)

Arguments

synthPopObj
a synthPopObj containing population and household survey data as well as optionally margins in standardized format.
additional
a character vector specifying additional categorical variables available in the sample object of synthPopObj that should be simulated for the population data.
method
a character string specifying the method to be used for simulating the additional categorical variables. Accepted values are "multinom" (estimation of the conditional probabilities using multinomial log-linear models and random draws from the
limit
if method is "multinom", this can be used to account for structural zeros. If only one additional variable is requested, a named list of lists should be supplied. The names of the list components specify the predictor variables f
censor
if method is "multinom", this can be used to account for structural zeros. If only one additional variable is requested, a named list of lists or data.frames should be supplied. The names of the list components speci
maxit, MaxNWts
control parameters to be passed to multinom and nnet. See the help file for nnet.
eps
a small positive numeric value, or NULL (the default). In the former case and if method is "multinom", estimated probabilities smaller than this are assumed to result from structural zeros and are set to exactly 0.
nr_cpus
if specified, an integer number defining the number of cpus that should be used for parallel processing.
seed
optional; an integer value to be used as the seed of the random number generator, or an integer vector containing the state of the random number generator to be restored.

Value

  • An object of class synthPopObj containing survey data as well as the simulated population data including the categorical variables specified by argument additional.

Details

The number of cpus are selected automatically in the following manner. The number of cpus is equal the number of strata. However, if the number of cpus is less than the number of strata, the number of cpus - 1 is used by default. This should be the best strategy, but the user can also overwrite this decision.

See Also

simStructure, simRelation, simContinuous, simComponents

Examples

Run this code
data(eusilcS) # load sample data
inp <- specifyInput(data=eusilcS, hhid="db030", hhsize="hsize", strata="db040", weight="db090")
## in the following, nr_cpus are selected automatically
synthPop <- simStructure(data=inp, method="direct", basicHHvars=c("age", "rb090"), nr_cpus=NULL)
synthPop <- simCategorical(synthPop, additional=c("pl030", "pb220a"), method="multinom")
summary(synthPop)

Run the code above in your browser using DataLab