"bal.tab"(x, treat, data = NULL, weights = NULL,  distance = NULL, subclass = NULL, method, int = FALSE, addl = NULL, continuous = c("std", "raw"),  binary = c("raw","std"), s.d.denom = c("treated",  "pooled", "control"), m.threshold = NULL,  v.threshold = NULL, r.threshold = NULL, un = FALSE,  disp.means = FALSE,  disp.v.ratio = FALSE, disp.subclass = FALSE, cluster = NULL, which.cluster = NULL, cluster.summary = TRUE,  quick = FALSE, ...)
"bal.tab"(formula, data, weights = NULL, distance = NULL,  subclass = NULL, method, int = FALSE, addl = NULL,  continuous = c("std", "raw"), binary = c("raw", "std"), s.d.denom = c("treated", "pooled", "control"),  m.threshold = NULL, v.threshold = NULL, r.threshold = NULL, un = FALSE,  disp.means = FALSE, disp.v.ratio = FALSE,  disp.subclass = FALSE, cluster = NULL, which.cluster = NULL,  cluster.summary = TRUE, quick = FALSE, ...)data.
formula with the treatment variable as the response and the covariates for which balance is to be assessed as the terms. All arguments must be present as variable names in data.
treat, weights, distance, and/or subclass, if any.For the formula method: Required; a data frame containing all covariates named in formula and variables with the names used in weights, distance, and/or subclass, if any.
data.  These can be weights generated by, e.g., inverse probability weighting or matching weights resulting from a matching algorithm.  This must be specified in method.  If weights = NULL and subclass = NULL, balance information will be presented only for the unadjusted sample.
data.
data.  If weights=NULL and subclass=NULL, balance information will be presented only for the unadjusted sample.
weights are specified, the user must specify either "matching" or "weighting"; "weighting" is the default.  If subclass is specified, "subclassification" is the default.  Abbreviations allowed.
logical; whether or not to include 2-way interactions of covariates included in covs and in addl.
covs.  In general, it makes more sense to include all desired variables in covs than in addl.  See note in Details for using addl.
"treated".
logical; whether to print statistics for the unadjusted sample as well as for the adjusted sample.  If weights = NULL and subclass = NULL, un will be set to TRUE.  
logical; whether to print the group means in balance output.
logical; whether to display variance ratios in balance output.
logical; whether to display balance information for individual subclasses if subclassification is used in conditioning.
data. 
NULL, all clusters in cluster will be displayed. If NA, no clusters will be displayed. Otherwise, can be a vector of cluster names or numerical indices for which to display balance. Indices correspond to the alphabetical order of cluster names. 
logical; whether to display the cluster summary table if cluster is specified. If which.cluster is NA, cluster.summary will be set to TRUE.
logical; if TRUE, will not compute any values that will not be displayed. Leave FALSE if computed values not displayed will be used later.
"bal.tab" containing balance summaries for the data object.  If subclassifcation is not used, the following are the elements of bal.tab:
:If clusters are specified, an object of class "bal.tab.cluster" containing balance summaries within each cluster and a summary of balance across clusters. Each balance summary is a balance table as described in Balance above. The summary of balance across clusters displays the mean, median, and maximum mean difference and variance ratio after adjustment for each covariate across clusters. Minimum statistics are calculated as well, but not displayed. To see these, use the options in print.bal.tab.cluster.If subclassification is used, the following are the elements of bal.tab:
If subclassification is used, the following are the elements of bal.tab:If treatment is continuous, means, mean differences, and variance ratios are replaced by (weighted) Pearson correlations between each covariate and treatment. The r.threshold argument works the same as m.threshold or v.threshold, adding an extra column to the balance table output and creating additional summaries for balance tallies and maximum imbalances. All arguments related to the calculation or display of mean differences or variance ratios are ignored. The int, addl, un, and  cluster arguments are still used as described above.
bal.tab.data.frame() generates a list of balance summaries for the data frame of covariates and treatment status values given.  bal.tab.formula() does the same but uses a formula interface instead.  When the formula interface is used, the formula and data are reshaped into a treatment vector and data frame of covariates and then simply passed through the data frame method.  bal.tab() behaves differently depending on whether subclasses are used in conditioning or not.  If they are used, bal.tab creates balance statistics for each subclass and for the sample in aggregate.  If weights are specified, subclass will be ignored unless method is specified as "subclassification".
The last four arguments of bal.tab affect display only; they are passed directly to print.bal.tab or print.bal.tab.subclass, and do not affect any calculations or the contents of the bal.tab object.  All balance statistics are calculated whether they are displayed by print or not.  The threshold values (m.threshold, v.threshold, and r.threshold) control whether extra columns should be inserted into the Balance table describing whether the balance statistics in question exceeded or were within the threshold.  Including these thresholds also creates summary tables tallying the number of variables that exceeded and were within the threshold and displaying the variables with the greatest imbalance on that balance measure.  When subclassification is used, the extra threshold columns are placed within the balance tables for each subclass as well as in the aggregate balance table, and the summary tables display balance for each subclass.
The input to addl must be a data frame; if more than one variable is included, this is straightforward (i.e., because data[,c("v1", "v2")] is already a data frame), but if only one variable is used (e.g., data[,"v1"]), R will coerce it to a vector, thus making it unfit for input in addl.  To avoid this, simply wrap the input to addl in data.frame() or use subset() if only one variable is to be added.  Again, when more than one variable is included, the input is general already a data frame and nothing needs to be done.  It is recommended to include all desired variables in formula or covs rather than specifying additional variables using addl.
bal.tab for details of calculations.
data("lalonde", package = "cobalt")
## Propensity score weighting using IPTW
glm1 <- glm(treat ~ age + educ + black + hispan, data = lalonde, 
            family = "binomial")
lalonde$distance <- glm1$fitted.values
lalonde$iptw.weights <- ifelse(lalonde$treat==1, 
                               1/lalonde$distance, 
                               1/(1-lalonde$distance))
covariates <- subset(lalonde, 
                     select = c(age, educ, black, hispan))
# data frame interface:
bal.tab(covariates, treat = "treat", data = lalonde, 
      weights = "iptw.weights", method = "weighting", 
      s.d.denom = "pooled")
# Formula interface:
bal.tab(treat ~ age + educ + black + hispan, data = lalonde, 
      weights = "iptw.weights", method = "weighting", 
      s.d.denom = "pooled")
Run the code above in your browser using DataLab