add_risk_diff can only be attached to a count layer, so the count layer must be constructed
first. add_risk_diff allows you to compare the difference between treatment group, so all
comparisons should be based upon the values within the specified treat_var in your
tplyr_table object.
Comparisons are specified by providing two-element character vectors. You can provide as many of
these groups as you want. You can also use groups that have been constructed using
add_treat_grps or add_total_group. The first element provided will be considered
the 'reference' group (i.e. the left side of the comparison), and the second group will be considered
the 'comparison'. So if you'd like to see the risk difference of 'T1 - Placebo', you would specify
this as c('T1', 'Placebo').
Tplyr forms your two-way table in the background, and then runs prop.test appropriately.
Similar to way that the display of layers are specified, the exact values and format of how you'd like
the risk difference display are set using set_format_strings. This controls both the values
and the format of how the risk difference is displayed. Risk difference formats are set within
set_format_strings by using the name 'riskdiff'.
You have 5 variables to choose from in your data presentation:
- comp
Probability of the left hand side group (i.e. comparison)
- ref
Probability of the right hand side group (i.e. reference)
- dif
Difference of comparison - reference
- low
Lower end of the confidence interval (default is 95%, override with the args paramter)
- high
Upper end of the confidence interval (default is 95%, override with the args paramter)
Use these variable names when forming your f_str objects. The default presentation, if no
string format is specified, will be:
f_str('xx.xxx (xx.xxx, xx.xxx)', dif, low, high)
Note - within Tplyr, you can account for negatives by allowing an extra space within your integer
side settings. This will help with your alignment.
If columns are specified on a Tplyr table, risk difference comparisons still only take place between
groups within the treat_var variable - but they are instead calculated treating the cols
variables as by variables. Just like the tplyr layers themselves, the risk difference will then be transposed
and display each risk difference as separate variables by each of the cols variables.
If distinct is TRUE (the default), all calculations will take place on the distinct counts, if
they are available. Otherwise, non-distinct counts will be used.
One final note - prop.test may throw quite a few warnings. This is natural, because it
alerts you when there's not enough data for the approximations to be correct. This may be unnerving
coming from a SAS programming world, but this is R is trying to alert you that the values provided
don't have enough data to truly be statistically accurate.