confounderPlot: Confounder Plot

Description

Plots similarities of all variables to an outcome variable against similarities of all variables to a predictor of interest

Usage

confounderPlot(data, S, x, y, labels, method = c("associationMeasures", "distcor"), 
returnS = FALSE, plotLegend = TRUE, col, pch, font, cex.text, xlim, ylim, ...)

Arguments

data

data frame with variables of interest

similarity matrix; if missing it will be calculated from data by similarity.variables

name of the predictor variable (as used in data and S) of main interest, for which confounders / collinearities shall be detected

name of the outcome variable (as used in data and S)

labels

variable names used for plotting; have to be in corresponding order with columns of data; if missing, names of data are used

method

method to calculate similarities: combination of association measures ('associationMeasures') or distance correlation ('distcor')

returnS

shall similarity matrix be returned?

plotLegend

shall (default) legend be shown, indicating categorical and continuous variables

col

symbol and label color; by default categorical variables are shown in purple, continuous variables in black

pch

plotting symbol, default 16

font

font of plotted labels; by default names of variables x and y are shown in bold

cex.text

size of plotted labels

xlim, ylim

axis limits

…

graphical parameters passed to plot

Value

Scatterplot of variable similarities. Chosen predictor and outcome variables are highlighted in bold. Categorical/quantitative variables are shown in purple/black by default.

Details

The similarities of all variables in a dataset with two variables of special interest (i.e. predictor and outcome of a regression model) are simultaneously visualized in a scatter plot, where the x-axis shows similarities to the predictor and the y-axis similarities to the outcome. The height of the predictor variable's point indicates its association with the outcome and hence its predicting ability. Variables in the upper right part are potential confounders for which prediction model should be adjusted, or collinear variables that should be removed. Variables in the lower right part are strongly related to the predictor, but not associated with the outcome. Variables very close to the outcome variable's point are potential surrogate outcomes.

Examples

Run this code

# NOT RUN {
data(mixdata)

confounderPlot(mixdata, x="X2.quant", y="X1.cat")
# }