Plots similarities of all variables to an outcome variable against similarities of all variables to a predictor of interest
confounderPlot(data, S, x, y, labels, method = c("associationMeasures", "distcor"),
returnS = FALSE, plotLegend = TRUE, col, pch, font, cex.text, xlim, ylim, ...)
data frame with variables of interest
similarity matrix; if missing it will be calculated from data by similarity.variables
name of the predictor variable (as used in data
and S
) of main interest, for which confounders / collinearities shall be detected
name of the outcome variable (as used in data
and S
)
variable names used for plotting; have to be in corresponding order with columns of data
; if missing, names of data
are used
method to calculate similarities: combination of association measures ('associationMeasures'
) or distance correlation ('distcor'
)
shall similarity matrix be returned?
shall (default) legend be shown, indicating categorical and continuous variables
symbol and label color; by default categorical variables are shown in purple, continuous variables in black
plotting symbol, default 16
font of plotted labels; by default names of variables x
and y
are shown in bold
size of plotted labels
axis limits
graphical parameters passed to plot
Scatterplot of variable similarities. Chosen predictor and outcome variables are highlighted in bold. Categorical/quantitative variables are shown in purple/black by default.
The similarities of all variables in a dataset with two variables of special interest (i.e. predictor and outcome of a regression model) are simultaneously visualized in a scatter plot, where the x-axis shows similarities to the predictor and the y-axis similarities to the outcome. The height of the predictor variable's point indicates its association with the outcome and hence its predicting ability. Variables in the upper right part are potential confounders for which prediction model should be adjusted, or collinear variables that should be removed. Variables in the lower right part are strongly related to the predictor, but not associated with the outcome. Variables very close to the outcome variable's point are potential surrogate outcomes.
# NOT RUN {
data(mixdata)
confounderPlot(mixdata, x="X2.quant", y="X1.cat")
# }
Run the code above in your browser using DataLab