arrangeC: Make a list of variable pairings for condition selecting plots produced by plotxc

Description

This function arranges a number of variables in pairs, ordered by their bivariate relationships. The goal is to discover which variable pairings are most helpful in avoiding extrapolations when exploring the data space. Variable pairs with strong bivariate dependencies (not necessarily linear) are chosen first. The bivariate dependency is measured using savingby2d. Each variable appears in the output only once.

Usage

arrangeC(data, method = "default")

Arguments

data

A dataframe

method

The character name for the method to use for measuring bivariate dependency, passed to savingby2d.

Value

A list containing character vectors giving variable pairings.

Details

If data is so big as to make arrangeC very slow, a random sample of rows is used instead. The bivariate dependency measures are rough, and the ordering algorithm is a simple greedy one, so it is not worth allowing it too much time. This function exists mainly to provide a helpful default ordering/pairing for ceplot.

References

O'Connell M, Hurley CB and Domijan K (2017). ``Conditional Visualization for Statistical Models: An Introduction to the condvis Package in R.''Journal of Statistical Software, 81(5), pp. 1-20. <URL:http://dx.doi.org/10.18637/jss.v081.i05>.

Examples

Run this code

# NOT RUN {
data(powerplant)

pairings <- arrangeC(powerplant)

dev.new(height = 2, width = 2 * length(pairings))
par(mfrow = c(1, length(pairings)))

for (i in seq_along(pairings)){
 plotxc(powerplant[, pairings[[i]]], powerplant[1, pairings[[i]]],
   select.col = NA)
}

# }

Run the code above in your browser using DataLab