This function correlates a whole dataframe, filtering automatically all numerical values.
corr(df, method = "pearson", ignore = NA, dummy = TRUE,
dates = FALSE, redundant = TRUE, logs = FALSE, plot = FALSE,
top = NA)
Dataframe. It doesn't matter if it's got non-numerical columns: they will be filtered!
Character. Any of: c("pearson", "kendall", "spearman")
Character vector. Which columns do you wish to exlude?
Boolean. Should One Hot Encoding be applied to categorical columns?
Boolean. Do you want the function to create more features out of the date/time columns?
Boolean. Should we keep redundat columns? i.e. It the column only has two different values, should we keep both new columns?
Boolean. Automatically calculate log(values) for numerical variables (not binaries)
Boolean. Do you wish to see a plot?
Integer. Select top N most relevant variables? Filtered and sorted by mean of each variable's correlations
Other Calculus: ROC
, conf_mat
,
deg2num
, dist2d
,
errors
, loglossBinary
,
mae
, mape
,
model_metrics
, mse
,
quants
, rmse
,
rsqa
, rsq
Other Correlations: corr_cross
,
corr_plot
, corr_var