`fastexplore()` is an optional, lightweight exploratory data analysis (EDA) helper. It returns summary tables and plot objects; it only writes to disk or renders a report when you explicitly request it via `save_results` or `render_report`.
fastexplore(
data,
label = NULL,
visualize = c("histogram", "boxplot", "barplot", "heatmap", "scatterplot"),
save_results = FALSE,
render_report = FALSE,
output_dir = NULL,
sample_size = NULL,
interactive = FALSE,
corr_threshold = 0.9,
auto_convert_numeric = TRUE,
visualize_missing = TRUE,
imputation_suggestions = FALSE,
report_duplicate_details = TRUE,
detect_near_duplicates = FALSE,
auto_convert_dates = FALSE,
feature_engineering = FALSE,
outlier_method = c("iqr", "zscore", "dbscan", "lof"),
run_distribution_checks = TRUE,
normality_tests = c("shapiro"),
pairwise_matrix = TRUE,
max_scatter_cols = 5,
grouped_plots = TRUE,
use_upset_missing = TRUE
)A list of summaries (tables/tibbles) and plot objects (ggplot/plotly), plus any saved file paths when `save_results`/`render_report` are enabled.
A `data.frame` to explore.
Optional column name of the target/label. If supplied and categorical, grouped plots and class balance summaries are produced.
Character vector indicating which plot families to build. Defaults to `c("histogram", "boxplot", "barplot", "heatmap", "scatterplot")`.
Logical; if `TRUE`, plots/results are saved under `output_dir` (defaults to the working directory). Default is `FALSE`.
Logical; if `TRUE`, a short HTML report is rendered via `rmarkdown` (if available). Default is `FALSE`.
Directory to save results/report when `save_results` or `render_report` is `TRUE`.
Optional integer; if supplied, visualizations are produced on a random sample of this size.
Logical; if `TRUE` and `plotly` is available, an interactive correlation heatmap is produced. Falls back to static ggplot output otherwise.
Absolute correlation threshold for flagging high correlations.
Logical; convert factor/character columns that look numeric into numeric.
Logical; if `TRUE`, include simple missingness visualizations.
Logical; if `TRUE`, prints lightweight suggestions based on missingness patterns.
Logical; if `TRUE`, returns a small sample of duplicated rows when present.
Placeholder for future fuzzy duplicate checks.
Logical; convert YYYY-MM-DD strings to `Date`.
Logical; if `TRUE`, derive day/month/year from date columns to aid inspection of temporal structure.
One of `"iqr"`, `"zscore"`, `"dbscan"`, `"lof"`.
Logical; if `TRUE`, run normality tests on numeric columns.
Character vector of normality tests to run; currently supports `"shapiro"` and `"ks"`.
Logical; if `TRUE` and `GGally` is available, returns a ggpairs scatterplot matrix for a subset of numeric columns.
Maximum number of numeric columns to include in the pairwise matrix.
Logical; if `TRUE` and `label` is a factor, group histograms/boxplots/density plots by label.
Logical; retained for compatibility. When `TRUE` and `UpSetR` is installed, an UpSet plot of missingness is returned; otherwise a simpler missingness heatmap is used.
This helper is intentionally decoupled from the core modeling workflow. Most of its heavy dependencies are treated as optional and loaded via `requireNamespace()` when requested features are used.