Learn R Programming

NOVA (version 0.1.1)

analyze_pca_variable_importance_general: Analyze and Visualize PCA Variable Importance

Description

This function performs comprehensive analysis of variable importance in Principal Component Analysis, generating multiple visualization types including loading biplots, importance rankings, PC comparisons, and heatmaps. It extracts variable contributions to specified principal components and creates publication-ready plots with detailed statistical summaries.

Usage

analyze_pca_variable_importance_general(
  pca_result = NULL,
  output_dir = tempdir(),
  experiment_name = "PCA_Analysis",
  pc_x = "PC1",
  pc_y = "PC2",
  color_scheme = "default",
  top_n = 15,
  min_loading_threshold = 0.1,
  save_plots = TRUE,
  show_labels = TRUE,
  verbose = TRUE
)

Value

A list containing:

plots

Named list of ggplot objects: 'biplot', 'importance_bar', 'pc_comparison', 'heatmap'

variable_importance

Data frame with comprehensive variable importance metrics for all variables

selected_variables

Data frame containing the top N most important variables with detailed statistics

analysis_summary

List with key analysis metrics and variance explained information

config_used

List documenting all parameters used in the analysis

Arguments

pca_result

A PCA result object. Can be either a prcomp object directly, or a list containing a PCA object in fields named 'pca_result', 'pca', 'result', or 'prcomp'.

output_dir

Character string specifying the directory for saving plots and results (default: "pca_plots").

experiment_name

Character string used as a prefix for output files and plot titles (default: "PCA_Analysis").

pc_x

Character string specifying the principal component for x-axis analysis (default: "PC1").

pc_y

Character string specifying the principal component for y-axis analysis (default: "PC2").

color_scheme

Character string specifying the color palette. Options: "default", "viridis", "colorbrewer" (default: "default").

top_n

Numeric value specifying the number of top variables to focus on in detailed analyses (default: 15).

min_loading_threshold

Numeric value specifying the minimum loading threshold for importance filtering (default: 0.1).

save_plots

Logical indicating whether to save plots and results to disk (default: TRUE).

show_labels

Logical indicating whether to show variable labels on the biplot (default: TRUE).

verbose

Logical indicating whether to print detailed progress messages (default: TRUE).

Output Files

When save_plots = TRUE, the function creates files in the specified output directory (default: "pca_plots"). For CRAN compliance, use tempdir() for the output directory:

  • PNG files for each visualization type

  • CSV file with complete variable importance rankings

  • CSV file with selected top variables and detailed metrics

  • CSV file with analysis summary and metadata

Details

The function calculates multiple importance metrics for each variable:

  • PC loadings: Direct loading values for specified principal components

  • Combined importance: Euclidean distance combining both PC loadings

  • Contribution percentages: Percent contribution to each PC's total variance

  • Ranking: Variables ranked by combined importance score

Four visualization types are generated:

  • Loading Biplot: Scatter plot showing variable loadings on both PCs with size indicating importance

  • Importance Bar Chart: Ranked bar chart of top variables by combined importance

  • PC Comparison: Side-by-side comparison of absolute loadings for both PCs

  • Loading Heatmap: Color-coded matrix showing loading values and directions

The function automatically:

  • Validates input PCA objects from various sources

  • Calculates variance explained by each principal component

  • Creates publication-ready plots with consistent theming

  • Exports detailed CSV files with variable rankings and analysis summaries

  • Provides comprehensive statistical summaries

Color schemes provide different aesthetic options:

  • default: Blue/red palette suitable for most publications

  • viridis: Colorblind-friendly viridis color scale

  • colorbrewer: ColorBrewer palettes optimized for scientific visualization

View top variables using head(results$selected_variables)

See Also

prcomp for PCA computation, biplot for basic PCA plotting