Learn R Programming

SignalY (version 1.1.1)

validate_input: Validate Input Data Structure

Description

Validates that input data meets the requirements for SignalY analysis. Performs comprehensive checks on data types, dimensions, missing values, and numeric properties.

Usage

validate_input(
  y,
  X = NULL,
  time_index = NULL,
  na_action = c("fail", "omit", "interpolate"),
  min_obs = 10,
  verbose = FALSE
)

Value

A list with validated and potentially transformed data:

y

Validated numeric vector of target signal

X

Validated matrix of predictors (or NULL)

time_index

Vector of time indices

n_obs

Number of observations

n_vars

Number of predictor variables (or 0)

var_names

Names of predictor variables (or NULL)

na_indices

Indices of removed observations (if any)

Arguments

y

Numeric vector representing the target signal. Must be a numeric vector with no infinite values. Missing values (NA) are handled according to the na_action parameter.

X

Optional matrix or data frame of candidate predictors. If provided, must have the same number of rows as length(y).

time_index

Optional vector of time indices. If NULL, sequential integers will be used.

na_action

Character string specifying how to handle missing values. Options are "fail" (stop with error), "omit" (remove observations with NA), or "interpolate" (linear interpolation).

min_obs

Minimum number of observations required. Default is 10.

verbose

Logical indicating whether to print diagnostic messages.

Details

This function implements defensive programming principles to ensure data integrity before computationally intensive analyses. The validation process includes:

  1. Type checking: Ensures y is numeric and X (if provided) is numeric matrix/data.frame

  2. Dimension checking: Verifies compatible dimensions between y and X

  3. Missing value handling: Processes NA values according to specified action

  4. Finiteness checking: Removes or flags infinite values

  5. Minimum sample size: Ensures sufficient observations for analysis