Learn R Programming

valytics (version 0.4.0)

precision_study: Precision Study Analysis

Description

Performs variance component analysis for precision experiments following established methodology for clinical laboratory method validation. Estimates repeatability, Within-laboratory precision, and reproducibility from nested experimental designs.

Usage

precision_study(
  data,
  value = "value",
  sample = NULL,
  site = NULL,
  day = "day",
  run = NULL,
  replicate = NULL,
  conf_level = 0.95,
  ci_method = c("satterthwaite", "mls", "bootstrap"),
  boot_n = 1999,
  method = c("anova", "reml")
)

Value

An object of class c("precision_study", "valytics_precision", "valytics_result"), which is a list containing:

input

List with original data and metadata:

  • data: The input data frame (after validation)

  • n: Total number of observations

  • n_excluded: Number of observations excluded due to NAs

  • factors: Named list of factor column names used

  • value_col: Name of the value column

design

List describing the experimental design:

  • type: Design type (e.g., "single_site", "multi_site")

  • structure: Character string describing nesting (e.g., "day/run")

  • levels: Named list with number of levels for each factor

  • balanced: Logical; TRUE if design is balanced

  • n_samples: Number of distinct samples/concentration levels

variance_components

Data frame with variance component estimates:

  • component: Name of variance component

  • variance: Estimated variance

  • sd: Standard deviation (sqrt of variance

  • pct_total: Percentage of total variance

  • df: Degrees of freedom

precision

Data frame with precision estimates:

  • measure: Precision measure name (repeatability, intermediate, etc.)

  • sd: Standard deviation

  • cv_pct: Coefficient of variation (percent)

  • ci_lower: Lower confidence limit

  • ci_upper: Upper confidence limit

anova_table

ANOVA table with SS, MS, DF for each source of variation

by_sample

If multiple samples: list of results per sample

settings

List with analysis settings

call

The matched function call

Arguments

data

A data frame containing the precision experiment data.

value

Character string specifying the column name containing measurement values. Default is "value".

sample

Character string specifying the column name for sample/level identifier. Use when multiple concentration levels are tested. Default is NULL (single sample).

site

Character string specifying the column name for site/device identifier. Use for multi-site reproducibility studies. Default is NULL (single site).

day

Character string specifying the column name for day identifier. Default is "day".

run

Character string specifying the column name for run identifier (within day). Default is NULL (assumes single run per day).

replicate

Character string specifying the column name for replicate identifier. If NULL (default), replicates are inferred from the data structure.

conf_level

Confidence level for intervals (default: 0.95).

ci_method

Method for calculating confidence intervals: "satterthwaite" (default) uses the Satterthwaite approximation, "mls" uses the Modified Large Sample method, "bootstrap" uses BCa bootstrap resampling.

boot_n

Number of bootstrap resamples when ci_method = "bootstrap" (default: 1999).

method

Estimation method for variance components: "anova" (default) uses ANOVA-based method of moments, "reml" uses Restricted Maximum Likelihood (requires lme4 package).

Confidence Intervals

Three methods are available for confidence interval estimation:

  • Satterthwaite (default): Uses Satterthwaite's approximation for degrees of freedom of linear combinations of variance components.

  • MLS: Modified Large Sample method, which can provide better coverage when variance components may be estimated as negative.

  • Bootstrap: BCa bootstrap resampling. Most robust but computationally intensive.

ANOVA vs REML

  • ANOVA (default): Method of moments estimation. Works well for balanced designs. May produce negative variance estimates for small variance components (set to zero by default).

  • REML: Restricted Maximum Likelihood. Preferred for unbalanced designs. Requires the lme4 package. Always produces non-negative estimates.

Details

This function implements variance component analysis for nested experimental designs commonly used in clinical laboratory precision studies. The analysis follows methodology consistent with international standards.

Supported Experimental Designs:

  • Single-site, day/run/replicate: Classic 20 x 2 x 2 design (20 days, 2 runs per day, 2 replicates per run)

  • Single-site, day/replicate: Simplified design without run factor (e.g., 5 days x 5 replicates for verification)

  • Multi-site: 3 sites x 5 days x 5 replicates for reproducibility

  • Custom designs: Any fully-nested combination of factors

Variance Components:

For a design with site/day/run/replicate, the model is: $$y_{ijkl} = \mu + S_i + D_{j(i)} + R_{k(ij)} + \epsilon_{l(ijk)}$$

where S = site, D = day (nested in site), R = run (nested in day), and epsilon = residual error.

Precision Measures:

  • Repeatability: Within-run precision (sqrt of error variance)

  • Between-run precision: Additional variability between runs

  • Between-day precision: Additional variability between days

  • Within-laboratory precision: Within-laboratory precision (combines day, run, and error variance)

  • Reproducibility: Total precision including between-site variability (for multi-site designs)

References

Chesher D (2008). Evaluating assay precision. Clinical Biochemist Reviews, 29(Suppl 1):S23-S26.

ISO 5725-2:2019. Accuracy (trueness and precision) of measurement methods and results - Part 2: Basic method for the determination of repeatability and reproducibility of a standard measurement method.

Searle SR, Casella G, McCulloch CE (1992). Variance Components. Wiley, New York.

Satterthwaite FE (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2:110-114.

See Also

verify_precision() for comparing results to manufacturer claims, plot.precision_study() for visualization, summary.precision_study() for detailed summary

Examples

Run this code
# Example with simulated precision data
set.seed(42)

# Generate study design: 20 days x 2 runs x 2 replicates
n_days <- 20
n_runs <- 2
n_reps <- 2

prec_data <- expand.grid(
  day = 1:n_days,
  run = 1:n_runs,
  replicate = 1:n_reps
)

# Add realistic variance components
day_effect <- rep(rnorm(n_days, 0, 1.5), each = n_runs * n_reps)
run_effect <- rep(rnorm(n_days * n_runs, 0, 1.0), each = n_reps)
error <- rnorm(nrow(prec_data), 0, 2.0)

prec_data$value <- 100 + day_effect + run_effect + error

# Run precision study
prec <- precision_study(
  data = prec_data,
  value = "value",
  day = "day",
  run = "run"
)

print(prec)
summary(prec)

Run the code above in your browser using DataLab