Performs variance component analysis for precision experiments following established methodology for clinical laboratory method validation. Estimates repeatability, Within-laboratory precision, and reproducibility from nested experimental designs.
precision_study(
data,
value = "value",
sample = NULL,
site = NULL,
day = "day",
run = NULL,
replicate = NULL,
conf_level = 0.95,
ci_method = c("satterthwaite", "mls", "bootstrap"),
boot_n = 1999,
method = c("anova", "reml")
)An object of class c("precision_study", "valytics_precision", "valytics_result"),
which is a list containing:
List with original data and metadata:
data: The input data frame (after validation)
n: Total number of observations
n_excluded: Number of observations excluded due to NAs
factors: Named list of factor column names used
value_col: Name of the value column
List describing the experimental design:
type: Design type (e.g., "single_site", "multi_site")
structure: Character string describing nesting (e.g., "day/run")
levels: Named list with number of levels for each factor
balanced: Logical; TRUE if design is balanced
n_samples: Number of distinct samples/concentration levels
Data frame with variance component estimates:
component: Name of variance component
variance: Estimated variance
sd: Standard deviation (sqrt of variance
pct_total: Percentage of total variance
df: Degrees of freedom
Data frame with precision estimates:
measure: Precision measure name (repeatability, intermediate, etc.)
sd: Standard deviation
cv_pct: Coefficient of variation (percent)
ci_lower: Lower confidence limit
ci_upper: Upper confidence limit
ANOVA table with SS, MS, DF for each source of variation
If multiple samples: list of results per sample
List with analysis settings
The matched function call
A data frame containing the precision experiment data.
Character string specifying the column name containing
measurement values. Default is "value".
Character string specifying the column name for sample/level
identifier. Use when multiple concentration levels are tested. Default
is NULL (single sample).
Character string specifying the column name for site/device
identifier. Use for multi-site reproducibility studies. Default is NULL
(single site).
Character string specifying the column name for day identifier.
Default is "day".
Character string specifying the column name for run identifier
(within day). Default is NULL (assumes single run per day).
Character string specifying the column name for replicate
identifier. If NULL (default), replicates are inferred from the data
structure.
Confidence level for intervals (default: 0.95).
Method for calculating confidence intervals:
"satterthwaite" (default) uses the Satterthwaite approximation,
"mls" uses the Modified Large Sample method,
"bootstrap" uses BCa bootstrap resampling.
Number of bootstrap resamples when ci_method = "bootstrap"
(default: 1999).
Estimation method for variance components:
"anova" (default) uses ANOVA-based method of moments,
"reml" uses Restricted Maximum Likelihood (requires lme4 package).
Three methods are available for confidence interval estimation:
Satterthwaite (default): Uses Satterthwaite's approximation for degrees of freedom of linear combinations of variance components.
MLS: Modified Large Sample method, which can provide better coverage when variance components may be estimated as negative.
Bootstrap: BCa bootstrap resampling. Most robust but computationally intensive.
ANOVA (default): Method of moments estimation. Works well for balanced designs. May produce negative variance estimates for small variance components (set to zero by default).
REML: Restricted Maximum Likelihood. Preferred for unbalanced designs. Requires the lme4 package. Always produces non-negative estimates.
This function implements variance component analysis for nested experimental designs commonly used in clinical laboratory precision studies. The analysis follows methodology consistent with international standards.
Supported Experimental Designs:
Single-site, day/run/replicate: Classic 20 x 2 x 2 design (20 days, 2 runs per day, 2 replicates per run)
Single-site, day/replicate: Simplified design without run factor (e.g., 5 days x 5 replicates for verification)
Multi-site: 3 sites x 5 days x 5 replicates for reproducibility
Custom designs: Any fully-nested combination of factors
Variance Components:
For a design with site/day/run/replicate, the model is: $$y_{ijkl} = \mu + S_i + D_{j(i)} + R_{k(ij)} + \epsilon_{l(ijk)}$$
where S = site, D = day (nested in site), R = run (nested in day), and epsilon = residual error.
Precision Measures:
Repeatability: Within-run precision (sqrt of error variance)
Between-run precision: Additional variability between runs
Between-day precision: Additional variability between days
Within-laboratory precision: Within-laboratory precision (combines day, run, and error variance)
Reproducibility: Total precision including between-site variability (for multi-site designs)
Chesher D (2008). Evaluating assay precision. Clinical Biochemist Reviews, 29(Suppl 1):S23-S26.
ISO 5725-2:2019. Accuracy (trueness and precision) of measurement methods and results - Part 2: Basic method for the determination of repeatability and reproducibility of a standard measurement method.
Searle SR, Casella G, McCulloch CE (1992). Variance Components. Wiley, New York.
Satterthwaite FE (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2:110-114.
verify_precision() for comparing results to manufacturer claims,
plot.precision_study() for visualization,
summary.precision_study() for detailed summary
# Example with simulated precision data
set.seed(42)
# Generate study design: 20 days x 2 runs x 2 replicates
n_days <- 20
n_runs <- 2
n_reps <- 2
prec_data <- expand.grid(
day = 1:n_days,
run = 1:n_runs,
replicate = 1:n_reps
)
# Add realistic variance components
day_effect <- rep(rnorm(n_days, 0, 1.5), each = n_runs * n_reps)
run_effect <- rep(rnorm(n_days * n_runs, 0, 1.0), each = n_reps)
error <- rnorm(nrow(prec_data), 0, 2.0)
prec_data$value <- 100 + day_effect + run_effect + error
# Run precision study
prec <- precision_study(
data = prec_data,
value = "value",
day = "day",
run = "run"
)
print(prec)
summary(prec)
Run the code above in your browser using DataLab