Learn R Programming

agridat (version 1.8.1)

yates.missing: Factorial experiment with missing values

Description

Potato factorial experiment with missing values

Arguments

source

F. Yates, 1933. The analysis of replicated experiments when the field results are incomplete. Emp. J. Exp. Agric., 1, 129--142.

Details

The response variable y is the intensity of infection of potato tubers innoculated with Phytophthora Erythroseptica. Yates (1933) presents an iterative algorithm to estimate missing values in a matrix, using this data as an example.

References

Steel & Torrie, 1980, Principles and Procedures of Statistics, 2nd Edition, page 212.

Examples

Run this code
dat <- yates.missing

require("reshape2")
mat0 <- acast(dat[, c('trt','block','y')], trt~block,
               id.var=c('trt','block'), value.var='y')

# Use lm to estimate missing values.  The estimated missing values
# are the same as in Yates (1933)
m1 <- lm(y~trt+block, dat)
dat$pred <- predict(m1, new=dat[, c('trt','block')])
dat$filled <- ifelse(is.na(dat$y), dat$pred, dat$y)
mat1 <- acast(dat[, c('trt','block','pred')], trt~block,
               id.var=c('trt','block'), value.var='pred')

# Another method to estimate missing values via PCA
require("pcaMethods")
m2 <- pca(mat0, method="nipals", center=FALSE, nPcs=3)
mat2 <- m2@scores
# Compare
ord <- c("0","N","K","P","NK","NP","KP","NKP")
print(mat0[ord,], na.print=".")
round(mat1[ord,] ,2)
round(mat2[ord,] ,2)

# SVD with 3 components recovers original data better
sum((mat0-mat1)^2, na.rm=TRUE)
sum((mat0-mat2)^2, na.rm=TRUE) # Smaller SS => better fit

Run the code above in your browser using DataLab