makeTPPSplineMats.data.frame: Make the spline basis matrices and data needed to fit Tensor Product P-splines.

Description

Prepares the fixed and random sP-spline basis matrices, and associated ifnormation, that are needed for fitting of Tensor Product P-splines (TPPS) as described by Rodriguez-Alvarez et al. (2018). When asreml.option is set to mbf, makeTPPSplineMats.data.frame must be run prior to fitting TPPS models for local spatial variation using addSpatialModelOnIC.asrtests and chooseSpatialModelOnIC.asrtests. The spline basis matrices are created in the parent environment of makeTPPSplineMats.data.frame when it is called. If the grp is to be used to supply the basis functions to asreml-R, then this function need not be called; the spatial spline fitting functions will set up the basis functions.

Usage

# S3 method for data.frame
makeTPPSplineMats(data, sections = NULL, 
                  row.covar, col.covar, 
                  nsegs = NULL, nestorder = c(1,1), 
                  degree = c(3,3), difforder = c(2,2),
                  asreml.option = "mbf", ...)

Value

A list of length equal to the number of sections is produced. Each of these components is a list with 8 or 9 compnents. named data.plus, being the input data.frame to which has been added the columns required to fit the TPPS model (the data.frame stored in the data component holds only the covariates from data).

List of length 8 or 9 (according to the asreml.option).

data = the input data frame augmented with structures required to fit tensor product splines in asreml-R. This data frame can be used to fit the TPS model.

Added columns:
- TP.col, TP.row = column and row coordinates
- TP.CxR = combined index for use with smooth x smooth term
- TP.C.n for n=1:diff.c = X parts of column spline for use in random model (where diff.c is the order of column differencing)
- TP.R.n for n=1:diff.r = X parts of row spline for use in random model (where diff.r is the order of row differencing)
- TP.CR.n for n=1:(diff.c*diff.r) = interaction between the two X parts for use in fixed model. The first variate is a constant term which should be omitted from the model when the constant (1) is present. If all elements areincluded in the model then the constant term should be omitted,eg. y ~ -1 + TP.CR.1 + TP.CR.2 + TP.CR.3 + TP.CR.4 + other terms...
- when asreml="grp" or "sepgrp", the spline basis functions are also added into the data frame. Column numbers for each term are given in the grp list structure.
mbflist = list that can be used in call to asreml (so long as Z matrix data frames extracted with right names, eg BcZ<stub>.df)
BcZ.df = mbf data frame mapping onto smooth part of column spline, last column (labelled TP.col) gives column index
BrZ.df = mbf data frame mapping onto smooth part of row spline, last column (labelled TP.row) gives row index
BcrZ.df = mbf data frame mapping onto smooth x smooth term, last column (labelled TP.CxR) maps onto col x row combined index
dim = list structure, holding dimension values relating to the model:
1. "diff.c" = order of differencing used in column dimension
2. "nbc" = number of random basis functions in column dimension
3. "nbcn" = number of nested random basis functions in column dimension used in smooth x smooth term
4. "diff.r" = order of differencing used in column dimension
5. "nbr" = number of random basis functions in column dimension
6. "nbrn" = number of nested random basis functions in column dimension used in smooth x smooth term
trace = list of trace values for ZGZ' for the random TPspline terms, where Z is the design matrix and G is the known diagonal variance matrix derived from eigenvalues. This can be used to rescale the spline design matrix (or equivalently variance components).
grp = list structure, only added for setting asreml="grp". For asreml="grp", provides column indexes for each of the 5 random components of the 2D splines in data.plus. Dimensions of the components can be derived from the values in the dim item.
data.plus = the input data.frame to which has been added the columns required to fit tensor product splines in asreml-R. This data.frame can be used to fit the TPS model. FOr multiple sections, this data.frame will occur in the component for each section.

Arguments

data: An data.frame that holds the spline bases for a section. It is indexed by columns named col and row.
sections: A single character string that species the name of the column in the data.frame that contains the factor that identifies different sections of the data to which separate spatial models are to be fitted.
row.covar: A single character string nominating a numeric column in the data.frame that contains the values of a covariate indexing the rows of the grid.
col.covar: A single character string nominating a numeric column in the data.frame that contains the values of a covariate indexing the columns of the grid.
nsegs: A pair of numeric values giving the number of segments into which the column and row ranges are to be split, respectively (each value specifies the number of internal knots + 1). If only one number is specified, that value is used in both dimensions. If not specified, then (number of unique values - 1) is used in each dimension; for a grid layout with equal spacing, this gives a knot at each data value.
nestorder: A character of length 2. The order of nesting for column and row dimensions, respectively; default=1 (no nesting). A value of 2 generates a spline with half the number of segments in that dimension, etc. The number of segments in each direction must be a multiple of the order of nesting.
degree: A character of length 2. The degree of polynomial spline to be used for column and row dimensions respectively; default=3.
difforder: A character of length 2. The order of differencing for column and row dimensions, respectively; default=2.
asreml.option: A single character string whose value may be mbf or grp, indicating the method is to be used to supply externally formed covariate matrices to asreml.
...: Further arguments passed to tpsmmb from package TPSbits.

Author

Chris Brien

Details

The objects are formed using the function tpsmmb from the R package TPSbits authored by Sue Welham (2022).

Each combination of a row.covar and a col.covar does not have to specify a single observation; for example, to fit a local spatial variation model to the main units of a split-unit design, each combination would correspond to a main unit and all subunits of the main unit would would have the same combination.

The data for experiment can be divided sections and the spline bases and associated data will be produced for each section. If there is more than one sections, then a list is returned that has a component for each section. The component for a section contains:

References

Rodriguez-Alvarez, M. X., Boer, M. P., van Eeuwijk, F. A., & Eilers, P. H. C. (2018). Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spatial Statistics, 23, 52-71.

Welham, S. J. (2022) TPSbits: Creates Structures to Enable Fitting and Examination of 2D Tensor-Product Splines using ASReml-R. Version 1.0.0 https://mmade.org/tpsbits/

Examples

Run this code

if (FALSE) {

data(Wheat.dat)

#Add row and column covariates
Wheat.dat <- within(Wheat.dat, 
                    {
                      cColumn <- dae::as.numfac(Column)
                      cColumn <- cColumn  - mean(unique(cColumn))
                      cRow <- dae::as.numfac(Row)
                      cRow <- cRow - mean(unique(cRow))
                    })

#Set up the matrices
tps.XZmat <- makeTPPSplineMats(wheat.dat, 
                                row.covar = "cRow", col.covar = "cColumn")
}

Run the code above in your browser using DataLab