Adds either a correlation, two-dimensional tensor-product natural cubic
smoothing spline (TPNCSS), or a two-dimensional tensor-product penalized P-spline
model (TPPS) to account for the local spatial variation exhibited by a response variable
measured on a potentially irregular grid of rows and columns of the units. The data may
be arranged in sections for each of which there is a grid and for which the model is to
be fitted separately. Also, the rows and columns of a grid are not necessarily one
observational unit wide. The spatial model is only added if the information criterion of
the supplied model is decreased with the addition of the local spatial model. For
TPPS
models for which the order of differencing the penalty matrix is two, the
improvement in the fit from rotating the eigenvectors of the penalty matrix can be
investigated; if there is no improvement, the unrotated fit will be returned.
A row is added for each section
to the test.summary
data.frame
of the asrtests.object
stating whether or not the new model has been
swapped for a model in which the spatial model has been add to the supplied model.
Convergence and the occurrence of fixed correlations in fitting the
model is checked and a note included in the action
if there was not.
All components of the asrtests.object
are updated to exhibit the
differences between the supplied and the new model, if a spatial model is added.
# S3 method for asrtests
addSpatialModelOnIC(asrtests.obj, spatial.model = "TPPS",
sections = NULL,
row.covar = "cRow", col.covar = "cCol",
row.factor = "Row", col.factor = "Col",
corr.funcs = c("ar1", "ar1"), corr.orders = c(0, 0),
row.corrFitfirst = TRUE,
allow.corrsJointFit = TRUE, nugget.variance = TRUE,
dropFixed = NULL, dropRandom = NULL,
nsegs = NULL, nestorder = c(1,1),
degree = c(3,3), difforder = c(2,2),
usRandLinCoeffs = TRUE,
rotateX = FALSE, ngridangles = NULL,
which.rotacriterion = "AIC", nrotacores = 1,
asreml.option = "grp", tpps4mbf.obj = NULL,
allow.unconverged = TRUE, allow.fixedcorrelation = TRUE,
checkboundaryonly = FALSE, update = TRUE, trace = FALSE,
maxit = 30, IClikelihood = "full", which.IC = "AIC", ...)
An asrtests.object
containing the components (i) asreml.obj
,
possibly with attribute theta.opt
,
(ii) wald.tab
, and (iii) test.summary
for the model whose fit has
the smallest information criterion between the supplied and spatial model. The values
of the degrees of freedom and the information criteria in the test.summary
are
differences between those of the changed model and those of the model supplied to
addSpatialModelOnIC
. If the
asrtests.object
is the result of fitting a TPPCS
model with
an exploration of the rotation of the eigenvectors of the penalty matrix for the linear
components, then the asreml.obj
will have an attribute theta.opt
that contains
the optimal rotation angles of the eigenvectors.
An asrtests.object
containing the components
(i) asreml.obj
, (ii) wald.tab
, and (iii) test.summary
.
A single character
string nominating the type of spatial
model to fit. Possible values are corr
, TPNCSS
and
TPPS
.
A single character
string that specifies the name of the column
in the data.frame
that contains the factor
that identifies different sections of the data to which separate spatial
models are to be fitted. Note that, for other terms that involve
sections
in the random formula, there should be separate terms for
each level of sections
. For example, in a blocked experiment
involving multiple sites, there should be the sum of separate terms for
the Blocks at each Site i.e. a formula that contains terms like
at(Site, i):Block for each site and these are separated by '+'
.
Otherwise, the combined term (e.g. Site:Block) will impact on the
fitting of the local spatial models for the different Sites. Similarly,
a separate residual variance for each of the sections
should be
fitted, unless there is a need to fit a different variance structure to
the residual, e.g. heterogeneous residual variances depending on
treatments. Separate residual variances for sections
can be
achieved using the asreml
functions dsum
or idh
.
Because, unlike random terms, terms for residual variances are not
removed from the model, compound residual terms can be used to include
them in the model, e.g. terms with idh
or dsum
with multiple
levels
in the list
or leaving levels
out altogether.
In addition to allowing the independent fitting of models to the
sections
, separate residual variance terms allows a nugget variance
to be fitted in a correlation model for each of the sections
.
A single character
string nominating a numeric
that contains the values of a centred covariate indexing the rows of a
grid. The numeric
must be a column in the
data.frame
stored in the asreml.obj
that is a
component of the supplied asrtests.obj
.
A single character
string nominating a numeric
that contains the values of a centred covariate indexing the columns
of a grid. The numeric
must be a column in the
data.frame
stored in the asreml.obj
that is a
component of the supplied asrtests.obj
.
A single character
string nominating a factor
that indexes the rows of a grid that are to be one dimension of a
spatial correlation model. The factor
must a column in
the data.frame
stored in the asreml.obj
that is
a component of the supplied asrtests.obj
.
A single character
string nominating a factor
that indexes the columns of a grid that are to be one dimension of a
spatial correlation model. The factor
must a column in
the data.frame
stored in the asreml.obj
that is
a component of the supplied asrtests.obj
.
A single character
string of length two that specifies the
asreml
one-dimensional correlation or variance model function
for the row and column dimensions of a two-dimensional separable spatial
correlation model to be fitted when spatial.model
is corr
;
the two-dimensional model is fitted as a random term. If a correlation
or variance model is not to be investigated for one of the dimensions,
specify ""
for that dimension. If the correlation model is
corb
, the values of corr.orders
are used for its order
argument (b
).
A numeric
of length two that specifies the order argument
(b
) values for the row and column dimensions of a two-dimensional
separable spatial correlation model when spatial.model
is
corr
and the corr.funcs
for a dimension is corb
,
the asreml
banded correlation model. If one of the dimensions
does not involve an order argument, set the value of corr.orders
for that dimension to zero. For a dimension for which the
corr.funcs
is corb
and corr.orders
is zero, a
model with a single band, the correlation between immediate neighbours,
will be fitted and then further bands, up to a maximum of 10 bands, will
be added until the addition of an extra band does not reduce the
information criterion nominated using which.IC
. Note that the
two-dimensional spatial model is fitted as a random term.
A logical
. If TRUE
then, in fitting the
model for spatial.model
set to corr
, the row correlation
or variance function is fitted first, followed by the addition of the
column correlation or variance function. If FALSE
, the order
of fitting is reversed.
A logical
which, if TRUE
, will allow
the simultaneous fitting of correlation functions for the two dimensions
of the grid when separate fits have failed to fit any correlation
functions. This argument is available for when a joint fit
hangs the system.
A logical
which, if TRUE
, will result in
an attempt to fit a nugget or unit-specific variance. Otherwise,
a nugget or unit-specific variance will not be fitted.
A single character
string or a character
vector of strings
with an element for each level of sections
in the same order as the
sections
levels. Each string, which if it is not NA
and after
the addition of ". ~ . -"
and conversion to a formula that is then
expanded, specifies the sum of a set of terms to be dropped from the fixed
formula in fitting splines (TPPS
and TPNCSS
). The result is
that the fitted model supplied in the asrtests.obj
, that includes these
terms, will be compared with the fitted model that has had them removed and
a spatial model added.
An element that is NA
indicates that no term pertaining to the
corresponding sections
level is to be removed. If sections
is not NULL
and a single character
string has been supplied,
the terms specified by the string are taken to be terms that are
independent of the sections
and will be removed when adding the
spatial model for the first sections
.
The terms must match those in the wald.tab
component of the
asrtests.obj
. The fixed terms will be reordered so that
single-variable terms come first, followed by two-variable terms and
so on. Note also that multiple terms specified using a single
asreml::at
function can only be dropped as a whole. If the term
was specified using an asreml::at
function with a single level,
then it can be removed and either the level itself or its
numeric
position in the levels returned by the
levels
function can be specified.
A single character
string or a character
vector of strings
with an element for each level of sections
in the same order as the
sections
levels. Each string, which if it is not NA
and after
the addition of " ~ . -"
and conversion to a formula that is then
expanded, specifies the sum of a set of terms to be dropped from the random
formula in fitting splines (TPPS
and TPNCSS
). The result is
that the fitted model supplied in the asrtests.obj
, that includes
these terms, will be compared with the fitted model that has had them removed
and a spatial model added.
An element that is NA
indicates that no term pertaining to the
corresponding sections
level is to be removed. If sections
is not NULL
and a single character
string has been supplied,
the terms specified by the string are taken to be terms that are
independent of the sections
and will be removed when adding the
spatial model for the first sections
.
The terms must match those in the vparameters
component of the
asreml.obj
component in the asrtests.obj
. Note also that
multiple terms specified using a single asreml::at
function
can only be dropped as a whole. If the term was specified using
an asreml::at
function with a single level, then it can be
removed and either the level itself or its numeric
position in the levels returned by the levels
function
can be specified.
A pair of numeric
values giving the number of segments into
which the column and row ranges are to be split, respectively, for fitting
a P-spline model (TPPS
) (each value specifies the number of internal
knots + 1). If not specified, then (number of unique values - 1) is used
in each dimension; for a grid layout with equal spacing, this gives a knot
at each data value. If sections
is not NULL
and the grid
differs between the sections
, then nsegs
will differ between
the sections
.
A numeric
of length 2. The order of nesting for column
and row dimensions, respectively, in fitting a P-spline model (TPPS
).
A value of 1 specifies no nesting, a value of 2 generates a spline with
half the number of segments in that
dimension, etc. The number of segments in each direction must be a
multiple of the order of nesting.
A numeric
of length 2. The degree of polynomial spline to
be used for column and row dimensions respectively, in fitting a P-spline
(TPPS
).
A numeric
of length 2. The order of differencing for
column and row dimensions, respectively, in fitting a P-spline
(TPPS
).
A logical
which, if TRUE
, will attempt to
fit an unstructured variance model to the constant and linear terms in
the interactions for constant and linear terms in one grid dimension
interacting with smooth terms in the second grid dimension. The
unstructured variance model can only be fitted if both the constant and
linear interaction terms have been retained in the fitted model.
This argument can be used to omit the attempt to fit an unstructured
variance model when the attempt results in a system error.
A logical
indicating whether to rotate the eigenvectors of the
penalty matrix, as described by Piepho, Boer and Williams (2022), when
fitting a P-spline (TPPS
). Setting rotateX
to TRUE
results in a search for an optimized rotation under a model that omits
the random spline interaction terms. If ngridangles
is set to
NULL
, the optimal rotation us found using an optimizer
(nloptr::bobyqa
). Otherwise, the optimal rotation is found by
exploring the fit over a two-dimensional grid of rotation angle pairs.
The optimization seeks to optimize the criterion nominated in
which.rotacriterion
. Rotation of the eigenvectors is only relevant
for difforder
values greater than 1 and has only been implemented
for difforder
equal to 2.
A numeric
of length 2. If NULL
(the default),
the optimal pair of angles for rotating the eigenvectors of the penalty
matrix of a P-spline (TPPS
) will be determined using a nonlinear
optimizer (nloptr::bobyqa
). Otherwise, its two values specify the
numbers of angles between 0 and 90 degrees for each of the row and column
dimensions to be used in determining the optimal pair of angles. Specifying
factors of 90 will result in integer-valued angles. The number of grid
points, and hence re-analyses will be the product of the values of
(ngridangles
+ 1).
A single character
string nominating which of the
criteria out of the deviance
, the likelihood
, the
AIC
and the BIC
in determining the optimal rotation of the
eigenvectors of the penalty matrix. The deviance
uses the
REML value computed by asreml
; the other criteria use the
full likelihood, evaluated using the REML estimates, that is computed
by infoCriteria.asreml
.
A numeric
specifying the number of cores to deploy for
running the analyses required to search the two-dimensional grid of rotation
angles when rotateX
is TRUE
. Parallel processing has been
implemented for analyzing, for each column angle, the set of angles to be
investigated for the row dimension. The default value of one means that
parallel processing will not be used. The value chosen for nrotacores
needs to balanced against the other processes that are using parallel
processing at the same time.
A single character
string specifying whether the grp
or
mbf
methods are to be used to supply externally formed covariate
matrices to asreml
when fitting a P-spline (TPPS
). Compared
to the mbf
method, the grp
method is somewhat faster, but
creates large asrtests.object
s for which the time it takes
to save them can exceed any gains in execution speed. The grp
method adds columns to the data.frame
containing the data
.
On the other hand, the mbf
method adds only the fixed covariates
to data
and stores the random covariates in the environment of the
internal function that calls the spline-fitting function; there are three smaller
data.frames
for each section
that are not stored in the
asreml.object
resulting from the fitted model.
An object made with makeTPPSplineMats.data.frame
that contains the spline basis information for fitting P-splines.
The argument tpps4mbf.obj
only needs to be set when the mbf
option of asreml.option
is being used and it is desired to use
mbf
data.frames
that have been created and stored
prior to calling
addSpatialModelOnIC.asrtests
. If tpps4mbf.obj
is NULL
, makeTPPSplineMats.data.frame
will be called
internally to produce the required mbf
data.frames
.
A logical
indicating whether to accept a new model
even when it does not converge. If FALSE
and the fit of the new
model does not converge, the supplied asrtests.obj
is returned.
Also, if FALSE
and the fit of the new model has converged, but that
of the old model has not, the new model will be accepted.
A logical
indicating whether to accept a new model
even when it contains correlations in the model whose values have been
designated as fixed, bound or singular. If FALSE
and the new model
contains correlations whose values have not been able to be estimated,
the supplied asrtests.obj
is returned. The fit in the
asreml.obj
component of the supplied asrtests.obj
will
also be tested and a warning issued if both fixed correlations are found
in it and allow.fixedcorrelation
is FALSE
.
If TRUE
then boundary and singular terms are not removed by
rmboundary.asrtests
; a warning is issued instead. Note that, for
correlation models, the fitting of each dimension and the test for a nugget
term are performed with checkboundaryonly
set to TRUE
and its
supplied setting only honoured using a call to rmboundary.asrtests
immediately prior to returning the final result of the fitting.
If TRUE
, then newfit.asreml
is called to fit the
model to be tested, using the values of the variance parameters stored in
the asreml.object
, that is stored in asrtests.obj
, as starting values.
If FALSE
, then newfit.asreml
will not use the stored
variance parameter values as starting values when fitting the new model,
the only modifications being (i) to add the terms for the spatial models
and (ii) those specified via ...
.
If TRUE
then the stages in fitting a correlation model are displayed.
A character
specifying the information criterion to be used in
selecting the best model. Possible values are AIC
and BIC
.
The value of the criterion for supplied model must exceed that for changed
model for the changed model to be returned. (For choosing the rotation angle
of the eigenvectors of the penalty matrix, see which.rotacriterion
.
A numeric
specifying the maximum number of iterations that
asreml
should perform in fitting a model.
A character
specifying whether Restricted Maximum Likelihood
(REML
) or the full likelihood, evaluated using REML estimates,
(full
) are to be used in calculating the information criteria
to be included in the test.summary
of an
asrtests.object
or to be used in choosing the best model.
Further arguments passed to changeModelOnIC.asrtests
, asreml
and
tpsmmb
.
Chris Brien
A fitted spatial model is only returned if it improves the fit over and above that of achieved with the model fit supplied in the asrtests.obj
. To fit the spatial model without any hypotheses testing or comparison of information criteria use addSpatialModel.asrtests
. The model fit supplied in the asrtests.obj
should not include terms that will be included in the local spatial model. All spatial model terms are fitted as fixed or random. Consequently, the residual model does not have to be iid. Note that the data must be in the order that corresponds to the residual
argument with a variable to the right of another variable changes levels in the data frame faster than those of the other variable e.g. Row:Column
implies that all levels for Column
in consecutive rows of the data.frame
with a single Row
level.
For the corr
spatial model, the default model is an autocorrelation model of order one (ar1
) for each dimension. However, any of the single dimension correlation/variance models from asreml
can be specified for each dimension, as can no correlation model for a dimension; the models for the two dimensions can differ. Using a forward selection procedure, a series of models are tried, without removing boundary or singular terms, beginning with the addition of row correlation and followed by the addition of column correlation or, if the row.corrFitfirst
is set to FALSE
, the reverse order. If the fitting of the first-fitted correlation did not result in a model change because the fitting did not converge or correlations were fixed, but the fit of the second correlation was successful, then adding the first correlation will be retried. If one of the metric correlation functions is specified (e.g. exp
), then the row.covar
or col.covar
will be used in the spatial model. However, because the correlations are fitted separately for the two dimensions, the row.factor
and col.factor
are needed for all models and is used for a dimension that does not involve a correlation/variance function for the fit being performed. Also, the correlation models are fitted as random
terms and so the correlation model will include a variance parameter for the grid even when ar1
is used to specify the correlation model, i.e. the model fitted is a variance model and there is no difference between ar1
and ar1v
in fitting the model. The variance parameter for this term represents the spatial variance and the fit necessarily includes a nugget term, this being the residual variance. If any correlation is retained in the model, for a section if sections
is not NULL
, then the need for a nuggest term is assessed by fixing the corresponding residual variance to one, unless there are multiple residual variances and these are not related to the sections
. Once the fitting of the correlation model has been completed, the rmboundary
function will be executed with the checkboundaryonly
value supplied in the addSpatialModelOnIC.asrtests
call. Finally, checking for bound and singular random terms associated with the correlation model and residual terms will be carried out when there are correlation terms in the model and checkboundaryonly
has been set to FALSE
; as many as possible will be removed from the fitted model, in some cases by fixing variance terms to one.
The tensor-product natural-cubic-smoothing-spline (TPNCSS
) spatial model is as described by Verbyla et al. (2018), the tensor-product penalized-cubic-spline (TPPSC2
) model with second-order differencing of the penalty is similar to that described by Rodriguez-Alvarez et al. (2018), and the tensor-product, first-difference-penalty, linear spline (TPPSL1
) model is amongst those described by Piepho, Boer and Williams (2022). The fixed terms for the spline models are row.covar + col.covar + row.covar:col.covar
and the random terms are spl(row.covar) + spl(col.covar) + dev(row.covar) + dev(col.covar) + spl(row.covar):col.covar + row.covar:spl(col.covar) + spl(row.covar):spl(col.covar)
, except that spl(row.covar) + spl(col.covar)
is replaced with spl(row.covar):int(col.covar) + int(row.covar):spl(col.covar)
in the TPPSC2
model, where int(.)
indicates an intercept or constant value specific to its argument. For TPPSL1
models, the terms spl(row.covar):col.covar + row.covar:spl(col.covar)
are omitted, The supplied model should not include any of these terms. However, any fixed or random main-effect Row or Column term that has been included as an initial model for comparison with a spatial model can be removed prior to fitting the spatial model using dropFixed
or dropRandom
. For the P-spline models with second-order differencing, the model matrices used to fit the pairs of random terms (i) spl(row.covar):int(col.covar)
and spl(row.covar):col.covar
and (ii) int(row.covar):spl(col.covar)
and row.covar:spl(col.covar)
are transformed using the spectral decomposition of their penalty matrices. An unstructured variance model is tried for each of these pairs and retained if it improves the fit. For TPPSC2
, it is also possible to optimize the rotation of the null-space eigenvectors of the penalty matrix for each of these random-term pairs (for more information see Piepho, Boer and Williams, 2022). The optimization is achieved either using an optimizer or takes the form of a search over a grid of rotation angles for a reduced model; the fit of the full model with rotation using the optimal rotation angles will only be returned if it improves on the fit of the full, unrotated model.
The TPPCS
and TPP1LS
models are fitted using functions from the R
package TPSbits
authored by Sue Welham (2022). There are two methods for supplying the spline basis information produced by tpsmmb
to asreml
. The grp
method adds it to the data.frame
supplied in the data
argument of the asreml
call. The mbf
method creates smaller data.frames
with the spline basis information in the same environment as the internal function that calls the spline-fitting function. If it is desired to use in a later session, an asreml
function, or asrtests
function that calls asreml
, (e.g. predict.asreml
, predictPlus.asreml
, or changeTerms.asrtests
) on an asreml.object
created using mbf
terms, then the mbf
data.frames
will need to be recreated using makeTPPSplineMats.data.frame
in the new session, supplying, if there has been rotation of the penalty matrix eigenvectors, the theta
values that are returned as the attribute theta.opt
of the asreml.obj
.
All models utlize the function changeModelOnIC.asrtests
to assess the model fit, the information criteria used in assessing the fit being calculated using infoCriteria
. Any bound
terms are removed from the model. Arguments from tpsmmb
and changeModelOnIC.asrtests
can be supplied in calls to addSpatialModelOnIC.asrtests
and will be passed on to the relevant function through the ellipses argument (...).
The data for experiment can be divided sections
and the same spatial model fitted separately to each. The fit over all of the sections is assessed. For more detail see sections
above.
Each combination of a row.coords and a col.coords does not have to specify a single observation; for example, to fit a local spatial model to the main units of a split-unit design, each combination would correspond to a main unit and all subunits of the main unit would have the same combination.
Piepho, H.-P., Boer, M. P., & Williams, E. R. (2022). Two-dimensional P-spline smoothing for spatial analysis of plant breeding trials. Biometrical Journal, 64, 835-857.
Rodriguez-Alvarez, M. X., Boer, M. P., van Eeuwijk, F. A., & Eilers, P. H. C. (2018). Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spatial Statistics, 23, 52-71.
Verbyla, A. P., De Faveri, J., Wilkie, J. D., & Lewis, T. (2018). Tensor Cubic Smoothing Splines in Designed Experiments Requiring Residual Modelling. Journal of Agricultural, Biological and Environmental Statistics, 23(4), 478-508.
Welham, S. J. (2022) TPSbits
: Creates Structures to Enable Fitting and Examination of 2D Tensor-Product Splines using ASReml-R. Version 1.0.0 https://mmade.org/tpsbits/
as.asrtests
,
makeTPPSplineMats.data.frame
,
addSpatialModel.asrtests
,
chooseSpatialModelOnIC.asrtests
,
changeModelOnIC.asrtests
,
changeTerms.asrtests
,
rmboundary.asrtests
,
testranfix.asrtests
,
testresidual.asrtests
,
newfit.asreml
,
reparamSigDevn.asrtests
,
changeTerms.asrtests
,
infoCriteria.asreml
if (FALSE) {
data(Wheat.dat)
#Add row and column covariates
Wheat.dat <- within(Wheat.dat,
{
cColumn <- dae::as.numfac(Column)
cColumn <- cColumn - mean(unique(cColumn))
cRow <- dae::as.numfac(Row)
cRow <- cRow - mean(unique(cRow))
})
#Fit initial model
current.asr <- asreml(yield ~ Rep + WithinColPairs + Variety,
random = ~ Row + Column,
data=Wheat.dat)
#Create an asrtests object, removing boundary terms
current.asrt <- as.asrtests(current.asr, NULL, NULL,
label = "Random Row and Column effects")
current.asrt <- rmboundary(current.asrt)
current.asrt <- addSpatialModelOnIC(current.asrt, spatial.model = "TPPS",
row.covar = "cRow", col.covar = "cColumn",
dropRowterm = "Row", dropColterm = "Column",
asreml.option = "grp")
infoCriteria(current.asrt$asreml.obj)
}
Run the code above in your browser using DataLab