Learn R Programming

asremlPlus (version 4.4.15)

chooseSpatialModelOnIC.asrtests: Uses information criteria to choose the best fitting spatial model for accounting for local spatial variation.

Description

For a response variable measured on a potentially irregular grid of rows and columns of the units, uses information criteria to decide whether to add to the fitted model stored in the supplied asrtests.object either a two-dimensional exponential correlation model, a two-dimensional tensor-product natural cubic smoothing spline model (TPNCSS), a two-dimensional tensor-product penalized P-spline model (TPPCS) model, or a two-dimensional tensor-product penalized linear spline model with first-difference penalties (TPP1LS) to account for the local spatial variation. The models from which to select can be reduced to a subset of these four models. The data can be arranged in sections, for each of which there is a grid and for which the model is to be fitted separately. Also, the rows and columns of a grid are not necessarily one observational unit wide. The spatial model is only added if the information criterion of the supplied model is decreased with the addition of the local spatial model.

One or more rows is added to the test.summary data.frame of the asrtests.object, for each section and each spatial model, stating whether or not the new model has been swapped for a model in which the spatial model has been added to the supplied model. Convergence in fitting the model is checked and a note included in the action if there was not. All components of the asrtests.object are updated to exhibit the differences between the supplied and any new model.

Usage

# S3 method for asrtests
chooseSpatialModelOnIC(asrtests.obj, trySpatial = "all", 
                       sections = NULL, 
                       row.covar = "cRow", col.covar = "cCol", 
                       row.factor = "Row", col.factor = "Col", 
                       corr.funcs = c("ar1", "ar1"), 
                       row.corrFitfirst = TRUE, 
                       dropRowterm = NULL, dropColterm = NULL, 
                       nsegs = NULL, nestorder = c(1,1), 
                       rotateX = FALSE, ngridangles = c(18, 18), 
                       which.rotacriterion = "AIC", nrotacores = 1, 
                       asreml.option = "mbf", tpps4mbf.obj = NULL, 
                       allow.unconverged = FALSE, 
                       allow.fixedcorrelation = FALSE,
                       checkboundaryonly = FALSE, update = FALSE, 
                       IClikelihood = "full", which.IC = "AIC", 
                       return.asrts = "best", ...)

Value

A list containing four components: (i) asrts, (ii) spatial.IC, (iii) best.spatial.mod, and (iv) best.spatial.IC.

The component asrts itself holds a list of one or more

asrtests.objects, either the best overall out of the supplied model and the spatial models, or, for each spatial model, the best out of the supplied model and that spatial model. Each asrtests.object contains the components: (i) asreml.obj, (ii) wald.tab, and (iii) test.summary. If the

asrtests.object is the result of fitting a TPPCS model with an exploration of the rotation of the eigenvectors of the penalty matrix for the linear components, then the asreml.obj will have an attribute theta.opt that contains the optimal rotation angles of the eigenvectors.

The spatial.IC component holds a data.frame with summary of the values of the information criteria for the supplied model and those resulting from adding the spatial model to the supplied model. In the se of a sptial correlation model, the information criteira for the selected spatial correlation model is returned. If a spatial model could not be fitted, then all returned values will be NA).

The best.spatial component is a character giving the name of the best spatial model, and best.spatial.AIC gives the value of its AIC.

Arguments

asrtests.obj

An asrtests.object containing the components (i) asreml.obj, (ii) wald.tab, and (iii) test.summary.

trySpatial

A character string nominating the types of spatial model whose fits are to be assessed. Possible values are none, corr, TPNCSS, TPPCS, and TPP1LS. If set to none, then just the supplied nonspatial model and the information about its information criteria will be returned.

sections

A single character string that specifies the name of the column in the data.frame that contains the factor that identifies different sections of the data to which separate spatial models are to be fitted. Note that, if there are other terms that involve sections in the random formula, there should be separate terms for each level of sections. For example, in a blocked experiment involving multiple sites, there should be a separate term for the Blocks at each Site i.e. terms of the form at(Site, i):Block, or equivalent, for each site. Otherwise, the combined term (e.g. Site:Block) will affect the fitting of the local spatial model for a site.

row.covar

A single character string nominating a numeric that contains the values of a centred covariate indexing the rows of a grid. The numeric must be a column in the data.frame stored in the asreml.obj that is a component of the supplied asrtests.obj.

col.covar

A single character string nominating a numeric that contains the values of a centred covariate indexing the columns of a grid. The numeric must be a column in the data.frame stored in the asreml.obj that is a component of the supplied asrtests.obj.

row.factor

A single character string nominating a factor that indexes the rows of a grid that are to be one dimension of a spatial correlation model. The factor must a column in the data.frame stored in the asreml.obj that is a component of the supplied asrtests.obj.

col.factor

A single character string nominating a factor that indexes the columns of a grid that are to be one dimension of a spatial correlation model. The factor must a column in the data.frame stored in the asreml.obj that is a component of the supplied asrtests.obj.

corr.funcs

A single character string of length two that specifies the asreml one-dimensional correlation or variance model function for the row and column dimensions of a two-dimensional separable spatial correlation model to be fitted when spatial.model is corr; the two-dimensional model is fitted as a random term. If a correlation or variance model is not to be investigated for one of the dimensions, specify "" for that dimension.

row.corrFitfirst

If TRUE then, in fitting the model for spatial.model set to corr, the row correlation or variance function is fitted first, followed by the addition of the column correlation or variance function. If FALSE, the order of fitting is reversed.

dropRowterm

A single character string nominating a factor in the data.frame that has as many levels as there are unique values in row.covar. This argument is applicable for spatial.model set to TPNCSS or TPPS. It is used to remove a term corresponding to the dropRowterm and a random row deviations term based on row.covar will be included in the model. If the argument is NULL, it is assumed that such a term is not included in the fitted model stored in asrtests.obj.

dropColterm

A single character string nominating a factor in the data.frame that has as many levels as there are unique values in col.covar. This argument is applicable for spatial.model set to TPNCSS or TPPS. It is used to remove a term corresponding to the dropColterm and a random column deviations term based on col.covar will be included in the model. If the argument is NULL, it is assumed that such a term is not included in the fitted model stored in asrtests.obj.

nsegs

A pair of numeric values giving the number of segments into which the column and row ranges are to be split, respectively, for fitting a P-spline model (TPPS) (each value specifies the number of internal knots + 1). If not specified, then (number of unique values - 1) is used in each dimension; for a grid layout with equal spacing, this gives a knot at each data value. If sections is not NULL and the grid differs between the sections, then nsegs will differ between the sections.

nestorder

A numeric of length 2. The order of nesting for column and row dimensions, respectively, in fitting a P-spline model (TPPS). A value of 1 specifies no nesting, a value of 2 generates a spline with half the number of segments in that dimension, etc. The number of segments in each direction must be a multiple of the order of nesting.

rotateX

A logical indicating whether to rotate the eigenvectors of the penalty matrix, as described by Piepho, Boer and Williams (2022), when fitting a P-spline (TPSS). Setting rotateX to TRUE results in an optimized rotation produced by exploring a two-dimensional grid of rotation angle pairs, and for each pair analyzing the data under a model that omits the random interaction terms. The angle pair with the minimum deviance is used to apply an optimized rotation. Rotation of the eigenvectors is only relevant for difforder values greater than 1 and has only been implemented for difforder equal to 2.

ngridangles

A numeric of length 2. The numbers of angles between 0 and 90 degrees for row and column dimensions to be used in determining the optimal pair of angles for rotating the eignevectors of the penalty matrix of a P-spline (TPSS). Specifying factors of 90 will result in integer-valued angles. The number of grid points, and hence re-analyses will be the product of the two sums. a sum being one plus each of the values for ngridangles.

which.rotacriterion

A single character string nominating which of the criteria, out of the deviance, the likelihood, the AIC and the BIC, is to be used in determining the optimal rotation of the eigenvectors of the penalty matrix. The deviance uses the REML value computed by asreml; the other criteria use the full likelihood, evaluated using the REML estimates, that is computed by infoCriteria.asreml.

nrotacores

A numeric specifying the number of cores to deploy for running the analyses required to search the two-diemsional grid of rotation angles when rotateX is TRUE. Parallel processing has been implemented for analyzing, for each column angle, the set of angles to be investigated for the row dimension. The default value of one means that parallel processing will not be used. The value chosen for nrotacores needs to balanced against the other processes that are using parallel processing at the same time.

asreml.option

A single character string specifying whether the grp or mbf methods are to be used to supply externally formed covariate matrices to asreml when fitting a P-spline (TPSS). Compared to the mbf method, the grp method is somewhat faster, but creates large asrtests.objects for which the time it takes to save them can exceed any gains in execution speed. The grp method adds columns to the data.frame containing the data. On the other hand, the mbf method adds only the fixed covariates to data and stores the random covariates in the environment of the internal function that calls the spline-fitting function; there are three smaller data.frames for each section that are not stored in the asreml.object resulting from the fitted model.

tpps4mbf.obj

An object made with makeTPPSplineMats.data.frame that contains the spline basis information for fitting P-splines. The argument tpps4mbf.obj only needs to be set when the mbf option of asreml.option is being used and it is desired to use mbf data.frames that have been created and stored prior to calling chooseSpatialModelOnIC.asrtests. If tpps4mbf.obj is NULL,
makeTPPSplineMats.data.frame will be called internally to produce the required mbf data.frames.

allow.unconverged

A logical indicating whether to accept a new model even when it does not converge. If FALSE and the fit of the new model does not converge, the supplied asrtests.obj is returned. Also, if FALSE and the fit of the new model has converged, but that of the old model has not, the new model will be accepted.

allow.fixedcorrelation

A logical indicating whether to accept a new model even when it contains correlations in the model whose values have been designated as fixed, bound or singular. If FALSE and the new model contains correlations whose values have not been able to be estimated, the supplied asrtests.obj is returned. The fit in the asreml.obj component of the supplied asrtests.obj will also be tested and a warning issued if both fixed correlations are found in it and allow.fixedcorrelation is FALSE.

checkboundaryonly

If TRUE then boundary and singular terms are not removed by rmboundary.asrtests; a warning is issued instead. Note that, for correlation models, the fitting of each dimension and the test for a nugget term are performed with checkboundaryonly set to TRUE and its supplied setting only honoured using a call to rmboundary.asrtests immediately prior to returning the final result of the fitting.

update

If TRUE, and set.terms is NULL, then newfit.asreml is called to fit the model to be tested, using the values of the variance parameters stored in the asreml.object, that is stored in asrtests.obj, as starting values. If FALSE or set.terms is not NULL, then newfit.asreml will not use the stored variance parameter values as starting values when fitting the new model, the only modifications being (i) to fit aptial terms and (ii) those specified via ....

which.IC

A character specifying the information criterion to be used in selecting the best model. Possible values are AIC and BIC. The value of the criterion for supplied model must exceed that for changed model for the changed model to be returned. (For choosing the rotation angle of the eigenvectors of the penalty matrix, see which.rotacriterion.

IClikelihood

A character specifying whether Restricted Maximum Likelihood (REML) or the full likelihood, evaluated using REML estimates, (full) are to be used in calculating the information criteria to be included in the test.summary of an asrtests.object or to be used in choosing the best model.

return.asrts

A character string specifying whether the asrtests.object for the best fitting model (smallest AIC or BIC) is returned or the asrtests.objects resulting from the attempted fits of all of the models specifed using trySpatial are returned.

...

Further arguments passed to changeModelOnIC.asrtests, asreml and tpsmmb.

Author

Chris Brien

Details

A fitted spatial model is only returned if it improves the fit over an above that achieved with the model fit supplied in the asrtests.obj. If return.asrts is all, then this applies to each spatial model specified by trySpatial. The model fit supplied in the asrtests.obj should not include terms that will be included in any local spatial model. All spatial model terms are fitted as fixed or random. Consequently, the residual model does not have to be iid. The improvement in the fit resulting from the addition of a spatial model to the supplied model is evaluated. Note that the data must be in the order that correponds to the residual argument with a variable to the right of another variable changes levels in the data frame faster than those of the other variable e.g. Row:Column implies that all levels for Column in consecutive rows of the data.frame with a single Row level.

For the corr spatial model, the default model is an autocorrelation model of order one (ar1) for each dimension. However, any of the single dimension correlation/variance models from asreml can be specified for each dimension, as can no correlation model for a dimension; the models for the two dimensions can differ. Using a forward selection procedure, a series of models are tried, without removing boundary or singular terms, beginning with the addition of row correlation and followed by the addition of column correlation or, if the row.corrFitfirst is set to FALSE, the reverse order. If the fitting of the first-fitted correlation did not result in a model change because the fitting did not converge or correlations were fixed, but the fit of the second correlation was successful, then adding the first correlation will be retried. If one of the metric correlation functions is specified (e.g. exp), then the row.covar or col.covar will be used in the spatial model. However, because the correlations are fitted separately for the two dimensions, the row.factor and col.factor are needed for all models and is used for a dimension that does not involve a correlation/variance function for the fit being performed. Also, the correlation models are fitted as random terms and so the correlation model will include a variance parameter for the grid even when ar1 is used to specify the correlation model, i.e. the model fitted is a variance model and there is no difference between ar1 and ar1v in fitting the model. The variance parameter for this term represents the spatial variance and the fit necessarily includes a nugget term, this being the residual variance. If any correlation is retained, the need for a nuggest term is assessed by fixing the residual variance to one, which will have no effect if heterogeneous residual variances have been specified. Once the fitting of the correlation model has been completed, the rmboundary function will be executed with the checkboundaryonly value suppied in the chooseSpatialModelOnIC.asrtests call.

The tensor-product natural-cubic-smoothing-spline (TPNCSS) spatial model is as decribed by Verbyla et al. (2018), the tensor-product penalized-cubic-spline (TPPCS) model is similar to that described by Rodriguez-Alvarez et al. (2018), and the tensor-product, first-difference-penalty, linear spline (TPP1LS) model is amongst those described by Piepho, Boer and Williams (2022). The fixed terms for the spline models are row.covar + col.covar + row.covar:col.covar and the random terms are spl(row.covar) + spl(col.covar) + dev(row.covar) + dev(col.covar) + spl(row.covar):col.covar + row.covar:spl(col.covar) + spl(row.covar):spl(col.covar), except that spl(row.covar) + spl(col.covar) is not included in TPPCS and TPP1LS models. The supplied model should not include any of these terms. However, any fixed or random main-effect Row or Column term that has been included as an initial model for comparison with a spatial model can be removed prior to fitting the spatial model using dropRowterm or dropColterm. For the penalized P-spline models with second-order differencing, the model matrices used to fit the random terms spl(row.covar):col.covar and row.covar:spl(col.covar) are transformed using the spectral decomposition of their penalty matrices, and unstructured variance models across the columns of each of them will be tried.

The TPPCS and TPP1LS models are fitted using functions from the R package TPSbits authored by Sue Welham (2022). There are two methods for supplying the spline basis information produced by tpsmmb to asreml. The grp method adds it to the data.frame supplied in the data argument of the asreml call. The mbf method creates smaller data.frames with the spline basis information in the same environment as the internal function that calls the spline-fitting function. For TPPS with second-order differencing, it is also possible to investigate the rotation of the penalty matrix eigenvectors for the random terms spl(row.covar):col.covar and row.covar:spl(col.covar) (for more information see Piepho, Boer and Williams, 2022). If it is desired to use in a later session, an asreml function, or asrtests function that calls asreml, (e.g. predict.asreml, predictPlus.asreml, or changeTerms.asrtests) on an asreml.object created using mbf terms, then the mbf data.frames will need to be recreated using makeTPPSplineMats.data.frame in the new session, supplying, if there has been rotation of the penalty matrix eigenvectors, the theta values that are returned as the attribute theta.opt of the asreml.obj.

All models utlize the function changeModelOnIC.asrtests to assess the model fit, the information critera used in assessing the fit being calculated using infoCriteria. Arguments from tpsmmb and changeModelOnIC.asrtests can be supplied in calls to chooseSpatialModelOnIC.asrtests and will be passed on to the relevant function throught the ellipses argument (...).

The data for experiment can be divided into sections and an attempt to fit the same spatial model to each is made. The fit may differ for each of the sections, but the fit over all of the sections is assessed.

Each combination of a row.coords and a col.coords does not have to specify a single observation; for example, to fit a local spatial model to the main units of a split-unit design, each combination would correspond to a main unit and all subunits of the main unit would would have the same combination.

References

Piepho, H.-P., Boer, M. P., & Williams, E. R. (2022). Two-dimensional P-spline smoothing for spatial analysis of plant breeding trials. Biometrical Journal, 64, 835-857.

Rodriguez-Alvarez, M. X., Boer, M. P., van Eeuwijk, F. A., & Eilers, P. H. C. (2018). Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spatial Statistics, 23, 52-71.

Verbyla, A. P., De Faveri, J., Wilkie, J. D., & Lewis, T. (2018). Tensor Cubic Smoothing Splines in Designed Experiments Requiring Residual Modelling. Journal of Agricultural, Biological and Environmental Statistics, 23(4), 478-508.

Welham, S. J. (2022) TPSbits: Creates Structures to Enable Fitting and Examination of 2D Tensor-Product Splines using ASReml-R. Version 1.0.0 https://mmade.org/tpsbits/

See Also

as.asrtests, makeTPPSplineMats.data.frame, addSpatialModelOnIC.asrtests,
addSpatialModel.asrtests, changeModelOnIC.asrtests, changeTerms.asrtests,
rmboundary.asrtests, testranfix.asrtests, testresidual.asrtests, newfit.asreml,
reparamSigDevn.asrtests, changeTerms.asrtests, infoCriteria.asreml

Examples

Run this code
if (FALSE) {

data(Wheat.dat)

#Add row and column covariates
Wheat.dat <- within(Wheat.dat, 
                    {
                      cColumn <- dae::as.numfac(Column)
                      cColumn <- cColumn  - mean(unique(cColumn))
                      cRow <- dae::as.numfac(Row)
                      cRow <- cRow - mean(unique(cRow))
                    })

#Fit initial model
current.asr <- asreml(yield ~ Rep + WithinColPairs + Variety, 
                      random = ~ Row + Column,
                      data=Wheat.dat)

#Create an asrtests object, removing boundary terms
current.asrt <- as.asrtests(current.asr, NULL, NULL, 
                            label = "Random Row and Column effects")
current.asrt <- rmboundary(current.asrt)

# Choose the best of four models for the local spatial variation
current.asrt <- chooseSpatialModelOnIC(current.asrt, 
                                       row.covar = "cRow", col.covar = "cColumn",
                                       dropRowterm = "Row", dropColterm = "Column",
                                       asreml.option = "grp")
}

Run the code above in your browser using DataLab