lgnpp: Unbiased Back-Transformations for Log-normal Kriging

Description

The function lgnpp back-transforms point or block Kriging predictions of a log-transformed response variable computed by predict.georob. Alternatively, the function averages log-normal point Kriging predictions for a block and approximates the mean squared prediction error of the block mean.

Usage

lgnpp(object, newdata, locations, is.block = FALSE, all.pred = NULL, 
    extended.output = FALSE)

Arguments

object

an object with Kriging predictions of a log-transformed response variable as obtained by predict(georob-object, ...).

newdata

an optional object as passed as argument newdata to predict.georob, see Details.

locations

an optional one-sided formula specifying what variables of newdata are the coordinates of the prediction points, see predict.georob.

is.block

an optional logical (default FALSE) specifying whether point predictions contained in object are considered to belong to a single block and should be averaged after back-transformation. Ignored if object contains block Kriging predictions, see Details.

all.pred

an optional positive integer or an object as obtained by lgnpp(predict(georob-object, ...)), see Details.

extended.output

logical controlling whether the covariance matrix of the errors of the back-transformed point predictions is added as an attribute to the result, see Details.

Value

If is.block is FALSE and all.pred is equal to NULL an updated object of the same class as object (see section Value of predict.georob). The data frame with the point or block Kriging predictions is complemented by lgnpp with the following new components:

lgn.pred: the back-transformed Kriging predictions of a log-transformed response.
lgn.se: the standard errors of the back-transformed predictions.
lgn.lower, lgn.upper: the bounds of the back-transformed prediction intervals.

If is.block is TRUE or all.pred not equal to NULL a named numeric vector with two elements:

mean: the back-transformed block Kriging estimate, see Details.
se: the (approximated) block Kriging standard error, see Details.

If extended.output is TRUE then the vector is supplemented with the attribute mse.lgn.pred that contains the full covariance matrix of the back-transformed point prediction errors.

Details

The function lgnpp performs three tasks:

1. Back-transformation of point Kriging predictions of a log-transformed response

The usual, marginally unbiased back-transformation for log-normal point Kriging is used:

$$\widehat{U}(\mbox{\boldmath$s$\unboldmath}) = \exp( \widehat{Z}(\mbox{\boldmath$s$\unboldmath}) + 1/2 ( \mathrm{Var}_{\hat{\theta}}[ Z(\mbox{\boldmath$s$\unboldmath})] - \mathrm{Var}_{\hat{\theta}}[\widehat{Z}(\mbox{\boldmath$s$\unboldmath})])),$$

$$\mathrm{Cov}_{\hat{\theta}}[ U(\mbox{\boldmath$s$\unboldmath}_i) - \widehat{U}(\mbox{\boldmath$s$\unboldmath}_i), U(\mbox{\boldmath$s$\unboldmath}_j) - \widehat{U}(\mbox{\boldmath$s$\unboldmath}_j) ] = \mu_{\hat{\theta}}(\mbox{\boldmath$s$\unboldmath}_i) \mu_{\hat{\theta}}(\mbox{\boldmath$s$\unboldmath}_j) $$ $$ \times \{ \exp(\mathrm{Cov}_{\hat{\theta}}[Z(\mbox{\boldmath$s$\unboldmath}_i),Z(\mbox{\boldmath$s$\unboldmath}_j)]) -2\exp(\mathrm{Cov}_{\hat{\theta}}[\widehat{Z}(\mbox{\boldmath$s$\unboldmath}_i),Z(\mbox{\boldmath$s$\unboldmath}_j)]) +\exp(\mathrm{Cov}_{\hat{\theta}}[\widehat{Z}(\mbox{\boldmath$s$\unboldmath}_i),\widehat{Z}(\mbox{\boldmath$s$\unboldmath}_j)]) \}, $$

where $\widehat{Z}$ and $\widehat{U}$ denote the log- and back-transformed predictions of the signal, and

$$\mu_{\hat{\theta}}(\mbox{\boldmath$s$\unboldmath}) \approx \exp(\mbox{\boldmath$x$\unboldmath}(\mbox{\boldmath$s$\unboldmath})\mathrm{^T}\widehat{\mbox{\boldmath$\beta$\unboldmath}} + 1/2 \mathrm{Var}_{\hat{\theta}}[Z(\mbox{\boldmath$s$\unboldmath})]). $$

The expressions for the required covariance terms can be found in the Appendices of Nussbaum et al. (2012). Instead of the signal $Z(\mbox{\boldmath$s$\unboldmath})$, predictions of the log-transformed response $Y(\mbox{\boldmath$s$\unboldmath})$ or the estimated trend $\mbox{\boldmath$x$\unboldmath}(\mbox{\boldmath$s$\unboldmath})^\mathrm{T}\widehat{\mbox{\boldmath$\beta$\unboldmath}}$ of the log-transformed data can be back-transformed (see georobIntro). The above transformations are used if object contains point Kriging predictions (see predict.georob, Value) and if is.block = FALSE and all.pred is missing.

2. Back-transformation of block Kriging predictions of a log-transformed response

Block Kriging predictions of a log-transformed response variable are back-transformed by the approximately unbiased transformation proposed by Cressie (2006, Appendix C)

$$\widehat{U}(A) = \exp( \widehat{Z}(A) + 1/2 \{ \mathrm{Var}_{\hat{\theta}}[Z(\mbox{\boldmath$s$\unboldmath})] + \widehat{\mbox{\boldmath$\beta$\unboldmath}}\mathrm{^T} \mbox{\boldmath$M$\unboldmath}(A) \widehat{\mbox{\boldmath$\beta$\unboldmath}} - \mathrm{Var}_{\hat{\theta}}[\widehat{Z}(A)] \}), $$

$$\mathrm{E}_{\hat{\theta}}[\{U(A) - \widehat{U}(A))^2] = \mu_{\hat{\theta}}(A)^2 \{ \exp(\mathrm{Var}_{\hat{\theta}}[Z(A)]) - 2 \exp(\mathrm{Cov}_{\hat{\theta}}[\widehat{Z}(A),Z(A)]) + \exp(\mathrm{Var}_{\hat{\theta}}[\widehat{Z}(A)]) \} $$

where $\widehat{Z}(A)$ and $\widehat{U}(A)$ are the log- and back-transformed predictions of the block mean $U(A)$, respectively, $\mbox{\boldmath$M$\unboldmath}(A)$ is the spatial covariance matrix of the covariates

$$ \mbox{\boldmath$M$\unboldmath}(A) = 1/|A| \int_A ( \mbox{\boldmath$x$\unboldmath}(\mbox{\boldmath$s$\unboldmath}) - \mbox{\boldmath$x$\unboldmath}(A) ) ( \mbox{\boldmath$x$\unboldmath}(\mbox{\boldmath$s$\unboldmath}) - \mbox{\boldmath$x$\unboldmath}(A) )\mathrm{^T} \,d\mbox{\boldmath$s$\unboldmath} $$

within the block $A$ where

$$ \mbox{\boldmath$x$\unboldmath}(A) = 1/|A| \int_A \mbox{\boldmath$x$\unboldmath}(\mbox{\boldmath$s$\unboldmath}) \,d\mbox{\boldmath$s$\unboldmath} $$

and

$$ \mu_{\hat{\theta}}(A) \approx \exp(\mbox{\boldmath$x$\unboldmath}(A)\mathrm{^T} \widehat{\mbox{\boldmath$\beta$\unboldmath}} + 1/2 \mathrm{Var}_{\hat{\theta}}[Z(A)]). $$

This back-transformation is based on the assumption that both the point data $U(\mbox{\boldmath$s$\unboldmath})$ and the block means $U(A)$ follow log-normal laws, which strictly cannot hold. But for small blocks the assumption works well as the bias and the loss of efficiency caused by this assumption are small (Cressie, 2006; Hofer et al., 2013).

The above formulae are used by lgnpp if object contains block Kriging predictions in the form of a SpatialPolygonsDataFrame. To approximate $\mbox{\boldmath$M$\unboldmath}(A)$, one needs the covariates on a fine grid for the whole study domain in which the blocks lie. The covariates are passed lgnpp as argument newdata, where newdata can be any spatial data frame accepted by predict.georob. For evaluating $\mbox{\boldmath$M$\unboldmath}(A)$ the geometry of the blocks is taken from the polygons slot of the SpatialPolygonsDataFrame passed as object to lgnpp.

3. Back-transformation and averaging of point Kriging predictions of a log-transformed response

lgnpp allows as a further option to back-transform and average point Kriging predictions passed as object to the function. One then assumes that the predictions in object refer to points that lie in a single block. Hence, one uses the approximation

$$\widehat{U}(A) \approx \frac{1}{K} \sum_{s_i \in A} \widehat{U}(\mbox{\boldmath$s$\unboldmath}_i) $$

to predict the block mean $U(A)$, where $K$ is the number of points in $A$. The mean squared prediction error can be approximated by

$$\mathrm{E}_{\hat{\theta}}[\{U(A) - \widehat{U}(A)\}^2] \approx \frac{1}{K^2} \sum_{s_i \in A} \sum_{s_j \in A} \mathrm{Cov}_{\hat{\theta}}[ U(\mbox{\boldmath$s$\unboldmath}_i) - \widehat{U}(\mbox{\boldmath$s$\unboldmath}_i), U(\mbox{\boldmath$s$\unboldmath}_j) - \widehat{U}(\mbox{\boldmath$s$\unboldmath}_j) ]. $$

In most instances, the evaluation of the above double sum is not feasible because a large number of points is used to discretize the block $A$. lgnpp then uses the following approximations to compute the mean squared error (see also Appendix E of Nussbaum et al., 2012):

Point prediction results are passed as object to lgnpp only for a random sample of points in $A$ (of size $k$), for which the evaluation of the above double sum is feasible.
The prediction results for the complete set of points within the block are passed as argument all.pred to lgnpp. These results are used to compute $\widehat{U}(A)$.
The mean squared error is then approximated by

$$ \mathrm{E}_{\hat{\theta}}[\{U(A) - \widehat{U}(A)\}^2] \approx \frac{1}{K^2} \sum_{s_i \in A} \mathrm{E}_{\hat{\theta}}[ \{ U(\mbox{\boldmath$s$\unboldmath}_i) - \widehat{U}(\mbox{\boldmath$s$\unboldmath}_i)\}^2] $$

$$+ \frac{K-1}{K k (k-1)} \sum_{s_i \in \mathrm{sample}}\sum_{s_j \in \mathrm{sample}, s_j \neq s_i} \mathrm{Cov}_{\hat{\theta}}[ U(\mbox{\boldmath$s$\unboldmath}_i) - \widehat{U}(\mbox{\boldmath$s$\unboldmath}_i), U(\mbox{\boldmath$s$\unboldmath}_j) - \widehat{U}(\mbox{\boldmath$s$\unboldmath}_j) ]. $$

The first term of the RHS (and $\widehat{U}(A)$) can be computed from the point Kriging results contained in all.pred, and the double sum is evaluated from the full covariance matrices of the predictions and the respective targets, passed to lgnpp as object (one has to use the arguments control=control.predict.georob(full.covmat=TRUE) for predict.georob when computing the point Kriging predictions stored in object).
If the prediction results are not available for the complete set of points in $A$ then all.pred may be equal to $K$. The block mean is then approximated by

$$\widehat{U}(A) \approx \frac{1}{k} \sum_{s_i \in \mathrm{sample}} \widehat{U}(\mbox{\boldmath$s$\unboldmath}_i) $$

and the first term of the RHS of the expression for the mean squared error by

$$ \frac{1}{kK} \sum_{s_i \in \mathrm{sample}} \mathrm{E}_{\hat{\theta}}[ \{ U(\mbox{\boldmath$s$\unboldmath}_i) - \widehat{U}(\mbox{\boldmath$s$\unboldmath}_i)\}^2]. $$
By drawing samples repeatedly and passing the related Kriging results as object to lgnpp, one can reduce the error of the approximation of the mean squared error.

References

Cressie, N. (2006) Block Kriging for Lognormal Spatial Processes. Mathematical Geology, 38, 413--443.

Hofer, C., Borer, F., Bono, R., Kayser, A. and Papritz, A. 2013. Predicting topsoil heavy metal content of parcels of land: An empirical validation of customary and constrained lognormal block Kriging and conditional simulations. Geoderma, 193--194, 200--212.

Nussbaum, M., Papritz, A., Baltensweiler, A. and Walthert, L. (2012) Organic Carbon Stocks of Swiss Forest Soils, Institute of Terrestrial Ecosystems, ETH Zurich and Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), pp. 51. http://dx.doi.org/10.3929/ethz-a-007555133

Examples

Run this code

# NOT RUN {
data(meuse)

data(meuse.grid)
coordinates(meuse.grid) <- ~x+y
meuse.grid.pixdf <- meuse.grid
gridded(meuse.grid.pixdf) <- TRUE

library(constrainedKriging)
data(meuse.blocks)

r.logzn.rob <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y,
    variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200),
    tuning.psi = 1., control = control.georob(cov.bhat = TRUE, full.cov.bhat = TRUE))
        
## point predictions of log(Zn)
r.pred.points <- predict(r.logzn.rob, newdata = meuse.grid.pixdf, 
    control = control.predict.georob(extended.output = TRUE, full.covmat = TRUE))
str(r.pred.points$pred@data)

## back-transformation of point predictions
r.backtf.pred.points <- lgnpp(r.pred.points)
str(r.backtf.pred.points$pred@data)

spplot(r.backtf.pred.points[["pred"]], zcol = "lgn.pred", main = "Zn content")

## predicting mean Zn content for whole area
r.block <- lgnpp(r.pred.points, is.block = TRUE, all.pred = r.backtf.pred.points[["pred"]])
r.block

## block predictions of log(Zn)
r.pred.block <- predict(r.logzn.rob, newdata = meuse.blocks, 
    control = control.predict.georob(extended.output = TRUE,
        pwidth = 75, pheight = 75))
r.backtf.pred.block <- lgnpp(r.pred.block, newdata = meuse.grid)

spplot(r.backtf.pred.block, zcol = "lgn.pred", main = "block means Zn content")
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab