optimACDC: Optimization of sample configurations for spatial trend identification and estimation (III)

Description

Optimize a sample configuration for spatial trend identification and estimation. An utility function U is defined so that the sample reproduces the bivariate association/correlation between the covariates, as well as their marginal distribution (ACDC). The utility function is obtained aggregating two objective functions: CORR and DIST.

Usage

optimACDC(points, candi, covars, strata.type = "area",
  use.coords = FALSE, schedule = scheduleSPSANN(), plotit = FALSE,
  track = FALSE, boundary, progress = "txt", verbose = FALSE,
  weights, nadir = list(sim = NULL, seeds = NULL, user = NULL, abs =
  NULL), utopia = list(user = NULL, abs = NULL))
objACDC(points, candi, covars, strata.type = "area",
  use.coords = FALSE, weights, nadir = list(sim = NULL, seeds = NULL,
  user = NULL, abs = NULL), utopia = list(user = NULL, abs = NULL))

Arguments

points

Integer value, integer vector, data frame or matrix, or list.

Integer value. The number of points. These points will be randomly sampled from candi to form the starting sample configuration.
Integer vector. The row indexes of candi that correspond to the points that form the starting sample configuration. The length of the vector defines the number of points.
Data frame or matrix. An object with three columns in the following order: [, "id"], the row indexes of candi that correspond to each point, [, "x"], the projected x-coordinates, and [, "y"], the projected y-coordinates.
List. An object with two named sub-arguments: fixed, a data frame or matrix with the projected x- and y-coordinates of the existing sample configuration -- kept fixed during the optimization --, and free, an integer value defining the number of points that should be added to the existing sample configuration -- free to move during the optimization.

candi

Data frame or matrix with the candidate locations for the jittered points. candi must have two columns in the following order: [, "x"], the projected x-coordinates, and [, "y"], the projected y-coordinates.

covars

Data frame or matrix with the covariates in the columns.

strata.type

(Optional) Character value setting the type of stratification that should be used to create the marginal sampling strata (or factor levels) for the numeric covariates. Available options are "area", for equal-area, and "range", for equal-range. Defaults to strata.type = "area".

use.coords

(Optional) Logical value. Should the spatial x- and y-coordinates be used as covariates? Defaults to use.coords = FALSE.

schedule

List with 11 named sub-arguments defining the control parameters of the cooling schedule. See scheduleSPSANN.

plotit

(Optional) Logical for plotting the optimization results, including a) the progress of the objective function, and b) the starting (gray circles) and current sample configuration (black dots), and the maximum jitter in the x- and y-coordinates. The plots are updated at each 10 jitters. When adding points to an existing sample configuration, fixed points are indicated using black crosses. Defaults to plotit = FALSE.

track

(Optional) Logical value. Should the evolution of the energy state be recorded and returned along with the result? If track = FALSE (the default), only the starting and ending energy states are returned along with the results.

boundary

(Optional) SpatialPolygon defining the boundary of the spatial domain. If missing and plotit = TRUE, boundary is estimated from candi.

progress

(Optional) Type of progress bar that should be used, with options "txt", for a text progress bar in the R console, "tk", to put up a Tk progress bar widget, and NULL to omit the progress bar. A Tk progress bar widget is useful when using parallel processors. Defaults to progress = "txt".

verbose

(Optional) Logical for printing messages about the progress of the optimization. Defaults to verbose = FALSE.

weights

List with named sub-arguments. The weights assigned to each one of the objective functions that form the multi-objective combinatorial optimization problem. They must be named after the respective objective function to which they apply. The weights must be equal to or larger than 0 and sum to 1.

nadir

List with named sub-arguments. Three options are available: 1) sim -- the number of simulations that should be used to estimate the nadir point, and seeds -- vector defining the random seeds for each simulation; 2) user -- a list of user-defined nadir values named after the respective objective functions to which they apply; 3) abs -- logical for calculating the nadir point internally (experimental).

utopia

List with named sub-arguments. Two options are available: 1) user -- a list of user-defined values named after the respective objective functions to which they apply; 2) abs -- logical for calculating the utopia point internally (experimental).

Value

optimACDC returns an object of class OptimizedSampleConfiguration: the optimized sample configuration with details about the optimization.

objACDC returns a numeric value: the energy state of the sample configuration -- the objective function value.

Details

The help page of minmaxPareto contains details on how spsann solves the multi-objective combinatorial optimization problem of finding a globally optimum sample configuration that meets multiple, possibly conflicting, sampling objectives.

Details about the mechanism used to generate a new sample configuration out of the current sample configuration by randomly perturbing the coordinates of a sample point are available in the help page of spJitter.

Visit the help pages of optimCORR and optimDIST to see the details of the objective functions that compose ACDC.

References

Minasny, B.; McBratney, A. B. A conditioned Latin hypercube method for sampling in the presence of ancillary information. Computers & Geosciences, v. 32, p. 1378-1388, 2006.

Minasny, B.; McBratney, A. B. Conditioned Latin Hypercube Sampling for calibrating soil sensor data to soil properties. Chapter 9. Viscarra Rossel, R. A.; McBratney, A. B.; Minasny, B. (Eds.) Proximal Soil Sensing. Amsterdam: Springer, p. 111-119, 2010.

Roudier, P.; Beaudette, D.; Hewitt, A. A conditioned Latin hypercube sampling algorithm incorporating operational constraints. 5th Global Workshop on Digital Soil Mapping. Sydney, p. 227-231, 2012.

Examples

Run this code

# NOT RUN {
data(meuse.grid, package = "sp")
candi <- meuse.grid[1:1000, 1:2]
nadir <- list(sim = 10, seeds = 1:10)
utopia <- list(user = list(DIST = 0, CORR = 0))
covars <- meuse.grid[1:1000, 5]
schedule <- scheduleSPSANN(
  chains = 1, initial.temperature = 5, x.max = 1540, y.max = 2060, 
  x.min = 0, y.min = 0, cellsize = 40)
set.seed(2001)
res <- optimACDC(
  points = 10, candi = candi, covars = covars, nadir = nadir, use.coords = TRUE, 
  utopia = utopia, schedule = schedule, weights = list(DIST = 1/2, CORR = 1/2))
objSPSANN(res) - objACDC(
  points = res, candi = candi, covars = covars, use.coords = TRUE, nadir = nadir, 
  utopia = utopia, weights = list(DIST = 1/2, CORR = 1/2))
# }

Run the code above in your browser using DataLab