spJitter: Random perturbation of spatial points

Description

Randomly perturb (‘jitter’) the coordinates of spatial points.

Usage

spJitter(points, candi, x.max, x.min, y.max, y.min, which.point, cellsize)

Arguments

points

Data frame or matrix with three columns in the following order: [, "id"] the row indexes of candi that correspond to each point, [, "x"] the projected x-coordinates, and [, "y"] the projected y-coordinates. Note that points must be a subset of candi.

candi

Data frame or matrix with the candidate locations for the jittered points. candi must have two columns in the following order: [, "x"] the projected x-coordinates, and [, "y"] the projected y-coordinates.

x.max, x.min, y.max, y.min

Numeric value defining the minimum and maximum quantity of random noise to be added to the projected x- and y-coordinates. The minimum quantity should be equal to, at least, the minimum distance between two neighbouring candidate locations. The units are the same as of the projected x- and y-coordinates. If missing, they are estimated from candi.

which.point

Integer values defining which point should be perturbed.

cellsize

Vector with two numeric values defining the horizontal (x) and vertical (y) spacing between the candidate locations in candi. A single value can be used if the spacing in the x- and y-coordinates is the same. If cellsize = 0 then spsann understands that a finite set of candidate locations is being used (See Details).

Value

A matrix with the jittered projected coordinates of the points.

Details

Jittering methods

There are multiple mechanism to generate a new sample configuration out of the current sample configuration. The main step consists of randomly perturbing the coordinates of a sample point, a process known as ‘jittering’. These mechanisms can be classified based on how the set of candidate locations is defined. For example, one could use an infinite set of candidate locations, that is, any location in the sampling region can be selected as the new location of a jittered point. All that is needed is a polygon indicating the boundary of the sampling region. This method is the most computationally demanding because every time a point is jittered, it is necessary to check if the point falls in sampling region.

Another approach consists of using a finite set of candidate locations for the jittered points. A finite set of candidate locations is created by discretising the sampling region, that is, creating a fine grid of points that serve as candidate locations for the jittered point. This is the least computationally demanding jittering method because, by definition, the jittered point will always fall in the sampling region.

Using a finite set of candidate locations has two important inconveniences. First, not all locations in the sampling region can be selected as the new location for a jittered point. Second, when a point is jittered, it may be that the new location already is occupied by another point. If this happens, another location has to be iteratively sought for, say, as many times as the number of points in the sample. In general, the more points there are in the sample, the more likely it is that the new location already is occupied by another point. If a solution is not found in a reasonable time, the point selected to be jittered is kept in its original location. Such a procedure clearly is suboptimal.

spsann uses a more elegant method which is based on using a finite set of candidate locations coupled with a form of two-stage random sampling as implemented in [spsample](https://CRAN.R-project.org/package=spcosa). Because the candidate locations are placed on a finite regular grid, they can be seen as the centre nodes of a finite set of grid cells (or pixels of a raster image). In the first stage, one of the “grid cells” is selected with replacement, i.e. independently of already being occupied by another sample point. The new location for the point chosen to be jittered is selected within that “grid cell” by simple random sampling. This method guarantees that virtually any location in the sampling region can be selected. It also discards the need to check if the new location already is occupied by another point, speeding up the computations when compared to the first two approaches.

References

Edzer Pebesma, Jon Skoien with contributions from Olivier Baume, A. Chorti, D.T. Hristopulos, S.J. Melles and G. Spiliopoulos (2013). intamapInteractive: procedures for automated interpolation - methods only to be used interactively, not included in intamap package. R package version 1.1-10.

van Groenigen, J.-W. Constrained optimization of spatial sampling: a geostatistical approach. Wageningen: Wageningen University, p. 148, 1999.

Walvoort, D. J. J.; Brus, D. J.; de Gruijter, J. J. An R package for spatial coverage sampling and random sampling from compact geographical strata by k-means. Computers & Geosciences. v. 36, p. 1261-1267, 2010.

Examples

Run this code

# NOT RUN {
require(sp)
data(meuse.grid)
meuse.grid <- as.matrix(meuse.grid[, 1:2])
meuse.grid <- matrix(cbind(1:dim(meuse.grid)[1], meuse.grid), ncol = 3)
pts1 <- sample(c(1:dim(meuse.grid)[1]), 155)
pts2 <- meuse.grid[pts1, ]
pts3 <- spJitter(points = pts2, candi = meuse.grid, x.min = 40,
                 x.max = 100, y.min = 40, y.max = 100,
                 which.point = 10, cellsize = 40)
plot(meuse.grid[, 2:3], asp = 1, pch = 15, col = "gray")
points(pts2[, 2:3], col = "red", cex = 0.5)
points(pts3[, 2:3], pch = 19, col = "blue", cex = 0.5)

#' Cluster of points
pts1 <- c(1:55)
pts2 <- meuse.grid[pts1, ]
pts3 <- spJitter(points = pts2, candi = meuse.grid, x.min = 40,
                x.max = 80, y.min = 40, y.max = 80,
                which.point = 1, cellsize = 40)
plot(meuse.grid[, 2:3], asp = 1, pch = 15, col = "gray")
points(pts2[, 2:3], col = "red", cex = 0.5)
points(pts3[, 2:3], pch = 19, col = "blue", cex = 0.5)
# }

Run the code above in your browser using DataLab