Pair-wise distance sampling
Select pairs of points from two sets (without replacement) that have a similar distance to their nearest point in another set of points.
For each point in "
fixed", a point is selected from "
sample" that has a similar distance (as defined by
threshold) to its nearest point in "
reference" (note that these are likely to be different points in
reference). The select point is either the nearest point
nearest=TRUE, or a randomly select point
nearest=FALSE that is within the threshold distance. If no point within the threshold distance is found in
sample, the point in
fixed is dropped.
Hijmans (2012) proposed this sampling approach to remove 'spatial sorting bias' (
ssb) from evaluation data used in cross-validation of presence-only species distribution models. In that context,
fixed are the testing-presence points,
sample the testing-absence (or testing-background) points, and
reference the training-presence points.
pwdSample(fixed, sample, reference, tr=0.33, nearest=TRUE, n=1, lonlat=TRUE, warn=TRUE)
- two column matrix (x, y) or (longitude/latitude) or SpatialPoints object, for point locations for which a pair should be found in
- as above for point locations from which to sample to make a pair with a point from
- as above for reference point locations to which distances are computed
- How many pairs do you want for each point in
- Numeric, normally below 1. The threshold distance for a pair of points (one of
fixedand one of
sample) to their respective nearest points in
referenceto be considered a valid pair. The absolute difference in distance between the candidate point pairs in
reference(dfr) and the distance between candidate point pairs in
reference(dsr) must be smaller than
tr* dfr. I.e. if the dfr = 100 km, and tr = 0.1, dsr must be between >90 and
- Logical. If
TRUE, the pair with the smallest difference in distance to their nearest
referencepoint is selected. If
FALSE, a random point from the valid pairs (with a difference in distance below the threshold defined by
tr) is selected (generally leading to higher
- Logical. Use
TRUEif the coordinates are spherical (in degrees), and use
FALSEif they are planar
- Logical. If
TRUEa warning is given if
nrow(fixed) < nrow(sample)
A matrix of nrow(fixed) and ncol(n), that indicates, for each point (row) in
fixedwhich point(s) in
sampleit is paired to; or
NAif no suitable pair was available.
Hijmans, R.J., 2012. Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null-model. Ecology 93: 679-688
ref <- matrix(c(-54.5,-38.5, 2.5, -9.5, -45.5, 1.5, 9.5, 4.5, -10.5, -10.5), ncol=2) fix <- matrix(c(-56.5, -30.5, -6.5, 14.5, -25.5, -48.5, 14.5, -2.5, 14.5, -11.5, -17.5, -11.5), ncol=2) r <- raster() extent(r) <- c(-110, 110, -45, 45) r <- 1 set.seed(0) sam <- randomPoints(r, n=50) par(mfrow=c(1,2)) plot(sam, pch='x') points(ref, col='red', pch=18, cex=2) points(fix, col='blue', pch=20, cex=2) i <- pwdSample(fix, sam, ref, lonlat=TRUE) i sfix <- fix[!is.na(i), ] ssam <- sam[i[!is.na(i)], ] ssam plot(sam, pch='x', cex=0) points(ssam, pch='x') points(ref, col='red', pch=18, cex=2) points(sfix, col='blue', pch=20, cex=2) # try to get 3 pairs for each point in 'fixed' pwdSample(fix, sam, ref, lonlat=TRUE, n=3)