# pwdSample

##### Pair-wise distance sampling

Select pairs of points from two sets (without replacement) that have a similar distance to their nearest point in another set of points.

For each point in "`fixed`

", a point is selected from "`sample`

" that has a similar distance (as defined by `threshold`

) to its nearest point in "`reference`

" (note that these are likely to be different points in `reference`

). The select point is either the nearest point `nearest=TRUE`

, or a randomly select point `nearest=FALSE`

that is within the threshold distance. If no point within the threshold distance is found in `sample`

, the point in `fixed`

is dropped.

Hijmans (2012) proposed this sampling approach to remove 'spatial sorting bias' (`ssb`

) from evaluation data used in cross-validation of presence-only species distribution models. In that context, `fixed`

are the testing-presence points, `sample`

the testing-absence (or testing-background) points, and `reference`

the training-presence points.

- Keywords
- spatial

##### Usage

`pwdSample(fixed, sample, reference, tr=0.33, nearest=TRUE, n=1, lonlat=TRUE, warn=TRUE)`

##### Arguments

- fixed
two column matrix (x, y) or (longitude/latitude) or SpatialPoints object, for point locations for which a pair should be found in

`sample`

- sample
as above for point locations from which to sample to make a pair with a point from

`fixed`

- reference
as above for reference point locations to which distances are computed

- n
How many pairs do you want for each point in

`fixed`

- tr
Numeric, normally below 1. The threshold distance for a pair of points (one of

`fixed`

and one of`sample`

) to their respective nearest points in`reference`

to be considered a valid pair. The absolute difference in distance between the candidate point pairs in`fixed`

and`reference`

(dfr) and the distance between candidate point pairs in`sample`

and`reference`

(dsr) must be smaller than`tr`

* dfr. I.e. if the dfr = 100 km, and tr = 0.1, dsr must be between >90 and <110 km to be considered a valid pair.- nearest
Logical. If

`TRUE`

, the pair with the smallest difference in distance to their nearest`reference`

point is selected. If`FALSE`

, a random point from the valid pairs (with a difference in distance below the threshold defined by`tr`

) is selected (generally leading to higher`ssb)`

- lonlat
Logical. Use

`TRUE`

if the coordinates are spherical (in degrees), and use`FALSE`

if they are planar- warn
Logical. If

`TRUE`

a warning is given if`nrow(fixed) < nrow(sample)`

##### Value

A matrix of nrow(fixed) and ncol(n), that indicates, for each point (row) in `fixed`

which point(s) in `sample`

it is paired to; or `NA`

if no suitable pair was available.

##### References

Hijmans, R.J., 2012. Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null-model. Ecology 93: 679-688

##### See Also

##### Examples

```
# NOT RUN {
ref <- matrix(c(-54.5,-38.5, 2.5, -9.5, -45.5, 1.5, 9.5, 4.5, -10.5, -10.5), ncol=2)
fix <- matrix(c(-56.5, -30.5, -6.5, 14.5, -25.5, -48.5, 14.5, -2.5, 14.5,
-11.5, -17.5, -11.5), ncol=2)
r <- raster()
extent(r) <- c(-110, 110, -45, 45)
r[] <- 1
set.seed(0)
sam <- randomPoints(r, n=50)
par(mfrow=c(1,2))
plot(sam, pch='x')
points(ref, col='red', pch=18, cex=2)
points(fix, col='blue', pch=20, cex=2)
i <- pwdSample(fix, sam, ref, lonlat=TRUE)
i
sfix <- fix[!is.na(i), ]
ssam <- sam[i[!is.na(i)], ]
ssam
plot(sam, pch='x', cex=0)
points(ssam, pch='x')
points(ref, col='red', pch=18, cex=2)
points(sfix, col='blue', pch=20, cex=2)
# try to get 3 pairs for each point in 'fixed'
pwdSample(fix, sam, ref, lonlat=TRUE, n=3)
# }
```

*Documentation reproduced from package dismo, version 1.1-4, License: GPL (>= 3)*