nngeo (version 0.4.8)

st_nn: Nearest Neighbor Search for Simple Features

Description

Returns the indices of layer y which are nearest neighbors of each feature of layer x. The number of nearest neighbors k and the search radius maxdist can be modified.

The function has three modes of operation:

  • lon-lat points—Calculation using C code from GeographicLib, similar to sf::st_distance

  • projected points—Calculation using nabor::knn, a fast search method based on the libnabo C++ library

  • lines or polygons, either lon-lat or projected—Calculation based on sf::st_distance

Usage

st_nn(
  x,
  y,
  sparse = TRUE,
  k = 1,
  maxdist = Inf,
  returnDist = FALSE,
  progress = TRUE,
  parallel = 1
)

Value

  • If sparse=TRUE (the default), a sparse list with list element i being a numeric vector with the indices j of neighboring features from y for the feature x[i,], or an empty vector (integer(0)) in case there are no neighbors.

  • If sparse=FALSE, a logical matrix with element [i,j] being TRUE when y[j,] is a neighbor of x[i].

  • If returnDists=TRUE the function returns a list, with the first element as specified above, and the second element a sparse list with the distances (as numeric vectors, in meters) between each pair of detected neighbors corresponding to the sparse list of indices.

Arguments

x

Object of class sf or sfc

y

Object of class sf or sfc

sparse

logical; should a sparse index list be returned (TRUE, the default) or a dense logical matrix? See "Value" section below.

k

The maximum number of nearest neighbors to compute. Default is 1, meaning that only a single point (nearest neighbor) is returned.

maxdist

Search radius (in meters). Points farther than search radius are not considered. Default is Inf, meaning that search is unconstrained.

returnDist

logical; whether to return a second list with the distances between detected neighbors.

progress

Display progress bar? The default is TRUE. When using parallel>1 or when input is projected points, a progress bar is not displayed regardless of progress argument.

parallel

Number of parallel processes. The default parallel=1 implies ordinary non-parallel processing. Parallel processing is not applicable for projected points, where calculation is already highly optimized through the use of nabor::knn. Parallel processing is done with the parallel package.

References

C. F. F. Karney, GeographicLib, Version 1.49 (2017-mm-dd), https://geographiclib.sourceforge.io/1.49/

Examples

Run this code
data(cities)
data(towns)

cities = st_transform(cities, 32636)
towns = st_transform(towns, 32636)
water = st_transform(water, 32636)

# Nearest town
st_nn(cities, towns, progress = FALSE)

# Using 'sfc' objects
st_nn(st_geometry(cities), st_geometry(towns), progress = FALSE)
st_nn(cities, st_geometry(towns), progress = FALSE)
st_nn(st_geometry(cities), towns, progress = FALSE)

# With distances
st_nn(cities, towns, returnDist = TRUE, progress = FALSE)

if (FALSE) {

# Distance limit
st_nn(cities, towns, maxdist = 7200)
st_nn(cities, towns, k = 3, maxdist = 12000)
st_nn(cities, towns, k = 3, maxdist = 12000, returnDist = TRUE)

# 3 nearest towns
st_nn(cities, towns, k = 3)

# Spatial join
st_join(cities, towns, st_nn, k = 1)
st_join(cities, towns, st_nn, k = 2)
st_join(cities, towns, st_nn, k = 1, maxdist = 7200)
st_join(towns, cities, st_nn, k = 1)

# Polygons to polygons
st_nn(water, towns, k = 4)

# Large example - Geo points
n = 1000
x = data.frame(
  lon = (runif(n) * 2 - 1) * 70,
  lat = (runif(n) * 2 - 1) * 70
)
x = st_as_sf(x, coords = c("lon", "lat"), crs = 4326)
start = Sys.time()
result1 = st_nn(x, x, k = 3)
end = Sys.time()
end - start

# Large example - Geo points - Parallel processing
start = Sys.time()
result2 = st_nn(x, x, k = 3, parallel = 4)
end = Sys.time()
end - start
all.equal(result1, result2)

# Large example - Proj points
n = 1000
x = data.frame(
  x = (runif(n) * 2 - 1) * 70,
  y = (runif(n) * 2 - 1) * 70
)
x = st_as_sf(x, coords = c("x", "y"), crs = 4326)
x = st_transform(x, 32630)
start = Sys.time()
result = st_nn(x, x, k = 3)
end = Sys.time()
end - start

# Large example - Polygons
set.seed(1)
n = 150
x = data.frame(
  lon = (runif(n) * 2 - 1) * 70,
  lat = (runif(n) * 2 - 1) * 70
)
x = st_as_sf(x, coords = c("lon", "lat"), crs = 4326)
x = st_transform(x, 32630)
x = st_buffer(x, 1000000)
start = Sys.time()
result1 = st_nn(x, x, k = 3)
end = Sys.time()
end - start

# Large example - Polygons - Parallel processing
start = Sys.time()
result2 = st_nn(x, x, k = 3, parallel = 4)
end = Sys.time()
end - start
all.equal(result1, result2)

}

Run the code above in your browser using DataLab