Learn R Programming

RSAGA (version 0.94-5)

pick.from.points: Pick Variable from Spatial Dataset

Description

These functions pick (i.e. interpolate without worrying too much about theory) values of a spatial variables from a data stored in a data.frame, a point shapefile, or an ASCII or SAGA grid, using nearest neighbor or kriging interpolation. pick.from.points and [internal.]pick.from.ascii.grid are the core functions that are called by the different wrappers.

Usage

pick.from.points(data, src, pick, method = c("nearest.neighbour", "krige"), set.na = FALSE, radius = 200, nmin = 0, nmax = 100, sill = 1, range = radius, nugget = 0, model = vgm(sill - nugget, "Sph", range = range, nugget = nugget), log = rep(FALSE, length(pick)), X.name = "x", Y.name = "y", cbind = TRUE)
pick.from.shapefile(data, shapefile, X.name = "x", Y.name = "y", ...)
pick.from.ascii.grid(data, file, path = NULL, varname = NULL, prefix = NULL, method = c("nearest.neighbour", "krige"), cbind = TRUE, parallel = FALSE, nsplit, quiet = TRUE, ...)
pick.from.ascii.grids(data, file, path = NULL, varname = NULL, prefix = NULL, cbind = TRUE, quiet = TRUE, ...)
internal.pick.from.ascii.grid(data, file, path = NULL, varname = NULL, prefix = NULL, method = c("nearest.neighbour", "krige"), nodata.values = c(-9999, -99999), at.once, quiet = TRUE, X.name = "x", Y.name = "y", nlines = Inf, cbind = TRUE, range, radius, na.strings = "NA", ...)
pick.from.saga.grid(data, filename, path, varname, prec = 7, show.output.on.console = FALSE, env = rsaga.env(), ...)

Arguments

data
data.frame giving the coordinates (in columns specified by X.name, Y.name) of point locations at which to interpolate the specified variables or grid values
src
data.frame
pick
variables to be picked (interpolated) from src; if missing, use all available variables, except those specified by X.name and Y.name
method
interpolation method to be used; uses a partial match to the alternatives "nearest.neighbor" (currently the default) and "krige"
set.na
logical: if a column with a name specified in pick already exists in data, how should it be dealt with? set.na=FALSE (default) only overwrites existing data if the interpolator yields a non-NA result; set.na=TRUE passes NA values returned by the interpolator on to the results data.frame
radius
numeric value specifying the radius of the local neighborhood to be used for interpolation; defaults to 200 map units (presumably meters), or, in the functions for grid files, 2.5*cellsize.
nmin
numeric, for method="krige" only: see krige function in package gstat
nmax
numeric, for method="krige" only: see krige function in package gstat
sill
numeric, for method="krige" only: the overall sill parameter to be used for the variogram
range
numeric, for method="krige" only: the variogram range
nugget
numeric, for method="krige" only: the nugget effect
model
for method="krige" only: the variogram model to be used for interpolation; defaults to a spherical variogram with parameters specified by the range, sill, and nugget arguments; see vgm in package gstat for details
log
logical vector, specifying for each variable in pick if interpolation should take place on the logarithmic scale (default: FALSE)
X.name
name of the variable containing the x coordinates
Y.name
name of the variable containing the y coordinates
cbind
logical: shoud the new variables be added to the input data.frame (cbind=TRUE, the default), or should they be returned as a separate vector or data.frame? cbind=FALSE
shapefile
point shapefile
...
arguments to be passed to pick.from.points, and to internal.pick.from.ascii.grid in the case of pick.from.ascii.grid
file
file name (relative to path, default file extension .asc) of an ASCII grid from which to pick a variable, or an open connection to such a file
path
optional path to file
varname
character string: a variable name for the variable interpolated from grid file file in pick.from.*.grid; if missing, variable name will be determined from filename by a call to create.variable.name
prefix
an optional prefix to be added to the varname
parallel
logical (default: FALSE): enable parallel processing; requires additional packages such as doSNOW or doMC. See example below and ddply
nsplit
split the data.frame data in nsplit disjoint subsets in order to increase efficiency by using ddply in package plyr. The default seems to perform well in many situations.
quiet
logical: provide information on the progress of grid processing on screen? (only relevant if at.once=FALSE and method="nearest.neighbour")
nodata.values
numeric vector specifying grid values that should be converted to NA; in addition to the values specified here, the nodata value given in the input grid's header will be used
at.once
logical: should the grid be read as a whole or line by line? at.once=FALSE is useful for processing large grids that do not fit into memory; the argument is currently by default FALSE for method="nearest.neighbour", and it currently MUST be TRUE for all other methods (in these cases, TRUE is the default value); piecewise processing with at.once=FALSE is always faster than processing the whole grid at.once
nlines
numeric: stop after processing nlines lines of the input grid; useful for testing purposes
na.strings
passed on to scan
filename
character: name of a SAGA grid file, default extension .sgrd
prec
numeric, specifying the number of digits to be used in converting a SAGA grid to an ASCII grid in pick.from.saga.grid
show.output.on.console
a logical (default: FALSE), indicates whether to capture the output of the command and show it on the R console (see system, rsaga.geoprocessor).
env
list: RSAGA geoprocessing environment created by rsaga.env

Value

If cbind=TRUE, columns with the new, interpolated variables are added to the input data.frame data.If cbind=FALSE, a data.frame only containing the new variables is returned (possibly coerced to a vector if only one variable is processed).

Details

pick.from.points interpolates the variables defined by pick in the src data.frame to the locations provided by the data data.frame. Only nearest neighbour and ordinary kriging interpolation are currently available. This function is intended for 'data-rich' situations in which not much thought needs to be put into a geostatistical analysis of the spatial structure of a variable. In particular, this function is supposed to provide a simple, 'quick-and-dirty' interface for situations where the src data points are very densely distributed compared to the data locations.

pick.from.shapefile is a front-end of pick.from.points for point shapefiles.

pick.from.ascii.grid retrieves data values from an ASCII raster file using either nearest neighbour or ordinary kriging interpolation. The latter may not be possible for large raster data sets because the entire grid needs to be read into an R matrix. Split-apply-combine strategies are used to improve efficiency and allow for parallelization.

The optional parallelization of pick.from.ascii.grid computation requires the use of a parallel backend package such as doSNOW or doMC, and the parallel backend needs to be registered before calling this function with parallel=TRUE. The example section provides an example using doSNOW on Windows. I have seen 25-40

pick.from.ascii.grids performs multiple pick.from.ascii.grid calls. File path and prefix arguments may be specific to each file (i.e. each may be a character vector), but all interpolation settings will be the same for each file, limiting the flexibility a bit compared to individual pick.from.ascii.grid calls by the user. pick.from.ascii.grids currently processes the files sequentially (i.e. parallelization is limited to the pick.from.ascii.grid calls within this function).

pick.from.saga.grid is the equivalent to pick.from.ascii.grid for SAGA grid files. It simply converts the SAGA grid file to a (temporary) ASCII raster file and applies pick.from.ascii.grid.

internal.pick.from.ascii.grid is an internal 'workhorse' function that by itself would be very inefficient for large data sets data. This function is called by pick.from.ascii.grid, which uses a split-apply-combine strategy implemented in the plyr package.

References

Brenning, A. (2008): Statistical geocomputing combining R and SAGA: The example of landslide susceptibility analysis with generalized additive models. In: J. Boehner, T. Blaschke, L. Montanarella (eds.), SAGA - Seconds Out (= Hamburger Beitraege zur Physischen Geographie und Landschaftsoekologie, 19), 23-32.

See Also

grid.to.xyz,

Examples

Run this code
## Not run: 
# # assume that 'dem' is an ASCII grid and d a data.frame with variables x and y
# pick.from.ascii.grid(d, "dem")
# # parallel processing on Windows using the doSNOW package:
# require(doSNOW)
# registerDoSNOW(cl <- makeCluster(2, type = "SOCK")) # DualCore processor
# pick.from.ascii.grid(d, "dem", parallel = TRUE)
# # produces two (ignorable) warning messages when using doSNOW
# # typically 25-40% faster than the above on my DualCore notebook
# stopCluster(cl)
# ## End(Not run)

## Not run: 
# # use the meuse data for some tests:
# require(gstat)
# data(meuse)
# data(meuse.grid)
# meuse.nn = pick.from.points(data=meuse.grid, src=meuse,
#     pick=c("cadmium","copper","elev"), method="nearest.neighbour")
# meuse.kr = pick.from.points(data=meuse.grid, src=meuse,
#     pick=c("cadmium","copper","elev"), method="krige", radius=100)
# # it does make a difference:
# plot(meuse.kr$cadmium,meuse.nn$cadmium)
# plot(meuse.kr$copper,meuse.nn$copper)
# plot(meuse.kr$elev,meuse.nn$elev)
# ## End(Not run)

Run the code above in your browser using DataLab