focal_hpc is designed to execute a function on a Raster* object using foreach, to
achieve parallel reads, executions and writes. Parallel random writes are achieved through the use of
mmap, so individual image chunks can finish and write their outputs without having to wait for
all nodes in the cluster to finish and then perform sequential writing. On Windows systems,
random writes are possible but apparently not parallel writes. focal_hpc solves this by trying to
write to a portion of the image file, and if it finds an error (a race condition occurs), it will
simply retry the writes until it successfully finishes. On Unix-alikes, truly parallel writes
should be possible.
Note that rasterEngine
is a convienence wrapper for focal_hpc and, in general, should be used instead
of focal_hpc directly.
focal_hpc operates in two modes, which have different input and outputs to the function:
Pixel based processing:
1) If chunk_format=="array" (default), the input to the function should assume an array of dimensions
x,y,z where x = the number of columns in a chunk, y = the number of rows in the chunk, and
z = the number of bands in the chunk. If chunk_format=="raster", the input to the function
will be a raster subset.
Note that we are ordering the array using standards for geographic data, (columns, rows, bands),
not how R usually thinks of arrays (rows, columns, bands).
2) The output of the function should always be an array with the x and y dimensions matching
the input, and an arbitrary number of band outputs. Remember to order the dimensions as
columns, rows, bands (x,y,z).
Local window processing:
1) The function should be written to process a SINGLE window at a time, given the dimensions
of window_dims, so the input to the function should assume a window of dimensions window_dims
with a local center defined by window_center. As with before, the input can be passed to
the function as an array (suggested) or a small raster.
2) The output should be a single pixel value, so can either be a single value, or a vector
(which is assumed to be multiple bands of a single pixel).
The speed of the execution when running in parallel will vary based on the specific setup,
and may, indeed, be slower than a sequential execution (e.g. with calc() ),
particularly on smaller files. Note that by simply running sfQuickStop(), focal_hpc
will run in sequential mode.