A LAScatalog object is a representation of a set of las/laz files, since a computer cannot load
all the data at once. A LAScatalog is a simple way to manage the entire dataset by reading only
the file headers. A LAScatalog enables the user to process a large area or to
selectively clip data from a large area without loading the large area itself. A LAScatalog
can be built with the function catalog. Also a LAScatalog contains extra information
that enables users to control how the catalog is processed (see details).
datadata.table. A table representing the header of each file.
crsA CRS object.
coresinteger. Numer of cores used to make parallel computations in compatible functions that
support a LAScatalog as input. Default is 1.
buffernumeric. When applying a function to an entire catalog by sequentially processing sub-areas (clusters), some algorithms (such as grid_terrain) require a buffer around the area to avoid edge effects. Default is 15 units.
progresslogical. Display an estimation of progress while processing. Default is TRUE.
by_filelogical. This option overwrites the option tiling_size. Instead of processing
the catalog by arbitrary split areas, it forces processing by file. Buffering around each file is
still available. Default is FALSE.
tiling_sizenumeric. To process an entire catalog, the algorithm splits the dataset into several square sub-areas (called clusters) to process them sequentially. This is the size of each square cluster. Default is 1000 unit^2.
vrtcharacter. Path to a folder. In grid_* functions such as grid_metrics,
grid_terrain and others, the functions can write RasterLayers in this folder and
return a lightweight virtual raster mosaic (VRT). In other functions where it is not relevant,
it is not used.
stop_earlylogical. If TRUE the catalog processing stops if an error occurs during the
computation. If FALSE, the catalog will be processed until the end anyway and clusters with
errors will be skipped.
opt_changedInternal use only for compatibility with older deprecated code.
A LAScatalog contains a slot @data that contains the useful information about the point cloud
that is used internally, as well as several other slots that contain processing options. Each
lidR function that supports a LAScatalog as input will respect this processing option
when it is relevant. When it is not relevant these options are not considered. Examples of some non-
relevant situations:
@vrt options is not relevant in functions that do not rasterize the point cloud.
@tiling_size is always respected but can be slighly modified to align the clusters with
the grid in grid_* functions.
@buffer is not relevant in grid_metrics because lidR aligns the
clusters with the resolution to get a continuous output. However it is relevant in grid_terrain
to avoid edge artifacts, for example.
@cores may not be respected if it is known internally that a single core is better
than four (no current case currently exists)
Internally, processing a catalog is almost always the same and relies on few steps:
Create a set of clusters. A cluster is the representation of a region of interest that can be buffered or not.
Loop over each cluster (in parallel or not)
For each cluster, load the points inside the region of interest in R, run some R functions, return the expected output.
Merge the outputs of the different clusters once they are all processed.
So basically, a LAScatalog is a built in batch process with the specificity that lidR
does not loop through files but loops seamlessly through clusters that do not not necessarily match
with the files. This is why point cloud indexation with lax files may significantly speed-up the
processing.
It is important to note that buffered datasets (i.e. files that overlap each other) are not natively
supported by lidR. When encountering such datasets the user should always filter the
overlap if possible. This is possible if the overlapping points are flagged, for example in the
'withheld' field. Otherwise lidR will not be able to process the dataset correctly.