Parallelize applying a function over a list or vector according to the registered parallelization engine.
tm_parLapply(X, FUN, ...)
tm_parLapply_engine(new)A vector (atomic or list), or other objects suitable for the engine in use.
the function to be applied to each element of X.
optional arguments to FUN.
an object inheriting from class cluster as created
by makeCluster() from package
parallel, or a function with formals X, FUN and
..., or NULL corresponding to the default of using no
parallelization engine.
A list the length of X, with the result of applying FUN
together with the ... arguments to each element of X.
Parallelization can be employed to speed up some of the embarrassingly
parallel computations performed in package tm, specifically
tm_index(), tm_map() on a non-lazy-mapped
VCorpus, and TermDocumentMatrix() on a
VCorpus or PCorpus. Functions
tm_parLapply() and tm_parLapply_engine() can be used to
customize parallelization according to the available resources.
tm_parLapply_engine() is used for getting (with no arguments)
or setting (with argument new) the parallelization engine
employed (see below for examples).
If an engine is set to an object inheriting from class cluster,
tm_parLapply() calls
parLapply() with this cluster and
the given arguments. If set to a function, tm_parLapply()
calls the function with the given arguments. Otherwise, it simply
calls lapply().
Hence, to achieve parallelization via
parLapply()
and a default cluster registered via
setDefaultCluster(), one
can use
tm_parLapply_engine(function(X, FUN, ...)
parallel::parLapply(NULL, X, FUN, ...))or re-register the cluster, say cl, using
tm_parLapply_engine(cl)
(note that there is no mechanism for programmatically getting the registered default cluster). Using
tm_parLapply_engine(function(X, FUN, ...)
parallel::parLapplyLB(NULL, X, FUN, ...))or
tm_parLapply_engine(function(X, FUN, ...)
parallel::parLapplyLB(cl, X, FUN, ...))gives load-balancing parallelization with the registered default or
given cluster, respectively. To achieve parallelization via forking
(on Unix-alike platforms), one can use the above with clusters created
by makeForkCluster(), or use
tm_parLapply_engine(parallel::mclapply)
or
tm_parLapply_engine(function(X, FUN, ...)
parallel::mclapply(X, FUN, ..., mc.cores = n))to use mclapply() with the default or
given number n of cores.
makeCluster(),
parLapply(),
parLapplyLB(), and
mclapply().