Set or get number of threads that data.table should use
Set and get number of threads to be used in
data.table functions that are parallelized with OpenMP. Default value 0 means to utilize all CPU available with an appropriate number of threads calculated by OpenMP.
getDTthreads() returns the number of threads that will be used. This affects
data.table only and does not change R itself or other packages using OpenMP. We have followed the advice of section 184.108.40.206 in the R-exts manual: "… or, better, for the regions in your code as part of their specification… num_threads(nthreads)… That way you only control your own code and not that of other OpenMP users." All the parallel region in data.table contain this directive. This is mandated by a
grep in data.table's quality control CRAN release procedure script.
setDTthreads(threads = 0, restore_after_fork = NULL) getDTthreads(verbose = getOption("datatable.verbose", FALSE))
An integer >= 0. Default 0 means use all CPU available and leave the operating system to multi task.
Should data.table be multi-threaded after a fork has completed? NULL leaves the current setting unchanged which by default is TRUE. See details below.
Display the value of some OpenMP settings, including the restore_after_fork internal option.
data.table automatically switches to single threaded mode upon fork (the mechanism used by
mclapply and the foreach package). Otherwise, nested parallelism would very likely overload your CPUs and result in much slower execution. As
data.table becomes more parallel internally, we expect explicit user parallelism to be needed less often. The
restore_after_fork option controls what happens after the explicit fork parallelism completes. It needs to be at C level so it is not a regular R option using
options(). By default
data.table will be multi-threaded again; restoring the prior setting of
getDTthreads(). But problems have been reported in the past on Mac and Intel OpenMP libraries, whereas success has been reported on Linux. If you experience problems after fork, start a new R session and change the default behaviour by calling
setDTthreads(restore_after_fork=FALSE) before retrying. Please raise issues on the data.table GitHub issues page.
setDTthreads() to more than the number of logical CPUs is intended to be ineffective; i.e.,
getDTthreads() will still return the number of logical CPUs in that case. Further, there is a hard coded limit of 1024 threads (with warning when imposed) to prevent accidentally picking up the value
INT_MAX (2 billion; i.e. unlimited) which we have seen returned by OpenMP's
omp_get_thread_limit() in some cases.
A length 1
integer. The old value is returned by
setDTthreads so you can store that prior value and pass it to
setDTthreads() again after the section of your code where you control the number of threads.