Big data is defined loosely here as data that is too large for
  computer memory (RAM). The BigData function uses the
  split-apply-combine strategy with a big data set. The unmanageable
  big data set is split into smaller, manageable pieces (batches),
  a function is applied to each batch, and results are combined.
Each iteration, the BigData function opens a connection to a
  big data set and keeps the connection open while the scan
  function reads in each batch of data (elsewhere, batches are often
  referred to chunks). A user-specified function is applied to each
  batch of data, the results are combined together, the connection is
  closed, and the results are returned.
As an introductory example, suppose a statistician updates a linear
  regression model, but the design matrix \(\textbf{X}\) is too
  large for computer memory. Suppose the design matrix has 100 million
  rows, and the statistician specifies size=1e6. The statistician
  combines dependent variable \(\textbf{y}\) with design matrix
  \(\textbf{X}\). Each iteration in IterativeQuadrature,
  LaplaceApproximation, LaplacesDemon,
  PMC, or VariationalBayes, the
  BigData function sequentially reads in one million rows of the
  combined data \(\textbf{X}\), calculates expectation vector
  \(\mu\), and finally returns the sum of the log-likelihood. The sum
  of the log-likelihood is added together for all batches, and returned.
  
There are many limitations with this function.
This function is not fast, in the sense that the entire big data set
  is processed in batches, each iteration. With iterative methods, this
  may perform well, albeit slowly.
There are many functions that cannot be performed on batches, though
  most models in the Examples vignette may easily be updated with big
  data.
Large matrices of samples are unaddressed, only the data.
Although many (but not all) models may be estimated, many additional
  functions in this package will not work when applied after the model
  has updated. Instead, a batch or random sample of data (see the
  read.matrix function for sampling from big data) should
  be used in the usual way, in the Data argument, and the
  Model function coded in the usual way without the
  BigData function.
Parallel processing may be performed when the user specifies
  CPUs to be greater than one, implying that the specified number
  of CPUs exists and is available. Parallelization may be performed on a
  multicore computer or a computer cluster. Either a Simple Network of
  Workstations (SNOW) or Message Passing Interface (MPI) is used. Each
  call to BigData establishes and closes the parallelization,
  which is costly, and unfortunately results in copious output to the
  console. With small data sets, parallel processing may be slower, due
  to computer network communication. With larger data sets, the user
  should experience a faster run-time.
There have been several alternative approaches suggested for big data.
Huang and Gelman (2005) propose that the user creates batches by
  sampling from big data, updating a separate Bayesian model on each
  batch, and combining the results into a consensus posterior. This
  many-mini-model approach may be faster when feasible, because multiple
  models may be updated in parallel, say one per CPU. Such results will
  work with all functions in this package. With the many-mini-model
  approach, several methods are proposed for combining posterior samples
  from batch-level models, such as by using a normal approximation,
  updating from prior to posterior sequentially (the posterior from the
  last batch becomes the prior of the next batch), sample from the full
  posterior via importance sampling from the batched posteriors, and
  more.
Scott et al. (2013) propose a method that they call Consensus Monte
  Carlo, which consists of breaking the data down into chunks, calling
  each chunk a shard, and use a many-mini-model approach as well, but
  propose their own method of weighting the posteriors back together.
  
Balakrishnan and Madigan (2006) introduced a Sequential Monte Carlo
  (SMC) sampler, a refinement of an earlier proposal, that was designed
  for big data. It makes one pass through the massive data set, after an
  initial MCMC estimation on a small sample. Each particle is updated
  for each record, resulting in numerous evaluations per record.
Welling and Teh (2011) proposed a new class of MCMC sampler in which
  only a random sample of big data is used each iteration. The
  stochastic gradient Langevin dynamics (SGLD) algorithm is available
  in the LaplacesDemon function.
An important alternative to consider is using the ff package,
  where "ff" stands for fast access file. The ff package has been
  tested successfully with updating a model in LaplacesDemon.
  Once the big data set, say \(\textbf{X}\), is an object of
  class ff_matrix, simply include it in the list of data as
  usual, and modify the Model specification function
  appropriately. For example, change mu <- tcrossprod(X, t(beta))
  to mu <- tcrossprod(X[], t(beta)). The ff package is
  not included as a dependency in the LaplacesDemon package, so
  it must be installed and activated.