Learn R Programming

bamlss (version 1.1-2)

bbfit: Batchwise Backfitting

Description

Batchwise backfitting estimation engine for GAMLSS using very large data sets.

Usage

## Batchwise backfitting engine.
bbfit(x, y, family, shuffle = TRUE, start = NULL, offset = NULL,
  epochs = 1, nbatch = 10, verbose = TRUE, ...)

## Parallel version. bbfitp(x, y, family, mc.cores = 1, ...)

## Loglik contribution plot. contribplot(x, ...)

Arguments

x

For function bfit() the x list, as returned from function bamlss.frame, holding all model matrices and other information that is used for fitting the model. For the updating functions an object as returned from function smooth.construct or smoothCon. For function contribplot(), a "bamlss" object using bbfit() with argument select = TRUE.

y

The model response, as returned from function bamlss.frame.

family

A bamlss family object, see family.bamlss.

shuffle

Should observations be shuffled?

start

A named numeric vector containing possible starting values, the names are based on function parameters.

offset

Can be used to supply model offsets for use in fitting, returned from function bamlss.frame.

epochs

For how many epochs should the algorithm run?

nbatch

Number of batches. Can also be a number between 0 and 1, i.e., determining the fraction of observations that should be used for fitting.

verbose

Print information during runtime of the algorithm.

mc.cores

On how many cores should estimation be started?

For bbfitp() all arguments to be passed to bbfit().

Value

For function bbfit() a list containing the following objects:

fitted.values

A named list of the fitted values of the modeled parameters of the selected distribution.

parameters

The estimated set regression coefficients and smoothing variances.

shuffle

Logical

runtime

The runtime of the algorithm.

Details

The algorithm uses batch-wise estimation of smoothing variances, which are estimated on an hold-out batch. This way, models for very large data sets can be estimated. Note, the algorithm only works in combination withe the ff and ffbase package. The data needs to be stored as comma separated file on disc, see the example.

See Also

bamlss, bfit

Examples

Run this code
# NOT RUN {
## Simulate data.
set.seed(123)
d <- GAMart(n = 27000, sd = -1)

## Write data to disc.
tf <- tempdir()
write.table(d, file.path(tf, "d.raw"), quote = FALSE, row.names = FALSE, sep = ",")

## Estimation using batch-wise backfitting.
f <- list(
  num ~ s(x1,k=40) + s(x2,k=40) + s(x3,k=40) + te(lon,lat,k=10),
  sigma ~ s(x1,k=40) + s(x2,k=40) + s(x3,k=40) + te(lon,lat,k=10)
)

b <- bamlss(f, data = file.path(tf, "d.raw"), optimizer = bbfit,
  sampler = FALSE, nbatch = 10, epochs = 2, loglik = TRUE)

## Show estimated effects.
plot(b)
# }

Run the code above in your browser using DataLab