# bbfit

##### Batchwise Backfitting

Batchwise backfitting estimation engine for GAMLSS using very large data sets.

- Keywords
- regression

##### Usage

```
## Batchwise backfitting engine.
bbfit(x, y, family, shuffle = TRUE, start = NULL, offset = NULL,
epochs = 1, nbatch = 10, verbose = TRUE, ...)
```## Parallel version.
bbfitp(x, y, family, mc.cores = 1, ...)

## Loglik contribution plot.
contribplot(x, ...)

##### Arguments

- x
For function

`bfit()`

the`x`

list, as returned from function`bamlss.frame`

, holding all model matrices and other information that is used for fitting the model. For the updating functions an object as returned from function`smooth.construct`

or`smoothCon`

. For function`contribplot()`

, a`"bamlss"`

object using`bbfit()`

with argument`select = TRUE`

.- y
The model response, as returned from function

`bamlss.frame`

.- family
A bamlss family object, see

`family.bamlss`

.- shuffle
Should observations be shuffled?

- start
A named numeric vector containing possible starting values, the names are based on function

`parameters`

.- offset
Can be used to supply model offsets for use in fitting, returned from function

`bamlss.frame`

.- epochs
For how many epochs should the algorithm run?

- nbatch
Number of batches. Can also be a number between 0 and 1, i.e., determining the fraction of observations that should be used for fitting.

- verbose
Print information during runtime of the algorithm.

- mc.cores
On how many cores should estimation be started?

- …
For

`bbfitp()`

all arguments to be passed to`bbfit()`

.

##### Details

The algorithm uses batch-wise estimation of smoothing variances, which are estimated on an hold-out batch. This way, models for very large data sets can be estimated. Note, the algorithm only works in combination withe the ff and ffbase package. The data needs to be stored as comma separated file on disc, see the example.

##### Value

For function `bbfit()`

a list containing the following objects:

A named list of the fitted values of the modeled parameters of the selected distribution.

The estimated set regression coefficients and smoothing variances.

Logical

The runtime of the algorithm.

##### See Also

##### Examples

```
# NOT RUN {
## Simulate data.
set.seed(123)
d <- GAMart(n = 27000, sd = -1)
## Write data to disc.
tf <- tempdir()
write.table(d, file.path(tf, "d.raw"), quote = FALSE, row.names = FALSE, sep = ",")
## Estimation using batch-wise backfitting.
f <- list(
num ~ s(x1,k=40) + s(x2,k=40) + s(x3,k=40) + te(lon,lat,k=10),
sigma ~ s(x1,k=40) + s(x2,k=40) + s(x3,k=40) + te(lon,lat,k=10)
)
b <- bamlss(f, data = file.path(tf, "d.raw"), optimizer = bbfit,
sampler = FALSE, nbatch = 10, epochs = 2, loglik = TRUE)
## Show estimated effects.
plot(b)
# }
```

*Documentation reproduced from package bamlss, version 1.1-2, License: GPL-2 | GPL-3*