VGAM (version 1.0-4)

# s: Defining Smooths in VGAM Formulas

## Description

`s` is used in the definition of (vector) smooth terms within `vgam` formulas. This corresponds to 1st-generation VGAMs that use backfitting for their estimation. The effective degrees of freedom is prespecified.

## Usage

`s(x, df = 4, spar = 0, ...)`

## Arguments

x

covariate (abscissae) to be smoothed. Note that `x` must be a single variable and not a function of a variable. For example, `s(x)` is fine but `s(log(x))` will fail. In this case, let `logx <- log(x)` (in the data frame), say, and then use `s(logx)`. At this stage bivariate smoothers (`x` would be a two-column matrix) are not implemented.

df

numerical vector of length \(r\). Effective degrees of freedom: must lie between 1 (linear fit) and \(n\) (interpolation). Thus one could say that `df-1` is the effective nonlinear degrees of freedom (ENDF) of the smooth. Recycling of values will be used if `df` is not of length \(r\). If `spar` is positive then this argument is ignored. Thus `s()` means that the effective degrees of freedom is prespecified. If it is known that the component function(s) are more wiggly than usual then try increasing the value of this argument.

spar

numerical vector of length \(r\). Positive smoothing parameters (after scaling) . Larger values mean more smoothing so that the solution approaches a linear fit for that component function. A zero value means that `df` is used. Recycling of values will be used if `spar` is not of length \(r\).

Ignored for now.

## Value

A vector with attributes that are (only) used by `vgam`.

## Details

In this help file \(M\) is the number of additive predictors and \(r\) is the number of component functions to be estimated (so that \(r\) is an element from the set {1,2,…,\(M\)}). Also, if \(n\) is the number of distinct abscissae, then `s` will fail if \(n < 7\).

`s`, which is symbolic and does not perform any smoothing itself, only handles a single covariate. Note that `s` works in `vgam` only. It has no effect in `vglm` (actually, it is similar to the identity function `I` so that `s(x2)` is the same as `x2` in the LM model matrix). It differs from the `s()` of the gam package and the `s` of the mgcv package; they should not be mixed together. Also, terms involving `s` should be simple additive terms, and not involving interactions and nesting etc. For example, `myfactor:s(x2)` is not a good idea.

## References

Yee, T. W. and Wild, C. J. (1996) Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481--493.

`vgam`, `is.buggy`, `sm.os`, `sm.ps`, `vsmooth.spline`.

## Examples

Run this code
```# NOT RUN {
# Nonparametric logistic regression
fit1 <- vgam(agaaus ~ s(altitude, df = 2), binomialff, data = hunua)
# }
# NOT RUN {
plot(fit1, se = TRUE)
# }
# NOT RUN {
# Bivariate logistic model with artificial data
nn <- 300
bdata <- data.frame(x1 = runif(nn), x2 = runif(nn))
bdata <- transform(bdata,
y1 = rbinom(nn, size = 1, prob = logit(sin(2 * x2), inverse = TRUE)),
y2 = rbinom(nn, size = 1, prob = logit(sin(2 * x2), inverse = TRUE)))
fit2 <- vgam(cbind(y1, y2) ~ x1 + s(x2, 3), trace = TRUE,
binom2.or(exchangeable = TRUE), data = bdata)
coef(fit2, matrix = TRUE)  # Hard to interpret
# }
# NOT RUN {
plot(fit2, se = TRUE, which.term = 2, scol = "blue")
# }
```

Run the code above in your browser using DataCamp Workspace