Last chance! 50% off unlimited learning
Sale ends in
mgcv()
. This is usually called by gam
.GAMsetup(G)
G$m
integers specifying the maximum d.f. for each spline
term.dim[i]
is the
number of covariates that smooth i
is a function of.G$n
element arrays of data and (optionally) design matrix
columns. The first G$nsdf
elements of G$x
should contain the elements of
the columns of the design matrix corresponding to the parametric part oby
variables (i.e. covariates that multiply a
smooth term) by[i,j]
is the jth value for the ith by
variable. There are only as many rows of this array as there are
by
variables in the modeby.exists[i]
is
TRUE
if the ith smooth has a by
variable associated with
it, FALSE
otherwise.G$x
. There are G$dim[i]
arrays of length G$n.knots[i]
for the ith
smooth - all these arrays aH
, containing the elements of G
(the input list) plus the
following:start[k+1]<-start[k]+H$df[1:(k-1)]^2
and
start[1]<-0
. Then penalty matrix k
has
H$S[start[k]+i+H$df[i]*(j-1)
on its ith row and jth column.
To get the kth full penalty matrix the matrix so obtained would be
inserted into a full matrix of zeroes with it's 1,1 element at H$off[k],H$off[k]
.p[off[i]+1]
).start[1]<-0
and
start[k+1]<-start[k]+(M[k]+n)*tp.bs[k]
where n
is number
of data, M[k]
is penalty null space dimension and
tp.bs[k]
is zero for a cubic regression spline and the basis
dimension for a t.p.r.s. Then element i,j
of the UZ matrix for
model term k
is:
UZ[start[k]+i+(j=1)*(M[k]+n)]
.start[1]<-0
and
start[k+1]<-start[k]+(xu.length[k])*tp.dim[k]
where xu.length[k]
is number
of unique covariate combinations and tp.dim[k]
is zero for a
cubic regression spline
and the dimension of the smooth (i.e. number of covariates it is a
function of) for a t.p.r.s. Then element i,j
of the Xu matrix for
model term k
is:
Xu[start[k]+i+(j=1)*(xu.length[k])]
.Wood, S.N. (2003) Thin plate regression splines. JRSSB in press.
mgcv
gam
# This example modified from routine SANtest()
set.seed(0)
n<-100 # number of observations to simulate
x <- runif(5 * n, 0, 1) # simulate covariates
x <- array(x, dim = c(5, n)) # put into array for passing to GAMsetup
pi <- asin(1) * 2 # begin simulating some data
y <- 2 * sin(pi * x[2, ])
y <- y + exp(2 * x[3, ]) - 3.75887
y <- y + 0.2 * x[4, ]^11 * (10 * (1 - x[4, ]))^6 + 10 * (10 *
x[4, ])^3 * (1 - x[4, ])^10 - 1.396
sig2<- -1 # set magnitude of variance
e <- rnorm(n, 0, sqrt(abs(sig2)))
y <- y + e # simulated data
w <- matrix(1, n, 1) # weight matrix
par(mfrow = c(2, 2)) # scatter plots of simulated data
plot(x[2, ], y);plot(x[3, ], y);plot(x[4, ], y);plot(x[5, ], y)
x[1,]<-1
# create list for passing to GAMsetup....
G <- list(m = 4, n = n, nsdf = 0, df = c(15, 15, 15, 15),dim=c(1,1,1,1),
s.type=c(0,0,0,0),by=0,by.exists=c(FALSE,FALSE,FALSE,FALSE),
p.order=c(0,0,0,0),x = x,n.knots=rep(0,4))
H <- GAMsetup(G)
H$y <- y # add data to H
H$sig2 <- sig2 # add variance (signalling GCV use in this case) to H
H$w <- w # add weights to H
H$sp<-array(-1,H$m)
H$fix<-array(FALSE,H$m)
H$conv.tol<-1e-6;H$max.half<-15
H$min.edf<-5;H$fixed.sp<-0
H <- mgcv(H) # select smoothing parameters and fit model
Run the code above in your browser using DataLab