psSignal: Smooth signal (multivariate calibration) regression using P-splines.

Description

Smooth signal (multivariate calibration) regression using P-splines.

Usage

psSignal(
  y,
  x_signal,
  x_index = c(1:ncol(x_signal)),
  nseg = 10,
  bdeg = 3,
  pord = 3,
  lambda = 1,
  wts = 1 + 0 * y,
  family = "gaussian",
  link = "default",
  m_binomial = 1 + 0 * y,
  r_gamma = wts,
  y_predicted = NULL,
  x_predicted = x_signal,
  ridge_adj = 0,
  int = TRUE
)

Value

coef: a vector with length(n) of estimated P-spline coefficients.
mu: a vector with length(m) of estimated means.
eta: a vector of length(m) of estimated linear predictors.
B: the B-spline basis (for the coefficients), with dimension p by n.
deviance: the deviance of fit.
eff_df: the approximate effective dimension of fit.
aic: AIC.
df_resid: approximate df residual.
beta: a vector of length p, containing estimated smooth signal coefficients.
std_beta: a vector of length p, containing standard errors of smooth signal coefficients.
cv: leave-one-out standard error prediction, when family = "gaussian".
cv_predicted: standard error prediction for y_predict, when family = "gaussian", NULL otherwise.
nseg: the number of evenly spaced B-spline segments.
bdeg: the degree of B-splines.
pord: the order of the difference penalty.
lambda: the positive tuning parameter.
family: the family of the response.
link: the link function.
y_intercept: the estimated y-intercept (when int = TRUE.)
int: a logical variable related to use of y-intercept in model.
dispersion_param: estimate of dispersion, Dev/df_resid.
summary_predicted: inverse link prediction vectors, and twice se bands.
eta_predicted: estimated linear predictor of length(y).
press_mu: leave-one-out prediction of mean, when family = "gaussian", NULL otherwise.
bin_percent_correct: percent correct classification based on 0.5 cut-off, when family = binomial, NULL otherwise.
x_index: a vector to of length ncol(x_signal) == p, associated with the ordering of the signal.

Arguments

y: a (glm) response vector, usually continuous, binomial or count data.
x_signal: a matrix of continuous regressor with nrow(x_signal) == length(y), often a discrete digitization of a signal or histogram or time series.
x_index: a vector to of length ncol(x_signal) == p, associated with the ordering index of the signal. Default is 1:ncol(x_signal).
nseg: the number of evenly spaced segments between xl and xr (default 10).
bdeg: the degree of the basis, usually 1, 2, or 3 (defalult).
pord: the order of the difference penalty, usually 1, 2, or 3 (defalult).
lambda: the (positive) tuning parameter for the penalty (default 1).
wts: the weight vector of length(y); default is 1.
family: the response distribution, e.g. "gaussian", "binomial", "poisson", "Gamma" distribution; quotes are needed. Default is "gaussian".
link: the link function, one of "identity", "log", "sqrt", "logit", "probit", "cloglog", "loglog", "reciprocal"; quotes are needed (default "identity").
m_binomial: a vector of binomial trials having length(y); default is 1 vector for family = "binomial", NULL otherwise.
r_gamma: a vector of gamma shape parameters. Default is 1 vector for family = "Gamma", NULL otherwise.
y_predicted: a vector of responses associated with x_predicted which are used to calculate standard error of external prediction. Default is NULL.
x_predicted: a matrix of external signals to yield external prediction.
ridge_adj: A ridge penalty tuning parameter, which can be set to small value, e.g. 1e-8 to stabilize estimation, (default 0).
int: set to TRUE or FALSE to include intercept term in linear predictor (default TRUE).

Author

Brian Marx

Details

Support functions needed: pspline_fitter, bbase and pspline_checker.

References

Marx, B.D. and Eilers, P.H.C. (1999). Generalized linear regression for sampled signals and curves: A P-spline approach. Technometrics, 41(1): 1-13.

Eilers, P.H.C. and Marx, B.D. (2021). Practical Smoothing, The Joys of P-splines. Cambridge University Press.

Examples

Run this code

library(JOPS)
# Get the data
library(fds)
data(nirc)
iindex <- nirc$x
X <- nirc$y
sel <- 50:650 # 1200 <= x & x<= 2400
X <- X[sel, ]
iindex <- iindex[sel]
dX <- diff(X)
diindex <- iindex[-1]
y <- as.vector(labc[1, 1:40]) # percent fat
oout <- 23
dX <- t(dX[, -oout])
y <- y[-oout]
fit1 <- psSignal(y, dX, diindex, nseg = 25, bdeg = 3, lambda = 0.0001,
pord = 2, family = "gaussian", link = "identity", x_predicted = dX, int = TRUE)
plot(fit1, xlab = "Coefficient Index", ylab = "ps Smooth Coeff")
title(main = "25 B-spline segments with tuning = 0.0001")
names(fit1)

Run the code above in your browser using DataLab