qrnn.fit: Main function used to fit a QRNN model or ensemble of QRNN models

Description

Function used to fit a QRNN model or ensemble of QRNN models.

Usage

qrnn.fit(x, y, n.hidden, w=NULL, tau=0.5, n.ensemble=1,
         iter.max=5000, n.trials=5, bag=FALSE, lower=-Inf,
         init.range=c(-0.5, 0.5, -0.5, 0.5), monotone=NULL,
         additive=FALSE, eps.seq=2^seq(-8, -32, by=-4),
         Th=sigmoid, Th.prime=sigmoid.prime, penalty=0,
         unpenalized=NULL, n.errors.max=10, trace=TRUE, ...)

Value

a list containing elements

weights: a list containing fitted weight matrices
lower: left censoring point
eps.seq: sequence of eps values for the finite smoothing algorithm
tau: desired tau-quantile(s)
Th: hidden layer transfer function
x.center: vector of column means for x
x.scale: vector of column standard deviations for x
y.center: vector of column means for y
y.scale: vector of column standard deviations for y
monotone: column indices indicating covariate monotonicity constraints.
additive: force additive relationships.

Arguments

x: covariate matrix with number of rows equal to the number of samples and number of columns equal to the number of variables.
y: response column matrix with number of rows equal to the number of samples.
n.hidden: number of hidden nodes in the QRNN model.
w: vector of weights with length equal to the number of samples; NULL gives equal weight to each sample.
tau: desired tau-quantile(s).
n.ensemble: number of ensemble members to fit.
iter.max: maximum number of iterations of the optimization algorithm.
n.trials: number of repeated trials used to avoid local minima.
bag: logical variable indicating whether or not bootstrap aggregation (bagging) should be used.
lower: left censoring point.
init.range: initial weight range for input-hidden and hidden-output weight matrices.
monotone: column indices of covariates for which the monotonicity constraint should hold.
additive: force additive relationships.
eps.seq: sequence of eps values for the finite smoothing algorithm.
Th: hidden layer transfer function; use sigmoid, elu, or softplus for a nonlinear model and linear for a linear model.
Th.prime: derivative of the hidden layer transfer function Th.
penalty: weight penalty for weight decay regularization.
unpenalized: column indices of covariates for which the weight penalty should not be applied to input-hidden layer weights.
n.errors.max: maximum number of nlm optimization failures allowed before quitting.
trace: logical variable indicating whether or not diagnostic messages are printed during optimization.
...: additional parameters passed to the nlm optimization routine.

Details

Fit a censored quantile regression neural network model for the tau-quantile by minimizing a cost function based on smooth Huber-norm approximations to the tilted absolute value and ramp functions. Left censoring can be turned on by setting lower to a value greater than -Inf. A simplified form of the finite smoothing algorithm, in which the nlm optimization algorithm is run with values of the eps approximation tolerance progressively reduced in magnitude over the sequence eps.seq, is used to set the QRNN weights and biases. Local minima of the cost function can be avoided by setting n.trials, which controls the number of repeated runs from different starting weights and biases, to a value greater than one.

(Note: if eps.seq is set to a single, sufficiently large value and tau is set to 0.5, then the result will be a standard least squares regression model. The same value of eps.seq and other values of tau leads to expectile regression.)

The hidden layer transfer function Th and its derivative Th.prime should be set to sigmoid, elu, or softplus and sigmoid.prime, elu.prime, or softplus.prime for a nonlinear model and to linear and linear.prime for a linear model.

If invoked, the monotone argument enforces non-decreasing behaviour between specified columns of x and model outputs. This holds if Th and To are monotone non-decreasing functions. In this case, the exp function is applied to the relevant weights following initialization and during optimization; manual adjustment of init.weights or qrnn.initialize may be needed due to differences in scaling of the constrained and unconstrained weights. Non-increasing behaviour can be forced by transforming the relevant covariates, e.g., by reversing sign.

The additive argument sets relevant input-hidden layer weights to zero, resulting in a purely additive model. Interactions between covariates are thus suppressed, leading to a compromise in flexibility between linear quantile regression and the quantile regression neural network.

Borrowing strength by using a composite model for multiple regression quantiles is also possible (see composite.stack). Applying the monotone constraint in combination with the composite model allows one to simultaneously estimate multiple non-crossing quantiles; the resulting monotone composite QRNN (MCQRNN) is demonstrated in mcqrnn.

In the linear case, model complexity does not depend on the number of hidden nodes; the value of n.hidden is ignored and is instead set to one internally. In the nonlinear case, n.hidden controls the overall complexity of the model. As an added means of avoiding overfitting, weight penalty regularization for the magnitude of the input-hidden layer weights (excluding biases) can be applied by setting penalty to a nonzero value. (For the linear model, this penalizes both input-hidden and hidden-output layer weights, leading to a quantile ridge regression model. In this case, kernel quantile ridge regression can be performed with the aid of the qrnn.rbf function.) Finally, if the bag argument is set to TRUE, models are trained on bootstrapped x and y sample pairs; bootstrap aggregation (bagging) can be turned on by setting n.ensemble to a value greater than one. Averaging over an ensemble of bagged models will also tend to alleviate overfitting.

The gam.style function can be used to plot modified generalized additive model effects plots, which are useful for visualizing the modelled covariate-response relationships.

Note: values of x and y need not be standardized or rescaled by the user. All variables are automatically scaled to zero mean and unit standard deviation prior to fitting and parameters are automatically rescaled by qrnn.predict and other prediction functions. Values of eps.seq are relative to the residuals in standard deviation units.

References

Cannon, A.J., 2011. Quantile regression neural networks: implementation in R and application to precipitation downscaling. Computers & Geosciences, 37: 1277-1284. doi:10.1016/j.cageo.2010.07.005

Cannon, A.J., 2018. Non-crossing nonlinear regression quantiles by monotone composite quantile regression neural network, with application to rainfall extremes. Stochastic Environmental Research and Risk Assessment, 32(11): 3207-3225. doi:10.1007/s00477-018-1573-6

Examples

Run this code

x <- as.matrix(iris[,"Petal.Length",drop=FALSE])
y <- as.matrix(iris[,"Petal.Width",drop=FALSE])

cases <- order(x)
x <- x[cases,,drop=FALSE]
y <- y[cases,,drop=FALSE]

tau <- c(0.05, 0.5, 0.95)
 
set.seed(1)

## QRNN models for conditional 5th, 50th, and 95th percentiles
w <- p <- vector("list", length(tau))
for(i in seq_along(tau)){
    w[[i]] <- qrnn.fit(x=x, y=y, n.hidden=3, tau=tau[i],
                       iter.max=200, n.trials=1)
    p[[i]] <- qrnn.predict(x, w[[i]])
}

## Monotone composite QRNN (MCQRNN) for simultaneous estimation of
## multiple non-crossing quantile functions
x.y.tau <- composite.stack(x, y, tau)
fit.mcqrnn <- qrnn.fit(cbind(x.y.tau$tau, x.y.tau$x), x.y.tau$y,
                       tau=x.y.tau$tau, n.hidden=3, n.trials=1,
                       iter.max=500, monotone=1)
pred.mcqrnn <- matrix(qrnn.predict(cbind(x.y.tau$tau, x.y.tau$x),
                      fit.mcqrnn), ncol=length(tau))

par(mfrow=c(1, 2))
matplot(x, matrix(unlist(p), nrow=nrow(x), ncol=length(p)), col="red",
        type="l")
points(x, y)
matplot(x, pred.mcqrnn, col="blue", type="l")
points(x, y)

Run the code above in your browser using DataLab