quad_spline_est_kn: AIC (or BIC) criterion for choosing the number of knots for quadratic splines

Description

Computes the optimal number of knots for the constrained quadratic spline fit proposed by Daouia, Noh and Park (2013)

Usage

quad_spline_est_kn(xtab, ytab, x, cv, krange = 1:20, type = "AIC")

Arguments

xtab

numeric vectors containing the observed inputs $x_1,\ldots,x_n$

ytab

numeric vectors of the same length as xtab containing the observed outputs $y_1,\ldots,y_n$

a numeric vector of evaluation points in which the estimator is to be computed

an integer equal to 0 (constraint of monotonicity only) or 1 (both constrainst of monotonicity and concavity)

krange

a vector of integer specifying the number of knots at which the spline estimate will be computed

type

a character equal to "AIC" or "BIC"

Value

Returns an integer

Details

For the implementation of the monotone quadratic spline smoother $\hat\varphi_n$, Daouia et al. (2013) first suggest using the set of knots ${ t_j = {\mathcal{X}_{[j \mathcal{N}/k_n]}},~j=1,\ldots,k_n-1 }$ among the FDH points $(\mathcal{X}_{\ell},\mathcal{Y}_{\ell})$, $\ell=1,\ldots,\mathcal{N}$ (function quad_spline_est). Because the number of knots $k_n$ determines the complexity of the spline approximation, its choice may then be viewed as model selection through the minimization of the following two information criteria: $$AIC(k) = \log \left( \sum_{i=1}^{n} |y_i- \hat \varphi_n(x_i)|\right) + 2(k+2)/n,$$ $$BIC(k) = \log \left( \sum_{i=1}^{n} |y_i- \hat \varphi_n(x_i)|\right) + \log n \cdot (k+2)/n.$$ The first one (option type = "AIC") is similar to the famous Akaike information criterion (Akaike, 1973) and the second one (option type = "BIC") to the Bayesian information criterion (Schwartz, 1978). A small number of knots is typically needed as elucidated by the asymptotic theory. For the implementation of the monotone and concave spline estimator $\hat\varphi^{\star}_n$, just apply the same scheme as above by replacing the FDH points $(\mathcal{X}_{\ell},\mathcal{Y}_{\ell})$ with the DEA points $(\mathcal{X}^*_{\ell},\mathcal{Y}^*_{\ell})$ (see dea_est).

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle, in Second International Symposium of Information Theory, eds. B. N. Petrov and F. Csaki, Budapest: Akademia Kiado, 267--281. Daouia, A., Noh, H. and Park, B.U. (2013). Data Envelope Fitting with Constrained Polynomial splines. TSE Working Paper, http://www.tse-fr.eu/images/doc/wp/etrie/wp_tse_449.pdf. Schwartz, G. (1978). Estimating the dimension of a model, Annals of Statistics, 6, 461--464.

Examples

Run this code

data("green")
x <- seq(min(log(green$COST)), max(log(green$COST)), length.out=1001)
quad_spline_est_kn(log(green$COST), log(green$OUTPUT), x, cv=1, type="AIC")   
quad_spline_est_kn(log(green$COST), log(green$OUTPUT), x, cv=1, type="BIC")

Run the code above in your browser using DataLab