`frscvNOMAD`

computes NOMAD-based (Nonsmooth
Optimization by Mesh Adaptive Direct Search, Abramson, Audet, Couture
and Le Digabel (2011)) cross-validation directed search for a
regression spline estimate of a one (1) dimensional dependent variable
on an `r`

-dimensional vector of continuous predictors and
nominal/ordinal (`factor`

/`ordered`

)
predictors.

```
frscvNOMAD(xz,
y,
degree.max = 10,
segments.max = 10,
degree.min = 0,
segments.min = 1,
cv.df.min = 1,
complexity = c("degree-knots","degree","knots"),
knots = c("quantiles","uniform","auto"),
basis = c("additive","tensor","glp","auto"),
cv.func = c("cv.ls","cv.gcv","cv.aic"),
degree = degree,
segments = segments,
include = include,
random.seed = 42,
max.bb.eval = 10000,
initial.mesh.size.integer = "1",
min.mesh.size.integer = "1",
min.poll.size.integer = "1",
opts=list(),
nmulti = 0,
tau = NULL,
weights = NULL,
singular.ok = FALSE)
```

y

continuous univariate vector

degree.max

the maximum degree of the B-spline basis for
each of the continuous predictors (default `degree.max=10`

)

segments.max

the maximum segments of the B-spline basis for
each of the continuous predictors (default `segments.max=10`

)

degree.min

the minimum degree of the B-spline basis for
each of the continuous predictors (default `degree.min=0`

)

segments.min

the minimum segments of the B-spline basis for
each of the continuous predictors (default `segments.min=1`

)

cv.df.min

the minimum degrees of freedom to allow when
conducting cross-validation (default `cv.df.min=1`

)

complexity

a character string (default
`complexity="degree-knots"`

) indicating whether model
‘complexity’ is determined by the degree of the spline or by
the number of segments (‘knots’). This option allows the user
to use cross-validation to select either the spline degree (number
of knots held fixed) or the number of knots (spline degree held
fixed) or both the spline degree and number of knots

knots

a character string (default `knots="quantiles"`

)
specifying where knots are to be placed. ‘quantiles’ specifies
knots placed at equally spaced quantiles (equal number of observations
lie in each segment) and ‘uniform’ specifies knots placed at
equally spaced intervals. If `knots="auto"`

, the knot type will
be automatically determined by cross-validation

basis

a character string (default `basis="additive"`

)
indicating whether the additive or tensor product B-spline basis
matrix for a multivariate polynomial spline or generalized B-spline
polynomial basis should be used. Note this can be automatically
determined by cross-validation if `cv=TRUE`

and
`basis="auto"`

, and is an ‘all or none’ proposition
(i.e. interaction terms for all predictors or for no predictors
given the nature of ‘tensor products’). Note also that if
there is only one predictor this defaults to `basis="additive"`

to avoid unnecessary computation as the spline bases are equivalent
in this case

cv.func

a character string (default `cv.func="cv.ls"`

)
indicating which method to use to select smoothing
parameters. `cv.gcv`

specifies generalized cross-validation
(Craven and Wahba (1979)), `cv.aic`

specifies expected
Kullback-Leibler cross-validation (Hurvich, Simonoff, and Tsai
(1998)), and `cv.ls`

specifies least-squares
cross-validation

degree

integer/vector specifying the degree of the B-spline
basis for each dimension of the continuous `x`

segments

integer/vector specifying the number of segments of
the B-spline basis for each dimension of the continuous `x`

(i.e. number of knots minus one)

include

integer/vector for the categorical predictors. If it is not NULL, it will be the initial value for the fitting

random.seed

when it is not missing and not equal to 0, the initial points will
be generated using this seed when `nmulti > 0`

max.bb.eval

argument passed to the NOMAD solver (see `snomadr`

for
further details)

initial.mesh.size.integer

argument passed to the NOMAD solver (see `snomadr`

for
further details)

min.mesh.size.integer

arguments passed to the NOMAD solver (see `snomadr`

for
further details)

min.poll.size.integer

arguments passed to the NOMAD solver (see `snomadr`

for
further details)

opts

list of optional arguments to be passed to
`snomadr`

nmulti

integer number of times to restart the process of finding extrema of
the cross-validation function from different (random) initial
points (default `nmulti=0`

)

tau

if non-null a number in (0,1) denoting the quantile for which a quantile
regression spline is to be estimated rather than estimating the
conditional mean (default `tau=NULL`

)

weights

an optional vector of weights to be used in the fitting process. Should be ‘NULL’ or a numeric vector. If non-NULL, weighted least squares is used with weights ‘weights’ (that is, minimizing ‘sum(w*e^2)’); otherwise ordinary least squares is used.

singular.ok

a logical value (default `singular.ok=FALSE`

) that, when
`FALSE`

, discards singular bases during cross-validation (a check
for ill-conditioned bases is performed).

`frscvNOMAD`

returns a `crscv`

object. Furthermore, the function
`summary`

supports objects of this type. The returned
objects have the following components:

scalar/vector containing optimal degree(s) of spline or number of segments

scalar/vector containing an indicator of whether the
predictor is included or not for each dimension of the
nominal/ordinal
(`factor`

/`ordered`

) predictors

vector/matrix of values of `K`

evaluated during
search

the maximum degree of the B-spline basis for
each of the continuous predictors (default `degree.max=10`

)

the maximum segments of the B-spline basis for
each of the continuous predictors (default `segments.max=10`

)

the minimum degree of the B-spline basis for
each of the continuous predictors (default `degree.min=0`

)

the minimum segments of the B-spline basis for
each of the continuous predictors (default `segments.min=1`

)

objective function value at optimum

vector of objective function values at each degree
of spline or number of segments in `K.mat`

`frscvNOMAD`

computes NOMAD-based cross-validation for a
regression spline estimate of a one (1) dimensional dependent variable
on an `r`

-dimensional vector of continuous and nominal/ordinal
(`factor`

/`ordered`

) predictors. Numerical
search for the optimal `degree`

/`segments`

/`I`

is
undertaken using `snomadr`

.

The optimal `K`

/`I`

combination is returned along with other
results (see below for return values).

For the continuous predictors the regression spline model employs
either the additive or tensor product B-spline basis matrix for a
multivariate polynomial spline via the B-spline routines in the GNU
Scientific Library (https://www.gnu.org/software/gsl/) and the
`tensor.prod.model.matrix`

function.

For the nominal/ordinal (`factor`

/`ordered`

)
predictors the regression spline model uses indicator basis functions.

Abramson, M.A. and C. Audet and G. Couture and J.E. Dennis Jr. and S. Le Digabel (2011), “The NOMAD project”. Software available at https://www.gerad.ca/nomad.

Craven, P. and G. Wahba (1979), “Smoothing Noisy Data With Spline Functions,” Numerische Mathematik, 13, 377-403.

Hurvich, C.M. and J.S. Simonoff and C.L. Tsai (1998), “Smoothing Parameter Selection in Nonparametric Regression Using an Improved Akaike Information Criterion,” Journal of the Royal Statistical Society B, 60, 271-293.

Le Digabel, S. (2011), “Algorithm 909: NOMAD: Nonlinear Optimization With the MADS Algorithm”. ACM Transactions on Mathematical Software, 37(4):44:1-44:15.

Li, Q. and J.S. Racine (2007), *Nonparametric Econometrics:
Theory and Practice,* Princeton University Press.

Ma, S. and J.S. Racine and L. Yang (2015), “Spline Regression in the Presence of Categorical Predictors,” Journal of Applied Econometrics, Volume 30, 705-717.

Ma, S. and J.S. Racine (2013), “Additive Regression Splines with Irrelevant Categorical and Continuous Regressors,” Statistica Sinica, Volume 23, 515-541.

# NOT RUN { set.seed(42) ## Simulated data n <- 1000 x <- runif(n) z <- round(runif(n,min=-0.5,max=1.5)) z.unique <- uniquecombs(as.matrix(z)) ind <- attr(z.unique,"index") ind.vals <- sort(unique(ind)) dgp <- numeric(length=n) for(i in 1:nrow(z.unique)) { zz <- ind == ind.vals[i] dgp[zz] <- z[zz]+cos(2*pi*x[zz]) } y <- dgp + rnorm(n,sd=.1) xdata <- data.frame(x,z=factor(z)) ## Compute the optimal K and I, determine optimal number of knots, set ## spline degree for x to 3 cv <- frscvNOMAD(x=xdata,y=y,complexity="knots",degree=c(3),segments=c(5)) summary(cv) # }