frscvNOMAD computes NOMAD-based (Nonsmooth
  Optimization by Mesh Adaptive Direct Search, Abramson, Audet, Couture
  and Le Digabel (2011)) cross-validation directed search for a
  regression spline estimate of a one (1) dimensional dependent variable
  on an r-dimensional vector of continuous predictors and
  nominal/ordinal (factor/ordered)
  predictors.
frscvNOMAD(xz,
           y,
           degree.max = 10, 
           segments.max = 10, 
           degree.min = 0, 
           segments.min = 1,
           cv.df.min = 1,
           complexity = c("degree-knots","degree","knots"),
           knots = c("quantiles","uniform","auto"),
           basis = c("additive","tensor","glp","auto"),
           cv.func = c("cv.ls","cv.gcv","cv.aic"),
           degree = degree,
           segments = segments, 
           include = include, 
           random.seed = 42,
           max.bb.eval = 10000,
           initial.mesh.size.integer = "1",
           min.mesh.size.integer = "1", 
           min.poll.size.integer = "1", 
           opts=list(),
           nmulti = 0,
           tau = NULL,
           weights = NULL,
           singular.ok = FALSE)frscvNOMAD returns a crscv object. Furthermore, the function
summary supports objects of this type. The returned
  objects have the following components:
scalar/vector containing optimal degree(s) of spline or number of segments
scalar/vector containing an indicator of whether the
    predictor is included or not for each dimension of the
    nominal/ordinal
  (factor/ordered) predictors
vector/matrix of values of K evaluated during
    search
the maximum degree of the B-spline basis for
    each of the continuous predictors (default degree.max=10)
the maximum segments of the B-spline basis for
    each of the continuous predictors (default segments.max=10)
the minimum degree of the B-spline basis for
    each of the continuous predictors (default degree.min=0)
the minimum segments of the B-spline basis for
    each of the continuous predictors (default segments.min=1)
objective function value at optimum
vector of objective function values at each degree
    of spline or number of segments in K.mat
continuous univariate vector
continuous and/or nominal/ordinal
    (factor/ordered) predictors
the maximum degree of the B-spline basis for
    each of the continuous predictors (default degree.max=10)
the maximum segments of the B-spline basis for
    each of the continuous predictors (default segments.max=10)
the minimum degree of the B-spline basis for
    each of the continuous predictors (default degree.min=0)
the minimum segments of the B-spline basis for
    each of the continuous predictors (default segments.min=1)
the minimum degrees of freedom to allow when
    conducting cross-validation (default cv.df.min=1)
a character string (default
  complexity="degree-knots") indicating whether model
  ‘complexity’ is determined by the degree of the spline or by
  the number of segments (‘knots’). This option allows the user
  to use cross-validation to select either the spline degree (number
  of knots held fixed) or the number of knots (spline degree held
  fixed) or both the spline degree and number of knots
a character string (default knots="quantiles")
  specifying where knots are to be placed. ‘quantiles’ specifies
  knots placed at equally spaced quantiles (equal number of observations
  lie in each segment) and ‘uniform’ specifies knots placed at
  equally spaced intervals. If knots="auto", the knot type will
  be automatically determined by cross-validation
a character string (default basis="additive")
    indicating whether the additive or tensor product B-spline basis
    matrix for a multivariate polynomial spline or generalized B-spline
    polynomial basis should be used. Note this can be automatically
    determined by cross-validation if cv=TRUE and
    basis="auto", and is an ‘all or none’ proposition
    (i.e. interaction terms for all predictors or for no predictors
    given the nature of ‘tensor products’). Note also that if
    there is only one predictor this defaults to basis="additive"
    to avoid unnecessary computation as the spline bases are equivalent
    in this case
a character string (default cv.func="cv.ls")
    indicating which method to use to select smoothing
    parameters. cv.gcv specifies generalized cross-validation
    (Craven and Wahba (1979)), cv.aic specifies expected
    Kullback-Leibler cross-validation (Hurvich, Simonoff, and Tsai
    (1998)), and cv.ls specifies least-squares
    cross-validation
integer/vector specifying the degree of the B-spline
    basis for each dimension of the continuous x
integer/vector specifying the number of segments of
    the B-spline basis for each dimension of the continuous x
    (i.e. number of knots minus one)
integer/vector for the categorical predictors. If it is not NULL, it will be the initial value for the fitting
when it is not missing and not equal to 0, the initial points  will 
 be generated using this seed when nmulti > 0
argument passed to the NOMAD solver (see snomadr for
    further details)
argument passed to the NOMAD solver (see snomadr for
    further details)
arguments passed to the NOMAD solver (see snomadr for
    further details)
arguments passed to the NOMAD solver (see snomadr for
    further details)
list of optional arguments to be passed to
  snomadr
integer number of times to restart the process of finding extrema of
    the cross-validation function from different (random) initial
    points (default nmulti=0)
if non-null a number in (0,1) denoting the quantile for which a quantile
    regression spline is to be estimated rather than estimating the
    conditional mean (default tau=NULL)
an optional vector of weights to be used in the fitting process. Should be ‘NULL’ or a numeric vector. If non-NULL, weighted least squares is used with weights ‘weights’ (that is, minimizing ‘sum(w*e^2)’); otherwise ordinary least squares is used.
a logical value (default singular.ok=FALSE) that, when
    FALSE, discards singular bases during cross-validation (a check
    for ill-conditioned bases is performed).
Jeffrey S. Racine racinej@mcmaster.ca and Zhenghua Nie niez@mcmaster.ca
frscvNOMAD computes NOMAD-based cross-validation for a
  regression spline estimate of a one (1) dimensional dependent variable
  on an r-dimensional vector of continuous and nominal/ordinal
  (factor/ordered) predictors.  Numerical
  search for the optimal degree/segments/I is
  undertaken using snomadr.
The optimal K/I combination is returned along with other
  results (see below for return values).
For the continuous predictors the regression spline model employs
  either the additive or tensor product B-spline basis matrix for a
  multivariate polynomial spline via the B-spline routines in the GNU
  Scientific Library (https://www.gnu.org/software/gsl/) and the
  tensor.prod.model.matrix function.
For the nominal/ordinal (factor/ordered)
  predictors the regression spline model uses indicator basis functions.
Abramson, M.A. and C. Audet and G. Couture and J.E. Dennis Jr. and S. Le Digabel (2011), “The NOMAD project”. Software available at https://www.gerad.ca/nomad.
Craven, P. and G. Wahba (1979), “Smoothing Noisy Data With Spline Functions,” Numerische Mathematik, 13, 377-403.
Hurvich, C.M. and J.S. Simonoff and C.L. Tsai (1998), “Smoothing Parameter Selection in Nonparametric Regression Using an Improved Akaike Information Criterion,” Journal of the Royal Statistical Society B, 60, 271-293.
Le Digabel, S. (2011), “Algorithm 909: NOMAD: Nonlinear Optimization With the MADS Algorithm”. ACM Transactions on Mathematical Software, 37(4):44:1-44:15.
Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.
Ma, S. and J.S. Racine and L. Yang (2015), “Spline Regression in the Presence of Categorical Predictors,” Journal of Applied Econometrics, Volume 30, 705-717.
Ma, S. and J.S. Racine (2013), “Additive Regression Splines with Irrelevant Categorical and Continuous Regressors,” Statistica Sinica, Volume 23, 515-541.
loess, npregbw,
set.seed(42)
## Simulated data
n <- 1000
x <- runif(n)
z <- round(runif(n,min=-0.5,max=1.5))
z.unique <- uniquecombs(as.matrix(z))
ind <-  attr(z.unique,"index")
ind.vals <-  sort(unique(ind))
dgp <- numeric(length=n)
for(i in 1:nrow(z.unique)) {
  zz <- ind == ind.vals[i]
  dgp[zz] <- z[zz]+cos(2*pi*x[zz])
}
y <- dgp + rnorm(n,sd=.1)
xdata <- data.frame(x,z=factor(z))
## Compute the optimal K and I, determine optimal number of knots, set
## spline degree for x to 3
cv <- frscvNOMAD(x=xdata,y=y,complexity="knots",degree=c(3),segments=c(5))
summary(cv)
Run the code above in your browser using DataLab