sim.est: Single Index Model Estimation: Objective Function Approach

Description

Provides an estimate of the non-parametric function and the index vector by minimizing an objective function specified by the method argument.

Usage

sim.est(x, y, w = NULL, beta.init = NULL, nmulti = NULL, L = NULL,
        lambda = NULL, maxit = 100, bin.tol = 1e-5, beta.tol = 1e-5,
        method = c("cvx.pen", "cvx.lip", "cvx.lse.con", "cvx.lse", "smooth.pen"),
        progress = TRUE, force = FALSE)
# S3 method for sim.est
plot(x, pch = 20, cex = 1, lwd = 2, col2 = "red", ...)
# S3 method for sim.est
print(x, digits = getOption("digits"), ...)
# S3 method for sim.est
predict(object, newdata = NULL, deriv = 0, ...)

Value

An object of class sim.est, basically a list including the elements

beta: A numeric vector storing the estimate of the index vector.
nmulti: Number of multistarts used.
x.mat: the input x matrix with possibly aggregated rows.
BetaInit: a matrix storing the initial vectors taken or given for the index parameter.
lambda: Given input lambda.
L: Given input L.
K: an integer storing the row index of BetaInit which lead to the estimator beta.
BetaPath: a list containing the paths taken by each initial index vector for nmulti times.
ObjValPath: a matrix with nmulti rows storing the path of objective function value for multiple starts.
convergence: a numeric storing convergence status for the index parameter.
itervec: a vector of length nmulti storing the number of iterations taken by each of the multiple starts.
iter: a numeric giving the total number of iterations taken.
method: method given as input.
regress: An output of the regression function used needed for predict.
x.values: sorted x.betahat values obtained by the algorithm.
y.values: corresponding y values in input.
fit.values: corresponding fit values of same length as that of $x \beta$.
deriv: corresponding values of the derivative (of the same length).
residuals: residuals obtained from the fit.
minvalue: minimum value of the objective function attained.

Arguments

x: a numeric matrix giving the values of the predictor variables or covariates. For plot() and print() methods, x is an object of class sim.est.
y: a numeric vector giving the values of the response variable (length as x).
w: an optional numeric vector of the same length as x; Defaults to all 1.
beta.init: an numeric vector giving the initial value for the index vector.
nmulti: an integer giving the number of multiple starts to be used for iterative algorithm. If beta.init is provided then the nmulti is set to 1.
L: a numeric value giving the Lipschitz bound for cvx.lip.
lambda: a numeric value giving the penalty value for cvx.pen and cvx.lip.
maxit: an integer specifying the maximum number of iterations for each initial $\beta$ vector.
bin.tol: a tolerance level upto which the x values used in regression are recognized as distinct values.
beta.tol: a tolerance level for stopping iterative algorithm for the index vector.
method: a string indicating which method to use for regression.
progress: a logical denoting if progress of the algorithm is to be printed. Defaults to TRUE.
force: a logical indicating the use of cvx.lse.reg or cvx.lse.con.reg. Defaults to false and uses cvx.lse.con.reg. This is deprecated; rather method = "cvx.lse.con" or = "cvx.lse" choose the method.
object: the result of sim.est(), of class sim.est.
pch, cex, lwd, col2: further optional arguments to plot() method, passed to underlying plot() or lines() calls.
digits: the number of significant digits, for numbers in the print() method.
...: additional arguments to be passed.
newdata: a matrix of new data points in the predict() method.
deriv: either 0 or 1, the order of the derivative to evaluate.

Author

Arun Kumar Kuchibhotla

Details

The function minimizes $$\sum_{i=1}^n w_i(y_i - f(x_i^{\top}\beta))^2 + \lambda\int\{f''(x)\}^2dx$$ with constraints on $f$ dictated by method = "cvx.pen" or "smooth.pen". For method = "cvx.lip" or "cvx.lse", the function minimizes $$\sum_{i=1}^n w_i(y_i - f(x_i^{\top}\beta))^2$$ with constraints on $f$ dictated by method = "cvx.lip" or "cvx.lse". The penalty parameter $\lambda$ is not choosen by any criteria. It has to be specified for using method = "cvx.pen", "cvx.lip" or "smooth.pen" and $\lambda$ denotes the Lipschitz constant for using the method = "cvx.lip".

The plot() method provides the scatterplot along with the fitted curve; it also includes some diagnostic plots for residuals and progression of the algorithm. The predict() method now allows calculation of the first derivative.

In applications, it might be advantageous to scale of the covariate matrix x before passing into the function which brings more stability to the algorithm.

References

Arun K. Kuchibhotla and Rohit K. Patra (2020) Efficient estimation in single index models through smoothing splines, Bernoulli 26(2), 1587--1618. tools:::Rd_expr_doi("10.3150/19-BEJ1183")

Examples

Run this code

set.seed(2017)
x <- matrix(runif(50*3, -1,1), ncol = 3)
b0 <- c(1, 1, 1)/sqrt(3)
y <- (x %*% b0)^2 + rnorm(50,0,0.3) 
(mCP  <- sim.est(x, y, lambda = 0.01, method = "cvx.pen",   nmulti = 5))
(mCLi <- sim.est(x, y, L = 10,        method = "cvx.lip",   nmulti = 3))
(mSP  <- sim.est(x, y, lambda = 0.01, method = "smooth.pen",nmulti = 5))
(mCLs <- sim.est(x, y,                method = "cvx.lse",   nmulti = 1))
## Compare the 4 models on the same data point:
pr000 <- sapply(list(mCP, mCLi, mSP, mCLs), predict, newdata = c(0,0,0))
pr000 # values close to 0

Run the code above in your browser using DataLab