shape: Shape Selection

Description

Given a predictor vector $\bold{x}$, e.g., years, and a matrix $\bold{ymat}$ whose columns are response vectors, e.g., Landsat signals. The shape routine will select a shape that is the best fit for each response vector according to the Bayes information criterion (BIC) or the cone information criterion (CIC).

Usage

shape(x, ymat, infocrit = "CIC", flat = TRUE, dec = TRUE, jp = TRUE, 
invee = TRUE, vee = TRUE, inc = TRUE, db = TRUE, nsim = 1e+3, 
edf0 = NULL, get.edf0 = FALSE, random = FALSE, msg = TRUE)

Arguments

A $n$ by $1$ predictor vector, for example, years.

ymat

A $n$ by $N$ matrix whose columns are response vectors corresponding to x, for example, Landsat signals.

infocrit

The criterion used to select the best shape for a scatterplot. It can either be the Bayes information criterion (BIC) or the cone information criterion (CIC).

flat

A logical flag. If it is TRUE, there is a flat shape choice; otherwise, there is no such a shape option.

dec

A logical flag. If it is TRUE, there is a decreasing shape choice; otherwise, there is no such a shape option.

A logical flag. If it is TRUE, there is a one-jump shape choice; otherwise, there is no such a shape option.

invee

A logical flag. If it is TRUE, there is an inverted-vee shape choice; otherwise, there is no such a shape option.

vee

A logical flag. If it is TRUE, there is a vee shape choice; otherwise, there is no such a shape option.

inc

A logical flag. If it is TRUE, there is an increasing shape choice; otherwise, there is no such a shape option.

A logical flag. If it is TRUE, there is a double-jump shape choice; otherwise, there is no such a shape option. The routine is usually slower when there is a double-jump shape choice than it is when there is no such a choice.

nsim

Number of simulations used to get the edf0 vector. The default is nsim = 1e+3. See references in this section for more details about edf0.

edf0

The edf0 given by the user. When $\bold{x}$ is an equally spaced vector whose number of elements is between $20$ and $40$. The user doesn't need to provide an edf0 vector; otherwise, the user has to set get.edf0 to be TRUE such that the shape routine will

get.edf0

A logical flag. When $\bold{x}$ is not an equally spaced vector whose number of elements is between $20$ and $40$. The user has to set get.edf0 to be TRUE such that the shape routine will simulate an edf0 vector, or the user can choose to simulate an edf0

random

A parameter used by the maintainer to test if each shape option can be both included and excluded.

msg

A logical flag. If msg is TRUE, then a warning message will be printed when there is a non-convergence problem; otherwise no warning message will be printed. The default is msg = TRUE

Value

shapeA $N$ by $1$ vector. The $i$th element is the best shape for each of the $i$th scatterplot.
icA $k$ by $N$ matrix where the $i$th column is the vector of "BIC" or "CIC" values used to choose the best shape for the $i$th scatterplot. $k$ is the number of shapes allowed by the user.
thetabA $n$ by $N$ matrix where the $i$th column is the vector of predicted values for the chosen shape for the $i$th scatterplot.
xThe argument x.
ymatThe argument ymat.
infocritThe argument infocrit.
kThe number of knots used.
bsA list of coefficient vectors. Each vector is the vector of coefficients for regression basis functions for each scatterplot.
ijpsA list storing the position of the first jump for scatterplots whose best shape is one-jump or double-jump. It also stores the position of the knot from where $\bold{f}$ starts increasing (decreasing) for scatterplots whose best shape is vee (inverted vee).
jjpsA list storing the position of the second jump for scatterplots whose best shape is double-jump.
m_isA vector storing the centering values for the first ramp edge for scatterplots whose best shape is one-jump or double-jump.
m_jsA vector storing the centering values for the second ramp edge for scatterplots whose best shape is double-jump.

bold

R
coneproj
R
coneproj

url

https://cran.r-project.org/package=coneproj

Details

Given a scatterplot of $(x_i, y_i)$, $i=1,\ldots,n$, where $\bold{x}$ could be a vector of years and $\bold{y}$ could be a vector of Landsat signals, constrained least-squares spline fits are obtained for the following shapes:

1. flat

2. decreasing 3. one-jump, i.e., decreasing, jump up, decreasing 4. inverted vee (increasing then decreasing) 5. vee (decreasing then increasing) 6. linear increasing 7. double-jump, i.e., decreasing, jump up, decreasing, jump up, decreasing.

References

Meyer, M. C. (2013a) Semi-parametric additive constrained regression. Journal of Nonparametric Statistics 25(3), 715.

Meyer, M. C. (2013b) A simple new algorithm for quadratic programming with applications in statistics. Communications in Statistics 42(5), 1126--1139.

Liao, X. and M. C. Meyer (2014) coneproj: An R package for the primal or dual cone projections with routines for constrained regression. Journal of Statistical Software 61(12), 1--22.

Examples

Run this code

# import the matrix of Landsat signals 
	data("ymat")

	# define the predictor vector: the year 1985 to the year 2010	
	x <- 1985:2010
# Example 1:	
	# call the shape routine allowing a double jump shape using "BIC"
	ans <- shape(x, ymat, "BIC")
	plotshape(ans, ids = 1:6, both = TRUE, form = TRUE)
	# Example 2:
	# call the shape routine not allowing a double jump shape using "CIC"
	ans <- shape(x, ymat, "CIC", db = FALSE)
	plotshape(ans, ids = 1:6, both = TRUE, form = TRUE)

Run the code above in your browser using DataLab