rtpsdr: Real time sufficient dimension reduction through principal least squares SVM

Description

In stream data, where we need to constantly update the estimation as new data are collected, the use of all available data can create computational challenges even for computationally efficient algorithms. Therefore it is important to develop real time SDR algorithms that work efficiently in the case that there are data streams. After getting an initial estimator with the currently available data, the basic idea of real-time method is to update the estimator efficiently as new data are collected. This function realizes real time least squares SVM SDR method for a both regression and classification problem It is efficient algorithms for either adding new data or removing old data are provided.

Usage

rtpsdr(x, y, obj = NULL, h = 10, lambda = 1)

Value

An object with S3 class "rtpsdr". Details are listed below.

x: input data matrix
y: iniput response vector
Mn: The estimated working matrix, which is obtained by the cumulative outer product of the estimated parameters over H
evalues: Eigenvalues of the Mn
evectors: Eigenvectors of the Mn, the first d leading eigenvectors consists the basis of the central subspace
N: total number of observation \(n_1 + n_2\)
Xbar: mean of total \(\mathbf{x}\)
r: updated estimated coefficients matrix
A: new A part for update. See Artemiou et. al., (2021)

Arguments

x: x in new data
y: y in new data, y is continuous
obj: the latest output object from the rtpsdr
h: a number of slices. default is set to 10.
lambda: hyperparameter for the loss function. default is set to 1.

Author

Jungmin Shin, jungminshin@korea.ac.kr, Seung Jun Shin, sjshin@korea.ac.kr, Andreas Artemiou artemiou@uol.ac.cy

References

Artemiou, A. and Dong, Y. (2016) Sufficient dimension reduction via principal lq support vector machine, Electronic Journal of Statistics 10: 783–805.
Artemiou, A., Dong, Y. and Shin, S. J. (2021) Real-time sufficient dimension reduction through principal least squares support vector machines, Pattern Recognition 112: 107768.
Kim, B. and Shin, S. J. (2019) Principal weighted logistic regression for sufficient dimension reduction in binary classification, Journal of the Korean Statistical Society 48(2): 194–206.
Li, B., Artemiou, A. and Li, L. (2011) Principal support vector machines for linear and nonlinear sufficient dimension reduction, Annals of Statistics 39(6): 3182–3210.
Soale, A.-N. and Dong, Y. (2022) On sufficient dimension reduction via principal asymmetric least squares, Journal of Nonparametric Statistics 34(1): 77–94.
Wang, C., Shin, S. J. and Wu, Y. (2018) Principal quantile regression for sufficient dimension reduction with heteroscedasticity, Electronic Journal of Statistics 12(2): 2114–2140.
Shin, S. J., Wu, Y., Zhang, H. H. and Liu, Y. (2017) Principal weighted support vector machines for sufficient dimension reduction in binary classification, Biometrika 104(1): 67–81.
Li, L. (2007) Sparse sufficient dimension reduction, Biometrika 94(3): 603–613.

Examples

Run this code

# \donttest{
p <- 5
m <- 500 # batch size
N <- 10  # number of batches
obj <- NULL
for (iter in 1:N){
 set.seed(iter)
 x <- matrix(rnorm(m*p), m, p)
 y <-  x[,1]/(0.5 + (x[,2] + 1)^2) + 0.2 * rnorm(m)
 obj <- rtpsdr(x = x, y = y, obj=obj)
}
print(obj)
# }

Run the code above in your browser using DataLab