# selectLambda

##### Selection of sparsity parameter using IC

Selection of the sparsity parameter for ROSPCA and SCoTLASS using BIC of Hubert et al. (2016), and for SRPCA using BIC of Croux et al. (2013).

- Keywords
- optimize

##### Usage

```
selectLambda(X, k, kmax = 10, method = "ROSPCA", lmin = 0, lmax = 2, lstep = 0.02,
alpha = 0.75, stand = TRUE, skew = FALSE, multicore = FALSE,
mc.cores = NULL, P = NULL, ndir = "all")
```

##### Arguments

- X
An \(n\) by \(p\) matrix or data matrix with observations in the rows and variables in the columns.

- k
Number of Principal Components (PCs).

- kmax
Maximal number of PCs to be computed, only used when

`method = "ROSPCA"`

or`method = "ROSPCAg"`

. Default is 10.- method
PCA method to use: ROSPCA (

`"ROSPCA"`

or`"ROSPCAg"`

), SCoTLASS (`"SCoTLASS"`

or`"SPCAg"`

) or SRPCA (`"SRPCA"`

). Default is`"ROSPCA"`

.- lmin
Minimal value of \(\lambda\) to look at, default is 0.

- lmax
Maximal value of \(\lambda\) to look at, default is 2.

- lstep
Difference between two consecutive values of \(\lambda\), i.e. the step size, default is 0.02.

- alpha
Robustness parameter for ROSPCA, default is 0.75.

- stand
Logical indicating if the data should be standardised, default is

`TRUE`

.- skew
Logical indicating if the skewed version of ROSPCA should be applied, default is

`FALSE`

.- multicore
Logical indicating if multiple cores can be used, default is

`TRUE`

. Note that this is not possible for the Windows platform, so`multicore`

is always`FALSE`

there.- mc.cores
Number of cores to use if

`multicore=TRUE`

, default is`NULL`

which corresponds to the number of cores minus 1.- P
True loadings matrix, a numeric matrix of size \(p\) by \(k\). The default is

`NULL`

which means that no true loadings matrix is specified.- ndir
Number of directions used when computing the outlyingness (or the adjusted outlyingness when

`skew=TRUE`

) in`rospca`

, see`outlyingness`

and`adjOutl`

for more details.

##### Details

We select an optimal value of \(\lambda\) for a certain method on a certain dataset by looking at an equidistant grid of \(\lambda\) values. For each value of \(\lambda\), we apply the method on the dataset using this sparsity parameter, and compute an Information Criterion (IC). The optimal value of \(\lambda\) is then the one corresponding to the minimal IC. The ICs we consider are the BIC of for Hubert et al. (2016) for ROSPCA and SCoTLASS, and the BIC of Croux et al. (2013) for SRPCA. The BIC of Hubert et al. (2016) is defined as $$BIC(\lambda)=\ln(1/(h_1p)\sum_{i=1}^{h_1} OD^2_{(i)}(\lambda))+df(\lambda)\ln(h_1p)/(h_1p),$$ where \(h_1\) is the size of \(H_1\) (the subset of observations that are kept in the non-sparse reweighting step) and \(OD_{(i)}(\lambda)\) is the \(i\)th smallest orthogonal distance for the model when using \(\lambda\) as the sparsity parameter. The degrees of freedom \(df(\lambda)\) are the number of non-zero loadings when \(\lambda\) is used as the sparsity parameter.

##### Value

A list with components:

Value of \(\lambda\) corresponding to minimal IC.

Minimal value of IC.

Numeric vector containing the used values of \(\lambda\).

Numeric cector containing the IC values corresponding to all values of \(\lambda\) in `Lambda`

.

Loadings obtained using method with sparsity parameter `opt.lambda`

, a numeric matrix of size \(p\) by \(k\).

Fit obtained using method with sparsity parameter `opt.lambda`

. This is a list containing the loadings (`loadings`

), the eigenvalues (`eigenvalues`

), the standardised data matrix used as input (`Xst`

), the scores matrix (`scores`

), the orthogonal distances (`od`

) and the score distances (`sd`

).

Type of IC used: `BICod`

(BIC of Hubert et al. (2016)) or `BIC`

(BIC of Croux et al. (2013)).

A numeric vector containing the standardised angles between the true and the estimated loadings matrix for each value of \(\lambda\) if a loadings matrix is given. When no loadings matrix is given as input (`P=NULL`

), `measure`

is equal to `NULL`

.

##### References

Hubert, M., Reynkens, T., Schmitt, E. and Verdonck, T. (2016). ``Sparse PCA for High-Dimensional Data with Outliers,'' *Technometrics*, 58, 424--434.

Croux, C., Filzmoser, P., and Fritz, H. (2013), ``Robust Sparse Principal Component Analysis,'' *Technometrics*, 55, 202--214.

##### See Also

##### Examples

```
# NOT RUN {
X <- dataGen(m=1, n=100, p=10, eps=0.2, bLength=4)$data[[1]]
sl <- selectLambda(X, k=2, method="ROSPCA", lstep=0.1)
selectPlot(sl)
# }
```

*Documentation reproduced from package rospca, version 1.0.4, License: GPL (>= 2)*