Learn R Programming

dCUR (version 1.0.1)

optimal_stage: optimal_stage

Description

optimal_stage is a function used to select the optimal k, the number of columns and rows of dynamic CUR object; it also produces a data frame and corresponding plots.

Usage

optimal_stage(data, limit = 80)

Value

data

a data frame which specifies the relative error for each stage of CUR decomposition.

rows_plot

a plot where the average relative error is shown for each number of relevant rows selected.

columns_plot

a plot where the average relative error is shown for each number of relevant columns selected.

k_plot

a plot where the average relative error is shown for each k (number of components to compute leverage), given the optimal number of relevant columns and rows.

optimal

a data frame where the average relative error is shown for optimal k (number of components to compute leverage), given the optimal number of relevant columns and rows.

Arguments

data

An object resulting from a call to dCUR.

limit

Cumulative percentage average of relative error rate.

Author

Cesar Gamboa-Sanabria, Stefany Matarrita-Munoz, Katherine Barquero-Mejias, Greibin Villegas-Barahona, Mercedes Sanchez-Barba and Maria Purificacion Galindo-Villardon.

Details

Select the optimal stage of dynamic CUR descomposition

The objective of CUR decomposition is to find the most relevant variables and observations within a data matrix to reduce the dimensionality. It is well known that as more columns (variables) and rows are selected, the relative error will decrease; however, this is not true for k (number of components to compute leverages). Given the above, this function seeks to find the best-balanced stage of k, the number of relevant columns, and rows that have an error very close to the minimum, but at the same time maintain the low-rank fit of the data matrix.

References

dynamyCURdCUR

See Also

dCUR CUR

Examples

Run this code
# \donttest{
results <- dCUR(data=AASP, variables=hoessem:notabachillerato,
k=15, rows=0.25, columns=0.25,skip = 0.1, standardize=TRUE,
cur_method="sample_cur",
parallelize =TRUE, dynamic_columns  = TRUE,
dynamic_rows  = TRUE)
result <- optimal_stage(results, limit = 80)
result
result$k_plot
result$columns_plot
result$data
result$optimal
# }

Run the code above in your browser using DataLab