Learn R Programming

RCTS (version 0.2.4)

Clustering Time Series While Resisting Outliers

Description

Robust Clustering of Time Series (RCTS) has the functionality to cluster time series using both the classical and the robust interactive fixed effects framework. The classical framework is developed in Ando & Bai (2017) . The implementation within this package excludes the SCAD-penalty on the estimations of beta. This robust framework is developed in Boudt & Heyndels (2022) and is made robust against different kinds of outliers. The algorithm iteratively updates beta (the coefficients of the observable variables), group membership, and the latent factors (which can be common and/or group-specific) along with their loadings. The number of groups and factors can be estimated if they are unknown.

Copy Link

Version

Install

install.packages('RCTS')

Monthly Downloads

227

Version

0.2.4

License

GPL (>= 2)

Maintainer

Ewoud Heyndels

Last Published

May 18th, 2023

Functions in RCTS (0.2.4)

calculate_FL_group_estimated

Returns the estimated groupfactorstructure.
calculate_TN_factor

Helpfunction. Calculates part of the 4th term of the PIC.
calculate_VCsquared

Calculates VC², to determine the stability of the found number of groups and factors over the subsamples.
add_pic_parallel

Calculates the PIC for the current configuration.
beta_true_heterogroups

Helpfunction in create_true_beta() for the option beta_true_heterogeneous_groups. (This is the default option.)
add_pic

Fills in df_pic: adds a row with the calculated PIC for the current configuration.
calculate_W

Calculates W = Y - X*beta_est. It is used in the initialization step of the algorithm, to initialise the factorstructures.
calculate_PIC_term1

Function to calculate the first term of PIC (panel information criterium)
calculate_FL_group_true

Calculate the true groupfactorstructure.
calculate_PIC

Function to determine PIC (panel information criterium)
calculate_lambda_group

calculates factor loadings of groupfactors
calculate_Z_group

Calculates Z = Y - X*beta_est - LF. It is used to estimate the groupfactorstructure.
calculate_XB_estimated

Calculates (the estimated value of) the matrix X*beta_est.
calculate_best_config

Function that returns for each candidate C the best number of groups and factors, based on the PIC.
calculate_Z_common

Calculates Z = Y - X*beta_est - LgFg. It is used in the estimate of the common factorstructure.
calculate_error_term

Calculates the error term Y - X*beta_est - LF - LgFg.
calculate_XB_true

Calculates the product of X*beta_true .
define_C_candidates

Defines the candidate values for C.
check_stopping_rules

Checks the rules for stopping the algorithm, based on its convergence speed.
clustering_with_robust_distances

Function that puts individuals in a separate "class zero", when their distance to all possible groups is bigger then a certain threshold.
create_true_beta

Creates beta_true, which contains the true values of beta (= the coefficients of X)
calculate_lgfg

Calculates the group factor structure: the matrix product of the group factors and their loadings.
define_rho_parameters

Determines parameters of rho-function.
determine_beta

Helpfunction in estimate_beta() for estimating beta_est.
calculate_virtual_factor_and_lambda_group

Helpfunction used in update_g()
calculate_sigma2maxmodel

Calculates sigma2maxmodel
estimate_factor

Estimates common factor(s) F.
evade_floating_point_errors

Function to evade floating point errors.
define_object_for_initial_clustering_macropca

Defines the object that will be used to define a initial clustering.
evade_crashes_macropca

Solves a very specific issue with MacroPCA.
define_number_subsets

Returns a vector with the indices of the subsets. Must start with zero.
determine_robust_lambda

Help-function for return_robust_lambdaobject().
fill_rcj

Fills in the optimized number of groups and group specific factors for each C.
final_estimations_filter_kg

Filters dataframe on the requested group specific factors configuration.
estimate_factor_group

Estimates group factors Fg.
generate_Y

Generate panel data Y for simulations.
g_true_dgp3

g_true_dgp3 contains the true group memberships of the elements of Y_dgp3
create_data_dgp2

Creates an instance of DGP 2, as defined in BoudtHeyndels2021;textualRCTS.
create_covMat_crosssectional_dependence

Function used in generating simulated data with non normal errors.
generate_grouped_factorstructure

Generates the true groupfactorstructure, to use in simulations.
get_best_configuration

Finds the first stable interval after the first unstable point. It then defines the value for C for the begin, middle and end of this interval.
initialise_df_pic

Initialises a dataframe which will contain the PIC for each configuration and for each value of C.
factor_group_true_dgp3

factor_group_true_dgp3 contains the values of the true group factors on which Y_dgp3 is based
define_kg_candidates

Defines the set of combinations of group specific factors.
get_convergence_speed

Defines the convergence speed.
handleNA

Function with as input a dataframe. (this will be "Y" or "to_divide") It filters out rows with NA.
grid_add_variables

Function which is used to have a dataframe (called "grid") with data (individualindex, timeindex, XT and LF) available.
initialise_df_results

Initialises a dataframe that will contain an overview of metrics for each estimated configuration (for example adjusted randindex).
define_configurations

Constructs dataframe where the rows contains all configurations that are included and for which the estimators will be estimated.
plot_VCsquared

Plots expression(VC^2) along with the corresponding number of groups (orange), common factors (darkblue) and group factors of the first group (lightblue).
fill_rc

Fills in the optimized number of common factors for each C.
initialise_clustering

Function that clusters time series in a dataframe with kmeans.
df_results_example

An example for df_results. This dataframe contains the estimators for each configuration.
initialise_commonfactorstructure_macropca

Initialises the estimation of the common factors and their loadings.
kg_candidates_expand

Function that returns the set of combinations of groupfactors for which the algorithm needs to run.
reassign_if_empty_groups

Randomly reassign individual(s) if there are empty groups. This can happen if the total number of time series is low compared to the number of desired groups.
iterate

Wrapper around estimate_beta(), update_g(), and estimating the factorstructures.
calculate_errors_virtual_groups

Helpfunction for update_g(). Calculates the errors for one of the possible groups time series can be placed in.
calculate_obj_for_g

Calculates objective function for individual i and group k in order to estimate group membership.
restructure_X_to_order_slowN_fastT

Restructures X (which is an 3D-array of dimensions (N,T,p) to a 2D-matrix of dimension (NxT,p).
calculate_lambda

calculates factor loadings of common factors
initialise_X

Creates X (the observable variables) to use in simulations.
get_final_estimation

Function that returns the final clustering, based on the estimated number of groups and common and group specific factors.
do_we_estimate_common_factors

Helpfunction to shorten code: are common factors being estimated.
initialise_beta

Initialisation of estimation of beta (the coefficients with the observable variables)
solveFG

Helpfunction in update_g(), to calculate solve(FG x t(FG)) x FG
calculate_sigma2

Calculates sum of squared errors, divided by NT
scaling_X

Scaling of X.
prepare_for_robpca

Helpfunction: prepares object to perform robust PCA on.
run_config

Wrapper around the non-parallel algorithm, to estimate beta, group membership and the factorstructures.
handleNA_LG

Removes NA's in LG (in function calculate_virtual_factor_and_lambda_group() )
do_we_estimate_group_factors

Helpfunction to shorten code: are group factors being estimated.
estimate_algorithm

This function is a wrapper around the initialization and the estimation part of the algorithm, for one configuration. It is only used for the serialized algorithm.
estimate_beta

Estimates beta.
parallel_algorithm

Wrapper of the loop over the subsets which in turn use the parallelised algorithm.
update_g

Function that estimates group membership.
make_df_results_parallel

Makes a dataframe with information on each configuration.
matrixnorm

Function to calculate the norm of a matrix.
initialise_rc

Initialises rc.
handle_macropca_errors

Helpfunction in robustpca().
initialise_rcj

Initialises rcj.
lambda_group_true_dgp3

lambda_group_true_dgp3 contains the values of the loadings to the group factors on which Y_dgp3 is based
return_robust_lambdaobject

Calculates robust loadings
make_df_pic_parallel

Makes a dataframe with the PIC for each configuration and each candidate C.
tabulate_potential_C

Shows the configurations for potential C's of the first stable interval (beginpoint, middlepoint and endpoint)
robustpca

Function that uses robust PCA and estimates robust factors and loadings.
make_subsamples

Selects a subsample of the time series, and of the length of the time series. Based on this it returns a list with a subsample of Y, the corresponding subsample of X and of the true group membership and factorstructures if applicable.
OF_vectorized3

Calculates objective function for the classical algorithm: used in iterate() and in local_search.
add_configuration

Adds the current configuration (number of groups and factors) to df_results.
add_metrics

Adds several metrics to df_results.
adapt_pic_with_sigma2maxmodel

Adapts the object that contains PIC for all candidate C's and all subsamples with sigma2_max_model.
LMROB

Wrapper around lmrob.
RCTS

RCTS
X_dgp3

The dataset X_dgp3 contains the values of the 3 observable variables on which Y_dgp3 is based.
adapt_X_estimating_less_variables

When running the algorithm with a different number of observable variables then the number that is available, reformat X. (Mainly used for testing)
OF_vectorized_helpfunction3

Helpfunction in OF_vectorized3()
Y_dgp3

Y_dgp3 contains a simulated dataset for DGP 3.