generate_reduced_graph: [INERNAL] Generate a reduced iGraph from adjacency matrices

Description

[INTERNAL] A wrapper functions that calls the functions to generate a network from correlation data and reduce the network by a given method. Correlation/adjacency matrices are computed in compute_correlation_matrices. Graph generation uses graph.adjacency internally. Methods implemented are network_reduction_by_p_value (reduction by statistical significance of correlation) and network_reduction_by_pickHardThreshold (using WGCNA function pickHardThreshold.fromSimilarity that finds a suitable cutoff value to get a scale-free network). If no method is given, no reduction will be performed. When using the reduction method `p_value` the user can specify an alpha significance value and a method for p-value adjustment. When using the reduction by `pickHardThreshold` a R^2 cutoff and a cut vector can be specified.

Usage

generate_reduced_graph(
  adjacency_matrix,
  measurement_data,
  identifiers,
  handling_missing_data = "all.obs",
  reduction_method = "pickHardTreshold",
  r_squared_cutoff = 0.85,
  cut_vector = seq(0.2, 0.8, by = 0.01),
  mean_number_edges = NULL,
  edge_density = NULL,
  p_value_adjustment_method = "BH",
  reduction_alpha = 0.05,
  n_threads = 1,
  parallel_chunk_size = 10^6,
  print_graph_info = TRUE
)

Value

iGraph graph object of the reduced network.

Arguments

adjacency_matrix: [matrix] Adjacency matrix of correlations computed using cor in compute_correlation_matrices
measurement_data: [data.frame] Data frame containing the respective raw data (e.g. mRNA expression data, protein abundance, etc.) to the adjacency matrix. Analyzed components (e.g. genes) in rows, samples (e.g. patients) in columns.
identifiers: [data.frame] Data frame containing biological identifiers and the corresponding node ID created in compute_correlation_matrices via create_unique_layer_node_ids. The column containing node IDs has to be named `node_id`.
handling_missing_data: ["all.obs"|"pairwise.complete.obs"] Specifying the handling of missing data during correlation matrix computation. (default: all.obs)
reduction_method: ["pickHardThreshold"|"p_value"] A character string specifying the method to be used for network reduction. `p_value` for hard thresholding based on the statistical significance of the computed correlation. `pickHardThreshold` for a cutoff based on the scale-freeness criterion (calls pickHardThreshold). (default: pickHardThreshold)
r_squared_cutoff: [float] A number indicating the desired minimum scale free topology fitting index R^2 for reduction using pickHardThreshold. (default: 0.85)
cut_vector: [sequence of float] A vector of hard threshold cuts for which the scale free topology fit indices are to be calculated during reduction with pickHardThreshold. (default: seq(0.2, 0.8, by = 0.01))
mean_number_edges: [int] Find a suitable edge weight cutoff employing pickHardThreshold to reduce the network to at most the specified mean number of edges. Attention: This parameter overwrites the 'r_squared_cutoff' and 'edge_density' parameters if not set to NULL. (default: NULL)
edge_density: [float] Find a suitable edge weight cutoff employing pickHardThreshold to reduce the network to at most the specified edge density. Attention: This parameter overwrites the 'r_squared_cutoff' parameter if not set to NULL. (default: NULL)
p_value_adjustment_method: ["holm"|"hochberg"|"hommel"|"bonferroni"|"BH"|"BY"|"fdr"|"none"] String of the correction method applied to p-values. Passed to p.adjust. (default: "BH")
reduction_alpha: [float] A number indicating the significance value for correlation p-values during reduction. Not-significant edges are dropped. (default: 0.05)
n_threads: [int] Number of threads for parallel computation of p-values during p-value reduction. (default: 1)
parallel_chunk_size: [int] Number of p-values in smallest work unit when computing in parallel during network reduction with method `p_value`. (default: 10^6)
print_graph_info: [bool] Specifying if a summary of the reduced graph should be printed to the console after network generation. (default: TRUE)