estimate_state_adj_matrix: Estimate adjacency matrix for equivalent FLC distributions based on states

Description

This function estimates the adjacency matrix $\mathbf{A}$ of all pairwise equivalent FLC distributions given the states $s_1, \ldots, s_K$. See Details below.

Usage

estimate_state_adj_matrix(states = NULL, FLCs = NULL, pdfs.FLC = NULL, alpha = NULL, 
    distance = function(f, g) return(mean(abs(f - g))))

Arguments

states

vector of length $N$ with entry $i$ being the label $k = 1, \ldots, K$ of PLC $i$

FLCs

$N \times n_f$ matrix of FLCs (only necessary if distance= "KS")

pdfs.FLC

$N \times K$ matrix of all $K$ state-conditional FLC densities evaluated at each FLC $\ell^{+}_i$, $i=1, \ldots, N$ (only necessary if distance = function(f, g) return(...)).

alpha

significance level for testing. Default: alpha=NULL (this will return a p-value matrix if method == "KS")

distance

either a Kolmogorov-Smirnov test (distance = "KS") or a function metric (e.g. $L_q$ distance). For a distance function, distance requires as input a function of $f$ and $g$ that returns one value.

Default: distance = function(f, g) return(mean(abs(f-g))) $\rightarrow$ $L_1$ distance.

Value

A $K \times K$ adjacency matrix with a trimmed version of exp(-distance) or p-values. If alpha!=NULL then it returns the thresholded $0/1$ matrix. However, here $1$ stands for equivalent, i.e. not rejecting. The matrix is obtained by checking for pval>alpha (rather than the usual pval<alpha).

Details and user-defined distance function

The $(i,j)$th element of the adjacency matrix is defined as $$ \mathbf{A}_{ij} = distance(P(X \mid s_i), P(X \mid s_j)) = distance(f, g), $$ where distance is either

a metric: in the function space of pdfs $f$ and $g$, or
a two sample test: for $H_0: f=g$, e.g. a Kolmogorov-Smirnov test (distance="KS").

Again we use a functional programming approach and allow the user to specify any valid distance/similarity function distance = function(f, g) return(...).

If distance="KS" the adjacency matrix contains p-values of a Kolmogorov-Smirnov test or the thresholded versions (if alpha!=NULL) - see Return for details.

Otherwise distance is an R function that takes as an input two vectors f and g (e.g. the wKDE estimates for two states), and returns a non-negative, real number to estimate their distance. Default is the $L_1$ distance distance = function(f, g) return(mean(abs(f-g))).

Examples

Run this code

# NOT RUN {
WW <- matrix(runif(10000), ncol = 10)
WW <- normalize(WW)
temp_flcs <- cbind(rnorm(nrow(WW)))
temp_pdfs.FLC <- estimate_LC_pdfs(temp_flcs, WW)
AA_ks <- estimate_state_adj_matrix(states = weight_matrix2states(WW), FLCs = temp_flcs, 
    distance = "KS")
AA_L1 <- estimate_state_adj_matrix(pdfs.FLC = temp_pdfs.FLC)

par(mfrow = c(1, 2), mar = c(1, 1, 2, 1))
image2(AA_ks, zlim = c(0, 1), legend = FALSE, main = "Kolmogorov-Smirnov")
image2(AA_L1, legend = FALSE, main = "L1 distance")
# }

Run the code above in your browser using DataLab