Learn R Programming

hdMTD (version 0.1.1)

dTV_sample: The total variation distance between distributions

Description

Calculates the total variation distance between distributions conditioned in a given past sequence.

Usage

dTV_sample(S, j, A = NULL, base, lenA = NULL, A_pairs = NULL, x_S)

Value

Returns a vector of total variation distances, where each entry corresponds to the distance between a pair of distributions conditioned on the same fixed past x_S, differing only in the symbol indexed by j, which varies across all distinct pairs of elements in A.The output has length equal to the number of unique pairs in A_pairs.

Arguments

S

A numeric vector of positive integers (or NULL) representing a set of past lags. The distributions from which this function will calculate the total variation distance are conditioned on a fixed sequence indexed by S ( the user must also input the sequence through the argument x_S).

j

A positive integer representing a lag in the \(complement\) of S. The symbols indexed by j vary along the state space A, altering the distribution through this single lag, and the size of this change is what this function seeks to measure.

A

A vector of unique positive integers (state space) with at least two elements. A represents the state space. You may leave A=NULL (default) if you provide the function with the arguments lenA and A_pairs (see Details below).

base

A data frame with sequences of elements from A and their transition probabilities. base is meant to be an output from function freqTab(), and must be structured as such. The data frame must contain all required transitions conditioned on x_S (i.e. length(A)^2 rows with sequence x_S). See Details section for further information.

lenA

An integer >= 2, representing length(A). Required if A is not provided.

A_pairs

A two-column matrix with all unique pairs of elements from A. Required if A is not provided.

x_S

A vector of length length(S) or NULL. If S==NULL, x_S will be set to NULL. x_S represents a sequence of symbols from A indexed by S. This sequence remains constant across the conditional distributions to be compared, representing the fixed configuration of the past.

Details

This function computes the total variation distance between distributions found in base, which is expected to be the output of the function freqTab(). Therefore, base must follow a specific structure (e.g., column names must match, and a column named qax_Sj, containing transition distributions, must be present). For more details on the output structure of freqTab(), refer to its documentation..

If you provide the state space A, the function calculates: lenA <- length(A) and A_pairs <- t(utils::combn(A, 2)). Alternatively, you can input lenA and A_pairs directly and let A <- NULL, which is useful in loops to improve efficiency.

Examples

Run this code
#creating base argument through freqTab function.
pbase <- freqTab(S=c(1,4),j=2,A=c(1,2,3),countsTab = countsTab(testChains[,2],d=5))
dTV_sample(S=c(1,2),j=4,A=c(1,2,3),base=pbase,x_S=c(2,3))
pbase <- freqTab(S=NULL,j=1,A=c(1,2,3),countsTab = countsTab(testChains[,2],d=5))
dTV_sample(S=NULL,j=1,A=c(1,2,3),base=pbase)

Run the code above in your browser using DataLab