Learn R Programming

dsmmR (version 1.0.7)

nonparametric_dsmm: Non-parametric Drifting semi-Markov model specification

Description

Creates a non-parametric model specification for a drifting semi-Markov model. Returns an object of class (dsmm_nonparametric, dsmm).

Usage

nonparametric_dsmm(
  model_size,
  states,
  initial_dist,
  degree,
  k_max,
  f_is_drifting,
  p_is_drifting,
  p_dist,
  f_dist
)

Value

Returns an object of the S3 class

dsmm_nonparametric,dsmm.

  • dist : List. Contains 2 arrays, passing down from the arguments:

    • p_drift or p_notdrift, corresponding to whether the defined \(p\) transition matrix is drifting or not.

    • f_drift or f_notdrift, corresponding to whether the defined \(f\) sojourn time distribution is drifting or not.

  • initial_dist : Numerical vector. Passing down from the arguments. It contains the initial distribution of the drifting semi-Markov model.

  • states : Character vector. Passing down from the arguments. It contains the state space \(E\).

  • s : Positive integer. It contains the number of states in the state space, \(s = |E|\), which is given in the attribute states.

  • degree : Positive integer. Passing down from the arguments. It contains the polynomial degree \(d\) considered for the drifting of the model.

  • k_max : Numerical value. Passing down from the arguments. It contains the maximum sojourn time, for the drifting semi-Markov model.

  • model_size : Positive integer. Passing down from the arguments. It contains the size of the drifting semi-Markov model \(n\), which represents the length of the embedded Markov chain \((J_{t})_{t\in \{0,\dots,n\}}\), without the last state.

  • f_is_drifting : Logical. Passing down from the arguments. Specifies if \(f\) is drifting or not.

  • p_is_drifting : Logical. Passing down from the arguments. Specifies if \(p\) is drifting or not.

  • Model : Character. Possible values:

    • "Model_1" : Both \(p\) and \(f\) are drifting.

    • "Model_2" : \(p\) is drifting and \(f\) is not drifting.

    • "Model_3" : \(f\) is drifting and \(p\) is not drifting.

  • A_i : Numerical Matrix. Represents the polynomials \(A_i(t)\) with degree \(d\) that are used for solving the system \(MJ = P\). Used for the methods defined for the object. Not printed when viewing the object.

Arguments

model_size

Positive integer that represents the size of the drifting semi-Markov model \(n\). It is equal to the length of a theoretical embedded Markov chain \((J_{t})_{t\in \{0,\dots,n\}}\), without the last state.

states

Character vector that represents the state space \(E\) . It has length equal to \(s = |E|\).

initial_dist

Numerical vector of \(s\) probabilities, that represents the initial distribution for each state in the state space \(E\).

degree

Positive integer that represents the polynomial degree \(d\) for the drifting semi-Markov model.

k_max

Positive integer that represents the maximum sojourn time of choice, for the drifting semi-Markov model.

f_is_drifting

Logical. Specifies if \(f\) is drifting or not.

p_is_drifting

Logical. Specifies if \(p\) is drifting or not.

p_dist

Numerical array, that represents the probabilities of the transition matrix \(p\) of the embedded Markov chain \((J_{t})_{t\in \{0,\dots,n\}}\) (it is defined the same way in the parametric_dsmm function). It can be defined in two ways:

  • If \(p\) is not drifting, it has dimensions of \(s \times s\).

  • If \(p\) is drifting, it has dimensions of \(s \times s \times (d+1)\) (see more in Details, Defined Arguments.)

f_dist

Numerical array, that represents the probabilities of the conditional sojourn time distributions \(f\). \(0\) is allowed for state transitions that we do not wish to have a sojourn time distribution (e.g. all state transitions to the same state should have \(0\) as their value). It can be defined in two ways:

  • If \(f\) is not drifting, it has dimensions of \(s \times s \times k_{max}\).

  • If \(f\) is drifting, it has dimensions of \(s \times s \times k_{max} \times (d+1)\) (see more in Details, Defined Arguments.)

Details

Defined Arguments

For the non-parametric case, we explicitly define:

  1. The transition matrix of the embedded Markov chain \((J_{t})_{t\in \{0,\dots,n\}}\), given in the attribute p_dist:

    • If \(p\) is not drifting, it contains the values: $$p(u, v), \forall u, v \in E,$$ given in an array with dimensions of \(s \times s\), where the first dimension corresponds to the previous state \(u\) and the second dimension corresponds to the current state \(v\).

    • If \(p\) is drifting then, for \(i \in\{0,\dots,d\}\), it contains the values: $$p_{\frac{i}{d}}(u,v), \forall u, v \in E,$$ given in an array with dimensions of \(s \times s \times (d + 1)\), where the first and second dimensions are defined as in the non-drifting case, and the third dimension corresponds to the \(d+1\) different matrices \(p_{\frac{i}{d}}.\)

  2. The conditional sojourn time distribution, given in the attribute f_dist:

    • If \(f\) is not drifting, it contains the values: $$f(u,v,l), \forall u,v\in E,\forall l\in \{1,\dots,k_{max}\},$$ given in an array with dimensions of \(s \times s \times k_{max}\), where the first dimension corresponds to the previous state \(u\), the second dimension corresponds to the current state \(v\), and the third dimension correspond to the sojourn time \(l\).

    • If \(f\) is drifting then, for \(i\in \{0,\dots,d\}\), it contains the values: $$f_{\frac{i}{d}}(u,v,l),\forall u,v\in E, \forall l\in \{1,\dots,k_{max}\},$$ given in an array with dimensions of \(s \times s \times k_{max} \times (d + 1)\), where the first, second and third dimensions are defined as in the non-drifting case, and the fourth dimension corresponds to the \(d+1\) different arrays \(f_{\frac{i}{d}}.\)

References

V. S. Barbu, N. Limnios. (2008). semi-Markov Chains and Hidden semi-Markov Models Toward Applications - Their Use in Reliability and DNA Analysis. New York: Lecture Notes in Statistics, vol. 191, Springer.

Vergne, N. (2008). Drifting Markov models with Polynomial Drift and Applications to DNA Sequences. Statistical Applications in Genetics Molecular Biology 7 (1).

Barbu V. S., Vergne, N. (2019). Reliability and survival analysis for drifting Markov models: modeling and estimation. Methodology and Computing in Applied Probability, 21(4), 1407-1429.

See Also

Methods applied to this object: simulate.dsmm, get_kernel.

For the parametric drifting semi-Markov model specification: parametric_dsmm.

For the theoretical background of drifting semi-Markov models: dsmmR.

Examples

Run this code
# Setup.
states <- c("AA", "AC", "CC")
s <- length(states)
d <- 2
k_max <- 3

# ===========================================================================
# Defining non-parametric drifting semi-Markov models.
# ===========================================================================

# ---------------------------------------------------------------------------
# Defining distributions for Model 1 - both p and f are drifting.
# ---------------------------------------------------------------------------

# `p_dist` has dimensions of: (s, s, d + 1).
# Sums over v must be 1 for all u and i = 0, ..., d.
p_dist_1 <- matrix(c(0,   0.1, 0.9,
                     0.5, 0,   0.5,
                     0.3, 0.7, 0),
                   ncol = s, byrow = TRUE)

p_dist_2 <- matrix(c(0,   0.6, 0.4,
                     0.7, 0,   0.3,
                     0.6, 0.4, 0),
                   ncol = s, byrow = TRUE)

p_dist_3 <- matrix(c(0,   0.2, 0.8,
                     0.6, 0,   0.4,
                     0.7, 0.3, 0),
                   ncol = s, byrow = TRUE)

# Get `p_dist` as an array of p_dist_1, p_dist_2 and p_dist_3.
p_dist <- array(c(p_dist_1, p_dist_2, p_dist_3),
                dim = c(s, s, d + 1))

# `f_dist` has dimensions of: (s, s, k_max, d + 1).
# First f distribution. Dimensions: (s, s, k_max).
# Sums over l must be 1, for every u, v and i = 0, ..., d.
f_dist_1_l_1 <- matrix(c(0,   0.2, 0.7,
                         0.3, 0,   0.4,
                         0.2, 0.8, 0),
                       ncol = s, byrow = TRUE)

f_dist_1_l_2 <- matrix(c(0,   0.3,  0.2,
                         0.2, 0,    0.5,
                         0.1, 0.15, 0),
                       ncol = s, byrow = TRUE)

f_dist_1_l_3 <- matrix(c(0,   0.5,  0.1,
                         0.5, 0,    0.1,
                         0.7, 0.05, 0),
                       ncol = s, byrow = TRUE)
# Get f_dist_1
f_dist_1 <- array(c(f_dist_1_l_1, f_dist_1_l_2, f_dist_1_l_3),
                  dim = c(s, s, k_max))

# Second f distribution. Dimensions: (s, s, k_max)
f_dist_2_l_1 <- matrix(c(0,   1/3, 0.4,
                         0.3, 0,   0.4,
                         0.2, 0.1, 0),
                       ncol = s, byrow = TRUE)

f_dist_2_l_2 <- matrix(c(0,   1/3, 0.4,
                         0.4, 0,   0.2,
                         0.3, 0.4, 0),
                       ncol = s, byrow = TRUE)

f_dist_2_l_3 <- matrix(c(0,   1/3, 0.2,
                         0.3, 0,   0.4,
                         0.5, 0.5, 0),
                       ncol = s, byrow = TRUE)

# Get f_dist_2
f_dist_2 <- array(c(f_dist_2_l_1, f_dist_2_l_2, f_dist_2_l_3),
                  dim = c(s, s, k_max))

# Third f distribution. Dimensions: (s, s, k_max)
f_dist_3_l_1 <- matrix(c(0,    0.3, 0.3,
                         0.3,  0,   0.5,
                         0.05, 0.1, 0),
                       ncol = s, byrow = TRUE)

f_dist_3_l_2 <- matrix(c(0,   0.2, 0.6,
                         0.3, 0,   0.35,
                         0.9, 0.2, 0),
                       ncol = s, byrow = TRUE)

f_dist_3_l_3 <- matrix(c(0,    0.5, 0.1,
                         0.4,  0,   0.15,
                         0.05, 0.7, 0),
                       ncol = s, byrow = TRUE)

# Get f_dist_3
f_dist_3 <- array(c(f_dist_3_l_1, f_dist_3_l_2, f_dist_3_l_3),
                  dim = c(s, s, k_max))

# Get f_dist as an array of f_dist_1, f_dist_2 and f_dist_3.
f_dist <- array(c(f_dist_1, f_dist_2, f_dist_3),
                dim = c(s, s, k_max, d + 1))

# ---------------------------------------------------------------------------
# Non-Parametric object for Model 1.
# ---------------------------------------------------------------------------

obj_nonpar_model_1 <- nonparametric_dsmm(
    model_size = 8000,
    states = states,
    initial_dist = c(0.3, 0.5, 0.2),
    degree = d,
    k_max = k_max,
    p_dist = p_dist,
    f_dist = f_dist,
    p_is_drifting = TRUE,
    f_is_drifting = TRUE
)

# p drifting array.
p_drift <- obj_nonpar_model_1$dist$p_drift
p_drift

# f distribution.
f_drift <- obj_nonpar_model_1$dist$f_drift
f_drift

# ---------------------------------------------------------------------------
# Defining Model 2 - p is drifting, f is not drifting.
# ---------------------------------------------------------------------------

# p_dist has the same dimensions as in Model 1: (s, s, d + 1).
p_dist_model_2 <- array(c(p_dist_1, p_dist_2, p_dist_3),
                        dim = c(s, s, d + 1))

# f_dist has dimensions of: (s,s,k_{max}).
f_dist_model_2 <- f_dist_2


# ---------------------------------------------------------------------------
# Non-Parametric object for Model 2.
# ---------------------------------------------------------------------------

obj_nonpar_model_2 <- nonparametric_dsmm(
    model_size = 10000,
    states = states,
    initial_dist = c(0.7, 0.1, 0.2),
    degree = d,
    k_max = k_max,
    p_dist = p_dist_model_2,
    f_dist = f_dist_model_2,
    p_is_drifting = TRUE,
    f_is_drifting = FALSE
)

# p drifting array.
p_drift <- obj_nonpar_model_2$dist$p_drift
p_drift

# f distribution array.
f_notdrift <- obj_nonpar_model_2$dist$f_notdrift
f_notdrift


# ---------------------------------------------------------------------------
# Defining Model 3 - f is drifting, p is not drifting.
# ---------------------------------------------------------------------------


# `p_dist` has dimensions of: (s, s, d + 1).
p_dist_model_3 <- p_dist_3


# `f_dist` has the same dimensions as in Model 1: (s, s, d + 1).
f_dist_model_3 <- array(c(f_dist_1, f_dist_2, f_dist_3),
                        dim = c(s, s, k_max, d + 1))


# ---------------------------------------------------------------------------
# Non-Parametric object for Model 3.
# ---------------------------------------------------------------------------

obj_nonpar_model_3 <- nonparametric_dsmm(
    model_size = 10000,
    states = states,
    initial_dist = c(0.3, 0.4, 0.3),
    degree = d,
    k_max = k_max,
    p_dist = p_dist_model_3,
    f_dist = f_dist_model_3,
    p_is_drifting = FALSE,
    f_is_drifting = TRUE
)

# p distribution matrix.
p_notdrift <- obj_nonpar_model_3$dist$p_notdrift
p_notdrift

# f distribution array.
f_drift <- obj_nonpar_model_3$dist$f_drift
f_drift

# ===========================================================================
# Using methods for non-parametric objects.
# ===========================================================================

kernel_parametric <- get_kernel(obj_nonpar_model_3)
str(kernel_parametric)

sim_seq_par <- simulate(obj_nonpar_model_3, nsim = 50)
str(sim_seq_par)

Run the code above in your browser using DataLab