dynamichazard (version 0.3.0)

ddhazard: Function to fit dynamic discrete hazard models

Description

Function to fit dynamic discrete hazard models using state space models

Usage

ddhazard(formula, data, model = "logit", by, max_T, id, a_0, Q_0, Q = Q_0,
  order = 1, weights, control = list(), verbose = F)

Arguments

formula

coxph like formula with Surv(tstart, tstop, event) on the left hand site of ~

data

Data frame or environment containing the outcome and co-variates

model

"logit", "exp_clip_time_w_jump", "exp_clip_time" or "exp_bin" for the discrete time function using the logistic link function in the first case or for the continuous time model with different estimation method in the three latter cases (see the ddhazard vignette for details of the methods)

by

Interval length of the bins in which parameters are fixed

max_T

End of the last interval. The last stop time with an event is selected if the parameter is omitted

id

Vector of ids for each row of the in the design matrix

a_0

Vector \(a_0\) for the initial coefficient vector for the first iteration (optional). Default is estimates from static model (see static_glm)

Q_0

Covariance matrix for the prior distribution

Q

Initial covariance matrix for the state equation

order

Order of the random walk

weights

Weights to use if e.g. a skewed sample is used

control

List of control variables (see details below)

verbose

TRUE if you want status messages during execution

Value

A list with class fahrmeier_94. The list contains:

formula

The passed formula

state_vecs

2D matrix with the estimated state vectors (regression parameters) in each bin

state_vars

3D array with smoothed variance estimates for each state vector

lag_one_cov

3D array with lagged correlation matrix for each for each change in the state vector. Only present when the model is logit and the method is EKF

n_risk

The number of observations in each interval

times

The interval borders

risk_set

The object from get_risk_obj if saved

data

The data argument if saved

id

ids used to match rows in data to individuals

order

Order of the random walk

F_

Matrix with that map transition from one state vector to the next

method

Method used in the E-step

est_Q_0

TRUE if Q_0 was estimated in the EM-algorithm

hazard_func

Hazard function

hazard_first_deriv

First derivative of the hazard function with respect to the linear predictor

Control

The control argument allows you to pass a list to select additional parameters. See the vignette 'ddhazard' for more information on hyper parameters. Unspecified elements of the list will yield default values

method

Set to the method to use in the E-step. Either "EKF" for the Extended Kalman Filter, "UKF"for the Unscented Kalman Filter, "SMA" for the sequential posterior mode approximation method or "GMA" for the global mode approximation method. "EKF" is the default

LR

Learning rate for the Extended Kalman filter

NR_eps

Tolerance for the Extended Kalman filter. Default is NULL which means that no extra iteration is made in the correction step

alpha

Hyper parameter \(\alpha\) in the Unscented Kalman Filter

beta

Hyper parameter \(\beta\) in the Unscented Kalman Filter

kappa

Hyper parameter \(\kappa\) in the Unscented Kalman Filter

n_max

Maximum number of iteration in the EM-algorithm

eps

Tolerance parameter for the EM-algorithm

est_Q_0

TRUE if you want the EM-algorithm to estimate Q_0. Default is FALSE

save_risk_set

TRUE if you want to save the list from get_risk_obj used to estimate the model. It may be needed for later call to residuals, plot and logLike. Can be set to FALSE to save memory

save_data

TRUE if you want to save the list data argument. It may be needed for later call to residuals, plot and logLike. Can be set to FALSE to save memory

denom_term

Term added to denominators in either the EKF or UKF

fixed_parems_start

Starting value for fixed terms

fixed_terms_method

The method used to estimate the fixed effects. Either 'M_step' or 'E_step' for estimation in the M-step or E-step respectively

Q_0_term_for_fixed_E_step

The diagonal value of the initial covariance matrix, Q_0, for the fixed effects if fixed effects are estimated in the E-step

eps_fixed_parems

Tolerance used in the M-step of the Fisher's Scoring Algorithm for the fixed effects

permu

TRUE if the risk sets should be permutated before computation. This is TRUE by default for posterior mode approximation method and FALSE for all other methods

posterior_version

The implementation version of the posterior approximation method. Either "woodbury" or "cholesky"

GMA_max_rep

Maximum number of iterations in the correction step if method = 'GMA'

GMA_NR_eps

Tolerance for the convergence criteria for the relative change in the norm of the coefficients in the correction step if method = 'GMA'

Details

This function can be used to estimate a binary regression where the regression parameters follows a given order random walk. The order is specified by the order argument. 1. and 2. order random walks is implemented. The regression parameters are updated at time by, 2by, ..., max_T. See the vignette 'ddhazard' for more details

All filter methods needs a state covariance matrix Q_0 and state vector a_0. An estimate from a time-invariant model is provided for a_0 if it is not supplied (the same model you would get from static_glm function). A diagonal matrix with large entries is recommended for Q_0. What is large dependents on the data set and model. Further, a variance matrix for the first iteration Q is needed. It is recommended to select diagonal matrix with low values for the latter. The Q, a_0 and optionally Q_0 is estimated with an EM-algorithm

The model is specified through the model argument. See the model in the argument above for details. The logistic model is where outcomes are binned into the intervals. Be aware that there can be loss of information due to binning. It is key for the logit model that the id argument is provided if individuals in the data set have time varying co-variates. The the exponential models use an exponential model for the arrival times where there is no loss information due to binning

It is recommended to see the Shiny app demo for this function by calling ddhazard_app()

References

Fahrmeir, Ludwig. Dynamic modelling and penalized likelihood estimation for discrete time survival data. Biometrika 81.2 (1994): 317-330.

Durbin, James, and Siem Jan Koopman. Time series analysis by state space methods. No. 38. Oxford University Press, 2012.

See Also

plot, residuals, predict, static_glm, ddhazard_app, ddhazard_boot