# get_survival_case_weights_and_data

##### Get data.frame for Discrete Time Survival Models

Function used to get `data.frame`

with weights for a static fit for survivals.

##### Usage

```
get_survival_case_weights_and_data(formula, data, by, max_T, id,
init_weights, risk_obj, use_weights = T, is_for_discrete_model = T,
c_outcome = "Y", c_weights = "weights", c_end_t = "t")
```

##### Arguments

- formula
`coxph`

like formula with`Surv(tstart, tstop, event)`

on the left hand site of`~`

.- data
`data.frame`

or environment containing the outcome and covariates.- by
interval length of the bins in which parameters are fixed.

- max_T
end of the last interval interval.

- id
vector of ids for each row of the in the design matrix.

- init_weights
weights for the rows in

`data`

. Useful e.g., with skewed sampling.- risk_obj
a pre-computed result from a

`get_risk_obj`

. Will be used to skip some computations.- use_weights
`TRUE`

if weights should be used. See details.- is_for_discrete_model
`TRUE`

if the model is for a discrete hazard model is used like the logistic model.- c_outcome, c_weights, c_end_t
alternative names to use for the added columns described in the return section. Useful if you already have a column named

`Y`

,`t`

or`weights`

.

##### Details

This function is used to get the `data.frame`

for e.g. a `glm`

fit that is comparable to a `ddhazard`

fit in the sense that it is a static version. For example, say that we bin our time periods into `(0,1]`

, `(1,2]`

and `(2,3]`

. Next, consider an individual who dies at time 2.5. He should be a control in the the first two bins and should be a case in the last bin. Thus the rows in the final data frame for this individual is `c(Y = 1, ..., weights = 1)`

and `c(Y = 0, ..., weights = 2)`

where `Y`

is the outcome, `...`

is the covariates and `weights`

is the weights for the regression. Consider another individual who does not die and we observe him for all three periods. Thus, he will yield one row with `c(Y = 0, ..., weights = 3)`

.

This function use similar logic as the `ddhazard`

for individuals with time varying covariates (see the vignette `vignette("ddhazard", "dynamichazard")`

for details).

If `use_weights = FALSE`

then the two previously mentioned individuals will yield three rows each. The first individual will have `c(Y = 0, t = 1, ..., weights = 1)`

, `c(Y = 0, t = 2, ..., weights = 1)`

, `c(Y = 1, t = 3, ..., weights = 1)`

while the latter will have three rows `c(Y = 0, t = 1, ..., weights = 1)`

, `c(Y = 0, t = 2, ..., weights = 1)`

, `c(Y = 0, t = 3, ..., weights = 1)`

. This kind of data frame is useful if you want to make a fit with e.g. `gam`

function in the `mgcv`

package as described en Tutz et. al (2016).

##### Value

Returns a `data.frame`

where the following is added (column names will differ if you specified them): column `Y`

for the binary outcome, column `weights`

for weights of each row and additional rows if applicable. A column `t`

is added for the stop time of the bin if `use_weights = FALSE`

. An element `Y`

with the used `Surv`

object is added if `is_for_discrete_model = FALSE`

.

##### References

Tutz, Gerhard, and Matthias Schmid. *Nonparametric Modeling and Smooth Effects*. Modeling Discrete Time-to-Event Data. Springer International Publishing, 2016. 105-127.

##### See Also

##### Examples

```
# NOT RUN {
library(dynamichazard)
# small toy example with time-varying covariates
dat <- data.frame(
id = c( 1, 1, 2, 2),
tstart = c( 0, 4, 0, 2),
tstop = c( 4, 6, 2, 6),
event = c( 0, 1, 0, 0),
x1 = c(1.09, 1.29, 0, -1.16))
get_survival_case_weights_and_data(
Surv(tstart, tstop, event) ~ x1, dat, by = 1, id = dat$id)$X
get_survival_case_weights_and_data(
Surv(tstart, tstop, event) ~ x1, dat, by = 1, id = dat$id,
use_weights = FALSE)$X
# }
```

*Documentation reproduced from package dynamichazard, version 0.6.5, License: GPL-2*