# get_survival_case_weights_and_data

##### Static GLM fit for survival models

Function used to get design matrix and weights for a static fit for survivals models where observations are binned into intervals

##### Usage

```
get_survival_case_weights_and_data(formula, data, by, max_T, id, init_weights,
risk_obj, use_weights = T, is_for_discrete_model = T, c_outcome = "Y",
c_weights = "weights", c_end_t = "t")
```

##### Arguments

- formula
`coxph`

like formula with`Surv(tstart, tstop, event)`

on the left hand site of`~`

- data
Data frame or environment containing the outcome and co-variates

- by
Length of each intervals that cases are binned into

- max_T
The end time of the last bin

- id
The id for each row in

`data`

. This is important when variables are time varying- init_weights
Weights for the rows

`data`

. Useful with skewed sampling and will be used when computing the final weights- risk_obj
A pre-computed result from a

`get_risk_obj`

. Will be used to skip some computations- use_weights
`TRUE`

if weights should be used. See details- is_for_discrete_model
`TRUE`

if the model is for a discrete hazard model like the logistic model. Affects how deaths are included when individuals have time varying coefficients- c_outcome, c_weights, c_end_t
Alternative names to use for the added columns described in the return section. Useful if you already have a column named

`Y`

,`t`

or`weights`

##### Details

This function is used to get the data frame for e.g. a `glm`

fit that is comparable to a `ddhazard`

fit in the sense that it is a static version. For example, say that we bin our time periods into `(0,1]`

, `(1,2]`

and `(2,3]`

. Next, consider an individual who dies at time 2.5. He should be a control in the the first two bins and should be a case in the last bin. Thus the rows in the final data frame for this individual is `c(Y = 1, ..., weights = 1)`

and `c(Y = 0, ..., weights = 2)`

where `Y`

is the outcome, `...`

is the co-variates and `weights`

is the weights for the regression. Consider another individual who does not die and we observe him for all three periods. Thus, he will yield one row with `c(Y = 0, ..., weights = 3)`

This function use similar logic as the `ddhazard`

for individuals with time varying co-variates (see the vignette "ddhazard" for details)

If `use_weights = FALSE`

then the two individuals will yield three rows each. The first individual will have `c(Y = 0, t = 1, ..., weights = 1)`

, `c(Y = 0, t = 2, ..., weights = 1)`

, `c(Y = 1, t = 3, ..., weights = 1)`

while the latter will have three rows `c(Y = 0, t = 1, ..., weights = 1)`

, `c(Y = 0, t = 2, ..., weights = 1)`

, `c(Y = 0, t = 3, ..., weights = 1)`

. This kind of data frame is useful if you want to make a fit with e.g. `gam`

function in the `mgcv`

package as described en Tutz et. al (2016) (see reference)

##### Value

Returns a data frame with the design matrix from the formula where the following is added (column names will differ if you specified them): column `Y`

for the binary outcome, column `weights`

for weights of each row and additional rows if applicable. A column `t`

is added for the stop time of the bin if `use_weights = FALSE`

##### References

Tutz, Gerhard, and Matthias Schmid. *Nonparametric Modeling and Smooth Effects*. Modeling Discrete Time-to-Event Data. Springer International Publishing, 2016. 105-127.

##### See Also

*Documentation reproduced from package dynamichazard, version 0.3.1, License: GPL-2*