Performs g-estimation of a structural nested mean model (SNMM), based on the outcome regression methods described in Sjolander and Vansteelandt (2016) and Dukes and Vansteelandt (2018). We expect a dataset that holds an end of study outcome that is either binary or continuous, time-varying and/or baseline confounders, and a time-varying exposure that is either binary or continuous.

```
gest(
data,
idvar,
timevar,
Yn,
An,
Ybin,
Abin,
Lny,
Lnp,
type = 1,
Cn = NA,
LnC = NA,
...
)
```

data

A data frame in long format containing the data to be analysed. See description for details.

idvar

Character string specifying the name of the ID variable in the data.

timevar

Character string specifying the name of the time variable in the data. Note that timevar must specify time periods as integer values starting from 1 (must not begin at 0).

Yn

Character string specifying the name of the end of study outcome variable.

An

Character string specifying the name of the time-varying exposure variable.

Ybin

TRUE or FALSE indicator of whether the outcome is binary.

Abin

TRUE or FALSE indicator of whether the exposure is binary.
Note that if `Abin==TRUE`

then the variable specified in `An`

MUST be written as a numeric variable.
taking values 0 or 1. If not use `gestcat`

Lny

Vector of character strings specifying the names of the time-varying and/or baseline confounders to be included in the outcome model in quotations.

Lnp

Vector of character strings specifying the names of the time-varying and/or baseline confounders to be included in the model calculating the propensity scores.

type

Value from 1-4 specifying SNMM type to fit. See details.

Cn

Optional character string specifying the name of the censoring indicator variable. The variable specified in Cn should be a numeric variable taking values 0 or 1, with 1 indicating censored.

LnC

Vector of character strings specifying the names of the time-varying and/or baseline covariates to be used in the censoring score model to calculate
the censoring weights. Note that any variable in `LnC`

should also be in `Lnp`

for the validity of the censoring and propensity weights.

...

Additional arguments, currently not in use.

List of the fitted causal parameters of the posited SNMM. These are labeled as follows for each SNMM type, where `An`

is
set to the name of the exposure variable, i is the current time period, and `Lny[1]`

is set to the name of the first confounder in `Lny`

.

`type=1`

`An`

: The effect of exposure at any time t on outcome.

`type=2`

`An`

: The effect of exposure at any time t on outcome, when `Ln[1]`

is set to zero.
`An:Ln[1]`

: The effect modification by `Lny[1]`

, the additional effect of A on Y for each unit increase in `Lny[1]`

.

`type=3`

`t=i.An`

: The effect of exposure at time t=i on outcome.

`type=4`

`t=i.An`

: The effect of exposure at time t=i on outcome, when `Ln[1]`

is set to zero.
`t=i.An:Ln[1]`

: The effect modification by `Lny[1]`

, the additional effect of A on Y at time t=i for each unit increase in `Lny[1]`

.

Given a time-varying exposure variable, \(A_t\) and time-varying confounders, \(L_t\) measured over time periods \(t=1,\ldots,T\), and an end of study outcome \(Y\)
measured at time \(T+1\), `gest`

estimates the causal parameters \(\psi\) of a SNMM of the form
$$E(Y(\bar{a}_{t},0)-Y(\bar{a}_{t-1},0)|\bar{a}_{t-1},\bar{l}_{t})=\psi z_ta_t \;\forall\; t=1,\ldots,T$$
if Y is continuous or
$$\frac{E(Y(\bar{a}_{t},0)|\bar{a}_{t-1},\bar{l}_{t})}{E(Y(\bar{a}_{t-1},0)|\bar{a}_{t-1},\bar{l}_{t})}=exp(\psi z_ta_t)\;\forall\; t=1,\ldots,T $$
if Y is binary. The SNMMs form is defined by the parameter \(z_t\), which can be controlled by the input `type`

as follows

`type=1`

sets \(z_t=1\). This implies that \(\psi\) is the effect of exposure at any time t on Y.`type=2`

sets \(z_t=c(1,l_t)\), and adds affect modification by the first named variable in`Lny`

, which we denote \(L_t\). Now \(\psi=c(\psi_0,\psi_1)\) where \(\psi_0\) is the effect of exposure at any time t on Y when \(l_t=0\) for all t, modified by \(\psi_1\) for each unit increase in \(l_t\) at all times t. Note that effect modification is currently only supported for binary (written as a numeric 0,1 vector) or continuous confounders.`type=3`

allows for time-varying causal effects. It sets \(z_t\) to a vector of zeros of length T with a 1 in the t'th position. Now \(\psi=c(\psi_1,\ldots,\psi_T)\) where \(\psi_t\) is the effect of \(A_t\) on Y.`type=4`

allows for a time-varying causal effect that can be modified by the first named variable in`Lny`

, that is it allows for both time-varying effects and effect modification. It sets \(z_t\) to a vector of zeros of length T with \(c(1,l_t)\) in the t'th position. Now \(\psi=(\underline{\psi_1},\ldots,\underline{\psi_T})\) where \(\underline{\psi_t}=c(\psi_{0t},\psi_{1t})\). Here \(\psi_{0t}\) is the effect of exposure at time t on Y when \(l_t=0\) modified by \(\psi_{1t}\) for each unit increase in \(l_t\). Note that effect modification is currently only supported for binary (written as a numeric 0,1 vector) or continuous confounders.

The data must be in long format, where we assume the convention that each row with `time=t`

contains \(A_t,L_t\) and \(C_{t+1}\) and \(Y_{T+1}\). Thus the censoring indicator for each row
should indicate that a user is censored AFTER time t. The end of study outcome \(Y_{T+1}\) should be repeated on each row. If either A or Y are binary, they must be written as numeric vectors taking values either 0 or 1.
The same is true for any covariate that is used for effect modification.
The data must be rectangular with a row entry for every individual for each exposure time 1 up to T. Data rows after censoring should be empty apart from the ID and time variables. This can be done using the function `FormatData`

.
By default the censoring, propensity and outcome models include the exposure history at the previous time as an explanatory variable. One may consider also including all previous exposure and confounder history as variables in `Lny`

,`Lnp`

, and `LnC`

, variables which can be generated using `FormatData`

.
Censoring weights are handled as described in Sjolander and Vansteelandt (2016). Note that it is necessary that any variable included in `LnC`

must also be in `Lnp`

. Missing data not due to censoring are handled automatically by removing rows with missing data prior to fitting the model. If outcome models fail to fit, consider removing covariates from `Lny`

but keeping
them in `Lnp`

to reduce collinearity issues.

Vansteelandt, S., & Sjolander, A. (2016). Revisiting g-estimation of the Effect of a Time-varying Exposure Subject to Time-varying Confounding, Epidemiologic Methods, 5(1), 37-56. <doi:10.1515/em-2015-0005>.

Dukes, O., & Vansteelandt, S. (2018). A Note on g-Estimation of Causal Risk Ratios, American Journal of Epidemiology, 187(5), 1079<U+2013>1084. <doi:10.1093/aje/kwx347>.

# NOT RUN { datas<-dataexamples(n=1000,seed=123,Censoring=FALSE) data=datas$datagest idvar="id" timevar="time" Yn="Y" An="A" Ybin=FALSE Abin=TRUE Lny=c("L","U") Lnp=c("L","U") type=1 Cn=NA LnC=NA gest(data,idvar=idvar,timevar,Yn,An,Ybin,Abin,Lny,Lnp,type=1) #Example with censoring datas<-dataexamples(n=1000,seed=123,Censoring=TRUE) data=datas$datagest Cn="C" LnC=c("L","U") gest(data,idvar,timevar,Yn,An,Ybin,Abin,Lny,Lnp,Cn,LnC,type=3) # }