kernel_sievePH
implements estimation and hypothesis testing method of
Sun et al. (2009) for a mark-specific proportional hazards model. The methods
allow separate baseline mark-specific hazard functions for different baseline
subgroups.
kernel_sievePH(
eventTime,
eventInd,
mark,
tx,
zcov = NULL,
strata = NULL,
formulaPH = ~tx,
tau = NULL,
tband = NULL,
hband = NULL,
nvgrid = 100,
a = NULL,
b = NULL,
ntgrid = NULL,
nboot = 500,
seed = NULL,
maxit = 6
)
An object of class kernel_sievePH
which can be processed by
summary.kernel_sievePH
to obtain or print a summary of the
results. An object of class kernel_sievePH
is a list containing the
following components:
H10
: a data frame with test statistics (first row) and
corresponding p-values (second row) for testing \(H_{10}: HR(v) = 1\) for v
\(\in [a, b]\). Columns TSUP1
and Tint1
include test
statistics and p-values for testing \(H_{10}\) vs. \(H_{1a}: HR(v) \neq
1\) for any v \(\in [a, b]\) (general alternative). Columns TSUP1m
and Tint1m
include test statistics and p-values for testing
\(H_{10}\) vs. \(H_{1m}: HR(v) \leq 1\) with strict inequality for some v
in \([a, b]\) (monotone alternative). TSUP1
and TSUP1m
are
based on extensions of the classic Kolmogorov-Smirnov supremum-based test.
Tint1
and Tint1m
are based on generalizations of the
integration-based Cramer-von Mises test. Tint1
and Tint1m
involve integration of deviations over the whole range of the mark. If
nboot
is NULL
, H10
is returned as NULL
.
H20
: a data frame with test statistics (first row) and
corresponding p-values (second row) for testing \(H_{20}\): HR(v) does not
depend on v \(\in [a, b]\). Columns TSUP2
and Tint2
include
test statistics and p-values for testing \(H_{20}\) vs. \(H_{2a}\): HR
depends on v \(\in [a, b]\) (general alternative). Columns TSUP2m
and Tint2m
include test statistics and p-values for testing
\(H_{20}\) vs. \(H_{2m}\): HR increases as v increases \(\in [a, b]\)
(monotone alternative). TSUP2
and TSUP2m
are based on
extensions of the classic Kolmogorov-Smirnov supremum-based test.
Tint2
and Tint2m
are based on generalizations of the
integration-based Cramer-von Mises test. Tint2
and Tint2m
involve integration of deviations over the whole range of the mark. If
nboot
is NULL
, H20
is returned as NULL
.
estBeta
: a data frame summarizing point estimates and standard
errors of the mark-specific coefficients for treatment at equally-spaced
values between the minimum and the maximum of the observed mark values.
cBproc1
: a data frame containing equally-spaced mark values in
the column Mark
, test processes \(Q^{(1)}(v)\) for observed data in
the column Observed
, and \(Q^{(1)}(v)\) for nboot
independent
sets of normal samples in the columns S1, S2, \(\cdots\). If
nboot
is NULL
, cBproc1
is returned as NULL
.
cBproc2
: a data frame containing equally-spaced mark values in
the column Mark
, test processes \(Q^{(2)}(v)\) for observed data in
the column Observed
, and \(Q^{(2)}(v)\) for nboot
independent
sets of normal samples in the columns S1, S2, \(\cdots\). If
nboot
is NULL
, cBproc2
is returned as NULL
.
Lambda0
: an array of dimension K x nvgrid x ntgrid for the
kernel-smoothed baseline hazard function \(\lambda_{0k}, k = 1, \dots,
K\) where \(K\) is the number of strata. If ntgrid
is NULL
(by default), Lambda0
is returned as NULL
.
a numeric vector specifying the observed right-censored event time.
a numeric vector indicating the event of interest (1 if event, 0 if right-censored).
a numeric vector specifying a univariate continuous mark. No
missing values are permitted for subjects with eventInd = 1
. For
subjects with eventInd = 0
, the value(s) in mark
should be
set to NA
.
a numeric vector indicating the treatment group (1 if treatment, 0 if placebo).
a data frame with one row per subject specifying possibly
time-dependent covariate(s) (not including tx
). If no covariate is
used, zcov
should be set to the default of NULL
.
a numeric vector specifying baseline strata (NULL
by
default). If specified, a separate mark-specific baseline hazard is assumed
for each stratum.
a one-sided formula object (on the right side of the
~
operator) specifying the linear predictor in the proportional
hazards model. Available variables to be used in the formula include
tx
and variable(s) in zcov
. By default, formulaPH
is
specified as ~ tx
.
a numeric value specifying the duration of study follow-up period.
Failures beyond tau
are treated right-censored. There needs to be at
least \(10\%\) of subjects (as a rule of thumb) remaining uncensored by
tau
for the estimation to be stable. By default, tau
is set
as the maximum of eventTime
.
a numeric value between 0 and tau
specifying the
bandwidth of the kernel smoothing function over time. By default,
tband
is set as (tau
-min(eventTime
))/5.
a numeric value between 0 and 1 specifying the bandwidth of the
kernel smoothing function over mark. By default, hband
is set as
\(4\sigma n^{-1/3}\) where \(\sigma\) is the estimated standard
deviation of the observed marks for uncensored failure times and \(n\) is
the number of subjects in the dataset. Larger bandwidths are recommended
for higher percentages of missing marks.
an integer value (100 by default) specifying the number of equally spaced mark values between the minimum and maximum of the observed mark for which the treatment effects are evaluated.
a numeric value between the minimum and maximum of observed mark
values specifying the lower bound of the range for testing the null
hypotheses \(H_{10}: HR(v) = 1\) and \(H_{20}: HR(v)\) does not depend
on \(v\), for \(v \in [a, b]\); By default, a
is set as
(max(mark) - min(mark))/nvgrid + min(mark)
.
a numeric value between the minimum and maximum of observed mark
specifying the upper bound of the range for testing the null hypotheses
\(H_{10}: HR(v) = 1\) and \(H_{20}: HR(v)\) does not depend on \(v\),
for \(v \in [a, b]\); By default, b
is set as \(max(mark)\).
an integer value (NULL
by default) specifying the number
of equally spaced time points for which the mark-specific baseline hazard
functions are evaluated. If NULL
, baseline hazard functions are not
evaluated.
number of bootstrap iterations (500 by default) for simulating
the distributions of test statistics. If NULL
, the hypotheses tests
are not performed.
an integer specifying the random number generation seed for reproducing the test statistics and p-values. By default, a specific seed is not set.
Maximum number of iterations to attempt for convergence in estimation. The default is 6.
kernel_sievePH
analyzes data from a randomized
placebo-controlled trial that evaluates treatment efficacy for a
time-to-event endpoint with a continuous mark. The parameter of interest is
the ratio of the conditional mark-specific hazard functions
(treatment/placebo), which is based on a stratified mark-specific
proportional hazards model. This model assumes no parametric form for the
baseline hazard function nor the treatment effect across different mark
values.
Sun, Y., Gilbert, P. B., & McKeague, I. W. (2009). Proportional hazards models with continuous marks. Annals of statistics, 37(1), 394.
Yang, G., Sun, Y., Qi, L., & Gilbert, P. B. (2017). Estimation of stratified mark-specific proportional hazards models under two-phase sampling with application to HIV vaccine efficacy trials. Statistics in biosciences, 9, 259-283.
set.seed(20240410)
beta <- 2.1
gamma <- -1.3
n <- 200
tx <- rep(0:1, each = n / 2)
tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(gamma)))
cens <- runif(n, 0, 15)
eventTime <- pmin(tm, cens, 3)
eventInd <- as.numeric(tm <= pmin(cens, 3))
alpha <- function(b){ log((1 - exp(-2)) * (b - 2) / (2 * (exp(b - 2) - 1))) }
mark0 <- log(1 - (1 - exp(-2)) * runif(n / 2)) / (-2)
mark1 <- log(1 + (beta - 2) * (1 - exp(-2)) * runif(n / 2) / (2 * exp(alpha(beta)))) /
(beta - 2)
mark <- ifelse(eventInd == 1, c(mark0, mark1), NA)
# the true TE(v) curve underlying the data-generating mechanism is:
# TE(v) = 1 - exp{alpha(beta) + beta * v + gamma}
# complete-case estimation discards rows with a missing mark
fit <- kernel_sievePH(eventTime, eventInd, mark, tx, tau = 3, tband = 0.5,
hband = 0.3, nvgrid = 20, nboot = 20)
Run the code above in your browser using DataLab