Compute the Kaplan-Meier and Reduced Sample estimators of a survival time distribution function, using histogram techniques

`km.rs(o, cc, d, breaks)`

A list with five elements

- rs
Reduced-sample estimate of the survival time c.d.f. \(F(t)\)

- km
Kaplan-Meier estimate of the survival time c.d.f. \(F(t)\)

- hazard
corresponding Nelson-Aalen estimate of the hazard rate \(\lambda(t)\)

- r
values of \(t\) for which \(F(t)\) is estimated

- breaks
the breakpoints vector

- o
vector of observed survival times

- cc
vector of censoring times

- d
vector of non-censoring indicators

- breaks
Vector of breakpoints to be used to form histograms.

Adrian Baddeley Adrian.Baddeley@curtin.edu.au

and Rolf Turner r.turner@auckland.ac.nz

This function is needed mainly for internal use in spatstat, but may be useful in other applications where you want to form the Kaplan-Meier estimator from a huge dataset.

Suppose \(T_i\) are the survival times of individuals \(i=1,\ldots,M\) with unknown distribution function \(F(t)\) which we wish to estimate. Suppose these times are right-censored by random censoring times \(C_i\). Thus the observations consist of right-censored survival times \(\tilde T_i = \min(T_i,C_i)\) and non-censoring indicators \(D_i = 1\{T_i \le C_i\}\) for each \(i\).

The arguments to this function are
vectors `o`

, `cc`

, `d`

of observed values of \(\tilde T_i\), \(C_i\)
and \(D_i\) respectively.
The function computes histograms and forms the reduced-sample
and Kaplan-Meier estimates of \(F(t)\) by
invoking the functions `kaplan.meier`

and `reduced.sample`

.
This is efficient if the lengths of `o`

, `cc`

, `d`

(i.e. the number of observations) is large.

The vectors `km`

and `hazard`

returned by `kaplan.meier`

are (histogram approximations to) the Kaplan-Meier estimator
of \(F(t)\) and its hazard rate \(\lambda(t)\).
Specifically, `km[k]`

is an estimate of
`F(breaks[k+1])`

, and `lambda[k]`

is an estimate of
the average of \(\lambda(t)\) over the interval
`(breaks[k],breaks[k+1])`

. This approximation is exact only if the
survival times are discrete and the
histogram breaks are fine enough to ensure that each interval
`(breaks[k],breaks[k+1])`

contains only one possible value of
the survival time.

The vector `rs`

is the reduced-sample estimator,
`rs[k]`

being the reduced sample estimate of `F(breaks[k+1])`

.
This value is exact, i.e. the use of histograms does not introduce any
approximation error in the reduced-sample estimator.

`reduced.sample`

,
`kaplan.meier`