Compute the Kaplan-Meier and Reduced Sample estimators of a survival time distribution function, using histogram techniques

`km.rs(o, cc, d, breaks)`

o

vector of observed survival times

cc

vector of censoring times

d

vector of non-censoring indicators

breaks

Vector of breakpoints to be used to form histograms.

A list with five elements

Reduced-sample estimate of the survival time c.d.f. \(F(t)\)

Kaplan-Meier estimate of the survival time c.d.f. \(F(t)\)

corresponding Nelson-Aalen estimate of the hazard rate \(\lambda(t)\)

values of \(t\) for which \(F(t)\) is estimated

the breakpoints vector

This function is needed mainly for internal use in spatstat, but may be useful in other applications where you want to form the Kaplan-Meier estimator from a huge dataset.

Suppose \(T_i\) are the survival times of individuals \(i=1,\ldots,M\) with unknown distribution function \(F(t)\) which we wish to estimate. Suppose these times are right-censored by random censoring times \(C_i\). Thus the observations consist of right-censored survival times \(\tilde T_i = \min(T_i,C_i)\) and non-censoring indicators \(D_i = 1\{T_i \le C_i\}\) for each \(i\).

The arguments to this function are
vectors `o`

, `cc`

, `d`

of observed values of \(\tilde T_i\), \(C_i\)
and \(D_i\) respectively.
The function computes histograms and forms the reduced-sample
and Kaplan-Meier estimates of \(F(t)\) by
invoking the functions `kaplan.meier`

and `reduced.sample`

.
This is efficient if the lengths of `o`

, `cc`

, `d`

(i.e. the number of observations) is large.

The vectors `km`

and `hazard`

returned by `kaplan.meier`

are (histogram approximations to) the Kaplan-Meier estimator
of \(F(t)\) and its hazard rate \(\lambda(t)\).
Specifically, `km[k]`

is an estimate of
`F(breaks[k+1])`

, and `lambda[k]`

is an estimate of
the average of \(\lambda(t)\) over the interval
`(breaks[k],breaks[k+1])`

. This approximation is exact only if the
survival times are discrete and the
histogram breaks are fine enough to ensure that each interval
`(breaks[k],breaks[k+1])`

contains only one possible value of
the survival time.

The vector `rs`

is the reduced-sample estimator,
`rs[k]`

being the reduced sample estimate of `F(breaks[k+1])`

.
This value is exact, i.e. the use of histograms does not introduce any
approximation error in the reduced-sample estimator.