# km.rs

0th

Percentile

##### Kaplan-Meier and Reduced Sample Estimator using Histograms

Compute the Kaplan-Meier and Reduced Sample estimators of a survival time distribution function, using histogram techniques

Keywords
spatial, nonparametric
##### Usage
km.rs(o, cc, d, breaks)
##### Arguments
o

vector of observed survival times

cc

vector of censoring times

d

vector of non-censoring indicators

breaks

Vector of breakpoints to be used to form histograms.

##### Details

This function is needed mainly for internal use in spatstat, but may be useful in other applications where you want to form the Kaplan-Meier estimator from a huge dataset.

Suppose $$T_i$$ are the survival times of individuals $$i=1,\ldots,M$$ with unknown distribution function $$F(t)$$ which we wish to estimate. Suppose these times are right-censored by random censoring times $$C_i$$. Thus the observations consist of right-censored survival times $$\tilde T_i = \min(T_i,C_i)$$ and non-censoring indicators $$D_i = 1\{T_i \le C_i\}$$ for each $$i$$.

The arguments to this function are vectors o, cc, d of observed values of $$\tilde T_i$$, $$C_i$$ and $$D_i$$ respectively. The function computes histograms and forms the reduced-sample and Kaplan-Meier estimates of $$F(t)$$ by invoking the functions kaplan.meier and reduced.sample. This is efficient if the lengths of o, cc, d (i.e. the number of observations) is large.

The vectors km and hazard returned by kaplan.meier are (histogram approximations to) the Kaplan-Meier estimator of $$F(t)$$ and its hazard rate $$\lambda(t)$$. Specifically, km[k] is an estimate of F(breaks[k+1]), and lambda[k] is an estimate of the average of $$\lambda(t)$$ over the interval (breaks[k],breaks[k+1]). This approximation is exact only if the survival times are discrete and the histogram breaks are fine enough to ensure that each interval (breaks[k],breaks[k+1]) contains only one possible value of the survival time.

The vector rs is the reduced-sample estimator, rs[k] being the reduced sample estimate of F(breaks[k+1]). This value is exact, i.e. the use of histograms does not introduce any approximation error in the reduced-sample estimator.

##### Value

A list with five elements

rs

Reduced-sample estimate of the survival time c.d.f. $$F(t)$$

km

Kaplan-Meier estimate of the survival time c.d.f. $$F(t)$$

hazard

corresponding Nelson-Aalen estimate of the hazard rate $$\lambda(t)$$

r

values of $$t$$ for which $$F(t)$$ is estimated

breaks

the breakpoints vector

reduced.sample, kaplan.meier