estimate_cdf.default: Estimation of Failure Probabilities

Description

This function applies a non-parametric method to estimate the failure probabilities of complete data taking (multiple) right-censored observations into account.

Usage

# S3 method for default
estimate_cdf(
  x,
  status,
  id = NULL,
  method = c("mr", "johnson", "kaplan", "nelson"),
  options = list(),
  ...
)

Value

A tibble containing the following columns:

id : Identification for every unit.
x : Lifetime characteristic.
status : Binary data (0 or 1) indicating whether a unit is a right censored observation (= 0) or a failure (= 1).
rank : The (computed) ranks. Determined for methods "mr" and "johnson", filled with NA for other methods or if status = 0.
prob : Estimated failure probabilities, NA if status = 0.
cdf_estimation_method : Specified method for the estimation of failure probabilities.

Arguments

x: A numeric vector which consists of lifetime data. Lifetime data could be every characteristic influencing the reliability of a product, e.g. operating time (days/months in service), mileage (km, miles), load cycles.
status: A vector of binary data (0 or 1) indicating whether unit i is a right censored observation (= 0) or a failure (= 1).
id: A vector for the identification of every unit. Default is NULL.
method: Method used for the estimation of failure probabilities. See 'Details'.
options: A list of named options. See 'Options'.
...: Further arguments passed to or from other methods. Currently not used.

Options

The listed options can only be applied for method "mr":

mr_method : "benard" (default) or "invbeta".
mr_ties.method : "max" (default), "min" or "average".

Details

The following techniques can be used for the method argument:

"mr" : Method Median Ranks is used to estimate the failure probabilities of failed units without considering censored items. Tied observations can be handled in three ways (See 'Options'):
- "max" : Highest observed rank is assigned to tied observations.
- "min" : Lowest observed rank is assigned to tied observations.
- "average" : Mean rank is assigned to tied observations.
Two formulas can be used to determine cumulative failure probabilities F(t) (See 'Options'):
- "benard" : Benard's approximation for Median Ranks.
- "invbeta" : Exact Median Ranks using the inverse beta distribution.
"johnson" : The Johnson method is used to estimate the failure probabilities of failed units, taking censored units into account. Compared to complete data, correction of probabilities is done by the computation of adjusted ranks.
"kaplan" : The method of Kaplan and Meier is used to estimate the survival function S(t) with respect to (multiple) right censored data. The complement of S(t), i.e. F(t), is returned. In contrast to the original Kaplan-Meier estimator, one modification is made (see 'References').

Note : The Kaplan-Meier estimator does not assign ranks to observations, so the beta-binomial confidence intervals cannot be calculated using this method.
"nelson" : The Nelson-Aalen estimator models the cumulative hazard rate function in case of (multiple) right censored data. Equating the formal definition of the hazard rate with that according to Nelson-Aalen results in a formula for the calculation of failure probabilities.

Note : The Nelson-Aalen estimator does not assign ranks to observations, so the beta-binomial confidence intervals cannot be calculated using this method.

References

NIST/SEMATECH e-Handbook of Statistical Methods, 8.2.1.5. Empirical model fitting - distribution free (Kaplan-Meier) approach, NIST SEMATECH, December 3, 2020

Examples

Run this code

# Vectors:
cycles <- alloy$cycles
status <- alloy$status

# Example 1 - Johnson method:
prob_tbl <- estimate_cdf(
  x = cycles,
  status = status,
  method = "johnson"
)


# Example 2 - Method 'mr' with options:
prob_tbl_2 <- estimate_cdf(
  x = cycles,
  status = status,
  method = "mr",
  options = list(
    mr_method = "invbeta",
    mr_ties.method = "average"
  )
)

Run the code above in your browser using DataLab