Learn R Programming

NPHazardRate (version 0.1)

DiscretizeData: Discretize the available data set

Description

Defines equispaced disjoint intervals based on the range of the sample and calculates empirical hazard rate estimates at each interval center

Usage

DiscretizeData(xin, xout)

Arguments

xin

A vector of input values

xout

Grid points where the function will be evaluated

Value

A vector with the values of the function at the designated points xout or the random numbers drawn.

Details

The function defines the subinterval length \(\Delta = (0.8\max(X_i) - \min(X_i))/N\) where \(N\) is the sample size. Then at each bin (subinterval) center, the empirical hazard rate estimate is calculated by $$ c_i = \frac{f_i}{\Delta(N-F_i +1) } $$ where \(f_i\) is the frequency of observations in the ith bin and \(F_i = \sum_{j\leq i} f_j\) is the empirical cummulative distribution estimate.

Examples

Run this code
# NOT RUN {
x<-seq(0, 5,length=100) #design points where the estimate will be calculated
SampleSize<-100 #amount of data to be generated
ti<- rweibull(SampleSize, .6, 1) # draw a random sample
ui<-rexp(SampleSize, .2)         # censoring sample
cat("\n AMOUNT OF CENSORING: ", length(which(ti>ui))/length(ti)*100, "\n")
x1<-pmin(ti,ui)                  # observed data
cen<-rep.int(1, SampleSize)      # initialize censoring indicators
cen[which(ti>ui)]<-0             # 0's correspond to censored indicators

a.use<-DiscretizeData(ti, x)     # discretize the data
BinCenters<-a.use$BinCenters     # get the data centers
ci<-a.use$ci                     # get empircal hazard rate estimates
Delta=a.use$Delta                # Binning range


# }

Run the code above in your browser using DataLab