Learn R Programming

discSurv (version 2.0.0)

calibrationPlot: Calibration Plots

Description

Calibration plot based on predictions. Overall root mean squared error (RMSE) of predicted and observed discrete hazards is calculated.

Usage

calibrationPlot(
  testPreds,
  testDataLong,
  weights = NULL,
  K = 10,
  event = "e1",
  ...
)

Arguments

testPreds

Predictions on the validation data with model fitted on training data ("numeric vector").

testDataLong

Validation data set in long format ("class data.frame").

weights

optional vector of weights ("numeric vector"). The length of weights must be equal to the number of observations of the validation data set.

K

Number of subsets for plotting ("integer vector").

event

Column names of the event to be considered for plotting (only in case of cause-specific hazards) ("character vector").

...

Additional arguments passed to plot.

Value

Calibration plot

References

bergerTutorialdiscSurv heyardValCompRisksdiscSurv bergerAssessingdiscSurv

See Also

estRecal, dataLong, dataLongCompRisks, dataLongSubDist

Examples

Run this code
# NOT RUN {
####################
# Data preprocessing

# Example unemployment data
library(Ecdat)
data(UnempDur)

# Select subsample
selectInd1 <- 1:100
selectInd2 <- 101:200
trainSet <- UnempDur[which(UnempDur$spell %in% (1:10))[selectInd1], ]
valSet <- UnempDur[which(UnempDur$spell %in% (1:10))[selectInd2], ]

####################
# One event

# Convert to long format
trainSet_long <- dataLong(dataShort = trainSet, timeColumn = "spell", eventColumn = "censor1")
valSet_long <- dataLong(dataShort = valSet, timeColumn = "spell", eventColumn = "censor1")

# Estimate continuation ratio model with logit link
glmFit <- glm(formula = y ~ timeInt + age + logwage, data = trainSet_long, family = binomial())

# Calculate predicted hazards
predHazards <- predict(glmFit, newdata = valSet_long, type = "response")

# Calibration plot
calibrationPlot(predHazards, testDataLong = valSet_long)

############################
# Two cause specific hazards 

# Convert to long format
trainSet_long <- dataLongCompRisks(dataShort = trainSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"))
valSet_long <- dataLongCompRisks(dataShort = valSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"))

# Estimate continuation ratio model with logit link
vglmFit <- VGAM::vglm(formula = cbind(e0, e1, e2) ~ timeInt + age + logwage, data = trainSet_long, 
family = VGAM::multinomial(refLevel = "e0"))

# Calculate predicted hazards
predHazards <- VGAM::predictvglm(vglmFit, newdata = valSet_long, type = "response")

# Calibration plots
calibrationPlot(predHazards, testDataLong = valSet_long)
calibrationPlot(predHazards, testDataLong = valSet_long, event = "e2")

###############################
# Subdistribution hazards model

# Convert to long format
trainSet_long <- dataLongSubDist(dataShort = trainSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"), eventFocus = "censor1")
valSet_long <- dataLongSubDist(dataShort = valSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"), eventFocus = "censor1")

# Estimate continuation ratio model with logit link
glmFit <- glm(formula = y ~ timeInt + age + logwage, data = trainSet_long, 
family = binomial(), weights = trainSet_long$subDistWeights)

# Calculate predicted  hazards 
predHazards <- predict(glmFit, newdata = valSet_long, type = "response")

# Calibration plot 
calibrationPlot(predHazards, testDataLong = valSet_long, weights = valSet_long$subDistWeights)

# }

Run the code above in your browser using DataLab