h2o.coxph: Trains a Cox Proportional Hazards Model (CoxPH) on an H2O dataset

Description

Trains a Cox Proportional Hazards Model (CoxPH) on an H2O dataset

Usage

h2o.coxph(
  x,
  event_column,
  training_frame,
  model_id = NULL,
  start_column = NULL,
  stop_column = NULL,
  weights_column = NULL,
  offset_column = NULL,
  stratify_by = NULL,
  ties = c("efron", "breslow"),
  init = 0,
  lre_min = 9,
  max_iterations = 20,
  interactions = NULL,
  interaction_pairs = NULL,
  interactions_only = NULL,
  use_all_factor_levels = FALSE,
  export_checkpoints_dir = NULL,
  single_node_mode = FALSE
)

Arguments

x: (Optional) A vector containing the names or indices of the predictor variables to use in building the model. If x is missing, then all columns except event_column, start_column and stop_column are used.
event_column: The name of binary data column in the training frame indicating the occurrence of an event.
training_frame: Id of the training data frame.
model_id: Destination id for this model; auto-generated if not specified.
start_column: Start Time Column.
stop_column: Stop Time Column.
weights_column: Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.
offset_column: Offset column. This will be added to the combination of columns before applying the link function.
stratify_by: List of columns to use for stratification.
ties: Method for Handling Ties. Must be one of: "efron", "breslow". Defaults to efron.
init: Coefficient starting value. Defaults to 0.
lre_min: Minimum log-relative error. Defaults to 9.
max_iterations: Maximum number of iterations. Defaults to 20.
interactions: A list of predictor column indices to interact. All pairwise combinations will be computed for the list.
interaction_pairs: A list of pairwise (first order) column interactions.
interactions_only: A list of columns that should only be used to create interactions but should not itself participate in model training.
use_all_factor_levels: Logical. (Internal. For development only!) Indicates whether to use all factor levels. Defaults to FALSE.
export_checkpoints_dir: Automatically export generated models to this directory.
single_node_mode: Logical. Run on a single node to reduce the effect of network overhead (for smaller datasets) Defaults to FALSE.

Examples

Run this code

if (FALSE) {
library(h2o)
h2o.init()

# Import the heart dataset
f <- "https://s3.amazonaws.com/h2o-public-test-data/smalldata/coxph_test/heart.csv"
heart <- h2o.importFile(f)

# Set the predictor and response
predictor <- "age"
response <- "event"

# Train a Cox Proportional Hazards model 
heart_coxph <- h2o.coxph(x = predictor, training_frame = heart,
                         event_column = "event",
                         start_column = "start", 
                         stop_column = "stop")
}

Run the code above in your browser using DataLab