Performs stochastic gradient descent optimisation for large-scale survival models after removing observations with missing values.
bigSurvSGD.na.omit(
formula = survival::Surv(time = time, status = status) ~ .,
data,
norm.method = "standardize",
features.mean = NULL,
features.sd = NULL,
opt.method = "AMSGrad",
beta.init = NULL,
beta.type = "averaged",
lr.const = 0.12,
lr.tau = 0.5,
strata.size = 20,
batch.size = 1,
num.epoch = 100,
b1 = 0.9,
b2 = 0.99,
eps = 1e-08,
inference.method = "plugin",
num.boot = 1000,
num.epoch.boot = 100,
boot.method = "SGD",
lr.const.boot = 0.12,
lr.tau.boot = 0.5,
num.sample.strata = 1000,
sig.level = 0.05,
beta0 = 0,
alpha = NULL,
lambda = NULL,
nlambda = 100,
num.strata.lambda = 10,
lambda.scale = 1,
parallel.flag = FALSE,
num.cores = NULL,
bigmemory.flag = FALSE,
num.rows.chunk = 1e+06,
col.names = NULL,
type = "float"
)A fitted model object storing the learned coefficients, optimisation metadata, and any requested inference summaries. coef: Log of hazards ratio. If no inference is used, it returns a vector for estimated coefficients: If inference is used, it returns a matrix including estimates and confidence intervals of coefficients. In case of penalization, it resturns a matrix with columns corresponding to lambdas. coef.exp: Exponentiated version of coef (hazards ratio). lambda: Returns lambda(s) used for penalizarion. alpha: Returns alpha used for penalizarion. features.mean: Returns means of features, if given or calculated features.sd: Returns standard deviations of features, if given or calculated.
Model formula describing the survival outcome and the set of predictors to include in the optimisation.
Input data set or connection to a big-memory backed design
matrix that contains the variables referenced in formula.
Normalization strategy applied to the feature matrix before optimisation, for example centring or standardising columns.
Optional pre-computed column means used when normalising the features so that repeated fits can reuse shared statistics.
Optional pre-computed column standard deviations used in
concert with features.mean for scaling the predictors.
Gradient based optimisation routine to employ, such as vanilla SGD or adaptive methods like Adam.
Vector of starting values for the regression coefficients supplied when warm-starting the optimisation.
Indicator controlling how beta.init is interpreted,
for example whether the coefficients correspond to the original or
normalised scale.
Base learning-rate constant used by the stochastic gradient descent routine.
Learning-rate decay horizon or damping factor that moderates the step size schedule.
Number of observations drawn per stratum when building mini-batches for the optimisation loop.
Total number of observations assembled into each stochastic gradient batch.
Number of passes over the training data used during the optimisation.
First exponential moving-average rate used by adaptive methods such as Adam to smooth gradients.
Second exponential moving-average rate used by adaptive methods to smooth squared gradients.
Numerical stabilisation constant added to denominators when updating the adaptive moments.
Inference approach requested after fitting, for example naive asymptotics or bootstrap resampling.
Number of bootstrap replicates to draw when
inference.method relies on resampling.
Number of optimisation epochs to run within each bootstrap replicate.
Type of bootstrap scheme to apply, such as ordinary or stratified resampling.
Learning-rate constant used during bootstrap refits.
Learning-rate decay factor applied during bootstrap refits.
Number of strata sampled without replacement during each bootstrap iteration when stratified resampling is selected.
Significance level used when constructing confidence intervals or hypothesis tests.
Optional vector of coefficients under the null hypothesis when performing hypothesis tests.
Elastic-net mixing parameter controlling the relative weight of \(\ell_1\) and \(\ell_2\) regularisation penalties.
Sequence of regularisation strengths supplied explicitly for penalised estimation.
Number of automatically generated lambda values when a
grid is produced internally.
Number of strata used when tuning lambda via
cross-validation or other search procedures.
Scale on which the lambda grid is generated, for
example logarithmic or linear spacing.
Logical flag enabling parallel computation of gradients or bootstrap replicates.
Number of processing cores to use when parallel execution is enabled.
Logical flag indicating whether intermediate matrices should be stored using bigmemory backed objects.
Row chunk size to use when streaming data from an on-disk matrix representation.
Optional character vector of column names associated with the feature matrix.
Type of survival model to fit, for example Cox proportional hazards or accelerated failure time variants.
See Also bigSurvSGD,
bigscale for constructing normalised design matrices and
partialbigSurvSGDv0 for partial fitting pipelines.
# \donttest{
data(micro.censure, package = "bigPLScox")
surv_data <- stats::na.omit(micro.censure[, c("survyear", "DC", "sexe", "Agediag")])
# Increase num.epoch and num.boot for real use
fit <- bigSurvSGD.na.omit(
survival::Surv(survyear, DC) ~ .,
data = surv_data,
norm.method = "standardize",
opt.method = "adam",
batch.size = 16,
num.epoch = 2,
)
# }
Run the code above in your browser using DataLab