coxph_mpl.control: Ancillary arguments for controling coxph_mpl fits

Description

This is used to set various numeric parameters controling a Cox model fit using coxph_mpl. Typically it would only be used in a call to coxph_mpl. Some basic checks are performed on inputs, such that impossible argument values (like a negative number of events per base, for example) are avoided.

Usage

coxph_mpl.control(n.obs=NULL, basis = "uniform", 
        smooth = NULL, max.iter=c(150,7.5e+04,1e+06),
        tol=1e-7, n.knots = NULL, n.events_basis = NULL, 
        range.quant = c(0.075,.9), cover.sigma.quant = .25, 
        cover.sigma.fixed=.25, min.theta = 1e-10, penalty = 2L,
        order = 3L, kappa = 1/.6, epsilon = c(1e-16, 1e-10), 
        ties = "epsilon", seed = NULL)

Arguments

n.obs

the number of fully observed (i.e., non censored) outcomes. This argument is only required when basis=="uniform" to define an acceptable range of values for n.events_basis.

basis

the name of the basis to use to approximate the baseline hazard function. Available options are "uniform", for a step function approximation); "gaussian", using truncated Gaussian densities; "msplines", as defined by Ramsay (1988); and "epanechikov". Default is basis="uniform".

smooth

the smoothing parameter value. When specified, it should be larger or equal to zero. By default, the smoothing value is set to NULL and its optimal value is estimated via REML. Maximum likelihood estimates are obtained by specifying smooth=0. The effect of the smooting parameter over the estimates (currently) depends on the response range, such that its value is (currently) difficult to interpret.

max.iter

a vector of 3 integers defining the maximum number of iterations for the smooth parameter (first value) and for the Beta and Theta (second value) parameters to attempt for convergence. The third value is the total number of iterations allawed. Default is max.iter=c(150,7.5e+04,1e+06).

tol

the convergence tolerence value. Convergence is achieved when the maximum absolute difference between the parameter estimates at iteration k and iteration k-1 is smaller than tol. Default is tol=1e-7.

n.knots

a vector of 2 integers defining how the internal knot sequence (the minimum and maximum observations define the external knots) of non-uniform bases should be set. The first value specify the number of quantile knots to be set between the range.quant quantiles of the fully observed (i.e., non censored) outcomes. The second value specify the number of equally spaced knots to be set outside the range of the quantile knots. The first and last equally spaced knots equal the minimum and maximum response values. When the number of quantile knots is larger than 0, other equally spaced knots are set between the largest quantile knot and the maximum outcome value. The minimal total number of knots is 3. Default is n.knots=c(8,2) when basis=="msplines" and n.knots=c(0,20) otherwise.

n.events_basis

an integer specifing the number of fully observed (i.e., non censored) outcome per uniform base. The value has to be larger or equal to one and smaller than n.obs divided by 2. Default is round(3.5*log(n.obs)-7.5) if it belongs to the accepted range of values.

range.quant

a vector of length 2 defining the range of the quantile knots when a non uniform basis is chosen. By default, range.quant = c(0.075,.9) such that n.knots[1] quantile knots are set between the quantiles 0.075 and 0.9 of the fully observed (i.e., non censored) outcomes.

cover.sigma.quant

the proportion of fully observed (i.e., non censored) outcomes that should belong to the interval defined by the quantiles 0.025 and 0.975 of each truncated Gaussian base corresponding to a quantile knot (see n.kots). This value allows to define the standard deviation of these bases. Default is cover.sigma.quant=.25.

cover.sigma.fixed

the proportion of the outcome range that should belong to the interval defined by the quantiles 0.025 and 0.975 of each untruncated Gaussian base corresponding to each fixed knot (see n.kots). Default is cover.sigma.fixed=.25.

min.theta

a value indicating the minimal baseline hazard parameter value in the output (i.e., after the fit). Baseline hazard parameter estimates lower than min.theta will be considered as zero. Consequenlty, in the inference, these zero estimates will correspond to active constraints as defined by Moore and Sadler (2008). Default is 1e-10.

penalty

an integer specifying the order of the penalty matrix (see Ma, Heritier and Lo (2008)). Currently, the first and second order penalty matrices are available for the "uniform" and "gaussian" bases, the second order penalty matrix is available for the "epanechikov" basis, and the penalty matrix of the "msplines" basis is set to order-1 (see order below). Default is penalty=2.

order

an integer specifying the order of the "msplines" (as defined by Ramsay (1988)) and "epanechikov" basis. Default is order=3. M-splines of order 1 correspond to a uniform base (with density equal to one) and M-splines of order 2 correspond to a triangular base.

kappa

a value larger than 1 used in the fitting algorithm to decrease the step size when the penalised likelihood doesn't increase during the iterative process. Default is kappa=1/.6.

epsilon

a vector of 2 values indicating the minimum distance from 1 and from 0 for - respectively - the survival function and the baseline parameter estimates in order to avoid problems with logarithms in the fitting algorithm . Default is epsilon=c(1e-16, 1e-10)

ties

a character string indicating a method to handle duplicated outcomes when defining the knots sequence (see n.events_basis and n.knots). Current available options are "epsilon" which add a random noise smaller than 1e-10 to each duplicate fully observed (i.e., non censored) outcomes, and "unique" which delete duplicated fully observed (i.e., non censored) outcomes when defing the knot sequence. Default is ties="epsilon".

seed

NULL or an integer vector compatible with .Random.seed: the seed to be used when adding a random noise to duplicate events when ties="epsilon". The current value of .Random.seed will be preserved if seed is set, i.e. non-NULL; otherwise, as by default, .Random.seed will be used and modified as usual from calls to runif() etc.

Value

a list containing the values of each of the above arguments (except n.obs).

References

Ma, J. and Heritier, S. and Lo, S. (2014), On the Maximum Penalised Likelihood Approach for Proportional Hazard Models with Right Censored Survival Data. Computational Statistics and Data Analysis 74, 142-156.

Moore, T. J. and Sadler, B. M. and Kozick R. J. (2008), Maximum-Likelihood Estimation, the Cramer-Rao Bound, and the Method of Scoring With Parameter Constraints, IEEE Transactions On Signal Processing 56, 3, 895-907.

Ramsay, J. O. (1988), Monotone Regression Splines in Action, Statistical Science 3, 4, 425-441.

Description

Usage

Arguments

Value

References

See Also