reg: Create 'reg' (regularisation) object

Description

This function creates an object of class reg, which holds regularisation type, tuning parameter scales, penalty factors and control list for optimisation. This is basically the control panel for the optimisation in rodeo and aim.

Usage

reg(reg_type = "l1", a = NULL, lower = -Inf, upper = Inf,
  lambda_factor = 1, exclude = NULL, penalty_factor = NULL,
  contexts = NULL, scales = NULL, fixed = FALSE, screen = NULL,
  exact_gradient = NULL, step_max = 200,
  step_screen = ceiling(step_max/10), step_cycle = step_screen,
  backtrack_max = 50, tol = 1e-05, tau_init = 0.1, tau_min = 1e-10,
  tau_scale = 0.5)

Arguments

reg_type

Character string determining the regularisation. Must be one of: "l1" (default), "l2", "elnet", "scad", "mcp" or "none". See details.

Numeric value of tuning parameter in elnet (must be between 0 and 1, default is 0.5), scad (must be larger than 2, default = 3.7) or mcp (must be larger than 1, default = 3).

lower

Either numeric vector of length 1 or p, of lower limit(s) for parameters, must be smaller or equal to upper.

upper

Either numeric vector of length 1 or p, of upper limit(s) for parameters, must be larger or equal to lower.

lambda_factor

Non-negative numeric value which will be multiplied with the tuning parameter in opt.

exclude

A vector indices to exclude from model (this is how to specify an infinite penalty factor). Default is none.

penalty_factor

A non-negative vector (p) of individual penalty weights. Defaults to 1 for each parameter. Will always be rescaled so its mean is 1.

contexts

A non-negative matrix (pxs) specifying design on context-level, see details. Defaults to a matrix of ones.

scales

A positive vector (p) of scales, see details. Defaults to a vector of ones. Will be rescaled to mean to 1.

fixed

Logical indicating if this parameter is fixed or should be optimised.

screen

Logical indicating if a faster optimisation relying on screening should be adapted. If NULL, it is set to TRUE iff p > 20.

exact_gradient

Logical indicating if exact gradient should be used. If NULL, it is set to TRUE iff p > 20.

step_max

Positive integer giving the maximal number of steps in optimisation procedure, per lambda.

step_screen

Positive integer giving the number of steps between screenings (defaults to ceiling(step_max / 50)).

step_cycle

Positive integer giving the number of steps per optimisation cycle, defaults to step_screen.

backtrack_max

Positive integer giving the maximal number of backtracking steps taken in each optimisation step.

tol

Positive numeric tolerance level used for parameter space.

tau_init

Positive initial step length for backtracking.

tau_min

Positive numeric giving minimal value of step length in backtracking.

tau_scale

Scaling parameter of step length, must be in (0,1).

Value

An object with S3 class "reg".

Details

Data format

The data (y in opt) is assumed generated from s different contexts. A new context begins whenever the time (the first column of y in opt) decreases. Hence s - 1 is the number of decreases in the time points.

Each context has its own initial condition and parameter vector specified through contexts in reg. More precisely, the effective parameter in context l is the element-wise product of a baseline parameter (scaled by scales in reg) and the l'th column of contexts. This enables the user to pre-specify different scales for each parameter, as well as different scales for the same parameter across contexts.

The default setup sets contexts to a matrix of ones. The following are examples of case-specific modifications for the mak ODE class: If reaction j does not take place in context l then set $contexts_{j,l} = 0$. If reaction j has a $50\%$ increase in rate in context l then set $contexts_{j,l} = 1.5$. If reaction j has independent rates in contexts l and l', then write that reaction twice in mak-object (with j' denoting its second appearance) and set $contexts_{j,l'} = 0$ and $contexts_{j',l} = 0$.

Loss function

The loss function optimised in rodeo is: $$RSS/(2 * (n - s)) + \lambda*\sum_{parameter \ argument}\lambda_{factor}*\sum_{j=1}^p v_jpen(param_j)$$ where $\lambda$ belongs to the lambda-sequence in opt-object and v is penalty_factor in reg. Moreover, the residual sum of squares, RSS, is given as: $$RSS = \sum^{n}_{i=1}||(y(t_i) - x(t_i, {x_0}_l, context_l * param))*\sqrt{w(t_i)}||_2^2$$ where param has been (internally) scaled with scales, and $w(t_i)$ and $y(t_i)$ refers to the i'th row of weights and y in opt (y with first column removed), respectively. The solution to the ODE system is the x()-function. The subscript 'l' refers to the context, i.e., the columns of contexts and x0 in rodeo-functions (x0 is the initial state of the system at the first time point after a decrease in the time-vector). All products are Hadamard products.

Regularisation

The type of regularisation is chosen via reg_type in reg:

l1:: Least Absolute Shrinkage and Selection Operator (Lasso) penalty. The penalty is the absolute value: $pen(param_j)=|param_j|$
l2:: Ridge penalty. The penalty is the squaring: $pen(param_j)=param_j^2/2$
elnet:: Elastic Net. A convex combination of l1 and l2 penalties: $pen(param_j)=(1-a)param_j^2/2+a|param_j|$, 0<=a<=1. Note if a = 0 or a = 1, then the penalty is automatically reduced to "l2" and "l1" respectively.
scad:: Smoothly Clipped Absolute Deviation penalty: $$pen(param_j)=\int^{param_j}_{0}{min( max((a\lambda-|param|)/(\lambda(a-1)), 0), 1) dparam}, a > 2.$$
mcp:: Maximum Concave Penalty penalty: $$pen(param_j)=\int^{param_j}_{0}{max(1 - |param|/(\lambda a), 0) dparam}, a > 1.$$
none:: No penalty. Not recommended for large systems. This overwrites both user-supplied and automatically generated lambda-sequences.

Optimisation

A proximal-gradient type of optimisation is employed. The step length is denoted $\tau$.

The flag screen (in reg) is passed onto the optimisation, it simply tells the optimisation to focus entirely on optimising a subset of the parameters, which where selected due to large gradients. At most step_screen (in reg) steps are taken before a re-evaluation of the screened subset takes place.

The convergence criteria is $\Delta \eta < 10^3 * \max(|(\Delta param \neq 0)| + |(\Delta x0 \neq 0)|, 1)$ AND $\Delta loss / \Delta \eta < tol_l$, where $$\Delta \eta = ||\Delta param||/tol_{param} + ||\Delta x0|| / tol_{x_0}$$

Tuning parameter

The lambda sequence can either be fully customised through lambda in opt or automatically generated. In the former case, a monotonic lambda-sequence is highly recommended. Throughout all optimisations, each optimisation step re-uses the old optimum, when sweeping through the lambda-sequence.

If lambda is NULL, an automatically generated sequence is used. A maximal value of lambda (the smallest at which 0 is a optimum in the rate parameter) is calculated and log-equidistant sequence of length nlambda is generated, with the ratio between the smallest and largest lambda value being lambda.min.ratio. Note: when the opt-object is passed to rodeo.ode, one may want to initialise the optimisation at a non-zero parameter and run the optimisation on the reversed lambda sequence. This is indicated by setting decrease.lambda = FALSE. If, however, the opt-object is passed to aim, glmnet ultimately decides on the lambda sequence, and may cut it short.

Gradient evaluations

Two gradient evaluations are implemented: exact and approximate. The computational time of exact gradient is generally longer than that of approximate gradient, but its relative magnitude depends on the solver in the ode-object passed to rodeo.

Examples

Run this code

# NOT RUN {
# Use 'reg' object to specify regularisation when creating 'ode' objects

# Example: power law kinetics with SCAD penalty on rate parameter
A <- matrix(
c(1, 1, 0, 0,
  0, 0, 1, 0,
  0, 0, 1, 0), ncol = 4, byrow = TRUE)
p <- plk(A, r = reg("scad"))

# Example: power law kinetics as above, with lower bound of -1
p <- plk(A, r = reg("scad", lower = -1))

# Example: rational mass action kinetics with
# MCP penalty on first parameter argument and l2 on second
B <- matrix(
c(0, 0, 1, 0,
  1, 1, 0, 0,
  1, 0, 0, 1), ncol = 4, byrow = TRUE)
rmak <- ratmak(A, B, r1 = reg("mcp"), r2 = reg("l2"))

# Example: mass action kinetics with
# no penalty on rate parameter and l2 on initial state
m <- mak(A, B, r = reg("none"), rx0 = reg("l2"))

# }

Run the code above in your browser using DataLab