This function creates an object of class reg
, which holds regularisation type, tuning parameter scales, penalty factors and control list for optimisation. This is basically the control panel for the optimisation in rodeo
and aim
.
reg(reg_type = "l1", a = NULL, lower = -Inf, upper = Inf,
lambda_factor = 1, exclude = NULL, penalty_factor = NULL,
contexts = NULL, scales = NULL, fixed = FALSE, screen = NULL,
exact_gradient = NULL, step_max = 200,
step_screen = ceiling(step_max/10), step_cycle = step_screen,
backtrack_max = 50, tol = 1e-05, tau_init = 0.1, tau_min = 1e-10,
tau_scale = 0.5)
Character string determining the regularisation. Must be one of: "l1" (default), "l2", "elnet", "scad", "mcp" or "none". See details.
Numeric value of tuning parameter in elnet
(must be between 0 and 1, default is 0.5), scad
(must be larger than 2, default = 3.7) or mcp
(must be larger than 1, default = 3).
Either numeric vector of length 1 or p, of lower limit(s) for parameters, must be smaller or equal to upper
.
Either numeric vector of length 1 or p, of upper limit(s) for parameters, must be larger or equal to lower
.
Non-negative numeric value which will be multiplied with the tuning parameter in opt
.
A vector indices to exclude from model (this is how to specify an infinite penalty factor). Default is none.
A non-negative vector (p) of individual penalty weights. Defaults to 1 for each parameter. Will always be rescaled so its mean is 1.
A non-negative matrix (pxs) specifying design on context-level, see details. Defaults to a matrix of ones.
A positive vector (p) of scales, see details. Defaults to a vector of ones. Will be rescaled to mean to 1.
Logical indicating if this parameter is fixed or should be optimised.
Logical indicating if a faster optimisation relying on screening should be adapted. If NULL
, it is set to TRUE
iff p > 20.
Logical indicating if exact gradient should be used. If NULL
, it is set to TRUE
iff p > 20.
Positive integer giving the maximal number of steps in optimisation procedure, per lambda.
Positive integer giving the number of steps between screenings (defaults to ceiling(step_max / 50)
).
Positive integer giving the number of steps per optimisation cycle, defaults to step_screen
.
Positive integer giving the maximal number of backtracking steps taken in each optimisation step.
Positive numeric tolerance level used for parameter space.
Positive initial step length for backtracking.
Positive numeric giving minimal value of step length in backtracking.
Scaling parameter of step length, must be in (0,1).
An object with S3 class "reg".
The data (y
in opt
) is assumed generated from s different contexts. A new context begins whenever the time (the first column of y
in opt
) decreases. Hence s - 1 is the number of decreases in the time points.
Each context has its own initial condition and parameter vector specified through contexts
in reg
. More precisely, the effective parameter in context l is the element-wise product of a baseline parameter (scaled by scales
in reg
) and the l'th column of contexts
. This enables the user to pre-specify different scales for each parameter, as well as different scales for the same parameter across contexts.
The default setup sets contexts
to a matrix of ones. The following are examples of case-specific modifications for the mak
ODE class:
If reaction j
does not take place in context l
then set \(contexts_{j,l} = 0\).
If reaction j
has a \(50\%\) increase in rate in context l
then set \(contexts_{j,l} = 1.5\).
If reaction j
has independent rates in contexts l
and l
', then write that reaction twice in mak
-object (with j
' denoting its second appearance) and set \(contexts_{j,l'} = 0\) and \(contexts_{j',l} = 0\).
The loss function optimised in rodeo
is:
$$RSS/(2 * (n - s)) + \lambda*\sum_{parameter \ argument}\lambda_{factor}*\sum_{j=1}^p v_jpen(param_j)$$
where \(\lambda\) belongs to the lambda
-sequence in opt
-object and v is penalty_factor
in reg
. Moreover, the residual sum of squares, RSS, is given as:
$$RSS = \sum^{n}_{i=1}||(y(t_i) - x(t_i, {x_0}_l, context_l * param))*\sqrt{w(t_i)}||_2^2$$
where param
has been (internally) scaled with scales
, and \(w(t_i)\) and \(y(t_i)\) refers to the i'th row of weights
and y
in opt
(y
with first column removed), respectively. The solution to the ODE system is the x()-function. The subscript 'l' refers to the context, i.e., the columns of contexts
and x0
in rodeo
-functions (x0
is the initial state of the system at the first time point after a decrease in the time-vector). All products are Hadamard products.
The type of regularisation is chosen via reg_type
in reg
:
l1
:Least Absolute Shrinkage and Selection Operator (Lasso) penalty. The penalty is the absolute value: \(pen(param_j)=|param_j|\)
l2
:Ridge penalty. The penalty is the squaring: \(pen(param_j)=param_j^2/2\)
elnet
:Elastic Net. A convex combination of l1
and l2
penalties: \(pen(param_j)=(1-a)param_j^2/2+a|param_j|\), 0<=a<=1. Note if a = 0 or a = 1, then the penalty is automatically reduced to "l2" and "l1" respectively.
scad
:Smoothly Clipped Absolute Deviation penalty: $$pen(param_j)=\int^{param_j}_{0}{min( max((a\lambda-|param|)/(\lambda(a-1)), 0), 1) dparam}, a > 2.$$
mcp
:Maximum Concave Penalty penalty: $$pen(param_j)=\int^{param_j}_{0}{max(1 - |param|/(\lambda a), 0) dparam}, a > 1.$$
none
:No penalty. Not recommended for large systems. This overwrites both user-supplied and automatically generated lambda-sequences.
A proximal-gradient type of optimisation is employed. The step length is denoted \(\tau\).
The flag screen
(in reg
) is passed onto the optimisation, it simply tells the optimisation to focus entirely on optimising a subset of the parameters, which where selected due to large gradients. At most step_screen
(in reg
) steps are taken before a re-evaluation of the screened subset takes place.
The convergence criteria is \(\Delta \eta < 10^3 * \max(|(\Delta param \neq 0)| + |(\Delta x0 \neq 0)|, 1)\) AND \(\Delta loss / \Delta \eta < tol_l\), where $$\Delta \eta = ||\Delta param||/tol_{param} + ||\Delta x0|| / tol_{x_0}$$
The lambda sequence can either be fully customised through lambda
in opt
or automatically generated. In the former case, a monotonic lambda
-sequence is highly recommended. Throughout all optimisations, each optimisation step re-uses the old optimum, when sweeping through the lambda
-sequence.
If lambda
is NULL
, an automatically generated sequence is used. A maximal value of lambda (the smallest at which 0 is a optimum in the rate parameter) is calculated and log-equidistant sequence of length nlambda
is generated, with the ratio between the smallest and largest lambda value being lambda.min.ratio
. Note: when the opt
-object is passed to rodeo.ode
, one may want to initialise the optimisation at a non-zero parameter and run the optimisation on the reversed lambda sequence. This is indicated by setting decrease.lambda
= FALSE. If, however, the opt
-object is passed to aim
, glmnet
ultimately decides on the lambda sequence, and may cut it short.
Two gradient evaluations are implemented: exact and approximate. The computational time of exact gradient is generally longer than that of approximate gradient, but its relative magnitude depends on the solver
in the ode
-object passed to rodeo
.
aim, rodeo, rodeo.aim, rodeo.ode
# NOT RUN {
# Use 'reg' object to specify regularisation when creating 'ode' objects
# Example: power law kinetics with SCAD penalty on rate parameter
A <- matrix(
c(1, 1, 0, 0,
0, 0, 1, 0,
0, 0, 1, 0), ncol = 4, byrow = TRUE)
p <- plk(A, r = reg("scad"))
# Example: power law kinetics as above, with lower bound of -1
p <- plk(A, r = reg("scad", lower = -1))
# Example: rational mass action kinetics with
# MCP penalty on first parameter argument and l2 on second
B <- matrix(
c(0, 0, 1, 0,
1, 1, 0, 0,
1, 0, 0, 1), ncol = 4, byrow = TRUE)
rmak <- ratmak(A, B, r1 = reg("mcp"), r2 = reg("l2"))
# Example: mass action kinetics with
# no penalty on rate parameter and l2 on initial state
m <- mak(A, B, r = reg("none"), rx0 = reg("l2"))
# }
Run the code above in your browser using DataLab