The package creates an array with specified minimum resolution and optimized word length pattern, using mixed integer programming (MIP). This is a step in
generalized minimum aberration (GMA) according to Xu and Wu (2001), which is a way to minimize the confounding potential in a factorial design, as measured by the generalized word length pattern (GWLP). Reduction of short words has priority over reduction of long words, because it pertains to confounding of lower order interactions, which is assumed to be more severe than confounding of higher order interactions.
There is always one word of length zero. An array is said to have strength t (resolution R=t+1), if it has no words of lengths 1,...,t, but has words of length R=t+1; in that case, t-factor interactions are not confounded with the overall mean, (t-1)-factor interactions are not confounded with main effects, and so forth. Groemping and Xu (2014) provided an interpretation of the number of shortest words (i.e. words of length R) in terms of coefficients of determination of linear models with main effects model matrix columns on the LHS and full factorial t-factor models on the right-hand side; for example, in an array of strength 2 with three 3-level factors, the number of words of length 3 is the sum of the two R^2 values obtained from regressing the two main effects model matrix columns of one of the factors on a full model in the other two factors, provided main effect model matrix columns are coded orthogonally (to each other and the overall mean). GMA considers a design better than another one, if it has larger strength or resolution. In case of ties, a design is better if it has fewer shortest words; further ties are resolved by comparing words of lengths t+2, t+3, ...
Any array found by a function of this package will have the requested resolution (if not possible, an error will be thrown). If desired (default, but can be suppressed), optimization of the number of shortest words will be attempted. If kmax
is chosen larger than the resolution, optimization of longer words will also be attempted. Mixed integer optimization is very resource-intensive and often fails to provide a confirmed optimum (see also below). Choosing kmax
larger than the resolution is therefore advisable for very small problems only; in most cases, one should attempt an optimization of the number of shortest words only, or even suppress that by find.only=TRUE
. Only after this has been achieved, possibly with several sequential attempts, a subsequent attempt to improve the number of longer words should be undertaken.
Functions gurobi_MIParray
and mosek_MIParray
create an array from scratch (start=NULL
and forced=NULL
), by trying to improve a starting array specified with the argument start
, or by restricting a portion of the array to correspond to a pre-existing array with the argument forced
. If no starting array is given, the functions initially obtain one by solving a linear optimization problem for obtaining a design with the requested resolution. Subsequently, the number of shortest words of the starting array is optimized, followed by the numbers of words of length up to kmax
.
Where functions gurobi_MIParray
and mosek_MIParray
do not easily find an optimal array, it may be worthwhile to consider using functions gurobi_MIPsearch
and mosek_MIPsearch
, which can search over different orderings of the factor levels; these have been provided because a brief search for a fortunate level orderings may be successful where a very long search for an unfortunate level ordering fails.
The algorithm implemented in the package is explained in Groemping and Fontana (2019); it is a modification of Fontana (2017). Modifications include enforcing the requested resolution via a linear optimization step (much faster than sequentially optimizing all those word lengths until they are zeroes), and using a result on coding invariance from Groemping (2018) for a more parsimonious formulation of constraints. Mixed integer programming can use a lot of time and resources; in particular, the confirmation of the optimality of a solution that has been found can take prohibitively long, even if the optimal solution itself has been found fast (which is also not necessarily so). Functions gurobi_MIPcontinue
and mosek_MIPcontinue
can be used to continue optimization for larger problems,
where optimization was previously aborted, e.g. due to a time limit (however, a lot of effort is lost and has to be repeated when continuing a previous attempt). Note that it is possible to continue an optimization effort started with Gurobi using Mosek and vice versa, because the functions gurobi_MIPcontinue
and mosek_MIPcontinue
can convert problems using the internal functions mosek2gurobi
and gurobi2mosek
. Groemping (2020) explains the package itself.
The solver functions in the package use the commercial solvers Mosek and/or Gurobi, which provide free academic licenses. These solvers and their corresponding vendor-provided R packages have to be installed for using the package's solver functions. For users who do not have access to these but to other solvers, the package provides export functions to mps files that can be read by other solvers, for example by IBM CPLEX (write_MPSMIQP
, write_MPSILPlist
).