generalized.word.length: Functions for calculating the generalized word length pattern, projection frequencies or optimizing column selection within an array

Description

Functions length2, length3, length4 and length5 calculate the numbers of generalized words of lengths 2, 3, 4, and 5 respectively, lengths calculates them all. Functions P3.3 and P4.4 calculate projection frequencies, functions oa.min3, oa.min34, oa.maxGR, oa.minRelProjAberr, oa.max3 and oa.max4 determine column allocations with minimum or maximum aliasing. Function nchoosek is an auxiliary function for calculating all subsets without replacement.

Usage

length2(design, with.blocks = FALSE, J = FALSE)
length3(design, with.blocks = FALSE, J = FALSE, rela = FALSE)
length4(design, with.blocks = FALSE, separate = FALSE, J = FALSE, rela = FALSE)
length5(design, with.blocks = FALSE, J = FALSE, rela = FALSE)
lengths(design, with.blocks = FALSE, J = FALSE)
contr.XuWu(n, contrasts = TRUE, sparse = FALSE)
oa.min3(ID, nlevels, all, rela = FALSE, variants = NULL, crit = "total")
oa.min34(ID, nlevels, variants = NULL, min3=NULL, all = FALSE, rela = FALSE)
oa.max3(ID, nlevels, rela = FALSE)
oa.max4(ID, nlevels, rela = FALSE)
oa.maxGR(ID, nlevels, variants = NULL)
oa.minRelProjAberr(ID, nlevels, maxGR = NULL)
P3.3(ID, digits = 4, rela=FALSE)
P4.4(ID, digits = 4, rela=FALSE)
GR(ID, digits=2)
nchoosek(n, k)

Arguments

design

an experimental design. This can either be a matrix or a data frame in which all columns are experimental factors, or a special data frame of class design, which may also

with.blocks

a logical, indicating whether or not an existing block factor is to be included into word counting. This option is ignored if design is not of class design. Per default, an existing block factor is ignored. For de

a logical, indicating whether or not a vector of contributions from individual degrees of freedom is produced. If TRUE, the entries of the vector are absolute normalized J-characteristics from all 3- or 4-factor products respe

rela

logical indicating whether the word lengths are to be calculated in absolute terms (as usual) or relative to the maximum possible word length in case of complete aliasing; if TRUE, each word length is divided by the worst

separate

a logical, indicating whether or not separate (and overlapping) sums are requested for each two-factor interaction; the idea is to be able to identify clear two-factor interactions; this may be useful for a design for which l

integer; for function contr.XuWu: number of levels of the factor for which to determine contrasts for function nchoosek: number of elements to choose from

contrasts

output a contrast matrix ?

sparse

return a sparse contrast matrix ?

an orthogonal array, either a matrix or a data frame; need not be of class oa; can also be a character string containing the name of an array listed in data frame oacat

nlevels

a vector of requested level informations (vector with an entry for each factor)

all

logical; if FALSE, the search stops whenever a design with 0 generalized words of highest requested length is found; otherwise, the function always determines all best designs

variants

matrix of integer column number entries; each row gives the column numbers for one variant to be compared; the matrix columns must correspond to the entries of the nlevels option

crit

character string that requests "total" or "worst" triple optimization; "total" corresponds to the previous version that optimizes the overall number of length 3 words; "worst" minimizes

min3

the outcome of a call to oa.min3 with crit="total", which is to be used for a call to oa.min34

maxGR

the outcome of a call to oa.min3 with crit="worst" and rela=TRUE (or the outcome of a call to oa.maxGR), which is to be used for a call to oa.minRelProjAberr

digits

number of decimal points to which to round the result

number of elements to be chosen, integer from 0 to n

Value

The functions length3 and length4 (currently) per default return the number of words. If option J=TRUE is set, their value is a named vector of normalized absolute J-characteristics (cf. Ai and Zhang 2004) for the respective length, based on normalized Helmert contrasts, with names indicating factor indices. (For blocked designs with the with.blocks=TRUE option, the block factor has index 1.) Functions P3.3 and P4.4 return a matrix with the numbers of generalized words of length 3 (4) that do occur for 3 (4) factor projections (column length3 or length4 resp.) and their frequencies. If option rela=TRUE is set, the numbers of generalized words are normalized by dividing them by the number of words that corresponds to perfect aliasing among the factors for each projection. For P4.4, the relative version is only reasonable for resolution IV designs. The matrix of projection frequencies has the overall number of generalized words of the respective length as an attribute; in the case rela=TRUE it also has the generalized resolution and the overall absolute number of generalized words of the respective length as an attribute. The functions oa.min3, oa.min34, oa.max3 and oa.max4 (currently) return a list with elements GWP (the number(s) of generalized words of length 3 (lengths 3 and 4)) column.variants (the columns to be used for design creation, ordered with ascending nlevels) and complete (logical indicating whether or not the list is guaranteed to be complete). oa.min3, the name of the first element is either GWP3 (crit="total"), worst.a3 (rela=FALSE, crit="worst") or GR (rela=FALSE, crit="worst"). The function oa.maxGR returns a list with elements GR, column.variants and complete, the function oa.minRelProjAberr returns a list with elements GR, GWP, column.variants and complete. Function GR returns a list with elements GR (the generalized resolution of the array, a not necessarily integer number between 3 and 5) and RPFT (the relative projection frequency table). GR values smaller than 5 are exact, while the number five stands for at least 5. The resolution itself is the integer portion of GR. The RPFT element is NULL for GR=5. The function nchoosek returns a matrix with k rows and choose(n, k) columns, each of which contains a different subset of k elements.

Warning

The functions have been checked on the types of designs for which they are intended (especially orthogonal arrays produced with oa.design) and on 2-level fractional factorial designs produced with package FrF2. They may produce meaningless results for some other types of designs. Furthermore, all optimizing functions work for relatively small problems only and will break down for larger problems because of storage space requirements (size depends on the number of possible selections among columns; for example, selecting 9 out of 31 columns is not doable on my computer because of storage space issues, while selecting 29 out of 31 columns is doable within the available storage space). Programming of a less storage-intensive algorithm is underway.

Details

These functions work for factors only and are not intended for quantitative variables. Nevertheless it is possible to apply them to class design plans with quantitative variables in them in some situations. The generalized word length pattern as introduced in Xu and Wu (2001) is the basis for the functions described here. Consult their article or Groemping (2011) for rigorous mathematical detail of this concept. A brief explanation is also given here, before explaining the details for the functions: Assume a design with qualitative factors, for which all factors are coded with specially normalized Helmert contrasts (which orthogonalizes the model matrix columns to the intercept column). Function contr.XuWu provides such contrasts, normalized according to the prescription by Xu and Wu (2001) which implies that all model matrix columns have Euclidean norm sqrt(n), provided that each individual factor is balanced. Then, the number of generalized words of length 3 is determined by taking the sum of squares of the column sums of all three-factor interaction columns (from a model matrix with all three-factor interactions included), divided by the squared number of runs. Likewise, the number of generalized words of length 4 is determined by taking the sum of squares of the column sums of all four-factor interaction columns (from a model matrix with all four-factor interactions included), divided by the squared number of runs, and so on. A certain plausibility can be found in these numbers by noting that they provide the more well-known word length pattern for regular fractional factorial 2-level designs, implying that they are exactly zero for resolution IV or resolution V fractional factorial 2-level designs, respectively. Function lengths calculates the generalized word length pattern (numbers of generalized words of lengths 2, 3, 4 and 5 respectively), functions length2, length3, length4 and length5 calculate each length separately. The most important ones are length3 and length4; length2 should yield zero for all orthogonal arrays, and length5 will in most cases not be of interest either. The number of shortest possible words, e.g. length 4 for resolution IV designs, can be calculated in relative terms, if interest is in the extent of complete aliasing (cf. Groemping 2011). The length functions are fast for small numbers of factors but can take a long time if the number of factors is large. Note that an orthogonal array based design is called resolution III if the result of function length3 is non-zero, resolution IV, if the result of function length3 is zero and the result of function length4 is non-zero, and resolution V+ (at least V), if the result of both functions length3 and length4 are zero. Functions P3.3 and P4.4 calculate the pattern of generalized words of length 3 for all three-factor projections of an array and of generalized words of length 3 or 4 for all four-factor projections of an array. Calculation of such projection frequencies has been proposed by Xu, Cheng and Wu (2004). The relative version for P3.3 and P4.4 has been introduced by Groemping (2011) for better assessment of the projective properties of a design. It divides each absolute number of words by the maximum possible number in case one factor is completely determined by the combinations of the other two factors. For P4.4, the relative version is valid only for resolution IV designs. The functions can be used in selecting among different possibilities to accomodate factors within a given orthogonal array (cf. examples). For general purposes, it is recommended to use designs with as small an outcome of length3 as possible (either absolute or relative, either total or worst case), and within the same result for length3 (particularly 0), with as small a result for length4 as possible. This corresponds to (a step towards) generalized minimum aberration. It can also be useful to consider the patterns, particularly P3.3. Function GR calculates the generalized resolution according to Deng and Tang (1999) for 2-level designs or a generalization thereof according to Groemping (2011) for general orthogonal arrays. It returns a value between 3 and 5, where the numeric value 5 stands for at least 5. Roughly, generalized resolution measures the closeness of a design to the next higher resolution (worst-case based, e.g. one completely aliased triple of factors implies resolution 3). Functions oa.min3, oa.min34 optimize column allocation for a given array for which a certain factor combination must be accomodated: They return designs that allocate columns such that the number of generalized words of length 3 is minimized (oa.min3; with a choice between minimizing the total number or minimizing the number for the worst-case triple of factors), or the number of generalized words of length 4 is minimized within all designs for which the number of generalized words of length 3 is minimal (oa.min34, total number only). Option rela allows to switch from the default consideration of absolute numbers of words to relative numbers of words according to Groemping (2011). Function oa.maxGR maximizes generalized resolution according to Deng and Tang (1999) as generalized by Groemping (2011), function oa.minRelProjAberr conducts minimum relative projection aberration according to Groemping (2011), with the four steps (a) maximize GR, (b) minimize rA3 or rA4 (depending on resolution), (c) optimize RPFT (as obtained by P3.3 or P4.4) and (d) minimize absolute words of lengths 4 etc. (only carried through to length 4 by the function. Note that function oa.maxGR can be replaced by the much faster function oa.min3 with options crit="worst" and rela=TRUE, whenever GR<=4. only="" for="" designs="" with="" gr=""> 4, the extra effort with function oa.maxGR is useful. Functions oa.max3 and oa.max4 do the opposite: they search for the worst design in terms of the number of generalized words of lengths 3 or 4. Such a design can e.g. be used for demonstrating the benefit of optimizing the number of words, or for exemplifying theoretical properties. Occasionally, it may also be useful, if there are severe restrictions on possible combinations. (oa.max4 should only be used for resolution IV designs.)

References

Ai, M.-Y. and Zhang, R.-C. (2004). Projection justification of generalized minimum aberration for asymmetrical fractional factorial designs. Metrika 60, 279--285. Groemping, U. (2011). Relative projection frequency tables for orthogonal arrays. Report 1/2011, Reports in Mathematics, Physics and Chemistry http://www1.beuth-hochschule.de/FB_II/reports/welcome.htm, Department II, Beuth University of Applied Sciences, Berlin. Xu, H.-Q. and Wu, C.F.J. (2001). Generalized minimum aberration for asymmetrical fractional factorial designs. Annals of Statistics 29, 1066--1077. Xu, H., Cheng, S., and Wu, C.F.J. (2004). Optimal projective three-level designs for factor screening and interaction detection. Technometrics 46, 280--292.

Examples

Run this code

## check a small design 
   oa12 <- oa.design(nlevels=c(2,2,6))
   length3(oa12)
   ## length4 is of course 0, because there are only 3 factors
   P3.3(oa12)

   ## the results need not be an integer
   oa12 <- oa.design(L12.2.11,columns=1:6)
   length3(oa12)
   length4(oa12)
   P3.3(oa12)  ## all projections have the same pattern
             ## which is known to be true for the complete L12.2.11 as well
   P3.3(L18)   ## this is the pattern of the Taguchi L18
             ## also published by Schoen 2009
   P3.3(L18[,-2])  ## without the 2nd column (= the 1st 3-level column)
   P3.3(L18[,-2], rela=TRUE)  ## relative pattern, divided by theoretical upper 
                              ## bound for each 3-factor projection
   
   ## choosing among different assignment possibilities
   ## for two 2-level factors and one 3- and 4-level factor each
   show.oas(nlevels=c(2,2,3,4))
   ## default allocation: first two columns for the 2-level factors
   oa24.bad <- oa.design(L24.2.13.3.1.4.1, columns=c(1,2,14,15))
   length3(oa24.bad)
   ## much better: columns 3 and 10
   oa24.good <- oa.design(L24.2.13.3.1.4.1, columns=c(3,10,14,15))
   length3(oa24.good)
   length4(oa24.good)  ## there are several variants, 
                       ## which produce the same pattern for lengths 3 and 4
                       
   ## the difference matters
   plot(oa24.bad, select=c(2,3,4))
   plot(oa24.good, select=c(2,3,4))
   
   ## generalized resolution differs as well (resolution is III in both cases)
   GR(oa24.bad)
   GR(oa24.good)

   ## choices for columns can be explored with functions oa.min3, oa.min34 or oa.max3
   oa.min3(L24.2.13.3.1.4.1, nlevels=c(2,2,3,4))
   oa.min34(L24.2.13.3.1.4.1, nlevels=c(2,2,3,4))
   ## columns for designs with maximum generalized resolution 
   ##    (can take very long, if all designs have worst-case aliasing) 
      ## then optimize these for overall relative number of words of length 3
      ##     and in addition absolute number of words of length 4 
   mGR <- oa.maxGR(L18, c(2,3,3,3,3,3,3))
   oa.minRelProjAberr(L18, c(2,3,3,3,3,3,3), maxGR=mGR)
   
   oa.max3(L24.2.13.3.1.4.1, nlevels=c(2,2,3,4))    ## this is not for finding 
                                                    ## a good design!!!
                                                    
   ## play with selection of optimum design
   ## somewhat experimental at present
   oa.min3(L32.2.10.4.7, nlevels=c(2,2,2,4,4,4,4,4))
   best3 <- oa.min3(L32.2.10.4.7, nlevels=c(2,2,2,4,4,4,4,4), rela=TRUE)
   oa.min34(L32.2.10.4.7, nlevels=c(2,2,2,4,4,4,4,4))
   oa.min34(L32.2.10.4.7, nlevels=c(2,2,2,4,4,4,4,4), min3=best3)
   
   ## generalized resolution according to Groemping 2011, manually
   best3GR <- oa.min3(L36.2.11.3.12, c(rep(2,3),rep(3,3)), rela=TRUE, crit="worst")
      ## optimum GR is 3.59
   ## subsequent optimization w.r.t. rA3
   best3reltot.GR <- oa.min3(L36.2.11.3.12, c(rep(2,3),rep(3,3)), rela=TRUE, 
           variants=best3GR$column.variants)
      ## optimum rA3 is 0.5069
   ## (note: different from first optimizing rA3 (0.3611) and then GR (3.5))
   ## remaining nine designs: optimize RPFTs
   L36 <- oa.design(L36.2.11.3.12, randomize=FALSE)
   lapply(1:9, function(obj) P3.3(L36[,best3reltot.GR$column.variants[obj,]]))
      ## all identical
   oa.min34(L36, nlevels=c(rep(2,3),rep(3,3)), min3=best3reltot.GR)
      ## still all identical

   ## select among column variants with projection frequencies 
   ## here, all variants have identical projection frequencies
   ## for larger problems, this may sometimes be relevant
   variants <- oa.min34(L24.2.13.3.1.4.1, nlevels=c(2,2,3,4))
   for (i in 1:nrow(variants$column.variants)){
      cat("variant ", i, "")
      print(P3.3(oa.design(L24.2.13.3.1.4.1, columns=variants$column.variants[i,])))
      }
   
   ## automatic optimization is possible, but can be time-consuming
   ## (cf. help for oa.design)
   plan <- oa.design(L24.2.13.3.1.4.1, nlevels=c(2,2,3,4), columns="min3")
   length3(plan)
   length4(plan)
   plan <- oa.design(L24.2.13.3.1.4.1, nlevels=c(2,2,3,4), columns="min34")
   length3(plan)
   length4(plan)

   ## blocked design from FrF2
   ## the design is of resolution IV
   ## there is one (generalized) 4-letter word that does not involve the block factor
   ## there are four more 4-letter words involving the block factor
   ## all this and more can also be learnt from design.info(plan)
   require(FrF2)
   plan <- FrF2(32,6,blocks=4)
   length3(plan)
   length3(plan, with.blocks=TRUE)
   length4(plan)
   length4(plan, with.blocks=TRUE)
   design.info(plan)

Run the code above in your browser using DataLab