Learn R Programming

RcppAlgos (version 0.2.1)

comboGeneral: Generate all Combinations/Permutations of a Vector with/without Constraints

Description

Quickly generate all combinations or permutations of a vector, chosen \(m\) at a time, with or without constraints using Rcpp.

Usage

comboGeneral(v, m, repetition = FALSE, constraintFun = NULL,
                comparisonFun = NULL, limitConstraints = NULL,
                rowCap = NULL, keepResults = FALSE)

permuteGeneral(v, m = NULL, repetition = FALSE, constraintFun = NULL, comparisonFun = NULL, limitConstraints = NULL, rowCap = NULL, keepResults = FALSE, freqs = NULL)

Arguments

v

Source vector. If v is an integer, it will be converted to the sequence 1:v.

m

Number of elements to choose. If repetition = TRUE, m can exceed the length of v. For permuteGeneral, the default is NULL. This is to allow for permutations of multisets utilizing the freqs argument (See below).

repetition

Logical value indicating whether combinations should be with or without repetition. The default is FALSE.

constraintFun

Function to be applied to the elements of v that should be passed as a string (E.g. constraintFun = "sum"). The possible contraint functions are: "sum", "prod", "mean", "max", & "min". The default is NULL, meaning no function is applied.

comparisonFun

Comparison operator that will be used to compare limitConstraints with the result of contraintFun applied to v. It should be passed as a string (E.g. comparisonFun = "<="). The possible comparison operators are: "<", ">", "<=", ">=", "==". The default is NULL.

limitConstraints

This is the value that will be used for comparison. The default is NULL.

rowCap

The maximal number of expected results when a contraint is applied. Can also be used if you only need a specific number of combinations/permutations. This is useful when the total number of combinations/permutations without contraint is large and you expect/need a small number of combinations/permutations that meet the criteria. Using rowCap can drastically improve run time and avoid unnecessary crashes due to lack of memory. See examples below.

keepResults

A logical flag indicating if the result of constraintFun applied to v should be displayed; if TRUE, an additional column of results will be added to the resulting matrix. The default is FALSE.

freqs

Only valid for permuteGeneral. A vector of frequencies used for producing all permutations of a multiset of v. Each element of freqs represents how many times each element of the source vector, v, is repeated. It is analogous to the times argument in rep. The default value is NULL. Ignored if m is supplied.

Value

In general, a matrix is returned with each row containing a vector of length \(m\) or \(m + 1\) depending on the value of keepResults. For permuteGeneral, if m isn't supplied and freqs is given, a matrix is returned with each row containing a vector of length sum(freqs).

Details

Finding all combinations/permutations with constraints is optimized by organizing them in such a way that when constraintFun is applied, a monotonic sequence is produced. Combinations/permutations are added successively, until a particular combination exceeds the given constraint value for a given constraint/comparison function combo. After this point, we can safely skip several combinations knowing that they will exceed the given constraint value.

When there are any negative values in v and constraintFun = "prod", producing a monotonic set is non-trivial for the general case. As a result, performance will suffer as all combinations/permutations must be tested against the constraint criteria. Additionally, rowCap will not have its normal effectiveness (i.e. it will only limit the number of rows after producing all combinations/permutations).

References

Examples

Run this code
# NOT RUN {
system.time(comboGeneral(17,8))
system.time(permuteGeneral(13,6))

system.time(comboGeneral(13,10,repetition = TRUE))
system.time(permuteGeneral(factor(letters[1:9]),6,TRUE))

## permutations of the multiset :
## c(1,1,1,1,2,2,2,3,3,4,4,4,4,4)
system.time(permuteGeneral(4, freqs = c(4,3,2,5)))

## Generate some random data
set.seed(1009)
mySamp <- rnorm(75, 997, 23)

## How to use rowCap example:
## Researcher only needs 1000 7-tuples of mySamp
## such that the sum is greater than 7200.
system.time(comboGeneral(mySamp,7,FALSE,"sum",">",7200,1000))

## If you leave rowCap as NULL, it can take much longer
## (still fast enough most of the time) and in some cases
## crash your computer as the underlying code allocates
## enough space to account for every combination
## (e.g. In our example, there are choose(75, 7)
## = 1984829850 rows, 7 columns, with each cell occupying
## 8 bytes. This gives a total over 100 GB). 
## (i.e. choose(75, 7)*7*8/(2^30)).

## class of the source vector is preserved
class(comboGeneral(5,3)[1,]) == class(1:5)
class(comboGeneral(c(1,2:5),3)[1,]) == class(c(1,2:5))

# }
# NOT RUN {
## Using keepResults will add a column of results
permuteGeneral(-3,6,TRUE,"prod",keepResults = TRUE)
comboGeneral(-3,6,TRUE,"sum","==",-8,keepResults = TRUE)

## Find 13-tuple combinations of 1:25 such
## that the mean is less than 10
system.time(myComb <- comboGeneral(25,13,FALSE,"mean","<",10))

## Alternatively, you must generate all combinations and subsequently
## subset to obtain the combinations that meet the criteria
system.time(myComb2 <- combn(25,13))
system.time(myCols <- which(colMeans(myComb2) < 10))
system.time(myComb2 <- myComb2[,myCols])

## Any variation is much slower
system.time(myComb2 <- combn(25,13)[,combn(25,13,mean) < 10])

## Test equality with myComb above
all.equal(myComb,t(myComb2))

## Fun example... see stackoverflow:
## https://stackoverflow.com/q/22218640/4408538
system.time(permuteGeneral(seq(0L,100L,10L),8,TRUE,"sum","==",100,10^5))
# }

Run the code above in your browser using DataLab