Learn R Programming

sfsmisc (version 1.0-15)

roundfixS: Round to Integer Keeping the Sum Fixed

Description

Given a real numbers $y_i$ with the particular property that $\sum_i y_i$ is integer, find integer numbers $x_i$ which are close to $y_i$ ($\left|x_i - y_i\right| < 1 \forall i$), and have identical marginal sum, sum(x) == sum(y).

Usage

roundfixS(x, method = c("offset-round", "round+fix", "1greedy"))

Arguments

x
a numeric vector which must sum to an integer
method
character string specifying the algorithm to be used.

Value

  • a numeric vector, say r, of the same length as x, but with integer values and fulfulling sum(r) == sum(x).

Details

Without hindsight, it may be surprising that all three methods give identical results (in all situations and simulations considered), notably that the idea of mass shifting employed in the iterative "1greedy" algorithm seems equivalent to the much simpler idea used in "offset-round".

I am pretty sure that these algorithms solve the $L_p$ optimization problem, $\min_x \left\|y - x\right\|_p$, typically for all $p \in [1,\infty]$ simultaneously, but have not bothered to find a formal proof.

See Also

round etc

Examples

Run this code
## trivial example
kk <- c(0,1,7)
stopifnot(identical(kk, roundfixS(kk))) # failed at some point

x <- c(-1.4, -1, 0.244, 0.493, 1.222, 1.222, 2, 2, 2.2, 2.444, 3.625, 3.95)
sum(x) # an integer
r <- roundfixS(x)
stopifnot(all.equal(sum(r), sum(x)))
m <- cbind(x=x, `r2i(x)` = r, resid = x - r, `|res|` = abs(x-r))
rbind(m, c(colSums(m[,1:2]), 0, sum(abs(m[,"|res|"]))))

chk <- function(y) {
  cat("sum(y) =", format(S <- sum(y)),"")
  r2  <- roundfixS(y, method="offset")
  r2. <- roundfixS(y, method="round")
  r2_ <- roundfixS(y, method="1g")
  stopifnot(all.equal(sum(r2 ), S),
            all.equal(sum(r2.), S),
            all.equal(sum(r2_), S))
  all(r2 == r2. && r2. == r2_) # TRUE if all give the same result
}

makeIntSum <- function(y) {
   n <- length(y)
   y[n] <- ceiling(y[n]) - (sum(y[-n]) %% 1)
   y
}
set.seed(11)
y <- makeIntSum(rnorm(100))
chk(y)

## nastier example:
set.seed(7)
y <- makeIntSum(rpois(100, 10) + c(runif(75, min= 0, max=.2),
                                   runif(25, min=.5, max=.9)))
chk(y)

for(i in 1:1000)
    stopifnot(chk(makeIntSum(rpois(100, 10) +
                             c(runif(75, min= 0, max=.2),
                               runif(25, min=.5, max=.9)))))

Run the code above in your browser using DataLab