colBinnedSmoothing.matrix: Binned smoothing of a matrix column by column

Description

Binned smoothing of a matrix column by column.

Usage

# S3 method for matrix
colBinnedSmoothing(Y, x=seq_len(nrow(Y)), w=NULL, xOut=NULL, xOutRange=NULL,
  from=min(x, na.rm = TRUE), to=max(x, na.rm = TRUE), by=NULL, length.out=length(x),
  na.rm=TRUE, FUN="median", ..., verbose=FALSE)

Arguments

A numeric JxI matrix (or a vector of length J.)

A (optional) numeric vector specifying the positions of the J entries. The default is to assume uniformly distributed positions.

A optional numeric vector of prior weights for each of the J entries.

xOut

Optional numeric vector of K bin center locations.

xOutRange

Optional Kx2 matrix specifying the boundary locations for K bins, where each row represents a bin \([x0,x1)\). If not specified, the boundaries are set to be the midpoints of the bin centers, such that the bins have maximum lengths without overlapping. Vice verse, if xOut is not specified, then xOut is set to be the mid points of the xOutRange boundaries.

from, to, by, length.out

If neither xOut nor xOutRange is specified, the xOut is generated uniformly from these arguments, which specify the center location of the first and the last bin, and the distance between the center locations, utilizing the seq() function. Argument length.out can be used as an alternative to by, in case it specifies the total number of bins instead.

FUN

A function.

na.rm

If TRUE, missing values are excluded, otherwise not.

...

Not used.

verbose

See Verbose.

Value

Returns a numeric KxI matrix (or a vector of length K) where K is the total number of bins. The following attributes are also returned:

xOutThe center locations of each bin.
xOutRangeThe bin boundaries.
countThe number of data points within each bin (based solely on argument x).
binWidthThe average bin width.

Details

Note that all zero-length bins \([x0,x1)\) will get result in an NA value, because such bins contain no data points. This also means that colBinnedSmoothing(Y, x=x, xOut=xOut) where xOut contains duplicated values, will result in some zero-length bins and hence NA values.

Examples

Run this code

# NOT RUN {
# Number of tracks
I <- 4

# Number of data points per track
J <- 100

# Simulate data with a gain in track 2 and 3
x <- 1:J
Y <- matrix(rnorm(I*J, sd=1/2), ncol=I)
Y[30:50,2:3] <- Y[30:50,2:3] + 3

# Uniformly distributed equal-sized bins
Ys3 <- colBinnedSmoothing(Y, x=x, from=2, by=3)
Ys5 <- colBinnedSmoothing(Y, x=x, from=3, by=5)

# Custom bins
xOutRange <- t(matrix(c(
  1, 11,
 11, 31,
 31, 41,
 41, 51,
 51, 81,
 81, 91,
 91,101
), nrow=2))
YsC <- colBinnedSmoothing(Y, x=x, xOutRange=xOutRange)

# Custom bins specified by center locations with
# maximized width relative to the neighboring bins.
xOut <- c(6, 21, 36, 46, 66, 86, 96)
YsD <- colBinnedSmoothing(Y, x=x, xOut=xOut)

xlim <- range(x)
ylim <- c(-3,5)
layout(matrix(1:I, ncol=1))
par(mar=c(3,3,1,1)+0.1, pch=19)
for (ii in 1:I) {
  plot(NA, xlim=xlim, ylim=ylim)
  points(x, Y[,ii], col="#999999")

  xOut <- attr(Ys3, "xOut")
  lines(xOut, Ys3[,ii], col=2)
  points(xOut, Ys3[,ii], col=2)

  xOut <- attr(Ys5, "xOut")
  lines(xOut, Ys5[,ii], col=3)
  points(xOut, Ys5[,ii], col=3)

  xOut <- attr(YsC, "xOut")
  lines(xOut, YsC[,ii], col=4)
  points(xOut, YsC[,ii], col=4, pch=15)

  xOut <- attr(YsD, "xOut")
  lines(xOut, YsD[,ii], col=5)
  points(xOut, YsD[,ii], col=5, pch=15)

  if (ii == 1) {
    legend("topright", pch=c(19,19,15,15), col=c(2,3,4,5),
           c("by=3", "by=5", "Custom #1", "Custom #2"), horiz=TRUE, bty="n")
  }
}


# Sanity checks
xOut <- x
YsT <- colBinnedSmoothing(Y, x=x, xOut=xOut)
stopifnot(all(YsT == Y))
stopifnot(all(attr(YsT, "counts") == 1))

xOut <- attr(YsD, "xOut")
YsE <- colBinnedSmoothing(YsD, x=xOut, xOut=xOut)
stopifnot(all(YsE == YsD))
stopifnot(all(attr(YsE, "xOutRange") == attr(YsD, "xOutRange")))
stopifnot(all(attr(YsE, "counts") == 1))

# Scramble ordering of loci
idxs <- sample(x)
x2 <- x[idxs]
Y2 <- Y[idxs,,drop=FALSE]
Y2s <- colBinnedSmoothing(Y2, x=x2, xOut=x2)
stopifnot(all(attr(Y2s, "xOut") == x2))
stopifnot(all(attr(Y2s, "counts") == 1))
stopifnot(all(Y2s == Y2))

xOut <- x[seq(from=2, to=J, by=3)]
YsT <- colBinnedSmoothing(Y, x=x, xOut=xOut)
stopifnot(all(YsT == Ys3))
stopifnot(all(attr(YsT, "counts") == 3))

xOut <- x[seq(from=3, to=J, by=5)]
YsT <- colBinnedSmoothing(Y, x=x, xOut=xOut)
stopifnot(all(YsT == Ys5))
stopifnot(all(attr(YsT, "counts") == 5))

# }