Learn R Programming

SeqArray (version 1.8.0)

seqSlidingWindow: Apply functions via a sliding window over variants

Description

Returns a vector or list of values obtained by applying a function to a sliding window over variants

Usage

seqSlidingWindow(gdsfile, var.name, win.size, shift=1, FUN, as.is = c("list", "integer", "double", "character", "none"), var.index = c("none", "relative", "absolute"), ...)

Arguments

gdsfile
var.name
the variable name(s), see details
win.size
the size of sliding window
shift
the number of variants to shift the window at each step
FUN
the function to be applied
as.is
returned value: a list, an integer vector, etc
var.index
if "none", call FUN(x, ...) without variable index; if "relative" or "absolute", add an argument to the user-defined function FUN like FUN(index, x, ...) where index is an index of variant starting from 1: "relative" for indexing in the selection defined by seqSetFilter, "absolute" for indexing with respect to all data
...
optional arguments to FUN

Value

A vector or list of values.

Details

The variable name should be "sample.id", "variant.id", "position", "chromosome", "allele", "annotation/id", "annotation/qual", "annotation/filter", "annotation/info/VARIABLE_NAME", or "annotation/format/VARIABLE_NAME".

In the user-defined funciton FUN(x, ...) or FUN(index, x, ...), x is a list with win.size elements, and each element includes values for the variable(s) var.name; index is the starting position of the sliding window.

The algorithm is highly optimized by blocking the computations to exploit the high-speed memory instead of disk.

See Also

seqSetFilter, seqGetData, seqApply

Examples

Run this code
# the file of GDS
gds.fn <- seqExampleFileName("gds")
# or gds.fn <- "C:/YourFolder/Your_GDS_File.gds"

# display
(f <- seqOpen(gds.fn))

# get 'sample.id
(samp.id <- seqGetData(f, "sample.id"))
# "NA06984" "NA06985" "NA06986" ...

# get 'variant.id'
head(variant.id <- seqGetData(f, "variant.id"))


# set sample and variant filters
set.seed(100)
seqSetFilter(f, sample.id=samp.id[seq(2, 16, 2)],
	variant.id=sample(variant.id, 10))


# apply a function via a sliding window over variants
seqSlidingWindow(f, c(qual="annotation/id"), win.size=3,
	FUN = function(x) {
		# x is a list with 'win.size' elements
		print(x)
	}, as.is="none")

# apply a function via a sliding window over variants
seqSlidingWindow(f, c(qual="annotation/id"), win.size=3,
	FUN = function(x) {
		cat(unlist(x), sep="\t"); cat("\n")
	}, as.is="none")


# apply a function via a sliding window over variants
seqSlidingWindow(f, c(geno="genotype", phase="phase", qual="annotation/id"),
	FUN = function(index, x) {
		cat("Window ", index, ":\n", sep="")
		print(x)
	},
	win.size=3, as.is="none", var.index="relative")


# apply a function via a sliding window over variants
seqSlidingWindow(f, "genotype", win.size=4,
	FUN = function(index, x) {
		z <- unlist(lapply(x, function(z) mean(z, na.rm=TRUE)))
		cat("Window ", index, ", starting from Variant ", index,
			"\n    ", format(round(z,3), nsmall=3, width=8), "\n", sep="")
	},
	as.is="none", var.index="relative")


# close the GDS file
seqClose(f)

Run the code above in your browser using DataLab