Learn R Programming

SeqArray (version 1.8.0)

seqParallel: Apply Functions in Parallel

Description

Apply a user-defined function in parallel

Usage

seqParallel(cl, gdsfile, FUN = function(gdsfile, ...) NULL, split=c("by.variant", "by.sample", "none"), .combine=NULL, ...)

Arguments

cl
NULL or a cluster object, created by the package parallel or snow
gdsfile
FUN
the function to be applied
split
split the dataset by variant or sample according to multiple processes, or "none" for no split
.combine
define a fucntion for combining results from different processes; by default, 'c' is used; if .combine=="", return invisible()
...
optional arguments to FUN

Value

A vector or list of values.

Details

If cl = NULL or length(cl) == 0, the function simply calls FUN(gdsfile, ...); otherwise, it splits jobs to different processes and calls FUN(gdsfile, ...) on each process, the optional arguments are passed to different processes.

See Also

seqSetFilter, seqGetData seqApply

Examples

Run this code
library(parallel)

# Use option cl.core to choose an appropriate cluster size or number of cores
cl <- makeCluster(getOption("cl.cores", 2))


# the file of GDS
gds.fn <- seqExampleFileName("gds")
# or gds.fn <- "C:/YourFolder/Your_GDS_File.gds"

# display
(f <- seqOpen(gds.fn))

# the uniprocessor version
afreq1 <- seqParallel(NULL, f, FUN = function(gdsfile) {
		seqApply(gdsfile, "genotype", as.is="double",
			FUN=function(x) mean(x==0, na.rm=TRUE))
	}, split = "by.variant")

length(afreq1)
summary(afreq1)


# run in parallel
afreq2 <- seqParallel(cl, f, FUN = function(gdsfile) {
		seqApply(gdsfile, "genotype", as.is="double",
			FUN=function(x) mean(x==0, na.rm=TRUE))
	}, split = "by.variant")

length(afreq2)
summary(afreq2)


# check
all(afreq1 == afreq2)


################################################################
# check -- variant splits

seqParallel(cl, f, FUN = function(gdsfile) {
		v <- seqGetFilter(gdsfile)
		sum(v$variant.sel)
	}, split = "by.variant")
# [1] 674 674


################################################################

stopCluster(cl)

# close the GDS file
seqClose(f)

Run the code above in your browser using DataLab