haplinSlide: Fit `haplin` in a series of sliding windows over a sequence of markers/SNPs

Description

Produces a list, each element of which is an object of class haplin, which is the result of fitting the log-linear haplin models to the data one "window" at a time.

Usage

haplinSlide(filename, data, pedIndex, markers = "ALL", winlength = 1, 
strata = NULL, table.output = TRUE, cpus = 1, slaveOutfile = "", 
printout = FALSE, verbose = FALSE, ...)

Arguments

filename

A character string giving the name and path of the ASCII data file to be read. The file should be in the Haplin data format.

data

An R-object which is the result of using load.gwaa.data to load data into R. See the web page for a description of how to convert a ped file into a file that can be loaded. The conversion uses prepPed and convert.snp.ped.

pedIndex

A file of family indexes constructed by using prepPed on the original ped file. This file is used by Haplin to extract and store family information.

markers

Default is "ALL", which means haplinSlide uses all available markers in the data set in the analysis. Alternatively, the relevant markers can be specified by, for instance, markers = c(1, 3:10), which would use the 10 first markers except marker 2. haplinSlide will then run haplin on a series of windows selected from the supplied markers. The winlength argument decides the length of the windows. See details.

winlength

Length of the sliding, overlapping windows to be run along the markers. See details.

strata

A single numeric value specifying which data column contains the stratification variable.

table.output

If TRUE, the haptable function will be applied to each result after estimation, greatly reducing the size of the output. If FALSE, each element of the output list is a standard haplin object. To preserve memory, default is set to TRUE.

cpus

haplinSlide allows parallel processing of its analyses. The cpus argument should preferably be set to the number of available cpu's. If set lower, it will save some capacity for other processes to run. Setting it too high should not cause any serious problems.

slaveOutfile

Character. To be used when cpus > 1. If slaveOutfile = "" (default), output from all running cores will be printed in the standard R session window. Alternatively, the output can be saved to a file by specifying the file path and name.

printout

Default is FALSE. If TRUE, provides a full summary of each haplin result during the run of haplinSlide.

verbose

Same as for haplin, but defaults to FALSE to reduce output size.

...

Remaining arguments to be used by haplin in each run.

Value

A list of objects of class haplin is returned.

Details

haplinSlide runs haplin on a series of overlapping windows of the chosen markers. Except for the markers and winlength arguments, all arguments are used exactly as in haplin itself. For instance, if markers = c(1, 3, 4, 5, 7, 8) and winlength = 4, haplinSlide will run haplin on first the markers c(1, 3, 4, 5), then on c(3, 4, 5, 7), and finally on c(4, 5, 7, 8). The results are returned in a list. The elements are named "1-3-4-5" etc., and can be extracted with, say, summary(res[["1-3-4-5"]]) etc., where res is the saved result. Or the output can be examined by, for instance, using lapply(res, summary) and lapply(res, plot). When running haplinSlide on a large number of markers, the output can become prohibitively large. In that case table.output should be set to TRUE, and haplinSlide will return a list of summary "haptables". This list can then be stacked into a single dataframe using toDataFrame. To avoid exessive memory use, the default is table.output = TRUE. When multiple cores are available, set the cpus to the number of cores that should be used. This will run haplinSlide in parallel on the chosen number of cores. Note that feedback is provided by each of the cores separately, and some cores may start working on markers far out in the sequence.

References

Gjessing HK and Lie RT. Case-parent triads: Estimating single- and double-dose effects of fetal and maternal disease gene haplotypes. Annals of Human Genetics (2006) 70, pp. 382-396. Web Site: http://folk.uib.no/gjessing/genetics/software/haplin/

Examples

Run this code

# NOT RUN {
# }
# NOT RUN {
# (Almost) all standard haplin runs can be done with haplinSlide. 
# Below is an illustration. See the haplin help page for more 
# examples.
# 
# Analyzing the effect of fetal genes, including triads with missing data,
# using a multiplicative response model. When winlength = 1, separate
# markers are used. To make longer windows, winlength can be increased
# correspondingly:
result.1 <- haplinSlide("C:/work/data.dat", use.missing = T, response = "mult",
reference = "ref.cat", winlength = 1, table.output = F)
# Provide summary of separate results:
lapply(result.1, summary)
# Plot results:
par(ask = T)
lapply(result.1, plot)
<!-- %# Compute an overall p-value for the scan, corrected for multiple testing -->
<!-- %# and dependencies between windows: -->
<!-- %suest(result.1) -->

# }
# NOT RUN {
# }

Run the code above in your browser using DataLab