Learn R Programming

GWASTools (version 1.12.2)

ncdfSetMissingGenotypes: Write a new netCDF or GDS file, setting certain SNPs to missing - deprecated

Description

Deprecated - use setMissingGenotypes

ncdfSetMissingGenotypes copies an existing netCDF genotype file to a new one, setting SNPs in specified regions to missing. gdsSetMissingGenotypes copies an existing GDS genotype file to a new one, setting SNPs in specified regions to missing.

Usage

ncdfSetMissingGenotypes(parent.file, new.file, regions, sample.include=NULL, verbose=TRUE) gdsSetMissingGenotypes(parent.file, new.file, regions, sample.include=NULL, zipflag="ZIP.max", verbose=TRUE)

Arguments

parent.file
Name of the parent file
new.file
Name of the new file
regions
Data.frame of chromosome regions with columns "scanID", "chromosome", "left.base", "right.base", "whole.chrom".
sample.include
Vector of sampleIDs to include in new.file
zipflag
the compression format for the GDS file, one of "", "ZIP", "ZIP.fast", "ZIP.default", or "ZIP.max"
verbose
Logical value specifying whether to show progress information.

Details

ncdfSetMissingGenotypes and gdsSetMissingGenotypes remove chromosome regions by setting SNPs that fall within the anomaly regions to NA (i.e., the missing value in the netCDF/GDS file). Optionally, entire samples may be excluded from the netCDF/GDS file as well: if the sample.include argument is given, only the scanIDs in this vector will be written to the new file, so the sample dimension will be length(sample.include).

For regions with whole.chrom=TRUE, the entire chromosome will be set to NA for that sample. For other regions, only the region between left.base and right.base will be set to NA.

See Also

ncdfSubset, anomSegStats for chromosome anomaly regions

Examples

Run this code
## Not run: 
# ncfile <- system.file("extdata", "affy_geno.nc", package="GWASdata")
# nc <- NcdfGenotypeReader(ncfile)
# sample.sel <- getScanID(nc, index=1:10)
# close(nc)
# 
# regions <- data.frame("scanID"=sample.sel[1:3], "chromosome"=c(21,22,23),
#   "left.base"=c(14000000, 30000000, NA), "right.base"=c(28000000, 450000000, NA),
#   whole.chrom=c(FALSE, FALSE, TRUE))
# 
# newnc <- tempfile()
# ncdfSetMissingGenotypes(ncfile, newnc, regions, sample.include=sample.sel)
# file.remove(newnc)
# ## End(Not run)

Run the code above in your browser using DataLab