readVCF: Reading SNP data from the 1000 Genome-project

Description

This function reads tabixed VCF-files from the 1000 Genome-project.

Usage

readVCF(filename, numcols, tid, frompos, topos,
        samplenames=NA, gffpath = FALSE, include.unknown=FALSE )

Arguments

filename

the corresponding VCF-file

numcols

number of SNPs should be read in as a chunk

tid

which chromosome ? (character)

frompos

start of the region

topos

end of the region

samplenames

a vector of individuals

gffpath

the corresponding GFF-file

include.unknown

including unknown positions

Value

The function creates an object of class "GENOME" --------------------------------------------------------- Following Slots will be filled in the "GENOME" object --------------------------------------------------------- rll{ Slot Description 1. n.sites total number of sites 2. n.biallelic.sites number of biallelic sites 3. region.data some detail data informations 4. region.names names of each region }

Details

The ff-package we use to store the SNP informations is limited by individuals * (number of SNPs) <= .machine$integer.max="" otherwise="" the="" bigmemory="" package="" will="" be="" applied.="" (slower)="" use="" function="" vcf_handle <-.Call("VCF_open", filename) to open a VCF-file and .Call("VCF_getSampleNames",vcf_handle) to get the individual names. See also readData(..., format="VCF") !

Examples

Run this code

# GENOME.class <- readVCF("...\chr1.vcf.gz", 1000, "1", 1, 100000)
# GENOME.class
# GENOME.class@region.names
# GENOME.class <- neutrality.stats(GENOME.class,FAST=TRUE)
# show the result:
# get.sum.data(GENOME.class)
# GENOME.class@region.data

Run the code above in your browser using DataLab