This function calls a C++ function that does all the heavy lifting. It passes the arguments necessary for the C++ function: some from the caller's arguments and some from data frames that are in the "global" environment, envir. From its markers_arg argument, it gets the locus_index and the index in the unified_genotype_table. From the "global" environment, envir, it gets a bit vector of compressed genotype information, allele information, and some bookkeeping related data. Note: This function also contains a dispatch/switch on the type of compression in the genotype vector. A different C++ function is called when there is compression versus when there is no compression.
getgenotypesgenabel(markers_arg, envir = ENV)a data.frame with the following 5 observations:
is the ordinal ranking of this marker among all loci
is the position of corresponding genotype data in the unified_genotype_table
is the text name of the marker
is the integer chromosome number
is the integer base pair position of marker
an environment that contains all the data frames created from the SQLite database.
the GenABEL gwaa.data-class object component that contains the genotype data.
This function reads the genotype data in Mega2 compressed format and converts it to the GenABEL compressed format. The unified_genotype_table contains one raw vector for each person. In the vector, there are two bits for each genotype; each byte has the data for 4 markers. In GenABEL, there is one raw vector per marker, and each byte has the data for 4 persons. The C++ function does the conversion as well as adjusts the bits' contents. For example, in GenABEL the genotype represented by bits == 0, is what Mega2 represents with 2. Doing the conversion in C++ is 10 - 20 times faster than converting the Mega2 data to PLINK .tped files and then having GenABEL read in and process/convert those files.
# NOT RUN {
db = system.file("exdata", "seqsimm.db", package="Mega2R")
ENV = read.Mega2DB(db)
aa = getgenotypesgenabel(ENV$markers[ENV$markers$chromosome == 1,])
aa
# }
Run the code above in your browser using DataLab