Learn R Programming

beadarrayMSV (version 1.0.3)

translateTheta: Convert genotype calls to allele information

Description

Genotype calls represented as numeric values (allele ratios) within [0, 1] are converted to character strings containing allele information A, T, C, and G

Usage

translateTheta(calls, resInfo, type = "regular")

translateThetaCombined(BSRed, mergedCalls = NULL)

translateThetaFromFiles(dataFiles, mergedCalls = NULL, markerStep = 1000, sep = "", quote = "") calls{ Numeric matrix with calls {0, 1/2, 1} representing allele ratios for each sample. Each row is a unique marker or paralogue (specified with type) } resInfo{ Data table containing featureData, including the columns Classification, SNP, and ILMN.Strand. These hold the genotype categories from callGenotypes and the SNP and TOP/BOT-category of the BeadArray markers (see createAlleleSet) } type{ One of regular, single, or merged (see details below) } BSRed{ "AlleleSetIllumina" object containing an assayData entry call and a featureData column Classification (see callGenotypes) } mergedCalls{ Matrix with calls from resolved MSV-5 paralogs (see assignParalogues) } dataFiles{ Character vector containing file names (see makeFilenames) } markerStep{ The maximum number of markers loaded into the workspace at the time } sep{ Field delimiter in text-files (see read.table) } quote{ Quote-marks used for character strings (see read.table) }

The main difference between translateTheta and translateThetaCombined is that the former can only handle call-values {0, 1/2, 2} whereas the latter handles values {0, 1/4, 1/2, 3/4, 1}. In effect this means that markers from duplicated genome regions have to be handled in a special way if analysed with translateTheta. If type == "regular", the markers are treated as if they were all from a diploid region. This implies that all non-segregating paralogs of MSV-a and MSV-b markers are ignored, effectively turning these markers into SNPs. Markers classified as MSV-5 or PSV are set to missing (see makeDiploidCalls). If type == "single", calls is expected to contain resolved MSV-5 paralogs named with -Para1 or -Para2 (see unmixParalogues). If type == "merged", resolved MSV-5 paralogs named according to their respective chromosomes, -ChromX, are expected (see assignParalogues). The main use of this function is to prepare genotype calls for mapping software which requires diploid markers.

With translateThetaCombined, there is always one element per marker, as required by the "AlleleSetIllumina". If mergedCalls is given, the MSV-5 paralogs will be resolved, otherwise only the ratio of the alleles across paralogs will be returned. The function translateThetaFromFiles performs the same operations on data sequentially loaded into the workspace, and the genotypes are written to file dataFiles$genoFile as they are found. Output from translateTheta is a matrix whose dimensions depend on the input data. If calls has one row per marker (i.e. type == "regular"), the number of rows in the output matrix also equals the number of markers. If calls has one row per paralogue (i.e. type != "regular"), the number of rows in the output matrix also equals the number of paralogs. Each element is a character string x y denoting the two alleles (A, T, C, G, or - for missing).

In contrast, the output from translateThetaCombined is an AlleleSetIllumina object with an added assayData entry genotype. The elements of this matrix representing diploid markers are given as xy, un-resolved tetraploid markers are given as xyzw, and resolved tetraploid markers are given as xy,zw (paralogs separated by comma). The letters correspond to any of the 4 bases or - for missing.

The function translateThetaFromFiles are used for its side effects. [object Object] makeDiploidCalls, unmixParalogues, assignParalogues, makeFilenames, callGenotypes #Read 25 markers into an AlleleSetIllumina object rPath <- system.file("extdata", package="beadarrayMSV") normOpts <- setNormOptions() dataFiles <- makeFilenames('testdata',normOpts,rPath) beadFile <- paste(rPath,'beadData_testdata.txt',sep='/') beadInfo <- read.table(beadFile,sep='\t',header=TRUE,as.is=TRUE) BSRed <- createAlleleSetFromFiles(dataFiles[1:4],markers=1:25,beadInfo=beadInfo)

#Genotype calling BSRed <- callGenotypes(BSRed) genotypes <- translateTheta(assayData(BSRed)$call,fData(BSRed),type='regular') print(cbind(fData(BSRed)$Classification,genotypes[,1:3])[1:10,])

#Alternative output BSRed <- translateThetaCombined(BSRed) print(cbind(fData(BSRed)$Classification,assayData(BSRed)$genotype[,1:3])[1:10,])

Arguments