Learn R Programming

aroma.light (version 3.2.0)

callNaiveGenotypes: Calls genotypes in a normal sample

Description

Calls genotypes in a normal sample.

Usage

"callNaiveGenotypes"(y, cn=rep(2L, times = length(y)), ..., modelFit=NULL, verbose=FALSE)

Arguments

y
A numeric vector of length J containing allele B fractions for a normal sample.
cn
An optional numeric vector of length J specifying the true total copy number in $\{0,1,2,NA\}$ at each locus. This can be used to specify which loci are diploid and which are not, e.g. autosomal and sex chromosome copy numbers.
...
Additional arguments passed to fitNaiveGenotypes().
modelFit
A optional model fit as returned by fitNaiveGenotypes().
verbose
A logical or a Verbose object.

Value

Returns a numeric vector of length J containing the genotype calls in allele B fraction space, that is, in [0,1] where 1/2 corresponds to a heterozygous call, and 0 and 1 corresponds to homozygous A and B, respectively. Non called genotypes have value NA.

Missing and non-finite values

A missing value always gives a missing (NA) genotype call. Negative infinity (-Inf) always gives genotype call 0. Positive infinity (+Inf) always gives genotype call 1.

See Also

Internally fitNaiveGenotypes() is used to identify the thresholds.

Examples

Run this code
layout(matrix(1:3, ncol=1))
par(mar=c(2,4,4,1)+0.1)

# - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# A bimodal distribution
# - - - - - - - - - - - - - - - - - - - - - - - - - - - -
xAA <- rnorm(n=10000, mean=0, sd=0.1)
xBB <- rnorm(n=10000, mean=1, sd=0.1)
x <- c(xAA,xBB)
fit <- findPeaksAndValleys(x)
print(fit)
calls <- callNaiveGenotypes(x, cn=rep(1,length(x)), verbose=-20)
xc <- split(x, calls)
print(table(calls))
xx <- c(list(x),xc)
plotDensity(xx, adjust=1.5, lwd=2, col=seq(along=xx), main="(AA,BB)")
abline(v=fit$x)

# - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# A trimodal distribution with missing values
# - - - - - - - - - - - - - - - - - - - - - - - - - - - -
xAB <- rnorm(n=10000, mean=1/2, sd=0.1)
x <- c(xAA,xAB,xBB)
x[sample(length(x), size=0.05*length(x))] <- NA;
x[sample(length(x), size=0.01*length(x))] <- -Inf;
x[sample(length(x), size=0.01*length(x))] <- +Inf;
fit <- findPeaksAndValleys(x)
print(fit)
calls <- callNaiveGenotypes(x)
xc <- split(x, calls)
print(table(calls))
xx <- c(list(x),xc)
plotDensity(xx, adjust=1.5, lwd=2, col=seq(along=xx), main="(AA,AB,BB)")
abline(v=fit$x)

# - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# A trimodal distribution with clear separation
# - - - - - - - - - - - - - - - - - - - - - - - - - - - -
xAA <- rnorm(n=10000, mean=0, sd=0.02)
xAB <- rnorm(n=10000, mean=1/2, sd=0.02)
xBB <- rnorm(n=10000, mean=1, sd=0.02)
x <- c(xAA,xAB,xBB)
fit <- findPeaksAndValleys(x)
print(fit)
calls <- callNaiveGenotypes(x)
xc <- split(x, calls)
print(table(calls))
xx <- c(list(x),xc)
plotDensity(xx, adjust=1.5, lwd=2, col=seq(along=xx), main="(AA',AB',BB')")
abline(v=fit$x)

Run the code above in your browser using DataLab