A local region of the tumor is sampled by constructing a cube with side length cube.length
around
the center point pos
. Each cell within the cube is sampled, and the reported quantity is variant (or mutation)
allele frequency. Lattice sites without cells are assumed to be normal tissue, and thus the reported MAF may be less than
1.0 even if the mutation is present in all cancerous cells.
If coverage
is non-zero then deep sequencing can be simulated. For a chosen coverage \(C\), it is known
that the number of times the base is read follows a \(Pois(C)\) distribution (see Illumina's website).
Let \(d\) be the true coverage
sampled from this distribution. Then the estimated VAF is drawn from a \(Bin(d,p)/d\) distribution.
Note that cube.length
is required to be an odd integer (in order to have a well-defined center point).