Run PHASE to estimate the phase of loci in diploid data.
phase(
g,
loci,
positions = NULL,
type = NULL,
num.iter = 1e+05,
thinning = 100,
burnin = 1e+05,
model = "new",
ran.seed = NULL,
final.run.factor = NULL,
save.posterior = FALSE,
in.file = "phase_in",
out.file = "phase_out",
delete.files = TRUE
)phaseReadSample(out.file, type)
phaseReadPair(out.file)
phaseWrite(
g,
loci,
positions = NULL,
type = rep("S", length(loci)),
in.file = "phase_in"
)
phasePosterior(ph.res, keep.missing = TRUE)
phaseFilter(ph.res, thresh = 0.5, keep.missing = TRUE)
vector or data.frame of loci in 'g' that are to be phased. If a
data.frame, it should have columns named
locus
(name of locus in 'g'),
group
(number identifying loci in same linkage group), and
position
(integer identifying location of each locus in a
linkage group).
position along chromosome of each locus.
type of each locus.
number of PHASE MCMC iterations.
number of PHASE MCMC iterations to thin by.
number of PHASE MCMC iterations for burnin.
PHASE model type.
PHASE random number seed.
optional.
logical. Save posterior sample in output list?
name to use for PHASE input file.
name to use for PHASE output files.
logical. Delete PHASE input and output files when done?
result from phase.run
.
logical. T = keep missing data from original data set. F = Use estimated genotypes from PHASE.
minimum probability for a genotype to be selected (0.5 - 1).
a list containing:
locus.name |
new locus name, which is a combination of loci in group. |
gtype.probs |
a data.frame listing the estimated genotype for every sample along with probability. |
orig.gtypes |
the original gtypes object for the composite loci. |
posterior |
a list of num.iter data.frames
representing posterior sample of genotypes for each sample. |
a list with the input filename and the '>gtypes object used.
a data.frame of genotype probabilities.
a list of data.frames representing the posterior sample of genotypes for one set of loci for each sample.
a matrix of genotypes for each sample.
a list of data.frames representing the posterior sample of all genotypes for each sample.
phase |
runs PHASE assuming that the executable is installed properly and available on the command line. |
phaseWrite |
writes a PHASE formatted file. |
phaseReadPair |
reads the '_pair' output file. |
phaseReadSample |
reads the '_sample' output file. |
phaseFilter |
filters the result from phase.run to
extract one genotype for each sample. |
phasePosterior |
create a data.frame of all genotypes for each posterior sample. |
Stephens, M., and Donnelly, P. (2003). A comparison of Bayesian methods for haplotype reconstruction from population genotype data. American Journal of Human Genetics 73:1162-1169. Available at: http://stephenslab.uchicago.edu/software.html#phase
# NOT RUN {
data(bowhead.snps)
data(bowhead.snp.position)
snps <- df2gtypes(bowhead.snps, ploidy = 2, description = "Bowhead SNPS")
summary(snps)
# Run PHASE on all data
phase.results <- phase(snps, bowhead.snp.position, num.iter = 100,
save.posterior = FALSE)
# Filter phase results
filtered.results <- phaseFilter(phase.results, thresh = 0.5)
# Convert phased genotypes to gtypes
ids <- rownames(filtered.results)
strata <- bowhead.snps$Stock[match(ids, bowhead.snps$LABID)]
filtered.df <- cbind(id = ids, strata = strata, filtered.results)
phased.snps <- df2gtypes(filtered.df, ploidy = 2, description = "Bowhead phased SNPs")
summary(phased.snps)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab