polyRAD (version 1.2)

ExportGAPIT: Export RADdata Object for Use by Other R Packages

Description

After a "RADdata" object has been run through a pipeline such as IteratePopStruct, these functions can be used to export the genotypes to R packages and other software that can perform genome-wide association and genomic prediction. ExportGAPIT, Export_rrBLUP_Amat, Export_rrBLUP_GWAS, and Export_TASSEL_Numeric all export continuous numerical genotypes generated by GetWeightedMeanGenotypes. Export_polymapR and Export_GWASpoly use GetProbableGenotypes to export discrete genotypes. Export_MAPpoly writes genotype posterior probabilities to a file for import by MAPpoly.

Usage

ExportGAPIT(object, onePloidyPerAllele = FALSE)
Export_rrBLUP_Amat(object, naIfZeroReads = FALSE, 
                   onePloidyPerAllele = FALSE)
Export_rrBLUP_GWAS(object, naIfZeroReads = FALSE, 
                   onePloidyPerAllele = FALSE)
Export_TASSEL_Numeric(object, file, naIfZeroReads = FALSE,
                      onePloidyPerAllele = FALSE)
Export_polymapR(object, naIfZeroReads = TRUE,
                progeny = GetTaxa(object)[!GetTaxa(object) %in% 
                  c(GetDonorParent(object), GetRecurrentParent(object),
                    GetBlankTaxa(object))])
Export_MAPpoly(object, file, pheno = NULL, ploidyIndex = 1,
               progeny = GetTaxa(object)[!GetTaxa(object) %in% 
                 c(GetDonorParent(object), GetRecurrentParent(object),
                   GetBlankTaxa(object))],
               digits = 3)
Export_GWASpoly(object, file, naIfZeroReads = TRUE)

Arguments

object

A "RADdata" object with posterior genotype probabilities already estimated.

onePloidyPerAllele

Logical. If TRUE, for each allele the inheritance mode with the lowest \(\chi ^ 2\) value is selected and is assumed to be the true inheritance mode. If FALSE, inheritance modes are weighted by inverse \(\chi ^ 2\) values for each allele, and mean genotypes that have been weighted across inheritance modes are returned.

naIfZeroReads

A logical indicating whether NA should be inserted into the output matrix for any taxa and loci where the total read depth for the locus is zero. If FALSE, the output for these genotypes is essentially the mode (for Export_polymapR and Export_GWASpoly) or mean (for others) across prior genotype probabilities, since prior and posterior genotype probabilities are equal when there are no reads.

file

A character string indicating a file path to which to write.

pheno

A data frame or matrix of phenotypic values, with progeny in rows and traits in columns. Columns should be named.

ploidyIndex

Index, within object$priorProbPloidies, of the ploidy to be exported.

progeny

A character vector indicating which individuals to export as progeny of the cross.

digits

Number of decimal places to which to round genotype probabilities in the output file.

Value

For ExportGAPIT, a list:

GD

A data frame with taxa in the first column and alleles (markers) in subsequent columns, containing the genotypes. To be passed to the GD argument for GAPIT or FarmCPU.

GM

A data frame with the name, chromosome number, and position of every allele (marker). To be passed to the GM argument for GAPIT or FarmCPU.

For Export_rrBLUP_Amat, a matrix with taxa in rows and alleles (markers) in columns, containing genotype data. This can be passed to A.mat in rrBLUP.

For Export_rrBLUP_GWAS, a data frame with alleles (markers) in rows. The first three columns contain the marker names, chromosomes, and positions, and the remaining columns each represent one taxon and contain the genotype data. This can be passed to the GWAS function in rrBLUP.

Export_TASSEL_Numeric and Export_MAPpoly write a file but does not return an object.

For Export_polymapR, a matrix of integers indicating the most probable allele copy number, with markers in rows and individuals in columns. The parents are listed first, followed by all progeny.

Export_MAPpoly and Export_GWASpoly write files but do not return an object. Files output by Export_GWASpoly are comma delimited and in numeric format.

Details

GAPIT, FarmCPU, rrBLUP, and TASSEL allow genotypes to be a continuous numeric variable. MAPpoly allows for import of genotype probabilities. GAPIT does not allow missing data, hence there is no naIfZeroReads argument for ExportGAPIT. Genotypes are exported on a scale of -1 to 1 for rrBLUP, on a scale of 0 to 2 for GAPIT and FarmCPU, and on a scale of 0 to 1 for TASSEL.

For all functions, one allele per marker is dropped. Export_MAPpoly also drops alleles where one or both parental genotypes could not be determined, and where both parents are homozygotes.

For ExportGAPIT and Export_rrBLUP_GWAS, chromosome and position are filled with dummy data if they do not exist in object$locTable. For Export_TASSEL_Numeric, allele names are exported, but no chromosome or position information per se.

If the chromosomes in object$locTable are in character format, ExportGAPIT and Export_MAPpoly will attempt to extract chromosome numbers.

Because polymapR allows only one ploidy, there must only be one possible ploidy for the progeny in the RADdata object (possibly with different ploidies for the parents, e.g. 4x x 2x = 3x). MAPpoly also only allows one ploidy, but Export_MAPpoly allows the user to select which ploidy from the RADdata object to use. (This is due to how internal polyRAD functions are coded.)

References

http://zzlab.net/GAPIT/

Lipka, A. E., Tian, F., Wang, Q., Peiffer, J., Li, M., Bradbury, P. J., Gore, M. A., Buckler, E. S. and Zhang, Z. (2012) GAPIT: genome association and prediction integrated tool. Bioinformatics 28, 2397--2399.

http://www.zzlab.net/FarmCPU/

Liu, X., Huang, M., Fan, B., Buckler, E. S., Zhang, Z. (2016) Iterative usage of fixed and random effects models for powerful and efficient genome-wide association studies. PLoS Genetics 12, e1005767.

Endelman, J.B. (2011) Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP. The Plant Genome 4, 250--255.

http://www.maizegenetics.net/tassel

Bradbury, P. J., Zhang, Z., Kroon, D. E., Casstevens, T. M., Ramdoss, Y. and Buckler, E. S. (2007) TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633--2635.

Bourke, P., van Geest, G., Voorrips, R. E., Jansen, J., Kranenberg, T., Shahin, A., Visser, R. G. F., Arens, P., Smulders, M. J. M. and Maliepaard, C. (2018) polymapR: linkage analysis and genetic map construction from F1 populations of outcrossing polyploids. Bioinformatics 34, 3496--3502.

https://github.com/mmollina/MAPpoly

Mollinari, M. and Garcia, A. A. F. (2018) Linkage analysis and haplotype phasing in experimental autopolyploid populations with high ploidy level using hidden Markov models. bioRxiv doi: https://doi.org/10.1101/415232.

https://potatobreeding.cals.wisc.edu/software/

Rosyara, U. R., De Jong, W. S., Douches, D. S., and Endelman, J. B. (2016) Software for Genome-Wide Association Studies in Autopolyploids and Its Application to Potato. Plant Genome 9.

See Also

GetWeightedMeanGenotypes, RADdata2VCF

Examples

Run this code
# NOT RUN {
# load example dataset
data(exampleRAD)
# get genotype posterior probabilities
exampleRAD <- IterateHWE(exampleRAD)

# export to GAPIT
exampleGAPIT <- ExportGAPIT(exampleRAD)

# export to rrBLUP
example_rrBLUP_A <- Export_rrBLUP_Amat(exampleRAD)
example_rrBLUP_GWAS <- Export_rrBLUP_GWAS(exampleRAD)

# export to TASSEL
outfile <- tempfile() # temporary file for example
Export_TASSEL_Numeric(exampleRAD, outfile)

# for mapping populations
data(exampleRAD_mapping)

# specify donor and recurrent parents
exampleRAD_mapping <- SetDonorParent(exampleRAD_mapping, "parent1")
exampleRAD_mapping <- SetRecurrentParent(exampleRAD_mapping, "parent2")

# run the pipeline
exampleRAD_mapping <- PipelineMapping2Parents(exampleRAD_mapping)

# convert to polymapR format
examplePMR <- Export_polymapR(exampleRAD_mapping)

# export to MAPpoly
outfile2 <- tempfile() # temporary file for example
# generate a dummy phenotype matrix containing random numbers
mypheno <- matrix(rnorm(200), nrow = 100, ncol = 2,
                  dimnames = list(GetTaxa(exampleRAD_mapping)[-(1:2)],
                                  c("Height", "Yield")))
Export_MAPpoly(exampleRAD_mapping, file = outfile2, pheno = mypheno)

# load data into MAPpoly
# require(mappoly) # can uncomment once mappoly is on CRAN
# mydata <- read_geno_dist(outfile2)

# export to GWASpoly
outfile3 <- tempfile() # temporary file for example
Export_GWASpoly(exampleRAD, outfile3)
# }

Run the code above in your browser using DataLab