Learn R Programming

OriGen (version 1.3.1)

ConvertMicrosatData: Microsatellite file conversion for known and unknown data

Description

This function converts two Microsatellite data files (one for the genotypes and one for locations) into the data format required for OriGen.

Usage

ConvertMicrosatData(DataFileName,LocationFileName)

Arguments

DataFileName
Name of file containing the genotypes of the various locations. The columns here would be LocationName, LocationNumber, Locus1, Locus2, etc. Each individual would take up 2 rows (one for each allele) with the same LocationName and LocationNumber. The v
LocationFileName
Space or tab delimited text file with the location information for the individuals. The columns are LocationName, LocationNumber, Latitude, and Longitude. Note that the first two columns must be in the same order as the FileName.

Value

  • List with the following components:
  • DataArrayAn array giving the number alleles grouped by sample sites for each locus. The dimension of this array is [MaxAlleles,SampleSites,NumberSNPs].
  • SampleCoordinatesThis is an array which gives the longitude and latitude of each of the found sample sites. The dimension of this array is [SampleSites,2], where the second dimension represents longitude and latitude respectively.
  • AllelesAtLocusThis shows the integer vector of alleles found at each locus.
  • MaxAllelesThis shows the maximum of AllelesAtLocus. The maximum number of alleles at all loci.
  • SampleSitesThis shows the integer number of sample sites found.
  • NumberLociThis shows the integer number of loci found.
  • NumberUnknownsThis is an integer value showing the number of unknowns found.
  • UnknownDataArrayAn array showing the unknown individuals genetic data. The dimension of this array is [NumberUnknowns,2,NumberLoci].
  • LocationNamesThis is a list of all the LocationNames (The first column of the input files).
  • DataFileNameThis shows the inputted DataFileName.
  • LocationFileNameThis shows the inputted LocationFileName.

References

Ranola J, Novembre J, Lange K (2014) Fast Spatial Ancestry via Flexible Allele Frequency Surfaces. Bioinformatics, in press.

See Also

ConvertMicrosatData for converting Microsatellite data files into a format appropriate for analysis, ConvertPEDData for converting Plink PED files into a format appropriate for analysis,

FitMultinomialModel for fitting allele surfaces to the converted Microsatellite data,

PlotAlleleFrequencySurface for a quick way to plot the resulting allele frequency surfaces from FitOriGenModel or FitMultinomialModel,;

Examples

Run this code
#Note that sample files MicrosatTrialDataSmall.txt and 
#LocationTrialDataSmall.txt are included in data for formatting.
#Note that this was done to allow inclusion of the test data in the package.

MicrosatDataSmall=ConvertMicrosatData("MicrosatTrialDataSmall.txt",
		"LocationTrialDataSmall.txt")
str(MicrosatDataSmall)
MicrosatAnalysisSmall=FitMultinomialModel(MicrosatDataSmall$DataArray,
		MicrosatDataSmall$SampleCoordinates,MaxGridLength=20)
str(MicrosatAnalysisSmall)
PlotAlleleFrequencySurface(MicrosatAnalysisSmall)

Run the code above in your browser using DataLab