Learn R Programming

PopGenome (version 2.1.6)

read.big.fasta: Reading large FASTA alignments

Description

This function splits FASTA alignments that are too large to fit into the computer memory into chunks.

Usage

read.big.fasta(filename,populations=FALSE,outgroup=FALSE,window=2000,
               SNP.DATA=FALSE,include.unknown=FALSE,
               parallized=FALSE,FAST=FALSE,big.data=TRUE)

Arguments

filename
the basepath of the FASTA alignment
outgroup
vector of outgroup sequences
populations
list of populations
window
chunk size: number of columns/nucleotide sites
SNP.DATA
should be switched to TRUE if you use SNP data in alignment format
include.unknown
include unknown positions in the biallelic.matrix
parallized
Use parallel computations to speed up the reading - works only on UNIX systems!
FAST
Fast computation. see readData()
big.data
use the ff-package

Value

  • The function creates an object of class "GENOME" --------------------------------------------------------- The following slots will be filled in the "GENOME" object --------------------------------------------------------- rll{ Slot Description 1. n.sites total number of sites 2. n.biallelic.sites number of biallelic sites 3. region.names names of regions 4. region.data some detailed information about the data }

Details

The algorithm reads the data for each individual and stores the information on disk. The data can be analyzed as regions of the defined window size, or can be concatenated in the PopGenome framework via the function concatenate.regions. This function should only be used when the FASTA file does not fit into the RAM; else, use the function readData.

Examples

Run this code
# GENOME.class <- read.big.fasta("Alignment.fas", big.data=TRUE)
# GENOME.class
# GENOME.class@region.names
# CON <- concatenate.regions(GENOME.class)
# CON@region.data@biallelic.sites
# GENOME.class.slide <- sliding.window.transform(GENOME.class,100,100)
# GENOME.class <- neutrality.stats(GENOME.class,FAST=TRUE)
# show the result:
# get.sum.data(GENOME.class)
# GENOME.class@region.data

Run the code above in your browser using DataLab