Learn R Programming

poolABC (version 1.0.0)

importData: Import a single file containing data in popoolation2 format

Description

Load a file that is in the _rc format of the popoolation2 software.

Usage

importData(file, pops = NA, header = NA, remove = NA)

Value

a matrix with general information about the data in the first 9 columns and the number of major and minor allele reads for the required populations in the remaining columns. If an header was supplied then the matrix will also contain column names as defined by the header

input.

Arguments

file

is a character string indicating the path to the file you wish to import.

pops

is a vector with the index of the populations that should be imported. Defaults to NA, meaning that data is imported for all populations.

header

is a character vector containing the names for the columns. If set to NA (default), no column names will be added to the output.

remove

is a character vector where each entry is a name of a contig to be removed. These contigs are, obviously, removed from the imported dataset. If NA (default), all contigs will be kept in the output.

Details

This function will import a single file containing data in the _rc format. Note that this function will remove all non biallelic sites and sites where the sum of deletions in all populations is not zero.

The first 9 columns of the matrix contain general information about the data and the number of major-allele reads for each population starts on the 10th column. Thus, the 10th column contains the number of major-allele reads for the first population, the 11th column contains the number of major-allele reads for the second population and so on. Thus if, for example, you wish to import the data for the 5th and 6th population, then you should define the pops input as pops = c(5, 6). This will result in keeping only the first 9 columns of the matrix plus the 15th and 16th columns and the corresponding columns with the number of minor-allele reads for those populations.