Learn R Programming

poolABC (version 1.0.0)

checkMissing: Remove sites with missing data

Description

This functions checks if there is any population with an "N" as the reference character for the major allele.

Usage

checkMissing(info, major, minor, rMajor, rMinor, coverage)

Value

a list with the following elements:

info

a matrix with the general information about the dataset. Each row of this matrix corresponds to a different site.

major

a matrix with the reference character of the major allele. Each column of this matrix corresponds to a different population and each row to a different site.

minor

a matrix with the reference character of the minor allele. Each column of this matrix corresponds to a different population and each row to a different site.

rMajor

a matrix with the number of major-allele reads. Each row of this matrix is a different site and each column a different population.

rMinor

a matrix with the number of minor-allele reads. Each row of this matrix is a different site and each column a different population.

coverage

a matrix with the total coverage. Each row of this matrix is a different site and each column a different population.

Each of those matrices is similar to the corresponding input but without any sites where any of the populations has an "N" as the reference character for the major allele.

Arguments

info

is a matrix containing information about the dataset. This matrix might contain several columns including one with the reference contig (chromosome), the position of the SNP in the reference contig and the reference character of the SNP. Note that each row of the matrix should be a different SNP and each column a different type of information.

major

is a matrix with the reference character of the major allele. Each column of the matrix should be a different population and each row a different SNP.

minor

is a matrix with the reference character of the minor allele. Each column of the matrix should be a different population and each row a different SNP.

rMajor

is a matrix with the number of major allele reads. Each column of the matrix should be a different population and each row a different SNP.

rMinor

is a matrix with the number of minor allele reads. Each column of the matrix should be a different population and each row a different SNP.

coverage

is a matrix with the total number of reads i.e. the depth of coverage. Each column of the matrix should be a different population and each row a different SNP.

Details

This verification is performed for all the populations included in the dataset. Any site where this verification fails for any of the populations is removed from the dataset. More precisely, if a single population has an "N" as the reference character of their major allele, then that site is removed from the data for all the populations.