Learn R Programming

BEDMatrix (version 1.3.0)

BEDMatrix: Creates a matrix wrapper around binary PED files.

Description

BEDMatrix is an S3 class that behaves similarly to a regular matrix by implementing key methods such as [, dim, and dimnames. Subsets are extracted directly and on-demand from the binary PED file without loading the entire file into memory through memory mapping. The subsets are coded similarly to RAW files generated with the --recodeA argument in PLINK: 0 indicates homozygous major allele, 1 indicates heterozygous, and 2 indicates homozygous minor allele.

Usage

BEDMatrix(path, n = NULL, p = NULL)

Arguments

path
Path to the binary PED file, with or without extension.
n
The number of individuals. Optional if FAM file of same name as BED file exists. If provided, rownames will be set to NULL and have to be provided manually.
p
The number of markers. Optional if BIM file of same name as BED file exists. If provided, colnames will be set to NULL and have to be provided manually.

Details

A BEDMatrix instance can be created by providing the path to the BED file (with or without extension) as path, the number of individuals as n, and the number of markers as p. If a FAM file (which corresponds to the first six columns of a PED file) of the same name and in the same directory as the BED file exists, it is optional to provide n and the number of individuals as well as the rownames of the BEDMatrix will be detected automatically. The rownames will be generated based on the IID and FID of each individual, concatenated by _. If a BIM file (which corresponds to the MAP file that accompanies a PED file) of the same name and in the same directory as the BED file exists, it is optional to provide p and the number of markers as well as the colnames of the BEDMatrix will be detected automatically. The colnames will be generated based on the SNP name and the minor allele, concatenated by _ (similar to the colnames in a RAW file). For very large BED file it is advised to provide n and p manually to speed up object creation. In that case rownames and colnames will be set to NULL and have to be specified manually.

A BED file can be created from a PED file with PLINK using plink --file myfile --make-bed. BED files are storage and query efficient, and can be transformed back into the original PED file with PLINK using plink --bfile myfile --recode.

Internally, BEDMatrix inherits from list and exposes a few attributes that should not be relied upon in actual code: path, dims, dnames, and _instance. path stores the path to the BED file. dims and dnames contain the dimensions and dimnames of the BEDMatrix object. _instance points to the underlying Rcpp module. The Rcpp module exposes an S4 class called BEDMatrix_ that memory maps the BED file via Boost.Interprocess of the BH package.

Examples

Run this code
# Create an example BEDMatrix object
m <- BEDMatrix(system.file("extdata", "example.bed", package = "BEDMatrix"))

# Get the dimensions of the example BEDMatrix object
dim(m)

# Extract a subset of the example BEDMatrix object
m[1:3, ]

Run the code above in your browser using DataLab