Learn R Programming

locStra (version 1.9)

Fast Implementation of (Local) Population Stratification Methods

Description

Fast implementations to compute the genetic covariance matrix, the Jaccard similarity matrix, the s-matrix (the weighted Jaccard similarity matrix), and the (classic or robust) genomic relationship matrix of a (dense or sparse) input matrix (see Hahn, Lutz, Hecker, Prokopenko, Cho, Silverman, Weiss, and Lange (2020) ). Full support for sparse matrices from the R-package 'Matrix'. Additionally, an implementation of the power method (von Mises iteration) to compute the largest eigenvector of a matrix is included, a function to perform an automated full run of global and local correlations in population stratification data, a function to compute sliding windows, and a function to invert minor alleles and to select those variants/loci exceeding a minimal cutoff value. New functionality in locStra allows one to extract the k leading eigenvectors of the genetic covariance matrix, Jaccard similarity matrix, s-matrix, and genomic relationship matrix via fast PCA without actually computing the similarity matrices. The fast PCA to compute the k leading eigenvectors can now also be run directly from 'bed'+'bim'+'fam' files.

Copy Link

Version

Install

install.packages('locStra')

Monthly Downloads

250

Version

1.9

License

GPL (>= 2)

Maintainer

Georg Hahn

Last Published

April 12th, 2022

Functions in locStra (1.9)

fastCovEVs

Computation of the k leading eigenvectors of the covariance matrix for a (sparse) input matrix.
fastJaccardEVs

Computation of the k leading eigenvectors of the Jaccard similarity matrix for a (sparse) input matrix. Note that this computation is only approximate and does not necessarily coincide with the result obtained by extracting the k leading eigenvectors of the Jaccard matrix computed with the function jaccardMatrix.
bed_fastSMatrixEVs

Computation of the k leading eigenvectors of the s-matrix (the weighted Jaccard similarity matrix) directly from a bed+bim+fam file. Note that in contrast to the parameters of the function sMatrix, the choice phased=FALSE cannot be modified for the fast eigenvector computation. Moreover, inverting the minor allele is not possible when reading directly from external files.
covMatrix

C++ implementation to compute the covariance matrix for a (sparse) input matrix. The function is equivalent to the R command 'cov' applied to matrices.
bed_fastGrmEVs

Computation of the k leading eigenvectors of the genomic relationship matrix, defined in Yang et al. (2011), directly from a bed+bim+fam file.
bed_fastJaccardEVs

Computation of the k leading eigenvectors of the Jaccard similarity matrix directly from a bed+bim+fam file.. Note that this computation is only approximate and does not necessarily coincide with the result obtained by extracting the k leading eigenvectors of the Jaccard matrix computed with the function jaccardMatrix.
fastGrmEVs

Computation of the k leading eigenvectors of the genomic relationship matrix, defined in Yang et al. (2011), for a (sparse) input matrix.
bed_fastCovEVs

Computation of the k leading eigenvectors of the covariance matrix directly from a bed+bim+fam file.
fastSMatrixEVs

Computation of the k leading eigenvectors of the s-matrix (the weighted Jaccard similarity matrix) for a (sparse) input matrix. Note that in contrast to the parameters of the function sMatrix, the choice phased=FALSE cannot be modified for the fast eigenvector computation.
fullscan

A full scan of the input data m using a collection of windows given by the two-column matrix windows. For each window, the data is processed using the function matrixFunction (this could be, e.g., the covMatrix function), then the processed data is summarized using the function summaryFunction (e.g., the largest eigenvector computed with the function powerMethod), and finally the global and local summaries are compared using the function comparisonFunction (e.g., the vector correlation with R's function cor). The function returns a two-column matrix which contains per row the global summary statistics (e.g., the correlation between the global and local eigenvectors) and the local summary statistics (e.g., the correlation between the local eigenvectors of the previous and current windows) for each window.
selectVariants

Auxiliary function to invert minor alleles and to select those variants/loci exceeding a minimal cutoff value.
sMatrix

C++ implementation to compute the s-matrix (the weighted Jaccard similarity matrix) for a (sparse) input matrix as in the 'Stego' package: https://github.com/dschlauch/stego
powerMethod

C++ implementation of the power method (von Mises iteration) to compute the largest eigenvector of a dense input matrix.
makeWindows

Auxiliary function to generate a two-column matrix of windows to be used in the function 'fullscan'.
grMatrix

C++ implementation to compute the genomic relationship matrix (grm) for a (sparse) input matrix as defined in Yang et al. (2011).
jaccardMatrix

C++ implementation to compute the Jaccard similarity matrix for a (sparse) input matrix.
testdata

Simulated test data.