Learn R Programming

adwave (version 1.1)

signal: Compute Localized Admixture Signals

Description

Produces estimates of localized ancestry for each individual.

Usage

signal(table, who = colnames(table), populations, popA = NA, popB = NA, normalize = FALSE, n.pca = 5, PCAonly = FALSE, verbose = TRUE, tol = 0.001, n.signal = NULL, window.size = NULL, genmap = NULL)

Arguments

table
matrix of genotype calls (rows, length T) versus individuals (columns, length n)
who
individuals to include in the analysis.
populations
list containing vector of IDs for each population in the analysis.
popA
name of ancestral population 1 (used for forming the axes of variation). Must match one of the names in populations.
popB
name of ancestral population 2 (used for forming the axes of variation). Must match one of the names in populations.
normalize
if TRUE, normalize the data matrix. Default is FALSE.
n.pca
number of PCA axes to compute (only the first principal component is used for forming the signals, but additional components may be desired for visualization). Default is 5.
PCAonly
if TRUE, only compute the PCA, do not compute the signals. Default is FALSE.
verbose
if TRUE, print summary to screen. Default is TRUE.
tol
tolerance for normalization of admixture signals ($\epsilon$ in accompanying paper). Default is 0.001.
n.signal
(optional) number of data points in windowed signal.
window.size
(optional) size of window specified as proportion of total length; e.g., window.size = 0.01 with signal of length $T = 3000$ SNPs generates windows of $0.01 \times 3000 = 30$ SNPs. Value need not be a round number.
genmap
(optional) genetic distance of genotype calls, supplied as vector of length T. If specified, signals will be formulated in terms of genetic distance along the chromosome (rather than physical position).

Value

Returns an object of class adsig, a list with the following components:
call
function call.
date
date of function call.
individuals
individuals for whom projections on the first principal component are calculated.
n.snps
number of SNPs in the table.
signals
The admixture signals, output as a $T \times n$ data matrix, where n is the number of individuals and T is the number of data points (either the number of SNPs if n.signal = NULL or n.signal otherwise).
n.tol
the number of entries replaced by zero in the normalization procedure. This is dependent on the value set for the tolerance, tol.
popP
estimated proportion of admixture for each population.
indP
estimated proportion of admixture for each individual.
pa.ind
columns are principal axes in individual coordinates ($n_A+n_B$ rows, n.pca columns).
pa.snp
columns are principal axes in SNP coordinates (T rows, n.pca columns).
G
matrix of quadratic form in individual coordinates.
ev
vector of eigenvalues.
gendist
(only if genmap is specified in input). Vector of genetic distances along the chromosome, length n.signal.

Details

Applies PCA to genome-wide data using ancestral reference populations. The first eigenvector reflects the population structure. All individuals are then projected on to this axis to form the SNP-level admixture signals. PCA scores are used to estimate the proportion of admixture at the level of individuals (indP) and populations (popP). There is no restriction on the length of the data (number of SNPs) and the default is to provide an estimate of localized ancestry at each SNP. Optionally, it is also possible to window the signals, producing processed signals of length n.signal. The windows may be overlapping or disjoint with width specified through the window.size option (see examples). If genmap is specified, the signals will be formulated in terms of genetic distance along the chromosome (not implemented in the paper).

References

Sanderson, J., H. Sudoyo, T.M. Karafet, M.F. Hammer and M.P. Cox. 2015. Reconstructing past admixture processes from local genomic ancestry using wavelet transformation. Genetics 200:469-481.

See Also

wavesum, plotsignal

Examples

Run this code
data(admix)

# EXAMPLE 1. 
# Generate the admixture signal 
AdexPCA <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
		n.signal=NULL)

# Plot the resulting PCA
plot(AdexPCA$pc.ind[,1],AdexPCA$pc.ind[,2],col=admix$colplot,xlab="PC1",ylab="PC2",pch=16)
legend("bottomright",c("popA","popB","popAB"),col=c(3,4,2),pch=16)



# EXAMPLE 2. 
# Generate the admixture signal with windowing
AdexPCA2 <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
		n.signal=1000,window.size=0.01)

# Plot resulting admixture signal for one individual
plotsignal(AdexPCA2,ind="AD00001",popA=AdexPCA2$popA,popB=AdexPCA2$popB)



# EXAMPLE 3. 
# Generate the admixture signal with windowing
# As EXAMPLE 2 but with n.signal reduced to 100 to provide disjoint windows
AdexPCA3 <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
		n.signal=100,window.size=0.01)

# Plot resulting admixture signal for one individual
plotsignal(AdexPCA3,ind="AD00001",popA=AdexPCA2$popA,popB=AdexPCA2$popB)



# EXAMPLE 4. 
# Generate the admixture signal in terms of genetic distance
# As EXAMPLE 2 but with genmap specified so that signals are formulated using genetic distances.
AdexPCA4 <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
	n.signal=1000,window.size=0.01,genmap=admix$map[,2])

# Plot resulting admixture signal for one individual
plotsignal(AdexPCA4,ind="AD00001",popA=AdexPCA4$popA,popB=AdexPCA4$popB)

Run the code above in your browser using DataLab