Learn R Programming

adwave (version 1.0)

signal: Compute Localized Admixture Signals

Description

Produces estimates of localized ancestry for each individual.

Usage

signal(table, who = colnames(table), populations, popA = NA, popB = NA, 
	normalize = FALSE, n.pca = 5, PCAonly = FALSE, verbose = TRUE, tol = 0.001,
	n.signal = NULL, window.size = NULL, genmap = NULL)

Arguments

table
matrix of genotype calls (rows, length T) versus individuals (columns, length n)
who
individuals to include in the analysis.
populations
list containing vector of IDs for each population in the analysis.
popA
name of ancestral population 1 (used for forming the axes of variation). Must match one of the names in populations.
popB
name of ancestral population 2 (used for forming the axes of variation). Must match one of the names in populations.
normalize
if TRUE, normalize the data matrix. Default is FALSE.
n.pca
number of PCA axes to compute (only the first principal component is used for forming the signals, but additional components may be desired for visualization). Default is 5.
PCAonly
if TRUE, only compute the PCA, do not compute the signals. Default isFALSE.
verbose
if TRUE, print summary to screen. Default is TRUE.
tol
tolerance for normalization of admixture signals ($\epsilon$ in accompanying paper). Default is 0.001.
n.signal
(optional) number of data points in windowed signal.
window.size
(optional) size of window specified as proportion of total length; e.g., window.size = 0.01 with signal of length $T = 3000$ SNPs generates windows of $0.01 \times 3000 = 30$ SNPs. Value need not be a round number.
genmap
(optional) genetic distance of genotype calls, supplied as vector of length T. If specified, signals will be formulated in terms of genetic distance along the chromosome (rather than physical position).

Value

  • Returns an object of class adsig, a list with the following components:
  • callfunction call.
  • datedate of function call.
  • individualsindividuals for whom projections on the first principal component are calculated.
  • n.snpsnumber of SNPs in the table.
  • signalsThe admixture signals, output as a $T \times n$ data matrix, where n is the number of individuals and T is the number of data points (either the number of SNPs if n.signal = NULL or n.signal otherwise).
  • n.tolthe number of entries replaced by zero in the normalization procedure. This is dependent on the value set for the tolerance, tol.
  • popPestimated proportion of admixture for each population.
  • indPestimated proportion of admixture for each individual.
  • pa.indcolumns are principal axes in individual coordinates ($n_A+n_B$ rows, n.pca columns).
  • pa.snpcolumns are principal axes in SNP coordinates (T rows, n.pca columns).
  • Gmatrix of quadratic form in individual coordinates.
  • evvector of eigenvalues.
  • gendist(only if genmap is specified in input). Vector of genetic distances along the chromosome, length n.signal.

Details

Applies PCA to genome-wide data using ancestral reference populations. The first eigenvector reflects the population structure. All individuals are then projected on to this axis to form the SNP-level admixture signals. PCA scores are used to estimate the proportion of admixture at the level of individuals (indP) and populations (popP). There is no restriction on the length of the data (number of SNPs) and the default is to provide an estimate of localized ancestry at each SNP. Optionally, it is also possible to window the signals, producing processed signals of length n.signal. The windows may be overlapping or disjoint with width specified through the window.size option (see examples). If genmap is specified, the signals will be formulated in terms of genetic distance along the chromosome (not implemented in the paper).

References

For further details, see accompanying paper.

See Also

wavesum, plotsignal

Examples

Run this code
data(admix)

# EXAMPLE 1. 
# Generate the admixture signal 
AdexPCA <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
		n.signal=NULL)

# Plot the resulting PCA
plot(AdexPCA$pc.ind[,1],AdexPCA$pc.ind[,2],col=admix$colplot,xlab="PC1",ylab="PC2",pch=16)
legend("bottomright",c("popA","popB","popAB"),col=c(3,4,2),pch=16)



# EXAMPLE 2. 
# Generate the admixture signal with windowing
AdexPCA2 <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
		n.signal=1000,window.size=0.01)

# Plot resulting admixture signal for one individual
plotsignal(AdexPCA2,ind="AD00001",popA=AdexPCA2$popA,popB=AdexPCA2$popB)



# EXAMPLE 3. 
# Generate the admixture signal with windowing
# As EXAMPLE 2 but with n.signal reduced to 100 to provide disjoint windows
AdexPCA3 <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
		n.signal=100,window.size=0.01)

# Plot resulting admixture signal for one individual
plotsignal(AdexPCA3,ind="AD00001",popA=AdexPCA2$popA,popB=AdexPCA2$popB)



# EXAMPLE 4. 
# Generate the admixture signal in terms of genetic distance
# As EXAMPLE 2 but with genmap specified so that signals are formulated using genetic distances.
AdexPCA4 <- signal(admix$data,popA="popA",popB="popB",populations=admix$populations,tol=0.001,
	n.signal=1000,window.size=0.01,genmap=admix$map[,2])

# Plot resulting admixture signal for one individual
plotsignal(AdexPCA4,ind="AD00001",popA=AdexPCA4$popA,popB=AdexPCA4$popB)

Run the code above in your browser using DataLab