Learn R Programming

popkin (version 1.3.0)

plot_popkin: Visualize one or more kinship matrices

Description

This function plots one or more kinship matrices and a shared legend for the color key. Many options allow for fine control of individual or subpopulation labeling. This code assumes input matrices are symmetric.

Usage

plot_popkin(
  kinship,
  titles = NULL,
  col = NULL,
  col_n = 100,
  mar = NULL,
  mar_pad = 0.2,
  oma = 1.5,
  diag_line = FALSE,
  panel_letters = toupper(letters),
  panel_letters_cex = 1.5,
  ylab = "Individuals",
  ylab_adj = NA,
  ylab_line = 0,
  layout_add = TRUE,
  layout_rows = 1,
  leg_per_panel = FALSE,
  leg_title = "Kinship",
  leg_cex = 1,
  leg_n = 5,
  leg_mar = 3,
  leg_width = 0.3,
  names = FALSE,
  names_cex = 1,
  names_line = NA,
  names_las = 2,
  labs = NULL,
  labs_cex = 1,
  labs_las = 0,
  labs_line = 0,
  labs_sep = TRUE,
  labs_lwd = 1,
  labs_col = "black",
  labs_ticks = FALSE,
  labs_text = TRUE,
  labs_even = FALSE,
  null_panel_data = FALSE,
  weights = NULL,
  raster = is.null(weights),
  ...
)

Arguments

kinship

A numeric kinship matrix or a list of matrices. Note kinship may contain NULL elements (makes blank plots with titles; good for placeholders or non-existent data)

titles

Titles to add to each matrix panel (default is no titles)

col

Colors for heatmap (default is a red-white-blue palette symmetric about zero constructed using RColorBrewer).

col_n

The number of colors to use in the heatmap (applies if col = NULL).

mar

Margins shared by all panels (if a vector) or for each panel (if a list of such vectors). If the vector has length 1, mar corresponds to the shared lower and left margins, while the top and right margins are set to zero. If this length is 2, mar[1] is the same as above, while mar[2] is the top margin. If this length is 4, then mar is a fully-specified margin vector in the standard format c(bottom, left, top, right) that \link[graphics]{par}('mar') expects. Vectors of invalid lengths produce a warning. Note the padding mar_pad below is added to every margin if set. If NULL, the original margin values are used without change, and are reset for every panel that has a NULL value. The original margins are also reset after plotting is complete.

mar_pad

Margin padding added to all panels (mar above and leg_mar below). Default 0.2. Must be a scalar or a vector of length 4 to match \link[graphics]{par}('mar').

oma

Outer margin vector. If length 1, the value of oma is applied to the left outer margin only (so ylab below displays correctly) and zero outer margins elsewhere. If length 4, all outer margins are expected in standard format \link[graphics]{par}('mar') expects (see mar above). mar_pad above is never added to outer margins. If NULL, no outer margins are set (previous settings are preserved). Vectors of invalid lengths produce a warning.

diag_line

If TRUE adds a line along the diagonal (default no line). May also be a vector of logicals to set per panel (lengths must agree).

panel_letters

Vector of strings for labeling panels (default A-Z). No labels are added if NULL, or when there is only one panel except if its set to a single letter in that case (this behavior is useful if goal is to have multiple external panels but popkin only creates one of these panels).

panel_letters_cex

Scaling factor of panel letters (default 1.5).

ylab

The y-axis label (default "Individuals"). If length(ylab) == 1, the label is placed in the outer margin (shared across panels); otherwise length(ylab) must equal the number of panels and each label is placed in the inner margin of the respective panel.

ylab_adj

The value of "adj" passed to \link[graphics]{mtext}. If length(ylab) == 1, only the first value is used, otherwise length(ylab_adj) must equal the number of panels.

ylab_line

The value of "line" passed to \link[graphics]{mtext}. If length(ylab) == 1, only the first value is used, otherwise length(ylab_line) must equal the number of panels.

LAYOUT OPTIONS

layout_add

If TRUE (default) then \link[graphics]{layout} is called internally with appropriate values for the required number of panels for each matrix, the desired number of rows (see layout_rows below) plus the color key legend. The original layout is reset when plotting is complete and if layout_add = TRUE. If a non-standard layout or additional panels (beyond those provided by plot_popkin) are desired, set to FALSE and call \link[graphics]{layout} yourself beforehand.

layout_rows

Number of rows in layout, used only if layout_add = TRUE.

LEGEND (COLOR KEY) OPTIONS

leg_per_panel

If TRUE, every kinship matrix get its own legend/color key (best for matrices with very different scales). If FALSE (default), a single legend/color key is shared by all kinship matrix panels.

leg_title

The name of the variable that the heatmap colors measure (default "Kinship"), or a vector of such values if they vary per panel.

leg_cex

Scaling factor for leg_title (default 1), or a vector of such values if they vary per panel.

leg_n

The desired number of ticks in the legend y-axis (input to \link{pretty}, see that for more details), or a vector of such values if they vary per panel.

leg_mar

Margin values for the legend panel only, or a list of such values if they vary per panel. A length-4 vector (in c( bottom, left, top, right ) format that \link[graphics]{par}('mar') expects) specifies the full margins, to which mar_pad is added. Otherwise, the margins used in the last panel are preserved with the exception that the left margin is set to zero, and if leg_mar is length-1, it is used to specify the right margin (plus the value of mar_pad, see above).

INDIVIDUAL LABEL OPTIONS

leg_width

The width of the legend panel, relative to the width of the kinship panel. This value is passed to \link[graphics]{layout} (ignored if layout_add = FALSE).

names

If TRUE, the column and row names are plotted in the heatmap, or a vector of such values if they vary per panel.

names_cex

Scaling factor for the column and row names, or a vector of such values if they vary per panel.

names_line

Line where column and row names are placed, or a vector of such values if they vary per panel.

names_las

Orientation of labels relative to axis. Default (2) makes labels perpendicular to axis.

SUBPOPULATION LABEL OPTIONS

labs

Subpopulation labels for individuals. Use a matrix of labels to show groupings at more than one level (for a hierarchy or otherwise). If input is a vector or a matrix, the same subpopulation labels are shown for every heatmap panel; the input must be a list of such vectors or matrices if the labels vary per panel.

labs_cex

A vector of label scaling factors for each level of labs, or a list of such vectors if labels vary per panel.

labs_las

A vector of label orientations (in format that \link[graphics]{mtext} expects) for each level of labs, or a list of such vectors if labels vary per panel.

labs_line

A vector of lines where labels are placed (in format that \link[graphics]{mtext} expects) for each level of labs, or a list of such vectors if labels vary per panel.

labs_sep

A vector of logicals that specify whether lines separating the subpopulations are drawn for each level of labs, or a list of such vectors if labels vary per panel.

labs_lwd

A vector of line widths for the lines that divide subpopulations (if labs_sep = TRUE) for each level of labs, or a list of such vectors if labels vary per panel.

labs_col

A vector of colors for the lines that divide subpopulations (if labs_sep = TRUE) for each level of labs, or a list of such vectors if labels vary per panel.

labs_ticks

A vector of logicals that specify whether ticks separating the subpopulations are drawn for each level of labs, or a list of such vectors if labels vary per panel.

labs_text

A vector of logicals that specify whether the subpopulation labels are shown for each level of labs, or a list of such vectors if labels vary per panel. Useful for including separating lines or ticks without text.

labs_even

A vector of logicals that specify whether the subpopulations labels are drawn with equal spacing for each level of labs, or a list of such vectors if labels vary per panel. When TRUE, lines mapping the equally-spaced labels to the unequally-spaced subsections of the heatmap are also drawn.

null_panel_data

If FALSE (default), panels with NULL kinship matrices must not have titles or other parameters set, and no panel letters are used in these cases. If TRUE, panels with NULL kinship matrices must have titles and other parameters set. In the latter case, these NULL panels also get panel letters. The difference is important when checking that lengths of non-singleton parameters agree.

weights

A vector with weights for every individual, or a list of such vectors if they vary per panel. The width of every individual becomes proportional to their weight. Individuals with zero or negative weights are omitted.

raster

A logical equivalent to useRaster option in the image function used internally, or a vector of such logicals if the choice varies per panel. If weights are non-NULL in a given panel, raster = FALSE is forced (this is necessary to plot images where columns and rows have variable width). If weights are NULL, the default is raster = TRUE, but in this case the user may override (for example, so panels are visually coherent when some use weights while others do not, as there are small differences in rendering implementation for each value of raster). Note that a multipanel figure with a list of weights sets raster = FALSE to all panels by default, even if the weights were only applied to a subset of panels.

AXIS LABEL OPTIONS

...

Additional options passed to \link[graphics]{image}. These are shared across panels

Details

plot_popkin plots the input kinship matrices as-is. For best results, a standard kinship matrix (such as the output of \link{popkin}) should have its diagonal rescaled to contain inbreeding coefficients (\link{inbr_diag} does this) before plot_popkin is used.

This function permits the labeling of individuals (from row and column names when names = TRUE) and of subpopulations (passed through labs). The difference is that the labels passed through labs are assumed to be shared by many individuals, and lines (or other optional visual aids) are added to demarcate these subgroups.

Examples

Run this code
# NOT RUN {
# Construct toy data
X <- matrix(c(0,1,2,1,0,1,1,0,2), nrow = 3, byrow = TRUE) # genotype matrix
subpops <- c(1,1,2) # subpopulation assignments for individuals

# NOTE: for BED-formatted input, use BEDMatrix!
# "file" is path to BED file (excluding .bed extension)
## library(BEDMatrix)
## X <- BEDMatrix(file) # load genotype matrix object

# estimate the kinship matrix from the genotypes "X"!
kinship <- popkin(X, subpops) # calculate kinship from X and optional subpop labels

# simple plot of the kinship matrix, marking the subpopulations only
# note inbr_diag replaces the diagonal of kinship with inbreeding coefficients
# (see vignette for more elaborate examples)
plot_popkin( inbr_diag(kinship), labs = subpops )

# }

Run the code above in your browser using DataLab