wrGraph (version 1.3.7)

plotPCAw: PCA plot with bag-plot to highlight groups

Description

This function allows to plot principal components analysis (PCA), with options to show center and potential outliers for each of the groups (columns of data). The main points of this implementation consist in offering bagplots to highlight groups of columns/samples and support to (object-oriented) output from limma and wrProteo.

Usage

plotPCAw(
  dat,
  sampleGrp,
  tit = NULL,
  useSymb = c(21:25, 9:12, 3:4),
  center = TRUE,
  scale. = TRUE,
  colBase = NULL,
  useSymb2 = NULL,
  cexTxt = 1,
  cexSub = 0.6,
  displBagPl = TRUE,
  outCoef = 2,
  getOutL = FALSE,
  showLegend = TRUE,
  nGrpForMedian = 6,
  pointLabelPar = NULL,
  rowTyName = "genes",
  rotatePC = NULL,
  suplFig = TRUE,
  callFrom = NULL,
  silent = FALSE,
  debug = FALSE
)

Value

This function make a plot and may retiurn an optional matrix of outlyer-data (depending on argument getOutL)

Arguments

dat

(matrix, data.frame, MArrayLM-object or list) data to plot. Note: NA-values cannot be processed - all lines with non-finite data (eg NA) will be omitted ! In case of MArrayLM-object or list dat must conatain list-element named 'datImp','dat' or 'data'.

sampleGrp

(character or factor) should be factor describing groups of replicates, NAs are not supported

tit

(character) custom title

useSymb

(integer) symbols to use (see also par)

center

(logical or numeric) decide if variables should be shifted to be zero centered, argument passed to prcomp

scale.

(logical or numeric) decide if scaling to obtain unit variance, argument passed to prcomp Alternatively, a vector of length equal the number of columns of x can be supplied. The value is passed to scale.

colBase

(character or integer) use custom colors

useSymb2

(integer) symbol to mark group-center (no mark of group-center if default NULL) (equivalent to pch, see also par)

cexTxt

(integer) expansion factor for text (see also par)

cexSub

(integer) expansion factor for subtitle line text (see also par)

displBagPl

(logical) if TRUE, show bagPlot (group-center) if >3 points per group otherwise the average-confidence-interval

outCoef

(numeric) parameter for defining outliers, see addBagPlot (equivalent to range in boxplot)

getOutL

(logical) return outlyer samples/values

showLegend

(logical or character) toggle to display legend, if character it designes the location within the plot to display the legend ('bottomleft','topright', etc..)

nGrpForMedian

(integer) decide if group center should be displayed via its average or median value: If group has less than 'nGrpForMedian' values, the average will be used, otherwise the median; if NULL no group centers will be displayed

pointLabelPar

(character) define formatting for optional labels next to points in main figure (ie PC1 vs PC2); may be TRUE or list containing elments 'textLabel', 'textCol', 'textCex', 'textOffSet', 'textAdj' for fine-tuning

rowTyName

(character) for subtitle : specify nature of rows (genes, proteins, probesets,...)

rotatePC

(integer) optional rotation (by -1) for fig&ure of the principal components specified by index

suplFig

(logical) to include plots vs 3rd principal component (PC) and Screeplot

callFrom

(character) allow easier tracking of messages produced

silent

(logical) suppress messages

debug

(logical) display additional messages for debugging

Details

One motivation for this implementation of plotting PCA was to provide a convenient way for doing so with of MArrayLM-objects or lists as created by limma and wrProteo.

Another motivation for this implementation come from integrating the idea of bag-plots to better visualize different groups of points (if they can be organized so beforehand as distinct groups) : The main body of data is shown as 'bag-plots' (a bivariate boxplot, see Bagplot) with different transparent colors to highlight the core part of different groups (if they contain more than 2 values per group). Furthermore, group centers are shown as average or median (see 'nGrpForMedian') with stars & index-number (if <25 groups).

Layout is automatically set to 2 or 4 subplots (if plotting more than 2 principal components makes sense).

Note : This function uses prcomp for calculating Eigenvectors and principal components, with default center=TRUE and scale.=FALSE (different to princomp(). which standardizes by default). This way the user has to option to intervene on arguments center and scale.. However, this should be done with care.

Note: NA-values cannot (by definition) be processed by (any) PCA - all lines with any non-finite values/content (eg NA) will be omitted !

Note : Package RColorBrewer may be used if available.

For more options with PCA (and related methods) you may also see also the package FactoMineR which provides a very wide spectrum of possibiities, in particular for combined numeric and categorical data.

See Also

prcomp (used here for the PCA underneith) , princomp, see the package FactoMineR for multiple plotting options or ways of combining categorical and numeric data

Examples

Run this code
set.seed(2019); dat1 <- matrix(round(c(rnorm(1000), runif(1000,-0.9,0.9)),2),
  ncol=20, byrow=TRUE) + matrix(rep(rep(1:5,6:2), each=100), ncol=20)
biplot(prcomp(dat1))        # traditional plot
(grp = factor(rep(LETTERS[5:1],6:2)))
plotPCAw(dat1, grp)

Run the code above in your browser using DataLab