wrGraph (version 1.1.0)

plotPCAw: PCA plot with bag-plot to highlight groups

Description

Function to plot principal components analysis (PCA), with options to show center and potential outliers for each of the groups (columns of data). One of the specificities of this implementation is the integration of bag-plots to better visualize different groups of points (if they can be organized so beforehand as distinct groups) : The main body of data is shown as 'bag-plots' (a bivariate boxplot, see Bagplot) with different transparent colors to highlight the core part of different groups (if they contain more than 2 values per group). Furthermore, group centers are shown as average or median (see 'nGrpForMedian') with stars & index-number (if <25 groups). Layout is automatically set to 2 or 4 subplots (if plotting more than 2 principal components makes sense). Note : This function uses for calulating PCA prcomp with default center=TRUE and scale.=FALSE, (different to princomp() which standardizes by default). Note: NA-values cannot (by definition) be processed by PCA - all lines with any non-finite values/content (eg NA) will be omitted ! Note : Package RColorBrewer may be used if avaialble. Finally, note that several other packages dedicated to PCA exist, for example FactoMineR offers a very wide spectrum of possibiities, in particular for combined numeric and categorical data.

Usage

plotPCAw(
  dat,
  sampleGrp,
  tit = NULL,
  useSymb = c(21:25, 9:12, 3:4),
  center = TRUE,
  scale. = TRUE,
  colBase = NULL,
  useSymb2 = NULL,
  displBagPl = TRUE,
  getOutL = FALSE,
  cexTxt = 1,
  showLegend = TRUE,
  nGrpForMedian = 6,
  pointLabelPar = NULL,
  rowTyName = "genes",
  rotatePC = NULL,
  suplFig = TRUE,
  callFrom = NULL,
  silent = FALSE
)

Arguments

dat

(matrix, list or data.frame) data to plot. Note: NA-values cannot be processed - all lines with non-finite data (eg NA) will be omitted !

sampleGrp

(character or factor) should be factor describing groups of replicates, NAs are not supported

tit

(character) custom title

useSymb

(integer) symbols to use (see also par)

center

(logical or numeric) decide if variables should be shifted to be zero centered, argument passed to prcomp

scale.

(logical or numeric) decide if scaling to obtain unit variance, argument passed to prcomp Alternatively, a vector of length equal the number of columns of x can be supplied. The value is passed to scale.

colBase

(character or integer) use custom colors

useSymb2

(integer) symbol to mark group-center (no mark of group-center if default NULL) (equivalent to pch, see also par)

displBagPl

(logical) if TRUE, show bagPlot (group-center) if >3 points per group otherwise the average-confidence-interval

getOutL

(logical) return outlyer samples/values

cexTxt

(integer) expansion factor for text (see also par)

showLegend

(logical) toggle to display legend

nGrpForMedian

(integer) decide if group center should be displayed via its average or median value: If group has less than 'nGrpForMedian' values, the average will be used, otherwise the median; if NULL no group centers will be displayed

pointLabelPar

(character) define formatting for optional labels next to points in main figure (ie PC1 vs PC2); may be TRUE or list containing elments 'textLabel','textCol','textCex', 'textOffSet','textAdj' for fine-tuning

rowTyName

(character) for subtitle : specify nature of rows (genes, proteins, probesets,...)

rotatePC

(integer) optional rotation (by -1) for figure of the principal components specified by index

suplFig

(logical) to include plots vs 3rd principal component (PC) and Screeplot

callFrom

(character) allow easier tracking of message(s) produced

silent

(logical) suppress messages

Value

plot and optional matrix of outlyer-data

See Also

(used in this function for the PCA underneith:) prcomp, princomp, the package FactoMineR

Examples

Run this code
# NOT RUN {
set.seed(2019); dat1 <- matrix(round(c(rnorm(1000), runif(1000,-0.9,0.9)),2), 
  ncol=20, byrow=TRUE) + matrix(rep(rep(1:5,6:2), each=100), ncol=20)
biplot(prcomp(dat1))      # traditional plot
(grp = factor(rep(LETTERS[5:1],6:2)))
plotPCAw(dat1,grp)
# }

Run the code above in your browser using DataLab