This function creates a 'volc3d' object of S4 class for downstream plots
containing the p-values from a 2x3 factor analysis, expression data
sample data and polar coordinates. For RNA-Seq count data, two functions
deseq_2x3
followed by deseq_2x3_polar()
can be used instead.
polar_coords_2x3(
data,
metadata = NULL,
outcome,
group,
pvals = NULL,
padj = pvals,
pcutoff = 0.05,
fc_cutoff = NULL,
padj.method = "BH",
process = c("positive", "negative", "two.sided"),
scheme = c("grey60", "red", "gold2", "green3", "cyan", "blue", "purple", "black"),
labs = NULL,
...
)
Returns an S4 'volc3d' object containing:
'df' A list of 2 dataframes. Each dataframe contains both x,y,z
coordinates as well as polar coordinates r, angle. The first dataframe has
coordinates on scaled data. The 2nd dataframe has unscaled data (e.g. log2
fold change for gene expression). The type
argument in
volcano3D
, radial_plotly
and
radial_ggplot
corresponds to these dataframes.
'outcome' The three-group contrast factor used for comparisons,
linked to the group
column
'data' Dataframe or matrix containing the expression data
'pvals' A dataframe containing p-values in 3 columns representing the binary comparison for the outcome for each of the 3 groups.
'padj' A dataframe containing p-values adjusted for multiple testing
'pcutoff Numeric value for cut-off for p-value significance
'scheme' Character vector with colour scheme for plotting
'labs' Character vector with labels for colour groups
Dataframe or matrix with variables in columns and samples in rows
Dataframe of sample information with samples in rows
Either the name of column in metadata
containing the binary
outcome data. Or a vector with 2 groups, ideally a factor. If it is not a
factor, this will be coerced to a factor. This must have exactly 2 levels.
Either the name of column in metadata
containing the 3-way
grouping data. Or a vector with 3 groups, ideally a factor. If it is not a
factor, this will be coerced to a factor. This must have exactly 3 levels.
NOTE: if pvals
is given, the order of the levels in group
must
correspond to the order of columns in pvals
.
Optional matrix or dataframe with p-values in 3 columns. If
pvals
is not given, it is calculated using the function
calc_stats_2x3
. The p-values in 3 columns represent the
comparison between the binary outcome with each column for the 3 groups as
specified in group
.
Matrix or dataframe with adjusted p-values. If not supplied,
defaults to use nominal p-values from pvals
.
Cut-off for p-value significance
Cut-off for fold change on radial axis
Can be "qvalue"
or any method available in p.adjust
.
The option "none"
is a pass-through.
Character value specifying colour process for statistical significant genes: "positive" specifies genes are coloured if fold change is >0; "negative" for genes with fold change <0 (note that for clarity the polar position is altered so that genes along each axis have the most strongly negative fold change values); or "two.sided" which is a compromise in which positive genes are labelled as before but genes with negative fold changes and significant p-values have an inverted colour scheme.
Vector of colours starting with non-significant variables
Optional character vector for labelling groups. Default NULL
leads to abbreviated labels based on levels in outcome
using
abbreviate()
. A vector of length 3 with custom abbreviated names for the
outcome levels can be supplied. Otherwise a vector length 8 is expected, of
the form "ns", "A+", "A+B+", "B+", "B+C+", "C+", "A+C+", "A+B+C+", where
"ns" means non-significant and A, B, C refer to levels 1, 2, 3 in
outcome
, and must be in the correct order.
Optional arguments passed to calc_stats_2x3
This function is designed for manually generating a 'volc3d' class object for
visualising a 2x3 way analysis comparing a large number of attributes such as
genes. For RNA-Seq data we suggest using deseq_2x3()
and
deseq_2x3_polar()
functions in sequence instead.
Scaled polar coordinates are generated using the t-score for each group
comparison. Unscaled polar coordinates are generated as difference between
means for each group comparison. If p-values are not supplied they are
calculated by calc_stats_2x3()
using either t-tests or wilcoxon tests.
The z axis for 3d volcano plots does not have as clear a corollary in 2x3 analysis as for the standard 3-way analysis (which uses the likelihood ratio test for the 3 groups). For 2x3 polar analysis the smallest p-value from the 3 group pairwise comparisons for each gene is used to generate a z coordinate as -log10(p-value).
The colour scheme is not as straightforward as for the standard polar plot
and volcano3D plot since genes (or attributes) can be significantly up or
downregulated in the response comparison for each of the 3 groups.
process = "positive"
means that genes are labelled with colours if a gene
is significantly upregulated in the response for that group. This uses the
primary colours (RGB) so that if a gene is upregulated in both red and blue
group it becomes purple etc with secondary colours. If the gene is
upregulated in all 3 groups it is labelled black. Non-significant genes are
in grey.
With process = "negative"
genes are coloured when they are significantly
downregulated. With process = "two.sided"
the colour scheme means that both
significantly up- and down-regulated genes are coloured with downregulated
genes labelled with inverted colours (i.e. cyan is the inverse of red etc).
However, significant upregulation in a group takes precedence.
deseq_2x3
, deseq_2x3_polar
,
calc_stats_2x3