Make a plot of effect size (y-axis) vs. log10(read coverage) (x-axis), coloring points by position relative to user-defined decision cutoffs.
EZpcaPlot(
obj,
data_type = c("fraction_labeled", "reads"),
features = NULL,
exactMatch = TRUE,
variance_decile = 7,
center = TRUE,
scale = TRUE,
point_size = 3,
metadf_cols_to_use = "all"
)A ggplot2 object.
An object of class EZbakRCompare, which is an EZbakRData object
on which you have run CompareParameters
Specifies what data to use for the PCA. Options are "fraction_labeled" (default; means using fraction high T-to-C or other mutation type estimate from EZbakR) or "reads" (means using log10(read counts + 1)).
Character vector of feature names for which comparisons were made.
If TRUE, then features and populations have to exactly match
those for a given fractions table for that table to be used. Means that you can't
specify a subset of features or populations by default, since this is TRUE
by default.
Integer between (inclusive) 1 and 9. Features with sample-to-sample
variance greater than the nth decile (n = variance_decile) will go into PCA.
From prcomp(): a logical value indicating whether the variables should be shifted to be zero centered.
Alternately, a vector of length equal the number of columns of x can be supplied. The value is passed to scale.
From prcomp(): a logical value indicating whether the variables should be scaled to have unit variance
before the analysis takes place. Alternatively, a vector of length equal the number of columns of x can be supplied.
The value is passed to scale.
Size of points in PCA plot
Columns in the EZbakR metadf that will be used to color points in the PCA plot. Points will be colored by the interaction between all of these columns (i.e., samples with unique combinations of values of these columns will get unique colors). Default is to use all columns (except "sample"), specified as "all".
EZMAPlot() accepts as input the output of CompareParameters(), i.e.,
an EZbakRData object with at least one "comparisons" table. It will plot
the "avg_coverage" column in this table vs. the "difference" column.
In the simplest case, "difference" represents a log-fold change in a kinetic
parameter (e.g., kdeg) estimate. More complicated linear model fits and
comparisons can yield different parameter estimates.
NOTE: some outputs of CompareParameters() are not meant for visualization
via an MA plot. For example, when fitting certain interaction models,
some of the parameter estimates may represent average log(kinetic paramter)
in one condition. See discussion of one example of this here.
EZbakR estimates kinetic parameters in EstimateKinetics() and EZDynamics()
on a log-scale. By default, since log2-fold changes are a bit easier to interpret
and more common for these kind of visualizations, EZMAPlot() multiplies
the y-axis value by log2(exp(1)), which is the factor required to convert from
a log to a log2 scale. You can turn this off by setting plotlog2 to FALSE.
# Simulate data to analyze
simdata <- EZSimulate(30)
# Create EZbakR input
ezbdo <- EZbakRData(simdata$cB, simdata$metadf)
# Estimate Fractions
ezbdo <- EstimateFractions(ezbdo)
# Make MA plot (ggplot object that you can save and add/modify layers)
EZpcaPlot(ezbdo)
Run the code above in your browser using DataLab