Display data as sorted or cumulative frequency plot. This type of plot represents an alternative to plotting data as histograms.
Histograms are very universal and which are very intuitive. However,fine-tuning the bandwith (ie width of the bars) may be very delicate,
fine resultion details may often remain hidden.
One of the advanges of directly displaying all data-points is that subtile differences may be revealed easier, compared to calssical histograms.
Furthermore, the plot prensented her offeres more options to display multiple series of data simultaneaously.
Thus, this type of plot may be useful to compare eg results of data normalization.
Of course, with very large data-sets (eg > 3000 values) this gain of 'details' will be less important (compared to histograms) and will penalize speed.
In such cases the argument thisResol
will get useful as it allows to reduce the resultion and introduce binning.
Alternatively for very large data-sets one may looking into density-plots or vioplots (eg vioplotW
).
The argument CVlimit
allows optionally excluding extreme values.
If numeric (& > 2 columns), its value will be used exclExtrValues
to identify series with column-median > 'CVlimit'.
Of course, exclusion of extreme values should be done with great care, important features of the data may get lost.
cumFrqPlot(
dat,
cumSum = FALSE,
exclCol = NULL,
colNames = NULL,
displColNa = TRUE,
tit = NULL,
xLim = NULL,
yLim = NULL,
xLab = NULL,
yLab = NULL,
col = NULL,
CVlimit = NULL,
thisResol = NULL,
supTxtAdj = 0,
supTxtYOffs = 0,
useLog = "",
silent = FALSE,
callFrom = NULL
)
(matrix or data.frame) data to plot/inspect
(logical) for either plotting cumulates Sums (then thisResol
for number of breaks) or (if =FALSE
) simply sorted values -> max resolution
(integer) columlns to exclude
(character) for alternative column/series names in display, as long as displColNa=TRUE
(logical) display column-names
(character) custom title
(numeric) custom limit for x-axis (see also par
)
(numeric) custom limit for y-axis (see also par
)
(character) custom x-axis label
(character) custom y-axis label
(integer or character) custom colors
(numeric) for the tag 'outlier column' (uses exclExtrValues
) identify & mark column with median row-CV > CVlimit
(integer) resolution res
for binning large data-sets
(numeric) parameter adj
for supplemetal text
(numeric) supplemental offset for text on y axis
(character) default="", otherwise for setting axis in log-scale "x", "y" or "xy"
(logical) suppress messages
(character) allows easier tracking of message(s) produced
plot only
layout
, exclExtrValues
for decision of potential outliers; hist
, vioplotW
# NOT RUN {
set.seed(2017); dat0 <- matrix(rnorm(500), ncol=5, dimnames=list(NULL,1:5))
cumFrqPlot(dat0, tit="Sorted values")
cumFrqPlot(dat0, cumSum=TRUE, tit="Sum of sorted values")
# }
Run the code above in your browser using DataLab