Last chance! 50% off unlimited learning
Sale ends in
Broom tidies a number of lists that are effectively S3
objects without a class attribute. For example, stats::optim()
,
svd()
and akima::interp()
produce consistent output, but because
they do not have a class attribute, they cannot be handled by S3 dispatch.
These functions look at the elements of a list and determine if there is
an appropriate tidying method to apply to the list. Those tidiers are
themselves are implemented as functions of the form tidy_<function>
or glance_<function>
and are not exported (but they are documented!).
If no appropriate tidying method is found, throws an error.
tidy_svd(x, matrix = "u", ...)
A list with components u
, d
, v
returned by svd()
.
Character specifying which component of the PCA should be tidied.
"u"
, "samples"
, or "x"
: returns information about the map from
the original space into principle components space.
"v"
, "rotation"
, or "variables"
: returns information about the
map from principle components space back into the original space.
"d"
or "pcs"
: returns information about the eigenvalues
will return information about
Additional arguments. Not used. Needed to match generic
signature only. Cautionary note: Misspelled arguments will be
absorbed in ...
, where they will be ignored. If the misspelled
argument has a default value, the default value will be used.
For example, if you pass conf.lvel = 0.9
, all computation will
proceed using conf.level = 0.95
. Additionally, if you pass
newdata = my_tibble
to an augment()
method that does not
accept a newdata
argument, it will use the default value for
the data
argument.
A tibble::tibble with columns depending on the component of PCA being tidied.
If matrix
is "u"
, "samples"
, or "x"
each row in the tidied
output corresponds to the original data in PCA space. The columns are:
row
ID of the original observation (i.e. rowname from original data).
PC
Integer indicating a principle component.
value
The score of the observation for that particular principle component. That is, the location of the observation in PCA space.
If matrix is "v", "rotation", or "variables", each row in the tidied ouput corresponds to information about the principle components in the original space. The columns are:
row
The variable labels (colnames) of the data set on which PCA was performed
PC
An integer vector indicating the principal component
value
The value of the eigenvector (axis score) on the indicated principal component
If matrix is "d" or "pcs", the columns are:
PC
An integer vector indicating the principal component
std.dev
Standard deviation explained by this PC
percent
Percentage of variation explained
cumulative
Cumulative percentage of variation explained
See https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca for information on how to interpret the various tidied matrices. Note that SVD is only equivalent to PCA on centered data.
Other svd tidiers: augment.prcomp
,
tidy.prcomp
, tidy_irlba
Other list tidiers: glance_optim
,
list_tidiers
, tidy_irlba
,
tidy_optim
, tidy_xyz
# NOT RUN {
mat <- scale(as.matrix(iris[, 1:4]))
s <- svd(mat)
tidy_u <- tidy(s, matrix = "u")
tidy_u
tidy_d <- tidy(s, matrix = "d")
tidy_d
tidy_v <- tidy(s, matrix = "v")
tidy_v
library(ggplot2)
library(dplyr)
ggplot(tidy_d, aes(PC, percent)) +
geom_point() +
ylab("% of variance explained")
tidy_u %>%
mutate(Species = iris$Species[row]) %>%
ggplot(aes(Species, value)) +
geom_boxplot() +
facet_wrap(~ PC, scale = "free_y")
# }
Run the code above in your browser using DataLab