Learn R Programming

polyreg (version 0.0.0.9600)

prVis: Polynomial-Based Manifold Exploration

Description

Polynomial-based alternative to t-SNE, UMAP etc.

Usage

prVis(xy, labels = FALSE, deg = 2, scale = FALSE, nSubSam = 0, 
    nIntervals = NULL, saveOutputs = FALSE, cex = 0.5)
addRowNums(np, savedPrVisOut)

Arguments

xy

Data frame with labels, if any, in the last column.

labels

If TRUE, have class labels.

deg

Degree of polynomial.

scale

If TRUE, call scale on nonlabels data before generating polynomial terms.

nSubSam

Number of random rows of xy to sample; 0 means use the full dataset.

nIntervals

If labels column is continuous, discretize into this many levels.

saveOutputs

Save outputs for use in addRowNums.

cex

Point size for plot.

np

Number of points to label in plot.

savedPrVisOut

Output save from a previous call, so can avoid duplicate computation.

Value

If saveOutputs is set, an R list is returned, with comppnents gpOut, the generated polynomial matrix, and prout, the return value from the call to prcomp.

Details

A number of "nonlinear" analogs of Principle Components Analysis (PCA) have emerged, such as ICA, t-SNE, UMAP and so on. Intuitively, an approach based on polynomials may be effective too. Specifically, prVis first expands xy to polynomial terms, then applies PCA to the result.

Once a plot is displayed, addRowNums can be used to add row-number IDs of random points, to gain further insight into the data.

Examples

Run this code
# NOT RUN {
getPE()  # prgeng data, included in pkg
# may want to predict wage; look at some predictors
pe1 <- pe[,c(1,2,6:16)]
z <- prVis(pe1,nSubSam=2000,saveOutputs=T,labels=FALSE)  
# get a bunch of streaks; why?
# call addRowNums() (not shown); discover that points on the same streak
# tend to have same combination of sex, education and occupation; moving
# along a streak mainly consists of variying age

print('see data/SwissRoll for another example')

# }

Run the code above in your browser using DataLab