Learn R Programming

logisticPCA (version 0.2)

logisticPCA: Logistic Principal Component Analysis

Description

Dimensionality reduction for binary data by extending Pearson's PCA formulation to minimize Binomial deviance

Usage

logisticPCA(x, k = 2, m = 4, quiet = TRUE, partial_decomp = FALSE, max_iters = 1000, conv_criteria = 1e-05, random_start = FALSE, start_U, start_mu, main_effects = TRUE, validation, M, use_irlba)

Arguments

x
matrix with all binary entries
k
number of principal components to return
m
value to approximate the saturated model. If m = 0, m is solved for
quiet
logical; whether the calculation should give feedback
partial_decomp
logical; if TRUE, the function uses the rARPACK package to more quickly calculate the eigen-decomposition. This is usually faster than standard eigen-decomponsition when ncol(x) > 100 and k is small
max_iters
number of maximum iterations
conv_criteria
convergence criteria. The difference between average deviance in successive iterations
random_start
logical; whether to randomly inititalize the parameters. If FALSE, function will use an eigen-decomposition as starting value
start_U
starting value for the orthogonal matrix
start_mu
starting value for mu. Only used if main_effects = TRUE
main_effects
logical; whether to include main effects in the model
validation
optional validation matrix. If supplied and m = 0, the validation data is used to solve for m
M
depricated. Use m instead
use_irlba
depricated. Use partial_decomp instead

Value

An S3 object of class lpca which is a list with the following components:
mu
the main effects
U
a k-dimentional orthonormal matrix with the loadings
PCs
the princial component scores
m
the parameter inputed or solved for
iters
number of iterations required for convergence
loss_trace
the trace of the average negative log likelihood of the algorithm. Should be non-increasing
prop_deviance_expl
the proportion of deviance explained by this model. If main_effects = TRUE, the null model is just the main effects, otherwise the null model estimates 0 for all natural parameters.

References

Landgraf, A.J. & Lee, Y., 2015. Dimensionality reduction for binary data through the projection of natural parameters. arXiv preprint arXiv:1510.06112.

Examples

Run this code
# construct a low rank matrix in the logit scale
rows = 100
cols = 10
set.seed(1)
mat_logit = outer(rnorm(rows), rnorm(cols))

# generate a binary matrix
mat = (matrix(runif(rows * cols), rows, cols) <= inv.logit.mat(mat_logit)) * 1.0

# run logistic PCA on it
lpca = logisticPCA(mat, k = 1, m = 4, main_effects = FALSE)

# Logistic PCA likely does a better job finding latent features
# than standard PCA
plot(svd(mat_logit)$u[, 1], lpca$PCs[, 1])
plot(svd(mat_logit)$u[, 1], svd(mat)$u[, 1])

Run the code above in your browser using DataLab