enron

The data set is a subset of the Enron e-mail corpus from the UCI Machine Learning Repository (Lichman, 2013). The original data is a collection of 39,861 email messages with roughly 6 million tokens and a 28,102 term vocabulary. The subset is a binary (presence/absence) data set containing the 80 most frequent words which appear in the original corpus.


datasets

Incremental Multiple Correspondence Analysis and Principal
Component Analysis.

Angelos Markos

Incremental Decomposition Methods

enron function

A binary data frame with 39,861 observations (e-mail messages) on 80 variables (words).

enron: enron data set

Description

Usage

Arguments

Format

References

Examples