Patrick Perry

Patrick Perry

7 packages on CRAN

bcv

cran
99.99th

Percentile

Methods for choosing the rank of an SVD approximation via cross validation. The package provides both Gabriel-style "block" holdouts and Wold-style "speckled" holdouts. It also includes an implementation of the SVDImpute algorithm. For more information about Bi-cross-validation, see Owen & Perry's 2009 AoAS article (at http://arxiv.org/abs/0908.2062) and Perry's 2009 PhD thesis (at http://arxiv.org/abs/0909.3052).

corpus

cran
99.99th

Percentile

Text corpus data analysis, with full support for Unicode. Functions for reading data from newline-delimited JSON files, for normalizing and tokenizing text, for searching for term occurrences, and for computing term occurrence frequencies (including n-grams).

mbest

cran
99.99th

Percentile

Fast moment-based hierarchical model fitting. Implements methods from the papers "Fast Moment-Based Estimation for Hierarchical Models," by Perry (2017) and "Fitting a Deeply Nested Hierarchical Model to a Large Book Review Dataset Using a Moment-Based Estimator," by Zhang, Schmaus, and Perry (2018).

RMTstat

cran
99.99th

Percentile

Functions for working with the Tracy-Widom laws and other distributions related to the eigenvalues of large Wishart matrices. The tables for computing the Tracy-Widom densities and distribution functions were computed by Momar Dieng's MATLAB package "RMLab" (formerly available on his homepage at http://math.arizona.edu/~momar/research.htm ). This package is part of a collaboration between Iain Johnstone, Zongming Ma, Patrick Perry, and Morteza Shahram. It will soon be replaced by a package with more accuracy and built-in support for relevant statistical tests.

utf8

cran
99.99th

Percentile

Process and print 'UTF-8' encoded international text (Unicode). Input, validate, normalize, encode, format, and display.

quanteda

cran
99.99th

Percentile

A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and ngrams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.

stylest

cran
99.99th

Percentile

Estimates distinctiveness in speakers' (authors') style. Fits models that can be used for predicting speakers of new texts. Methods developed in Spirling et al (2018) <doi:10.2139/ssrn.3235506> (working paper).