Learn R Programming

⚠️There's a newer version (1.0.2.2) of this package.Take me there.

genscore

This repository contains the generalized score matching estimator introduced in the paper "Generalized Score Matching for Non-Negative Data" (http://www.jmlr.org/papers/volume20/18-278/18-278.pdf), an estimator for high-dimensional graphical models or parameters in truncated distributions. It is a generalization of the regularized score matching estimator in "Estimation of High-Dimensional Graphical Models Using Regularized Score Matching" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5476334/, https://github.com/linlina/highscore).

The current second version further generalizes the distributions to generalized domain types

  1. The real space,
  2. The non-negative orthant of the real space,
  3. A union of intervals as the uniform domain for each component,
  4. The (p-1)-dimensional simplex (with all components positive and sum to 1), and
  5. Intersections and unions of domains defined by polynomial inequalities.

The distributions covered include

  1. the univariate truncated normal distribution,
  2. Gaussian graphical models,
  3. truncated Gaussian graphical models,
  4. exponential square-root graphical models (Inouye et al, 2016),
  5. "gamma graphical models" (Yu et al, 2019),
  6. "a-b models" (Yu et al, 2019), and
  7. the A^d model (Aitchison, 1985).

Installation from CRAN

install.packages("genscore")

Installation from GitHub

install.packages(c("devtools", "knitr"))
devtools::install_github("sqyu/genscore")

Usage

For a complete guide to its usage, please consult the vignette here (or here for the pre-compiled html).

vignette("gen_vignette")

References

Some parts of the code were initially derived from https://github.com/linlina/highscore and http://www1.maths.leeds.ac.uk/~wally.gilks/adaptive.rejection/web_page/Welcome.html.

John Aitchison. A general class of distributions on the simplex. Journal of the Royal Statistical Society: Series B (Methodological), 47(1):136–146, 1985. https://doi.org/10.1111/j.2517-6161.1985.tb01341.x

David Inouye, Pradeep Ravikumar, and Inderjit Dhillon. Square root graphical models: Multivariate generalizations of univariate exponential families that permit positive dependencies. In Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 2445–2453, 2016. http://proceedings.mlr.press/v48/inouye16.html

Lina Lin, Mathias Drton, and Ali Shojaie. Estimation of high-dimensional graphical models using regularized score m atching. Electron. J. Stat., 10(1):806–854, 2016. https://doi.org/10.1214/16-EJS1126

Shiqing Yu, Mathias Drton, and Ali Shojaie. Graphical models for non-negative data using generalized score matching. In International Conference on Artificial Intelligence and Statistics, pages 1781–1790, 2018. http://proceedings.mlr.press/v84/yu18b.html

Shiqing Yu, Mathias Drton, and Ali Shojaie. Generalized score matching for non-negative data. Journal of Machine Learning Research, 20(76):1–70, 2019. http://jmlr.org/papers/v20/18-278.html

Copy Link

Version

Install

install.packages('genscore')

Monthly Downloads

202

Version

1.0.2

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Shiqing Yu

Last Published

April 27th, 2020

Functions in genscore (1.0.2)

beautify_rule

Replaces consecutive "&"s and "|"s in a string to a single & and |.
avgrocs

Takes the vertical average of ROC curves.
calc_crit

Calculates penalized or unpenalized loss in K and eta given arbitrary data
cov_cons

Random generator of inverse covariance matrices.
compare_two_sub_results

Compares two lists returned from get_results().
binarySearch_bin

Finds the index of the bin a number belongs to using binary search.
AUC

Calculates the AUC of an ROC curve.
compare_two_results

Compares two lists returned from estimate().
crbound_mu

The Cram\'er-Rao lower bound (times n) for estimating the mean parameter from a univariate truncated normal sample with known variance parameter.
check_endpoints

Checks if two equally sized numeric vectors satisfy the requirements for being left and right endpoints of a domain defined as a union of intervals.
crbound_sigma

The Cram\'er-Rao lower bound (times n) for estimating the variance parameter from a univariate truncated normal sample with known mean parameter.
diff_lists

Computes the sum of absolute differences between two lists.
gen

Random data generator from general a-b distributions with general domain types, assuming a and b are rational numbers.
gcd

Finds the greatest (positive) common divisor of two integers.
diff_vecs

Computes the sum of absolute differences in the finite non-NA/NULL elements between two vectors.
domain_for_C

Returns a list to be passed to C that represents the domain.
frac_pow

Evaluate x^(a/b) and |x|^(a/b) with integer a and b with extension to conventional operations.
find_max_ind

Finds the max index in a vector that does not exceed a target number.
eBIC

eBIC score with or without refitting.
estimate

The main function for the generalized score-matching estimator for graphical models.
get_elts_exp

The R implementation to get the elements necessary for calculations for the exponential square-root setting (a=0.5, b=0.5).
get_elts_loglog

The R implementation to get the elements necessary for calculations for the log-log setting (a=0, b=0).
get_crit_nopenalty

Minimized loss for unpenalized restricted asymmetric models.
get_dist

Finds the distance of each element in a matrix x to the its boundary of the domain while fixing the others in the same row.
get_elts_loglog_simplex

The R implementation to get the elements necessary for calculations for the log-log setting (a=0, b=0) on the p-simplex.
get_elts_gauss

The R implementation to get the elements necessary for calculations for the gaussian setting on R^p.
get_elts_ab

The R implementation to get the elements necessary for calculations for general a and b.
get_elts

The function wrapper to get the elements necessary for calculations for all settings.
get_elts_gamma

The R implementation to get the elements necessary for calculations for the gamma setting (a=0.5, b=0).
get_elts_trun_gauss

The R implementation to get the elements necessary for calculations for the gaussian setting (a=1, b=1) on domains other than R^p.
get_g0

Calculates the l2 distance to the boundary of the domain and its gradient for some domains.
get_h_hp_vector

Generator of h and hp (derivative of h) functions.
h_of_dist

Finds the distance of each element in a matrix x to the its boundary of the domain while fixing the others in the same row (dist(x, domain)), and calculates element-wise h(dist(x, domain)) and h\'(dist(x, domain)) (w.r.t. each element in x).
get_trun

The truncation point for h for h that is truncated (bounded but not naturally bounded).
in_bound

Returns whether a vector or each row of a matrix falls inside a domain.
get_g0_ada

Adaptively truncates the l2 distance to the boundary of the domain and its gradient for some domains.
interval_intersection

Finds the intersection between two unions of intervals.
get_postfix_rule

Changes a logical expression in infix notation to postfix notation using the shunting-yard algorithm.
naiveSearch_bin

Finds the index of the bin a number belongs to using naive search.
parse_ab

Parses an ab setting into rational numbers a and b.
mu_sigmasqhat

Estimates the mu and sigma squared parameters from a univariate truncated normal sample.
makecoprime

Makes two integers coprime.
ran_mat

Random generator of matrices with given eigenvalues.
parse_ineq

Parses an ineq expression into a list of elements that represents the ineq.
read_exponential

Parses the integer coefficient in an exponential term.
random_init_polynomial

Randomly generate an initial point in the domain defined by a single polynomial with no negative coefficient.
read_one_term

Parses the first term of a non-uniform expression.
tp_fp

Calculates the true and false positive rates given the estimated and true edges.
update_finite_infinity_for_uniform

Maximum between finite_infinity and 10 times the max abs value of finite elements in lefts and rights.
get_h_hp

Generator of h and hp (derivative of h) functions.
get_h_hp_adaptive

Generator of adaptive h and hp (derivative of h) functions.
random_init_uniform

Generates random numbers from a finite union of intervals.
test_lambda_bounds

Searches for a tight bound for \(\lambda_{\boldsymbol{K}}\) that gives the empty or complete graph starting from a given lambda with a given step size
get_results

Estimate \(\mathbf{K}\) and \(\boldsymbol{\eta}\) using elts from get_elts() given one \(\lambda_{\mathbf{K}}\) (and \(\lambda_{\boldsymbol{\eta}}\) if non-profiled non-centered) and applying warm-start with strong screening rules.
make_domain

Creates a list of elements that defines the domain for a multivariate distribution.
make_folds

Helper function for making fold IDs for cross validation.
random_init_simplex

Generates a random point in the (p-1)-simplex.
s_output

Helper function for outputting if verbose.
search_bin

Finds the index of the bin a number belongs to.
rlaplace_truncated

Generates laplace variables truncated to a finite union of intervals.
rexp_truncated

Generates translated and truncated exponential variables.
rlaplace_truncated_centered

Generates centered laplace variables with scale 1.
varhat

Asymptotic variance (times n) of the estimator for mu or sigmasq for the univariate normal on a general domain assuming the other parameter is known.
read_exponent

Parses the exponent part into power_numer and power_denom.
s_at

Returns the character at a position of a string.
lambda_max

Analytic solution for the minimum \(\lambda_{\mathbf{K}}\) that gives the empty graph.
get_safe_log_h_hp

Asymptotic log of h and hp functions for large x for modes with an unbounded h.
interval_union

Finds the union between two unions of intervals.
read_uniform_term

Attempts to parse a single term in x into power_numer and power_denom.
refit

Loss for a refitted (restricted) unpenalized model
test_lambda_bounds2

Searches for a tight bound for \(\lambda_{\boldsymbol{K}}\) that gives the empty or complete graph starting from a given lambda