Learn R Programming

⚠️There's a newer version (2.3.4) of this package.Take me there.

Welcome to R package admix

The goal of admix is to provide code for estimation, hypothesis testing and clustering methods in admixture models.

We remind that an admixture model has the following cumulative distribution function (cdf) L(x) = p**F(x) + (1−p)G(x),   x ∈ ℝ,

where G is a perfectly known cdf, and p and F are unknown.

The cdf F relates to the contamination phenomenon that is added to the well-known signal G, with proportion p.

The proportion of the unknown component in the two-component mixture model can be easily estimated under weak nonparametric assumptions on the related distribution. The decontaminated version of this unknown component distribution can then be tested against some other specified distribution (included another decontaminated unknown component). Finally, clustering of K populations is made possible, based on hypothesis tests that compare unknown component distributions. The package is suited to one-sample as well as multi-samples analysis.

Installation

You can install the released version of admix from Github with:

#once on CRAN with : install.package("admix")
# from now on:
remotes::install_github(repo = "XavierMilhaud/admix@main", build_manual = TRUE, build_vignettes = FALSE)

The optional argument build_vignettes can be set to TRUE to get vignettes that help to understand the functionnalities of the package.

To get some help about the functionalities of the package, do once installed:

help(package = 'admix')

More details can also be found through the vignettes, available in admix github-pages (see https://xaviermilhaud.github.io/admix/, in Menu Articles).

Example

This is a basic example which shows you how to estimate the unknown component proportion and the localization shift parameters in an admixture model where the unknown component density is assumed to be symmetric. In practice, the cdf L is given by L(x) = p**F(xμ) + (1−p)G(x),   x ∈ ℝ, where p is the unknown component weight, and μ is the localization shift parameter of the unknown cdf F with symmetric density.

The estimation would be made through the following commands:

library(admix)
## Simulate data:
list.comp <- list(f = 'norm', g = 'norm')
list.param <- list(f = list(mean = 3, sd = 0.5),
                   g = list(mean = 0, sd = 1))
data1 <- rsimmix(n = 1000, unknownComp_weight = 0.8, list.comp, list.param)[['mixt.data']]
## Perform the estimation of parameters in real-life:
list.comp <- list(f = NULL, g = 'norm')
list.param <- list(f = NULL, g = list(mean = 0, sd = 1))
BVdk_estimParam(data1, method = 'L-BFGS-B', list.comp, list.param)
#> [1] 0.7977239 3.0114174

Copy Link

Version

Install

install.packages('admix')

Monthly Downloads

651

Version

2.1-3

License

GPL (>= 3)

Issues

Pull Requests

Stars

Forks

Maintainer

Xavier Milhaud

Last Published

April 25th, 2024

Functions in admix (2.1-3)

IBM_greenLight_criterion

Green-light criterion to decide whether to perform full equality test between unknown components between two admixture models
PatraSen_dist_calc

Compute the distance to be minimized using Patra and Sen estimation technique in admixture models
PatraSen_density_est

Compute the estimate of the density of the unknown component in an admixture model
PatraSen_est_mix_model

Estimate by Patra and Sen the unknown component weight as well as the unknown distribution in admixture models
IBM_k_samples_test

Equality test of unknown component distributions in K admixture models, with IBM approach
PatraSen_cv_mixmodel

Cross-validation estimate (by Patra and Sen) of the unknown component weight as well as the unknown distribution in an admixture model
IBM_theoretical_contrast

Theoretical contrast in the Inversion - Best Matching (IBM) method
IBM_theoretical_gap

Difference between unknown cumulative distribution functions of admixture models at some given point
IBM_hessian_contrast

Hessian matrix of the contrast function in the Inversion - Best Matching (IBM) method
IBM_tabul_stochasticInteg

Distribution of the contrast in the Inversion - Best Matching (IBM) method
estimVarCov_empProcess

Variance-covariance matrix of the empirical process in an admixture model
admix_estim

Estimate the unknown parameters of the admixture model(s) under study
admix_test

Hypothesis test between unknown components of the admixture models under study
admix_clustering

Clustering of K populations following admixture models
admix-package

admix: Package Admix for Admixture (aka Contamination) Models
decontaminated_density

Provide the decontaminated density of the unknown component in an admixture model.
gaussianity_test

One-sample gaussianity test in admixture models using Bordes and Vandekerkhove estimation method
detect_support_type

Detect the support of the random variables under study
decontaminated_cdf

Provide the decontaminated cumulative distribution function (CDF) of the unknown component in an admixture model
allGalaxies

Four galaxies (Carina, Sextans, Sculptor, Fornax) measurements of heliocentric velocities from SIMBAD astronomical database
orthoBasis_coef

Compute expansion coefficients in a given orthonormal polynomial basis.
plot_mixt_density

Plot the density of some given sample(s) with mixture distributions.
knownComp_to_uniform

Transforms the known component of the admixture distribution to a Uniform distribution
kernel_cdf

Kernel estimation
is_equal_knownComp

Test for equality of the known components between two admixture models
plot.decontaminated_density

Plot the decontaminated density of the unknown component for an estimated admixture model
kernel_density

Kernel estimation
milkyWay

Heliocentric velocity measured for the Milky Way (from Walker, M. G., M. Mateo, E. W. Olszewski, O. Y. Gnedin, X. Wang, B. Sen, and M. Woodroofe (2007). Velocity dispersion profiles of seven dwarf spheroidal galaxies. Astrophysical J. 667(1), L53–L56).
mortality_sample

Dataset giving exposure-to-death (population size) and number of deaths for males in eleven european countries, with ages ranging from 30 years old to 85 years old.
orthoBasis_test_H0

Equality test of unknown components between two admixture models using polynomial basis expansions
print.admix_estim

Print the results of estimated parameters from K admixture models
sim_gaussianProcess

Simulation of a Gaussian process
poly_orthonormal_basis

Build an orthonormal basis to decompose some given probability density function
print.admix_test

Print the results of statistical test for equality of unknown component distributions in admixture models
print.admix_cluster

Results of the clustering algorithm performed over the K populations following admixture models.
stmf_small

Short-term Mortality Fluctuations (STMF) data series, restricted to 6 countries (Belgium, France, Italy, Netherlands, Spain, Germany).
rsimmix_mix

Simulation of a two-component gaussian mixture with one component following a two-component gaussian mixture
rsimmix

Simulation of a two-component mixture model
two_samples_test

Two-samples hypothesis test on the unknown component in admixture models
IBM_estimVarCov_gaussVect

Nonparametric estimation of the variance-covariance matrix of the gaussian vector in IBM approach
BVdk_ML_varCov_estimators

Maximum Likelihood estimation of the variance of the unknown density variance estimator in an admixture model
IBM_2samples_test

Equality test of unknown component distributions in two admixture models with IBM approach
BVdk_varCov_estimators

Estimation of the variance of the estimators in admixture models with symmetric unknown density.
BVdk_contrast_gradient

Gradient of the contrast as defined in Bordes & Vandekerkhove (2010)
BVdk_estimParam

Estimation of the parameters in a two-component admixture model with symmetric unknown density
IBM_empirical_contrast

Empirical computation of the contrast in the Inversion - Best Matching (IBM) method
BVdk_contrast

Contrast as defined in Bordes & Vandekerkhove (2010)
IBM_gap

Difference between the unknown empirical cumulative distribution functions in two admixture models
IBM_estimProp

Estimate the weights related to the proportions of the unknown components of the two admixture models