PLNfit: An R6 Class to represent a PLNfit in a standard, general framework

Description

The function PLN() fit a model which is an instance of a object with class PLNfit. Objects produced by the functions PLNnetwork(), PLNPCA(), PLNmixture() and PLNLDA() also enjoy the methods of PLNfit() by inheritance.

This class comes with a set of R6 methods, some of them being useful for the user and exported as S3 methods. See the documentation for coef(), sigma(), predict(), vcov() and standard_error().

Fields are accessed via active binding and cannot be changed by the user.

Arguments

Active bindings

n: number of samples
q: number of dimensions of the latent space
p: number of species
d: number of covariates
model_par: a list with the matrices of parameters found in the model (Theta, Sigma, plus some others depending on the variant)
fisher: Variational approximation of the Fisher Information matrix
std_err: Variational approximation of the variance-covariance matrix of model parameters estimates.
var_par: a list with two matrices, M and S2, which are the estimated parameters in the variational approximation
gen_par: a list with two parameters, sigma2 and rho, only used with the genetic covariance model
latent: a matrix: values of the latent vector (Z in the model)
latent_pos: a matrix: values of the latent position vector (Z) without covariates effects or offset
fitted: a matrix: fitted values of the observations (A in the model)
nb_param: number of parameters in the current PLN model
vcov_model: character: the model used for the covariance (either "spherical", "diagonal" or "full")
optim_par: a list with parameters useful for monitoring the optimization
weights: observational weights
loglik: (weighted) variational lower bound of the loglikelihood
loglik_vec: element-wise variational lower bound of the loglikelihood
BIC: variational lower bound of the BIC
entropy: Entropy of the variational distribution
ICL: variational lower bound of the ICL
R_squared: approximated goodness-of-fit criterion
criteria: a vector with loglik, BIC, ICL and number of parameters

Methods

Public methods

Method `update()`

Update a PLNfit object

Usage

PLNfit$update(
  Theta = NA,
  Sigma = NA,
  M = NA,
  S2 = NA,
  Ji = NA,
  R2 = NA,
  Z = NA,
  A = NA,
  monitoring = NA
)

Arguments

Theta: matrix of regression matrix

Sigma

variance-covariance matrix of the latent variables

M

matrix of mean vectors for the variational approximation

S2

matrix of variance vectors for the variational approximation

Ji

vector of variational lower bounds of the log-likelihoods (one value per sample)

R2

approximate R^2 goodness-of-fit criterion

Z

matrix of latent vectors (includes covariates and offset effects)

A

matrix of fitted values

monitoring

a list with optimization monitoring quantities

Returns

Update the current PLNfit object

Method `new()`

Initialize a PLNfit model

Usage

PLNfit$new(responses, covariates, offsets, weights, formula, xlevels, control)

Arguments

responses: the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class

covariates

design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class

offsets

offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class

weights

an optional vector of observation weights to be used in the fitting process.

formula

model formula used for fitting, extracted from the formula in the upper-level call

xlevels

named listed of factor levels included in the models, extracted from the formula in the upper-level call and used for predictions.

control

a list for controlling the optimization. See details.

Method `optimize()`

Call to the C++ optimizer and update of the relevant fields

Usage

PLNfit$optimize(responses, covariates, offsets, weights, control)

Arguments

responses: the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class

covariates

design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class

offsets

offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class

weights

an optional vector of observation weights to be used in the fitting process.

control

a list for controlling the optimization. See details.

Method `VEstep()`

Result of one call to the VE step of the optimization procedure: optimal variational parameters (M, S) and corresponding log likelihood values for fixed model parameters (Sigma, Theta). Intended to position new data in the latent space.

Usage

PLNfit$VEstep(
  covariates,
  offsets,
  responses,
  weights,
  Theta = self$model_par$Theta,
  Sigma = self$model_par$Sigma,
  control = list()
)

Arguments

covariates: design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class

offsets

offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class

responses

the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class

weights

an optional vector of observation weights to be used in the fitting process.

Theta

Optional fixed value of the regression parameters

Sigma

Optional fixed value of the covariance parameters.

control

a list for controlling the optimization. See details.

Returns

A list with three components:

the matrix M of variational means,
the matrix S2 of variational variances
the vector log.lik of (variational) log-likelihood of each new observation

Method `set_R2()`

Update R2 field after optimization

Usage

PLNfit$set_R2(responses, covariates, offsets, weights, nullModel = NULL)

Arguments

responses: the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class

covariates

design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class

offsets

offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class

weights

an optional vector of observation weights to be used in the fitting process.

nullModel

null model used for approximate R2 computations. Defaults to a GLM model with same design matrix but not latent variable.

Method `compute_fisher()`

Safely compute the fisher information matrix (FIM)

Usage

PLNfit$compute_fisher(type = c("wald", "louis"), X = NULL)

Arguments

type: approximation scheme to compute the fisher information matrix. Either wald (default) or louis. type = "louis" results in smaller confidence intervals.

X

design matrix used to compute the FIM

Returns

a sparse matrix with sensible dimension names

Method `compute_standard_error()`

Compute univariate standard error for coefficients of Theta from the FIM

Usage

PLNfit$compute_standard_error()

Returns

a matrix of standard deviations.

Method `postTreatment()`

Update R2, fisher and std_err fields after optimization

Usage

PLNfit$postTreatment(
  responses,
  covariates,
  offsets,
  weights = rep(1, nrow(responses)),
  type = c("wald", "louis", "none"),
  nullModel = NULL
)

Arguments

responses: the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class

covariates

design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class

offsets

offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class

weights

an optional vector of observation weights to be used in the fitting process.

type

approximation scheme to compute the fisher information matrix. Either wald (default) or louis. type = "louis" results in smaller confidence intervals.

nullModel

null model used for approximate R2 computations. Defaults to a GLM model with same design matrix but not latent variable.

Method `predict()`

Predict position, scores or observations of new data.

Usage

PLNfit$predict(newdata, type = c("link", "response"), envir = parent.frame())

Arguments

newdata: A data frame in which to look for variables with which to predict. If omitted, the fitted values are used.

type

Scale used for the prediction. Either link (default, predicted positions in the latent space) or response (predicted counts).

envir

Environment in which the prediction is evaluated

Returns

A matrix with predictions scores or counts.

Method `predict_cond()`

Predict position, scores or observations of new data, conditionally on the observation of a (set of) variables

Usage

PLNfit$predict_cond(
  newdata,
  cond_responses,
  type = c("link", "response"),
  var_par = FALSE,
  envir = parent.frame()
)

Arguments

newdata: a data frame containing the covariates of the sites where to predict

cond_responses

a data frame containing the count of the observed variables (matching the names of the provided as data in the PLN function)

type

Scale used for the prediction. Either link (default, predicted positions in the latent space) or response (predicted counts).

var_par

Boolean. Should new estimations of the variational parameters of mean and variance be sent back, as attributes of the matrix of predictions. Default to FALSE.

envir

Environment in which the prediction is evaluated

Returns

A matrix with predictions scores or counts.

Method `show()`

User friendly print method

Usage

PLNfit$show(
  model = paste("A multivariate Poisson Lognormal fit with", private$covariance,
    "covariance model.\n")
)

Arguments

model: First line of the print output

Method `print()`

User friendly print method

Usage

PLNfit$print()

Method `clone()`

The objects of this class are cloneable with this method.

Usage

PLNfit$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Details

The parameter control is a list controlling the optimization with the following entries:

"covariance" character setting the model for the covariance matrix. Either "full", "diagonal", "spherical" or "genetic". Default is "full".
"corr_matrix": a symmetric positive definite correlation matrix used for the "genetic" model of covariance. Useless in other cases
"trace" integer for verbosity.
"inception" Set up the initialization. By default, the model is initialized with a multivariate linear model applied on log-transformed data, and with the same formula as the one provided by the user. However, the user can provide a PLNfit (typically obtained from a previous fit), which sometimes speeds up the inference.
"ftol_rel" stop when an optimization step changes the objective function by less than ftol multiplied by the absolute value of the parameter. Default is 1e-6 when n < p, 1e-8 otherwise.
"ftol_abs" stop when an optimization step changes the objective function by less than ftol multiplied by the absolute value of the parameter. Default is 0
"xtol_rel" stop when an optimization step changes every parameters by less than xtol multiplied by the absolute value of the parameter. Default is 1e-4
"xtol_abs" stop when an optimization step changes every parameters by less than xtol multiplied by the absolute value of the parameter. Default is 0
"maxeval" stop when the number of iteration exceeds maxeval. Default is 10000
"maxtime" stop when the optimization time (in seconds) exceeds maxtime. Default is -1 (no restriction)
"algorithm" the optimization method used by NLOPT among LD type, i.e. "CCSAQ", "MMA", "LBFGS", "VAR1", "VAR2". See NLOPT documentation for further details. Default is "CCSAQ".

Examples

Run this code

if (FALSE) {
data(trichoptera)
trichoptera <- prepare_data(trichoptera$Abundance, trichoptera$Covariate)
myPLN <- PLN(Abundance ~ 1, data = trichoptera)
class(myPLN)
print(myPLN)
}

Run the code above in your browser using DataLab

Description

Arguments

Active bindings

Methods

Public methods

Method update()

Usage

Arguments

Returns

Method new()

Usage

Arguments

Method optimize()

Usage

Arguments

Method VEstep()

Usage

Arguments

Returns

Method set_R2()

Usage

Arguments

Method compute_fisher()

Usage

Arguments

Returns

Method compute_standard_error()

Usage

Returns

Method postTreatment()

Usage

Arguments

Method predict()

Usage

Arguments

Returns

Method predict_cond()

Usage

Arguments

Returns

Method show()

Usage

Arguments

Method print()

Usage

Method clone()

Usage

Arguments

Details

Examples

Method `update()`

Method `new()`

Method `optimize()`

Method `VEstep()`

Method `set_R2()`

Method `compute_fisher()`

Method `compute_standard_error()`

Method `postTreatment()`

Method `predict()`

Method `predict_cond()`

Method `show()`

Method `print()`

Method `clone()`