Learn R Programming

PLNmodels (version 0.11.7)

PLNfit: An R6 Class to represent a PLNfit in a standard, general framework

Description

The function PLN() fit a model which is an instance of a object with class PLNfit. Objects produced by the functions PLNnetwork(), PLNPCA(), PLNmixture() and PLNLDA() also enjoy the methods of PLNfit() by inheritance.

This class comes with a set of R6 methods, some of them being useful for the user and exported as S3 methods. See the documentation for coef(), sigma(), predict(), vcov() and standard_error().

Fields are accessed via active binding and cannot be changed by the user.

Arguments

Active bindings

n

number of samples

q

number of dimensions of the latent space

p

number of species

d

number of covariates

model_par

a list with the matrices of parameters found in the model (Theta, Sigma, plus some others depending on the variant)

fisher

Variational approximation of the Fisher Information matrix

std_err

Variational approximation of the variance-covariance matrix of model parameters estimates.

var_par

a list with two matrices, M and S2, which are the estimated parameters in the variational approximation

gen_par

a list with two parameters, sigma2 and rho, only used with the genetic covariance model

latent

a matrix: values of the latent vector (Z in the model)

latent_pos

a matrix: values of the latent position vector (Z) without covariates effects or offset

fitted

a matrix: fitted values of the observations (A in the model)

nb_param

number of parameters in the current PLN model

vcov_model

character: the model used for the covariance (either "spherical", "diagonal" or "full")

optim_par

a list with parameters useful for monitoring the optimization

weights

observational weights

loglik

(weighted) variational lower bound of the loglikelihood

loglik_vec

element-wise variational lower bound of the loglikelihood

BIC

variational lower bound of the BIC

entropy

Entropy of the variational distribution

ICL

variational lower bound of the ICL

R_squared

approximated goodness-of-fit criterion

criteria

a vector with loglik, BIC, ICL and number of parameters

Methods


Method update()

Update a PLNfit object

Usage

PLNfit$update(
  Theta = NA,
  Sigma = NA,
  M = NA,
  S2 = NA,
  Ji = NA,
  R2 = NA,
  Z = NA,
  A = NA,
  monitoring = NA
)

Arguments

Theta

matrix of regression matrix

Sigma

variance-covariance matrix of the latent variables

M

matrix of mean vectors for the variational approximation

S2

matrix of variance vectors for the variational approximation

Ji

vector of variational lower bounds of the log-likelihoods (one value per sample)

R2

approximate R^2 goodness-of-fit criterion

Z

matrix of latent vectors (includes covariates and offset effects)

A

matrix of fitted values

monitoring

a list with optimization monitoring quantities

Returns

Update the current PLNfit object


Method new()

Initialize a PLNfit model

Usage

PLNfit$new(responses, covariates, offsets, weights, formula, xlevels, control)

Arguments

responses

the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class

covariates

design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class

offsets

offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class

weights

an optional vector of observation weights to be used in the fitting process.

formula

model formula used for fitting, extracted from the formula in the upper-level call

xlevels

named listed of factor levels included in the models, extracted from the formula in the upper-level call and used for predictions.

control

a list for controlling the optimization. See details.


Method optimize()

Call to the C++ optimizer and update of the relevant fields

Usage

PLNfit$optimize(responses, covariates, offsets, weights, control)

Arguments

responses

the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class

covariates

design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class

offsets

offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class

weights

an optional vector of observation weights to be used in the fitting process.

control

a list for controlling the optimization. See details.


Method VEstep()

Result of one call to the VE step of the optimization procedure: optimal variational parameters (M, S) and corresponding log likelihood values for fixed model parameters (Sigma, Theta). Intended to position new data in the latent space.

Usage

PLNfit$VEstep(
  covariates,
  offsets,
  responses,
  weights,
  Theta = self$model_par$Theta,
  Sigma = self$model_par$Sigma,
  control = list()
)

Arguments

covariates

design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class

offsets

offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class

responses

the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class

weights

an optional vector of observation weights to be used in the fitting process.

Theta

Optional fixed value of the regression parameters

Sigma

Optional fixed value of the covariance parameters.

control

a list for controlling the optimization. See details.

Returns

A list with three components:

  • the matrix M of variational means,

  • the matrix S2 of variational variances

  • the vector log.lik of (variational) log-likelihood of each new observation


Method set_R2()

Update R2 field after optimization

Usage

PLNfit$set_R2(responses, covariates, offsets, weights, nullModel = NULL)

Arguments

responses

the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class

covariates

design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class

offsets

offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class

weights

an optional vector of observation weights to be used in the fitting process.

nullModel

null model used for approximate R2 computations. Defaults to a GLM model with same design matrix but not latent variable.


Method compute_fisher()

Safely compute the fisher information matrix (FIM)

Usage

PLNfit$compute_fisher(type = c("wald", "louis"), X = NULL)

Arguments

type

approximation scheme to compute the fisher information matrix. Either wald (default) or louis. type = "louis" results in smaller confidence intervals.

X

design matrix used to compute the FIM

Returns

a sparse matrix with sensible dimension names


Method compute_standard_error()

Compute univariate standard error for coefficients of Theta from the FIM

Usage

PLNfit$compute_standard_error()

Returns

a matrix of standard deviations.


Method postTreatment()

Update R2, fisher and std_err fields after optimization

Usage

PLNfit$postTreatment(
  responses,
  covariates,
  offsets,
  weights = rep(1, nrow(responses)),
  type = c("wald", "louis", "none"),
  nullModel = NULL
)

Arguments

responses

the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class

covariates

design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class

offsets

offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class

weights

an optional vector of observation weights to be used in the fitting process.

type

approximation scheme to compute the fisher information matrix. Either wald (default) or louis. type = "louis" results in smaller confidence intervals.

nullModel

null model used for approximate R2 computations. Defaults to a GLM model with same design matrix but not latent variable.


Method predict()

Predict position, scores or observations of new data.

Usage

PLNfit$predict(newdata, type = c("link", "response"), envir = parent.frame())

Arguments

newdata

A data frame in which to look for variables with which to predict. If omitted, the fitted values are used.

type

Scale used for the prediction. Either link (default, predicted positions in the latent space) or response (predicted counts).

envir

Environment in which the prediction is evaluated

Returns

A matrix with predictions scores or counts.


Method predict_cond()

Predict position, scores or observations of new data, conditionally on the observation of a (set of) variables

Usage

PLNfit$predict_cond(
  newdata,
  cond_responses,
  type = c("link", "response"),
  var_par = FALSE,
  envir = parent.frame()
)

Arguments

newdata

a data frame containing the covariates of the sites where to predict

cond_responses

a data frame containing the count of the observed variables (matching the names of the provided as data in the PLN function)

type

Scale used for the prediction. Either link (default, predicted positions in the latent space) or response (predicted counts).

var_par

Boolean. Should new estimations of the variational parameters of mean and variance be sent back, as attributes of the matrix of predictions. Default to FALSE.

envir

Environment in which the prediction is evaluated

Returns

A matrix with predictions scores or counts.


Method show()

User friendly print method

Usage

PLNfit$show(
  model = paste("A multivariate Poisson Lognormal fit with", private$covariance,
    "covariance model.\n")
)

Arguments

model

First line of the print output


Method print()

User friendly print method

Usage

PLNfit$print()


Method clone()

The objects of this class are cloneable with this method.

Usage

PLNfit$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Details

The parameter control is a list controlling the optimization with the following entries:

  • "covariance" character setting the model for the covariance matrix. Either "full", "diagonal", "spherical" or "genetic". Default is "full".

  • "corr_matrix": a symmetric positive definite correlation matrix used for the "genetic" model of covariance. Useless in other cases

  • "trace" integer for verbosity.

  • "inception" Set up the initialization. By default, the model is initialized with a multivariate linear model applied on log-transformed data, and with the same formula as the one provided by the user. However, the user can provide a PLNfit (typically obtained from a previous fit), which sometimes speeds up the inference.

  • "ftol_rel" stop when an optimization step changes the objective function by less than ftol multiplied by the absolute value of the parameter. Default is 1e-6 when n < p, 1e-8 otherwise.

  • "ftol_abs" stop when an optimization step changes the objective function by less than ftol multiplied by the absolute value of the parameter. Default is 0

  • "xtol_rel" stop when an optimization step changes every parameters by less than xtol multiplied by the absolute value of the parameter. Default is 1e-4

  • "xtol_abs" stop when an optimization step changes every parameters by less than xtol multiplied by the absolute value of the parameter. Default is 0

  • "maxeval" stop when the number of iteration exceeds maxeval. Default is 10000

  • "maxtime" stop when the optimization time (in seconds) exceeds maxtime. Default is -1 (no restriction)

  • "algorithm" the optimization method used by NLOPT among LD type, i.e. "CCSAQ", "MMA", "LBFGS", "VAR1", "VAR2". See NLOPT documentation for further details. Default is "CCSAQ".

Examples

Run this code
if (FALSE) {
data(trichoptera)
trichoptera <- prepare_data(trichoptera$Abundance, trichoptera$Covariate)
myPLN <- PLN(Abundance ~ 1, data = trichoptera)
class(myPLN)
print(myPLN)
}

Run the code above in your browser using DataLab