The function PLN()
fit a model which is an instance of a object with class PLNfit
.
Objects produced by the functions PLNnetwork()
, PLNPCA()
, PLNmixture()
and PLNLDA()
also enjoy the methods of PLNfit()
by inheritance.
This class comes with a set of R6 methods, some of them being useful for the user and exported as S3 methods.
See the documentation for coef()
, sigma()
,
predict()
, vcov()
and standard_error()
.
Fields are accessed via active binding and cannot be changed by the user.
n
number of samples
q
number of dimensions of the latent space
p
number of species
d
number of covariates
model_par
a list with the matrices of parameters found in the model (Theta, Sigma, plus some others depending on the variant)
fisher
Variational approximation of the Fisher Information matrix
std_err
Variational approximation of the variance-covariance matrix of model parameters estimates.
var_par
a list with two matrices, M and S2, which are the estimated parameters in the variational approximation
gen_par
a list with two parameters, sigma2 and rho, only used with the genetic covariance model
latent
a matrix: values of the latent vector (Z in the model)
latent_pos
a matrix: values of the latent position vector (Z) without covariates effects or offset
fitted
a matrix: fitted values of the observations (A in the model)
nb_param
number of parameters in the current PLN model
vcov_model
character: the model used for the covariance (either "spherical", "diagonal" or "full")
optim_par
a list with parameters useful for monitoring the optimization
weights
observational weights
loglik
(weighted) variational lower bound of the loglikelihood
loglik_vec
element-wise variational lower bound of the loglikelihood
BIC
variational lower bound of the BIC
entropy
Entropy of the variational distribution
ICL
variational lower bound of the ICL
R_squared
approximated goodness-of-fit criterion
criteria
a vector with loglik, BIC, ICL and number of parameters
update()
Update a PLNfit
object
PLNfit$update(
Theta = NA,
Sigma = NA,
M = NA,
S2 = NA,
Ji = NA,
R2 = NA,
Z = NA,
A = NA,
monitoring = NA
)
Theta
matrix of regression matrix
Sigma
variance-covariance matrix of the latent variables
M
matrix of mean vectors for the variational approximation
S2
matrix of variance vectors for the variational approximation
Ji
vector of variational lower bounds of the log-likelihoods (one value per sample)
R2
approximate R^2 goodness-of-fit criterion
Z
matrix of latent vectors (includes covariates and offset effects)
A
matrix of fitted values
monitoring
a list with optimization monitoring quantities
Update the current PLNfit
object
new()
Initialize a PLNfit
model
PLNfit$new(responses, covariates, offsets, weights, formula, xlevels, control)
responses
the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class
covariates
design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class
offsets
offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class
weights
an optional vector of observation weights to be used in the fitting process.
formula
model formula used for fitting, extracted from the formula in the upper-level call
xlevels
named listed of factor levels included in the models, extracted from the formula in the upper-level call and used for predictions.
control
a list for controlling the optimization. See details.
optimize()
Call to the C++ optimizer and update of the relevant fields
PLNfit$optimize(responses, covariates, offsets, weights, control)
responses
the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class
covariates
design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class
offsets
offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class
weights
an optional vector of observation weights to be used in the fitting process.
control
a list for controlling the optimization. See details.
VEstep()
Result of one call to the VE step of the optimization procedure: optimal variational parameters (M, S) and corresponding log likelihood values for fixed model parameters (Sigma, Theta). Intended to position new data in the latent space.
PLNfit$VEstep(
covariates,
offsets,
responses,
weights,
Theta = self$model_par$Theta,
Sigma = self$model_par$Sigma,
control = list()
)
covariates
design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class
offsets
offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class
responses
the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class
weights
an optional vector of observation weights to be used in the fitting process.
Theta
Optional fixed value of the regression parameters
Sigma
Optional fixed value of the covariance parameters.
control
a list for controlling the optimization. See details.
A list with three components:
the matrix M
of variational means,
the matrix S2
of variational variances
the vector log.lik
of (variational) log-likelihood of each new observation
set_R2()
Update R2 field after optimization
PLNfit$set_R2(responses, covariates, offsets, weights, nullModel = NULL)
responses
the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class
covariates
design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class
offsets
offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class
weights
an optional vector of observation weights to be used in the fitting process.
nullModel
null model used for approximate R2 computations. Defaults to a GLM model with same design matrix but not latent variable.
compute_fisher()
Safely compute the fisher information matrix (FIM)
PLNfit$compute_fisher(type = c("wald", "louis"), X = NULL)
type
approximation scheme to compute the fisher information matrix. Either wald
(default) or louis
. type = "louis"
results in smaller confidence intervals.
X
design matrix used to compute the FIM
a sparse matrix with sensible dimension names
compute_standard_error()
Compute univariate standard error for coefficients of Theta from the FIM
PLNfit$compute_standard_error()
a matrix of standard deviations.
postTreatment()
Update R2, fisher and std_err fields after optimization
PLNfit$postTreatment(
responses,
covariates,
offsets,
weights = rep(1, nrow(responses)),
type = c("wald", "louis", "none"),
nullModel = NULL
)
responses
the matrix of responses (called Y in the model). Will usually be extracted from the corresponding field in PLNfamily-class
covariates
design matrix (called X in the model). Will usually be extracted from the corresponding field in PLNfamily-class
offsets
offset matrix (called O in the model). Will usually be extracted from the corresponding field in PLNfamily-class
weights
an optional vector of observation weights to be used in the fitting process.
type
approximation scheme to compute the fisher information matrix. Either wald
(default) or louis
. type = "louis"
results in smaller confidence intervals.
nullModel
null model used for approximate R2 computations. Defaults to a GLM model with same design matrix but not latent variable.
predict()
Predict position, scores or observations of new data.
PLNfit$predict(newdata, type = c("link", "response"), envir = parent.frame())
newdata
A data frame in which to look for variables with which to predict. If omitted, the fitted values are used.
type
Scale used for the prediction. Either link
(default, predicted positions in the latent space) or response
(predicted counts).
envir
Environment in which the prediction is evaluated
A matrix with predictions scores or counts.
predict_cond()
Predict position, scores or observations of new data, conditionally on the observation of a (set of) variables
PLNfit$predict_cond(
newdata,
cond_responses,
type = c("link", "response"),
var_par = FALSE,
envir = parent.frame()
)
newdata
a data frame containing the covariates of the sites where to predict
cond_responses
a data frame containing the count of the observed variables (matching the names of the provided as data in the PLN function)
type
Scale used for the prediction. Either link
(default, predicted positions in the latent space) or response
(predicted counts).
var_par
Boolean. Should new estimations of the variational parameters of mean and variance be sent back, as attributes of the matrix of predictions. Default to FALSE
.
envir
Environment in which the prediction is evaluated
A matrix with predictions scores or counts.
show()
User friendly print method
PLNfit$show(
model = paste("A multivariate Poisson Lognormal fit with", private$covariance,
"covariance model.\n")
)
model
First line of the print output
clone()
The objects of this class are cloneable with this method.
PLNfit$clone(deep = FALSE)
deep
Whether to make a deep clone.
The parameter control
is a list controlling the optimization with the following entries:
"covariance" character setting the model for the covariance matrix. Either "full", "diagonal", "spherical" or "genetic". Default is "full".
"corr_matrix": a symmetric positive definite correlation matrix used for the "genetic" model of covariance. Useless in other cases
"trace" integer for verbosity.
"inception" Set up the initialization. By default, the model is initialized with a multivariate linear model applied on log-transformed data, and with the same formula as the one provided by the user. However, the user can provide a PLNfit (typically obtained from a previous fit), which sometimes speeds up the inference.
"ftol_rel" stop when an optimization step changes the objective function by less than ftol multiplied by the absolute value of the parameter. Default is 1e-6 when n < p, 1e-8 otherwise.
"ftol_abs" stop when an optimization step changes the objective function by less than ftol multiplied by the absolute value of the parameter. Default is 0
"xtol_rel" stop when an optimization step changes every parameters by less than xtol multiplied by the absolute value of the parameter. Default is 1e-4
"xtol_abs" stop when an optimization step changes every parameters by less than xtol multiplied by the absolute value of the parameter. Default is 0
"maxeval" stop when the number of iteration exceeds maxeval. Default is 10000
"maxtime" stop when the optimization time (in seconds) exceeds maxtime. Default is -1 (no restriction)
"algorithm" the optimization method used by NLOPT among LD type, i.e. "CCSAQ", "MMA", "LBFGS", "VAR1", "VAR2". See NLOPT documentation for further details. Default is "CCSAQ".
if (FALSE) {
data(trichoptera)
trichoptera <- prepare_data(trichoptera$Abundance, trichoptera$Covariate)
myPLN <- PLN(Abundance ~ 1, data = trichoptera)
class(myPLN)
print(myPLN)
}
Run the code above in your browser using DataLab