modelsearch2: Data-driven Extension of a Latent Variable Model

Description

Procedure adding relationship between variables that are supported by the data.

Usage

modelsearch2(object, link, data, method.p.adjust, type.information, alpha,
  nStep, na.omit, trace, cpus)
# S3 method for lvmfit
modelsearch2(object, link = NULL, data = NULL,
  method.p.adjust = "fastmax", type.information = "E", alpha = 0.05,
  nStep = NULL, na.omit = TRUE, trace = TRUE, cpus = 1)

Arguments

object

a lvmfit object.

link

[character, optional for lvmfit objects] the name of the additional relationships to consider when expanding the model. Should be a vector containing strings like "Y~X". See the details section.

data

[data.frame, optional] the dataset used to identify the model

method.p.adjust

[character] the method used to adjust the p.values for multiple comparisons. Can be any method that is valid for the stats::p.adjust function (e.g. "fdr"), or "max" or "fastmax".

type.information

[character] the method used by lava::information to compute the information matrix.

alpha

[numeric 0-1] the significance cutoff for the p-values. When the p-value is below, the corresponding link will be added to the model and the search will continue. Otherwise the search will stop.

nStep

the maximum number of links that can be added to the model.

na.omit

should tests leading to NA for the test statistic be ignored. Otherwise this will stop the selection process.

trace

[logical] should the execution of the function be traced?

cpus

the number of cpus that can be used for the computations.

Value

A list containing:

sequenceTest: the sequence of test that has been performed.
sequenceModel: the sequence of models that has been obtained.
sequenceQuantile: the sequence of rejection threshold. Optional.
sequenceIID: the influence functions relative to each test. Optional.
sequenceSigma: the covariance matrix relative to each test. Optional.
initialModel: the model before the sequential search.
statistic: the argument statistic.
method.p.adjust: the argument method.p.adjust.
alpha: [numeric 0-1] the significance cutoff for the p-values.
cv: whether the procedure has converged.

Details

method.p.adjust = "max" computes the p-values based on the distribution of the max statistic. This max statistic is the max of the square root of the score statistic. The p-value are computed integrating the multivariate normal distribution.

method.p.adjust = "fastmax" only compute the p-value for the largest statistic. It is faster than "max" and lead to identical results.

Examples

Run this code

# NOT RUN {
## simulate data
mSim <- lvm()
regression(mSim) <- c(y1,y2,y3,y4)~u
regression(mSim) <- u~x1+x2
categorical(mSim,labels=c("A","B","C")) <- "x2"
latent(mSim) <- ~u
covariance(mSim) <- y1~y2
transform(mSim, Id~u) <- function(x){1:NROW(x)}
df.data <- lava::sim(mSim, n = 1e2, latent = FALSE)

## only identifiable extensions
m <- lvm(c(y1,y2,y3,y4)~u)
latent(m) <- ~u
addvar(m) <- ~x1+x2

e <- estimate(m, df.data)

# }
# NOT RUN {
resSearch <- modelsearch(e)
resSearch

resSearch2 <- modelsearch2(e, nStep = 2)
resSearch2
# }
# NOT RUN {
## some extensions are not identifiable
m <- lvm(c(y1,y2,y3)~u)
latent(m) <- ~u
addvar(m) <- ~x1+x2 

e <- estimate(m, df.data)

# }
# NOT RUN {
resSearch <- modelsearch(e)
resSearch
resSearch2 <- modelsearch2(e)
resSearch2
# }
# NOT RUN {
## for instance
mNI <- lvm(c(y1,y2,y3)~u)
latent(mNI) <- ~u
covariance(mNI) <- y1~y2
## estimate(mNI, data = df.data)
## does not converge



# }

Run the code above in your browser using DataLab