modelsearch2: Data-driven Extension of a Latent Variable Model

Description

Procedure adding relationship between variables that are supported by the data.

Usage

modelsearch2(object, ...)
# S3 method for lvmfit
modelsearch2(object, link = NULL, data = NULL,
  statistic = "Wald", method.p.adjust = "max", typeSD = "information",
  df = TRUE, bias.correct = TRUE, trace = TRUE, ...)
# S3 method for default
modelsearch2(object, link, data = NULL,
  statistic = "Wald", method.p.adjust = "max", typeSD = "information",
  df = TRUE, bias.correct = TRUE, trace = TRUE, ...)

Arguments

object

a lvmfit object.

...

additional arguments to be passed to findNewLink and .modelsearch2, see details.

link

[character, optional for lvmfit objects] the name of the additional relationships to consider when expanding the model. Should be a vector containing strings like "Y~X". See the details section.

data

[data.frame, optional] the dataset used to identify the model

statistic

[character] statistic used to perform the test. Can the likelihood ratio test ("LR"), the score ("score"), or the max statistic ("max").

method.p.adjust

[character] the method used to adjust the p.values for multiple comparisons. Ignored when using the max statistic. Can be any method that is valid for the stats::p.adjust function (e.g. "fdr").

typeSD

[character] the type of standard error to be used to compute the Wald statistic. Can be "information", "robust" or "jackknife".

[logical] should the degree of freedom be computed using the Satterthwaite approximation? Only relevant when the argument statistic is set to "Wald".

bias.correct

[logical] should the standard errors of the coefficients be corrected for small sample bias? Only relevant when the argument statistic is set to "Wald".

trace

[logical] should the execution of the function be traced?

Value

A list containing:

sequenceTest: the sequence of test that has been performed.
sequenceModel: the sequence of models that has been obtained.
sequenceQuantile: the sequence of rejection threshold. Optional.
sequenceIID: the influence functions relative to each test. Optional.
sequenceSigma: the covariance matrix relative to each test. Optional.
statistic: the argument statistic.
method.p.adjust: the argument method.p.adjust.
typeSD: the argument typeSD.
alpha: [numeric 0-1] the significance cutoff for the p-values.
cv: whether the procedure has converged.

Details

Argument link:

lvmfit object: when not specified all possible additional links are considered.
other objects: this argument must be specified.

Argument ... passed to findNewLink, see the documentation of this function:

exclude.var
rm.latent_latent
rm.endo_endo
rm.latent_endo

Argument ... passed to modelsearch2:

alpha: the significance threshold for retaining a new link.
method.max: the method used to compute the distribution of the max statistic. See lava.options()$search.calcMaxDist.
cpus: the number of cpus that can be used for the computations.
nStep: the maximum number of links that can be added to the model.
na.omit: should model leading to NA for the test statistic be ignored. Otherwise this will stop the selection process.

Examples

Run this code

# NOT RUN {
#### linear regression ####
set.seed(10)
mSim <- lvm(Y~X1+X2+X3+X4)
addvar(mSim) <- ~Z1+Z2
df.data <- lava::sim(mSim, n = 1e2)
eLM <- lm(Y~X1, data = df.data)
possible.link <- c("Y~X2","Y~X3","Y~X4","Y~Z1","Y~Z2")

res <- modelsearch2(eLM, link = possible.link, data = df.data,
             statistic = "LR", method.p.adjust = "holm")
res <- modelsearch2(eLM, link = possible.link, data = df.data,
             statistic = "Wald", method.p.adjust = "holm", nStep = 1)
# }
# NOT RUN {
res <- modelsearch2(eLM, data = df.data, link = possible.link)
# }
# NOT RUN {

#### Cox model ####
# }
# NOT RUN {
library(survival)
data(Melanoma, package = "riskRegression")
m <- coxph(Surv(time,status==1)~ici+age, data = Melanoma, x = TRUE, y = TRUE)

res <- modelsearch2(m, link = c(status~epicel,status~sex),
                    packages = "survival", nStep = 1)
res
# }
# NOT RUN {
#### LVM ####
# }
# NOT RUN {
mSim <- lvm()
regression(mSim) <- c(y1,y2,y3)~u
regression(mSim) <- u~x1+x2
categorical(mSim,labels=c("A","B","C")) <- "x2"
latent(mSim) <- ~u
covariance(mSim) <- y1~y2
transform(mSim, Id~u) <- function(x){1:NROW(x)}
df.data <- lava::sim(mSim, n = 1e2, latent = FALSE)

m <- lvm(c(y1,y2,y3)~u)
latent(m) <- ~u
addvar(m) <- ~x1+x2 

e <- estimate(m, df.data)

links <- c(u~x1,u~x2C,y3~x2C)
resScore <- modelsearch2(e, statistic = "score", link = links, method.p.adjust = "holm")
resLR <- modelsearch2(e, statistic = "LR", link = links, method.p.adjust = "holm", nStep = 1)
resMax <- modelsearch2(e, rm.endo_endo = TRUE, statistic = "Wald", link = links, nStep = 1)
resScore <- modelsearch2(e, statistic = "score", method.p.adjust = "holm")
resLR <- modelsearch2(e, statistic = "LR", method.p.adjust = "holm")
resMax <- modelsearch2(e, rm.endo_endo = TRUE, statistic = "Wald")
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab