modelsearch2: Data-driven Extension of a Latent Variable Model

Description

Procedure adding relationship between variables that are supported by the data.

Usage

modelsearch2(object, ...)
# S3 method for lvmfit
modelsearch2(object, link = NULL, data = NULL,
  statistic = "Wald", method.p.adjust = "max", typeSD = "information",
  df = FALSE, adjust.residuals = FALSE, trace = TRUE, ...)
# S3 method for default
modelsearch2(object, link, data = NULL,
  statistic = "Wald", method.p.adjust = "max", typeSD = "information",
  df = FALSE, adjust.residuals = FALSE, trace = TRUE, ...)

Arguments

object

a lvmfit object.

...

additional arguments to be passed to findNewLink and .modelsearch2, see details.

link

the name of the additional relationships to consider when expanding the model. Should be a vector containing strings like "Y~X". Optional for lvmfit objects, see details.

data

[optional] the dataset used to identify the model

statistic

statistic used to perform the test. Can the likelihood ratio test ("LR"), the score ("score") or the max statistic ("max").

method.p.adjust

the method used to adjust the p.values for multiple comparisons. Ignored when using the max statistic. Can be any method that is valid for the stats::p.adjust function (e.g. "fdr").

typeSD

[relevant when statistic is Wald] the type of standard error to be used to compute the Wald statistic. Can be "information", "robust" or "jackknife".

[relevant when statistic is Wald] small sample correction: should the degree of freedom be computed using the Satterthwaite approximation.

adjust.residuals

[relevant when statistic is Wald] small sample correction: should the leverage-adjusted residuals be used to compute the influence function? Otherwise the raw residuals will be used.

trace

should the execution be traced?

Value

a latent variable model

Details

Argument link:

lvmfit object: when not specified all possible additional links are considered.
other objects: this argument must be specified.

Argument ... passed to findNewLink, see the documentation of this function:

exclude.var
rm.latent_latent
rm.endo_endo
rm.latent_endo

Argument ... passed to modelsearch2:

alpha: the significance threshold for retaining a new link.
method.max: the method used to compute the distribution of the max statistic. See lava.options()$search.calcMaxDist.
ncpus: the number of cpus that can be used for the computations.
nStep: the maximum number of links that can be added to the model.
na.omit: should model leading to NA for the test statistic be ignored. Otherwise this will stop the selection process.

Examples

Run this code

# NOT RUN {
#### linear regression ####
set.seed(10)
mSim <- lvm(Y~X1+X2+X3+X4)
addvar(mSim) <- ~Z1+Z2
df.data <- lava::sim(mSim, n = 1e2)
eLM <- lm(Y~X1, data = df.data)
possible.link <- c("Y~X2","Y~X3","Y~X4","Y~Z1","Y~Z2")

res <- modelsearch2(eLM, link = possible.link, data = df.data,
             statistic = "LR", method.p.adjust = "holm")
res <- modelsearch2(eLM, link = possible.link, data = df.data,
             statistic = "Wald", method.p.adjust = "holm", nStep = 1)
# }
# NOT RUN {
res <- modelsearch2(eLM, data = df.data, link = possible.link)
# }
# NOT RUN {

#### Cox model ####
# }
# NOT RUN {
library(survival)
data(Melanoma, package = "riskRegression")
m <- coxph(Surv(time,status==1)~ici+age, data = Melanoma, x = TRUE, y = TRUE)

res <- modelsearch2(m, link = c(status~epicel,status~sex),
                    packages = "survival", nStep = 1)
res
# }
# NOT RUN {
#### LVM ####
# }
# NOT RUN {
mSim <- lvm()
regression(mSim) <- c(y1,y2,y3)~u
regression(mSim) <- u~x1+x2
categorical(mSim,labels=c("A","B","C")) <- "x2"
latent(mSim) <- ~u
covariance(mSim) <- y1~y2
transform(mSim, Id~u) <- function(x){1:NROW(x)}
df.data <- lava::sim(mSim, n = 1e2, latent = FALSE)

m <- lvm(c(y1,y2,y3)~u)
latent(m) <- ~u
addvar(m) <- ~x1+x2 

e <- estimate(m, df.data)

links <- c(u~x1,u~x2C,y3~x2C)
resScore <- modelsearch2(e, statistic = "score", link = links, method.p.adjust = "holm")
resLR <- modelsearch2(e, statistic = "LR", link = links, method.p.adjust = "holm", nStep = 1)
resMax <- modelsearch2(e, rm.endo_endo = TRUE, statistic = "Wald", link = links, nStep = 1)
resScore <- modelsearch2(e, statistic = "score", method.p.adjust = "holm")
resLR <- modelsearch2(e, statistic = "LR", method.p.adjust = "holm")
resMax <- modelsearch2(e, rm.endo_endo = TRUE, statistic = "Wald")
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab