resample
Creates the necessary input for fitNetwork when selecting variables based on
the resample
function. The purpose of making this function
available to the user is to that different decisions can be made about how
exactly to use the resample
output to select a model, as
sometimes there is more than one option for choosing a final model.
modSelect(
obj,
data = NULL,
fit = FALSE,
select = "select",
thresh = NULL,
ascall = TRUE,
type = "gaussian",
...
)
resample
output
The dataframe used to create the resample
object.
Necessary if ascall = TRUE
or fit = TRUE
.
Logical. Determines whether to fit the selected model to the data
or just return the model specifications. Must supply a dataset in the
data
argument as well.
Character string, referring to which variable of the output
should be used as the basis for selecting variables. If the resampling
method was either "bootstrap"
or "split"
, then setting
select = "select"
will select variables based on the aggregated
p-values being below a pre-specified threshold. Setting select =
"select_ci"
, however, will use the adjusted confidence intervals rather
than p-values to select variables. Alternatively, if select = "freq"
then the thresh
argument can be used to indicate the minimum
selection frequency across iterations. In this case, variables are selected
based on how frequently they were selected in the resampling procedure.
This also works if select
is simply set a numeric value (this value
will serve as the value for thresh
).
When the resampling method was "stability"
, the default option of
select = "select"
chooses variables based on the original threshold
provided to the resample
function, and relies on the
simultaneous selection proportion (the "freq"
column in the
"results"
element). Alternatively, if select
is a numeric
value, or a value for thresh
is provided, that new frequency
selection threshold will determine the choice of variables. Alternatively,
one can specify select = "split1"
or select = "split2"
to
base the threshold on the selection frequency in one of the two splits
rather than on the simultaneous selection frequency which is likely to be
the most conservative.
For all types of resample
objects, when select = "Pvalue"
then thresh
can be set to a numeric value in order to select
variables based on aggregated p-values. For the "bootstrapping"
and
"split"
methods this allows one to override the original threshold
(set as part of resample
) if desired.
Numeric value. If select = "Pvalue"
, then this value
will be the p-value threshold. Otherwise, this value will determine the
minimum frequency selection threshold.
Logical. Determines whether to return a list with arguments
necessary for fitting the model with do.call
to
fitNetwork
. Only possible if a dataset is supplied.
Should just leave as-is. Automatically taken from the
resample
object.
Additional arguments.
A call ready for fitNetwork
, a fitted network model, or
a list of selected variables for each node along with relevant attributes.
Essentially, the output is either the selected model itself or a list of
the necessary parameters to fit it.
# NOT RUN {
res1 <- resample(ggmDat, m = 'M', niter = 10)
mods1 <- modSelect(res1)
fit1 <- fitNetwork(ggmDat, morderators = 'M', type = mods1)
res2 <- resample(ggmDat, m = 'M', sampMethod = 'stability')
fit2 <- modSelect(res2, data = ggmDat, fit = TRUE, thresh = .7)
# }
Run the code above in your browser using DataLab