loggle.cv.select: A function to conduct model selection based on cross validation results

Description

This function is to conduct model selection for time-varying graphical models based on cross validation results from loggle.cv.

Usage

loggle.cv.select(cv.result, select.type = "all_flexible", 
cv.vote.thres = 0.8)

Arguments

cv.result

a list: results from loggle.cv

select.type

a string: "all_flexible" -- optimal d and lambda can vary across time points specified by pos, "d_fixed" -- optimal d is fixed across time points specified by pos and optimal lambda can vary across time points specified by pos, "all_fixed" -- optimal d and lambda are fixed across time points specified by pos, default = "all_flexible"

cv.vote.thres

a scalar between 0 and 1: an edge is kept after cv.vote if and only if it exists in no less than cv.vote.thres*cv.fold cv folds, default = 0.8

Value

h.opt

optimal value of h

d.opt

a vector of optimal values of d for each estimated graph

lambda.opt

a vector of optimal values of lambda for each estimated graph

cv.score.opt

optimal cv score (averaged over time points and cv folds)

edge.num.opt

a vector of numbers of edges for each estimated graph

edge.opt

a list of edges for each estimated graph

adj.mat.opt

a list of adjacency matrices for each estimated graph

Details

select.type = "all_flexible" is for the situation where we expect both the extent of structure smoothness (controlled by d) and the extent of graph sparsity (controlled by lambda) vary across time points. If only the extent of graph sparsity varies across time points, select.type = "d_fixed" should be used. If both of them are expected to be similar across time points, select.type = "all_fixed" should be used.

cv.vote.thres controls the tradeoff between false discovery rate and power in model selection. A large value of cv.vote.thres would decrease false discovery rate but also hurt power.

References

Yang, J. & Peng, J. (2018), 'Estimating Time-Varying Graphical Models', arXiv preprint arXiv:1804.03811

Examples

Run this code

# NOT RUN {
data(example)  # load example dataset
X <- example$X  # data matrix
dim(X)  # dimension of data matrix

# positions of time points to estimate graphs
pos <- round(seq(0.1, 0.9, length=9)*(ncol(X)-1)+1)
# estimate time-varying graphs via cross-validation
result <- loggle.cv(X, pos, h.list = c(0.2, 0.25), 
d.list = c(0, 0.05, 0.15, 1), lambda.list 
= c(0.2, 0.25), cv.fold = 3, fit.type = "pseudo", 
cv.vote.thres = 1, num.thread = 1)

# conduct model selection using cross-validation results
select.result <- loggle.cv.select(cv.result = result, 
select.type = "all_flexible", cv.vote.thres = 0.8)

# optimal values of h, d and lambda, and number of 
# selected edges at each time point
print(cbind("time" = seq(0.1, 0.9, length=9),
"h.opt" = rep(select.result$h.opt, length(pos)),
"d.opt" = select.result$d.opt,
"lambda.opt" = select.result$lambda.opt,
"edge.num.opt" = select.result$edge.num.opt))
# }

Run the code above in your browser using DataLab