Function elasticNetSEMcv allows users to set their own grid search through combination of a set of user provided alphas an lambdas.
elasticNetSEMcv(Y, X, Missing, B, alpha_factors,lambda_factors,kFold, verbose)
dataframe stores the minimum Mean Square Error (MSE) for each alpha and the corresponding lambda from the selection path [lambda_max, ...., lambda_min].
col1: alpha
col2: lambda (With the given alpha, this is the lambda having minimum MSE)
col3: MSE
col4: STE
The final (alpha, lambda) is set at the (alpha, lambda) that is within 1ste of the min(MSE) with higher level of penalty on the likehood function.
fitthe model fit with optimal (alpha,lambda) from cv
Bout the computed weights for the network topology. B[i,j] = 0 means there is no edge between node i and j; B[i,j]!=0 denotes an (undirected) edge between note i and j with B[i,j] being the weight of the edge.
fout f is 1 by M array keeping the weight for X (in SEM: Y = BY + FX + e). Theoretically, F can be M by L matrix, with M being the number of nodes, and L being the total node attributes. However, in current implementation, each node only allows one and only one attribute. If you have more than one attributes for some nodes, please consider selecting the top one by either correlation or principal component methods.
stat
statistics is 1x6 array keeping record of:
1. correct positive
2. total positive
3. false positive
4. positive detected
5. Power of detection (PD) = correct positive/total positive
6. False Discovery Rate (FDR) = false positive/positive detected
simTimecomputational time
callthe call that produced this object
The observed node response data with dimension of M (nodes) by N (samples). Y is normalized inside the function.
The network node attribute matrix with dimension of M by N. Theoretically, X can be L by N matrix, with L being the total
node attributes. In current implementation, each node only allows one and only one attribute.
If you have more than one attributes for some nodes, please consider selecting the top one by either
correlation or principal component methods.
If for some nodes there is no attribute available, fill in the rows with all zeros. See the yeast data `yeast.rda` for example.
X is normalized inside the function.
Optional M by N matrix corresponding to elements of Y. 0 denotes not missing, and 1 denotes missing. If a node i in sample j has the label missing (Missing[i,j] = 1), then Y[i,j] is set to 0.
Optional input. For a network with M nodes, B is the M by M adjacency matrix. If data is simulated/with known true network topology (i.e., known adjacency matrix), the Power of detection (PD) and False Discovery Rate (FDR) is computed in the output parameter 'statistics'.
If the true network topology is unknown, B is optional, and the PD/FDR in output parameter 'statistics' should be ignored.
The set of candidate alpha values. Default is seq(start = 0.95, to = 0.05, step = -0.05)
The set of candidate lambda values. Default is 10^seq(start =1, to = 0.001, step = -0.2)
k-fold cross validation, default k=5
describe the information output from -1 - 10, larger number means more output
Anhui Huang; Dept of Electrical and Computer Engineering, Univ of Miami, Coral Gables, FL
the function perform CV and parameter inference, calculate power and FDR
1. Cai, X., Bazerque, J.A., and Giannakis, G.B. (2013). Inference of Gene Regulatory Networks with Sparse Structural Equation Models Exploiting Genetic Perturbations. PLoS Comput Biol 9, e1003068.
2. Huang, A. (2014). "Sparse model learning for inferring genotype and phenotype associations." Ph.D Dissertation. University of Miami(1186).
library(sparseSEM)
data(B);
data(Y);
data(X);
data(Missing);
if (FALSE) OUT <- elasticNetSEMcv(Y, X, Missing, B, alpha_factors = c(0.75, 0.5, 0.25),
lambda_factors=c(0.1, 0.01, 0.001), kFold = 5, verbose = 1);
Run the code above in your browser using DataLab