Fits gesso model over the two dimentional grid of hyperparmeters lambda_1 and lambda_2, returns estimated coefficients for each pair of hyperparameters.
gesso.fit(G, E, Y, C = NULL, normalize = TRUE, normalize_response = FALSE,
grid = NULL, grid_size = 20, grid_min_ratio = NULL,
alpha = NULL, family = "gaussian", weights = NULL,
tolerance = 1e-3, max_iterations = 5000,
min_working_set_size = 100,
verbose = FALSE)matrix of main effects of size n x p, variables organized by columns
vector of environmental measurments
outcome vector. Set family="gaussian" for the continuous outcome and
family="binomial" for the binary outcome with 0/1 levels
matrix of confounders of size n x m, variables organized by columns
TRUE to normalize matrix G and vector E
TRUE to normalize vector Y
grid sequence for tuning hyperparameters, we use the same grid for lambda_1 and lambda_2
specify grid_size to generate grid automatically. Grid is generated by calculating max_lambda from the data (smallest lambda such that all the coefficients are zero). min_lambda is calculated as a product of max_lambda and grid_min_ratio. The program then generates grid_size values equidistant on the log10 scale from min_lambda to max_lambda
parameter to determine min_lambda (smallest value for the grid of lambdas),
default is 0.1 for p > n, 0.01 otherwise
if NULL independent 2D grid is used for (lambda_1, lambda_2), else 1D grid is used where lambda_2 = alpha * lambda_1, i.e. (lambda_1, alpha * lambda_1)
"gaussian" for continuous outcome and "binomial" for binary
tolerance for the dual gap convergence criterion
maximum number of iterations
minimum size of the working set
inner fitting parameter
TRUE to print messages
A list of estimated coefficients and other model fit metrics for each pair of hyperparameters (lambda_1, lambda_2)
vector of estimated intercept values of size lambda_1*lambda_2
vector of estimated environment coefficients of size lambda_1*lambda_2
matrix of estimated main effects coefficients organized by rows, size (lambda_1*lambda_2) by p
matrix of estimated interactions coefficients organized by rows, size (lambda_1*lambda_2) by p
matrix of estimated confounders coefficients organized by rows, size (lambda_1*lambda_2) by m, where m is the number of confounders
number of iterations until convergence for each fit
maximum number of variables in the working set for each fit
1 if the model converged within given max_iterations, 0 otherwise
objective function (loss) value for each fit
number of estimated non-zero main effects for each fit
number of estimated non-zero interactions for each fit
lambda_1 path values, decreasing
lambda_2 path values, oscillating
vector of values used for hyperparameters tuning
# NOT RUN {
data = data.gen()
fit = gesso.fit(G=data$G_train, E=data$E_train, Y=data$Y_train, normalize=TRUE)
plot(fit$beta_g_nonzero, pch=19, cex=0.4,
ylab="num of non-zero features", xlab="lambdas path")
points(fit$beta_gxe_nonzero, pch=19, cex=0.4, col="red")
# }
Run the code above in your browser using DataLab