Learn R Programming

ADSIHT (version 0.2.1)

ADSIHT.ML: ADSIHT in multi-task learning framework

Description

An implementation of the sparse group selection in linear regression model via ADSIHT.

Usage

ADSIHT.ML(
  x_list,
  y_list,
  group_list,
  s0,
  kappa = 0.9,
  ic.type = c("dsic", "loss"),
  ic.scale = 3,
  ic.coef = 3,
  L = 5,
  weight,
  coef1 = 1,
  coef2 = 1,
  eta = 0.8,
  max_iter = 20,
  method = "ols",
  center = TRUE,
  scale = 1
)

Value

A list object comprising:

beta

A \(p\)-by-length(s0) matrix of coefficients, stored in column format.

intercept

A length(s0) vector of intercepts.

lambda

A length(s0) vector of threshold values

A_out

The selected variables given threshold value in lambda.

ic

The values of the specified criterion for each fitted model given threshold lamdba.

Arguments

x_list

The list of input matrix.

y_list

The list of response variable.

group_list

A vector indicating which group each variable belongs to For variables in the same group, they should be located in adjacent columns of x and their corresponding index in group should be the same. Denote the first group as 1, the second 2, etc.

s0

A vector that controls the degrees with group. Default is \(d^((l-1)/(L-1))\) : \(1 \leq l \leq L\), where d is the maximum group size.

kappa

A parameter that controls the rapid of the decrease of threshold. Default is 0.9.

ic.type

The type of criterion for choosing the support size. Available options are "dsic", "loss". Default is "dsic".

ic.scale

A non-negative value used for multiplying the penalty term in information criterion. Default: ic.scale = 3.

ic.coef

A non-negative value used for multiplying the penalty term for choosing the optimal stopping time. Default: ic.coef = 3.

L

The length of the sequence of s0. Default: L = 5.

weight

The weight of the samples, with the default value set to 1 for each sample.

coef1

A positive value to control the sub-optimal stopping time.

coef2

A positive value to control the overall stopping time. A small value leads to larger search range.

eta

A parameter controls the step size in the gradient descent step. Default: eta = 0.8.

max_iter

A parameter that controls the maximum number of line search, ignored if OLS is employed.

method

Whether ols (default) or linesearch method should be employed.

center

A boolean value indicating whether centralization is required. Default: center = TRUE.

scale

A positive value to control the column-wise L2 norm of each observation matrix. Default: scale=1.

Author

Yanhang Zhang, Zhifan Li, Shixiang Liu, Jianxin Yin.

Examples

Run this code
set.seed(1)
n <- 200
p <- 100
K <- 4
s <- 5
s0 <- 2
x_list <- lapply(1:K, function(x) matrix(rnorm(n*p, 0, 1), nrow = n))
vec <- rep(0, K * p)
non_sparse_groups <- sample(1:p, size = s, replace = FALSE)
for (group in non_sparse_groups) {
 group_indices <- seq(group, K * p, by = p)
 non_zero_indices <- sample(group_indices, size = s0, replace = FALSE)
 vec[non_zero_indices] <- rep(2, s0)
}
y_list <- lapply(1:K, function(i) return(
  y = x_list[[i]] %*% vec[((i-1)*p+1):(i*p)]+rnorm(n, 0, 0.5))
)
fit <- ADSIHT.ML(x_list, y_list)
fit$A_out[, which.min(fit$ic)]

Run the code above in your browser using DataLab