Learn R Programming

PPSFS (version 0.1.1)

PPS-method: Partial Profile Score Feature Selection for GLMs

Description

ppsfs: PPSFS for main-effects.

ppsfsi: PPSFS for interaction effects.

Usage

ppsfs(
  x,
  y,
  family,
  keep = NULL,
  I0 = NULL,
  ...,
  ebicFlag = 1,
  maxK = min(NROW(x) - 1, NCOL(x) + length(I0)),
  verbose = FALSE
)

ppsfsi( x, y, family, keep = NULL, ..., ebicFlag = 1, maxK = min(NROW(x) - 1, choose(NCOL(x), 2)), verbose = FALSE )

Value

Index set of identified features.

Arguments

x

Matrix.

y

Vector.

family

See glm and family.

keep

Initial set of features that are included in model fitting.

I0

Index set of interaction effects to be identified.

...

Additional parameters for glm.fit.

ebicFlag

The procedure stops when the EBIC increases after ebicFlag times.

maxK

Maximum number of identified features.

verbose

Print the procedure path?

Details

That ppsfs(x, y, family="gaussian") is an implementation to sequential lasso method proposed by Luo and Chen(2014, <\doi{10 f6kfr6}="">).

References

Z. Xu, S. Luo and Z. Chen (2022). Partial profile score feature selection in high-dimensional generalized linear interaction models. Statistics and Its Interface. tools:::Rd_expr_doi("10.4310/21-SII706")

Examples

Run this code
## ***************************************************
## Identify main-effect features
## ***************************************************
set.seed(2022)
n <- 300
p <- 1000
x <- matrix(rnorm(n*p), n)
eta <- drop( x[, 1:3] %*% runif(3, 1.0, 1.5) )
y <- eta + rnorm(n, sd=sd(eta)/5)
print( A <- ppsfs(x, y, 'gaussian', verbose=TRUE) )

## ***************************************************
## Identify interaction effects
## ***************************************************
set.seed(2022)
n <- 300
p <- 150
x <- matrix(rnorm(n*p), n)
eta <- drop( cbind(x[, 1:3], x[, 4:6]*x[, 7:9]) %*% runif(6, 1.0, 1.5) )
y <- eta + rnorm(n, sd=sd(eta)/5)
print( group <- ppsfsi(x, y, 'gaussian', verbose=TRUE) )
print( A <- ppsfs(x, y, "gaussian", I0=group, verbose=TRUE) )

print( A <- ppsfs(x, y, "gaussian", keep=c(1, "5:8"), 
                  I0=group, verbose=TRUE) )

Run the code above in your browser using DataLab