Learn R Programming

CNVreg (version 1.0)

fit_WTSMTH: Penalized Regression with Lasso and Weighted Fusion Penalties with Given Parameters

Description

Performs penalized regression with Lasso penalty and weighted fusion penalty for a given pair of tuning parameters (lambda1 and lambda2), which is determined by the user based on prior knowledge or use any number just for testing purpose.

Usage

fit_WTSMTH(
  data,
  lambda1,
  lambda2,
  weight = NULL,
  family = c("gaussian", "binomial"),
  iter.control = list(max.iter = 8L, tol.beta = 10^(-3), tol.loss = 10^(-6)),
  ...
)

Value

A numeric vector. The estimated model parameters

Arguments

data

An object of class "WTsmth.data" as generated by prep()

lambda1

A scalar numeric. Lambda_1 value to be considered. Provided value will be transformed to 2^(lambda1).

lambda2

A scalar numeric Lambda_2 value to be considered. Provided value will be transformed to 2^(lambda2).

weight

A character. The type of weighting. Must be one of eql, keql, wcs, kwcs, wif, kwif indicating equal weight, K x equal weight, Cosine similarity, K x cosine similarity, inverse frequency, and K x inverse frequency, where K is the number of individuals in each CNV-active region. `eql` and `keql` gives equal weight to adjacent CNVs. `wcs` and `kwcs` allow similar CNV fragments to have more similar effect size. `wif` and `kwif` will encourage CNV with lower frequency to borrow information from nearby more frequent CNV fragments. Considering that CNVs usually present in some CNV-active regions and there are large regions in between with no CNV at all. K will describe the number of individuals having any CNV activities in a CNV-active region, and varying the weight according to the sample size across regions.

family

A character. The family of the outcome. Must be one of "gaussian" (Y is continuous) or "binomial" (Y is binary).

iter.control

A list object. Allows user to control iterative update procedure. Allowed elements are "max.iter", the maximum number of iterations; "tol.beta", the difference between consecutive beta updates below which the procedure is deemed converged; and "tol.loss", the difference in consecutive loss updates below which the procedure is deemed converged.

...

Ignored.

Examples

Run this code
# Note we use here a very small example data set to expedite examples. 

# load toy dataset
data("CNVCOVY")

# prepare data format for regression analysis

## Continuous outcome Y_QT
frag_data <- prep(CNV = CNV, Y = Y_QT, Z = Cov, rare.out = 0.05)
QT_fit <- fit_WTSMTH(frag_data, 
                     lambda1 = -5, 
                     lambda2 = 21, 
                     weight = "eql", 
                     family = "gaussian")
                        
## Binary outcome Y_BT

# We can directly replace frag_data$Y with Y_BT in the correct format,
# ensuring that the ordering matches that of the prepared object.

rownames(Y_BT) <- Y_BT$ID
frag_data$Y <- Y_BT[names(frag_data$Y), "Y"] |> drop()
names(frag_data$Y) <- rownames(frag_data$Z) 

# Or, we can also repeat the prep() call
# frag_data <- prep(CNV = CNV, Y = Y_BT, Z = Cov, rare.out = 0.05)

BT_fit <- fit_WTSMTH(frag_data, 
                        lambda1 = -5, 
                        lambda2 = 6, 
                        weight = "eql",
                        family = "binomial")

Run the code above in your browser using DataLab