Computes precision weights that account for heteroscedasticity in RNA-seq count data based on non-parametric local linear regression estimates.
sp_weights(
y,
x,
phi,
use_phi = TRUE,
preprocessed = FALSE,
doPlot = FALSE,
gene_based = FALSE,
bw = c("nrd", "ucv", "SJ", "nrd0", "bcv"),
kernel = c("gaussian", "epanechnikov", "rectangular", "triangular", "biweight",
"tricube", "cosine", "optcosine"),
exact = FALSE,
transform = TRUE,
verbose = TRUE,
na.rm = FALSE
)a numeric matrix of size G x n containing the raw RNA-seq counts or
preprocessed expression from n samples for G genes.
a numeric matrix of size n x p containing the model covariate(s) from
n samples (design matrix).
a numeric design matrix of size n x K containing the K
variable(s) of interest( e.g. bases of time).
a logical flag indicating whether conditional means should be conditioned
on phi and on covariate(s) x, or on x alone. Default is
TRUE in which case conditional means are estimated conditionally on both
x and phi.
a logical flag indicating whether the expression data have
already been preprocessed (e.g. log2 transformed). Default is FALSE, in
which case y is assumed to contain raw counts and is normalized into
log(counts) per million.
a logical flag indicating whether the mean-variance plot should be drawn.
Default is FALSE.
a logical flag indicating whether to estimate weights at the gene-level.
Default is FALSE, when weights will be estimated at the observation-level.
a character string indicating the smoothing bandwidth selection method to use. See
bandwidth for details. Possible values are "ucv", "SJ",
"bcv", "nrd" or "nrd0". Default is "nrd".
a character string indicating which kernel should be used.
Possibilities are "gaussian", "epanechnikov", "rectangular",
"triangular", "biweight", "tricube", "cosine",
"optcosine". Default is "gaussian" (NB: "tricube" kernel
corresponds to the loess method).
a logical flag indicating whether the non-parametric weights accounting
for the mean-variance relationship should be computed exactly or extrapolated
from the interpolation of local regression of the mean against the
variance. Default is FALSE, which uses interpolation (faster).
a logical flag indicating whether values should be transformed to uniform
for the purpose of local linear smoothing. This may be helpful if tail observations are sparse and
the specified bandwidth gives suboptimal performance there. Default is TRUE.
a logical flag indicating whether informative messages are printed
during the computation. Default is TRUE.
logical: should missing values (including NA and NaN)
be omitted from the calculations? Default is FALSE.
a n x G matrix containing the computed precision weights.
# NOT RUN {
#rm(list = ls())
set.seed(123)
G <- 10000
n <- 12
p <- 2
y <- sapply(1:G, FUN = function(x){rnbinom(n = n, size = 0.07, mu = 200)})
x <- sapply(1:p, FUN = function(x){rnorm(n = n, mean = n, sd = 1)})
# }
Run the code above in your browser using DataLab