Learn R Programming

DRIMSeq (version 1.0.2)

dmDispersion: Estimate dispersions in Dirichlet-multinomial model

Description

Maximum likelihood estimates of dispersion parameters in the Dirichlet-multinomial model used in differential splicing or sQTL analysis.

Usage

dmDispersion(x, ...)
"dmDispersion"(x, mean_expression = TRUE, common_dispersion = TRUE, genewise_dispersion = TRUE, disp_adjust = TRUE, disp_mode = "grid", disp_interval = c(0, 1e+05), disp_tol = 1e-08, disp_init = 100, disp_init_weirMoM = TRUE, disp_grid_length = 21, disp_grid_range = c(-10, 10), disp_moderation = "common", disp_prior_df = 0.1, disp_span = 0.3, prop_mode = "constrOptimG", prop_tol = 1e-12, verbose = 0, BPPARAM = BiocParallel::MulticoreParam(workers = 1))
"dmDispersion"(x, mean_expression = TRUE, common_dispersion = TRUE, genewise_dispersion = TRUE, disp_adjust = TRUE, disp_mode = "grid", disp_interval = c(0, 10000), disp_tol = 1e-08, disp_init = 100, disp_init_weirMoM = TRUE, disp_grid_length = 21, disp_grid_range = c(-10, 10), disp_moderation = "none", disp_prior_df = 0.1, disp_span = 0.3, prop_mode = "constrOptimG", prop_tol = 1e-12, verbose = 0, speed = TRUE, BPPARAM = BiocParallel::MulticoreParam(workers = 1))

Arguments

x
dmDSdata or dmSQTLdata object.
...
Other parameters that can be defined by methods using this generic.
mean_expression
Logical. Whether to estimate the mean expression of genes.
common_dispersion
Logical. Whether to estimate the common dispersion.
genewise_dispersion
Logical. Whether to estimate the gene-wise dispersion.
disp_adjust
Logical. Whether to use the Cox-Reid adjusted or non-adjusted profile likelihood.
disp_mode
Optimization method used to maximize the profile likelihood. Possible values are "optimize", "optim", "constrOptim", "grid". See Details.
disp_interval
Numeric vector of length 2 defining the interval of possible values for the dispersion.
disp_tol
The desired accuracy when estimating dispersion.
disp_init
Initial dispersion. If common_dispersion is TRUE, then disp_init is overwritten by common dispersion estimate.
disp_init_weirMoM
Logical. Whether to use the Weir moment estimator as an initial value for dispersion. If TRUE, then disp_init is replaced by Weir estimates.
disp_grid_length
Length of the search grid.
disp_grid_range
Vector giving the limits of grid interval.
disp_moderation
Dispersion moderation method. One can choose to shrink the dispersion estimates toward the common dispersion ("common") or toward the (dispersion versus mean expression) trend ("trended")
disp_prior_df
Degree of moderation (shrinkage).
disp_span
Value from 0 to 1 defining the percentage of genes used in smoothing sliding window when calculating the dispersion versus mean expression trend.
prop_mode
Optimization method used to estimate proportions. Possible values "constrOptim" and "constrOptimG".
prop_tol
The desired accuracy when estimating proportions.
verbose
Numeric. Definie the level of progress messages displayed. 0 - no messages, 1 - main messages, 2 - message for every gene fitting.
BPPARAM
Parallelization method used by bplapply.
speed
Logical. If FALSE, dispersion is calculated per each gene-block. Such calculation may take a long time, since there can be hundreds of SNPs/blocks per gene. If TRUE, there will be only one dipsersion calculated per gene and it will be assigned to all the blocks matched with this gene.

Value

Returns a dmDSdispersion or dmSQTLdispersion object.

Details

Parameters that are used in the dispersion estimation start with prefix disp_, and those that are used for the proportion estimation start with prop_.

There are 4 optimization methods implemented within dmDispersion ("optimize", "optim", "constrOptim" and "grid") that can be used to estimate the gene-wise dispersion. Common dispersion is estimated with "optimize".

Arguments that are used by all the methods are:

  • disp_adjust
  • prop_mode: Both "constrOptim" and "constrOptimG" use constrOptim function to maximize the likelihood of Dirichlet-multinomial proportions. The difference lays in the way the likelihood and score are computed. "constrOptim" uses the likelihood and score that are calculated based on the fact that x*Gamma(x) = Gamma(x+1). In "constrOptimG", we compute them using lgamma function. We recommend using the second approach, since it is much faster than the first one.
  • prop_tol: The accuracy for proportions estimation defined as reltol in constrOptim.

Only some of the rest of dispersion parameters in dmDispersion have an influence on the output for a given disp_mode. Here is a list of such active parameters for different modes:

"optimize", which uses optimize to maximize the profile likelihood.

  • disp_interval: Passed as interval.
  • disp_tol: The accuracy defined as tol.

"optim", which uses optim to maximize the profile likelihood.

  • disp_init and disp_init_weirMoM: The initial value par.
  • disp_tol: The accuracy defined as factr.

"constrOptim", which uses constrOptim to maximize the profile likelihood.

  • disp_init and disp_init_weirMoM: The initial value theta..
  • disp_tol: The accuracy defined as reltol.

"grid", which uses the grid approach from edgeR.

  • disp_init, disp_grid_length, disp_grid_range: Parameters used to construct the search grid disp_init * 2^seq(from = disp_grid_range[1], to = disp_grid_range[2], length = disp_grid_length).
  • disp_moderation: Dipsersion shrinkage is available only with "grid" method.
  • disp_prior_df: Used only when dispersion shrinkage is activated. Moderated likelihood is equal to loglik + disp_prior_df * moderation. Higher disp_prior_df, more shrinkage toward common or trended dispersion is applied.
  • disp_span: Used only when dispersion moderation toward trend is activated.

See Also

data_dmDSdata, data_dmSQTLdata, plotDispersion, dmFit, dmTest

Examples

Run this code
###################################
### Differential splicing analysis
###################################
# If possible, use BPPARAM = BiocParallel::MulticoreParam() with more workers

d <- data_dmDSdata

### Filtering
# Check what is the minimal number of replicates per condition 
table(samples(d)$group)
d <- dmFilter(d, min_samps_gene_expr = 7, min_samps_feature_expr = 3, 
 min_samps_feature_prop = 0)

### Calculate dispersion
d <- dmDispersion(d, BPPARAM = BiocParallel::SerialParam())
plotDispersion(d)

head(mean_expression(d))
common_dispersion(d)
head(genewise_dispersion(d))



Run the code above in your browser using DataLab