# Gordon Smyth

#### 5 packages on CRAN

#### 6 packages on Bioconductor

A collection of algorithms and functions to aid statistical modeling. Includes limiting dilution analysis (aka ELDA), growth curve comparisons, mixed linear models, heteroscedastic regression, inverse-Gaussian probability calculations, Gauss quadrature and a secure convergence algorithm for nonlinear models. Also includes advanced generalized linear model functions including Tweedie and Digamma distributional families and a secure convergence algorithm.

Model fitting and evaluation tools for double generalized linear models (DGLMs). This class of models uses one generalized linear model (GLM) to fit the specified response and a second GLM to fit the deviance of the first model.

Data sets from the book Generalized Linear Models with Examples in R by Dunn and Smyth.

A Graphical User Interface for analysis of Affymetrix microarray gene expression data using the affy and limma packages.

Detection of differentially bound regions in ChIP-seq data with sliding windows, with methods for normalization and proper FDR control.

Differential expression analysis of RNA-seq expression profiles with biological replication. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi-likelihood tests. As well as RNA-seq, it be applied to differential signal analysis of other types of genomic data that produce counts, including ChIP-seq, SAGE and CAGE.

A differential expression analysis method for RNA-seq data from repeated-measures design using general linear model framework and parametric bootstrap inference. The method accounts for the dependence of gene expression levels due to the repeated-measures through continuous autoregressive correlation structure. The method is described in Chapter 4 of Nguyen (2018) <https://lib.dr.iastate.edu/cgi/viewcontent.cgi?article=7433&context=etd>.

R implementation of generalized survival models (GSMs), smooth accelerated failure time (AFT) models and Markov multi-state models. For the GSMs, g(S(t|x))=eta(t,x) for a link function g, survival S at time t with covariates x and a linear predictor eta(t,x). The main assumption is that the time effect(s) are smooth <doi:10.1177/0962280216664760>. For fully parametric models with natural splines, this re-implements Stata's 'stpm2' function, which are flexible parametric survival models developed by Royston and colleagues. We have extended the parametric models to include any smooth parametric smoothers for time. We have also extended the model to include any smooth penalized smoothers from the 'mgcv' package, using penalized likelihood. These models include left truncation, right censoring, interval censoring, gamma frailties and normal random effects <doi:10.1002/sim.7451>. For the smooth AFTs, S(t|x) = S_0(t*eta(t,x)), where the baseline survival function S_0(t)=exp(-exp(eta_0(t))) is modelled for natural splines for eta_0, and the time-dependent cumulative acceleration factor eta(t,x)=\int_0^t exp(eta_1(u,x)) du for log acceleration factor eta_1(u,x). The Markov multi-state models allow for a range of models with smooth transitions to predict transition probabilities, length of stay, utilities and costs, with differences, ratios and standardisation.

A Graphical User Interface for differential expression analysis of two-color microarray data using the limma package.