mlmc: mlmc for missing and censored response in multilevel model.

Description

mlmc() handles Bayesian multilevel model with responses that is left-censored and not-missing-at-random. It is motivated from analysing mass-spectrometry data with respondent dependant non-missing-at-random missingness, and there are known covariates associating with the missingness. It also codes with a known detectable limit in the response values.

Usage

mlmc(formula_completed, formula_missing, formula_censor = NULL,
  formula_subject, pdata, respond_dep_missing = TRUE,
  response_censorlim = NULL, pidname, sidname, iterno = 100, chains = 3,
  pathname, thin = 1, seed = 125, algorithm = "NUTS",
  warmup = floor(iterno/2), adapt_delta_value = 0.85, savefile = TRUE,
  usefit = T, stanfit)

Arguments

formula_completed

The main regression model formula; It has the same formula format as lmr() and it is used to define the first level response and its explanatory variables.

formula_missing

The logistic regression model formula; It has the same formula as formula_completed.

formula_censor

The formula used in the program to define the observations with censored values.

formula_subject

The second level formula in the multilevel model which is used to define responses such as subject and its explanatory variables.

pdata

The dataset contains response and predictors in a long format. Response is a vector with an indictor variable to define the corresponding unit. The data needs to have the following rudimental variables: the indicator variable for first level response, second level indicator variable for subject, or a sampling unit, an indicator for missingness and indictor of censoring. Missingness and censor are two different classification, there should not have any overlap between missingness and censored. Data structure can be referenced from the example and reference papers.

respond_dep_missing

A logical variable to indicate whether response value is missing-dependant.

response_censorlim

The detectable limit for the response value, i.e. 1 mg per Liter for intensity value.

pidname

Variable name to define the multilevel response unit , i.e. protein name or gene name.

sidname

Variable name to define the subject unit, i.e. patient id or sampling id.

iterno

Number of iterations for the posterior samplings.

chains

rstan parameter to define number of chains of posterior samplings.

pathname

Path to save output summary results.

thin

rstan parameter to define the frequency of iterations saved.

seed

random seed for rstan function.

algorithm

rstan parameter which has three options NUTS, HMC, Fixed param.

warmup

Number of iterations for burn-out in stan.

adapt_delta_value

Adaptive delta value is an adaptation parameters for sampling algorithms,default is 0.85, value between 0-1.

savefile

A logical variable to indicate if the sampling files are to be saved.

usefit

A logical variable to indicate if the model use the existing fit.

stanfit

The name of the fitted stan model read from .rds fle.

Value

Return of the function is the result fitted by stan. It will have the summarized parameters from all chains and summary results for each chain. Plot function will return the visualization of the mean and parameters.

Examples

Run this code

# NOT RUN {
set.seed(150)
library(MASS)
var2=abs(rnorm(800,0,1));treatment=c(rep(0,400),rep(1,400));
var1=(1/0.85)*var2+2*treatment;
geneid=rep(c(1:50),16);sid=c(rep(c(1:25),16),rep(c(26:50),16));
cov1=rWishart(1,df=50,Sigma=diag(rep(1,50)))
u=rnorm(50,0,1);mu=mvrnorm(n=1,mu=u,cov1[,,1])
sdd=rgamma(1,shape=1,scale=1/10);
for (i in 1:800) {var1[i]=var1[i]+rnorm(1,mu[geneid[i]],sdd)}
miss_logit=var2*(-0.9)+var1*(0.001);
miss=rbinom(800,1,exp(miss_logit)/(exp(miss_logit)+1));
censor=rep(0,800)
for (i in 1:800) {if (var1[i]<0.002) censor[i]=1}
pdata=data.frame(var1,var2,treatment,miss,censor,geneid,sid);
for ( i in 1:800) {if ((pdata$miss[i]==1) & (pdata$censor[i]==1)) pdata$miss[i]=0};
for ( i in 1:800) {if (pdata$miss[i]==1) pdata$var1[i]=NA;
                   if (pdata$censor[i]==1) pdata$var1[i]=0.002};
pidname="geneid";sidname="sid";
#copy and paste the following formulas to the mmlm() function respectively
formula_completed=var1~var2+treatment;
formula_missing=miss~var2;
formula_censor=censor~1;
formula_subject=~treatment;
response_censorlim=0.002;
pathdir=getwd()
fp=system.file("demo",package="mlmm")
mcfit=readRDS(paste0(fp,"/demo/mlmc.rds"))
model1=mlmc(formula_completed=var1~var2+treatment,formula_missing=miss~var2,
formula_censor=censor~1,formula_subject=~treatment,pdata=pdata,response_censorlim=0.002,
respond_dep_missing=TRUE,pidname="geneid",sidname="sid",pathname=pathdir,
iterno=5,chains=1,savefile=FALSE,usefit=T,stanfit=mcfit)
# }

Run the code above in your browser using DataLab