For a list of dataframes, where each frame is of the form (Y_1,Y_2, ..., Y_K) and Y_t takes the values 0, 1, or 2 (missing), salbmM estimates E[ Y_t | alpha ] where alpha is one of a number of sensitivity paramaters under a Markovian assumption of order m.
salbmM( data, Narm = length(data), m, K, ntree,
EmpEst=FALSE, NEst=0,
seeds = 1:length(data), seeds2 = -1 - 1:length(data),
alphas, NBootstraps = 0, bBS = 1,
returnJP = TRUE, returnSamples = FALSE )a list of dataframes
the number of dataframes to process
order of the Markov assumption, note 2m+2 < K
The number of time-points
The number of trees in the random forest passed to randomForestSRC
logical, indicating if empirical estimation should be used when calculating the mean value of Yt.
The number of values of Yt to use in calculating the mean of Yt.
vector of positive numbers used as seeds in producing bootstrap samples. There should be at least one seed for each treatment arm.
vector of negative numbers passed to randomForestSRC. There should be at least one seed for each treatment arm.
vector of sensitivity parameters
number of bootstrap samples to be created and analyzed
Start Bootstrap number. Bootstrap IDs are given as bBS:eBS where eBS = bBs + NBootstraps - 1. Setting bBS and eBS is useful when running salbmM in parallel.
Logical indicating if the list of joint probability distributions returned by random forest for each treatment group should be returned. This is used by addSamples to create Bootstrap samples.
Logical indicating if generated bootstrap samples should be returned
salbmM returns a list which contains the following:
results for treatment group 1 in wide format
results for treatment group 1 in long format
means and standard deviations for trt1
joint distribution returned from randomForestRSC, trt 1
results for bootstrap samples trt1 in wide format
results for bootstrap samples trt1 in long format
means and standard deviations of bootstrap samples trt1.
results for treatment group 2 in wide format
results for treatment group 2 in long format
means and standard deviations for trt2
joint distribution returned from randomForestRSC trt 2
results for bootstrap samples trt2 in wide format
results for bootstrap samples trt2 in long format
means and standard deviations of bootstrap samples trt2.
the salbm data object supplied in the call to salbmM
the Markov paramater supplied in the call to salbmM
the value of K supplied in the call to salbmM
the value of ntree supplied in the call to salbmM
the value of NEst supplied in the call to salbmM
the value of alphas supplied in the call to salbmM
the value of seeds supplied in the call to salbmM
the value of seeds2 supplied in the call to salbmM
the value of bBS supplied in the call to salbmM
the value of eBS supplied in the call to salbmM
the value of NBootstraps supplied in the call to salbmM
For each dataframe separately, randomForestSRC is used to create a set of joint distributions f(Yn-m, Yn-m+1, ..., Yn-1, Yn, Yn+1, ... Yn+m+1) where Yi can take three possible values, 0, 1, or missing (represented by the value 2). The Markovian assumption of order m can be summarized as f( Y_n | Y_i, i = 1, 2, ..., n-1, n+1, ..., K) = f( Y_n | Y_i, i = max(1,n-m), ..., n-1, n+1, ..., min(n+m+1,K)) for n > 1.
RandomForestSRC is used to estimate the joint distributions, f_i( Y_n | Y_n-m, ..., Y_n-1, Y_n+1, ..., Y_n+m+1). For each sensitivity parameter, alpha, these distributions are used to compute the E[ Y_K | alpha ] Bootstraping is carried out using the $f_i$.
Because of the Markov assumption the full distribution f can be replaced by a set of distributions of order no more than 2m+2. This allows estimation in situations where K is large and estimation of the full joint distribution is unfeasable.
# NOT RUN {
# Clinical trial data with two arms.
data(trt1)
data(trt2)
data <- list( trt1 = trt1, trt2 = trt2 )
R <- salbmM( data = data , m = 2, K = 6, ntree = 1000,
seeds = c(22,18), seeds2 = c(-2,-3),
alphas = -8:8, NBootstraps=0 )
# }
Run the code above in your browser using DataLab