FitSingleMod: FitSingleMod

Description

Function to fit a model to the diversity values of subsamples of a given sample and its nested samples.

Usage

FitSingleMod(model.list, init.param, param.range,
             main.samp, tot.pop=(100*(DivSampleNum(main.samp,2)[1])),
             numit=10^5, varleft=1e-8, data.default=TRUE,
             subsizes = 6, dssamps = list(), nrf = 1,
             minrarefac=1, NResamples=1000, minplaus=10,
             fitloops=2)

Value

A list of class FitSingleMod containing the results of the fit of the model to the diversity samples. This includes the following:

param: matrix of fitted parameters for each nested sample
ssr: sum-of-squared residuals for the fits for each nested sample
ms: mean sum-of-squared residuals for the fits for each nested sample
discrep: goodness-of-fit values for the fits for each nested sample; this expressed as the average across the subsamples in each nested sample of all the percentage residuals
local: prediction of main sample sizes according to fitted curves for each of the nested samples
global: prediction of population diversity at popsize according to fitted curves for each of the nested subsamples
AccuracyToObserved: vector of percentage errors between the observed diversity of full sample data and the estimated diversity of full sample data from subsamples
subsamplesizes: vector of nested subsample sizes
datapoints: the list of divsubsample objects used in the fitting. The length of the list is equal to number of samples
modelname: name of the model used
numparam: number of parameters in the model
sampvar: the mean squared distances between subsample curves, local and global
mono.local: matrix of logical values: is the curve monotonically increasing, up to the main sample size?
mono.global: matrix of logical values: is the curve monotonically increasing, up to the population size?
slowing.local: matrix of logical values: is the rate of increase in the curve slowing (decreasing second derivative), up to the main sample size?
slowing.global: matrix of logical values: is the rate of increase in the curve slowing (decreasing second derivative), from minplaus to the population size (popsize?
plausibility: matrix of logical values: is the curve plausible (i.e. monotonically increasing and with decreasing second derivative)?
dist.local: matrix of distances between curves fitted to the nested samples. Distances are calculated as areas between curves bounded by 0 and the main sample size
dist.global: similar to dist.local, but with curve upper bound the population size
local.ref.dist: distances of nested curves to the curve fitted to the whole sample, with the curves bounded by 0 and the main sample size
global.ref.dist: similar to local.ref.dist but with curve upper bound the population size
popsize: user defined population size
the model: the function corresponding to the user-selected modelname

Arguments

model.list: model; written as a function: function(x, params) with(as.list(params), FunctionOfParams). Examples are given in the ModelSet data file as part of the DivE package. Used in the modFit function.
init.param: matrix of of initial seed model parameters. For each matrix, each row represents a given parameter set; each column represents a parameter value. Column names must match parameter names (params) in the corresponding model in the list models. Examples are given in the ParamSeeds data file as part of the DivE package.
param.range: matrix of lower and upper model parameters bounds. Used for the modFit function. The first and second row corresponds to the lower and upper bounds respectively; each column represents a parameter value. Column names must match parameter names (params) in the corresponding model in the list models. Examples are given in the ParamRanges data file as part of the DivE package.
main.samp: the main sample, either as a 2-column data.frame (species ID, count of species), or a vector of species IDs.
tot.pop: total population (integer); default set to 100x the main.samp size.
numit: control argument passed to optimisation routine; the maximum number of iterations that modFit will perform. See modFit for details.
varleft: control argument passed to optimisation routine; see modFit for details.
data.default: if True, then the list of vectors of nested rarefaction data (divsubsample objects) generated by the DivSampleNum and divsubsample functions; if False, then the function uses the user-specified list of nested rarefaction data, dssamps
subsizes: either number of subsamples of main.samp (integer), or a vector of subsample lengths. If the former, then the vector of sample lengths will be created using the DivSampleNum function.
dssamps: list of user specified rarefaction data DivSubsamples objects. The length of each component vector of each object in the list must correspond to the vector of subsample lengths (as defined by the user in subsizes).
nrf: difference between lengths of successive rarefaction datapoints.
minrarefac: minimum rarefaction x-axis value. This argument is not used if list of DivSubsamples object is specified in dssamps.
NResamples: number of resamples used to calculate the rarefaction data. This parameter is not used if list of DivSubsamples object is specified in dssamps. NB: different from numit parameter, which is specific to the fitting process.
minplaus: lower x-axis bound for plausibility check.
fitloops: number of fitting rounds performed for each model. In each round of fitting, the initial seed parameter values for each model will be the fitted parameters of the previous fitting run. This parameter has a significant impact on the computational time. The `sweet spot' is 2.

Author

Daniel J. Laydon, Aaron Sim, Charles R.M. Bangham, Becca Asquith

Details

This function fits a single specified model to the diversity values of the subsamples of a set of nested samples. The output is a list of raw fitting results (pre-scoring). The user should use this function if he or she is interested in fitting a specific parametric rarefraction curve to a sample (rather than selecting the most appropriate model) and examining its performance.

References

Laydon, D. J., Melamed, A., Sim, A., Gillet, N. A., Sim, K., Darko, S., Kroll, S., Douek, D. C., Price, D., Bangham, C. R. M., Asquith, B., Quantification of HTLV-1 clonality and TCR diversity, PLOS Comput. Biol. 2014

Examples

Run this code

# See documentation of \code{ScoreSingleMod} for examples

Run the code above in your browser using DataLab