bayescount: Analyse Count data using MCMC

Description

Apply a Bayesian [zero-inflated] gamma / Weibull / lognormal / independant / simple Poisson model to count data to return possible values for mean count, coefficient of variation, and zero-inflation. A text .csv file named [name].[model].csv with the results is optionally written to the working directory (before checking if the file already exists), and an object [name].[model].results is copied to the Global environment within R. Where more than 1 model is used, a results file and object is created for each model. Convergence is assessed for each dataset by calculating the Gelman-Rubin statistic for each parameter, see autorun.jags. Optionally, the log likelihood for the model fit is also calculated. The time taken to complete each analysis (not including calculation of the likelihood) is also recorded. This function is a wrapper for bayescount.single(), allowing extra automation. The lower level functions in the runjags package are used for calling JAGS.

Note: The GUI interface for R in Windows may not continually refresh the output window, making it difficult to track the progress of the simulation (if silent.jags is FALSE). To avoid this, you can run the function from the terminal version of R (located in the Program Files/R/bin/ folder).

*THIS SOFTWARE IS INTENDED FOR EDUCATIONAL PURPOSES ONLY AND SHOULD NOT BE RELIED UPON FOR REAL WORLD APPLICATIONS*

Usage

bayescount(name = NA, data = NA, setnames = NA,
   model = c("ZILP"), divide.data = 1, 
   scale.mean = divide.data, remove.all.zeros = TRUE, 
   test = TRUE, alt.prior = FALSE, write.file = TRUE, 
   adjust.zi.mean = FALSE, likelihood = FALSE, 
   record.chains = FALSE, ...)

Arguments

Value

No value is returned by this function. Instead, a text .csv file named *name*.*model*.csv with the results is optionally written to the working directory (before checking if the file already exists), and an object *name*.*model*.results is copied to the Global environment within R (this will over-write any existing object of the same name). Where more than 1 model is used, a results file and object is created for each model.

The results files consist of a table (or matrix) with the following results for each dataset:namethe name of the dataset. This is a dimnames attribute for matrices.convergeda logical flag indicating a PSRF of below the target indicating successful convergence (1), or the opposite (0). Datasets where the model initially converged but the subsequent final chains had a PSRF of more than the target are classified as 'converged', but a warning message is printed in the function and the datasets will have an error code of 4.error.codea numeric code indicating the exit status of the simulation. 0 indicates no errors encountered, 1 indicates a repeatedly crashing model, 2 indicates a failure to converge within the specified time limit, 3 indicates a dataset with no observations that was removed from analysis, 4 indicates a model that initially converged but the subsequent final chains had a PSRF of more than the target, and 5 indicates an unknown error in the function not related to the model (i.e. a bug in the runjags/bayescount code...).samplesthe number of sampled iterations calculated by raftery.diag to reduce the MC error to the specified level.samples.to.convthe number of sampled iterations discarded as a result of slow convergence (will be 0 if the pilot chain converged).parameter estimatesthe lower 95% credible interval, median estimate and upper 95% credible interval for the mean, coefficient of variation, zero-inflation, log mean, log variance, shape parameter, and scale parameter as appropriate to the model.multivariate.psrfthe multivariate PSRF of the final mcmc chains. If this couldn't be calculated, NA.likelihood estimatesthe lower 95% credible interval, median estimate, upper 95% credible interval and maximum observed value (this is NOT equivalent to the maximum likelihood) of the likelihood for the model. For the compound distributions these likelihoods are integrated for all possible values of the Poisson means (calculated using likelihood), and may well be different to the deviance calculated using JAGS.time.takenthe total time in seconds taken to run the simulation (not including calculating the likelihood and other summary statistics).

Examples

Run this code

# run the function with all values as default, and 'name' and 'data' (from a local .csv file) to be input by the user when prompted:
bayescount()

# analyse data using zero-inflated gamma Poisson and zero-inflated lognormal Poisson models in 5 text .csv files named 'mydata/data.*numer*.csv' with column labels, and calculating the likelihoods, and limiting each dataset to 30 minutes to achieve convergence:
for (i in 1:5){
	bayescount(name=paste("Data ", i, sep=""), data=paste("mydata/data.", i, ".csv", sep=""), model=c("ZIGP", "ZILP"), setnames=TRUE, test = FALSE, likelihood = TRUE, silent.jags=TRUE, max.time="30mins")
}

# analyse local data (2 datasets with 20 animals each with 10 repeat samples) using a zero-inflated lognormal Poisson model:
# Simulate some data:
data <- array(dim=c(20,10,2))
means1 <- rgamma(20, 10, 1)
means2 <- rgamma(20, 5, 1)
for(i in 1:20){
	data[i,,1] <- rpois(10, means1[i])
	data[i,,2] <- rpois(10, means2[i])
}
# Missing data is permissible but means the likelihood cannot be calculated - a warning will be printed:
data[sample(1:(20*10*2), 10)] <- NA
# Run the analysis:
bayescount(name="analysis", data=data, model = "ZILP", setnames=c("Simulated group A", "Simulated group B"), likelihood=TRUE)

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples