Running automated method on VA data
codeVA(
data,
data.type = c("WHO2012", "WHO2016", "PHMRC", "EAVA", "customize")[2],
data.train = NULL,
causes.train = NULL,
causes.table = NULL,
model = c("InSilicoVA", "InterVA", "Tariff", "NBC", "EAVA")[1],
Nchain = 1,
Nsim = 10000,
version = c("4.02", "4.03", "5")[2],
HIV = "h",
Malaria = "h",
phmrc.type = c("adult", "child", "neonate")[1],
convert.type = c("quantile", "fixed", "empirical")[1],
age_group = c("neonate", "child")[1],
...
)a fitted object
Input VA data, see data.type below for more information about the format.
There are five data input types currently supported by codeVA function as below.
WHO2012: InterVA-4 input format using WHO 2012 questionnaire. For example see data(RandomVA1). The first column should be death ID.
WHO2016: InterVA-5 input format using WHO 2016 questionnaire. For example see data(RandomVA5). The first column should be death ID.
PHMRC: PHMRC data format. The raw PHMRC long format data will be processed internally following the steps described in McCormick et al. (2016). For example see ConvertData.phmrc
EAVA: EAVA data format using WHO 2016 questionnaire, as produced by [EAVA::odk2EAVA()].
customized: Any dichotomized dataset with ``Y`` denote ``presence'', ``'' denote ``absence'', and ``.'' denote ``missing''. The first column should be death ID.
Training data with the same columns as data, except for an additional column specifying cause-of-death label. It is not used if data.type is ``WHO'' and model is ``InterVA'' or ``InSilicoVA''. The first column also has to be death ID for ``WHO'' and ``customized'' types.
the column name of the cause-of-death assignment label in training data.
list of causes to consider in the training data. Default to be NULL, which uses all the causes present in the training data.
Currently supports five models: ``InSilicoVA'', ``InterVA'', ``Tariff'', ``NBC'', and ``EAVA''.
Parameter specific to ``InSilicoVA'' model. Currently not used.
Parameter specific to ``InSilicoVA'' model. Number of iterations to run the sampler.
Parameter specific to ``InterVA'' model. Currently supports ``4.02'', ``4.03'', and ``5''. For InterVA-4, ``4.03'' is strongly recommended as it fixes several major bugs in ``4.02'' version. ``4.02'' is only included for backward compatibility. ``5'' version implements the InterVA-5 model, which requires different data input format.
Parameter specific to ``InterVA'' model. HIV prevalence level, can take values ``h'' (high), ``l'' (low), and ``v'' (very low).
HIV Parameter specific to ``InterVA'' model. Malaria prevalence level, can take values ``h'' (high), ``l'' (low), and ``v'' (very low).
Which PHMRC data format is used. Currently supports only ``adult'' and ``child'', ``neonate'' will be supported in the next release.
type of data conversion when calculating conditional probability (probability of each symptom given each cause of death) for InterVA and InSilicoVA models. Both ``quantile'' and ``fixed'' usually give similar results empirically.
quantile: the rankings of the P(S|C) are obtained by matching the same quantile distributions in the default InterVA P(S|C)
fixed: P(S|C) are matched to the closest values in the default InterVA P(S|C) table.
empirical: no ranking is calculated, but use the empirical conditional probabilities directly, which will force updateCondProb to be FALSE for InSilicoVA algorithm.
Parameter specific to ``EAVA'' model, which identifies the age group of the input VA data. Possible values are ``neonate'' or ``child''.
other arguments passed to insilico, InterVA, interVA_train, tariff, and nbc function in the nbc4va package. See respective package documents for details.
Tyler H. McCormick, Zehang R. Li, Clara Calvert, Amelia C. Crampin, Kathleen Kahn and Samuel J. Clark (2016) Probabilistic cause-of-death assignment using verbal autopsies. https://arxiv.org/abs/1411.3042, Journal of the American Statistical Association
James, S. L., Flaxman, A. D., Murray, C. J., & Population Health Metrics Research Consortium. (2011). Performance of the Tariff Method: validation of a simple additive algorithm for analysis of verbal autopsies. Population Health Metrics, 9(1), 1-16.
Zehang R. Li, Tyler H. McCormick, Samuel J. Clark (2014) InterVA4: An R package to analyze verbal autopsy data. Center for Statistics and the Social Sciences Working Paper, No.146
http://www.interva.net/
Miasnikof P, Giannakeas V, Gomes M, Aleksandrowicz L, Shestopaloff AY, Alam D, Tollman S, Samarikhalaj, Jha P. Naive Bayes classifiers for verbal autopsies: comparison to physician-based classification for 21,000 child and adult deaths. BMC Medicine. 2015;13:286.
Henry D. Kalter, Abdoulaye-Mamadou Roubanatou, Alain Koffi, and Robert E. Black. (2015). Direct estimates of national neonatal and child cause-specific mortality proportions in Niger by expert algorithm and physician-coded analysis of verbal autopsy interviews. Journal of Global Health 5(1):010415.
insilico in package InSilicoVA, InterVA in package InterVA4, InterVA5 in package InterVA5, interVA_train, tariff in package Tariff, nbc function in package nbc4va, and codEAVA function in package EAVA.
# \donttest{
data(RandomVA3)
test <- RandomVA3[1:200, ]
train <- RandomVA3[201:400, ]
fit1 <- codeVA(data = test, data.type = "customize", model = "InSilicoVA",
data.train = train, causes.train = "cause",
Nsim=1000, auto.length = FALSE)
fit2 <- codeVA(data = test, data.type = "customize", model = "InterVA",
data.train = train, causes.train = "cause", write=FALSE,
version = "4.02", HIV = "h", Malaria = "l")
fit3 <- codeVA(data = test, data.type = "customize", model = "Tariff",
data.train = train, causes.train = "cause",
nboot.sig = 100)
# }
Run the code above in your browser using DataLab