Learn R Programming

rprev (version 0.2.4)

prevalence_simulated: Estimate prevalence using Monte Carlo simulation.

Description

Estimates prevalent cases at a specific index date by use of Monte Carlo simulation. Simulated cases are marked with age and sex to enable agreement with population survival data where a cure model is used, and calculation of the posterior distributions of each.

Usage

prevalence_simulated(survobj, age, sex, entry, num_years_to_estimate,
  index_date, num_reg_years, cure = 10, start = NULL, N_boot = 1000,
  population_data = NULL, n_cores = 1)

Arguments

survobj

Surv object from survival package. Currently only right censoring is supported.

age

A vector of ages from the registry.

sex

A vector of sex, encoded as 0 and 1 for males and females respectively.

entry

A vector of entry dates into the registry, in the format YYYY-MM-DD.

num_years_to_estimate

Number of years of data to consider when estimating point prevalence; multiple values can be specified in a vector. If any values are greater than the number of years of registry data available before index_date, incident cases for the difference will be simulated.

index_date

The date at which to estimate point prevalence. Defaults to the latest registry entry date.

num_reg_years

The number of years of the registry for which incidence is to be calculated. Defaults to using all available complete years. Note that if more registry years are supplied than the number of years to estimate prevalence for, the survival data from the surplus registry years are still involved in the survival model fitting.

cure

Integer defining cure model assumption for the calculation (in years). A patient who has survived beyond the cure time has a probability of surviving derived from the mortality rate of the general population.

start

Deprecated: Use index_date instead and specify the number of years of registry data to use with num_reg_years. Date from which incident cases are included in the format YYYY-MM-DD. Defaults to the earliest entry date. This value is now inferred by counting back num_reg_years years of registry data from the index_date. and

N_boot

Number of bootstrapped calculations to perform.

population_data

A dataframe that must contain the columns age, rate, and sex, where each row is the mortality rate for a person of that age and sex. Ideally, age ranges from [0, 100]. Defaults to the supplied data; see UKmortality for the format required for custom datasets.

n_cores

Number of CPU cores to run the fitting of the bootstrapped survival models. Defaults to 1; multi-core functionality is provided by the doParallel package.

Value

A list with the following attributes:

mean_yearly_contributions

A vector of length num_years_to_estimate, representing the average number of prevalent cases subdivided by year of diagnosis across each bootstrap iteration.

posterior_age

Posterior distributions of age, sampled at every bootstrap iteration.

yearly_contributions

Total simulated prevalent cases from every bootstrapped sample.

pop_mortality

Population survival rates in the format of a list, stratified by sex.

nbootstraps

Number of bootstrapped samples used in the prevalence estimation.

coefs

The bootstrapped Weibull coefficients used by the survival models.

full_coefs

The Weibull coefficients from a model fitted to the full dataset.

See Also

Other prevalence functions: prevalence_counted, prevalence, test_prevalence_fit

Examples

Run this code
# NOT RUN {
data(prevsim)

# }
# NOT RUN {
prevalence_simulated(Surv(prevsim$time, prevsim$status), prevsim$age,
                     prevsim$sex, prevsim$entrydate,
                     num_years_to_estimate = 10,
                     index_date = "2013-09-01",
                     num_reg_years = 8, cure = 5)

prevalence_simulated(Surv(prevsim$time, prevsim$status), prevsim$age,
                     prevsim$sex, prevsim$entrydate,
                     num_years_to_estimate = 5,
                     index_date="2009-01-01",
                     num_reg_years=5)

# The program can be run using parallel processing.
prevalence_simulated(Surv(prevsim$time, prevsim$status), prevsim$age,
                     prevsim$sex, prevsim$entrydate,
                     num_years_to_estimate = 10,
                     index_date="2013-01-01",
                     num_reg_years=8, n_cores=4)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab