Learn R Programming

⚠️There's a newer version (0.4.12) of this package.Take me there.

popEpi: Epidemiology with population data

The purpose of popEpi is to facilitate computing certain epidemiological statistics where population data is used. Current main attractions:

Splitting, merging population hazards, and aggregating

the lexpand function allows users to split their subject-level follow-up data into sub-intervals along age, follow-up time and calendar time, merge corresponding population hazard information to those intervals, and to aggregate the resulting data if needed.

data(sire)
sr <- sire[1,]
print(sr)
#>    sex    bi_date    dg_date    ex_date status   dg_age
#> 1:   1 1952-05-27 1994-02-03 2012-12-31      0 41.68877
x <- lexpand(sr, birth = bi_date, entry = dg_date, exit = ex_date,
             status = status %in% 1:2, 
             fot = 0:5, per = 1994:2000)
print(x)
#>     lex.id      fot     per      age    lex.dur lex.Cst lex.Xst sex
#>  1:      1 0.000000 1994.09 41.68877 0.90958904       0       0   1
#>  2:      1 0.909589 1995.00 42.59836 0.09041096       0       0   1
#>  3:      1 1.000000 1995.09 42.68877 0.90958904       0       0   1
#>  4:      1 1.909589 1996.00 43.59836 0.09041096       0       0   1
#>  5:      1 2.000000 1996.09 43.68877 0.90958904       0       0   1
#>  6:      1 2.909589 1997.00 44.59836 0.09041096       0       0   1
#>  7:      1 3.000000 1997.09 44.68877 0.90958904       0       0   1
#>  8:      1 3.909589 1998.00 45.59836 0.09041096       0       0   1
#>  9:      1 4.000000 1998.09 45.68877 0.90958904       0       0   1
#> 10:      1 4.909589 1999.00 46.59836 0.09041096       0       0   1
#>        bi_date    dg_date    ex_date status   dg_age
#>  1: 1952-05-27 1994-02-03 2012-12-31      0 41.68877
#>  2: 1952-05-27 1994-02-03 2012-12-31      0 41.68877
#>  3: 1952-05-27 1994-02-03 2012-12-31      0 41.68877
#>  4: 1952-05-27 1994-02-03 2012-12-31      0 41.68877
#>  5: 1952-05-27 1994-02-03 2012-12-31      0 41.68877
#>  6: 1952-05-27 1994-02-03 2012-12-31      0 41.68877
#>  7: 1952-05-27 1994-02-03 2012-12-31      0 41.68877
#>  8: 1952-05-27 1994-02-03 2012-12-31      0 41.68877
#>  9: 1952-05-27 1994-02-03 2012-12-31      0 41.68877
#> 10: 1952-05-27 1994-02-03 2012-12-31      0 41.68877
data(popmort)
x <- lexpand(sr, birth = bi_date, entry = dg_date, exit = ex_date,
             status = status %in% 1:2, 
             fot = 0:5, per = 1994:2000, pophaz = popmort)
print(x)
#>     lex.id      fot     per      age    lex.dur lex.Cst lex.Xst sex
#>  1:      1 0.000000 1994.09 41.68877 0.90958904       0       0   1
#>  2:      1 0.909589 1995.00 42.59836 0.09041096       0       0   1
#>  3:      1 1.000000 1995.09 42.68877 0.90958904       0       0   1
#>  4:      1 1.909589 1996.00 43.59836 0.09041096       0       0   1
#>  5:      1 2.000000 1996.09 43.68877 0.90958904       0       0   1
#>  6:      1 2.909589 1997.00 44.59836 0.09041096       0       0   1
#>  7:      1 3.000000 1997.09 44.68877 0.90958904       0       0   1
#>  8:      1 3.909589 1998.00 45.59836 0.09041096       0       0   1
#>  9:      1 4.000000 1998.09 45.68877 0.90958904       0       0   1
#> 10:      1 4.909589 1999.00 46.59836 0.09041096       0       0   1
#>        bi_date    dg_date    ex_date status   dg_age     pop.haz       pp
#>  1: 1952-05-27 1994-02-03 2012-12-31      0 41.68877 0.001170685 1.000651
#>  2: 1952-05-27 1994-02-03 2012-12-31      0 41.68877 0.001441038 1.000651
#>  3: 1952-05-27 1994-02-03 2012-12-31      0 41.68877 0.001200721 1.001856
#>  4: 1952-05-27 1994-02-03 2012-12-31      0 41.68877 0.001300846 1.001856
#>  5: 1952-05-27 1994-02-03 2012-12-31      0 41.68877 0.001400981 1.003207
#>  6: 1952-05-27 1994-02-03 2012-12-31      0 41.68877 0.002142293 1.003207
#>  7: 1952-05-27 1994-02-03 2012-12-31      0 41.68877 0.002202424 1.005067
#>  8: 1952-05-27 1994-02-03 2012-12-31      0 41.68877 0.001771568 1.005067
#>  9: 1952-05-27 1994-02-03 2012-12-31      0 41.68877 0.002222468 1.007277
#> 10: 1952-05-27 1994-02-03 2012-12-31      0 41.68877 0.002282603 1.007277
a <- lexpand(sr, birth = bi_date, entry = dg_date, exit = ex_date,
             status = status %in% 1:2,
             fot = 0:5, per = 1994:2000, aggre = list(fot, per))
print(a)
#>     fot  per       pyrs at.risk from0to0
#>  1:   0 1994 0.90958904       0        0
#>  2:   0 1995 0.09041096       1        0
#>  3:   1 1995 0.90958904       0        0
#>  4:   1 1996 0.09041096       1        0
#>  5:   2 1996 0.90958904       0        0
#>  6:   2 1997 0.09041096       1        0
#>  7:   3 1997 0.90958904       0        0
#>  8:   3 1998 0.09041096       1        0
#>  9:   4 1998 0.90958904       0        0
#> 10:   4 1999 0.09041096       1        1

SIRs / SMRs

One can make use of the sir function to estimate indirectly standardised incidence or mortality ratios (SIRs/SMRs). The data can be aggregated by lexpand or by other means. While sir is simple and flexible in itself, one may also use sirspline to fit spline functions for the effect of e.g. age as a continuous variable on SIRs.

data(popmort)
data(sire)
c <- lexpand( sire, status = status %in% 1:2, birth = bi_date, exit = ex_date, entry = dg_date,
              breaks = list(per = 1950:2013, age = 1:100, fot = c(0,10,20,Inf)), 
              aggre = list(fot, agegroup = age, year = per, sex) )
#> dropped 16 rows where entry == exit

se <- sir( coh.data = c, coh.obs = 'from0to1', coh.pyrs = 'pyrs', 
           ref.data = popmort, ref.rate = 'haz', 
           adjust = c('agegroup', 'year', 'sex'), print = 'fot')
se
#> SIR (adjusted by agegroup, year, sex) with 95% confidence intervals (profile) 
#> Test for homogeneity: p < 0.001 
#> 
#>  Total sir: 3.08 (2.99-3.17)
#>  Total observed: 4559
#>  Total expected: 1482.13
#>  Total person-years: 39906 
#> 
#> 
#>    fot observed expected     pyrs  sir sir.lo sir.hi p_value
#> 1:   0     4264  1214.54 34445.96 3.51   3.41   3.62   0.000
#> 2:  10      295   267.59  5459.96 1.10   0.98   1.23   0.094

(Relative) survival

The survtab function computes observed, net/relative and cause-specific survivals as well as cumulative incidence functions for Lexis data. Any of the supported survival time functions can be easily adjusted by any number of categorical variables if needed.

One can also use survtab_ag for aggregated data. This means the data does not have to be on the subject-level to compute survival time function estimates.

library(Epi)
#> 
#> Attaching package: 'Epi'
#> The following object is masked from 'package:base':
#> 
#>     merge.data.frame

data(sibr)
sire$cancer <- "rectal"
sibr$cancer <- "breast"
sr <- rbind(sire, sibr)

sr$cancer <- factor(sr$cancer)
sr <- sr[sr$dg_date < sr$ex_date, ]

sr$status <- factor(sr$status, levels = 0:2, 
                    labels = c("alive", "canD", "othD"))

x <- Lexis(entry = list(FUT = 0, AGE = dg_age, CAL = get.yrs(dg_date)), 
           exit = list(CAL = get.yrs(ex_date)), 
           data = sr,
           exit.status = status)
#> NOTE: entry.status has been set to "alive" for all.

st <- survtab(FUT ~ cancer, data = x,
              breaks = list(FUT = seq(0, 5, 1/12)),
              surv.type = "cif.obs")
st
#> 
#> Call: 
#>  survtab(formula = FUT ~ cancer, data = x, breaks = list(FUT = seq(0, 5, 1/12)), surv.type = "cif.obs") 
#> 
#> Type arguments: 
#>  surv.type: cif.obs --- surv.method: hazard
#>  
#> Confidence interval arguments: 
#>  level: 95 % --- transformation: log-log
#>  
#> Totals:
#>  person-time:62120 --- events: 5375
#>  
#> Stratified by: 'cancer'
#>    cancer Tstop surv.obs.lo surv.obs surv.obs.hi SE.surv.obs CIF_canD
#> 1: breast   2.5      0.8804   0.8870      0.8933    0.003290   0.0687
#> 2: breast   5.0      0.7899   0.7986      0.8070    0.004368   0.1162
#> 3: rectal   2.5      0.6250   0.6359      0.6465    0.005480   0.2981
#> 4: rectal   5.0      0.5032   0.5148      0.5263    0.005901   0.3727
#>    CIF_othD
#> 1:   0.0442
#> 2:   0.0852
#> 3:   0.0660
#> 4:   0.1125

Copy Link

Version

Install

install.packages('popEpi')

Monthly Downloads

2,015

Version

0.4.3

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Joonas Miettinen

Last Published

September 5th, 2017

Functions in popEpi (0.4.3)

RPL

Relative Poisson family object
adjust

Adjust Estimates by Categorical Variables
ICSS

Age standardisation weights from the ICSS scheme.
Lexis_fpa

Create a Lexis Object with Follow-up Time, Period, and Age Time Scales
as.Date.yrs

Coerce Fractional Year Values to Date Values
as.aggre

Coercion to Class aggre
as.data.frame.ratetable

Coerce a ratetable Object to Class data.frame
as.data.table.ratetable

Coerce a ratetable Object to Class data.table
aggre

Aggregation of split Lexis data
all_names_present

Check if all names are present in given data
is.Date

Test if object is a Date object
direct_standardization

Direct Adjusting in popEpi Using Weights
cast_simple

Cast data.table/data.frame from long format to wide format
is_leap_year

Detect leap years
meanpop_fi

Mean population counts in Finland year, sex, and age group.
na2zero

Convert NA's to zero in data.table
plot.survtab

plot method for survtab objects
cut_bound

Change output values from cut(..., labels = NULL) output
longDF2ratetable

Experimental: Coerce a long-format data.frame to a ratetable object
lower_bound

Return lower_bound value from char string (20,30]
print.aggre

Print an aggre Object
print.rate

Print an rate object
relpois_ag

Excess hazard Poisson model
robust_values

Convert values to numeric robustly
sir_exp

Calculate SMR
sir_ratio

Confidence intervals for the ratio of two SIRs/SMRs
lexpand

Split case-level observations
lines.sirspline

lines method for sirspline-object
ltable

Tabulate Counts and Other Functions by Multiple Variables into a Long-Format Table
makeWeightsDT

Make a data.table of Tabulated, Aggregated Values and Weights
fac2num

Convert factor variable to numeric
lines.survmean

Graphically Inspect Curves Used in Mean Survival Computation
lines.survtab

lines method for survtab objects
poisson.ci

Get rate and exact Poisson confidence intervals
print.survtab

Print a survtab Object
rate

Direct-Standardised Incidence/Mortality Rates
stdpop101

World standard population by 1 year age groups from 1 to 101. Sums to 100 000.
stdpop18

Standard populations from 2000: world, Europe and Nordic.
summary.aggre

Summarize an aggre Object
summary.survtab

Summarize a survtab Object
survtab_ag

Estimate Survival Time Functions
try2int

Attempt coercion to integer
popEpi

popEpi: Functions for large-scale epidemiological analysis
pophaz

Expected / Population Hazard Data Sets Usage in popEpi
rpcurve

Marginal piecewise parametric relative survival curve
setaggre

Set aggre attributes to an object by modifying in place
sire

sire - a simulated cohort of Finnish female rectal cancer patients
sirspline

Estimate splines for SIR or SMR
flexible_argument

Flexible Variable Usage in popEpi Functions
get.yrs

Convert date objects to fractional years
plot.rate

plot method for rate object
plot.sir

Plot method for sir-object
popmort

Population mortality rates in Finland 1951 - 2013 in 101 age groups and by gender
prepExpo

Prepare Exposure Data for Aggregation
rate_ratio

Confidence intervals for the rate ratios
relpois

Excess hazard Poisson model
sibr

sibr - a simulated cohort of Finnish female breast cancer patients
sir

Calculate SIR or SMR
plot.sirspline

plot method for sirspline-object
plot.survmean

Graphically Inspect Curves Used in Mean Survival Computation
setclass

Set the class of an object (convenience function for setattr(obj, "class", CLASS)); can add instead of replace
setcolsnull

Delete data.table columns if there
splitLexisDT

Split case-level observations
splitMulti

Split case-level observations
survmean

Compute Mean Survival Times Using Extrapolation
survtab

Estimate Survival Time Functions