Apply established comorbidity algorithms to ICD-coded data. Supported methods include several variants of the Charlson comorbidity system, Elixhauser, and the Pediatric Complex Chronic Conditions (PCCC).
comorbidities(
data,
icd.codes,
method,
id.vars = NULL,
icdv.var = NULL,
icdv = NULL,
dx.var = NULL,
dx = NULL,
poa.var = NULL,
poa = NULL,
age.var = NULL,
primarydx.var = NULL,
primarydx = NULL,
flag.method = c("current", "cumulative"),
full.codes = TRUE,
compact.codes = TRUE,
subconditions = FALSE
)The return object will be slightly different depending on the value of
method and subconditions.
When subconditions = FALSE, a medicalcoder_comorbidities object (a
data.frame with attributes) is returned. Column(s) for id.vars, if
defined in the function call. For all methods there will be the following
columns:
num_cmrb a count of comorbidities/conditions flagged
cmrb_flag a 0/1 integer indicator for at least one
comorbidity/condition.
Additional columns:
PCCC methods:
For method = "pccc_v2.0" and method = "pccc_v2.1", there is one
indicator column per condition.
For method = "pccc_v3.0" and method = "pccc_v3.1",
there are four columns per condition:
<condition>_dxpr_or_tech: the condition was flagged due to the
presence of either a diagnostic or procedure code, or was flagged due to
the presence of a technology dependence code along with at least one
comorbidity being flagged by a diagnostic or procedure code.
<condition>_dxpr_only: the condition was flagged due to the
presence of a non-technology dependent diagnostic or procedure code
only.
<condition>_tech_only: the condition was flagged due to the
presence of a technology dependent code only and at least one other
comorbidity was flagged by a non-technology dependent code.
<condition>_dxpr_and_tech: The patient had both diagnostic or
procedure codes and a technology dependence code for the condition.
For Charlson variants, indicator columns are returned for the relevant
conditions, cci (Charlson Comorbidity Index), and age_score.
For Elixhauser variants, indicator columns are returned for all relevant comorbidities, mortality, and readmission indices.
When subconditions = TRUE and the method is a PCCC variant,
a list of length two is returned: the first element contains condition
indicators; the second element is a named list of data.frames with
indicators for subconditions within each condition.
A data.frame in a "long" format. The input data.frame is
expected to have one column of ICD codes (one code per row) with additional
(optional) columns for patient/encounter ids, ICD version,
diagnostic/procedure status, present-on-admission flags, primary
diagnostic flags, or age.
Character scalar naming the column in data that contains
ICD codes (character strings). Codes may be provided in full form (with
decimal points, e.g., C84.2), compact form (dots omitted, e.g., C842), or
any mix of the two. Matching against lookup tables is governed by
icdv.var/icdv, dx.var/dx, and the full.codes / compact.codes
flags.
Character string indicating the comorbidity algorithm to
apply to data.
Optional character vector of column names. When
missing, the entire input data is treated as a single encounter from a
single patient. If you want to set flag.method = "current" then
length(id.vars) >= 2 is expected. The last element should be the
encounter order (must be sortable).
Character scalar naming the column in data that indicates
the ICD version (9 or 10). If present it must be integer values 9 or
10. icdv.var takes precedence over icdv if both are provided.
An integer value of 9L or 10L indicating that all
data[[icd.codes]] are ICD version 9 or 10, respectively. Ignored
(with a warning) if icdv.var is provided.
Character scalar naming the column in data that indicates
diagnostic (1) vs procedural (0) codes. If present it must be integer
values 0 or 1. dx.var takes precedence over dx if both are
provided.
An integer indicating that all data[[icd.codes]] are
diagnostic (1) or procedure (0) codes. Ignored (with a
warning) if dx.var is provided.
Character scalar naming the column with present-on-admission
flags: integer 1L (present), 0L (not present), or NA.
PCCC and Charlson will only flag conditions when the code is
present-on-admission. Elixhauser has a mix of conditions; some require
present-on-admission while others do not. poa.var takes precedence over
poa if both are provided.
Integer scalar 0 or 1. Use when all icd.codes share the same
present-on-admission status. Ignored with a warning if poa and poa.var
are both provided.
Character scalar naming the column in data that contains
patient age in years. Only applicable to Charlson comorbidities.
Character scalar naming the column in data that
indicates whether data[[icd.codes]] are primary diagnostic codes (1L)
or not (0L). Primary diagnosis is used only for Elixhauser and Charlson
comorbidities and is ignored when the method is a PCCC variant.
primarydx.var takes precedence over primarydx if both are provided.
An integer value of 0 or 1. If 0,
treat all codes as non-primary diagnoses; if 1, treat all codes as
primary diagnoses. Ignored, with a warning, if primarydx.var is provided.
When flag.method = 'current' (default) only codes
associated with the current id.vars are considered when flagging
comorbidities. When flag.method = 'cumulative' then all prior encounters
are considered when flagging comorbidities. See Details.
Logical; when TRUE compare
data[[icd.codes]] against full and/or compact ICD codes in the
method’s lookup tables. Full ICD codes include a decimal point (when
applicable) and compact codes omit the decimal point. For example:
B95.0 is the full ICD-10-CM diagnostic code for “Streptococcus,
group A, as the cause of disease classified elsewhere,” whereas B950
is the associated compact code.
Logical scalar; when TRUE, report both conditions and
subconditions (PCCC only).
When flag.method = "current", only codes from the index encounter
contribute to flags. When a longitudinal method is selected (e.g.,
"cumulative"), prior encounters for the same id.vars
combination may contribute to condition flags. For the cumulative method to
work, id.vars needs to be a character vector of length 2 or more. The last
element is treated as the encounter identifier and must be sortable. For
example, say you have data with a hospital, patient, and encounter id. The
id.vars could be one of two entries: c("hospital", "patient", "encounter")
or c("patient", "hospital", "encounter"). In both cases the return will be
the same because the encounter identifier is unchanged regardless of whether
hospital or patient is listed first.
It is critically important that the data[[tail(id.vars, 1)]] variable can
be sorted. Just because your data is sorted in temporal order does not mean
that the results will be correct if the tail(id.vars, 1) is not in the same
order as the data. For example, say you had the following:
| patid | enc_id | date |
| P1 | 10823090 | Aug 2023 |
| P1 | 10725138 | Jul 2025 |
id.vars = c("patid", "enc_id") will give the wrong result as enc_id
10725138 would be sorted to come before enc_id 10823090. id.vars = c("patid", "date") would be sufficient input, assuming that date has been
correctly stored. Adding a column enc_seq, e.g.,
| patid | enc_id | date | enc_seq |
| P1 | 10823090 | Aug 2023 | 1 |
| P1 | 10725138 | Jul 2025 | 2 |
and calling comorbidities() with id.vars = c("patid", "enc_seq") will
have better performance than using the date and will clear up any possible
issues with non-sequential encounter ids from the source data.
Pediatric Complex Chronic Conditions:
Feudtner, C., Feinstein, J.A., Zhong, W. et al. Pediatric complex chronic conditions classification system version 2: updated for ICD-10 and complex medical technology dependence and transplantation. BMC Pediatr 14, 199 (2014). https://doi.org/10.1186/1471-2431-14-199
Feinstein JA, Hall M, Davidson A, Feudtner C. Pediatric Complex Chronic Condition System Version 3. JAMA Netw Open. 2024;7(7):e2420579. https://doi.org/10.1001/jamanetworkopen.2024.20579
Charlson Comorbidities:
Mary E. Charlson, Peter Pompei, Kathy L. Ales, C.Ronald MacKenzie, A new method of classifying prognostic comorbidity in longitudinal studies: Development and validation, Journal of Chronic Diseases, Volume 40, Issue 5, 1987, Pages 373-383, ISSN 0021-9681, https://doi.org/10.1016/0021-9681(87)90171-8.
Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol. 1992 Jun;45(6):613-9. https://doi.org/10.1016/0895-4356(92)90133-8. PMID: 1607900.
Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi JC, Saunders LD, Beck CA, Feasby TE, Ghali WA. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005 Nov;43(11):1130-9. https://doi.org/10.1097/01.mlr.0000182534.19832.83. PMID: 16224307.
Quan H, Li B, Couris CM, Fushimi K, Graham P, Hider P, Januel JM, Sundararajan V. Updating and validating the Charlson comorbidity index and score for risk adjustment in hospital discharge abstracts using data from 6 countries. Am J Epidemiol. 2011 Mar 15;173(6):676-82. https://doi.org/10.1093/aje/kwq433. Epub 2011 Feb 17. PMID: 21330339.
Glasheen WP, Cordier T, Gumpina R, Haugh G, Davis J, Renda A. Charlson Comorbidity Index: ICD-9 Update and ICD-10 Translation. Am Health Drug Benefits. 2019 Jun-Jul;12(4):188-197. PMID: 31428236; PMCID: PMC6684052.
Elixhauser Comorbidities:
Agency for Healthcare Research and Quality (AHRQ). Elixhauser Comorbidity Software Refined for ICD-10-CM Diagnoses, v2025.1 [Internet]. 2025. Available from: https://www.hcup-us.ahrq.gov/toolssoftware/comorbidityicd10/comorbidity_icd10.jsp
vignettes(topic = "comorbidities", package = "medicalcoder")
vignettes(topic = "pccc", package = "medicalcoder")
vignettes(topic = "charlson", package = "medicalcoder")
vignettes(topic = "elixhauser", package = "medicalcoder")
pccc_v3.1_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "pccc_v3.1",
flag.method = 'current',
poa = 1)
summary(pccc_v3.1_results)
pccc_v3.1_subcondition_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "pccc_v3.1",
flag.method = 'current',
poa = 1,
subconditions = TRUE)
summary(pccc_v3.1_subcondition_results)
charlson_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "charlson_quan2011",
flag.method = 'current',
poa = 1)
summary(charlson_results)
elixhauser_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "elixhauser_ahrq2025",
primarydx = 1,
flag.method = 'current',
poa = 1)
summary(elixhauser_results)
Run the code above in your browser using DataLab