This is more complicated than reshape or reshape2::dcast allows. This is a reasonably simple solution using built-in functions.
icd9LongToWide(icd9df, visitId = NULL, icd9Field = NULL, prefix = "icd_",
min.width = 0, aggregate = TRUE, return.df = FALSE)
data.frame of long-form data, one column for visitId and one for ICD code
The name of the column in the data frame which contains the
patient or visit identifier. Typically this is the visit identifier, since
patients come leave and enter hospital with different ICD-9 codes. It is a
character vector of length one. If left empty, or NULL
, then an
attempt is made to guess which field has the ID for the patient encounter
(not a patient ID, although this can of course be specified directly). The
guesses proceed until a single match is made. Data frames may be wide with
many matching fields, so to avoid false positives, anything but a single
match is rejected. If there are no successful guesses, and visitId
was not specified, then the first column of the data frame is used.
The column in the data frame which contains the ICD codes.
This is a character vector of length one. If it is NULL
, icd9
will attempt to guess the column name, looking for progressively less
likely possibilities until it matche a single column. Failing this, it will
take the first column in the data frame. Specifying the column using this
argument avoids the guesswork.
character, default "icd_" to prefix new columns
single integer, if specified, writes out this many columns even if no patients have that many codes. Must be greater than or equal to the maximum number of codes per patient.
single logical value, if TRUE (the default) will take more
time to find out-of-order visitIds, and combine all the codes for each
unique visitId. If FALSE
, then out-of-order visitIds will result in
a row in the output data per contiguous block of identical visitIds.
single logical value, if TRUE
, return a data frame
with a field for the visitId. This may be more convenient, but the default
of FALSE
gives the more natural return data of a matrix with
rownames being the visitIds.
Other ICD-9 convert: convert
,
icd9DecimalToParts
,
icd9MajMinToCode
,
icd9MajMinToDecimal
,
icd9MajMinToParts
,
icd9MajMinToParts_list
,
icd9MajMinToShort
,
icd9PartsToDecimal
,
icd9PartsToShort
,
icd9ShortToParts
;
icd9ChaptersToMap
;
icd9DropLeadingZeroes
,
icd9DropLeadingZeroesDecimal
,
icd9DropLeadingZeroesMajor
,
icd9DropLeadingZeroesShort
;
icd9WideToLong
# NOT RUN {
longdf <- data.frame(visitId = c("a", "b", "b", "c"),
icd9 = c("441", "4424", "443", "441"))
icd9LongToWide(longdf)
icd9LongToWide(longdf, prefix = "ICD10_")
# }
Run the code above in your browser using DataLab