extract_entities
for Tapering applicationThis function searches a phrase for medication dosing entities of interest. It
is called within medExtractR_tapering
and generally not intended for use outside
that function.
extract_entities_tapering(
phrase,
p_start,
d_stop,
unit,
frequency_fun = NULL,
intaketime_fun = NULL,
duration_fun = NULL,
route_fun = NULL,
doseschedule_fun = NULL,
preposition_fun = NULL,
timekeyword_fun = NULL,
transition_fun = NULL,
dosechange_fun = NULL,
strength_sep = NULL,
...
)
Text to search.
Start position of phrase within original text.
End position of drug name within original text.
Unit of measurement for medication strength, e.g., ‘mg’.
Function used to extract frequency.
Function used to extract intake time.
Function used to extract duration.
Function used to extract route.
Function used to extract dose schedule.
Function used to extract preposition.
Function used to extract time keyword.
Function used to extract transition.
Function used to extract dose change.
Delimiter for contiguous medication strengths.
Parameter settings used in extracting frequency and intake time,
including additional arguments to frequency_fun
and
intaketime_fun
. Use frequency_dict
to identify custom frequency
dictionaries and intaketime_dict
to identify custom intake time
dictionaries. Similarly, for all other entities with a corresponding <entity>_fun
,
a custom dictionary can be supplied with the argument <entity>_dict
.
data.frame with entities information. At least one row per entity is returned,
using NA
when no expression was found for a given entity.
The “entity” column of the output contains the formatted label for that entity, according to
the following mapping.
strength: “Strength”
dose amount: “DoseAmt”
dose strength: “DoseStrength”
frequency: “Frequency”
intake time: “IntakeTime”
duration: “Duration”
route: “Route”
dose change: “DoseChange”
dose schedule: “DoseScheule”
time keyword: “TimeKeyword”
transition: “Transition”
preposition: “Preposition”
dispense amount: “DispenseAmt”
refill: “Refill”
Sample output for the phrase “Lamotrigine 200mg bid for 14 days” would look like:
entity | expr |
IntakeTime | <NA> |
Strength | <NA> |
DoseAmt | <NA> |
DoseChange | <NA> |
DoseSchedule | <NA> |
TimeKeyword | <NA> |
Transition | <NA> |
Preposition | <NA> |
DispenseAmt | <NA> |
Refill | <NA> |
Frequency | bid;19:22 |
DoseStrength | 200mg;13:18 |
Preposition | for;23:26 |
Various medication dosing entities are extracted within this function including the following:
strength: The amount of drug in a given dosage form (i.e., tablet, capsule). dose amount: The number of tablets, capsules, etc. taken at a given intake time. dose strength: The total amount of drug given intake. This quantity would be equivalent to strength x dose amount, and appears similar to strength when dose amount is absent. frequency: The number of times per day a dose is taken, e.g., “once daily” or ‘2x/day’. intaketime: The time period of the day during which a dose is taken, e.g., ‘morning’, ‘lunch’, ‘in the pm’. duration: How long a patient is on a drug regimen, e.g., ‘2 weeks’, ‘mid-April’, ‘another 3 days’. route: The administration route of the drug, e.g., ‘by mouth’, ‘IV’, ‘topical’. dose change: Whether the dosage of the drug was changed, e.g., ‘increase’, ‘adjust’, ‘reduce’. dose schedule: Keywords which represent special dosing regimens, such as tapering schedules, alternating doses, or stopping keywords, e.g., ‘weaning’, ‘even days’ or ‘odd_days’, ‘discontinue’. time keyword: Whether the dosing regimen is a past dose, current dose, or future dose, e.g., ‘currently’, ‘remain’, ‘yesterday’. transition: Words or symbols that link consecutive doses of a tapering regimen, e.g., ‘then’, ‘followed by’, or a comma ‘,’. preposition: Prepositions that occur immediately next to another identified entity, e.g., ‘to’, ‘until’, ‘for’. dispense amount: The number of pills prescribed to the patient. refill: The number of refills allowed for the patient's prescription.
Similar to the basic implementation, drug name and and time of last dose are not
handled by the extract_entities_tapering
function. Those entities are extracted separately
and appended to the extract_entities_tapering
output within the main medExtractR_tapering
function. In the tapering extension, however, dose change is treated the same as other dictionary-based
entities and extracted within extract_entities_tapering
. Strength, dose amount, dose strength, dispense amount,
and refill are primarily numeric quantities, and are identified using a combination of
regular expressions and rule-based approaches. All other entities use dictionaries for
identification. For more information about the default dictionary for a specific entity,
view the documentation file for the object <entity>_vals
.
By default and when an argument <entity>_fun
is NULL
, the
extract_generic
function will be used to extract that entity. This function
can also inherit user-defined entity dictionaries for each entity, supplied as arguments <entity>_dict
to medExtractR
or medExtractR_tapering
(see documentation files for main function(s) for details).
Note that extract_entities_tapering
has the argument d_stop
. This differs
from extract_entities
, which uses the end position of the full search window. This
is a consequence of medExtractR
using a fixed search window length and medExtractR_tapering
dynamically constructing a search window.
# NOT RUN {
note <- "prednisone 20mg daily tapering to 5mg daily over 2 weeks"
extract_entities_tapering(note, 1, 11, "mg")
# A user-defined dictionary can be used instead of the default
my_dictionary <- data.frame(c("daily", "twice daily"))
extract_entities(note, 1, 11, "mg", frequency_dict = my_dictionary)
# }
Run the code above in your browser using DataLab