Learn R Programming

clinspacy

The goal of clinspacy is to perform biomedical named entity recognition, Unified Medical Language System (UMLS) concept mapping, and negation detection using the Python spaCy, scispacy, and medspacy packages.

Installation

You can install the GitHub version of clinspacy with:

remotes::install_github('ML4LHS/clinspacy', INSTALL_opts = '--no-multiarch')

How to load clinspacy

library(clinspacy)

Initiating clinspacy

Note: the very first time you run clinspacy_init() or clinspacy() after installing the package, you may receive an error stating that spaCy was unable to be imported because it was not found. Restarting your R session should resolve the issue.

Initiating clinspacy is optional. If you do not initiate the package using clinspacy_init(), it will be automatically initiated without the UMLS linker. The UMLS linker takes up ~12 GB of RAM, so if you would like to use the linker, you can initiate clinspacy with the linker. The linker can still be added on later by reinitiating with the use_linker argument set to TRUE.

clinspacy_init() # This is optional! The default functionality is to initiatie clinspacy without the UMLS linker

Named entity recognition (without the UMLS linker)

The clinspacy() function can take a single string, a character vector, or a data frame. It can output either a data frame or a file name.

A single character string as input

clinspacy('This patient has diabetes and CKD stage 3 but no HTN.')
#>   |                                                                                                                                 |                                                                                                                         |   0%  |                                                                                                                                 |=========================================================================================================================| 100%
#>   clinspacy_id      entity       lemma is_family is_historical is_hypothetical is_negated is_uncertain section_category
#> 1            1     patient     patient     FALSE         FALSE           FALSE      FALSE        FALSE             <NA>
#> 2            1    diabetes    diabetes     FALSE         FALSE           FALSE      FALSE        FALSE             <NA>
#> 3            1 CKD stage 3 ckd stage 3     FALSE         FALSE           FALSE      FALSE        FALSE             <NA>
#> 4            1         HTN         htn     FALSE         FALSE           FALSE       TRUE        FALSE             <NA>

clinspacy('HISTORY: He presents with chest pain. PMH: HTN. MEDICATIONS: This patient with diabetes is taking omeprazole, aspirin, and lisinopril 10 mg but is not taking albuterol anymore as his asthma has resolved. ALLERGIES: penicillin.', verbose = FALSE)
#>    clinspacy_id     entity      lemma is_family is_historical is_hypothetical is_negated is_uncertain     section_category
#> 1             1 chest pain chest pain     FALSE          TRUE           FALSE      FALSE        FALSE                 <NA>
#> 2             1        PMH        PMH     FALSE         FALSE           FALSE      FALSE        FALSE past_medical_history
#> 3             1        HTN        htn     FALSE         FALSE           FALSE      FALSE        FALSE past_medical_history
#> 4             1    patient    patient     FALSE         FALSE           FALSE      FALSE        FALSE          medications
#> 5             1   diabetes   diabetes     FALSE         FALSE           FALSE      FALSE        FALSE          medications
#> 6             1 omeprazole omeprazole     FALSE         FALSE           FALSE      FALSE        FALSE          medications
#> 7             1    aspirin    aspirin     FALSE         FALSE           FALSE      FALSE        FALSE          medications
#> 8             1 lisinopril lisinopril     FALSE         FALSE           FALSE      FALSE        FALSE          medications
#> 9             1  albuterol  albuterol     FALSE         FALSE           FALSE       TRUE        FALSE          medications
#> 10            1     asthma     asthma     FALSE         FALSE           FALSE       TRUE        FALSE          medications
#> 11            1 penicillin penicillin     FALSE         FALSE           FALSE      FALSE        FALSE            allergies

A character vector as input

clinspacy(c('This pt has CKD and HTN', 'Pt only has CKD but no HTN'),
          verbose = FALSE)
#>   clinspacy_id entity lemma is_family is_historical is_hypothetical is_negated is_uncertain section_category
#> 1            1    CKD   ckd     FALSE         FALSE           FALSE      FALSE        FALSE             <NA>
#> 2            1    HTN   htn     FALSE         FALSE           FALSE      FALSE        FALSE             <NA>
#> 3            2     Pt    pt     FALSE         FALSE           FALSE      FALSE        FALSE             <NA>
#> 4            2    CKD   ckd     FALSE         FALSE           FALSE      FALSE        FALSE             <NA>
#> 5            2    HTN   htn     FALSE         FALSE           FALSE       TRUE        FALSE             <NA>

A data frame as input

data.frame(text = c('This pt has CKD and HTN', 'Diabetes is present'),
           stringsAsFactors = FALSE) %>%
  clinspacy(df_col = 'text', verbose = FALSE)
#>   clinspacy_id   entity    lemma is_family is_historical is_hypothetical is_negated is_uncertain section_category
#> 1            1      CKD      ckd     FALSE         FALSE           FALSE      FALSE        FALSE             <NA>
#> 2            1      HTN      htn     FALSE         FALSE           FALSE      FALSE        FALSE             <NA>
#> 3            2 Diabetes Diabetes     FALSE         FALSE           FALSE      FALSE        FALSE             <NA>

Saving the output to file

The output_file can then be piped into bind_clinspacy() or bind_clinspacy_embeddings(). This saves a lot of time because you can try different strategies of subsetting in both of these functions without needing to re-process the original data.

if (!dir.exists(rappdirs::user_data_dir('clinspacy'))) {
  dir.create(rappdirs::user_data_dir('clinspacy'), recursive = TRUE)
}

mtsamples = dataset_mtsamples()

mtsamples[1:5,]
#>   note_id                                                      description          medical_specialty
#> 1       1 A 23-year-old white female presents with complaint of allergies.       Allergy / Immunology
#> 2       2                         Consult for laparoscopic gastric bypass.                 Bariatrics
#> 3       3                         Consult for laparoscopic gastric bypass.                 Bariatrics
#> 4       4                                             2-D M-Mode. Doppler. Cardiovascular / Pulmonary
#> 5       5                                               2-D Echocardiogram Cardiovascular / Pulmonary
#>                               sample_name
#> 1                       Allergic Rhinitis
#> 2 Laparoscopic Gastric Bypass Consult - 2
#> 3 Laparoscopic Gastric Bypass Consult - 1
#> 4                  2-D Echocardiogram - 1
#> 5                  2-D Echocardiogram - 2
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            transcription
#> 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    SUBJECTIVE:,  This 23-year-old white female presents with complaint of allergies.  She used to have allergies when she lived in Seattle but she thinks they are worse here.  In the past, she has tried Claritin, and Zyrtec.  Both worked for short time but then seemed to lose effectiveness.  She has used Allegra also.  She used that last summer and she began using it again two weeks ago.  It does not appear to be working very well.  She has used over-the-counter sprays but no prescription nasal sprays.  She does have asthma but doest not require daily medication for this and does not think it is flaring up.,MEDICATIONS: , Her only medication currently is Ortho Tri-Cyclen and the Allegra.,ALLERGIES: , She has no known medicine allergies.,OBJECTIVE:,Vitals:  Weight was 130 pounds and blood pressure 124/78.,HEENT:  Her throat was mildly erythematous without exudate.  Nasal mucosa was erythematous and swollen.  Only clear drainage was seen.  TMs were clear.,Neck:  Supple without adenopathy.,Lungs:  Clear.,ASSESSMENT:,  Allergic rhinitis.,PLAN:,1.  She will try Zyrtec instead of Allegra again.  Another option will be to use loratadine.  She does not think she has prescription coverage so that might be cheaper.,2.  Samples of Nasonex two sprays in each nostril given for three weeks.  A prescription was written as well.
#> 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        PAST MEDICAL HISTORY:, He has difficulty climbing stairs, difficulty with airline seats, tying shoes, used to public seating, and lifting objects off the floor.  He exercises three times a week at home and does cardio.  He has difficulty walking two blocks or five flights of stairs.  Difficulty with snoring.  He has muscle and joint pains including knee pain, back pain, foot and ankle pain, and swelling.  He has gastroesophageal reflux disease.,PAST SURGICAL HISTORY:, Includes reconstructive surgery on his right hand 13 years ago.  ,SOCIAL HISTORY:, He is currently single.  He has about ten drinks a year.  He had smoked significantly up until several months ago.  He now smokes less than three cigarettes a day.,FAMILY HISTORY:, Heart disease in both grandfathers, grandmother with stroke, and a grandmother with diabetes.  Denies obesity and hypertension in other family members.,CURRENT MEDICATIONS:, None.,ALLERGIES:,  He is allergic to Penicillin.,MISCELLANEOUS/EATING HISTORY:, He has been going to support groups for seven months with Lynn Holmberg in Greenwich and he is from Eastchester, New York and he feels that we are the appropriate program.  He had a poor experience with the Greenwich program.  Eating history, he is not an emotional eater.  Does not like sweets.  He likes big portions and carbohydrates.  He likes chicken and not steak.  He currently weighs 312 pounds.  Ideal body weight would be 170 pounds.  He is 142 pounds overweight.  If ,he lost 60% of his excess body weight that would be 84 pounds and he should weigh about 228.,REVIEW OF SYSTEMS: ,Negative for head, neck, heart, lungs, GI, GU, orthopedic, and skin.  Specifically denies chest pain, heart attack, coronary artery disease, congestive heart failure, arrhythmia, atrial fibrillation, pacemaker, high cholesterol, pulmonary embolism, high blood pressure, CVA, venous insufficiency, thrombophlebitis, asthma, shortness of breath, COPD, emphysema, sleep apnea, diabetes, leg and foot swelling, osteoarthritis, rheumatoid arthritis, hiatal hernia, peptic ulcer disease, gallstones, infected gallbladder, pancreatitis, fatty liver, hepatitis, hemorrhoids, rectal bleeding, polyps, incontinence of stool, urinary stress incontinence, or cancer.  Denies cellulitis, pseudotumor cerebri, meningitis, or encephalitis.,PHYSICAL EXAMINATION:, He is alert and oriented x 3.  Cranial nerves II-XII are intact.  Afebrile.  Vital Signs are stable.
#> 3 HISTORY OF PRESENT ILLNESS: , I have seen ABC today.  He is a very pleasant gentleman who is 42 years old, 344 pounds.  He is 5'9".  He has a BMI of 51.  He has been overweight for ten years since the age of 33, at his highest he was 358 pounds, at his lowest 260.  He is pursuing surgical attempts of weight loss to feel good, get healthy, and begin to exercise again.  He wants to be able to exercise and play volleyball.  Physically, he is sluggish.  He gets tired quickly.  He does not go out often.  When he loses weight he always regains it and he gains back more than he lost.  His biggest weight loss is 25 pounds and it was three months before he gained it back.  He did six months of not drinking alcohol and not taking in many calories.  He has been on multiple commercial weight loss programs including Slim Fast for one month one year ago and Atkin's Diet for one month two years ago.,PAST MEDICAL HISTORY: , He has difficulty climbing stairs, difficulty with airline seats, tying shoes, used to public seating, difficulty walking, high cholesterol, and high blood pressure.  He has asthma and difficulty walking two blocks or going eight to ten steps.  He has sleep apnea and snoring.  He is a diabetic, on medication.  He has joint pain, knee pain, back pain, foot and ankle pain, leg and foot swelling.  He has hemorrhoids.,PAST SURGICAL HISTORY: , Includes orthopedic or knee surgery.,SOCIAL HISTORY: , He is currently single.  He drinks alcohol ten to twelve drinks a week, but does not drink five days a week and then will binge drink.  He smokes one and a half pack a day for 15 years, but he has recently stopped smoking for the past two weeks.,FAMILY HISTORY: , Obesity, heart disease, and diabetes.  Family history is negative for hypertension and stroke.,CURRENT MEDICATIONS:,  Include Diovan, Crestor, and Tricor.,MISCELLANEOUS/EATING HISTORY:  ,He says a couple of friends of his have had heart attacks and have had died.  He used to drink everyday, but stopped two years ago.  He now only drinks on weekends.  He is on his second week of Chantix, which is a medication to come off smoking completely.  Eating, he eats bad food.  He is single.  He eats things like bacon, eggs, and cheese, cheeseburgers, fast food, eats four times a day, seven in the morning, at noon, 9 p.m., and 2 a.m.  He currently weighs 344 pounds and 5'9".  His ideal body weight is 160 pounds.  He is 184 pounds overweight.  If he lost 70% of his excess body weight that would be 129 pounds and that would get him down to 215.,REVIEW OF SYSTEMS: , Negative for head, neck, heart, lungs, GI, GU, orthopedic, or skin.  He also is positive for gout.  He denies chest pain, heart attack, coronary artery disease, congestive heart failure, arrhythmia, atrial fibrillation, pacemaker, pulmonary embolism, or CVA.  He denies venous insufficiency or thrombophlebitis.  Denies shortness of breath, COPD, or emphysema.  Denies thyroid problems, hip pain, osteoarthritis, rheumatoid arthritis, GERD, hiatal hernia, peptic ulcer disease, gallstones, infected gallbladder, pancreatitis, fatty liver, hepatitis, rectal bleeding, polyps, incontinence of stool, urinary stress incontinence, or cancer.  He denies cellulitis, pseudotumor cerebri, meningitis, or encephalitis.,PHYSICAL EXAMINATION:  ,He is alert and oriented x 3.  Cranial nerves II-XII are intact.  Neck is soft and supple.  Lungs:  He has positive wheezing bilaterally.  Heart is regular rhythm and rate.  His abdomen is soft.  Extremities:  He has 1+ pitting edema.,IMPRESSION/PLAN:,  I have explained to him the risks and potential complications of laparoscopic gastric bypass in detail and these include bleeding, infection, deep venous thrombosis, pulmonary embolism, leakage from the gastrojejuno-anastomosis, jejunojejuno-anastomosis, and possible bowel obstruction among other potential complications.  He understands.  He wants to proceed with workup and evaluation for laparoscopic Roux-en-Y gastric bypass.  He will need to get a letter of approval from Dr. XYZ.  He will need to see a nutritionist and mental health worker.  He will need an upper endoscopy by either Dr. XYZ.  He will need to go to Dr. XYZ as he previously had a sleep study.  We will need another sleep study.  He will need H. pylori testing, thyroid function tests, LFTs, glycosylated hemoglobin, and fasting blood sugar.  After this is performed, we will submit him for insurance approval.
#> 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        2-D M-MODE: , ,1.  Left atrial enlargement with left atrial diameter of 4.7 cm.,2.  Normal size right and left ventricle.,3.  Normal LV systolic function with left ventricular ejection fraction of 51%.,4.  Normal LV diastolic function.,5.  No pericardial effusion.,6.  Normal morphology of aortic valve, mitral valve, tricuspid valve, and pulmonary valve.,7.  PA systolic pressure is 36 mmHg.,DOPPLER: , ,1.  Mild mitral and tricuspid regurgitation.,2.  Trace aortic and pulmonary regurgitation.
#> 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     1.  The left ventricular cavity size and wall thickness appear normal.  The wall motion and left ventricular systolic function appears hyperdynamic with estimated ejection fraction of 70% to 75%.  There is near-cavity obliteration seen.  There also appears to be increased left ventricular outflow tract gradient at the mid cavity level consistent with hyperdynamic left ventricular systolic function.  There is abnormal left ventricular relaxation pattern seen as well as elevated left atrial pressures seen by Doppler examination.,2.  The left atrium appears mildly dilated.,3.  The right atrium and right ventricle appear normal.,4.  The aortic root appears normal.,5.  The aortic valve appears calcified with mild aortic valve stenosis, calculated aortic valve area is 1.3 cm square with a maximum instantaneous gradient of 34 and a mean gradient of 19 mm.,6.  There is mitral annular calcification extending to leaflets and supportive structures with thickening of mitral valve leaflets with mild mitral regurgitation.,7.  The tricuspid valve appears normal with trace tricuspid regurgitation with moderate pulmonary artery hypertension.  Estimated pulmonary artery systolic pressure is 49 mmHg.  Estimated right atrial pressure of 10 mmHg.,8.  The pulmonary valve appears normal with trace pulmonary insufficiency.,9.  There is no pericardial effusion or intracardiac mass seen.,10.  There is a color Doppler suggestive of a patent foramen ovale with lipomatous hypertrophy of the interatrial septum.,11.  The study was somewhat technically limited and hence subtle abnormalities could be missed from the study.,
#>                                                                                                                                                                                                                                                                                                                                  keywords
#> 1                                                                                                                                                                                                     allergy / immunology, allergic rhinitis, allergies, asthma, nasal sprays, rhinitis, nasal, erythematous, allegra, sprays, allergic,
#> 2                                                                                                bariatrics, laparoscopic gastric bypass, weight loss programs, gastric bypass, atkin's diet, weight watcher's, body weight, laparoscopic gastric, weight loss, pounds, months, weight, laparoscopic, band, loss, diets, overweight, lost
#> 3                                                                                             bariatrics, laparoscopic gastric bypass, heart attacks, body weight, pulmonary embolism, potential complications, sleep study, weight loss, gastric bypass, anastomosis, loss, sleep, laparoscopic, gastric, bypass, heart, pounds, weight,
#> 4                                                                          cardiovascular / pulmonary, 2-d m-mode, doppler, aortic valve, atrial enlargement, diastolic function, ejection fraction, mitral, mitral valve, pericardial effusion, pulmonary valve, regurgitation, systolic function, tricuspid, tricuspid valve, normal lv
#> 5 cardiovascular / pulmonary, 2-d, doppler, echocardiogram, annular, aortic root, aortic valve, atrial, atrium, calcification, cavity, ejection fraction, mitral, obliteration, outflow, regurgitation, relaxation pattern, stenosis, systolic function, tricuspid, valve, ventricular, ventricular cavity, wall motion, pulmonary artery

clinspacy_output_file = 
  mtsamples[1:5, 1:2] %>% 
  clinspacy(df_col = 'description',
            verbose = FALSE,
            output_file = file.path(rappdirs::user_data_dir('clinspacy'),
                                  'output.csv'),
          overwrite = TRUE)

clinspacy_output_file
#> [1] "C:\\Users\\kdpsingh\\AppData\\Local\\clinspacy\\clinspacy/output.csv"

Binding named entities to a data frame (without the UMLS linker)

Negated concepts, as identified by the medspacy cycontext flag, are ignored by default and do not count towards the frequencies. However, you can now change the subsetting criteria.

Note that you now need to re-provide the original dataset to the bind_clinspacy() function.

mtsamples[1:5, 1:2] %>% 
  clinspacy(df_col = 'description', verbose = FALSE) %>% 
  bind_clinspacy(mtsamples[1:5, 1:2])
#>   clinspacy_id note_id                                                      description 2-d 2-d m-mode allergy complaint consult
#> 1            1       1 A 23-year-old white female presents with complaint of allergies.   0          0       1         1       0
#> 2            2       2                         Consult for laparoscopic gastric bypass.   0          0       0         0       1
#> 3            3       3                         Consult for laparoscopic gastric bypass.   0          0       0         0       1
#> 4            4       4                                             2-D M-Mode. Doppler.   0          1       0         0       0
#> 5            5       5                                               2-D Echocardiogram   1          0       0         0       0
#>   doppler echocardiogram laparoscopic gastric bypass white female
#> 1       0              0                           0            1
#> 2       0              0                           1            0
#> 3       0              0                           1            0
#> 4       1              0                           0            0
#> 5       0              1                           0            0

We can also store the intermediate result so that bind_clinspacy() does not need to re-process the text.

clinspacy_output_data = 
  mtsamples[1:5, 1:2] %>% 
  clinspacy(df_col = 'description', verbose = FALSE)

clinspacy_output_data %>% 
  bind_clinspacy(mtsamples[1:5, 1:2])
#>   clinspacy_id note_id                                                      description 2-d 2-d m-mode allergy complaint consult
#> 1            1       1 A 23-year-old white female presents with complaint of allergies.   0          0       1         1       0
#> 2            2       2                         Consult for laparoscopic gastric bypass.   0          0       0         0       1
#> 3            3       3                         Consult for laparoscopic gastric bypass.   0          0       0         0       1
#> 4            4       4                                             2-D M-Mode. Doppler.   0          1       0         0       0
#> 5            5       5                                               2-D Echocardiogram   1          0       0         0       0
#>   doppler echocardiogram laparoscopic gastric bypass white female
#> 1       0              0                           0            1
#> 2       0              0                           1            0
#> 3       0              0                           1            0
#> 4       1              0                           0            0
#> 5       0              1                           0            0

clinspacy_output_data %>% 
  bind_clinspacy(mtsamples[1:5, 1:2],
                 cs_col = 'entity')
#>   clinspacy_id note_id                                                      description 2-D 2-D M-Mode Consult Doppler
#> 1            1       1 A 23-year-old white female presents with complaint of allergies.   0          0       0       0
#> 2            2       2                         Consult for laparoscopic gastric bypass.   0          0       1       0
#> 3            3       3                         Consult for laparoscopic gastric bypass.   0          0       1       0
#> 4            4       4                                             2-D M-Mode. Doppler.   0          1       0       1
#> 5            5       5                                               2-D Echocardiogram   1          0       0       0
#>   Echocardiogram allergies complaint laparoscopic gastric bypass white female
#> 1              0         1         1                           0            1
#> 2              0         0         0                           1            0
#> 3              0         0         0                           1            0
#> 4              0         0         0                           0            0
#> 5              1         0         0                           0            0

clinspacy_output_data %>% 
  bind_clinspacy(mtsamples[1:5, 1:2],
                 subset = 'is_uncertain == FALSE & is_negated == FALSE')
#>   clinspacy_id note_id                                                      description 2-d 2-d m-mode allergy complaint consult
#> 1            1       1 A 23-year-old white female presents with complaint of allergies.   0          0       1         1       0
#> 2            2       2                         Consult for laparoscopic gastric bypass.   0          0       0         0       1
#> 3            3       3                         Consult for laparoscopic gastric bypass.   0          0       0         0       1
#> 4            4       4                                             2-D M-Mode. Doppler.   0          1       0         0       0
#> 5            5       5                                               2-D Echocardiogram   1          0       0         0       0
#>   doppler echocardiogram laparoscopic gastric bypass white female
#> 1       0              0                           0            1
#> 2       0              0                           1            0
#> 3       0              0                           1            0
#> 4       1              0                           0            0
#> 5       0              1                           0            0

We can also re-use the output file we had created earlier and pipe this directly into bind_clinspacy().

clinspacy_output_file
#> [1] "C:\\Users\\kdpsingh\\AppData\\Local\\clinspacy\\clinspacy/output.csv"

clinspacy_output_file %>% 
  bind_clinspacy(mtsamples[1:5, 1:2])
#>   clinspacy_id note_id                                                      description 2-d 2-d m-mode allergy complaint consult
#> 1            1       1 A 23-year-old white female presents with complaint of allergies.   0          0       1         1       0
#> 2            2       2                         Consult for laparoscopic gastric bypass.   0          0       0         0       1
#> 3            3       3                         Consult for laparoscopic gastric bypass.   0          0       0         0       1
#> 4            4       4                                             2-D M-Mode. Doppler.   0          1       0         0       0
#> 5            5       5                                               2-D Echocardiogram   1          0       0         0       0
#>   doppler echocardiogram laparoscopic gastric bypass white female
#> 1       0              0                           0            1
#> 2       0              0                           1            0
#> 3       0              0                           1            0
#> 4       1              0                           0            0
#> 5       0              1                           0            0

clinspacy_output_file %>% 
  bind_clinspacy(mtsamples[1:5, 1:2],
                 cs_col = 'entity')
#>   clinspacy_id note_id                                                      description 2-D 2-D M-Mode Consult Doppler
#> 1            1       1 A 23-year-old white female presents with complaint of allergies.   0          0       0       0
#> 2            2       2                         Consult for laparoscopic gastric bypass.   0          0       1       0
#> 3            3       3                         Consult for laparoscopic gastric bypass.   0          0       1       0
#> 4            4       4                                             2-D M-Mode. Doppler.   0          1       0       1
#> 5            5       5                                               2-D Echocardiogram   1          0       0       0
#>   Echocardiogram allergies complaint laparoscopic gastric bypass white female
#> 1              0         1         1                           0            1
#> 2              0         0         0                           1            0
#> 3              0         0         0                           1            0
#> 4              0         0         0                           0            0
#> 5              1         0         0                           0            0

clinspacy_output_file %>% 
  bind_clinspacy(mtsamples[1:5, 1:2],
                 subset = 'is_uncertain == FALSE & is_negated == FALSE')
#>   clinspacy_id note_id                                                      description 2-d 2-d m-mode allergy complaint consult
#> 1            1       1 A 23-year-old white female presents with complaint of allergies.   0          0       1         1       0
#> 2            2       2                         Consult for laparoscopic gastric bypass.   0          0       0         0       1
#> 3            3       3                         Consult for laparoscopic gastric bypass.   0          0       0         0       1
#> 4            4       4                                             2-D M-Mode. Doppler.   0          1       0         0       0
#> 5            5       5                                               2-D Echocardiogram   1          0       0         0       0
#>   doppler echocardiogram laparoscopic gastric bypass white female
#> 1       0              0                           0            1
#> 2       0              0                           1            0
#> 3       0              0                           1            0
#> 4       1              0                           0            0
#> 5       0              1                           0            0

Binding entity embeddings to a data frame (without the UMLS linker)

With the UMLS linker disabled, 200-dimensional entity embeddings can be extracted from the scispacy Python package. For this to work, you must set return_scispacy_embeddings to TRUE when running clinspacy(). It’s also a good idea to write the output directly to file because the embeddings can be quite large.

clinspacy_output_file = 
  mtsamples[1:5, 1:2] %>% 
  clinspacy(df_col = 'description',
            return_scispacy_embeddings = TRUE,
            verbose = FALSE,
            output_file = file.path(rappdirs::user_data_dir('clinspacy'),
                                  'output.csv'),
          overwrite = TRUE)

clinspacy_output_file %>% 
  bind_clinspacy_embeddings(mtsamples[1:5, 1:2])
#>   clinspacy_id note_id                                                      description    emb_001    emb_002     emb_003
#> 1            1       1 A 23-year-old white female presents with complaint of allergies. -0.1959790 0.28813400  0.09685702
#> 2            2       2                         Consult for laparoscopic gastric bypass. -0.1115363 0.01725144 -0.13519235
#> 3            3       3                         Consult for laparoscopic gastric bypass. -0.1115363 0.01725144 -0.13519235
#> 4            4       4                                             2-D M-Mode. Doppler. -0.3077586 0.25928350 -0.37220851
#>       emb_004    emb_005     emb_006      emb_007     emb_008    emb_009    emb_010    emb_011     emb_012   emb_013    emb_014
#> 1 -0.20641684 -0.1554238 -0.01624470  0.027011001  0.05331314 -0.1006668  0.3682853  0.0581439 -0.29079599 0.1611375 -0.1118952
#> 2 -0.05496463  0.1488807 -0.19577999  0.052658666 -0.10433200 -0.0763495  0.1199215 -0.1860092  0.05465447 0.1267057 -0.2041533
#> 3 -0.05496463  0.1488807 -0.19577999  0.052658666 -0.10433200 -0.0763495  0.1199215 -0.1860092  0.05465447 0.1267057 -0.2041533
#> 4 -0.06021732  0.0386426 -0.07756314 -0.002676249  0.22511028  0.3279995 -0.2274373 -0.1656060 -0.30020200 0.5237787 -0.1472114
#>       emb_015     emb_016    emb_017    emb_018    emb_019      emb_020    emb_021    emb_022     emb_023     emb_024     emb_025
#> 1 -0.03922822  0.06888010 -0.1862742 -0.1454458 0.04115367  0.049065500 0.39795328 0.05879098  0.05246135 -0.19981400 -0.03346085
#> 2  0.01984984 -0.01107489  0.1080266  0.1128684 0.23062316 -0.005933613 0.06126638 0.05048515  0.12351524 -0.02489970 -0.26744565
#> 3  0.01984984 -0.01107489  0.1080266  0.1128684 0.23062316 -0.005933613 0.06126638 0.05048515  0.12351524 -0.02489970 -0.26744565
#> 4 -0.02312062 -0.11272645 -0.3415540 -0.2255931 0.02385290  0.074861225 0.12910485 0.02176433 -0.21616454  0.08218845  0.33230226
#>     emb_026     emb_027     emb_028     emb_029     emb_030     emb_031     emb_032     emb_033     emb_034     emb_035
#> 1 0.1395520  0.01792375 -0.06969561 -0.04942485  0.06613978  0.08035761 -0.12418544 -0.11839510  0.04266573 -0.04319873
#> 2 0.3418240 -0.12783451  0.38420413 -0.20168215 -0.06550949  0.26997083 -0.07201438  0.13039007 -0.13608095  0.10342984
#> 3 0.3418240 -0.12783451  0.38420413 -0.20168215 -0.06550949  0.26997083 -0.07201438  0.13039007 -0.13608095  0.10342984
#> 4 0.2420833  0.08455360  0.22111987 -0.57962301  0.32054099 -0.26178523 -0.46501200  0.05091595 -0.22430425 -0.07319695
#>       emb_036     emb_037    emb_038    emb_039      emb_040     emb_041     emb_042      emb_043    emb_044     emb_045
#> 1  0.06394462  0.02425202 -0.2158322 -0.1064802  0.005398401  0.01459978 -0.03936125 -0.216860471 0.01146569 -0.01707370
#> 2  0.03349850 -0.06359592 -0.2497478 -0.1312915 -0.068015995  0.12897950  0.20849532 -0.001854315 0.02034700  0.04105476
#> 3  0.03349850 -0.06359592 -0.2497478 -0.1312915 -0.068015995  0.12897950  0.20849532 -0.001854315 0.02034700  0.04105476
#> 4 -0.19518739 -0.21279503 -0.1980325 -0.3900315  0.214830723 -0.03985715  0.32672650 -0.067201529 0.43131340 -0.10445137
#>       emb_046     emb_047     emb_048     emb_049     emb_050   emb_051       emb_052     emb_053    emb_054     emb_055
#> 1 -0.08789315 -0.48977432  0.11840488 -0.24063642 -0.23959090 0.1258371 -0.0001312072 -0.15632193  0.2063196 -0.02019964
#> 2 -0.26218344  0.05762917 -0.08367021 -0.01368977  0.02369371 0.1266086 -0.1197809521  0.04324770 -0.2046735 -0.21317951
#> 3 -0.26218344  0.05762917 -0.08367021 -0.01368977  0.02369371 0.1266086 -0.1197809521  0.04324770 -0.2046735 -0.21317951
#> 4 -0.36873272  0.39958726  0.03923560  0.06519943 -0.12042060 0.1947917  0.5587487221  0.02909975 -0.1112386 -0.29085600
#>        emb_056     emb_057      emb_058     emb_059    emb_060     emb_061     emb_062     emb_063      emb_064     emb_065
#> 1 -0.002069766 -0.14390510 -0.112056380 -0.12671516 -0.3076788  0.01722672 -0.04037631  0.14633203  0.072336150  0.04734538
#> 2  0.029707700 -0.04107177 -0.003977332  0.03327019  0.1377243  0.18907296 -0.26335296  0.01884718 -0.009265006 -0.16859459
#> 3  0.029707700 -0.04107177 -0.003977332  0.03327019  0.1377243  0.18907296 -0.26335296  0.01884718 -0.009265006 -0.16859459
#> 4  0.051582206  0.03322158 -0.090760550 -0.01738100  0.4675597 -0.29520441  0.62886798 -0.14435785  0.002738898 -0.03027805
#>      emb_066     emb_067     emb_068   emb_069     emb_070     emb_071     emb_072    emb_073     emb_074    emb_075    emb_076
#> 1  0.2444712 0.005439494  0.07232769 0.1972760 0.007281476 -0.03698583 -0.07433472 -0.0170116  0.15559705 -0.0142159 0.03095377
#> 2 -0.2767420 0.048937336 -0.35522249 0.1164578 0.345116988 -0.03482347 -0.09575927 -0.1530600 -0.08885341  0.1138750 0.24408367
#> 3 -0.2767420 0.048937336 -0.35522249 0.1164578 0.345116988 -0.03482347 -0.09575927 -0.1530600 -0.08885341  0.1138750 0.24408367
#> 4 -0.4466182 0.080596073  0.29857932 0.2307856 0.032678135 -0.02464749 -0.05315572  0.2278580  0.05121428  0.3368990 0.12042545
#>      emb_077     emb_078    emb_079    emb_080    emb_081     emb_082    emb_083     emb_084    emb_085     emb_086    emb_087
#> 1 0.14973202 -0.07275485 -0.1265165  0.0756736 -0.1064746 -0.04138183  0.1262948 -0.07008250 -0.0581785 -0.08323197 -0.1252120
#> 2 0.01405296 -0.00684475 -0.1356777 -0.1306460  0.2395754 -0.24276201  0.1975068 -0.03769429 -0.2019527  0.09356334 -0.2311737
#> 3 0.01405296 -0.00684475 -0.1356777 -0.1306460  0.2395754 -0.24276201  0.1975068 -0.03769429 -0.2019527  0.09356334 -0.2311737
#> 4 0.05976460  0.20906300 -0.3898960 -0.2403080 -0.2094990 -0.43718034 -0.2580445 -0.36398449 -0.1863167 -0.38763523  0.1124806
#>       emb_088     emb_089     emb_090    emb_091    emb_092     emb_093     emb_094      emb_095     emb_096    emb_097
#> 1  0.10060352 -0.01839051 -0.24945817  0.2108233  0.2314818 -0.07174893  0.03378552  0.002213914  0.22163883 0.30331765
#> 2  0.01929579 -0.18456985  0.16967812 -0.3636869 -0.1134262  0.07241845  0.29899751  0.111884147 -0.04911397 0.05792167
#> 3  0.01929579 -0.18456985  0.16967812 -0.3636869 -0.1134262  0.07241845  0.29899751  0.111884147 -0.04911397 0.05792167
#> 4 -0.25680842 -0.21670937 -0.02249805  0.2278338 -0.1409704  0.17529125 -0.05521812 -0.186143875  0.54336450 0.13775243
#>        emb_098     emb_099     emb_100     emb_101     emb_102    emb_103     emb_104       emb_105   emb_106     emb_107
#> 1  0.009472401 -0.14205784  0.12607630 -0.19062089 -0.08417289 -0.0868922  0.08520973  0.1095840322 0.0911104 -0.11639215
#> 2 -0.125230156 -0.27682150 -0.03230023  0.09556636 -0.01811487  0.2020687 -0.28405397 -0.2379808277 0.0503400  0.07255385
#> 3 -0.125230156 -0.27682150 -0.03230023  0.09556636 -0.01811487  0.2020687 -0.28405397 -0.2379808277 0.0503400  0.07255385
#> 4 -0.269951746  0.01101355  0.12618919  0.24217032  0.19674813  0.1094553 -0.02718710 -0.0006717525 0.1023474  0.30398776
#>      emb_108     emb_109     emb_110     emb_111     emb_112    emb_113     emb_114     emb_115     emb_116     emb_117
#> 1 -0.1988509 -0.02318672 -0.03355397  0.06281934  0.09064088 -0.1812218 -0.08294683  0.09746995  0.16949679 0.001256246
#> 2 -0.3391048  0.29906577 -0.28191616  0.04745353 -0.04532966 -0.1529041  0.04579017  0.02364063 -0.31116034 0.160783665
#> 3 -0.3391048  0.29906577 -0.28191616  0.04745353 -0.04532966 -0.1529041  0.04579017  0.02364063 -0.31116034 0.160783665
#> 4  0.0299391  0.38101604 -0.07525725 -0.19109026 -0.09757482 -0.3430861  0.07392349 -0.34514988 -0.05409198 0.021575954
#>       emb_118     emb_119    emb_120    emb_121     emb_122    emb_123     emb_124     emb_125     emb_126     emb_127    emb_128
#> 1 -0.09206300 -0.27094193  0.1914412 0.10522338  0.01736773 -0.1658078 -0.24409867 -0.20621473 -0.35578349  0.19991713 -0.1075110
#> 2 -0.07702465 -0.02175729 -0.1156647 0.01362599 -0.20085029  0.3362202 -0.03874875 -0.02545092  0.21585878 -0.04820869  0.1341518
#> 3 -0.07702465 -0.02175729 -0.1156647 0.01362599 -0.20085029  0.3362202 -0.03874875 -0.02545092  0.21585878 -0.04820869  0.1341518
#> 4  0.24660901 -0.25714830 -0.3096262 0.14711675 -0.09584628 -0.2465328  0.02228437 -0.05287175  0.04758008  0.13082074 -0.4366458
#>       emb_129    emb_130     emb_131     emb_132    emb_133     emb_134     emb_135     emb_136    emb_137     emb_138     emb_139
#> 1 0.050961102 0.08590268 -0.07344585 -0.11005830  0.2082962 -0.03440777 -0.15951183  0.04417117 -0.1002716 -0.07090355 -0.09013366
#> 2 0.084913827 0.21485816 -0.26201880 -0.04661880  0.1594945  0.24577541 -0.04687785  0.02120483 -0.2707188 -0.05038439 -0.21531074
#> 3 0.084913827 0.21485816 -0.26201880 -0.04661880  0.1594945  0.24577541 -0.04687785  0.02120483 -0.2707188 -0.05038439 -0.21531074
#> 4 0.002557264 0.30628723 -0.24981013 -0.01674807 -0.3169997  0.12056302 -0.09506032 -0.01222125 -0.4409042  0.23120450  0.01691840
#>        emb_140     emb_141     emb_142     emb_143     emb_144    emb_145     emb_146     emb_147    emb_148    emb_149    emb_150
#> 1  0.004567102 -0.04074124 -0.09970398 -0.07412403  0.08118367 0.04151318  0.01023637 -0.02712608  0.1120797 0.07420963  0.2022959
#> 2 -0.214246295  0.12730155  0.04358483 -0.04084410  0.08556246 0.37193301 -0.23297635  0.16786779 -0.1552295 0.13361997  0.4047717
#> 3 -0.214246295  0.12730155  0.04358483 -0.04084410  0.08556246 0.37193301 -0.23297635  0.16786779 -0.1552295 0.13361997  0.4047717
#> 4  0.127434801  0.19368662  0.02984041 -0.14155845 -0.15326020 0.02936405  0.05187999  0.06006772  0.0758267 0.04905358 -0.0133047
#>       emb_151    emb_152     emb_153     emb_154     emb_155    emb_156    emb_157     emb_158     emb_159    emb_160     emb_161
#> 1 -0.02539130 -0.1542052  0.09878749  0.11210436 0.190853971 -0.2355878  0.1032905 -0.21532827  0.09456767 -0.1445503 -0.33522494
#> 2 -0.07385027  0.2168649  0.08279617  0.02853568 0.007983398 -0.2673024 -0.3518553  0.07097678  0.08358909 -0.1986835 -0.29901644
#> 3 -0.07385027  0.2168649  0.08279617  0.02853568 0.007983398 -0.2673024 -0.3518553  0.07097678  0.08358909 -0.1986835 -0.29901644
#> 4  0.25728051  0.2761333 -0.10433040 -0.02122432 0.066375951 -0.3625118 -0.2547615  0.13501658 -0.28645951 -0.1917117 -0.01892012
#>       emb_162      emb_163    emb_164     emb_165   emb_166     emb_167    emb_168     emb_169    emb_170    emb_171     emb_172
#> 1  0.15268593 -0.001686232  0.2152747 -0.10312133 0.1135696 -0.02624894  0.1098730  0.09047928 0.12684340 -0.0694985 -0.11949543
#> 2 -0.01896982 -0.052200415  0.1262764  0.10607937 0.0321700 -0.25643115 -0.1073976  0.26462262 0.03679075 -0.2173935  0.07656907
#> 3 -0.01896982 -0.052200415  0.1262764  0.10607937 0.0321700 -0.25643115 -0.1073976  0.26462262 0.03679075 -0.2173935  0.07656907
#> 4 -0.02507000 -0.031375002 -0.2519416  0.08888888 0.3796148 -0.25476800 -0.1437821 -0.15589955 0.23368900  0.1311810  0.52442150
#>      emb_173     emb_174     emb_175    emb_176    emb_177     emb_178     emb_179     emb_180      emb_181    emb_182    emb_183
#> 1  0.2164041 -0.29396720 -0.16588253 -0.1348005 -0.1148055 -0.08968537  0.05097483  0.09355133  0.008875800  0.1106400 -0.1088511
#> 2 -0.1012526 -0.02410151 -0.02048860 -0.1179298  0.2362113  0.30876314 -0.22625668  0.07487945  0.008851715 -0.1024263 -0.2249113
#> 3 -0.1012526 -0.02410151 -0.02048860 -0.1179298  0.2362113  0.30876314 -0.22625668  0.07487945  0.008851715 -0.1024263 -0.2249113
#> 4 -0.0487657  0.25153150  0.02299049 -0.1953604 -0.1572996  0.29195935 -0.05653973 -0.12341889 -0.312314242 -0.1885454 -0.2873893
#>       emb_184     emb_185     emb_186    emb_187     emb_188      emb_189    emb_190     emb_191      emb_192     emb_193
#> 1 -0.02326688  0.17733055 -0.07351807  0.0222525 -0.12066887 -0.179350998 0.01909462  0.13228424  0.024832169  0.05002003
#> 2 -0.06455390  0.07631866  0.01623236 -0.1098196 -0.04689731 -0.033685058 0.16270872 -0.05825762  0.069446986 -0.05563271
#> 3 -0.06455390  0.07631866  0.01623236 -0.1098196 -0.04689731 -0.033685058 0.16270872 -0.05825762  0.069446986 -0.05563271
#> 4 -0.02149600 -0.16462975  0.14877875  0.2350687  0.36260483  0.004200405 0.20571376  0.09558415 -0.006550124 -0.30820300
#>       emb_194     emb_195    emb_196     emb_197     emb_198     emb_199     emb_200
#> 1 -0.20531311 -0.00853500  0.0639337  0.29886368  0.01618892 -0.08192083 -0.37027851
#> 2 -0.17479033 -0.13635058  0.1291080 -0.09743453 -0.09941812 -0.05773153 -0.09702638
#> 3 -0.17479033 -0.13635058  0.1291080 -0.09743453 -0.09941812 -0.05773153 -0.09702638
#> 4  0.01686265 -0.05414012 -0.1694009 -0.13313706 -0.15822850  0.14830773 -0.34555282
#>  [ reached 'max' / getOption("max.print") -- omitted 1 rows ]

Adding the UMLS linker

The UMLS linker can be turned on (and off) even if clinspacy_init() has already been called. The first time you turn it on, it takes a while because the linker needs to be loaded into memory. On subsequent removal and addition, this occurs much more quickly because the linker is only removed/added to the pipeline and does not need to be reloaded into memory.

clinspacy_init(use_linker = TRUE)

Named entity recognition (with the UMLS linker)

By turning on the UMLS linker, you can restrict the results by semantic type. In general, restricting the result in clinspacy() is not a good idea because you can always subset the results later within bind_clinspacy() and bind_clinspacy_embeddings().

clinspacy('This patient has diabetes and CKD stage 3 but no HTN.')
#>   |                                                                                                                                 |                                                                                                                         |   0%  |                                                                                                                                 |=========================================================================================================================| 100%
#>   clinspacy_id      cui      entity       lemma             semantic_type                      definition is_family is_historical
#> 1            1 C0030705     patient     patient Patient or Disabled Group                        Patients     FALSE         FALSE
#> 2            1 C1578481     patient     patient           Idea or Concept      Mail Claim Party - Patient     FALSE         FALSE
#> 3            1 C1578484     patient     patient           Idea or Concept Relationship modifier - Patient     FALSE         FALSE
#> 4            1 C1578485     patient     patient      Intellectual Product Specimen Source Codes - Patient     FALSE         FALSE
#> 5            1 C1578486     patient     patient      Intellectual Product  Disabled Person Code - Patient     FALSE         FALSE
#> 6            1 C0011847    diabetes    diabetes       Disease or Syndrome                        Diabetes     FALSE         FALSE
#> 7            1 C0011849    diabetes    diabetes       Disease or Syndrome               Diabetes Mellitus     FALSE         FALSE
#> 8            1 C2316787 CKD stage 3 ckd stage 3       Disease or Syndrome  Chronic kidney disease stage 3     FALSE         FALSE
#> 9            1 C0020538         HTN         htn       Disease or Syndrome            Hypertensive disease     FALSE         FALSE
#>   is_hypothetical is_negated is_uncertain section_category
#> 1           FALSE      FALSE        FALSE             <NA>
#> 2           FALSE      FALSE        FALSE             <NA>
#> 3           FALSE      FALSE        FALSE             <NA>
#> 4           FALSE      FALSE        FALSE             <NA>
#> 5           FALSE      FALSE        FALSE             <NA>
#> 6           FALSE      FALSE        FALSE             <NA>
#> 7           FALSE      FALSE        FALSE             <NA>
#> 8           FALSE      FALSE        FALSE             <NA>
#> 9           FALSE       TRUE        FALSE             <NA>

clinspacy('This patient with diabetes is taking omeprazole, aspirin, and lisinopril 10 mg but is not taking albuterol anymore as his asthma has resolved.',
          semantic_types = 'Pharmacologic Substance')
#>   |                                                                                                                                 |                                                                                                                         |   0%  |                                                                                                                                 |=========================================================================================================================| 100%
#>   clinspacy_id      cui     entity      lemma           semantic_type definition is_family is_historical is_hypothetical
#> 1            1 C0028978 omeprazole omeprazole Pharmacologic Substance Omeprazole     FALSE         FALSE           FALSE
#> 2            1 C0004057    aspirin    aspirin Pharmacologic Substance    Aspirin     FALSE         FALSE           FALSE
#> 3            1 C0065374 lisinopril lisinopril Pharmacologic Substance Lisinopril     FALSE         FALSE           FALSE
#> 4            1 C0001927  albuterol  albuterol Pharmacologic Substance  Albuterol     FALSE         FALSE           FALSE
#>   is_negated is_uncertain section_category
#> 1      FALSE        FALSE             <NA>
#> 2      FALSE        FALSE             <NA>
#> 3      FALSE        FALSE             <NA>
#> 4       TRUE        FALSE             <NA>

clinspacy('This patient with diabetes is taking omeprazole, aspirin, and lisinopril 10 mg but is not taking albuterol anymore as his asthma has resolved.',
          semantic_types = 'Disease or Syndrome')
#>   |                                                                                                                                 |                                                                                                                         |   0%  |                                                                                                                                 |==========================================

Copy Link

Version

Install

install.packages('clinspacy')

Monthly Downloads

288

Version

1.0.2

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Karandeep Singh

Last Published

March 20th, 2021

Functions in clinspacy (1.0.2)

bind_clinspacy_embeddings

bind_clinspacy

This function binds columns containing either the lemma of the entity or the UMLS concept unique identifier (CUI) with frequencies to a data frame. The resulting data frame can be used to train a machine learning model or for additional feature selection.
dataset_mtsamples

Medical transcription samples.
%>%

Pipe operator
dataset_cui2vec_embeddings

Cui2vec concept embeddings
dataset_cui2vec_definitions

Cui2vec concept definitions
clinspacy

This is the primary function for processing both data frames and character vectors in the clinspacy package.
clinspacy_init

Initializes clinspacy. This function is optional to run but gives you more control over the parameters used by scispacy at initiation. If you do not run this function, it will be run with default parameters the first time that any of the package functions are run.