Learn R Programming

icd (version 2.2)

rtf_strip: Strip RTF

Description

Take a vector of character strings containing RTF, replace each \tab with a space and eradicate all other RTF symbols

Usage

rtf_strip(x, ...)

Arguments

x

vector of character strings containing RTF

Details

just for \tab, replace with space, otherwise, drop RTF tags entirely

Examples

Run this code
# NOT RUN {
# rtf_strip is a slow step, useBytes and perl together is five times faster
f_info_rtf <- rtf_fetch_year("2011", offline = FALSE)
rtf_lines <- readLines(f_info_rtf$file_path, warn = FALSE, encoding = "ASCII")
microbenchmark::microbenchmark(
  res_both <- rtf_strip(rtf_lines, perl = TRUE, useBytes = TRUE),
  res_none <- rtf_strip(rtf_lines, perl = FALSE, useBytes = FALSE),
  res_bytes <- rtf_strip(rtf_lines, perl = FALSE, useBytes = TRUE),
  res_perl <- rtf_strip(rtf_lines, perl = TRUE, useBytes = FALSE),
  times = 5
)
stopifnot(identical(res_both, res_none))
# }

Run the code above in your browser using DataLab