Learn R Programming

fansi - ANSI Control Sequence Aware String Functions

Counterparts to R string manipulation functions that account for the effects of ANSI text formatting control sequences.

Formatting Strings with Control Sequences

Many terminals will recognize special sequences of characters in strings and change display behavior as a result. For example, on my terminal the sequences "\033[3?m" and "\033[4?m", where "?" is a digit in 1-7, change the foreground and background colors of text respectively:

fansi <- "\033[30m\033[41mF\033[42mA\033[43mN\033[44mS\033[45mI\033[m"

This type of sequence is called an ANSI CSI SGR control sequence. Most *nix terminals support them, and newer versions of Windows and Rstudio consoles do too. You can check whether your display supports them by running term_cap_test().

Whether the fansi functions behave as expected depends on many factors, including how your particular display handles Control Sequences. See ?fansi for details, particularly if you are getting unexpected results.

Manipulation of Formatted Strings

ANSI control characters and sequences (Control Sequences hereafter) break the relationship between byte/character position in a string and display position. For example, to extract the “ANS” part of our colored “FANSI”, we would need to carefully compute the character positions:

With fansi we can select directly based on display position:

If you look closely you’ll notice that the text color for the substr version is wrong as the naïve string extraction loses the initial"\033[37m" that sets the foreground color. Additionally, the color from the last letter bleeds out into the next line.

fansi Functions

fansi provides counterparts to the following string functions:

  • substr (and substr<-)
  • strsplit
  • strtrim
  • strwrap
  • nchar / nzchar
  • trimws

These are drop-in replacements that behave (almost) identically to the base counterparts, except for the Control Sequence awareness. There are also utility functions such as strip_ctl to remove Control Sequences and has_ctl to detect whether strings contain them.

Much of fansi is written in C so you should find performance of the fansi functions to be slightly slower than the corresponding base functions, with the exception that strwrap_ctl is much faster. Operations involving type = "width" will be slower still. We have prioritized convenience and safety over raw speed in the C code, but unless your code is primarily engaged in string manipulation fansi should be fast enough to avoid attention in benchmarking traces.

Width Based Substrings

fansi also includes improved versions of some of those functions, such as substr2_ctl which allows for width based substrings. To illustrate, let’s create an emoji string made up of two wide characters:

pizza.grin <- sprintf("\033[46m%s\033[m", strrep("\U1F355\U1F600", 10))

And a colorful background made up of one wide characters:

raw <- paste0("\033[45m", strrep("FANSI", 40))
wrapped <- strwrap2_ctl(raw, 41, wrap.always=TRUE)

When we inject the 2-wide emoji into the 1-wide background their widths are accounted for as shown by the result remaining rectangular:

starts <- c(18, 13, 8, 13, 18)
ends <-   c(23, 28, 33, 28, 23)
substr2_ctl(wrapped, type='width', starts, ends) <- pizza.grin

fansi width calculations use heuristics to account for graphemes, including combining emoji:

emo <- c(
  "\U1F468",
  "\U1F468\U1F3FD",
  "\U1F468\U1F3FD\u200D\U1F9B3",
  "\U1F468\u200D\U1F469\u200D\U1F467\u200D\U1F466"
)
writeLines(
  paste(
    emo,
    paste("base:", nchar(emo, type='width')),
    paste("fansi:", nchar_ctl(emo, type='width'))
) )
## 

Copy Link

Version

Install

install.packages('fansi')

Monthly Downloads

1,214,426

Version

1.0.6

License

GPL-2 | GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Brodie Gaslam

Last Published

December 8th, 2023

Functions in fansi (1.0.6)

nchar_sgr

Control Sequence Aware Version of nchar
sgr_256

Show 8 Bit CSI SGR Colors
nchar_ctl

Control Sequence Aware Version of nchar
strip_ctl

Strip Control Sequences
normalize_state

Normalize CSI and OSC Sequences
state_at_end

Utilities for Managing CSI and OSC State In Strings
set_knit_hooks

Set an Output Hook Convert Control Sequences to HTML in Rmarkdown
strsplit_ctl

Control Sequence Aware Version of strsplit
strip_sgr

Strip Control Sequences
sgr_to_html

Convert Control Sequences to HTML Equivalents
tabs_as_spaces

Replace Tabs With Spaces
strtrim_ctl

Control Sequence Aware Version of strtrim
strwrap_ctl

Control Sequence Aware Version of strwrap
substr_sgr

SGR Control Sequence Aware Version of substr
strwrap_sgr

Control Sequence Aware Version of strwrap
strtrim_sgr

Control Sequence Aware Version of strtrim
strsplit_sgr

Check for Presence of Control Sequences
substr_ctl

Control Sequence Aware Version of substr
term_cap_test

Test Terminal Capabilities
to_html

Convert Control Sequences to HTML Equivalents
trimws_ctl

Control Sequence Aware Version of trimws
unhandled_ctl

Identify Unhandled Control Sequences
fansi_lines

Colorize Character Vectors
html_code_block

Format Character Vector for Display as Code in HTML
html_esc

Escape Characters With Special HTML Meaning
fansi

Details About Manipulation of Strings Containing Control Sequences
dflt_term_cap

Default Arg Helper Funs
has_sgr

Check for Presence of Control Sequences
make_styles

Generate CSS Mapping Classes to Colors
has_ctl

Check for Presence of Control Sequences
in_html

Frame HTML in a Web Page And Display
fwl

Display Strings to Terminal