fansi (version 0.4.0)

nchar_ctl: ANSI Control Sequence Aware Version of nchar

Description

nchar_ctl counts all non Control Sequence characters. nzchar_ctl returns TRUE for each input vector element that has non Control Sequence sequence characters. By default newlines and other C0 control characters are not counted.

Usage

nchar_ctl(x, type = "chars", allowNA = FALSE, keepNA = NA,
  ctl = "all", warn = getOption("fansi.warn"), strip)

nchar_sgr(x, type = "chars", allowNA = FALSE, keepNA = NA, warn = getOption("fansi.warn"))

nzchar_ctl(x, keepNA = NA, ctl = "all", warn = getOption("fansi.warn"))

nzchar_sgr(x, keepNA = NA, warn = getOption("fansi.warn"))

Arguments

x

a character vector or object that can be coerced to character.

type

character string, one of "chars", or "width". For byte counts use base::nchar.

allowNA

logical: should NA be returned for invalid multibyte strings or "bytes"-encoded strings (rather than throwing an error)?

keepNA

logical: should NA be returned where ever x is NA? If false, nchar() returns 2, as that is the number of printing characters used when strings are written to output, and nzchar() is TRUE. The default for nchar(), NA, means to use keepNA = TRUE unless type is "width". Used to be (implicitly) hard coded to FALSE in R versions \(\le\) 3.2.0.

ctl

character, which Control Sequences should be treated specially. See the "_ctl vs. _sgr" section for details.

  • "nl": newlines.

  • "c0": all other "C0" control characters (i.e. 0x01-0x1f, 0x7F), except for newlines and the actual ESC (0x1B) character.

  • "sgr": ANSI CSI SGR sequences.

  • "csi": all non-SGR ANSI CSI sequences.

  • "esc": all other escape sequences.

  • "all": all of the above, except when used in combination with any of the above, in which case it means "all but".

warn

TRUE (default) or FALSE, whether to warn when potentially problematic Control Sequences are encountered. These could cause the assumptions fansi makes about how strings are rendered on your display to be incorrect, for example by moving the cursor (see fansi).

strip

deprecated in favor of ctl.

_ctl vs. _sgr

The *_ctl versions of the functions treat all Control Sequences specially by default. Special treatment is context dependent, and may include detecting them and/or computing their display/character width as zero. For the SGR subset of the ANSI CSI sequences, fansi will also parse, interpret, and reapply the text styles they encode if needed. You can modify whether a Control Sequence is treated specially with the ctl parameter. You can exclude a type of Control Sequence from special treatment by combining "all" with that type of sequence (e.g. ctl=c("all", "nl") for special treatment of all Control Sequences but newlines). The *_sgr versions only treat ANSI CSI SGR sequences specially, and are equivalent to the *_ctl versions with the ctl parameter set to "sgr".

Details

nchar_ctl is just a wrapper around nchar(strip_ctl(...)). nzchar_ctl is implemented in native code and is much faster than the otherwise equivalent nzchar(strip_ctl(...)).

These functions will warn if either malformed or non-CSI escape sequences are encountered, as these may be incorrectly interpreted.

See Also

fansi for details on how Control Sequences are interpreted, particularly if you are getting unexpected results, strip_ctl for removing Control Sequences.

Examples

Run this code
# NOT RUN {
nchar_ctl("\033[31m123\a\r")
## with some wide characters
cn.string <-  sprintf("\033[31m%s\a\r", "\u4E00\u4E01\u4E03")
nchar_ctl(cn.string)
nchar_ctl(cn.string, type='width')

## Remember newlines are not counted by default
nchar_ctl("\t\n\r")

## The 'c0' value for the `ctl` argument does
## not include newlines.
nchar_ctl("\t\n\r", ctl="c0")
nchar_ctl("\t\n\r", ctl=c("c0", "nl"))

## The _sgr flavor only treats SGR sequences as zero width

nchar_sgr("\033[31m123")
nchar_sgr("\t\n\n123")

## All of the following are Control Sequences
nzchar_ctl("\n\033[42;31m\033[123P\a")
# }

Run the code above in your browser using DataCamp Workspace