Wraps strings to a specified width accounting for Control Sequences.
strwrap_ctl
is intended to emulate strwrap
closely except with respect to
the Control Sequences (see details for other minor differences), while
strwrap2_ctl
adds features and changes the processing of whitespace.
strwrap_ctl
is faster than strwrap
.
strwrap_ctl(
x,
width = 0.9 * getOption("width"),
indent = 0,
exdent = 0,
prefix = "",
simplify = TRUE,
initial = prefix,
warn = getOption("fansi.warn", TRUE),
term.cap = getOption("fansi.term.cap", dflt_term_cap()),
ctl = "all",
normalize = getOption("fansi.normalize", FALSE),
carry = getOption("fansi.carry", FALSE),
terminate = getOption("fansi.terminate", TRUE)
)strwrap2_ctl(
x,
width = 0.9 * getOption("width"),
indent = 0,
exdent = 0,
prefix = "",
simplify = TRUE,
initial = prefix,
wrap.always = FALSE,
pad.end = "",
strip.spaces = !tabs.as.spaces,
tabs.as.spaces = getOption("fansi.tabs.as.spaces", FALSE),
tab.stops = getOption("fansi.tab.stops", 8L),
warn = getOption("fansi.warn", TRUE),
term.cap = getOption("fansi.term.cap", dflt_term_cap()),
ctl = "all",
normalize = getOption("fansi.normalize", FALSE),
carry = getOption("fansi.carry", FALSE),
terminate = getOption("fansi.terminate", TRUE)
)
A character vector, or list of character vectors if simplify
is
false.
a character vector, or an object which can be converted to a
character vector by as.character
.
a positive integer giving the target column for wrapping lines in the output.
a non-negative integer giving the indentation of the first line in a paragraph.
a non-negative integer specifying the indentation of subsequent lines in paragraphs.
a character string to be used as prefix for
each line except the first, for which initial
is used.
a logical. If TRUE
, the result is a single
character vector of line text; otherwise, it is a list of the same
length as x
the elements of which are character vectors of
line text obtained from the corresponding element of x
.
(Hence, the result in the former case is obtained by unlisting that
of the latter.)
TRUE (default) or FALSE, whether to warn when potentially
problematic Control Sequences are encountered. These could cause the
assumptions fansi
makes about how strings are rendered on your display
to be incorrect, for example by moving the cursor (see ?fansi
).
At most one warning will be issued per element in each input vector. Will
also warn about some badly encoded UTF-8 strings, but a lack of UTF-8
warnings is not a guarantee of correct encoding (use validUTF8
for
that).
character a vector of the capabilities of the terminal, can
be any combination of "bright" (SGR codes 90-97, 100-107), "256" (SGR codes
starting with "38;5" or "48;5"), "truecolor" (SGR codes starting with
"38;2" or "48;2"), and "all". "all" behaves as it does for the ctl
parameter: "all" combined with any other value means all terminal
capabilities except that one. fansi
will warn if it encounters SGR codes
that exceed the terminal capabilities specified (see term_cap_test
for details). In versions prior to 1.0, fansi
would also skip exceeding
SGRs entirely instead of interpreting them. You may add the string "old"
to any otherwise valid term.cap
spec to restore the pre 1.0 behavior.
"old" will not interact with "all" the way other valid values for this
parameter do.
character, which Control Sequences should be treated
specially. Special treatment is context dependent, and may include
detecting them and/or computing their display/character width as zero. For
the SGR subset of the ANSI CSI sequences, and OSC hyperlinks, fansi
will also parse, interpret, and reapply the sequences as needed. You can
modify whether a Control Sequence is treated specially with the ctl
parameter.
"nl": newlines.
"c0": all other "C0" control characters (i.e. 0x01-0x1f, 0x7F), except for newlines and the actual ESC (0x1B) character.
"sgr": ANSI CSI SGR sequences.
"csi": all non-SGR ANSI CSI sequences.
"url": OSC hyperlinks
"osc": all non-OSC-hyperlink OSC sequences.
"esc": all other escape sequences.
"all": all of the above, except when used in combination with any of the above, in which case it means "all but".
TRUE or FALSE (default) whether SGR sequence should be
normalized out such that there is one distinct sequence for each SGR code.
normalized strings will occupy more space (e.g. "\033[31;42m" becomes
"\033[31m\033[42m"), but will work better with code that assumes each SGR
code will be in its own escape as crayon
does.
TRUE, FALSE (default), or a scalar string, controls whether to
interpret the character vector as a "single document" (TRUE or string) or
as independent elements (FALSE). In "single document" mode, active state
at the end of an input element is considered active at the beginning of the
next vector element, simulating what happens with a document with active
state at the end of a line. If FALSE each vector element is interpreted as
if there were no active state when it begins. If character, then the
active state at the end of the carry
string is carried into the first
element of x
(see "Replacement Functions" for differences there). The
carried state is injected in the interstice between an imaginary zeroeth
character and the first character of a vector element. See the "Position
Semantics" section of substr_ctl
and the "State Interactions" section
of ?fansi
for details. Except for strwrap_ctl
where NA
is
treated as the string "NA"
, carry
will cause NA
s in inputs to
propagate through the remaining vector elements.
TRUE (default) or FALSE whether substrings should have
active state closed to avoid it bleeding into other strings they may be
prepended onto. This does not stop state from carrying if carry = TRUE
.
See the "State Interactions" section of ?fansi
for details.
TRUE or FALSE (default), whether to hard wrap at requested
width if no word breaks are detected within a line. If set to TRUE then
width
must be at least 2.
character(1L), a single character to use as padding at the
end of each line until the line is width
wide. This must be a printable
ASCII character or an empty string (default). If you set it to an empty
string the line remains unpadded.
TRUE (default) or FALSE, if TRUE, extraneous white spaces (spaces, newlines, tabs) are removed in the same way as base::strwrap does. When FALSE, whitespaces are preserved, except for newlines as those are implicit boundaries between output vector elements.
FALSE (default) or TRUE, whether to convert tabs to
spaces. This can only be set to TRUE if strip.spaces
is FALSE.
integer(1:n) indicating position of tab stops to use when converting tabs to spaces. If there are more tabs in a line than defined tab stops the last tab stop is re-used. For the purposes of applying tab stops, each input line is considered a line and the character count begins from the beginning of the input line.
Control Sequences are non-printing characters or sequences of characters.
Special Sequences are a subset of the Control Sequences, and include CSI
SGR sequences which can be used to change rendered appearance of text, and
OSC hyperlinks. See fansi
for details.
fansi
approximates grapheme widths and counts by using heuristics for
grapheme breaks that work for most common graphemes, including emoji
combining sequences. The heuristic is known to work incorrectly with
invalid combining sequences, prepending marks, and sequence interruptors.
fansi
does not provide a full implementation of grapheme break detection to
avoid carrying a copy of the Unicode grapheme breaks table, and also because
the hope is that R will add the feature eventually itself.
The utf8
package provides a
conforming grapheme parsing implementation.
Several factors could affect the exact output produced by fansi
functions across versions of fansi
, R
, and/or across systems.
In general it is best not to rely on exact fansi
output, e.g. by
embedding it in tests.
Width and grapheme calculations depend on locale, Unicode database
version, and grapheme processing logic (which is still in development), among
other things. For the most part fansi
(currently) uses the internals of
base::nchar(type='width')
, but there are exceptions and this may change in
the future.
How a particular display format is encoded in Control Sequences is
not guaranteed to be stable across fansi
versions. Additionally, which
Special Sequences are re-encoded vs transcribed untouched may change.
In general we will strive to keep the rendered appearance stable.
To maximize the odds of getting stable output set normalize_state
to
TRUE
and type
to "chars"
in functions that allow it, and
set term.cap
to a specific set of capabilities.
fansi
is unaware of text directionality and operates as if all strings are
left to right (LTR). Using fansi
function with strings that contain mixed
direction scripts (i.e. both LTR and RTL) may produce undesirable results.
strwrap2_ctl
can convert tabs to spaces, pad strings up to width
, and
hard-break words if single words are wider than width
.
Unlike base::strwrap, both these functions will translate any non-ASCII
strings to UTF-8 and return them in UTF-8. Additionally, invalid UTF-8
always causes errors, and prefix
and indent
must be scalar.
When replacing tabs with spaces the tabs are computed relative to the
beginning of the input line, not the most recent wrap point.
Additionally,indent
, exdent
, initial
, and prefix
will be ignored when
computing tab positions.
?fansi
for details on how Control Sequences are
interpreted, particularly if you are getting unexpected results,
normalize_state
for more details on what the normalize
parameter does,
state_at_end
to compute active state at the end of strings,
close_state
to compute the sequence required to close active state.
hello.1 <- "hello \033[41mred\033[49m world"
hello.2 <- "hello\t\033[41mred\033[49m\tworld"
strwrap_ctl(hello.1, 12)
strwrap_ctl(hello.2, 12)
## In default mode strwrap2_ctl is the same as strwrap_ctl
strwrap2_ctl(hello.2, 12)
## But you can leave whitespace unchanged, `warn`
## set to false as otherwise tabs causes warning
strwrap2_ctl(hello.2, 12, strip.spaces=FALSE, warn=FALSE)
## And convert tabs to spaces
strwrap2_ctl(hello.2, 12, tabs.as.spaces=TRUE)
## If your display has 8 wide tab stops the following two
## outputs should look the same
writeLines(strwrap2_ctl(hello.2, 80, tabs.as.spaces=TRUE))
writeLines(hello.2)
## tab stops are NOT auto-detected, but you may provide
## your own
strwrap2_ctl(hello.2, 12, tabs.as.spaces=TRUE, tab.stops=c(6, 12))
## You can also force padding at the end to equal width
writeLines(strwrap2_ctl("hello how are you today", 10, pad.end="."))
## And a more involved example where we read the
## NEWS file, color it line by line, wrap it to
## 25 width and display some of it in 3 columns
## (works best on displays that support 256 color
## SGR sequences)
NEWS <- readLines(file.path(R.home('doc'), 'NEWS'))
NEWS.C <- fansi_lines(NEWS, step=2) # color each line
W <- strwrap2_ctl(NEWS.C, 25, pad.end=" ", wrap.always=TRUE)
writeLines(c("", paste(W[1:20], W[100:120], W[200:220]), ""))
Run the code above in your browser using DataLab