nchar

0th

Percentile

Count the Number of Characters (or Bytes or Width)

nchar takes a character vector as an argument and returns a vector whose elements contain the sizes of the corresponding elements of x.

nzchar is a fast way to find out if elements of a character vector are non-empty strings.

Keywords
character
Usage
nchar(x, type = "chars", allowNA = FALSE)
nzchar(x)
Arguments
x
character vector, or a vector to be coerced to a character vector. Giving a factor is an error.
type
character string: partial matching to one of c("bytes", "chars", "width"). See ‘Details’.
allowNA
logical: should NA be returned for invalid multibyte strings or "bytes"-encoded strings (rather than throwing an error)?
Details

The ‘size’ of a character string can be measured in one of three ways

bytes
The number of bytes needed to store the string (plus in C a final terminator which is not counted).

chars
The number of human-readable characters.

width
The number of columns cat will use to print the string in a monospaced font. The same as chars if this cannot be calculated.

These will often be the same, and almost always will be in single-byte locales. There will be differences between the first two with multibyte character sequences, e.g.\ifelse{latex}{\out{~}}{ } in UTF-8 locales.

The internal equivalent of the default method of as.character is performed on x (so there is no method dispatch). If you want to operate on non-vector objects passing them through deparse first will be required.

Value

For nchar, an integer vector giving the sizes of each element, currently always 2 for missing values (for NA).If allowNA = TRUE and an element is invalid in a multi-byte character set such as UTF-8, its number of characters and the width will be NA. Otherwise the number of characters will be non-negative, so !is.na(nchar(x, "chars", TRUE)) is a test of validity.A character string marked with "bytes" encoding (see Encoding) has a number of bytes, but neither a known number of characters nor a width, so the latter two types are NA if allowNA = TRUE, otherwise an error.Names, dims and dimnames are copied from the input.For nzchar, a logical vector of the same length as x, true if and only if the element has non-zero length.

Note

This does not by default give the number of characters that will be used to print() the string. Use encodeString to find the characters used to print the string. windows This is particularly important on Windows when \uxxxx sequences have been used to enter Unicode characters not representable in the current encoding. Thus nchar("\u2642") is 1, and it is printed in Rgui as one character, but it will be printed in Rterm as , which is what encodeString gives. unix Where character strings have been marked as UTF-8, the number of characters and widths will be computed in UTF-8, even though printing may use escapes such as in a non-UTF-8 locale.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

See Also

strwidth giving width of strings for plotting; paste, substr, strsplit

Aliases
  • nchar
  • nzchar
Examples
library(base) x <- c("asfef", "qwerty", "yuiop[", "b", "stuff.blah.yech") nchar(x) # 5 6 6 1 15 nchar(deparse(mean)) # 18 17
Documentation reproduced from package base, version 3.2.0, License: Part of R 3.2.0

Community examples

Looks like there are no examples yet.