nchar: Count the Number of Characters (or Bytes or Width)

Description

nchar takes a character vector as an argument and returns a vector whose elements contain the sizes of the corresponding elements of x.

nzchar is a fast way to find out if elements of a character vector are non-empty strings.

Usage

nchar(x, type = "chars", allowNA = FALSE)
nzchar(x)

Arguments

character vector, or a vector to be coerced to a character vector. Giving a factor is an error.

type

character string: partial matching to one of c("bytes", "chars", "width"). See ‘Details’.

allowNA

logical: should NA be returned for invalid multibyte strings or "bytes"-encoded strings (rather than throwing an error)?

Value

For nchar, an integer vector giving the sizes of each element, currently always 2 for missing values (for NA).If allowNA = TRUE and an element is invalid in a multi-byte character set such as UTF-8, its number of characters and the width will be NA. Otherwise the number of characters will be non-negative, so !is.na(nchar(x, "chars", TRUE)) is a test of validity.A character string marked with "bytes" encoding (see Encoding) has a number of bytes, but neither a known number of characters nor a width, so the latter two types are NA if allowNA = TRUE, otherwise an error.Names, dims and dimnames are copied from the input.For nzchar, a logical vector of the same length as x, true if and only if the element has non-zero length.

Details

The ‘size’ of a character string can be measured in one of three ways

bytes: The number of bytes needed to store the string (plus in C a final terminator which is not counted).
chars: The number of human-readable characters.
width: The number of columns cat will use to print the string in a monospaced font. The same as chars if this cannot be calculated.

These will often be the same, and almost always will be in single-byte locales. There will be differences between the first two with multibyte character sequences, e.g.\ifelse{latex}{\out{~}}{ } in UTF-8 locales.

The internal equivalent of the default method of as.character is performed on x (so there is no method dispatch). If you want to operate on non-vector objects passing them through deparse first will be required.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

Examples

Run this code

x <- c("asfef", "qwerty", "yuiop[", "b", "stuff.blah.yech")
nchar(x)
# 5  6  6  1 15

nchar(deparse(mean))
# 18 17

Run the code above in your browser using DataLab