x containing characters
that are not in standardCharacters.grepNonStandardCharacters(x, value=FALSE,
standardCharacters=c(letters, LETTERS, ' ','.', ',', 0:9,
'"', "'", '-', '_', '(', ')', '[', ']', ''),
... )
- x
{
character vector in which it is desired to identfy elements
containing characters not in standardCharacters.
}
- value
{
logical: TRUE to return the values found in x,
FALSE to return their indices.
}
- standardCharacters
{
Characters to overlook in x to identify anything not in
standardCharacters.
}
- ...
{ optional arguments for regexpr }
1. x. <- strsplit(x, ''): convert the input character vector to a
list of vectors of character vectors with nchar(x.[i]) == 1
for i in 1:length(x). 2. sapply(x., ...) to identify all elements for which any element of
x[[i]] is not in standardCharacters.
an integer vector identifying all elements of x containing a
character not in standardCharacters.
[object Object]
stringi-package
grep,
regexpr,
subNonStandardCharacters,
showNonASCIINames <- c('Raul', 'Ra`l', 'Torres,Raul', 'Torres, Raul')
# confusion in character sets can create
# names like Names[2]chk <- grepNonStandardCharacters(Names)
stopifnot(
all.equal(chk, 2)
)
chkv <- grepNonStandardCharacters(Names, TRUE)
stopifnot(
all.equal(chkv, 'Ra`l')
)
manip