Create or test for text objects.
as_text(x, ...)
is_text(x)
object to be coerced or tested.
further arguments passed to or from other methods.
as_text
attempts to coerce its argument to text
type; it
strips all attributes except for names.
is_text
returns TRUE
or FALSE
depending on
whether its argument is of text type or not.
The text
type is a new data type provided by the corpus
package suitable for processing Unicode text. Text vectors behave
like character vectors (and can be converted to them with the
as.character
function). They can be created using the
read_ndjson
function or by converting another object
using the as_text
function.
The default behavior for as_text
is to proceed as follows:
If x
is a character
vector, then we create
a new text
vector from x
, preserving
names(x)
if they exist.
If is_text(x)
is TRUE
, then we drop all
attributes from the object except for its names, and we
set the object class to text
.
Otherwise, if is.data.frame(x)
is TRUE
,
then we look for a column to convert. First, we look for a
column named "text"
. If none exists, we look for a
column of type text
. If we find such a column, then
we call as_text
on the found column and we set the
object names to match x
's row names. If there
are no columns with type text
or if there multiple
columns of type text
, none of which are named
"text"
, then we fail with an error message.
Finally, if x
is not a character
vector, and if
is_text(x)
and is.data.frame(x)
are
both FALSE
, then we try to use as.character
on the object and then we convert the resulting character
vector to text
.
This special handling for the names of the object is different from
the other R conversion functions (as.numeric
,
as.character
, etc.), which drop the names.
as_text
and is_text
are generic: you can write methods
to handle specific classes of objects.
as_text("hello, world!")
as_text(c(a="goodnight", b="moon")) # keeps names
is_text("hello") # FALSE, "hello" is character, not text
Run the code above in your browser using DataLab