The text
type is a new datatype provided by the corpus
package suitable for processing Unicode text. Text vectors behave
like character vectors (and can be converted to them with the
as.character
function). They can be created using the
read_ndjson
function or by converting another object
using the as_text
function.The as_text
function first gets the names of the object
by calling names(x)
; then, it converts the object to type
"text"
and drops all of the object attributes. Finally,
the function sets the converted objects names to original object
names. This special handling for the names of the object is
different from the other R conversion functions
(as.numeric
, as.character
, etc.), which drop the names.
as_text
and is_text
are generic: you can write methods
to handle specific classes of objects. The default behavior is to
extract the names from the object using the names
function,
then call as.character
on the object and convert the
resulting character vector to a text
object.