quanteda (version 2.1.2)

spacyr-methods: Extensions for and from spacy_parse objects

Description

These functions provide quanteda methods for spacyr objects, and also extend spacy_parse and spacy_tokenize to work directly with corpus objects.

Usage

# S3 method for spacyr_parsed
docnames(x)

# S3 method for spacyr_parsed ndoc(x)

# S3 method for spacyr_parsed ntoken(x, ...)

# S3 method for spacyr_parsed ntype(x, ...)

# S3 method for spacyr_parsed nsentence(x, ...)

Arguments

x

an object returned by spacy_parse, or (for spacy_parse) a corpus object

...

not used for these functions

Details

spacy_parse(x, ...) and spacy_tokenize(x, ...) work directly on quanteda corpus objects.

docnames() returns the document names

ndoc() returns the number of documents

ntoken() returns the number of tokens by document

ntype() returns the number of types (unique tokens) by document

nsentence() returns the number of sentences by document

Examples

Run this code
# NOT RUN {
library("spacyr")
spacy_initialize()

corp <- corpus(c(doc1 = "And now, now, now for something completely different.",
                 doc2 = "Jack and Jill are children."))
spacy_tokenize(corp)
(parsed <- spacy_parse(corp))

ntype(parsed)
ntoken(parsed)
ndoc(parsed)
docnames(parsed)
# }

Run the code above in your browser using DataCamp Workspace