A data frame with three columns: parent, index, and
text, and one row for each sentence. The parent value
is the integer index of the parent text in x; the index value
is the integer index of the sentence in its parent; the
text value is the text of the sentence, a value of type
text.
Details
sentences splits text at the sentence boundaries defined by
http://unicode.org/reports/tr29/#Sentence_Boundaries.
These boundaries handle Unicode correctly and they give reasonable
behavior across a variety of languages. Unfortunately, the UAX 29
sentence-breaking rules do not handle abbreviations correctly. So, for
example, the text "I saw Mr. Jones today." will get split into
two sentences.
Future versions of the sentences function may change to
accommodate special rules for abbreviations like "Mr.", "Dr.", etc.