Parse the elements of a character vector into a dataframe of sentences with additional identifiers.
sentenceParse(text, docId = "create")
Character vector to be parsed into sentences
A vector of document IDs with length equal to the length of text
. If docId == "create"
then doc IDs will be created as an index from 1 to n
, where n
is the length of text
.
A data frame with 3 columns and n
rows, where n
is the number of sentences found by the routine. Column 1: docId
document id for the sentence. Column 2: sentenceId
sentence id for the sentence. Column 3: sentence
the sentences found in the routine.
# NOT RUN {
sentenceParse("Bill is trying to earn a Ph.D.", "You have to have a 5.0 GPA.")
sentenceParse(c("Bill is trying to earn a Ph.D.", "You have to have a 5.0 GPA."),
docId=c("d1","d2"))
# }
Run the code above in your browser using DataLab