Learn R Programming

word.alignment (version 1.1)

remove.punct: Tokenizing and Removing Punctuation Marks

Description

It splits a given text into separated words and removes its punctuation marks.

Usage

remove.punct(text)

Arguments

text

an object.

Value

A vector of character string.

Details

This function also considers numbers as a separated word.

Note that This function removes "dot"" only if it is at the end of the sentence, separately. Meanwhile, it does not eliminate dash and hyper.Because it is assumed that words containing these punctuations are one word.

Examples

Run this code
# NOT RUN {
x = "This is an  example-based MT!"  
remove.punct (x)
# }

Run the code above in your browser using DataLab