Learn R Programming

pubmed.mineR (version 1.0.11)

wordscluster: To cluster the words

Description

wordscluster is used to cluster the words, using the levenshtein distance concept, which are coming together in combination with either 'prefixes' or 'suffixes' or other compound words. The first word, usually of lowest length, could be 'stemmed' word in many cases drastically so, is considered as representative for that cluster.

Usage

wordscluster(lower, upper)

Arguments

lower

lower limit for characters in word. Default = 5.

upper

upper limit of characters in word. Default = 30

Value

a list object of words clustered together and a text filenamed "resulttable.txt" with the columns cluster number, cluster size and representatives of clusters.

Details

This function is usefull for dampening the 'explotion' of words output from word_atomizations. This step enables easy examination of the terms.

See Also

whichcluster word_atomizations

Examples

Run this code
# NOT RUN {
test=wordscluster(5, 10)
## here it will start making cluster of words of length with minimum of 5 characters 
## and maximum of 10 characters.  
# }

Run the code above in your browser using DataLab