sets (version 1.0-18)

fuzzydocs: Documents on Fuzzy Theory

Description

Occurence of three terms (neural networks, fuzzy, and image) in 30 documents retrieved from a Japanese article data base on fuzzy theory and systems.

Usage

data("fuzzy_docs")

Arguments

Format

fuzzy_docs is a list of 30 fuzzy multisets, representing the occurrence of the terms “neural networks”, “fuzzy”, and “image” in each document. Each term appears with up to three membership values representing weights, depending on whether the term occurred in the abstract (0.2), the keywords section (0.6), and/or the title (1). The first 12 documents concern neural networks, the remaining 18 image processing. In the reference, various clustering methods have been employed to recover the two groups in the data set.

Examples

Run this code
# NOT RUN {
data(fuzzy_docs)

## compute distance matrix using Jaccard dissimilarity
d <- as.dist(set_outer(fuzzy_docs, gset_dissimilarity))

## apply hierarchical clustering (Ward method)
cl <- hclust(d, "ward")

## retrieve two clusters
cutree(cl, 2)

## -> clearly, the clusters are formed by docs 1--12 and 13--30,
## respectively.

# }

Run the code above in your browser using DataCamp Workspace