clusterImage: This function receives a property listing task, a given concept, and a threshold. It clusterizes the data according to the order of the listed properties. Given the mentioned properties of all users for a specific concept, the algorithm estimates a similarity among properties, based on the number of words mentioned between properties. For example, if the properties A and B are usually mentioned one after another, their similarity will be higher than the properties A and C which are usually not even mentioned together. The properties with low similarity to all other properties (below the user-defined threshold) are discarded from the plot.

Description

This function receives a property listing task, a given concept, and a threshold. It clusterizes the data according to the order of the listed properties. Given the mentioned properties of all users for a specific concept, the algorithm estimates a similarity among properties, based on the number of words mentioned between properties. For example, if the properties A and B are usually mentioned one after another, their similarity will be higher than the properties A and C which are usually not even mentioned together. The properties with low similarity to all other properties (below the user-defined threshold) are discarded from the plot.

Usage

clusterImage(data, distThreshold, concept = NULL)

Value

List with 2 elements: ggplot2 plot and data frame with cluster information

Arguments

data: Data frame with 3 columns: ID, Concept and Property
distThreshold: Distance value. It assign properties to specific cluster if their similarity is greater than distThreshold
concept: Text value. Clusters will only be generated with properties from this concept.

Examples

Run this code

data_cpn = data.frame(CPN_27)
threshold = 0.061
concept = "Ability"
cluster_data = clusterImage(data_cpn, threshold, concept)

Run the code above in your browser using DataLab