
This is an experimental function for a consensus clustering algorithm based on targeting a range of average next state probabilities derived when fitting each cluster to a markov chain.
getConsensusClusters(
trainingCLS,
testCLS,
maxIterations = 5,
optimalProbMean = 0.5,
range = 0.3,
centresMin = 2,
clusterCentresRange = 0,
order = 1,
takeHighest = FALSE,
verbose = FALSE
)
Clickstream object with training data (this should be the data used to build the markov chain object).
Clickstream object with test data.
Number of times to iterate (repeat) through the k-means clustering.
The target average probability of each next page click prediction in a 1st order markov chain.
The range above the optimal probability to target.
The minimum cluster centres to evaluate.
the additional cluster centres to evaluate.
The order for markov chains that will be used to evaluate each cluster.
Determines whether to default to the highest mean next click probability, or error if the target is not reached after the given number of k-means iterations.
Should this function report extra information on progress?
Theo van Kraay theo.vankraay@hotmail.com
training <- c("User1,h,c,c,p,c,h,c,p,p,c,p,p,o",
"User2,i,c,i,c,c,c,d",
"User3,h,i,c,i,c,p,c,c,p,c,c,i,d",
"User4,h,c,c,p,p,c,p,p,p,i,p,o",
"User5,i,h,c,c,p,p,c,p,c,d",
"User6,i,h,c,c,p,p,c,p,c,o",
"User7,i,h,c,c,p,p,c,p,c,d",
"User8,i,h,c,c,p,p,c,p,c,d,o")
test <- c(
"User1,h,c,c,p,c,h,c,p,p,c,p,p,o",
"User2,i,c,i,c,c,c,d",
"User3,h,i,c,i,c,p,c,c,p,c,c,i,d"
)
trainingCLS <- as.clickstreams(training, header = TRUE)
testCLS <- as.clickstreams(test, header = TRUE)
clusters <- getConsensusClusters(trainingCLS, testCLS, maxIterations=5,
optimalProbMean=0.40, range = 0.70, centresMin = 2,
clusterCentresRange = 0, order = 1, takeHighest = FALSE,
verbose = FALSE)
markovchains <- fitMarkovChains(clusters)
startPattern <- new("Pattern", sequence = c("i", "h", "c", "p"))
mc <- getOptimalMarkovChain(startPattern, markovchains, clusters)
predict(mc, startPattern)
Run the code above in your browser using DataLab