Learn R Programming

arulesSequences (version 0.2-1)

similarity-methods: Compute Similarities

Description

Provides the generic function similarity and the S4 method to compute similarities among a collection of sequences.

is.subset, is.superset find subsequence or supersequence relationships among a collection of sequences.

Usage

similarity(x, y = NULL, ...)

## S3 method for class 'sequences': similarity(x, y = NULL, method = c("jaccard", "dice", "cosine", "subset"), strict = FALSE)

## S3 method for class 'sequences': is.subset(x, y = NULL, proper = FALSE) ## S3 method for class 'sequences': is.superset(x, y = NULL, proper = FALSE)

Arguments

x, y
an object.
...
further (unused) arguments.
method
a string specifying the similarity measure to use (see details).
strict
a logical value specifying if strict itemset matching should be used.
proper
a logical value specifying if only strict relationships (omitting equality) should be indicated.

Value

  • For similarity, returns an object of class dsCMatrix if the result is symmetric (or method = "subset") and and object of class dgCMatrix otherwise.

    For is.subset, is.superset returns an object of class lgCMatrix.

Details

Let the number of common elements of two sequences refer to those that occur in a longest common subsequence. The following similarity measures are implemented: [object Object],[object Object],[object Object],[object Object] If strict = TRUE the elements (itemsets) of the sequences must be equal to be matched. Otherwise matches are quantified by the similarity of the itemsets (as specified by method) thresholded at 0.5, and the common sequence by the sum of the similarities.

See Also

Class sequences, method dissimilarity.

Examples

Run this code
## use example data
data(zaki)
z <- as(zaki, "timedsequences")
similarity(z)

# require equality
similarity(z, strict = TRUE)

## emphasize common
similarity(z, method = "dice")

## 
is.subset(z)
is.subset(z, proper = TRUE)

Run the code above in your browser using DataLab