Learn R Programming

TraMineR (version 1.8-9)

seqecmpgroup: Identifying discriminating subsequences

Description

Identify and sort the most discriminating subsequences by their discriminating power.

Usage

seqecmpgroup(subseq, group, method="chisq", pvalue.limit=NULL,
             weighted = TRUE)

Arguments

subseq
A subseqelist object (list of subsequences) such as produced by seqefsub
group
Group membership, i.e., a variable or factor defining the groups which we want to discriminate
method
The discrimination method; one of "bonferroni" or "chisq"
pvalue.limit
Can be used to filter the results. Only subsequences with a p-value lower than this parameter are selected. If NULL all subsequences are returned (regardless of their p-values).
weighted
Logical. If TRUE, seqecmpgroup uses the weights specified in subseq, (see seqefsub).

Value

  • An objet of type subseqelistchisq (subtype of subseqelist) with the following elements
  • subseqSorted list of found discriminating subsequences
  • seqeThe event sequence object on which the tests were computed
  • constraintTime constraints used for searching the subsequences (see seqeconstraint)
  • labelsLevels (value labels) of the target group variable
  • typeType of test used
  • dataA data frame with columns support, index (original order of the subsequence) and a pair of frequency and Pearson residual columns for each group

Details

The following discrimination test functions are implemented: chisq, the Pearson Independence Chi-squared test, and bonferroni, the Pearson Independence Chi-squared test with Bonferroni correction.

References

Studer, M., M�ller, N.S., Ritschard, G. & Gabadinho, A. (2010), "Classer, discriminer et visualiser des s�quences d'�v�nements", In Extraction et gestion des connaissances (EGC 2010), Revue des nouvelles technologies de l'information RNTI. Vol. E-19, pp. 37-48.

See Also

See also plot.subseqelistchisq to plot the results

Examples

Run this code
data(actcal.tse)
actcal.seqe <- seqecreate(actcal.tse)

##Searching for frequent subsequences, that is, appearing at least 20 times
fsubseq <- seqefsub(actcal.seqe, pMinSupport=0.01)

##searching for susbsequences discriminating the most men and women
data(actcal)
discr <- seqecmpgroup(fsubseq, group=actcal$sex, method="bonferroni")
##Printing discriminating subsequences
print(discr)
##Plotting the six most discriminating subsequences
plot(discr[1:6])

Run the code above in your browser using DataLab