Check whether seqdata contains sequences in the compressed format (as character strings with states separated by a separator) or in the extended format (sequences stored in a matrix with each successive state in a separate column.) For a more detailed description of the compressed and extended format, see Gabadinho, 2009.
seqfcheck(seqdata)
a character string coding the format of the sequence data, either ':'
,'-'
,'X'
or '-X'
.
a vector, data frame or matrix containing sequence data.
Alexis Gabadinho
Whether the sequence(s) are in compressed format is checked by counting the number of columns and searching for the '-'
or ':'
separator. The function returns the separator if it has been found in the data. If the data contains more than one column, the data is supposed to be in the extended format, and 'X'
is returned, unless some state codes contain the '-'
character (e.g., states coded with negative integer values), in which case '-X'
is returned.
Gabadinho, A., G. Ritschard, M. Studer and N. S. Müller (2009). Mining Sequence Data in R
with TraMineR
: A user's guide. Department of Econometrics and Laboratory of Demography, University of Geneva.
seqconc
, seqdecomp
## The sequences in the actcal data set
## are in the extended format
data(actcal)
head(actcal[,13:24])
seqfcheck(actcal[,13:24])
## The sequences in the famform data set
## are in the compressed format
data(famform)
famform
seqfcheck(famform)
Run the code above in your browser using DataLab