seqformat(data, var=NULL, id=NULL,
from, to, compressed=FALSE,
nrep=NULL, tevent, stsep=NULL, covar=NULL,
SPS.in=list(xfix="()", sdsep=","),
SPS.out=list(xfix="()", sdsep=","),
begin=NULL, end=NULL, status=NULL,
process=TRUE, pdata=NULL, pvar=NULL,
limit=100, overwrite=TRUE,
fillblanks=NULL, tmin=NULL, tmax=NULL)
NULL
, i.e. all the columns.
Whether the sequences are in the compressed (character strings) or
extended format is automatically detected by counting the number of
columns.SPELL
format as input, this identification number
is mandatory, in order to identify all spells belonging to each
individual in the data set."STS"
, "SPS"
, "SPELL"
.
If data
is a sequence object, format is automatically set to
"STS"
."STS"
, "SPS"
, "SRS"
, "DSS"
,
"TSE"
.TRUE
and output format is one of
"STS"
, "SPS"
or "DSS"
, the output sequences are
compressed into character strings"SRS"
format"TSE"
) format,
a matrix of size $d * d$ where $d$ is the number of distinct
states appearing in the sequences must be given. In this matrix, the
cell $(i,j)$ contains all events associated with a tranNULL
(default value), the seqfcheck
function
is called for detecting automatically "SRS"
is chosen, the covariates are replicated across
each row. Default is NULL
.SPS
format. Set the xfix
element of the list to ""
if there are no pSPS
format. Set the xfix
element
of the list to ""
if there are no pre-suf-fiSPELL
, the column with
the beginning position of the spellSPELL
, the column with the
end position of the spellSPELL
, the column with
the statusTRUE
(default) when converting from
SPELL
, sequences are created on a process time axis. If set to
FALSE
, they are created on a calendar time axis.SPELL
and
process=TRUE
, either NULL
, "auto"
or the name of
the data frame containing the individual 'birth' time, that is, the
entering time from which the process time will be copdata
.SPELL
, size of the
resulting dataframe when creating age sequences (by default goes from
age 1 to age 100)SPELL
, if
overwrite
is set to TRUE
, the most recent episode
overwrites the older one if they overlap each other. If set to
FALSE
, the most recent episode starts from the end of thSPELL
, if
fillblanks
is not NULL
, gaps between episodes are
filled with any character given as argument.SPELL
, if sequences are
to be defined on a calendar time axis, it defines the starting time
of the axis. If set to NULL
, the minimum time is taken from
the 'begin' column in the data.SPELL
, if year sequences
are wanted, defines the ending year of the dataframe. If set to
NULL
, it is guessed from the data (not so accurately!).seqformat
function is used to convert data
from one format to another. The input data is first converted into
the STS format and then converted to the output format. Depending on
input and output formats, some information can be lost in the
conversion process. The output is a matrix, NOT a sequence object to
be passed to TraMineR functions for plotting and mining sequences
(use the seqdef
function for that). See Gabadinho
et al. (2009) and Ritschard et al. (2009) for more details on
longitudinal data formats and converting between them.R
with the TraMineR
package: A user's guide. Department of Econometrics and Laboratory of
Demography, University of Geneva.
Ritschard, G., A. Gabadinho, M. Studer and N. S. M�ller. Converting
between various sequence representations. in Ras, Z. & Dardzinska, A.
(ed.) Advances in Data Management, Springer, 2009, 223,
155-175seqdef
## Converting sequences into SPS format
data(actcal)
actcal.SPS.A <- seqformat(actcal,13:24, from="STS", to="SPS")
head(actcal.SPS.A)
## SPS (compressed) format with no prefix/suffix "/" as state/duration separator
actcal.SPS.B <- seqformat(actcal,13:24,
from="STS", to="SPS", compressed=TRUE,
SPS.out=list(xfix="", sdsep="/"))
head(actcal.SPS.B)
## Converting sequences into DSS (compressed) format
actcal.DSS <- seqformat(actcal,13:24,
from="STS", to="DSS", compressed=TRUE)
head(actcal.DSS)
Run the code above in your browser using DataLab