Learn R Programming

TraMineR (version 1.1)

seqdef: Create a sequence object

Description

Create a sequence object to be passed to other functions provided by the TraMineR package. There are specific method for plotting and printing sequences objects.

Usage

seqdef(data, var=NULL, informat="STS", stsep="-", 
	alphabet=NULL, states=NULL, start=1, 
	left=NA, right="DEL", gaps=NA, missing=NA, void="%", nr="*", 
	cnames=NULL, cpal=NULL, missing.color="darkgrey", labels=NULL, ...)

Arguments

data
a data frame or matrix containing sequence data.
var
the list of columns containing the sequences. Defaut to NULL, ie all the columns. Whether the sequences are in the compressed (successive states in a character string) or extended format is automatically detected.
informat
format of the original data. Default is 'STS'. Avalaible formats are: STS, SPS, SPELL. See TraMineR user manual (Gabadinho et al., 2008) for a description of the formats.
stsep
the character used as separator in the original data if input format is successive states in a character string. By default, "-".
alphabet
optional vector containing the alphabet (the list of all possible states). Use this option if some states in the alphabet don't appear in the data or if yopu want to reorder the states in the alphabet. The specified vector MUST contain AT LEAST all the st
states
an optional vector containing the labels for the states. Must have a length equal to the number of states in the data, and the labels must be sorted according to the output of the seqstatl function.
start
starting time. For instance, if your sequences begin at age 15, you can specify 15. At this stage, used only for labelling column names.
left
the behavior for missing values appearing in the left part of the sequences, i.e the part before the first (leftmost) valid state in the sequences. See Gabadinho et al. (2008) for more details on the options for handling missing values when d
right
the behavior for missing values appearing in the right part of the sequences, i.e the part after the last (rightmost) valid state in the sequences. See Gabadinho et al. (2008) for more details on the options for handling missing values when d
gaps
the behavior for missing values appearing in the central part of the sequences, i.e the part after the first (leftmost) valid state in the sequences and before the last (rightmost) valid state in the sequences. See Gabadinho et al. (2008) for
missing
the code for the missing values appearing in the input data. If specified, all cells containing this value will be replaced by NA's, the internal R code for missing values. If 'missing' is not specified, cells containing NA's are considered to be missing
void
the internal code used by TraMineR for the void elements in the sequences. Default to "%".
nr
the internal code used by TraMineR for the missing elements in the sequences. Default to "*".
cnames
optional names for the columns composing the sequence data. Those names will be used by default in the graphics as axis labels. If not specified, names are taken from the original columns names in the data.
cpal
an optional color palette for representing the states in the graphics. If not specified, colors palette is created with the RColorBrewer package, using the "accent" palette. Note that the maximum number of colors in this palette is 8. If the number of sta
missing.color
alternative color for representing missing values inside the sequences. Default to "darkgrey".
labels
labels for the states, to appear in the graphics' legend.
...
options passed to the seqformat function to handle input data not in STS format.

Value

  • An object of class stslist. There are methods for print, summary, and subscripting sequence objects. Sequence objects are required as argument to other functions such as plotting functions (seqdplot, seqiplot or seqfplot), functions to compute distances (seqdist), etc...

encoding

latin1

Details

Subscripts applied to sequence objects (eg. seq[,1:5] or seq[1:10,]) returns a sequence object with preserved (alphabet, missing) and adapted attributes (start, column names), unless only one column is selected, in which case a factor is returned.

References

Gabadinho, A., G. Ritschard, M. Studer and N. S. M�ller (2008). Mining Sequence Data in R with TraMineR: A user's guide. Department of Econometrics and Laboratory of Demography, University of Geneva.

See Also

Examples

Run this code
## Creating a sequence object with the columns 13 to 24 
## in the 'actcal' example data set
data(actcal)
actcal.seq <- seqdef(actcal,13:24,
	labels=c("> 37 hours", "19-36 hours", "1-18 hours", "no work"))

## Displaying the first 10 rows of the sequence object
actcal.seq[1:10,]

## Displaying the first 10 rows of the sequence object
## in SPS format
print(actcal.seq[1:10,], format="SPS")

## Frequency plot for the monthes June to September
seqfplot(actcal.seq[,6:9])

## Re-ordering the alphabet
actcal.seq <- seqdef(actcal,13:24,alphabet=c("B","A","D","C"))
alphabet(actcal.seq)

## Adding a state not appearing in the data to the
## alphabet
actcal.seq <- seqdef(actcal,13:24,alphabet=c("A","B","C","D","E"))
alphabet(actcal.seq)

## Adding a state not appearing in the data to the
## alphabet and changing the states labels
actcal.seq <- seqdef(actcal,13:24,
  alphabet=c("A","B","C","D","E"),
  states=c("FT","PT","LT","NO","TR"))
alphabet(actcal.seq)
actcal.seq[1:10,]

Run the code above in your browser using DataLab