arulesSequences (version 0.2-19)

read_baskets: Read Transaction Data

Description

Read transaction data in basket format (with additional temporal or other information) and create an object of class transactions.

Usage

read_baskets(con, sep = "[ \t]+", info = NULL, iteminfo = NULL,
             encoding = "unknown")

Arguments

con

an object of class connection or file name.

sep

a regular expression specifying how fields are separated in the data file.

info

a character vector specifying the header for columns with additional transaction information.

iteminfo

a data frame specifying (additional) item information.

encoding

a character string indicating the encoding which is passed to readlines (see Encoding)

Value

An object of class transactions.

Details

Each line of text represents a transaction where items are separated by a pattern matching the regular expression specified by sep.

Columns with additional information such as customer or time (event) identifiers are required to come before any item identifiers and to be separated by sep, and must be specified by info.

Sequential data are identified by the presence of the column identifiers "sequenceID" (sequence or customer identifier) and "eventID" (time or event identifier) of transactionInfo.

The row names of iteminfo must match the item identifiers present in the data. However, iteminfo need not contain a labels column.

See Also

Class '>timedsequences, transactions, function cspade.

Examples

Run this code
# NOT RUN {
## read example data
x <- read_baskets(con  = system.file("misc", "zaki.txt", package = 
                                     "arulesSequences"),
		  info = c("sequenceID","eventID","SIZE"))
as(x, "data.frame")

# }
# NOT RUN {
## calendar dates
transactionInfo(x)$Date <-
    as.Date(transactionInfo(x)$eventID, origin = "2015-04-01")
transactionInfo(x)
all.equal(transactionInfo(x)$eventID,
          as.integer(transactionInfo(x)$Date - as.Date("2015-04-01")))
# }

Run the code above in your browser using DataCamp Workspace