transactions
.read_baskets(con, sep = "[ ]+", info = NULL, iteminfo = NULL,
encoding = "unknown")
- con
{an object of class connection
or file name.}
- sep
{a regular expression specifying how fields are separated
in the data file.}
- info
{a character vector specifying the header for columns with
additional transaction information.}
- iteminfo
{a data frame specifying (additional) item information.}
- encoding
{a character string indicating the encoding which is passed
to readlines
(see Encoding
)}.
Each line of text represents a transaction where items are
separated by a pattern matching the regular expression specified
by sep
. Columns with additional information such as customer or time (event)
identifiers are required to come before any item identifiers and to
be separated by sep
, and must be specified by info
.
Sequential data are identified by the presence of the column identifiers
"sequenceID" (sequence or customer identifier) and "eventID"
(time or event identifier) of slot transactionInfo
.
The row names of iteminfo
must match the item identifiers
present in the data. However, iteminfo
need not contain a
labels column.
An object of class transactions
. [object Object]
The item labels are sorted in the order they appear first in the
data.
Class
timedsequences
,
transactions
,
function
cspade
.
## read example data
x <- read_baskets(con = system.file("misc", "zaki.txt", package =
"arulesSequences"),
info = c("sequenceID","eventID","SIZE"))
as(x, "data.frame")## calendar dates
transactionInfo(x)$Date <-
as.Date(transactionInfo(x)$eventID, origin = "2015-04-01")
transactionInfo(x)
all.equal(transactionInfo(x)$eventID,
as.integer(transactionInfo(x)$Date - as.Date("2015-04-01")))
file