Last chance! 50% off unlimited learning
Sale ends in
transactions
class represents transaction data used for
mining itemsets or rules. It is a direct extension of class
itemMatrix
to store a binary incidence
matrix, item labels, and optionally transaction IDs and user IDs.
new("transactions", ...)
.itemMatrix
, directly.For example, an item describing a person (i.e., the considered object called a transaction) could be tall. The fact that the person is tall would be encoded in the transaction containing the item tall. This is typically encoded in a transaction-by-items matrix by a TRUE
value. This is why as.transaction
can deal with logical columns, because it assumes the column stands for an item. The function also can convert columns with nominal values (i.e., factors) into a series of binary items (one for each level). So if you have nominal variables then you need to make sure they are factors (and not characters or numbers) using something like
data[,"a_nominal_var"] <- factor(data[,"a_nominal_var"])
.
Continuous variables need to be discretized first. An item resulting from discretization might be age>18 and the column contains only TRUE
or FALSE
. Alternatively it can be a factor with levels age<=18< em="">, 50=>age>18 and age>50. These will be automatically converted into 3 items, one for each level. Have a look at the function discretize
for automatic discretization.
Complete examples for how to prepare data can be found in the man pages for Income
and
Adult
.
Transactions are represented as sparse binary matrices of class
itemMatrix
. If you work with several transaction sets at the
same time, then the encoding (order of the items in the binary matrix) in the different sets is important.
See itemCoding
to learn how to encode and recode transaction sets.
[-methods
,
discretize
,
LIST
,
write
,
c
,
image
,
inspect
,
itemCoding
,
read.transactions
,
random.transactions
,
sets
,
itemMatrix-class
## example 1: creating transactions form a list
a_list <- list(
c("a","b","c"),
c("a","b"),
c("a","b","d"),
c("c","e"),
c("a","b","d","e")
)
## set transaction names
names(a_list) <- paste("Tr",c(1:5), sep = "")
a_list
## coerce into transactions
trans1 <- as(a_list, "transactions")
## analyze transactions
summary(trans1)
image(trans1)
## example 2: creating transactions from a matrix
a_matrix <- matrix(c(
1,1,1,0,0,
1,1,0,0,0,
1,1,0,1,0,
0,0,1,0,1,
1,1,0,1,1
), ncol = 5)
## set dim names
dimnames(a_matrix) <- list(c("a","b","c","d","e"),
paste("Tr",c(1:5), sep = ""))
a_matrix
## coerce
trans2 <- as(a_matrix, "transactions")
trans2
inspect(trans2)
## example 3: creating transactions from data.frame
a_df <- data.frame(
age = as.factor(c(6, 8, NA, 9, 16)),
grade = as.factor(c("A", "C", "F", NA, "C")),
pass = c(TRUE, TRUE, FALSE, TRUE, TRUE))
## note: factors are translated differently to logicals and NAs are ignored
a_df
## coerce
trans3 <- as(a_df, "transactions")
inspect(trans3)
as(trans3, "data.frame")
## example 4: creating transactions from a data.frame with
## transaction IDs and items
a_df3 <- data.frame(
TID = c(1,1,2,2,2,3),
item=c("a","b","a","b","c", "b")
)
a_df3
trans4 <- as(split(a_df3[,"item"], a_df3[,"TID"]), "transactions")
trans4
inspect(trans4)
Run the code above in your browser using DataLab