ffbase (version 0.12.8)

duplicated.ff: Duplicated for ff and ffdf objects

Description

Duplicated for ff and ffdf objects similar as in duplicated. Remark that this duplicated function is slightly different from the duplicated method in the base package as it first orders the ffdf or ff_vector object and then applies duplicated. This means you need to order the ffdf or ff_vector in case you want to have the exact same result as the result of the base package. See the example.

Usage

# S3 method for ff
duplicated(x, incomparables = FALSE, fromLast = FALSE, trace = FALSE, ...)

# S3 method for ffdf duplicated(x, incomparables = FALSE, fromLast = FALSE, trace = FALSE, ...)

Arguments

x

ff object or ffdf object

incomparables

a vector of values that cannot be compared. FALSE is a special value, meaning that all values can be compared, and may be the only value accepted for methods other than the default. It will be coerced internally to the same type as x.

fromLast

logical indicating if duplication should be considered from the last, i.e., the last (or rightmost) of identical elements will be kept

trace

logical indicating to show on which chunk the function is computing

...

other parameters passed on to chunk

Value

A logical ff vector of length nrow(x) or length(x) indicating if each row or element is duplicated.

See Also

duplicated, ffdforder, fforder

Examples

Run this code
# NOT RUN {
## duplicated.ffdf - mark that you need to order according to the records you 
## like in order to have similar results as the base unique method 
data(iris)
irisdouble <- rbind(iris, iris)
irisdouble <- irisdouble[ sample(x=1:nrow(irisdouble), size=nrow(irisdouble)
                        , replace = FALSE), ]
ffiris <- as.ffdf(irisdouble)
duplicated(ffiris, by=10, trace=TRUE)
duplicated(ffiris$Sepal.Length, by=10, trace=TRUE)
table(duplicated(irisdouble), duplicated(ffiris, by=10)[])
irisdouble <- irisdouble[order(apply( irisdouble
                                    , FUN=function(x) paste(x, collapse=".")
                                    , MARGIN=1
                                    )), ]
ffiris <- as.ffdf(irisdouble)
table(duplicated(irisdouble), duplicated(ffiris, by=10)[])
table(duplicated(ffiris$Sepal.Width, by=10)[], duplicated(ffiris$Sepal.Width[]))

measures <- c("Sepal.Width","Species")
irisdouble <- irisdouble[order(apply( irisdouble[, measures]
                                    , FUN=function(x) paste(x, collapse=".")
                                    , MARGIN=1)), ]
ffiris <- as.ffdf(irisdouble)
table(duplicated(irisdouble[, measures]), duplicated(ffiris[measures], by=10)[])
table(duplicated(ffiris$Sepal.Width, by=10)[], duplicated(ffiris$Sepal.Width[]))
# }

Run the code above in your browser using DataLab