powered by
Drops duplicated rows across specified columns.
h2o.drop_duplicates(frame, columns, keep = "first")
An H2OFrame object to drop duplicates on.
Columns to compare during the duplicate detection process.
Which rows to keep. The "first" value (default) keeps the first row and delets the rest. The "last" keeps the last row.
# NOT RUN { library(h2o) h2o.init() data <- as.h2o(iris) deduplicated_data <- h2o.drop_duplicates(data, c("Species", "Sepal.Length"), keep = "first") # }
Run the code above in your browser using DataLab