h2o (version 3.30.1.2)

h2o.drop_duplicates: Drops duplicated rows.

Description

Drops duplicated rows across specified columns.

Usage

h2o.drop_duplicates(frame, columns, keep = "first")

Arguments

frame

An H2OFrame object to drop duplicates on.

columns

Columns to compare during the duplicate detection process.

keep

Which rows to keep. The "first" value (default) keeps the first row and delets the rest. The "last" keeps the last row.

Examples

Run this code
# NOT RUN {
library(h2o)
h2o.init()

data <- as.h2o(iris)
deduplicated_data <- h2o.drop_duplicates(data, c("Species", "Sepal.Length"), keep = "first")
# }

Run the code above in your browser using DataLab