h2o.importFile(path, destination_frame = "", parse = TRUE, header = NA,
sep = "", col.names = NULL, col.types = NULL, na.strings = NULL)h2o.importFolder(path, pattern = "", destination_frame = "", parse = TRUE,
header = NA, sep = "", col.names = NULL, col.types = NULL,
na.strings = NULL)
h2o.importURL(path, destination_frame = "", parse = TRUE, header = NA,
sep = "", col.names = NULL, na.strings = NULL)
h2o.importHDFS(path, pattern = "", destination_frame = "", parse = TRUE,
header = NA, sep = "", col.names = NULL, na.strings = NULL)
h2o.uploadFile(path, destination_frame = "", parse = TRUE, header = NA,
sep = "", col.names = NULL, col.types = NULL, na.strings = NULL,
progressBar = FALSE, parse_type = NULL)
sep = ""
, the
parser will automatically detect the separator.h2o.importFile
is a parallelized reader and pulls information from the server from a location specified
by the client. The path is a server-side path. This is a fast, scalable, highly optimized way to read data. H2O
pulls the data from a data store and initiates the data transfer as a read operation. Unlike the import function, which is a parallelized reader, h2o.uploadFile
is a push from
the client to the server. The specified path must be a client-side path. This is not scalable and is only
intended for smaller data sizes. The client pushes the data from a local filesystem (for example,
on your machine where R is running) to H2O. For big-data operations, you don't want the data
stored on or flowing through the client. h2o.importFolder
imports an entire directory of files. If the given path is relative, then it
will be relative to the start location of the H2O instance. The default
behavior is to pass-through to the parse phase automatically. h2o.importURL
and h2o.importHDFS
are both deprecated functions. Instead, use
h2o.importFile
h2o.init(ip = "localhost", port = 54321, startH2O = TRUE)
prosPath = system.file("extdata", "prostate.csv", package = "h2o")
prostate.hex = h2o.importFile(path = prosPath, destination_frame = "prostate.hex")
class(prostate.hex)
summary(prostate.hex)
#Import files with a certain regex pattern by utilizing h2o.importFolder()
#In this example we import all .csv files in the directory prostate_folder
prosPath = system.file("extdata", "prostate_folder", package = "h2o")
prostate_pattern.hex = h2o.importFolder(path = prosPath, pattern = ".*.csv",
destination_frame = "prostate.hex")
class(prostate_pattern.hex)
summary(prostate_pattern.hex)
Run the code above in your browser using DataLab