Usage
hive_stream( mapper, reducer, input, output, henv = hive(),
mapper_args = NULL, reducer_args = NULL, cmdenv_arg = NULL )
Arguments
mapper
a function which is executed on each worker node. The
so-called mapper typically maps input key/value pairs to a set of
intermediate key/value pairs.
reducer
a function which is executed on each worker node. The
so-called reducer reduces a set of intermediate values which share a
key to a smaller set of values. If no reducer is used leave empty.
input
specifies the directory holding the data in the DFS.
output
specifies the output directory in the DFS containing the
results after the streaming job finished.
henv
Hadoop local environment.
mapper_args
additional arguments to the mapper.
reducer_args
additional arguments to the reducer.
cmdenv_arg
additional arguments passed as environment variables
to distributed tasks.