h2o.tokenize

The column or columns whose strings to tokenize.

The regular expression to split on.

split

h2o.tokenize is similar to h2o.strsplit, the difference between them is that h2o.tokenize will store the tokenized
text into a single column making it easier for additional processing (filtering stop words, word2vec algo, ...).

R interface for 'H2O', the scalable open source machine learning
platform that offers parallelized implementations of many supervised and
unsupervised machine learning algorithms such as Generalized Linear
Models, Gradient Boosting Machines (including XGBoost), Random Forests,
Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Cox
Proportional Hazards, K-Means, PCA, Word2Vec, as well as a fully automatic
machine learning algorithm (AutoML).

Erin LeDell

R Interface for 'H2O'

Navdeep Gill

Spencer Aiello

Anqi Fu

Arno Candel

Cliff Click

Tom Kraljevic

Tomas Nykodym

Patrick Aboyoun

Michal Kurka

Michal Malohlava

Ludi Rehak

Eric Eckstrand

Brandon Hill

Sebastian Vidrio

Surekha Jadhawani

Amy Wang

Raymond Peck

Wendy Wong

Jan Gorecki

Matt Dowle

Yuan Tang

Lauren DiPerna

H2O.ai 

h2o.tokenize function

Tokenize String — h2o.tokenize

Tokenize String

h2o.tokenize: Tokenize String

Description

Usage

Arguments

Value

Examples