sdf_separate_column

0th

Percentile

Separate a Vector Column into Scalar Columns

Given a vector column in a Spark DataFrame, split that into n separate columns, each column made up of the different elements in the column column.

Usage
sdf_separate_column(x, column, into = NULL)
Arguments
x

A spark_connection, ml_pipeline, or a tbl_spark.

column

The name of a (vector-typed) column.

into

A specification of the columns that should be generated from column. This can either be a vector of column names, or an R list mapping column names to the (1-based) index at which a particular vector element should be extracted.

Aliases
  • sdf_separate_column
Documentation reproduced from package sparklyr, version 0.8.0, License: Apache License 2.0 | file LICENSE

Community examples

dalloliogm@gmail.com at Apr 4, 2018 sparklyr v0.7.0

This is generally used in combination with ft_regex_tokenizer, to split a column containing comma separated values (or other patterns) into multipl columns. ``` mydf %>% ft_regex_tokenizer(input.col="mycolumn", output.col="mycolumnSplit", pattern=";") %>% sdf_separate_column("mycolumnSplit", into=c("column1", "column2") ```