rquery (version 1.4.99)

normalize_cols: Build an optree pipeline that normalizes a set of columns so each column sums to one in each partition.

Description

This is an example of building up a desired pre-prepared pipeline fragment from relop nodes.

Usage

normalize_cols(source, columns, ..., partitionby = NULL, env = parent.frame())

Arguments

source

relop tree or data.frame source.

columns

character, columns to normalize.

...

force later arguments to bind by name.

partitionby

partitioning (window function) column names to define partitions.

env

environment to look for values in.

Examples

Run this code

# by hand logistic regression example
scale <- 0.237
d <- mk_td("survey_table",
                  c("subjectID", "surveyCategory", "assessmentTotal"))
optree <- d %.>%
  extend(.,
             probability %:=%
               exp(assessmentTotal * scale))  %.>%
  normalize_cols(.,
                 "probability",
                 partitionby = 'subjectID') %.>%
  pick_top_k(.,
             partitionby = 'subjectID',
             orderby = c('probability', 'surveyCategory'),
             reverse = c('probability')) %.>%
  rename_columns(., 'diagnosis' %:=% 'surveyCategory') %.>%
  select_columns(., c('subjectID',
                      'diagnosis',
                      'probability')) %.>%
  orderby(., 'subjectID')
cat(format(optree))

Run the code above in your browser using DataCamp Workspace