# mlr3pipelines v0.1.1

Monthly downloads

## Preprocessing Operators and Pipelines for 'mlr3'

Dataflow programming toolkit that enriches 'mlr3' with a diverse
set of pipelining operators ('PipeOps') that can be composed into graphs.
Operations exist for data preprocessing, model fitting, and ensemble
learning. Graphs can themselves be treated as 'mlr3' 'Learners' and can
therefore be resampled, benchmarked, and tuned.

## Readme

# mlr3pipelines

Dataflow Programming for Machine Learning in R.

## What is `mlr3pipelines`

?

Watch our UseR 2019 Presentation on Youtube for a 15 minute introduction:

** mlr3pipelines** is a dataflow
programming toolkit
for machine learning in R utilising the

**mlr3**package. Machine learning workflows can be written as directed “Graphs” that represent data flows between preprocessing, model fitting, and ensemble learning units in an expressive and intuitive language. Using methods from the

**mlr3tuning**package, it is even possible to simultaneously optimize parameters of multiple processing units.

In principle, *mlr3pipelines* is about defining singular data and model
manipulation steps as “PipeOps”:

```
pca = po("pca")
filter = po("filter", filter = mlr3filters::flt("variance"), filter.frac = 0.5)
learner_po = po("learner", learner = lrn("classif.rpart"))
```

These pipeops can then be combined together to define machine learning
pipelines. These can be wrapped in a `GraphLearner`

that behave like any
other `Learner`

in `mlr3`

.

```
graph = pca %>>% filter %>>% learner_po
glrn = GraphLearner$new(graph)
```

This learner can be used for resampling, benchmarking, and even tuning.

```
resample(tsk("iris"), glrn, rsmp("cv"))
#> <ResampleResult> of 10 iterations
#> * Task: iris
#> * Learner: pca.variance.classif.rpart
#> * Warnings: 0 in 0 iterations
#> * Errors: 0 in 0 iterations
```

## Feature Overview

Single computational steps can be represented as so-called **PipeOps**,
which can then be connected with directed edges in a **Graph**. The
scope of *mlr3pipelines* is still growing; currently supported features
are:

- Simple data manipulation and preprocessing operations, e.g. PCA, feature filtering
- Task subsampling for speed and outcome class imbalance handling
*mlr3**Learner*operations for prediction and stacking- Simultaneous path branching (data going both ways)
- Alternative path branching (data going one specific way, controlled by hyperparameters)
- Ensemble methods and aggregation of predictions

## Documentation

The easiest way to get started is reading some of the vignettes that are shipped with the package, which can also be viewed online:

- Quick Introduction, with short examples to get started
- Detailed Introduction, diving into concepts and describing the objects involved
- Comparison
of
`mlr3pipelines`

with other packages (not yet authoritative) - Writing Custom
`PipeOp`

s to extend and build on top of`mlr3pipelines`

## Bugs, Questions, Feedback

*mlr3pipelines* is a free and open source software project that
encourages participation and feedback. If you have any issues,
questions, suggestions or feedback, please do not hesitate to open an
“issue” about it on the GitHub page!

In case of problems / bugs, it is often helpful if you provide a “minimum working example” that showcases the behaviour (but don’t worry about this if the bug is obvious).

Please understand that the resources of the project are limited: response may sometimes be delayed by a few days, and some feature suggestions may be rejected if they are deemed too tangential to the vision behind the project.

## Similar Projects

A predecessor to this package is the
*mlrCPO*-package, which works with
*mlr* 2.x. Other packages that provide, to varying degree, some
preprocessing functionality or machine learning domain specific
language, are the *caret* package and
the related *recipes* project,
and the *dplyr* package.

## Functions in mlr3pipelines

Name | Description | |

PipeOpImpute | PipeOpImpute | |

PipeOpTaskPreprocSimple | PipeOpTaskPreprocSimple | |

Selector | Selector Functions | |

Graph | Graph | |

PipeOp | PipeOp | |

PipeOpEnsemble | PipeOpEnsemble | |

as_graph | Conversion to mlr3pipeline Graph | |

NO_OP | No-Op Sentinel Used for Alternative Branching | |

as.data.table | Re-export of as.data.table See data.table::as.data.table. | |

add_class_hierarchy_cache | Add a Class Hierarchy to the Cache | |

mlr_pipeops_ica | PipeOpICA | |

branch | Branch Between Alternative Paths | |

mlr_pipeops_imputehist | PipeOpImputeHist | |

mlr_pipeops_nop | PipeOpNOP | |

filter_noop | Remove NO_OPs from a List | |

mlr_pipeops_pca | PipeOpPCA | |

as_pipeop | Conversion to mlr3pipeline PipeOp | |

%>>% | PipeOp Composition Operator | |

greplicate | Create Disjoint Graph Union of Copies of a Graph | |

mlr_pipeops_classbalancing | PipeOpClassBalancing | |

mlr_pipeops_chunk | PipeOpChunk | |

assert_pipeop | Assertion for mlr3pipeline PipeOp | |

assert_graph | Assertion for mlr3pipeline Graph | |

mlr_pipeops_classifavg | PipeOpClassifAvg | |

mlr_pipeops_classweights | PipeOpClassWeights | |

mlr3pipelines-package | mlr3pipelines: Preprocessing Operators and Pipelines for 'mlr3' | |

mlr_pipeops_copy | PipeOpCopy | |

mlr_learners_avg | Optimized Weighted Average of Features for Classification and Regression | |

mlr_pipeops_encode | PipeOpEncode | |

mlr_learners_graph | GraphLearner | |

mlr_pipeops_colapply | PipeOpColApply | |

mlr_pipeops_removeconstants | PipeOpRemoveConstants | |

mlr_pipeops | Dictionary of PipeOps | |

mlr_pipeops_collapsefactors | PipeOpCollapseFactors | |

mlr_pipeops_yeojohnson | PipeOpYeoJohnson | |

mlr_pipeops_unbranch | PipeOpUnbranch | |

mlr_pipeops_imputemean | PipeOpImputeMean | |

mlr_pipeops_scale | PipeOpScale | |

mlr_pipeops_imputemedian | PipeOpImputeMedian | |

gunion | Disjoint Union of Graphs | |

mlr_pipeops_spatialsign | PipeOpSpatialSign | |

mlr_pipeops_subsample | PipeOpSubsample | |

mlr_pipeops_modelmatrix | PipeOpModelMatrix | |

po | Shorthand PipeOp Constructor | |

mlr_pipeops_mutate | PipeOpMutate | |

register_autoconvert_function | Add Autoconvert Function to Conversion Register | |

is_noop | Test for NO_OP | |

mlr_pipeops_encodelmer | Impact Encoding with Random Intercept Models | |

mlr_pipeops_encodeimpact | Conditional Target Value Impact Encoding | |

mlr_pipeops_kernelpca | PipeOpKernelPCA | |

mlr_pipeops_learner | PipeOpLearner | |

mlr_pipeops_featureunion | PipeOpFeatureUnion | |

mlr_pipeops_regravg | PipeOpRegrAvg | |

mlr_pipeops_quantilebin | PipeOpQuantileBin | |

mlr_pipeops_learner_cv | PipeOpLearnerCV | |

mlr_pipeops_histbin | PipeOpHistBin | |

mlr_pipeops_branch | PipeOpBranch | |

mlr_pipeops_boxcox | PipeOpBoxCox | |

mlr_pipeops_missind | PipeOpMissInd | |

mlr_pipeops_fixfactors | PipeOpFixFactors | |

mlr_pipeops_scalemaxabs | PipeOpScaleMaxAbs | |

mlr_pipeops_scalerange | PipeOpScaleRange | |

mlr_pipeops_smote | PipeOpSmote | |

mlr_pipeops_select | PipeOpSelect | |

mlr_pipeops_filter | PipeOpFilter | |

reset_autoconvert_register | Reset Autoconvert Register | |

reset_class_hierarchy_cache | Reset the Class Hierarchy Cache | |

mlr_pipeops_imputenewlvl | PipeOpImputeNewlvl | |

mlr_pipeops_imputesample | PipeOpImputeSample | |

PipeOpTaskPreproc | PipeOpTaskPreproc | |

No Results! |

## Vignettes of mlr3pipelines

Name | ||

figures/po_multi_alone.png | ||

figures/po_multi_viz.png | ||

figures/po_viz.png | ||

comparison_mlr3pipelines_mlr_sklearn.Rmd | ||

introduction.Rmd | ||

No Results! |

## Last month downloads

## Details

License | LGPL-3 |

URL | https://mlr3pipelines.mlr-org.com, https://github.com/mlr-org/mlr3pipelines |

BugReports | https://github.com/mlr-org/mlr3pipelines/issues |

VignetteBuilder | knitr |

ByteCompile | true |

Encoding | UTF-8 |

LazyData | true |

NeedsCompilation | no |

RoxygenNote | 6.1.1 |

Collate | 'Graph.R' 'GraphLearner.R' 'mlr_pipeops.R' 'utils.R' 'PipeOp.R' 'PipeOpEnsemble.R' 'LearnerAvg.R' 'NO_OP.R' 'PipeOpTaskPreproc.R' 'PipeOpBoxCox.R' 'PipeOpBranch.R' 'PipeOpChunk.R' 'PipeOpClassBalancing.R' 'PipeOpClassWeights.R' 'PipeOpClassifAvg.R' 'PipeOpColApply.R' 'PipeOpCollapseFactors.R' 'PipeOpCopy.R' 'PipeOpEncode.R' 'PipeOpEncodeImpact.R' 'PipeOpEncodeLmer.R' 'PipeOpFeatureUnion.R' 'PipeOpFilter.R' 'PipeOpFixFactors.R' 'PipeOpHistBin.R' 'PipeOpICA.R' 'PipeOpImpute.R' 'PipeOpImputeHist.R' 'PipeOpImputeMean.R' 'PipeOpImputeMedian.R' 'PipeOpImputeNewlvl.R' 'PipeOpImputeSample.R' 'PipeOpKernelPCA.R' 'PipeOpLearner.R' 'PipeOpLearnerCV.R' 'PipeOpMissingIndicators.R' 'PipeOpModelMatrix.R' 'PipeOpMutate.R' 'PipeOpNOP.R' 'PipeOpPCA.R' 'PipeOpQuantileBin.R' 'PipeOpRegrAvg.R' 'PipeOpRemoveConstants.R' 'PipeOpScale.R' 'PipeOpScaleMaxAbs.R' 'PipeOpScaleRange.R' 'PipeOpSelect.R' 'PipeOpSmote.R' 'PipeOpSpatialSign.R' 'PipeOpSubsample.R' 'PipeOpUnbranch.R' 'PipeOpYeoJohnson.R' 'Selector.R' 'assert_graph.R' 'greplicate.R' 'gunion.R' 'operators.R' 'po.R' 'reexports.R' 'typecheck.R' 'zzz.R' |

Packaged | 2019-10-28 19:07:19 UTC; user |

Repository | CRAN |

Date/Publication | 2019-10-29 07:00:02 UTC |

imports | backports , checkmate , data.table , digest , mlr3 (>= 0.1.4) , mlr3misc (>= 0.1.4) , paradox , R6 , withr |

suggests | bestNormalize , fastICA , ggplot2 , glmnet , igraph , kernlab , knitr , lgr , lme4 , mlbench , mlr3filters , mlr3learners , nloptr , rmarkdown , rpart , smotefamily , testthat , visNetwork |

depends | R (>= 3.1.0) |

Contributors | Michel Lang, Bernd Bischl, Florian Pfisterer, Susanne Dandl |

#### Include our badge in your README

```
[![Rdoc](http://www.rdocumentation.org/badges/version/mlr3pipelines)](http://www.rdocumentation.org/packages/mlr3pipelines)
```