Aggregates features from all input tasks by cbind()ing them together into a single
Task.
DataBackend primary keys and Task targets have to be equal
across all Tasks. Only the target column(s) of the first Task
are kept.
If assert_targets_equal is TRUE then target column names are compared and an error is thrown
if they differ across inputs.
If input tasks share some feature names but these features are not identical an error is thrown. This check is performed by first comparing the features names and if duplicates are found, also the values of these possibly duplicated features. True duplicated features are only added a single time to the output task.
PipeOpFeatureUnion$new(innum = 0, collect_multiplicity = FALSE, id = "featureunion", param_vals = list(), assert_targets_equal = TRUE)
innum :: numeric(1) | character
Determines the number of input channels.
If innum is 0 (default), a vararg input channel is created that can take an arbitrary number
of inputs. If innum is a character vector, the number of input channels is the length of
innum, and the columns of the result are prefixed with the values.
collect_multiplicity :: logical(1)
If TRUE, the input is a Multiplicity collecting channel. This means, a
Multiplicity input, instead of multiple normal inputs, is accepted and the members are aggregated. This requires innum to be 0.
Default is FALSE.
id :: character(1)
Identifier of the resulting object, default "featureunion".
param_vals :: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise
be set during construction. Default list().
assert_targets_equal :: logical(1)
If assert_targets_equal is TRUE (Default), task target column names are checked for
agreement. Disagreeing target column names are usually a bug, so this should often be left at
the default.
PipeOpFeatureUnion has multiple input channels depending on the innum construction
argument, named "input1", "input2", ... if innum is nonzero; if innum is 0, there is
only one vararg input channel named "...". All input channels take a Task
both during training and prediction.
PipeOpFeatureUnion has one output channel named "output", producing a Task
both during training and prediction.
The output is a Task constructed by cbind()ing all features from all input
Tasks, both during training and prediction.
The $state is left empty (list()).
PipeOpFeatureUnion has no Parameters.
PipeOpFeatureUnion uses the Task $cbind() method to bind the input values
beyond the first input to the first Task. This means if the Tasks
are database-backed, all of them except the first will be fetched into R memory for this. This
behaviour may change in the future.
Only fields inherited from PipeOp.
Only methods inherited from PipeOp.
Other PipeOps:
PipeOpEnsemble,
PipeOpImpute,
PipeOpTargetTrafo,
PipeOpTaskPreprocSimple,
PipeOpTaskPreproc,
PipeOp,
mlr_pipeops_boxcox,
mlr_pipeops_branch,
mlr_pipeops_chunk,
mlr_pipeops_classbalancing,
mlr_pipeops_classifavg,
mlr_pipeops_classweights,
mlr_pipeops_colapply,
mlr_pipeops_collapsefactors,
mlr_pipeops_colroles,
mlr_pipeops_copy,
mlr_pipeops_datefeatures,
mlr_pipeops_encodeimpact,
mlr_pipeops_encodelmer,
mlr_pipeops_encode,
mlr_pipeops_filter,
mlr_pipeops_fixfactors,
mlr_pipeops_histbin,
mlr_pipeops_ica,
mlr_pipeops_imputeconstant,
mlr_pipeops_imputehist,
mlr_pipeops_imputelearner,
mlr_pipeops_imputemean,
mlr_pipeops_imputemedian,
mlr_pipeops_imputemode,
mlr_pipeops_imputeoor,
mlr_pipeops_imputesample,
mlr_pipeops_kernelpca,
mlr_pipeops_learner,
mlr_pipeops_missind,
mlr_pipeops_modelmatrix,
mlr_pipeops_multiplicityexply,
mlr_pipeops_multiplicityimply,
mlr_pipeops_mutate,
mlr_pipeops_nmf,
mlr_pipeops_nop,
mlr_pipeops_ovrsplit,
mlr_pipeops_ovrunite,
mlr_pipeops_pca,
mlr_pipeops_proxy,
mlr_pipeops_quantilebin,
mlr_pipeops_randomprojection,
mlr_pipeops_randomresponse,
mlr_pipeops_regravg,
mlr_pipeops_removeconstants,
mlr_pipeops_renamecolumns,
mlr_pipeops_replicate,
mlr_pipeops_scalemaxabs,
mlr_pipeops_scalerange,
mlr_pipeops_scale,
mlr_pipeops_select,
mlr_pipeops_smote,
mlr_pipeops_spatialsign,
mlr_pipeops_subsample,
mlr_pipeops_targetinvert,
mlr_pipeops_targetmutate,
mlr_pipeops_targettrafoscalerange,
mlr_pipeops_textvectorizer,
mlr_pipeops_threshold,
mlr_pipeops_tunethreshold,
mlr_pipeops_unbranch,
mlr_pipeops_updatetarget,
mlr_pipeops_vtreat,
mlr_pipeops_yeojohnson,
mlr_pipeops
Other Multiplicity PipeOps:
Multiplicity(),
PipeOpEnsemble,
mlr_pipeops_classifavg,
mlr_pipeops_multiplicityexply,
mlr_pipeops_multiplicityimply,
mlr_pipeops_ovrsplit,
mlr_pipeops_ovrunite,
mlr_pipeops_regravg,
mlr_pipeops_replicate
# NOT RUN {
library("mlr3")
task1 = tsk("iris")
gr = gunion(list(
po("nop"),
po("pca")
)) %>>% po("featureunion")
gr$train(task1)
task2 = tsk("iris")
task3 = tsk("iris")
po = po("featureunion", innum = c("a", "b"))
po$train(list(task2, task3))
# }
Run the code above in your browser using DataLab