4.1 The Building Blocks: PipeOps

The building blocks of mlr3pipelines are PipeOp-objects (PO). They can be constructed directly using PipeOp<NAME>$new(), but the recommended way is to retrieve them from the mlr_pipeops dictionary:

library("mlr3pipelines")
as.data.table(mlr_pipeops)

## # A tibble: 42 x 8
##    key   packages input.num output.num input.type.train input.type.pred…
##    <chr> <list>       <int>      <int> <list>           <list>          
##  1 boxc… <chr [1…         1          1 <chr [1]>        <chr [1]>       
##  2 bran… <chr [0…         1         NA <chr [1]>        <chr [1]>       
##  3 chunk <chr [0…         1         NA <chr [1]>        <chr [1]>       
##  4 clas… <chr [0…         1          1 <chr [1]>        <chr [1]>       
##  5 clas… <chr [1…        NA          1 <chr [1]>        <chr [1]>       
##  6 clas… <chr [0…         1          1 <chr [1]>        <chr [1]>       
##  7 cola… <chr [0…         1          1 <chr [1]>        <chr [1]>       
##  8 coll… <chr [0…         1          1 <chr [1]>        <chr [1]>       
##  9 copy  <chr [0…         1         NA <chr [1]>        <chr [1]>       
## 10 enco… <chr [1…         1          1 <chr [1]>        <chr [1]>       
## # … with 32 more rows, and 2 more variables: output.type.train <list>,
## #   output.type.predict <list>

Single POs can be created using mlr_pipeops$get(<name>):

pca = mlr_pipeops$get("pca")

or using syntactic sugar:

pca = po("pca")

Some POs require additional arguments for construction:

learner = mlr_pipeops$get("learner")

# Error in as_learner(learner) : argument "learner" is missing, with no default argument "learner" is missing, with no default

learner = mlr_pipeops$get("learner", mlr_learners$get("classif.rpart"))

or in short po("learner", lrn("classif.rpart")).

Hyperparameters of POs can be set through the param_vals argument. Here we set the fraction of features for a filter:

filter = mlr_pipeops$get("filter",
  filter = mlr3filters::FilterVariance$new(),
  param_vals = list(filter.frac = 0.5))

or in short notation:

po("filter", mlr3filters::FilterVariance$new(), filter.frac = 0.5)

The figure below shows an exemplary PipeOp. It takes an input, transforms it during .$train and .$predict and returns data:

knitr::include_graphics("images/po_viz.png")