Following the usual conventions introduced from the world of
gene expression microarrays, a typical data matrix is constructed from
columns representing samples on which we want to make predictions
amd rows representing the features used to construct the predictive
model. In this context, we define a filter to be a function
that accepts a data matrix as its only argument and returns a logical
vector, whose length equals the number of rows in the matrix, where
'TRUE' indicates features that should be retrained. Most filtering
functions belong to parametrized families, with one of the most common
examples being
"retain all features whose mean is above some pre-specified cutoff".
We implement this idea using a set of function-generating functions,
whose arguments are the parameters that pick out the desired member
of the family. The return value is an instantiation of a particular
filtering function. The decison to define things this way is to be
able to apply the methods in cross-validation (or other) loops where
we want to ensure that we use the same filtering rule each time.