exprso (version 0.1.8)

pipeFilter: Filter ExprsPipeline Object

Description

pipeFilter subsets an ExprsPipeline object.

Usage

pipeFilter(object, colBy, how = 0, gate = 0, top = 0)

# S4 method for ExprsPipeline pipeFilter(object, colBy, how = 0, gate = 0, top = 0)

Arguments

object
colBy

A character vector or string. Specifies column(s) to use when filtering by classifier performance. Listing multiple columns will result in a filter based on a performance metric equal to the product of those listed columns.

how, gate

A numeric scalar. Arguments between 0 and 1 will impose a threshold or ceiling filter, respectively, based on the raw value of colBy. Arguments between 1 and 100 will impose a filter based on the percentile of colBy. The user may also provide "midrange", "median", or "mean" as an argument for these filters. Set how = 0 or gate = 0, to skip the threshold or ceiling filter, respectively.

top

A numeric scalar. Determines the top N models based on colBy to include after the threshold and ceiling filters. In the case that the @summary slot contains the column "boot", this determines the top N models for each unique bootstrap. Set top = 0 to skip this subset.

Value

An ExprsPipeline-class object.

Methods (by class)

  • ExprsPipeline: Method to filter ExprsPipeline objects.

Details

The filter process occurs in three steps. However, the user may skip any one of these steps by setting the respective argument to 0.

First, a threshold filter gets imposed. Any model with a performance less than the threshold filter, how, gets excluded. Second, a ceiling filter gets imposed. Any model with a performance less than the ceiling filter, gate, gets excluded. Third, an arbitrary subset occurs. The top N models in the ExprsPipeline object get selected based on the argument top. However, in the case that the @summary slot contains the column "boot", pipeFilter selects the top N models for each unique bootstrap.

pipeFilter will apply this filter for one or more performance metrics listed in the colBy argument. Listing multiple columns will result in a filter based on a performance metric equal to the product of all listed performance metrics. To more heavily weigh one performance metric over another, consider listing that column more than once.

See Also

pipeFilter pipeUnboot plCV plGrid plGridMulti plMonteCarlo plNested