Learn R Programming

butcher

Overview

Modeling or machine learning in R can result in fitted model objects that take up too much memory. There are two main culprits:

  1. Heavy usage of formulas and closures that capture the enclosing environment in model training
  2. Lack of selectivity in the construction of the model object itself

As a result, fitted model objects contain components that are often redundant and not required for post-fit estimation activities. The butcher package provides tooling to “axe” parts of the fitted output that are no longer needed, without sacrificing prediction functionality from the original model object.

Installation

Install the released version from CRAN:

install.packages("butcher")

Or install the development version from GitHub:

# install.packages("pak")
pak::pak("tidymodels/butcher")

Butchering

As an example, let’s wrap an lm model so it contains a lot of unnecessary stuff:

library(butcher)
our_model <- function() {
  some_junk_in_the_environment <- runif(1e6) # we didn't know about
  lm(mpg ~ ., data = mtcars)
}

This object is unnecessarily large:

library(lobstr)
obj_size(our_model())
#> 8.02 MB

When, in fact, it should only be:

small_lm <- lm(mpg ~ ., data = mtcars)
obj_size(small_lm)
#> 22.22 kB

To understand which part of our original model object is taking up the most memory, we leverage the weigh() function:

big_lm <- our_model()
weigh(big_lm)
#> # A tibble: 25 × 2
#>    object            size
#>    <chr>            <dbl>
#>  1 terms         8.01    
#>  2 qr.qr         0.00666 
#>  3 residuals     0.00286 
#>  4 fitted.values 0.00286 
#>  5 effects       0.0014  
#>  6 coefficients  0.00109 
#>  7 call          0.000728
#>  8 model.mpg     0.000304
#>  9 model.cyl     0.000304
#> 10 model.disp    0.000304
#> # ℹ 15 more rows

The problem here is in the terms component of our big_lm. Because of how lm() is implemented in the stats package, the environment in which our model was made is carried along in the fitted output. To remove the (mostly) extraneous component, we can use butcher():

cleaned_lm <- butcher(big_lm, verbose = TRUE)
#> ✔ Memory released: 8.00 MB
#> ✖ Disabled: `print()`, `summary()`, and `fitted()`

Comparing it against our small_lm, we find:

weigh(cleaned_lm)
#> # A tibble: 25 × 2
#>    object           size
#>    <chr>           <dbl>
#>  1 terms        0.00771 
#>  2 qr.qr        0.00666 
#>  3 residuals    0.00286 
#>  4 effects      0.0014  
#>  5 coefficients 0.00109 
#>  6 model.mpg    0.000304
#>  7 model.cyl    0.000304
#>  8 model.disp   0.000304
#>  9 model.hp     0.000304
#> 10 model.drat   0.000304
#> # ℹ 15 more rows

And now it will take up about the same memory on disk as small_lm:

weigh(small_lm)
#> # A tibble: 25 × 2
#>    object            size
#>    <chr>            <dbl>
#>  1 terms         0.00763 
#>  2 qr.qr         0.00666 
#>  3 residuals     0.00286 
#>  4 fitted.values 0.00286 
#>  5 effects       0.0014  
#>  6 coefficients  0.00109 
#>  7 call          0.000728
#>  8 model.mpg     0.000304
#>  9 model.cyl     0.000304
#> 10 model.disp    0.000304
#> # ℹ 15 more rows

To make the most of your memory available, this package provides five S3 generics for you to remove parts of a model object:

  • axe_call(): To remove the call object.
  • axe_ctrl(): To remove controls associated with training.
  • axe_data(): To remove the original training data.
  • axe_env(): To remove environments.
  • axe_fitted(): To remove fitted values.

When you run butcher(), you execute all of these axing functions at once. Any kind of axing on the object will append a butchered class to the current model object class(es) as well as a new attribute named butcher_disabled that lists any post-fit estimation functions that are disabled as a result.

Model Object Coverage

Check out the vignette("available-axe-methods") to see butcher’s current coverage. If you are working with a new model object that could benefit from any kind of axing, we would love for you to make a pull request! You can visit the vignette("adding-models-to-butcher") for more guidelines, but in short, to contribute a set of axe methods:

  1. Run new_model_butcher(model_class = "your_object", package_name = "your_package")
  2. Use butcher helper functions weigh() and locate() to decide what to axe
  3. Finalize edits to R/your_object.R and tests/testthat/test-your_object.R
  4. Make a pull request!

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Copy Link

Version

Install

install.packages('butcher')

Monthly Downloads

9,350

Version

0.4.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Max Kuhn

Last Published

December 9th, 2025

Functions in butcher (0.4.0)

axe-kknn

Axing an kknn.
axe-ranger

Axing an ranger.
axe-train

Axing a train object.
axe-train.recipe

Axing a train.recipe object.
butcher

Butcher an object.
axe-rpart

Axing a rpart.
butcher_example

Get path to model object example.
axe-recipe

Axing a recipe object.
axe-xrf

Axing a xrf.
axe-xgb.Booster

Axing a xgb.Booster.
axe_data

Axe data.
axe-pls

Axing mixOmics models
axe-tabnet_fit

Axing a tabnet_fit.
axe-terms

Axing for terms inputs.
axe-randomForest

Axing an randomForest.
axe-survreg.penal

Axing an survreg.penal
axe-multnet

Axing an multnet.
axe-nnet

Axing a nnet.
axe-survreg

Axing an survreg.
axe-mda

Axing a mda.
axe_env

Axe an environment.
axe-rda

Axing an rda.
axe_ctrl

Axe controls.
axe-sclass

Axing a sclass object.
axe-model_fit

Axing an model_fit.
axe-spark

Axing a spark object.
axe_call

Axe a call.
axe-rsample-data

Axe data within rsample objects.
new_model_butcher

New axe functions for a modeling object.
locate

Locate part of an object.
axe-rsample-indicators

Axe indicators within rsample objects.
axe_fitted

Axe fitted values.
butcher-package

Reduce the Size of Modeling Objects
weigh

Weigh the object.
ui

Console Messages
axe-elnet

Axing an elnet.
axe-C5.0

Axing a C5.0.
axe-NaiveBayes

Axing a NaiveBayes.
axe-coxph

Axing a coxph.
axe-bart

Axing a bart model.
axe-formula

Axing formulas.
axe-function

Axing functions.
axe-earth

Axing an earth object.
axe-KMeansCluster

Axing a KMeansCluster.
axe-flexsurvreg

Axing an flexsurvreg.
axe-glmnet

Axing a glmnet.
axe-ksvm

Axing a ksvm object.
axe-glm

Axing a glm.
axe-gausspr

Axing a gausspr.
axe-ipred

Axing a bagged tree.
axe-mass

Axing a MASS discriminant analysis object.
axe-kproto

Axing a kproto.
axe-gam

Axing a gam.
axe-lm

Axing an lm.