# tidypredict v0.4.0

Monthly downloads

## Run Predictions Inside the Database

It parses a fitted 'R' model object, and returns a formula
in 'Tidy Eval' code that calculates the predictions.
It works with several databases back-ends because it leverages 'dplyr'
and 'dbplyr' for the final 'SQL' translation of the algorithm. It currently
supports lm(), glm(), randomForest(), ranger(), earth(), xgb.Booster.complete(),
cubist(), and ctree() models.

## Readme

# tidypredict

The main goal of `tidypredict`

is to enable running predictions inside
databases. It reads the model, extracts the components needed to
calculate the prediction, and then creates an R formula that can be
translated into SQL. In other words, it is able to parse a model such as
this one:

```
model <- lm(mpg ~ wt + cyl, data = mtcars)
```

`tidypredict`

can return a SQL statement that is ready to run inside the
database. Because it uses `dplyr`

’s database interface, it works with
several databases back-ends, such as MS
SQL:

```
tidypredict_sql(model, dbplyr::simulate_mssql())
```

```
## <SQL> 39.6862614802529 + (`wt` * -3.19097213898374) + (`cyl` * -1.5077949682598)
```

## Installation

Install `tidypredict`

from CRAN using:

```
# install.packages("tidypredict")
```

Or install the **development version** using `devtools`

as follows:

```
# install.packages("remotes")
# remotes::install_github("tidymodels/tidypredict")
```

## Functions

`tidypredict`

has only a few functions, and it is not expected that
number to grow much. The main focus at this time is to add more models
to
support.

Function | Description |
---|---|

`tidypredict_fit()` |
Returns an R formula that calculates the prediction |

`tidypredict_sql()` |
Returns a SQL query based on the formula from `tidypredict_fit()` |

`tidypredict_to_column()` |
Adds a new column using the formula from `tidypredict_fit()` |

`tidypredict_test()` |
Tests `tidyverse` predictions against the model’s native `predict()` function |

`tidypredict_interval()` |
Same as `tidypredict_fit()` but for intervals (only works with `lm` and `glm` ) |

`tidypredict_sql_interval()` |
Same as `tidypredict_sql()` but for intervals (only works with `lm` and `glm` ) |

`parse_model()` |
Creates a list spec based on the R model |

`as_parsed_model()` |
Prepares an object to be recognized as a parsed model |

## How it works

Instead of translating directly to a SQL statement, `tidypredict`

creates an R formula. That formula can then be used inside `dplyr`

. The
overall workflow would be as illustrated in the image above, and
described here:

- Fit the model using a base R model, or one from the packages listed in Supported Models
`tidypredict`

reads model, and creates a list object with the necessary components to run predictions`tidypredict`

builds an R formula based on the list object`dplyr`

evaluates the formula created by`tidypredict`

`dplyr`

translates the formula into a SQL statement, or any other interfaces.- The database executes the SQL statement(s) created by
`dplyr`

### Parsed model spec

`tidypredict`

writes and reads a spec based on a model. Instead of
simply writing the R formula directly, splitting the spec from the
formula adds the following capabilities:

- No more saving models as
`.rds`

- Specifically for cases when the model needs to be used for predictions in a Shiny app. - Beyond R models - Technically, anything that can write a proper
spec, can be read into
`tidypredict`

. It also means, that the parsed model spec can become a good alternative to using*PMML.*

## Supported models

The following models are supported by `tidypredict`

:

- Linear Regression -
`lm()`

- Generalized Linear model -
`glm()`

- Random Forest models -
`randomForest::randomForest()`

- Random Forest models, via
`ranger`

-`ranger::ranger()`

- MARS models -
`earth::earth()`

- XGBoost models -
`xgboost::xgb.Booster.complete()`

- Cubist models -
`Cubist::cubist()`

- Tree models, via
`partykit`

-`partykit::ctree()`

`parsnip`

`tidypredict`

supports models fitted via the `parsnip`

interface. The
ones confirmed currently work in `tidypredict`

are:

`lm()`

-`parsnip`

:`linear_reg()`

with*“lm”*as the engine.`randomForest::randomForest()`

-`parsnip`

:`rand_forest()`

with*“randomForest”*as the engine.`ranger::ranger()`

-`parsnip`

:`rand_forest()`

with*“ranger”*as the engine.`earth::earth()`

-`parsnip`

:`mars()`

with*“earth”*as the engine.

`broom`

The `tidy()`

function from broom works with linear models parsed via
`tidypredict`

```
pm <- parse_model(lm(wt ~ ., mtcars))
tidy(pm)
```

```
## # A tibble: 11 x 2
## term estimate
## <chr> <dbl>
## 1 (Intercept) -0.231
## 2 mpg -0.0417
## 3 cyl -0.0573
## 4 disp 0.00669
## 5 hp -0.00323
## 6 drat -0.0901
## 7 qsec 0.200
## 8 vs -0.0664
## 9 am 0.0184
## 10 gear -0.0935
## 11 carb 0.249
```

## Functions in tidypredict

Name | Description | |

tidypredict_interval | Returns a Tidy Eval formula to calculate prediction interval | |

tidypredict_test | Tests base predict function against tidypredict | |

tidypredict_to_column | Adds the prediction columns to a piped command set | |

tidypredict_sql_interval | Returns a SQL query with formula to calculate predicted interval | |

tidypredict_sql | Returns a SQL query with formula to calculate fitted values | |

knit_print.tidypredict_test | Knit print method for test predictions results | |

acceptable_formula | Checks that the formula can be parsed | |

tidypredict-package | tidypredict: Run Predictions Inside the Database | |

parse_model | Converts an R model object into a table | |

tidypredict_fit | Returns a Tidy Eval formula to calculate fitted values | |

tidy.pm_regression | Tidy the parsed model results | |

as_parsed_model | Prepares parsed model object | |

print.tidypredict_test | print method for test predictions results | |

reexports | Objects exported from other packages | |

No Results! |

## Vignettes of tidypredict

Name | ||

cubist.Rmd | ||

glm.Rmd | ||

lm.Rmd | ||

mars.Rmd | ||

non-r.Rmd | ||

ranger.Rmd | ||

regression.Rmd | ||

regression.csv | ||

rf.Rmd | ||

save.Rmd | ||

sql.Rmd | ||

tree.Rmd | ||

tree.csv | ||

xgboost.Rmd | ||

No Results! |

## Last month downloads

## Details

License | GPL-3 |

URL | https://tidymodels.github.io/tidypredict |

BugReports | https://github.com/tidymodels/tidypredict/issues |

RoxygenNote | 6.1.1 |

Encoding | UTF-8 |

VignetteBuilder | knitr |

NeedsCompilation | no |

Packaged | 2019-07-12 16:10:49 UTC; edgar |

Repository | CRAN |

Date/Publication | 2019-07-12 22:30:03 UTC |

suggests | covr , Cubist , DBI , dbplyr , earth , methods , mlbench , nycflights13 , parsnip , partykit , randomForest , ranger , rmarkdown , RSQLite , testthat (>= 2.1.0) , xgboost , yaml |

imports | dplyr (>= 0.7) , generics , knitr , purrr , rlang , tibble |

depends | R (>= 3.1) |

Contributors |

#### Include our badge in your README

```
[![Rdoc](http://www.rdocumentation.org/badges/version/tidypredict)](http://www.rdocumentation.org/packages/tidypredict)
```