Learn R Programming

⚠️There's a newer version (1.0.11) of this package.Take me there.

tidyquant

tidyquant integrates the best quantitative resources for collecting and analyzing quantitative data, zoo, xts, quantmod and TTR, with the tidy data infrastructure of the tidyverse allowing for seamless interaction between each and working within the tidyverse.

Benefits

The tidyquant philosophy:

  • A few core functions with a lot of power, that
  • leverage the quantitative analysis power of zoo, xts, quantmod and TTR, and are
  • designed to be used and scaled with the tidyverse.

Installation

Development Version with Latest Features:

# install.packages("devtools")
devtools::install_github("mdancho84/tidyquant")

CRAN Approved Version:

install.packages("tidyquant")

Examples

Start by loading tidyquant.

# Loads tidyquant, tidyverse, lubridate, quantmod, TTR, and xts/zoo
library(tidyquant) 

Getting Data:

tq_get() is the one-stop shop for retrieving data. The full list of get options are:

tq_get_options()
#> [1] "stock.prices"   "stock.index"    "dividends"      "splits"        
#> [5] "financials"     "key.ratios"     "economic.data"  "exchange.rates"
#> [9] "metal.prices"

Stock Prices:

Set get = "stock.prices" to get stock prices. Notice the output is always a tibble.

aapl_prices <- tq_get("AAPL", get = "stock.prices")
aapl_prices
#> # A tibble: 2,522 × 7
#>          date  open  high   low close    volume adjusted
#>        <date> <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl>
#> 1  2007-01-03 86.29 86.58 81.90 83.80 309579900 10.90416
#> 2  2007-01-04 84.05 85.95 83.82 85.66 211815100 11.14619
#> 3  2007-01-05 85.77 86.20 84.40 85.05 208685400 11.06681
#> 4  2007-01-08 85.96 86.53 85.28 85.47 199276700 11.12147
#> 5  2007-01-09 86.45 92.98 85.15 92.57 837324600 12.04533
#> 6  2007-01-10 94.75 97.80 93.45 97.00 738220000 12.62176
#> 7  2007-01-11 95.94 96.78 95.10 95.80 360063200 12.46562
#> 8  2007-01-12 94.59 95.06 93.23 94.62 328172600 12.31207
#> 9  2007-01-16 95.68 97.25 95.45 97.10 311019100 12.63477
#> 10 2007-01-17 97.56 97.60 94.82 94.95 411565000 12.35501
#> # ... with 2,512 more rows

Financial Statements:

Set get = "financials" to get financial statements. The statements are returned as nested tibbles, that can be unnested and analyzed together.

tq_get("AAPL", get = "financials")
#> # A tibble: 3 × 3
#>    type             annual            quarter
#> * <chr>             <list>             <list>
#> 1    BS <tibble [168 × 4]> <tibble [210 × 4]>
#> 2    CF  <tibble [76 × 4]>  <tibble [76 × 4]>
#> 3    IS <tibble [196 × 4]> <tibble [245 × 4]>

Key Ratios:

Set get = "key.ratios" to get 10 years of 89 different key ratios (e.g. P/E, P/S, EPS, ROA, ROE, current ratio, debt/equity, inventory turnover, and many more), separated into seven primary sections.

tq_get("AAPL", get = "key.ratios")
#> # A tibble: 7 × 2
#>             section               data
#>               <chr>             <list>
#> 1        Financials <tibble [150 × 5]>
#> 2     Profitability <tibble [170 × 5]>
#> 3            Growth <tibble [160 × 5]>
#> 4         Cash Flow  <tibble [50 × 5]>
#> 5  Financial Health <tibble [240 × 5]>
#> 6 Efficiency Ratios  <tibble [80 × 5]>
#> 7  Valuation Ratios  <tibble [40 × 5]>

Stock Indexes:

There are 18 indexes available to select from, which can be seen with a call to tq_get_stock_index_options().

tq_get_stock_index_options()
#>  [1] "DOWJONES"    "DJI"         "DJT"         "DJU"         "SP100"      
#>  [6] "SP400"       "SP500"       "SP600"       "RUSSELL1000" "RUSSELL2000"
#> [11] "RUSSELL3000" "AMEX"        "AMEXGOLD"    "AMEXOIL"     "NASDAQ"     
#> [16] "NASDAQ100"   "NYSE"        "SOX"

Set x to one of the options and get = "stock.index" to get a full list of stocks within the specified index.

tq_get("SP500", get = "stock.index")
#> Getting data...
#> # A tibble: 501 × 2
#>    symbol                   company
#>     <chr>                     <chr>
#> 1     MMM                        3M
#> 2     ABT       ABBOTT LABORATORIES
#> 3    ABBV                ABBVIE INC
#> 4     ACN                 ACCENTURE
#> 5    ATVI       ACTIVISION BLIZZARD
#> 6     AYI             ACUITY BRANDS
#> 7    ADBE             ADOBE SYSTEMS
#> 8     AAP        ADVANCE AUTO PARTS
#> 9     AET                     AETNA
#> 10    AMG AFFILIATED MANAGERS GROUP
#> # ... with 491 more rows

Other Options:

There are many other get options including dividends, splits, economic data from the FRED, and exchange rates and metal prices from Oanda.

Working in the tidyverse

You may already know and love tidyverse packages like ggplot2, dplyr, tidyr, purrr, readr, and tibble along with lubridate for working with date and datetime. tidyquant works solely in tibbles, so all of the tidyverse functionality is intact.

A simple example inspired by Kan Nishida's blog shows the dplyr and lubridate capability: Say we want the growth in the stock over the past year. We can do this with dplyr operations.

Getting the last year is simple with dplyr and lubridate. We first select the date and adjusted price (adjusted for stock splits). We then filter using lubridate date functions. We can use the mutate to add columns to the data frame: Add the baseline price using the first function, add the growth and growth percent versus baseline columns using standard mathematical operations. We tack on a final select statement to remove unnecessary columns. The final workflow looks like this:

aapl_prices %>%
    select(date, adjusted) %>%
    filter(date >= today() - years(1)) %>%
    mutate(baseline = first(adjusted),
           growth = adjusted - baseline,
           growth_pct = growth / baseline * 100) %>%
    select(-(baseline:growth))
#> # A tibble: 253 × 3
#>          date adjusted growth_pct
#>        <date>    <dbl>      <dbl>
#> 1  2016-01-07 94.35077  0.0000000
#> 2  2016-01-08 94.84967  0.5287736
#> 3  2016-01-11 96.38550  2.1565601
#> 4  2016-01-12 97.78438  3.6391934
#> 5  2016-01-13 95.27031  0.9746004
#> 6  2016-01-14 97.35395  3.1829958
#> 7  2016-01-15 95.01597  0.7050287
#> 8  2016-01-19 94.55621  0.2177364
#> 9  2016-01-20 94.68337  0.3525186
#> 10 2016-01-21 94.20404 -0.1555144
#> # ... with 243 more rows

Transforming & Mutating Data with zoo, xts, quantmod, and TTR Functions

You may already know and love zoo, xts, quantmod, and TTR, which is why the core functionality is fully integrated. The workhorse functions, tq_transform() and tq_mutate(), apply zoo, xts, quantmod, and TTR functions to tibbles. The full list of compatible functions are shown with a call to tq_transform_fun_options(). Remove the %>% str() to expand the list.

tq_transform_fun_options() %>% str()
#> List of 4
#>  $ zoo     : chr [1:14] "rollapply" "rollapplyr" "rollmax" "rollmax.default" ...
#>  $ xts     : chr [1:27] "apply.daily" "apply.monthly" "apply.quarterly" "apply.weekly" ...
#>  $ quantmod: chr [1:25] "allReturns" "annualReturn" "ClCl" "dailyReturn" ...
#>  $ TTR     : chr [1:61] "adjRatios" "ADX" "ALMA" "aroon" ...

tq_transform

tq_transform() returns a new data set that can either be in the same periodicity or a different periodicity as the original data set. Let's use tq_transform to transform the periodicity of the aapl_prices. The quantmod OHLC codes are used to select the open, high, low, close, and volume (OHLCV) columns, which are then sent to the transformation function, xts::to.period, for transformation to monthly periodicity. We now have a much smaller data set containing the monthly prices.

aapl_prices %>%
    tq_transform(ohlc_fun = OHLCV, transform_fun = to.period, period = "months")
#> # A tibble: 121 × 6
#>          date   open   high    low  close    volume
#>        <dttm>  <dbl>  <dbl>  <dbl>  <dbl>     <dbl>
#> 1  2007-01-31  84.86  86.00  84.35  85.73 214017300
#> 2  2007-02-28  83.00  85.60  83.00  84.61 229868800
#> 3  2007-03-30  94.28  94.68  92.75  92.91 150139500
#> 4  2007-04-30 100.09 101.00  99.67  99.80 154127400
#> 5  2007-05-31 120.07 122.17 119.54 121.19 324266600
#> 6  2007-06-29 121.97 124.00 121.09 122.04 284460400
#> 7  2007-07-31 142.97 143.48 131.52 131.76 440598200
#> 8  2007-08-31 139.49 139.65 137.41 138.48 219221800
#> 9  2007-09-28 153.44 154.60 152.75 153.47 153775300
#> 10 2007-10-31 187.63 190.12 184.95 189.95 208327700
#> # ... with 111 more rows

tq_mutate

The cousin of tq_transform() is tq_mutate(). While tq_transform() produces a new, transformed data set, tq_mutate() modifies the existing data set. This is very useful for applying TTR functions like BBands, MACD, Moving Averages, etc. There is one caveat: the mutation must be in the same periodicity as the original data set (otherwise you can't add columns because the rows will not match up). Let's use tq_mutate() to add some Bollinger Bands and MACD using the closing prices (Cl in OHLC notation).

aapl_prices %>%
    tq_mutate(ohlc_fun = Cl, mutate_fun = MACD) %>%
    tq_mutate(ohlc_fun = HLC, mutate_fun = BBands)
#> # A tibble: 2,522 × 13
#>          date  open  high   low close    volume adjusted  macd signal
#>        <date> <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl> <dbl>  <dbl>
#> 1  2007-01-03 86.29 86.58 81.90 83.80 309579900 10.90416    NA     NA
#> 2  2007-01-04 84.05 85.95 83.82 85.66 211815100 11.14619    NA     NA
#> 3  2007-01-05 85.77 86.20 84.40 85.05 208685400 11.06681    NA     NA
#> 4  2007-01-08 85.96 86.53 85.28 85.47 199276700 11.12147    NA     NA
#> 5  2007-01-09 86.45 92.98 85.15 92.57 837324600 12.04533    NA     NA
#> 6  2007-01-10 94.75 97.80 93.45 97.00 738220000 12.62176    NA     NA
#> 7  2007-01-11 95.94 96.78 95.10 95.80 360063200 12.46562    NA     NA
#> 8  2007-01-12 94.59 95.06 93.23 94.62 328172600 12.31207    NA     NA
#> 9  2007-01-16 95.68 97.25 95.45 97.10 311019100 12.63477    NA     NA
#> 10 2007-01-17 97.56 97.60 94.82 94.95 411565000 12.35501    NA     NA
#> # ... with 2,512 more rows, and 4 more variables: dn <dbl>, mavg <dbl>,
#> #   up <dbl>, pctb <dbl>

tq_tranform_xy and tq_mutate_xy

The "xy" variants are useful in situations where (1) you have two inputs (hence x and y) that don't fit into the OHLC function, or (2) you are working with a single column of non-OHLC data.

Two inputs that don't fit OHLC mold:

aapl_prices %>%
    tq_mutate_xy(x = close, y = volume, mutate_fun = EVWMA)
#> # A tibble: 2,522 × 8
#>          date  open  high   low close    volume adjusted    v1
#>        <date> <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl> <dbl>
#> 1  2007-01-03 86.29 86.58 81.90 83.80 309579900 10.90416    NA
#> 2  2007-01-04 84.05 85.95 83.82 85.66 211815100 11.14619    NA
#> 3  2007-01-05 85.77 86.20 84.40 85.05 208685400 11.06681    NA
#> 4  2007-01-08 85.96 86.53 85.28 85.47 199276700 11.12147    NA
#> 5  2007-01-09 86.45 92.98 85.15 92.57 837324600 12.04533    NA
#> 6  2007-01-10 94.75 97.80 93.45 97.00 738220000 12.62176    NA
#> 7  2007-01-11 95.94 96.78 95.10 95.80 360063200 12.46562    NA
#> 8  2007-01-12 94.59 95.06 93.23 94.62 328172600 12.31207    NA
#> 9  2007-01-16 95.68 97.25 95.45 97.10 311019100 12.63477    NA
#> 10 2007-01-17 97.56 97.60 94.82 94.95 411565000 12.35501 94.95
#> # ... with 2,512 more rows

Working with a single column of non-OHLC data:

tq_get("GDPC1", get = "economic.data") %>%
    tq_mutate_xy(x = price, mutate_fun = rollapply, width = 5, FUN = mean)
#> # A tibble: 39 × 3
#>          date   price rollapply
#>        <date>   <dbl>     <dbl>
#> 1  2007-01-01 14726.0        NA
#> 2  2007-04-01 14838.7        NA
#> 3  2007-07-01 14938.5        NA
#> 4  2007-10-01 14991.8        NA
#> 5  2008-01-01 14889.5  14876.90
#> 6  2008-04-01 14963.4  14924.38
#> 7  2008-07-01 14891.6  14934.96
#> 8  2008-10-01 14577.0  14862.66
#> 9  2009-01-01 14375.0  14739.30
#> 10 2009-04-01 14355.6  14632.52
#> # ... with 29 more rows

Scaling with the tidyverse

All functions return data sets as tibbles, which allows for interaction within the tidyverse. This means we can:

  • Use dplyr and tidyr to select, filter, nest/unnest, etc.
  • Use the pipe (%>%) for chaining operations.
  • Seamlessly scale data retrieval and transformations/mutations using purrr to map functions.

A very basic example is retrieving the stock prices for multiple stocks. We can do this by piping a tibble of stock symbols to a mutation that maps the tq_get(get = "stock.prices") function.

tibble(symbol = c("AAPL", "GOOG", "AMZN", "FB", "AVGO", "SWKS","NVDA")) %>%
    mutate(stock.prices = map(.x = symbol, ~ tq_get(.x, get = "stock.prices")))
#> # A tibble: 7 × 2
#>   symbol         stock.prices
#>    <chr>               <list>
#> 1   AAPL <tibble [2,522 × 7]>
#> 2   GOOG <tibble [2,522 × 7]>
#> 3   AMZN <tibble [2,522 × 7]>
#> 4     FB <tibble [1,167 × 7]>
#> 5   AVGO <tibble [1,869 × 7]>
#> 6   SWKS <tibble [2,522 × 7]>
#> 7   NVDA <tibble [2,522 × 7]>

Further Information

This just scratches the surface of the features. See the tidyquant vignette for further details on the package.

Copy Link

Version

Install

install.packages('tidyquant')

Monthly Downloads

19,925

Version

0.2.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Matt Dancho

Last Published

January 8th, 2017

Functions in tidyquant (0.2.0)

tidyquant

tidyquant: Integrating quantitative financial analysis tools with the tidyverse
tq_mutate

Mutates quantitative data (adds new variables to existing tibble)
tq_get

Get quantitative data in tibble format
as_xts

Coerce objects to xts, designed to work with tibble and data.frame objects
as_tibble

Coerce to tibble. Enable preserving row names when coercing matrix and time-series-like objects with row names.
tq_transform

Transforms quantitative data (returns new variables in new tibble)