Learn R Programming

datapasta 3.1.0 'Leave to Simmer'

The Goods

Introducing datapasta

datapasta is about reducing resistance associated with copying and pasting data to and from R. It is a response to the realisation that I often found myself using intermediate programs like Sublime to munge text into suitable formats. Addins and functions in datapasta support a wide variety of input and output situations, so it (probably) "just works". Hopefully tools in this package will remove such intermediate steps and associated frustrations from our data slinging workflows.

Prerequisites

  • Linux users will need to install either xsel or xclip. These applications provide an interface to X selections (clipboard-like).
    • For example: sudo apt-get install xsel - it's 72kb...
  • Windows and MacOS have nothing extra to do.

Installation

  1. Get the package: install.packages("datapasta")
  2. Set the keyboard shortcuts using Tools -> Addins -> Browse Addins, then click Keyboard Shortcuts...

Usage

Use with RStudio

Getting data into source

At the moment this package contains these RStudio addins that paste data to the cursor:

  • tribble_paste which pastes a table as a nicely formatted call to tibble::tribble()
    • Recommend Ctrl + Shift + t as shortcut.
    • Table can be delimited with tab, comma, pipe or semicolon.
  • vector_paste which will paste delimited data as a vector definition, e.g. c("a", "b") etc.
    • Recommend Ctrl + Alt + Shift + v as shortcut.
  • vector_paste_vertical which will paste delimited data as a vertically formatted vector definition.
    • Recommend Ctrl + Shift + v as shortcut
    • example output:
c("Mint",
  "Fedora",
  "Debian",
  "Ubuntu",
  "OpenSUSE")
  • df_paste which pastes a table on the clipboard as a standard data.frame definition rather than a tribble call. This has certain advantages in the context of reproducible examples and educational posts. Many thanks to Jonathan Carroll for getting this rolling and coding the bulk of the feature.
    • Recommend Ctrl + Alt + Shift + d as shortcut.
  • dt_paste which is the same as df_paste, but for data.table.

Massaging data in source

There are two Addins that can help with creating and aligning data in your editor:

  • Fiddle Selection will perform magic on a selection. It can be used to:

    • Turn raw data delimited by any combination of commas, spaces, and newlines into a c() expression
    • Pivot a c() expr between horizontal and vertical layout.
    • Reflow messy tribble() and data.frame() exprs.
    • Recommend Ctrl +Shift + f as shortcut.
  • Toggle Vector Quotes will toggle a c() expr between all elements wrapped in "" and all bare unquoted form. Handy in combination with above to save mucho keystrokes.

    • Recommend Ctrl +Shift + q as shortcut.

Getting Data out of an R session

There are two R functions available that accept R objects and output formatted text for pasting to a reprex or other application:

  • dpasta accepts tibbles, data.frames, and vectors. Data is output in a format that matches in input class. Formatted text is pasted at the cursor.

  • dmdclip accepts the same inputs as dpasta but inserts the formatted text onto the clipboard, preceded by 4 spaces so that is can be as pasted as a preformatted block to Github, Stackoverflow etc.

Use with other editors

The only hard dependency of datapasta is readr for type guessing. All the above *paste functions can be called directly instead of as an addin, and will fall back to console output if the rstudioapi is not available.

On system without access to the clipboard (or without clipr installed) datapasta can still be used to output R objects from an R session. dpasta is probably the only function you care about in this scenario.

Custom Installation

datapasta imports clipr and rstudioapi so as to make installation smooth and easy for most users. If you wish to avoid installing an rstudioapi you will never use you can use:

  • install.packages("datapasta", dependencies = "Depends").
  • Followed by install.packages("clipr") to enable clipboard features.

Pitfalls

  • tribble_paste works well with CSVs, excel files, and html tables, but is currently brittle with respect to irregular table structures like merged cells or multi-line column headings. For some reason Wikipedia seems chock full of these. :(
  • Quoted csv data, where the quotes contain commas will not be parsed correctly.
  • Nested list columns have limited support with tribble_paste()/dpasta(). Nested lists of length 1 fail unless all are length 1 - It's complicated. You still get some output so it might be viable to fix and reflow with Fiddle Selection. Tread with caution.

Prior art

This package is made possible by mdlincon's clipr, and Hadley's packages tibble and readr (for data-type guessing). I especially appreciate clipr's thoughtful approach to the clipboard on Linux, which pretty much every other R clipboard package just nope'd out on.

Future developments

I am interested in expanding the types of objects supported by the output functions dpasta. I would also like to eventually have Fiddle Selection to pivot function calls and named vectors. Feel free to contribute your ideas to the open issues.

Bonus

0 to datapasta in 64 seconds via a video vignette:

Copy Link

Version

Install

install.packages('datapasta')

Monthly Downloads

1,366

Version

3.1.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Miles McBain

Last Published

January 17th, 2020

Functions in datapasta (3.1.0)

df_paste

df_paste
read_clip_tbl_guess

read_clip_table_guess
nchar_type

nchar_type
pad_to

pad_to
vector_paste_vertical

vector_paste_vertical
parse_vector

parse_vector
zzz_rs_dfiddle

dfiddle
df_format

df_format
dmdclip

dmdclip
dp_set_max_rows

dp_set_max_rows
render_type

render_type
vector_format

vector_format
vector_construct_vertical

vector_construct_vertical
tribble_paste

tribble_paste
nquote_str

Count the number of quotes in a string
vector_construct

vector_construct
render_type_pad_to

render_type_pad_to
dpasta

dpasta
tribble_construct

tribble_construct
dp_set_decimal_mark

dp_set_decimal_mark
tortellini

wrap the datapasta around itself
vector_format_vertical

vector_format_vertical
tribble_format

tribble_format
dt_format

dt_format
vector_paste

vector_paste
guess_sep

guess_sep
guess_output_context

guess_output_context
zzz_rs_toggle_quotes

Toggle Quotes
dt_paste

dt_paste
clipboard_context

custom_context
dfdt_construct

dfdt_construct