tibble v1.2

0

Monthly downloads

0th

Percentile

by Kirill Müller

Simple Data Frames

Provides a 'tbl_df' class that offers better checking and printing capabilities than traditional data frames.

Readme

tibble

Build Status AppVeyor Build Status Coverage Status CRAN\_Status\_Badge

tibble implements a modern reimagining of the data.frame, keeping what time has proven to be effective, and throwing out what is not. It extracts these basic ideas out of dplyr, which is now more clearly focused on data manipulation. tibble provides a lighter-weight package for the basic care and feeding of tbl_df's, aka "tibble diffs" or just "tibbles". Tibbles are data.frames with nicer behavior around printing, subsetting, and factor handling.

Creating tibbles

You can create a tibble from an existing object with as_tibble():

library(tibble)
as_tibble(iris)
#> # A tibble: 150 × 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl>  <fctr>
#> 1           5.1         3.5          1.4         0.2  setosa
#> 2           4.9         3.0          1.4         0.2  setosa
#> 3           4.7         3.2          1.3         0.2  setosa
#> 4           4.6         3.1          1.5         0.2  setosa
#> 5           5.0         3.6          1.4         0.2  setosa
#> 6           5.4         3.9          1.7         0.4  setosa
#> 7           4.6         3.4          1.4         0.3  setosa
#> 8           5.0         3.4          1.5         0.2  setosa
#> 9           4.4         2.9          1.4         0.2  setosa
#> 10          4.9         3.1          1.5         0.1  setosa
#> # ... with 140 more rows

This will work for reasonable inputs that are already data.frame, list, matrix, or table.

You can also create a new tibble from vectors that represent the columns with tibble():

tibble(x = 1:5, y = 1, z = x ^ 2 + y)
#> # A tibble: 5 × 3
#>       x     y     z
#>   <int> <dbl> <dbl>
#> 1     1     1     2
#> 2     2     1     5
#> 3     3     1    10
#> 4     4     1    17
#> 5     5     1    26

tibble() does much less than data.frame(): it never changes the type of the inputs (e.g. it never converts strings to factors!), it never changes the names of variables, and it never creates row.names(). You can read more about these features in the vignette, vignette("tibble").

You can define a tibble row-by-row with tribble():

tribble(
  ~x, ~y,  ~z,
  "a", 2,  3.6,
  "b", 1,  8.5
)
#> # A tibble: 2 × 3
#>       x     y     z
#>   <chr> <dbl> <dbl>
#> 1     a     2   3.6
#> 2     b     1   8.5

You can see why this variant of the data.frame is called a "tibble diff" from its class:

class(as_tibble(iris))
#> [1] "tbl_df"     "tbl"        "data.frame"

Tibbles vs data frames

There are two main differences in the usage of a data frame vs a tibble: printing, and subsetting.

Tibbles have a refined print method that shows only the first 10 rows, and all the columns that fit on screen. This makes it much easier to work with large data. In addition to its name, each column reports its type, a nice feature borrowed from str():

library(nycflights13)
flights
#> # A tibble: 336,776 × 19
#>     year month   day dep_time sched_dep_time dep_delay arr_time
#>    <int> <int> <int>    <int>          <int>     <dbl>    <int>
#> 1   2013     1     1      517            515         2      830
#> 2   2013     1     1      533            529         4      850
#> 3   2013     1     1      542            540         2      923
#> 4   2013     1     1      544            545        -1     1004
#> 5   2013     1     1      554            600        -6      812
#> 6   2013     1     1      554            558        -4      740
#> 7   2013     1     1      555            600        -5      913
#> 8   2013     1     1      557            600        -3      709
#> 9   2013     1     1      557            600        -3      838
#> 10  2013     1     1      558            600        -2      753
#> # ... with 336,766 more rows, and 12 more variables: sched_arr_time <int>,
#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
#> #   minute <dbl>, time_hour <dttm>

Tibbles are strict about subsetting. If you try to access a variable that does not exist via $, you'll get a warning:

flights$yea
#> Warning: Unknown column 'yea'
#> NULL

Tibbles also clearly delineate [ and [[: [ always returns another tibble, [[ always returns a vector. No more drop = FALSE!

class(iris[ , 1])
#> [1] "numeric"
class(iris[ , 1, drop = FALSE])
#> [1] "data.frame"
class(as_tibble(iris)[ , 1])
#> [1] "tbl_df"     "tbl"        "data.frame"

Installation

tibble is on CRAN, install using:

install.packages("tibble")

You can try out the dev version with:

# install.packages("devtools")
devtools::install_github("hadley/tibble")

Functions in tibble

Name Description
as_tibble Coerce lists and matrices to data frames.
tibble Build a data frame or list.
obj_sum Provide a succinct summary of an object
rownames Tools for working with row names
tribble Row-wise tibble creation
repair_names Repair object names.
all_equal Flexible equality comparison for data frames.
glimpse Get a glimpse of your data.
enframe Converting atomic vectors to data frames
print.tbl_df Tools for describing matrices
has_name Convenience function to check presence of a named element
add_column Add columns to a data frame
is.tibble Test if the object is a tibble.
knit_print.trunc_mat knit_print method for trunc mat
add_row Add rows to a data frame
No Results!

Last month downloads

Details

Encoding UTF-8
URL https://github.com/hadley/tibble
BugReports https://github.com/hadley/tibble/issues
LinkingTo Rcpp
LazyData yes
License MIT + file LICENSE
RoxygenNote 5.0.1
VignetteBuilder knitr
NeedsCompilation yes
Packaged 2016-08-26 11:42:44 UTC; muelleki
Repository CRAN
Date/Publication 2016-08-26 21:50:28

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/tibble)](http://www.rdocumentation.org/packages/tibble)