Attention

This repository is unstable and currently experimental. Please come back later when we have a new version to correspond with dat 1.0. Keep up to date in #dat on freenode or @dat_project on Twitter.

rdat

Software is in alpha stage. Not yet ready for use with real world data

The rdat package provides an R wrapper to the Dat project. Dat (git for data) is a framework for data versioning, replication and synchronisation, see dat-data.com.

Installation instructions

Prerequisites: Instructions below require R, git and nodejs (npm).

Installing `dat` stable

Install the latest stable version from npm:

sudo npm install -g dat

See instructions for more details.

Installing `dat` development version

If you have not already installed dat grab it from github:

git clone https://github.com/maxogden/dat ~/dat
cd ~/dat
npm install .
sudo npm link

To update an existing copy of dat

cd ~/dat
git pull
rm -Rf node_modules
npm install .

Installing `rdat`

Then install the R package:

library(devtools)
install_github("ropensci/rdat")

Run through the examples to verify that everything works:

library(rdat)
example(dat)

API

This api is experimental and hasn't been finalized or implemented. Stay tuned for updates

init

When no remote is specified, dat() will init a new repository:

repo <- dat("cars", path = getwd())

insert

Inserts data from a data frame and gets the dat version key

# insert some data
repo$insert(cars[1:20,])
v1 <- repo$status()$version
v1

Inserts more data, get a new version key

# insert more data
repo$insert(cars[21:25,])
v2 <- repo$status()$version
v2

get

Retreive particular versions of the dataset from the key.

data1 <- repo$get(v1)
data2 <- repo$get(v2)

diff

List changes in between versions

diff <- repo$diff(v1, v2)
diff$key

branching

Fork a dataset from a particular version into a new branch.

# create fork
repo$checkout(v1)
repo$insert(cars[40:42,])
repo$forks()
v3 <- repo$status()$version

checkout

Checkout the data at a particular version.

# go back to v2
repo$checkout(v2)
repo$get()

binary data

Save binary data (files) as attachements to the dataset.

# store binary attachements
repo$write(serialize(iris, NULL), "iris")
unserialize(repo$read("iris"))

clone

# Create another repo
dir.create(newdir <- tempfile())
repo2 <- dat("cars", path = newdir, remote = repo$path())
repo2$forks()
repo2$get()

Specifying a remote (path or url) to clone an existing repo. In this case we clone the previous repo into a new location.

push and pull

Lets make yet another clone of our original repository

# Create a third repo
dir.create(newdir <- tempfile())
repo3 <- dat("cars", path = newdir, remote = repo$path())

Add data in repo2 and then push it back to repo1.

# Add some data and push to origin
repo2$insert(cars[31:40,])
repo2$push()

Then pull data back into repo3.

# sync data with origin
repo3$pull()

# Verify that repositories are in sync
mydata2 <- repo2$get()
mydata3 <- repo3$get()
all.equal(mydata2, mydata3)

Attention

rdat

Installation instructions

Installing `dat` stable

Installing `dat` development version

Installing `rdat`

API

init

insert

get

diff

branching

checkout

binary data

clone

push and pull

Copy Link

Version

Version

License

Issues

Pull Requests

Stars

Forks

Repository

Maintainer

Last Published

Functions in rdat (0.2)

Attention

rdat

Installation instructions

Installing dat stable

Installing dat development version

Installing rdat

API

init

insert

get

diff

branching

checkout

binary data

clone

push and pull

Copy Link

Version

Version

License

Issues

Pull Requests

Stars

Forks

Repository

Maintainer

Last Published

Functions in rdat (0.2)

Installing `dat` stable

Installing `dat` development version

Installing `rdat`