rchallenge-package: A Simple Data Science Challenge System
Description
A simple data science challenge system using R Markdown and Dropbox .
It requires no network configuration, does not depend on external platforms
like e.g. Kaggle and can be easily installed on a personal computer.
Installation
Install the R package from CRAN repositories install.packages("rchallenge") or install the latest development version from GitHub # install.packages("devtools") devtools::install_github("adrtod/rchallenge") A recent version of pandoc (>= 1.12.3) is also required.
See the pandoc installation instructions
for details on installing pandoc for your platform.Getting started
Install a new challenge in Dropbox/mychallenge: setwd("~/Dropbox/mychallenge") library(rchallenge) new_challenge() or for a french version: new_challenge(template = "fr") You will obtain a ready-to-use challenge in the folder Dropbox/mychallenge containing:
-
challenge.rmd: Template R Markdown script for the webpage.
-
data: Directory of the data containing data_train and data_test datasets.
-
submissions: Directory of the submissions. It will contain one subdirectory per team where they can submit their submissions. The subdirectories are shared with Dropbox.
-
history: Directory where the submissions history is stored.
The default challenge provided is a binary classification problem on the German Credit Card dataset. You can easily customize the challenge in two ways:
- During the creation of the challenge: by using the options of the
new_challenge function.
- After the creation of the challenge: by manually replacing the data files in the
data subdirectory and the baseline predictions in submissions/baseline and by customizing the template challenge.rmd as needed.
Next steps
To complete the installation:
- Create and share subdirectories in
submissions for each team: new_team("team_foo", "team_bar")
- Publish the html page in
Dropbox/Public:
publish()
Prior to this, make sure you enabled your Public Dropbox folder.
- Give the public link to your
Dropbox/Public/challenge.html file to the participants.
- Refresh the webpage by repeating step 2 on a regular basis. See below for automating this step.
From now on, a fully autonomous challenge system is set up requiring no further
administration. With each update, the program automatically performs the following
tasks using the functions available in our package: Automating the updates on <strong>Unix/OSX</strong>
For the step 4, you can setup the following line to your crontab
using crontab -e (mind the quotes): 0 * * * * Rscript -e 'rchallenge::publish("~/Dropbox/mychallenge/challenge.rmd")' This will publish a html webpage in your Dropbox/Public folder every hour. You might have to add the path to Rscript and pandoc at the beginning of your crontab: PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin Depending on your system or pandoc version you might also have to explicitly add the encoding option to the command: 0 * * * * Rscript -e 'rchallenge::publish("~/Dropbox/mychallenge/challenge.rmd", encoding = "utf8")'Automating the updates on <strong>Windows</strong>
You can use the Task Scheduler
to create a new task with a Start a program action with the settings (mind the quotes):
- Program/script:
Rscript.exe
- options:
-e rchallenge::publish('~/Dropbox/mychallenge/challenge.rmd')