crandalf: No More "YOU SHALL NOT PASS"

There are a lot of things to check before you submit an R package to CRAN, and the last thing is probably to make sure your new version will not break any existing packages on CRAN, otherwise you may have to recall what Gandalf said.

Checking reverse dependencies is certainly not a pleasant thing to do. The basic idea is extremely simple: you just download all your reverse dependencies and run R CMD check on them. It is fairly easy to automate this, e.g. using tools::package_dependencies(). The code below shows how many packages one may have to check before submitting a new version to CRAN:

options(repos = c(CRAN = 'http://cran.rstudio.com'))
db = available.packages()
pkgs = rownames(db)
deps = tools::package_dependencies(pkgs, db, which = 'all', reverse = TRUE)
len  = sapply(deps, length)
tail(sort(len), 10)  # the top 10
#  knitr     plyr survival     Rcpp  mvtnorm   Matrix  ggplot2 testthat  lattice     MASS
#    241      256      273      280      295      314      338      338      421      844

The challenges

However, devils are in the details (as always). The biggest challenge, in my opinion, is the external system dependencies of some R packages. Reverse dependency checking will be too simple if the reverse dependencies are pure R packages. Apparently this is not true, e.g. to install the XML package, you will need libxml2-dev. Unforutately there is no official way to formally specify such dependencies; the closest thing is the SystemRequirements field in package DESCRIPTION, which is fairly "loose", and normally you cannot automatically figure out what packages to install. The manual process is the pain here. To make things worse, some system packages may not be available in your system. You may have to find some PPA's to install them. There are other details that you may need to take care of, such as LaTeX packages (some package vignettes use uncommon packages that are not included in a "minimal" LaTeX installation so you may have to install a gigantic Debian package just to check these packages).

Well, not everyone needs to face the above pain. I feel both honored and painful as the author of knitr: on one hand, I certainly love people making use of my work; on the other, I have to figure out how to install and check packages like RcppOctave even though I never use them (sorry, Renaud, no offence, just an example).

For many package authors, this crandalf repository can be useful and simple to use, and I will explain the simple way and the complicated way, respectively. Note the packages are checked on Travis CI, so you do not really need to install anything locally.

Packages with a small number of reverse dependencies

If your package is not as popular as MASS, testthat, or ggplot2, all you need to do when you are ready to check your reverse dependencies is submit a pull request with a special GIT commit message of the form [crandalf pkg@repo], where pkg is the name of your package, and repo is your Github repo. For example, the commit message aloha, [crandalf highr@yihui/highr] will trigger the checking of reverse dependencies of the highr package, which will be installed using devtools::install_github('yihui/highr'). At the moment, only Github is supported. So you fork this crandalf repo, make some changes, commit with the message containing [crandalf pkg@repo], and submit a pull request. If you have multiple commits, make sure the message [crandalf pkg@repo] is included in the last commit.

I do not even need to merge your pull request -- the whole point of submitting the pull request is to trigger the Travis CI service. You do not really need to make any substantial changes in your pull request, either. You are welcome to change anything in this repo, such as correcting typos, or improve things. If you cannot think of anything to do, you can add pictures of kittens to this repo (@RCatLadies) if you want...

Oh, I have a package on which hundreds of package depend

First, congratulations!

In that case, you are likely to split your reverse dependencies into groups, because Travis CI has the time limit of 50 minutes per job. However, it does not have a limit on how many jobs in a build matrix you can submit each time. Therefore, if you have 100 packages to check, you may group them in 10 jobs, each job checking 10 packages.

You will have to add your package to the file <inst/config/PACKAGES>. There have been some examples in this file. Let me explain the knitr configuration:

package: knitr
install: yihui/knitr
matrix: 30
only:
exclude: localsolver tabplot
separate: dbmss | DLMtool | hot.deck | HiveR pkgmaker
sysdeps:

The install field specifies the Github repo; matrix specifies how many jobs you want to arrange in a Travis build matrix; only is useful when you check a subset of packages: you can specify these packages names in this field, otherwise all reverse dependencies are checked; exclude will exclude some packages (usually known broken); separate can be used to separate a few packages so they are checked in separate jobs (usually these are the very time-consuming ones); you can include some system commands in the sysdeps field to, for example, install additional system packages before checking reverse dependencies. All of these fields are optional except package and install.

You need to submit a pull request to get your changes in the PACKAGES file merged. Then I will create a branch of the form pkg/name, where name is your package name. The reason for the new branches is that I will have to arrange packages in .travis.yml. You can take a look at .travis.yml in the pkg/knitr branch to understand what I mean here. That .travis.yml file was automatically generated using the information in PACKAGES, and the .travis.yml in the master branch is used as a template.

When you find certain packages cannot be installed due to missing system dependencies, you can add these dependencies in the RECIPES file. Note the latest versions of TeXLive and Pandoc were pre-installed from the ubuntu-bin repo. In case you find missing LaTeX packages, you can either do tlmgr install in the sysdeps field, or let me know so I can bundle them in the ubuntu-bin repo. Normally there is no need to install Gigabytes of texlive-* packages. Please do note all the settings here are tailored for Travis CI, and I do not mean you should do the same thing for your local computers.

Want to contribute?

Sure. I absolutely hate figuring out the system dependencies of R packages one by one. Please do help me expand the RECIPES file so more R package authors can benefit from it.

I can talk endlessly about the pain in this project, such as the broken R packages in the Ubuntu repo (built before R 3.0.0), the gory details of missing LaTeX packages, and so on, but that may be meaningless to you. There are also a few cool features that I do not have time to introduce, and I will see if other users find this repo useful.

May Gandalf bless CRAN!

Copy Link

Version

Down Chevron

Version

0.0.1

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

October 6th, 2014

Functions in crandalf (0.0.1)