cranly (version 0.5.4)

clean_CRAN_db: Clean and organize package and author names in the output of tools::CRAN_package_db()

Description

Clean and organize package and author names in the output of tools::CRAN_package_db()

Usage

clean_CRAN_db(packages_db = tools::CRAN_package_db(),
  clean_directives = clean_up_directives,
  clean_author = clean_up_author,
  clean_maintainer = standardize_whitespace)

Arguments

packages_db

a data.frame with the same structure to the output of tools::CRAN_package_db (default) or utils::available.packages.

clean_directives

a function that transforms the contents of the various directives in the package descriptions to vectors of package names. Default is clean_up_directives.

clean_author

a function that transforms the contents of Author to vectors of package authors. Default is clean_up_author.

clean_maintainer

a function that transforms the contents of Maintainer to vectors of of maintainer names. Default is standardize_whitespace.

Value

A data.frame with the same variables as package_db (but with lower case names), that also inherits from class_db, and has a timestamp attribute.

Details

clean_CRAN_db uses clean_up_directives and clean_up_author to clean up the author names and package names in the various directives (like Imports, Depends, Suggests, Enhances, LinkingTo) as in the data.frame that results from tools::CRAN_package_db return an organized data.frame of class cranly_db that can be used for further analysis.

The function tries hard to identify and eliminate mistakes in the Author field of the description file, and extract a clean list of only author names. The relevant operations are coded in the clean_up_author function. Specifically, some references to copyright holders had to go because they were contaminating the list of authors (most are not necessary anyway, but that is a different story...). The current version of clean_up_author is far from best practice in using regex but it currently does a fair job in cleaning up messy Author fields. It will be improving in future versions.

Custom clean-up functions can also be supplied via the clean_directives and clean_author arguments.

Examples

Run this code
# NOT RUN {
## Before cleaning
cran_db <- tools::CRAN_package_db()
cran_db[cran_db$Package == "weights", "Author"]

## After clean up
package_db <- clean_CRAN_db(cran_db)
package_db[package_db$package == "weights", "author"]
# }

Run the code above in your browser using DataLab