Oliver Keyes

Oliver Keyes

39 packages on CRAN

1 packages on GitHub

urltools

cran
96th

Percentile

A toolkit for all URL-handling needs, including encoding and decoding, parsing, parameter extraction and modification. All functions are designed to be both fast and entirely vectorised. It is intended to be useful for people dealing with web-related datasets, such as server-side logs, although may be useful for other situations involving large sets of URLs.

triebeard

cran
96th

Percentile

'Radix trees', or 'tries', are key-value data structures optimised for efficient lookups, similar in purpose to hash tables. 'triebeard' provides an implementation of 'radix trees' for use in R programming and in developing packages with 'Rcpp'.

WikipediR

cran
91th

Percentile

A wrapper for the MediaWiki API, aimed particularly at the Wikimedia 'production' wikis, such as Wikipedia. It can be used to retrieve page text, information about users or the history of pages, and elements of the category tree.

WikidataR

cran
90th

Percentile

An API client for the Wikidata <http://wikidata.org/> store of semantic data.

wicket

cran
87th

Percentile

Utilities to generate bounding boxes from 'WKT' (Well-Known Text) objects and R data types, validate 'WKT' objects and convert object types from the 'sp' package into 'WKT' representations.

pageviews

cran
84th

Percentile

Pageview data from the 'Wikimedia' sites, such as 'Wikipedia' <https://www.wikipedia.org/>, from entire projects to per-article levels of granularity, through the new RESTful API and data source <https:// wikimedia.org/api/rest_v1/?doc>.

udapi

cran
80th

Percentile

A client for the Urban Dictionary <http://www.urbandictionary.com/> API.

piton

cran
79th

Percentile

A wrapper around the 'Parsing Expression Grammar Template Library', a C++11 library for generating Parsing Expression Grammars, that makes it accessible within Rcpp. With this, developers can implement their own grammars and easily expose them in R packages.

rgeolocate

cran
79th

Percentile

Connectors to online and offline sources for taking IP addresses and geolocating them to country, city, timezone and other geographic ranges. For individual connectors, see the package index.

72th

Percentile

Human names are complicated and nonstandard things. Humaniformat, which is based on Anthony Ettinger's 'humanparser' project (https://github.com/ chovy/humanparser) provides functions for parsing human names, making a best- guess attempt to distinguish sub-components such as prefixes, suffixes, middle names and salutations.

batman

cran
63th

Percentile

Survey systems and other third-party data sources commonly use non-standard representations of logical values when it comes to qualitative data - "Yes", "No" and "N/A", say. batman is a package designed to seamlessly convert these into logicals. It is highly localised, and contains equivalents to boolean values in languages including German, French, Spanish, Italian, Turkish, Chinese and Polish.

birdnik

cran
63th

Percentile

A connector to the API for 'Wordnik' <https://www.wordnik.com>, a dictionary service that also provides bigram generation, word frequency data, and a whole host of other functionality.

primes

cran
63th

Percentile

Functions to test whether a number is prime and generate the prime numbers within a specified range. Based around an implementation of Wilson's theorem for testing for an integer's primality.

hail

cran
56th

Percentile

Read data from the City of Portland's 'HYDRA' <http://or.water.usgs.gov/precip/> rainfall datasets within R.

favnums

cran
48th

Percentile

A dataset of favourite numbers, selected from an online poll of over 30,000 people by Alex Bellos (http://pages.bloomsbury.com/favouritenumber).

rwars

cran
48th

Percentile

Provides functions to retrieve and reformat data from the 'Star Wars' API (SWAPI) <https://swapi.co/>.

muckrock

cran
38th

Percentile

A data package containing public domain information on requests made by the 'MuckRock' (https://www.muckrock.com/) project under the United States Freedom of Information Act.

olctools

cran
38th

Percentile

'Open Location Codes' <http://openlocationcode.com/> are a Google-created standard for identifying geographic locations. 'olctools' provides utilities for validating, encoding and decoding entries that follow this standard.

ores

cran
38th

Percentile

A connector to ORES (<http://ores.wmflabs.org/>), an AI project to provide edit scoring for content on Wikipedia and other Wikimedia projects. This lets a researcher identify if edits are likely to be reverted, damaging, or made in good faith.

38th

Percentile

Functions to reconstruct sessions from web log or other user trace data and calculate various metrics around them, producing tabular, output that is compatible with 'dplyr' or 'data.table' centered processes.

webreadr

cran
38th

Percentile

R is used by a vast array of people for a vast array of purposes - including web analytics. This package contains functions for consuming and munging various common forms of request log, including the Common and Combined Web Log formats and various Amazon access logs.

lucr

cran
27th

Percentile

Reformat currency-based data as numeric values (or numeric values as currency-based data) and convert between currencies.

rdian

cran
27th

Percentile

A client library for 'The Guardian' (https://www.guardian.com/) and their API, this package allows users to search for Guardian articles and retrieve both the content and metadata.

threewords

cran
27th

Percentile

A connector to the 'What3Words' (http://what3words.com/) service, which represents each 3m by 3m square on earth with a unique trio of English-language words.

whoapi

cran
27th

Percentile

Retrieve data from the 'Whoapi' (https://whoapi.com) store of domain information, including a domain's geographic location, registration status and search prominence.

exif

cran
18th

Percentile

Extracts Exchangeable Image File Format (EXIF) metadata, such as camera make and model, ISO speed and the date-time the picture was taken on, from JPEG images. Incorporates the 'easyexif' (https://github.com/mayanklahiri/easyexif) library.

geohash

cran
18th

Percentile

Provides tools to encode lat/long pairs into geohashes, decode those geohashes, and identify their neighbours.

osi

cran
17th

Percentile

A connector to the API maintained by the Open Source Initiative <https://api.opensource.org/licenses/>, which provides machine-readable metadata about a variety of open source software licenses.

openssl

cran
99.5th

Percentile

Bindings to OpenSSL libssl and libcrypto, plus custom SSH pubkey parsers. Supports RSA, DSA and EC curves P-256, P-384 and P-521. Cryptographic signatures can either be created and verified manually or via x509 certificates. AES can be used in cbc, ctr or gcm mode for symmetric encryption; RSA for asymmetric (public key) encryption or EC for Diffie Hellman. High-level envelope functions combine RSA and AES for encrypting arbitrary sized data. Other utilities include key generators, hash functions (md5, sha1, sha256, etc), base64 encoder, a secure random number generator, and 'bignum' math methods for manually performing crypto calculations on large multibyte integers.

tidytext

cran
98th

Percentile

Text mining for word processing and sentiment analysis using 'dplyr', 'ggplot2', and other tidy tools.

tokenizers

cran
97th

Percentile

Convert natural language text into tokens. Includes tokenizers for shingled n-grams, skip n-grams, words, word stems, sentences, paragraphs, characters, shingled characters, lines, tweets, Penn Treebank, regular expressions, as well as functions for counting characters, words, and sentences, and a function for splitting longer texts into separate documents, each with the same number of words. The tokenizers have a consistent interface, and the package is built on the 'stringi' and 'Rcpp' packages for fast yet correct tokenization in 'UTF-8'.

rticles

cran
95th

Percentile

A suite of custom R Markdown formats and templates for authoring journal articles and conference submissions.

dotwhisker

cran
84th

Percentile

Quick and easy dot-and-whisker plots of regression results.

rEDM

cran
81th

Percentile

A new implementation of EDM algorithms based on research software previously developed for internal use in the Sugihara Lab (UCSD/SIO). Contains C++ compiled objects that use time delay embedding to perform state-space reconstruction and nonlinear forecasting and an R interface to those objects using 'Rcpp'. It supports both the simplex projection method from Sugihara & May (1990) <DOI:10.1038/344734a0> and the S-map algorithm in Sugihara (1994) <DOI:10.1098/rsta.1994.0106>. In addition, this package implements convergent cross mapping as described in Sugihara et al. (2012) <DOI:10.1126/science.1227079> and multiview embedding as described in Ye & Sugihara (2016) <DOI:10.1126/science.aag0863>.

iptools

cran
68th

Percentile

A toolkit for manipulating, validating and testing 'IP' addresses and ranges, along with datasets relating to 'IP' addresses. Tools are also provided to map 'IPv4' blocks to country codes. While it primarily has support for the 'IPv4' address space, more extensive 'IPv6' support is intended.

phonics

cran
56th

Percentile

Provides a collection of phonetic algorithms including Soundex, Metaphone, NYSIIS, Caverphone, and others.

forwards

cran
48th

Percentile

Anonymized data from surveys conducted by Forwards <http://forwards.github.io/>, the R Foundation task force on women and other under-represented groups. Currently, a single data set of responses to a survey of attendees at useR! 2016 <http://user2016.org/>, the R user conference held at Stanford University, Stanford, California, USA, June 27 - June 30 2016.

rLTP

cran
27th

Percentile

R interface to the 'LTP'-Cloud service for Natural Language Processing in Chinese (http://www.ltp-cloud.com/).

pystr

cran
17th

Percentile

String operations the Python way - a package for those of us who miss Python's string methods while we're working in R.

notary

github
17th

Percentile

Signing and verification of R packages.