Learn R Programming

pudu

Overview

The goal of pudu is to provide function declarations and inline function definitions that facilitate cleaning strings in C++ code before passing them to R. It works with cpp11::strings and std::vector<std::string> objects.

The idea is the same as the janitor package, but for C++ code.

Why is the name Pudu? Pudu is the smallest deer on planet Earth and this package is tiny too. The original Pudu (unvectorized) was drawn by Pokanvas. This package emerged as a spinoff from the redatam package while cleaning strings in C++ code.

Installation

You can install the development version of pudu with:

remotes::install_github("pachadotdev/pudu")

Example

Here is how you can use the functions in this package in C++ code:

#include <cpp11.hpp>
#include <pudu.hpp>

using namespace cpp11;

// Example 1

std::vector<std::string> x = {" REGION NAME "};

tidy_std_names(x); // returns 'REGION NAME'

// Example 2

tidy_std_vars(x); // returns 'region_name'

// Example 3

// test_tidy_r_names(" REGION NAME ") returns 'REGION NAME'
[[cpp11::register]] cpp11::writable::strings test_tidy_r_names(
  const cpp11::strings& x) {
  cpp11::writable::strings res = tidy_r_names(x);
  return res;
}

// Example 4

// test_tidy_r_names(" REGION NAME ") returns 'region_name'
[[cpp11::register]] cpp11::writable::strings test_tidy_r_vars(
  const cpp11::strings& x) {
  cpp11::writable::strings res = tidy_r_vars(x);
  return res;
}

Messy strings such as " DEPTO. .REF_ID_ " are converted to “depto_ref_id” or “DEPTO. .REF_ID_”.

The following tests in R should give an idea of how the functions work:

# German
vars <- "Gau\xc3\x9f"
expect_equal(test_tidy_r_names(vars), "gau")
expect_equal(test_tidy_r_vars(vars), "Gau\u00df")

# French
vars <- "c\xc2\xb4est-\xc3\xa0-dire"
expect_equal(test_tidy_r_names(vars), "c_est_a_dire")
expect_equal(test_tidy_r_vars(vars), "c\u00b4est-\u00e0-dire")

# Spanish
vars <- "\xc2\xbfC\xc3\xb3mo est\xc3\xa1s\x3f"
expect_equal(test_tidy_r_names(vars), "como_estas")
expect_equal(test_tidy_r_vars(vars), "\u00bfC\u00f3mo est\u00e1s\u003f")

# Japanese
vars <- "Konnichiwa \xe3\x81\x93\xe3\x82\x93\xe3\x81\xab\xe3\x81\xa1\xe3\x81\xaf"
expect_equal(test_tidy_r_names(vars), "konnichiwa")
expect_equal(test_tidy_r_vars(vars), "Konnichiwa \u3053\u3093\u306b\u3061\u306f")

Copy Link

Version

Install

install.packages('pudu')

Monthly Downloads

135

Version

0.1.0

License

Apache License (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Mauricio Vargas

Last Published

January 14th, 2025

Functions in pudu (0.1.0)

pudu-package

pudu: C++ Tools for Cleaning Strings
cpp_vendor

Vendor the cpp11 and pudu dependency