epitrix (version 0.2.2)

clean_labels: Standardise labels

Description

This function standardises labels e.g. used as variable names or character string values, removing non-ascii characters, replacing diacritics (e.g. é, ô) with their closest ascii equivalents, and standardises separating characters. See details for more information on label transformation.

Usage

clean_labels(x, sep = "_")

Arguments

x

A vector of labels, normally provided as characters.

sep

A character string used as separator, defaulting to '_'.

Author

Thibaut Jombart thibautjombart@gmail.com

Details

The following changes are performed:

  • all non-ascii characters are removed

  • all diacritics are replaced with their non-accentuated equivalents, e.g. 'é', 'ê' and 'è' become 'e', 'ö' becomes 'o', etc.

  • all characters are set to lower case

  • separators are standardised to the use of a single character provided in sep (defaults to '_'); heading and trailing separators are removed.

Examples

Run this code

clean_labels("-_-This is; A    WeÏrD**./sêntënce...")
clean_labels("-_-This is; A    WeÏrD**./sêntënce...", sep = ".")
input <- c("Peter and stëven", "peter-and.stëven", "pëtêr and stëven  _-")
input
clean_labels(input)

Run the code above in your browser using DataLab