Learn R Programming

dcmdata (version 0.1.0)

generate_ids: Generate unique identifiers

Description

Create unique alphanumeric identifiers with a specified character length and proportions of alpha and numeric characters.

Usage

generate_ids(n, characters, prop_numeric = 1, n_attempt = n * 3)

Value

A factor vector of length n.

Arguments

n

The number of unique identifiers to generate.

characters

The number of characters to be included in each identifier.

prop_numeric

The proportion of characters that should be numeric. The default is 1 (i.e., all numbers). If less than 1, identifiers will also include lowercase and uppercase letters.

n_attempt

The number of allowed attempts for generating the requested number of identifiers. See details for more information.

Details

When identifiers are long (e.g., characters >= 10), it is slow and computationally intensive to find all possible permutations of the specified number of alpha and numeric characters. Therefore, identifiers are generated one at a time by sampling the required number of characters. This greatly increases efficiency, as we don't waste time generating multiple millions of identifiers when we might only need a few hundred. However, this means that it is possible we could generate duplicate identifiers. The n_attempt argument allows us to control how many identifiers we can generate in order to achieve our desired n unique identifiers. If we fail to find n unique identifiers after n_attempt, the function will error. For example, consider a request for 1,000 identifiers, each with 2 characters and only using numbers. With the number 0-9, there are only 100 possible two-character permutations. Thus, after n_attempt, the function will fail as 1,000 unique identifiers cannot be found.

Examples

Run this code
generate_ids(n = 10, characters = 5)
generate_ids(n = 100, characters = 10, prop_numeric = 0.5)

Run the code above in your browser using DataLab