Learn R Programming

sumer (version 1.0.0)

convert_to_dictionary: Convert Translation Data to a Sumerian Dictionary

Description

Converts a data frame of Sumerian translations into a structured dictionary format, adding cuneiform representations and phonetic readings for each sign.

Usage

convert_to_dictionary(df, mapping = NULL)

Value

A data frame with the following columns:

sign_name

The normalized Sumerian text (e.g., "A", "AN", "A2.TAB")

row_type

Type of entry: "cunei." (cuneiform character), "reading" (phonetic readings), or "trans." (translation)

count

Number of occurrences for translations; NA for cuneiform and reading entries

type

Grammatical type (e.g., "S", "V", "A") for translations; empty string for other row types

meaning

The cuneiform character(s), phonetic reading(s), or translated meaning depending on row_type

The data frame is sorted by sign_name, row_type, and descending count.

Arguments

df

A data frame with columns sign_name, type, and meaning, typically produced by read_translated_text.

mapping

A data frame containing sign-to-reading mappings with columns name, cuneiform and syllables. If NULL (default), the package's built-in mapping file etcsl_mapping.txt is used.

Details

Processing Steps

  1. Aggregates translations and counts occurrences of each unique combination in df

  2. Looks up phonetic readings and cuneiform signs for each sign component

  3. Combines cuneiform, reading, and translation rows into a single data frame

  4. Sorts the result by sign name and row type

Reading Format

Phonetic readings are formatted as follows:

  • Multiple possible readings are enclosed in braces: {a, dur5, duru5}

  • For compound signs, readings of individual components are joined with hyphens

  • If a sign has more than three possible readings in a compound, only the first three are shown followed by ...

  • Unknown readings are marked with ?

See Also

read_translated_text for reading translation files, make_dictionary for creating a complete dictionary with cuneiform representations and readings in a single step.

Examples

Run this code
# Read translations from a single text document
filename     <- system.file("extdata", "text_with_translations.txt", package = "sumer")
translations <- read_translated_text(filename)

# View the structure
head(translations)

#Make some custom unifications (here: removing the word "the")
translations$meaning <- gsub("\\bthe\\b", "", translations$meaning, ignore.case = TRUE)
translations$meaning <- trimws(gsub("\\s+", " ", translations$meaning))

# View the structure
head(translations)

#Convert the result into a dictionary
dictionary   <- convert_to_dictionary(translations)

# View the structure
head(dictionary)

# View entries for a specific sign
dictionary[dictionary$sign_name == "EN", ]

# With custom mapping
path  <- system.file("extdata", "etcsl_mapping.txt", package = "sumer")
mapping <- read.csv2(path, sep=";", na.strings="")
translations <- read_translated_text(filename, mapping = mapping)
dictionary <- convert_to_dictionary(translations, mapping = mapping)
head(dictionary)

Run the code above in your browser using DataLab