wide2long: Convert from the wide format (multiple entries per row) to the long format (single entry per row).

Description

Takes a data frame of word pairs/triples/..., each stored in a single row, and returns a data frame with the same pairs/triples/... but with each word stored in its own row.

Usage

wide2long(data, suffixes, col.lang = "LANGUAGE", strip = 0)

Arguments

data

[data.frame] The dataset to be converted.

suffixes

[character vector] Suffixes used to differentiate column names; in the output, those will be used as language names.

col.lang

[character] Name of the column in which language names are to be stored. Defaults to "LANGUAGE".

strip

[integer] The number of characters to strip from the beginning of suffixes when they are turned into language names. Defaults to 0.

Value

[data.frame] A data frame in the long format (single entry per row).

Details

Data for soundcorrs can be prepared in one of two formats: the 'long format' and the 'wide format'. In the 'long format', each row contains only a single word and metadata associated with it. In the 'wide format', each row contains the entire pair/triple/... of words, and all the metadata associated with them. The 'long format' is convenient for making sure that all the words in a pair/triple/... have the same number of segments, but it cannot be read directly by soundcorrs. long2wide and wide2long convert between the two formats.

Examples

Run this code

# NOT RUN {
# path to sample data in the "wide format"
fName <- system.file ("extdata", "data-capitals.tsv", package="soundcorrs")
wide <- read.table (fName, header=TRUE)
long <- wide2long (wide, c(".German",".Polish",".Spanish"), strip=1)
# }