Reshape: Reshape Wide Data Into a Semi-long Form

Description

The reshape function in base R is very handy when you want a semi-long (or semi-wide) data.frame. However, base R's reshape has problems is with "unbalanced" panel data, for instance data where one variable was measured at three points in time, and another only twice.

Usage

Reshape(data, id.vars = NULL, var.stubs, sep = ".", rm.rownames = TRUE,
  ...)

Arguments

data

The source data.frame.

id.vars

The variables that serve as unique identifiers. Defaults to NULL, at which point, all names which are not identified as variable groups are used as the identifiers.

var.stubs

The prefixes of the variable groups.

sep

The character that separates the "variable name" from the "times" in the wide data.frame.

rm.rownames

Logical. reshape creates some long distracting rownames that do not seem to serve much purpose. This argument is set to TRUE to remove the rownames by default.

…

Further arguments to NoSep in case the separator is of a different form.

Value

A "long" data.frame of the reshaped data that retains the attributes added by base R's reshape function.

Details

This function was written to overcome that limitation of dealing with unbalanced data, but is also appropriate for basic wide-to-long reshaping tasks.

Related functions like stack in base R and melt in "reshape2" are also very handy when you want a "long" reshaping of data, but they result in a very long structuring of your data, not the "semi-wide" format that reshape produces.

Examples

Run this code

# NOT RUN {
set.seed(1)
mydf <- data.frame(id_1 = 1:6, id_2 = c("A", "B"), varA.1 = sample(letters, 6),
                 varA.2 = sample(letters, 6), varA.3 = sample(letters, 6),
                 varB.2 = sample(10, 6), varB.3 = sample(10, 6),
                 varC.3 = rnorm(6))
mydf

## Note that these data are unbalanced
## reshape() will not work
# }
# NOT RUN {
reshape(mydf, direction = "long", idvar=1:2, varying=3:ncol(mydf))
# }
# NOT RUN {
## The Reshape() function can handle such scenarios

Reshape(mydf, id.vars = c("id_1", "id_2"),
       var.stubs = c("varA", "varB", "varC"))
# }

Run the code above in your browser using DataLab