git2rdata (version 0.1)

meta: Optimize an Object for Storage as Plain Text and Add Metadata

Description

Prepares a vector for storage. When relevant, meta()optimizes the object for storage by changing the format to one which needs less characters. The metadata stored in the meta attribute, contains all required information to back-transform the optimized format into the original format.

In case of a data.frame, meta() applies itself to each of the columns. The meta attribute becomes a named list containing the metadata for each column plus an additional ..generic element. ..generic is a reserved name for the metadata and not allowed as column name in a data.frame.

write_vc uses this function to prepare a dataframe for storage. Existing metadata is passed through the optional old argument. This argument intended for internal use.

Usage

meta(x, ...)

# S3 method for character meta(x, na = "NA", ...)

# S3 method for factor meta(x, optimize = TRUE, na = "NA", index, ...)

# S3 method for logical meta(x, optimize = TRUE, ...)

# S3 method for POSIXct meta(x, optimize = TRUE, ...)

# S3 method for Date meta(x, optimize = TRUE, ...)

# S3 method for data.frame meta(x, optimize = TRUE, na = "NA", sorting, ...)

Arguments

x

the vector.

...

further arguments to the methods.

na

the string to use for missing values in the data.

optimize

If TRUE, recode the data to get smaller text files. If FALSE, meta() converts the data to character. Defaults to TRUE.

index

an optional named vector with existing factor indices. The names must match the existing factor levels. Unmatched levels from x will get new indices.

sorting

an optional vector of column names defining which columns to use for sorting x and in what order to use them. Omitting sorting yields a warning. Add sorting to avoid this warning. Strongly recommended in combination with version control. See vignette("efficiency", package = "git2rdata") for an illustration of the importance of sorting.

Value

the optimized vector x with meta attribute.

See Also

Other internal: is_git2rdata, is_git2rmeta, upgrade_data

Examples

Run this code
# NOT RUN {
meta(c(NA, "'NA'", '"NA"', "abc\tdef", "abc\ndef"))
meta(1:3)
meta(seq(1, 3, length = 4))
meta(factor(c("b", NA, "NA"), levels = c("NA", "b", "c")))
meta(factor(c("b", NA, "a"), levels = c("a", "b", "c")), optimize = FALSE)
meta(factor(c("b", NA, "a"), levels = c("a", "b", "c"), ordered = TRUE))
meta(
  factor(c("b", NA, "a"), levels = c("a", "b", "c"), ordered = TRUE),
  optimize = FALSE
)
meta(c(FALSE, NA, TRUE))
meta(c(FALSE, NA, TRUE), optimize = FALSE)
meta(complex(real = c(1, NA, 2), imaginary = c(3, NA, -1)))
meta(as.POSIXct("2019-02-01 10:59:59", tz = "CET"))
meta(as.POSIXct("2019-02-01 10:59:59", tz = "CET"), optimize = FALSE)
meta(as.Date("2019-02-01"))
meta(as.Date("2019-02-01"), optimize = FALSE)
# }

Run the code above in your browser using DataCamp Workspace