Convenience functions in the collapse package that help to deal with variable names, labels, missing values, matching and object checking etc.. Some functions are performance improved replacements for base R functions.
.c(…) # Non-standard concatenation i.e. .c(a, b) == c("a", "b")
vlabels(X, attrn = "label") # Get labels of variables in X, in attr(X[[i]], attrn)
vlabels(X, attrn = "label") <- value # Set labels of variables in X
vclasses(X) # Get classes of variables in X
vtypes(X) # Get data storage types of variables in X (calling typeof)
namlab(X, class = FALSE, # Return data frame of names, labels and classes
attrn = "label")
add_stub(X, stub, pre = TRUE) # Add a stub (i.e. prefix or postfix) to column names
rm_stub(X, stub, pre = TRUE) # Remove stub from column names
x %!in% table # The opposite of %in%
ckmatch(x, table, # Check-match: throws an informative error if non-matched
e = "Unknown columns:")
fnlevels(x) # Faster version of nlevels(x) (for factors)
fnrow(X) # Faster nrow for data frames (not faster for matrices)
fncol(X) # Faster ncol for data frames (not faster for matrices)
fdim(X) # Faster dim for data frames (not faster for matrices)
na_rm(x) # Remove missing values from vector and return vector
na_omit(X, cols = NULL, # Faster na.omit for matrices and data frames
na.attr = FALSE)
na_insert(X, prop = 0.1) # Insert missing values at random in vectors, matrices DF's
all_identical(…) # Check exact equality of multiple objects or list-elements
all_obj_equal(…) # Check near equality of multiple objects or list-elements
seq_row(X) # Fast integer sequences along rows of X
seq_col(X) # Fast integer sequences along columns of X
setRownames(object,
nm = if(is.atomic(object)) # Set rownames of object and return object
seq_row(object) else NULL)
setColnames(object, nm) # Set colnames of object and return object
setDimnames(object, dn,
which = NULL) # Set dimension names of object and return object
unattrib(object) # Remove all attributes from object
is.categorical(x) # The opposite of is.numeric
is.Date(x) # Check if object is of class "Date", "POSIXlt" or "POSIXct"
a matrix or data frame (some functions also support vectors and arrays although that is less common).
a suitable R object.
a atomic vector.
character. Name of attribute to store labels or retrieve labels from.
a matching character vector of variable labels.
logical. Also show the classes of variables in X in a column?
a single character stub, i.e. "log.", which by default will be pre-applied to all variables or column names in X.
logical. FALSE
will post-apply stub
.
only removes rows with missing values on these columns. Columns can be selected using column names, indices, a logical vector or a selector function (i.e. is.numeric
).
logical. TRUE
adds an attribute containing the removed cases. For compatibility reasons this is exactly the same format as na.omit
i.e. the attribute is called "na.action" and of class "omit".
a suitable vector of row- or column-names.
a suitable vector or list of names for dimension(s).
integer. If NULL
, dn
has to be a list fully specifying the dimension names of the object. Alternatively, a vector or list of names for dimensions which
can be supplied. See Examples.
double. Specify the proportion of observations randomly replaced with NA
.
the error message thrown by ckmatch
for non-matched elements. The message is followed by the comma-separated non-matched elements.
for .c
: Comma-separated expressions. For all_identical / all_obj_equal
: Either multiple comma-separated objects or a single list of objects in which all elements will be checked for exact / numeric equality.
# NOT RUN {
## Non-standard concatenation
.c(a, b, "c d", e == f)
## Variable labels
namlab(wlddev, class = TRUE)
vlabels(wlddev)
vlabels(wlddev) <- vlabels(wlddev)
## Stub-renaming
log_mtc <- add_stub(log(mtcars), "log.")
head(log_mtc)
head(rm_stub(log_mtc, "log."))
rm(log_mtc)
## Setting dimension names of an object
head(setRownames(mtcars))
ar <- array(1:9, c(3,3,3))
setRownames(ar)
setColnames(ar, c("a","b","c"))
setDimnames(ar, c("a","b","c"), which = 3)
setDimnames(ar, list(c("d","e","f"), c("a","b","c")), which = 2:3)
setDimnames(ar, list(c("g","h","i"), c("d","e","f"), c("a","b","c")))
## Checking exact equality of multiple objects
all_identical(iris, iris, iris, iris)
l <- replicate(100, fmean(num_vars(iris), iris$Species), simplify = FALSE)
all_identical(l)
rm(l)
## Missing values
mtc_na <- na_insert(mtcars, 0.15) # Set 15% of values missing at random
fNobs(mtc_na) # See observation count
na_omit(mtc_na) # 12x faster than na.omit(mtc_na)
na_omit(mtc_na, na.attr = TRUE) # Adds attribute with removed cases, like na.omit
na_omit(mtc_na, cols = c("vs","am")) # Removes only cases missing vs or am
na_omit(qM(mtc_na)) # Also works for matrices
na_omit(mtc_na$vs, na.attr = TRUE) # Also works with vectors
na_rm(mtc_na$vs) # For vectors na_rm is faster ...
rm(mtc_na)
# }
Run the code above in your browser using DataLab