
This function "lengthens" data, increasing the number of rows and decreasing
the number of columns. This is a dependency-free base-R equivalent of
tidyr::pivot_longer()
.
data_to_long(
data,
select = "all",
names_to = "name",
names_prefix = NULL,
names_sep = NULL,
names_pattern = NULL,
values_to = "value",
values_drop_na = FALSE,
rows_to = NULL,
ignore_case = FALSE,
regex = FALSE,
...,
cols,
colnames_to
)reshape_longer(
data,
select = "all",
names_to = "name",
names_prefix = NULL,
names_sep = NULL,
names_pattern = NULL,
values_to = "value",
values_drop_na = FALSE,
rows_to = NULL,
ignore_case = FALSE,
regex = FALSE,
...,
cols,
colnames_to
)
If a tibble was provided as input, reshape_longer()
also returns a
tibble. Otherwise, it returns a data frame.
A data frame to pivot.
Variables that will be included when performing the required tasks. Can be either
a variable specified as a literal variable name (e.g., column_name
),
a string with the variable name (e.g., "column_name"
), or a character
vector of variable names (e.g., c("col1", "col2", "col3")
),
a formula with variable names (e.g., ~column_1 + column_2
),
a vector of positive integers, giving the positions counting from the left
(e.g. 1
or c(1, 3, 5)
),
a vector of negative integers, giving the positions counting from the
right (e.g., -1
or -1:-3
),
one of the following select-helpers: starts_with("")
, ends_with("")
,
contains("")
, a range using :
or regex("")
,
or a function testing for logical conditions, e.g. is.numeric()
(or
is.numeric
), or any user-defined function that selects the variables
for which the function returns TRUE
(like: foo <- function(x) mean(x) > 3
),
ranges specified via literal variable names, select-helpers (except
regex()
) and (user-defined) functions can be negated, i.e. return
non-matching elements, when prefixed with a -
, e.g. -ends_with("")
,
-is.numeric
or -Sepal.Width:Petal.Length
. Note: Negation means
that matches are excluded, and thus, the exclude
argument can be
used alternatively. For instance, select=-ends_with("Length")
(with
-
) is equivalent to exclude=ends_with("Length")
(no -
). In case
negation should not work as expected, use the exclude
argument instead.
If NULL
, selects all columns. Patterns that found no matches are silently
ignored, e.g. find_columns(iris, select = c("Species", "Test"))
will just
return "Species"
.
The name of the new column that will contain the column names.
A regular expression used to remove matching text from the start of each variable name.
If names_to
contains multiple values, this
argument controls how the column name is broken up.
names_pattern
takes a regular expression containing matching groups, i.e. "()".
The name of the new column that will contain the values of the pivoted variables.
If TRUE
, will drop rows that contain only NA
in the
values_to
column. This effectively converts explicit missing values to
implicit missing values, and should generally be used only when missing values
in data were created by its structure.
The name of the column that will contain the row names or row
numbers from the original data. If NULL
, will be removed.
Logical, if TRUE
and when one of the select-helpers or
a regular expression is used in select
, ignores lower/upper case in the
search pattern when matching against variable names.
Logical, if TRUE
, the search pattern from select
will be
treated as regular expression. When regex = TRUE
, select must be a
character string (or a variable containing a character string) and is not
allowed to be one of the supported select-helpers or a character vector
of length > 1. regex = TRUE
is comparable to using one of the two
select-helpers, select = contains("")
or select = regex("")
, however,
since the select-helpers may not work when called from inside other
functions (see 'Details'), this argument may be used as workaround.
Currently not used.
Identical to select
. This argument is here to ensure compatibility
with tidyr::pivot_longer()
. If both select
and cols
are provided, cols
is used.
Deprecated. Use names_to
instead.
Functions to rename stuff: data_rename()
, data_rename_rows()
, data_addprefix()
, data_addsuffix()
Functions to reorder or remove columns: data_reorder()
, data_relocate()
, data_remove()
Functions to reshape, pivot or rotate data frames: data_to_long()
, data_to_wide()
, data_rotate()
Functions to recode data: rescale()
, reverse()
, categorize()
, change_code()
, slide()
Functions to standardize, normalize, rank-transform: center()
, standardize()
, normalize()
, ranktransform()
, winsorize()
Split and merge data frames: data_partition()
, data_merge()
Functions to find or select columns: data_select()
, data_find()
Functions to filter rows: data_match()
, data_filter()
# \donttest{
wide_data <- data.frame(replicate(5, rnorm(10)))
# Default behaviour (equivalent to tidyr::pivot_longer(wide_data, cols = 1:5))
data_to_long(wide_data)
# Customizing the names
data_to_long(wide_data,
select = c(1, 2),
names_to = "Column",
values_to = "Numbers",
rows_to = "Row"
)
# Full example
# ------------------
if (require("psych")) {
data <- psych::bfi # Wide format with one row per participant's personality test
# Pivot long format
data_to_long(data,
select = regex("\\d"), # Select all columns that contain a digit
colnames_to = "Item",
values_to = "Score",
rows_to = "Participant"
)
if (require("tidyr")) {
reshape_longer(
tidyr::who,
select = new_sp_m014:newrel_f65,
names_to = c("diagnosis", "gender", "age"),
names_pattern = "new_?(.*)_(.)(.*)",
values_to = "count"
)
}
}
# }
Run the code above in your browser using DataLab