
Last chance! 50% off unlimited learning
Sale ends in
all_lower_case()
: Translate all non-numeric strings of a data frame
to lower case (
"Env"
to "env"
).
all_upper_case()
: Translate all non-numeric strings of a data frame
to upper case (e.g., "Env"
to "ENV"
).
all_title_case()
: Translate all non-numeric strings of a data frame
to title case (e.g., "ENV"
to "Env"
).
extract_number()
: Extract the number(s) of a string.
extract_string()
: Extract all strings, ignoring case.
find_text_in_num()
: Find text characters in a numeric sequence and
return the row index.
has_text_in_num()
: Inspect columns looking for text in numeric
sequence and return a warning if text is found.
remove_space()
: Remove all blank spaces of a string.
remove_strings()
: Remove all strings of a variable.
replace_number()
: Replace numbers with a replacement.
replace_string()
: Replace all strings with a replacement, ignoring
case.
round_cols()
: Round a selected column or a whole data frame to
significant figures.
tidy_strings()
: Tidy up characters strings, non-numeric columns, or
any selected columns in a data frame by putting all word in upper case,
replacing any space, tabulation, punctuation characters by '_'
, and
putting '_'
between lower and upper case. Suppose that str =
c("Env1", "env 1", "env.1")
(which by definition should represent a unique
level in plant breeding trials, e.g., environment 1) is subjected to
tidy_strings(str)
: the result will be then c("ENV_1", "ENV_1",
"ENV_1")
. See Examples section for more examples.
all_upper_case(.data, ...)all_lower_case(.data, ...)
all_title_case(.data, ...)
extract_number(
.data,
var,
new_var = new_var,
drop = FALSE,
pull = FALSE,
.before = NULL,
.after = NULL
)
extract_string(
.data,
var,
new_var = new_var,
drop = FALSE,
pull = FALSE,
.before = NULL,
.after = NULL
)
find_text_in_num(.data, ...)
has_text_in_num(.data)
remove_space(.data, ...)
remove_strings(.data, ...)
replace_number(
.data,
var,
new_var = new_var,
pattern = NULL,
replacement = "",
drop = FALSE,
pull = FALSE,
.before = NULL,
.after = NULL
)
replace_string(
.data,
var,
new_var = new_var,
pattern = NULL,
replacement = "",
ignore_case = FALSE,
drop = FALSE,
pull = FALSE,
.before = NULL,
.after = NULL
)
round_cols(.data, ..., digits = 2)
tidy_strings(.data, ..., sep = "_")
A data frame
The argument depends on the function used.
For round_cols()
...
are the variables to round. If no
variable is informed, all the numeric variables from data
are used.
For all_lower_case()
, all_upper_case()
,
all_title_case()
, remove_strings()
, and tidy_strings()
...
are the variables to apply the function. If no variable is
informed, the function will be applied to all non-numeric variables in
.data
.
The variable to extract or replace numbers or strings.
The name of the new variable containing the numbers or
strings extracted or replaced. Defaults to new_var
.
Logical argument. If TRUE
keeps the new variable
new_var
and drops the existing ones. Defaults to FALSE
.
Logical argument. If TRUE
, returns the last column (on the
assumption that's the column you've created most recently), as a vector.
For replace_sting()
, replace_number()
,
extract_string()
, ,and extract_number()
one-based column
index or column name where to add the new columns.
A string to be matched. Regular Expression Syntax is also allowed.
A string for replacement.
If FALSE
(default), the pattern matching is case
sensitive and if TRUE
, case is ignored during matching.
The number of significant figures.
A character string to separate the terms. Defaults to "_".
# NOT RUN {
library(metan)
################ Rounding numbers ###############
# All numeric columns
round_cols(data_ge2, digits = 1)
# Round specific columns
round_cols(data_ge2, EP, digits = 1)
########### Extract or replace numbers ##########
# Extract numbers
extract_number(data_ge, GEN)
extract_number(data_ge,
var = GEN,
drop = TRUE,
new_var = g_number)
# Replace numbers
replace_number(data_ge, GEN)
replace_number(data_ge,
var = GEN,
pattern = "1",
replacement = "_one",
pull = TRUE)
########## Extract, replace or remove strings ##########
# Extract strings
extract_string(data_ge, GEN)
extract_string(data_ge,
var = GEN,
drop = TRUE,
new_var = g_name)
# Replace strings
replace_string(data_ge, GEN)
replace_string(data_ge,
var = GEN,
new_var = GENOTYPE,
pattern = "G",
replacement = "GENOTYPE_")
# Remove strings
remove_strings(data_ge)
remove_strings(data_ge, ENV)
############ Find text in numeric sequences ###########
mixed_text <- data.frame(data_ge)
mixed_text[2, 4] <- "2..503"
mixed_text[3, 4] <- "3.2o75"
find_text_in_num(mixed_text, GY)
############# upper, lower and title cases ############
gen_text <- c("GEN 1", "Gen 1", "gen 1")
all_lower_case(gen_text)
all_upper_case(gen_text)
all_title_case(gen_text)
# A whole data frame
all_lower_case(data_ge)
############### Tidy up messy text string ##############
messy_env <- c("ENV 1", "Env 1", "Env1", "env1", "Env.1", "Env_1")
tidy_strings(messy_env)
messy_gen <- c("GEN1", "gen 2", "Gen.3", "gen-4", "Gen_5", "GEN_6")
tidy_strings(messy_gen)
messy_int <- c("EnvGen", "Env_Gen", "env gen", "Env Gen", "ENV.GEN", "ENV_GEN")
tidy_strings(messy_int)
library(tibble)
# Or a whole data frame
df <- tibble(Env = messy_env,
gen = messy_gen,
Env_GEN = interaction(Env, gen),
y = rnorm(6, 300, 10))
df
tidy_strings(df)
# }
Run the code above in your browser using DataLab