layer: Get 'layer' table

Description

Download data from the 'layer' ("camada") table of one or more datasets published in the Data Repository of the Brazilian Soil. This table includes data such as sampling depth, horizon designation, and variables such as pH, carbon and clay content, and much more.

Usage

layer(
  data.set,
  variable,
  stack = FALSE,
  missing = list(depth = "keep", data = "keep"),
  standardization = list(plus.sign = "keep", plus.depth = 2.5, lessthan.sign = "keep",
    lessthan.frac = 0.5, repetition = "keep", combine.fun = "mean", transition = "keep",
    smoothing.fun = "mean", units = FALSE, round = FALSE),
  harmonization = list(harmonize = FALSE, level = 2),
  progress = TRUE,
  verbose = TRUE,
  febr.repo = NULL
)

Arguments

data.set

Character vector indicating the identification code of one or more data sets. Use data.set = "all" to download all data sets.

variable

(optional) Character vector indicating one or more variables. Accepts only general identification codes, e.g. "ferro" and "carbono". If missing, then a set of standard identification variables is downloaded. Use variable = "all" to download all variables. See ‘Details’ for more information.

stack

(optional) Logical value indicating if tables from different datasets should be stacked on a single table for output. Requires standardization = list(units = TRUE) -- see below. Defaults to stack = FALSE, the output being a list of tables.

missing

(optional) List with named sub-arguments indicating what should be done with a layer missing data on sampling depth, depth, or data on variable(s), data. Options are "keep" (default) and "drop".

standardization

(optional) List with named sub-arguments indicating how to perform data #' standardization.

plus.sign Character string indicating what should be done with the plus sign (+) commonly used along with the inferior limit of the bottom layer of an observation. Options are "keep" (default), "add", and "remove".
plus.depth Numeric value indicating the depth increment (in centimeters) when processing the plus sign (+) with plus.sign = "add". Defaults to plus.depth = 2.5.
lessthan.sign Character string indicating what should be done with the less-than sign (<) used to indicate that the value of a variable is below the lower limit of detection. Options are "keep" (default), "subtract", and "remove".
lessthan.frac Numeric value between 0 and 1 (a fraction) by which the lower limit of detection should be subtracted when lessthan.sign = "subtract". Defaults to lessthan.frac = 0.5, i.e. subtract 50\
repetition Character string indicating what should be done with repetitions, i.e. repeated measurements of layers in an observation. Options are "keep" (default) and "combine". In the latter case, it is recommended to set lessthan.sign = "subtract" or lessthan.sign = "remove".
combine.fun Character string indicating the function that should be used to combine repeated measurements of layers in an observation when repetition = "combine". Options are "mean" (default), "min", "max", and "median".
transition Character string indicating what should be done about the wavy and irregular transition between subsequent layers in an observation. Options are "keep" (default) and "smooth".
smoothing.fun Character string indicating the function that should be used to smooth wavy and irregular transitions between subsequent layers in an observation when transition = "smooth". Options are "mean" (default), "min", "max", and "median".
units Logical value indicating if the measurement unit(s) of the continuous variable(s) should be converted to the standard measurement unit(s). Defaults to units = FALSE, i.e. no conversion is performed. See dictionary() for more information.
round Logical value indicating if the values of the continuous variable(s) should be rounded to the standard number of decimal places. Requires units = TRUE. Defaults to round = FALSE, i.e. no rounding is performed. See dictionary() for more information.

harmonization

(optional) List with named sub-arguments indicating if and how to perform data harmonization.

harmonize Logical value indicating if data should be harmonized. Defaults to harmonize = FALSE, i.e. no harmonization is performed.
level Integer value indicating the number of levels of the identification code of the variable(s) that should be considered for harmonization. Defaults to level = 2. See ‘Details’ for more information.

progress

(optional) Logical value indicating if a download progress bar should be displayed.

verbose

(optional) Logical value indicating if informative messages should be displayed. Generally useful to identify datasets with inconsistent data. Please report to febr-forum@googlegroups.com if you find any issue.

febr.repo

(optional) Defaults to the remote file directory of the Federal University of Technology - Paran<U+00E1> at https://cloud.utfpr.edu.br/index.php/s/Df6dhfzYJ1DDeso. Alternatively, a local directory path can be informed if the user has a local copy of the data repository.

Value

A list of data.frames or a data.frame with, possibly standardize or harmonized, data of the chosen variable(s) of the chosen dataset(s).

Details

Default variables

Default variables (fields) present in the 'layer' table are as follows:

dataset_id. Identification of the dataset in FEBR to which an observation belongs.
evento_id_febr. Identification code of an observation in a dataset.
camada_id. Sequential layer number, from top to bottom.
camada_altid. Layer designation according to some standard description guide.
amostra_id. Laboratory number of a sample.
profund_sup. Upper boundary of a layer (cm).
profund_inf. Lower boundary of a layer (cm).

Further details about the content of the default variables (fields) can be found in https://docs.google.com/document/d/1Bqo8HtitZv11TXzTviVq2bI5dE6_t_fJt0HE-l3IMqM (in Portuguese).

Harmonization

Data harmonization consists of converting the values of a variable determined using some method B so that they are (approximately) equivalent to the values that would have been obtained if the standard method A had been used instead. For example, converting carbon content values obtained using a wet digestion method to the standard dry combustion method is data harmonization.

A heuristic data harmonization procedure is implemented in the febr package. It consists of grouping variables based on a chosen number of levels of their identification code. For example, consider a variable with an identification code composed of four levels, aaa_bbb_ccc_ddd, where aaa is the first level and ddd is the fourth level. Now consider a related variable, aaa_bbb_eee_fff. If the harmonization is to consider all four coding levels (level = 4), then these two variables will remain coded as separate variables. But if level = 2, then both variables will be re-coded as aaa_bbb, thus becoming the same variable.

Examples

Run this code

# NOT RUN {
if (interactive()) {
res <- layer(data.set = "ctb0003")

# Download two data sets and standardize units
res <- layer(
  data.set = paste("ctb000", 4:5, sep = ""),
  variable = "carbono", stack = TRUE,
  standardization = list(units = TRUE))

# Try to download a data set that is not available yet
res <- layer(data.set = "ctb0020")

# Try to download a non existing data set
# res <- observation(data.set = "ctb0000")
}
# }

Run the code above in your browser using DataLab