Download data from the layer ("camada") table of one or more datasets contained in the Free Brazilian
Repository for Open Soil Data -- febr, http://www.ufsm.br/febr. This includes sampling depth,
horizon designation, and variables such as pH, carbon content, clay content, and much more. Use
header
if you want to check what are the variables contained in the layer table of a
dataset before downloading it.
layer(dataset, variable, stack = FALSE, missing = list(depth = "keep",
data = "keep"), standardization = list(plus.sign = "keep", plus.depth =
2.5, lessthan.sign = "keep", lessthan.frac = 0.5, repetition = "keep",
combine.fun = "mean", transition = "keep", smoothing.fun = "mean", units
= FALSE, round = FALSE), harmonization = list(harmonize = FALSE, level
= 2), progress = TRUE, verbose = TRUE)
Character vector indicating one or more datasets. Identification codes should be as recorded
in http://www.ufsm.br/febr/catalog/. Use dataset = "all"
to download all datasets.
(optional) Character vector indicating one or more variables. Accepts only general
identification codes, e.g. "ferro"
and "carbono"
. If missing, then a set of standard identification
variables is downloaded. Use variable = "all"
to download all variables. See ‘Details’ for
more information.
(optional) Logical value indicating if tables from different datasets should be stacked on a
single table for output. Requires standardization = list(units = TRUE)
-- see below. Defaults to
stack = FALSE
, the output being a list of tables.
(optional) List with named sub-arguments indicating what should be done with a layer missing
data on sampling depth, depth
, or data on variable(s), data
. Options are "keep"
(default) and "drop"
.
(optional) List with named sub-arguments indicating how to perform data standardization.
plus.sign
Character string indicating what should be done with the plus sign (+
) commonly used
along with the inferior limit of the bottom layer of an observation. Options are "keep"
(default),
"add"
, and "remove"
.
plus.depth
Numeric value indicating the depth increment (in centimeters) when processing the plus
sign (+
) with plus.sign = "add"
. Defaults to plus.depth = 2.5
.
lessthan.sign
Character string indicating what should be done with the less-than sign (<
) used
to indicate that the value of a variable is below the lower limit of detection. Options are "keep"
(default), "subtract"
, and "remove"
.
lessthan.frac
Numeric value between 0 and 1 (a fraction) by which the lower limit of detection
should be subtracted when lessthan.sign = "subtract"
. Defaults to lessthan.frac = 0.5
, i.e.
subtract 50% from the lower limit of detection.
repetition
Character string indicating what should be done with repetitions, i.e. repeated
measurements of layers in an observation. Options are "keep"
(default) and "combine"
. In the
latter case, it is recommended to set lessthan.sign = "subtract"
or lessthan.sign = "remove"
.
combine.fun
Character string indicating the function that should be used to combine repeated
measurements of layers in an observation when repetition = "combine"
. Options are "mean"
(default), "min"
, "max"
, and "median"
.
transition
Character string indicating what should be done about the wavy and irregular
transition between subsequent layers in an observation. Options are "keep"
(default) and
"smooth"
.
smoothing.fun
Character string indicating the function that should be used to smooth wavy and
irregular transitions between subsequent layers in an observation when transition = "smooth"
.
Options are "mean"
(default), "min"
, "max"
, and "median"
.
units
Logical value indicating if the measurement units of the continuous variable(s) should
be converted to the standard measurement unit(s). Defaults to units = FALSE
, i.e. no conversion is
performed. See standard
for more information.
round
Logical value indicating if the values of the continuous variable(s) should be rounded
to the standard number of decimal places. Requires units = TRUE
. Defaults to round = FALSE
, i.e.
no rounding is performed. See standard
for more information.
(optional) List with named sub-arguments indicating if and how to perform data harmonization.
harmonize
Logical value indicating if data should be harmonized? Defaults to harmonize = FALSE
,
i.e. no harmonization is performed.
level
Integer value indicating the number of levels of the identification code of the variable(s)
that should be considered for harmonization. Defaults to level = 2
. See ‘Details’ for more
information.
(optional) Logical value indicating if a download progress bar should be displayed.
(optional) Logical value indicating if informative messages should be displayed. Generally useful to identify datasets with inconsistent data. Please report to febr-forum@googlegroups.com if you find any issue.
A list of data frames or a data frame with data on the chosen variable(s) of the chosen dataset(s).
Standard identification variables and their content are as follows:
dataset_id
. Identification code of the dataset in febr to which an observation belongs.
observacao_id
. Identification code of an observation in febr.
camada_id
. Sequential layer number, from top to bottom.
camada_nome
. Layer designation according to some standard description guide.
amostra_id
. Laboratory number of a sample.
profund_sup
. Upper boundary of a layer (cm).
profund_inf
. Lower boundary of a layer (cm).
Data harmonization consists of converting the values of a variable determined using some method B so that they are (approximately) equivalent to the values that would have been obtained if the standard method A had been used instead. For example, converting carbon content values obtained using a wet digestion method to the standard dry combustion method is data harmonization.
A heuristic data harmonization procedure is implemented in the febr package. It consists of grouping
variables
based on a chosen number of levels of their identification code. For example, consider a variable with an
identification code composed of four levels, aaa_bbb_ccc_ddd
, where aaa
is the first level and
ddd
is the fourth level. Now consider a related variable, aaa_bbb_eee_fff
. If the harmonization
is to consider all four coding levels (level = 4
), then these two variables will remain coded as
separate variables. But if level = 2
, then both variables will be re-coded as aaa_bbb
, thus becoming the
same variable.
# NOT RUN {
res <- layer(dataset = "ctb0013")
str(res)
# }
Run the code above in your browser using DataLab