- response
a data frame with tree-ring proxy variables as columns and
(optional) years as row names. Row.names should be matched with those from a
env_data data frame. If not, set row_names_subset = TRUE.
- env_data
a data frame of daily sequences of environmental data as
columns and years as row names. Each row represents a year and
each column represents a day of a year. Row.names should be matched with
those from a response data frame. If not, set row_names_subset = TRUE.
Alternatively, env_data could be a tidy data with three columns,
i.e. Year, DOY and third column representing values of mean temperatures,
sum of precipitation etc. If tidy data is passed to the function, set the argument
tidy_env_data to TRUE.
- method
a character string specifying which method to use. Current
possibilities are "cor" (default), "lm" and "brnn".
- metric
a character string specifying which metric to use. Current
possibilities are "r.squared" and "adj.r.squared". If method = "cor",
metric is not relevant.
- cor_method
a character string indicating which correlation
coefficient is to be computed. One of "pearson" (default), "kendall", or
"spearman".
- lower_limit
lower limit of window width
- upper_limit
upper limit of window width
- fixed_width
fixed width used for calculation. If fixed_width is
assigned a value, upper_limit and lower_limit will be ignored
- previous_year
if set to TRUE, env_data and response variables will be
rearranged in a way, that also previous year will be used for calculations of
selected statistical metric.
- neurons
positive integer that indicates the number of neurons used
for brnn method
- brnn_smooth
if set to TRUE, a smoothing algorithm is applied that
removes unrealistic calculations which are a result of neural net failure.
- remove_insignificant
if set to TRUE, removes all correlations bellow
the significant threshold level, based on a selected alpha. For "lm" and
"brnn" method, squared correlation is used as a threshold
- alpha
significance level used to remove insignificant calculations.
- row_names_subset
if set to TRUE, row.names are used to subset
env_data and response data frames. Only years from both data frames are
kept.
- PCA_transformation
if set to TRUE, all variables in the response
data frame will be transformed using PCA transformation.
- log_preprocess
if set to TRUE, variables will be transformed with
logarithmic transformation before used in PCA
- components_selection
character string specifying how to select the Principal
Components used as predictors.
There are three options: "automatic", "manual" and "plot_selection". If
argument is set to automatic, all scores with eigenvalues above 1 will be
selected. This threshold could be changed by changing the
eigenvalues_threshold argument. If parameter is set to "manual", user should
set the number of components with N_components argument. If components
selection is set to "plot_selection", Scree plot will be shown and a user must
manually enter the number of components to be used as predictors.
- eigenvalues_threshold
threshold for automatic selection of Principal Components
- N_components
number of Principal Components used as predictors
- aggregate_function
character string specifying how the daily data
should be aggregated. The default is 'mean', the other options are 'median',
'sum', 'min' and 'max'
- temporal_stability_check
character string, specifying, how temporal stability
between the optimal selection and response variable(s) will be analysed. Current
possibilities are "sequential", "progressive" and "running_window". Sequential check
will split data into k splits and calculate selected metric for each split. Progressive
check will split data into k splits, calculate metric for the first split and then
progressively add 1 split at a time and calculate selected metric. For running window,
select the length of running window with the k_running_window argument.
- k
integer, number of breaks (splits) for temporal stability and cross validation
analysis.
- k_running_window
the length of running window for temporal stability check.
Applicable only if temporal_stability argument is set to running window.
- cross_validation_type
character string, specifying, how to perform cross validation
between the optimal selection and response variables. If the argument is set to "blocked",
years will not be shuffled. If the argument is set to "randomized", years will be shuffled.
- subset_years
a subset of years to be analyzed. Should be given in the form of
subset_years = c(1980, 2005)
- plot_specific_window
integer representing window width to be displayed
for plot_specific
- ylimits
limit of the y axes for plot_extreme and plot_specific. It should be
given in the form of: ylimits = c(0,1)
- seed
optional seed argument for reproducible results
- tidy_env_data
if set to TRUE, env_data should be inserted as a data frame with three
columns: "Year", "DOY", "Precipitation/Temperature/etc."
- reference_window
character string, the reference_window argument describes,
how each calculation is referred. There are three different options: 'start'
(default), 'end' and 'middle'. If the reference_window argument is set to 'start',
then each calculation is related to the starting day of window. If the
reference_window argument is set to 'middle', each calculation is related to the
middle day of window calculation. If the reference_window argument is set to
'end', then each calculation is related to the ending day of window calculation.
For example, if we consider correlations with window from DOY 15 to DOY 35. If
reference window is set to 'start', then this calculation will be related to the
DOY 15. If the reference window is set to 'end', then this calculation will be
related to the DOY 35. If the reference_window is set to 'middle', then this
calculation is related to DOY 25.
The optimal selection, which describes the optimal consecutive days that returns
the highest calculated metric and is obtained by the $plot_extreme output, is the
same for all three reference windows.
- boot
logical, if TRUE, bootstrap procedure will be used to calculate
estimates correlation coefficients, R squared or adjusted R squared metrices
- boot_n
The number of bootstrap replicates
- boot_ci_type
A character string representing the type of bootstrap intervals
required. The value should be any subset of the values c("norm","basic", "stud",
"perc", "bca").
- boot_conf_int
A scalar or vector containing the confidence level(s) of
the required interval(s)
- day_interval
a vector of two values: lower and upper time interval of
days that will be used to calculate statistical metrics. Negative values indicate
previous growing season days. This argument overwrites the calculation
limits defined by lower_limit and upper_limit arguments.
- dc_method
a character string to determine the method to detrend climate
(environmental) data. Possible values are c("Spline", "ModNegExp", "Mean",
"Friedman", "ModHugershoff"). Defaults to "none" (see dplR R package).
- dc_nyrs
a number giving the rigidity of the smoothing spline, defaults
to 0.67 of series length if nyrs is NULL (see dplR R package).
- dc_f
a number between 0 and 1 giving the frequency response or wavelength
cutoff. Defaults to 0.5 (see dplR R package).
- dc_pos.slope
a logical flag. Will allow for a positive slope to be used
in method "ModNegExp" and "ModHugershoff". If FALSE the line will be horizontal
(see dplR R package).
- dc_constrain.nls
a character string which controls the constraints of
the "ModNegExp" model and the "ModHugershoff" (see dplR R package).
- dc_span
a numeric value controlling method "Friedman", or "cv" (default)
for automatic choice by cross-validation (see dplR R package).
- dc_bass
a numeric value controlling the smoothness of the fitted curve
in method "Friedman" (see dplR R package).
- dc_difference
a logical flag. Compute residuals by subtraction if TRUE,
otherwise use division (see dplR R package).
- cor_na_use
an optional character string giving a method for computing
covariances in the presence of missing values for correlation coefficients.
This must be (an abbreviation of) one of the strings "everything" (default),
"all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs". See
also the documentation for the base cor() function.