Learn R Programming

marinespeed (version 0.1.0)

lapply_kfold_species: Apply a function over the folds of a set of species

Description

lapply_kfold_species returns a list of lists where each element is the result of applying fun to all species or the provided subset of species for the specified folds.

Usage

lapply_kfold_species(fun, ..., species = NULL, fold_type = "disc", k =
  1:5)

Arguments

fun
function. The function to be applied to the occurrence records of each species. Parameters are the species name, a list with the occurrence and background training and test records and a fold number.
...
optional arguments to fun.
species
dataframe or character vector. Dataframe like returned by list_species or the names of the species. If NULL (default) then fun is applied for all species.
fold_type
character. Type of partitioning you want to use, default is "disc".
k
integer vector. Numbers of the folds you want to get data for, if you want all 5-folds pass use 1:5, which is the default.

Value

A list with one named entry for every species provided or for all species. Every list entry is a list with k as names and the result of fun as value.

Details

The parameters passed to fun are speciesname, data where data is a list with 4 elements (occurrence_training, occurrence_test, background_training and background_test) and a parameter fold which contains the fold number. The different fold_type are: "disc": 5-fold disc partitioning of occurrences with pairwise distance sampled and buffer filtered random background points, equivalent to calling kfold_occurrence_background with occurrence_fold_type = "disc", k = 5, pwd_sample = TRUE, background_buffer = 200*1000 "grid_4" and "grid_9": 4-fold and 9-fold grid partitioning of occurrences with pairwise distance sampled and buffer filtered random background points, equivalent to calling kfold_occurrence_background with occurrence_fold_type = "grid", k = 4, pwd_sample = TRUE, background_buffer = 200*1000 "random": 5-fold random partitioning of occurrences and random background points, equivalent to calling kfold_occurrence_background with occurrence_fold_type = "random", k = 5, pwd_sample = FALSE, background_buffer = 0 "targetgroup": same way of partitioning as the "random" folds but instead of random background points, a random subset of all occurrences points was used creating a targetgroup background points set which has the same sampling bias as the entire dataset.

See Also

list_species lapply_species get_fold_data lapply_species, get_fold_data, list_species

Examples

Run this code
## Not run: ------------------------------------
# plot_occurrences <- function(speciesname, data, fold) {
#    title <- paste0(speciesname, " (fold = ", fold, ")")
#    plot(data$occurrence_train[,c("longitude", "latitude")], pch=".",
#         col="blue", main = title)
#    points(data$occurrence_test[,c("longitude", "latitude")], pch=".",
#         col="red")
# }
# 
# # plot training (blue) and test (red) occurrences
# # of the first 2 folds for the first 10 species
# species <- list_species()
# lapply_kfold_species(plot_occurrences, species=species[1:5,],
#                      fold_type = "disc", k = 1:2)
## ---------------------------------------------

Run the code above in your browser using DataLab